Dirk
August 1, 2008, 6:15am
1
Testing the patches
http://www.pwsan.com/omap/gptimer_workaround_3.tar.gz
http://www.pwsan.com/omap/read_die_ids.patch
boot log outputs:
<7>OMAP_TAP_IDCODE 0x0b7ae02f REV 0 HAWKEYE 0xb7ae MANF 017
<7>OMAP_TAP_DIE_ID_0: 0x00000000
<7>OMAP_TAP_DIE_ID_1: 0x00000000 DEV_REV: 0
<7>OMAP_TAP_DIE_ID_2: 0x00000000
<7>OMAP_TAP_DIE_ID_3: 0x00000000
<7>OMAP_TAP_PROD_ID_0: 0x00000000 DEV_TYPE: 0
This board has the serial hang without patches applied.
I tried patch series 3 in two kernel configurations. Without and with CONFIG_DEBUG_LL enabled. My default is without debug ll enabled. Then I wondered why I don't get OMAP_TAP output at boot and only with dmesg, so for a second try I enabled debug ll.
Result:
- With older patch series *2* yesterday I had a lot of "Timer workaround" outputs while typing at serial console (and after some time serial hang)
- With this patch series (*3*) I don't see any "*** GPTIMER missed match interrupt!" outputs. Independent of debug ll enabled or not (not sure if I missed anything or if this is intended).
- *Without* debug ll enabled I get serial hang (with patch series 3 applied) in < 10min doing something like in attachment
- *With* debug ll enabled, I couldn't get serial hang within ~20min doing similiar stuff like in attachment. Maybe it simply will take longer with debug ll enabled to hang. Or I had luck.
Most probably I did something wrong here, sorry then. But maybe it helps.
Anyway, many thanks to Paul looking at this!
Dirk
serial_hang_test.txt (40.2 KB)
Hello Dirk,
Result:
- *Without* debug ll enabled I get serial hang (with patch series 3 applied)
in < 10min doing something like in attachment
- *With* debug ll enabled, I couldn't get serial hang within ~20min doing
similiar stuff like in attachment. Maybe it simply will take longer with debug
ll enabled to hang. Or I had luck.
Most probably I did something wrong here, sorry then. But maybe it helps.
It does help, very much. Dirk, when your Beagle serial hangs again, could
you please send a Sysrq-q (break + q on serial console) and E-mail me the
GPTIMER register dump at the bottom? It will look something like this:
<3>GPT TCRR: ffff9c66
GPT TCRR: ffff9c66
<3>GPT TMAT: ffffbfff
GPT TMAT: ffffbfff
<3>GPT TISR: 00000000
GPT TISR: 00000000
<3>GPT TIER: 00000003
GPT TIER: 00000003
<3>GPT TCLR: 00000041
GPT TCLR: 00000041
<3>GPT TOCR: 00000000
GPT TOCR: 00000000
<3>GPT TOWR: 00000000
GPT TOWR: 00000000
thank you for the TAP data and the testing help,
- Paul
after 4 hours and 32 minutes:
GPT TCRR: 20a06241
GPT TMAT: ffffbfff
GPT TISR: 00000000
GPT TIER: 00000003
GPT TCLR: 00000041
GPT TOCR: 00000000
GPT TOWR: 00000000
regards,
Koen
Dirk
August 3, 2008, 12:01pm
6
Koen Kooi wrote:
It does help, very much. Dirk, when your Beagle serial hangs again, could
you please send a Sysrq-q (break + q on serial console) and E-mail me the
GPTIMER register dump at the bottom? It will look something like this:
<3>GPT TCRR: ffff9c66
GPT TCRR: ffff9c66
<3>GPT TMAT: ffffbfff
GPT TMAT: ffffbfff
<3>GPT TISR: 00000000
GPT TISR: 00000000
<3>GPT TIER: 00000003
GPT TIER: 00000003
<3>GPT TCLR: 00000041
GPT TCLR: 00000041
<3>GPT TOCR: 00000000
GPT TOCR: 00000000
<3>GPT TOWR: 00000000
GPT TOWR: 00000000
thank you for the TAP data and the testing help,
after 4 hours and 32 minutes:
GPT TCRR: 20a06241
GPT TMAT: ffffbfff
GPT TISR: 00000000
GPT TIER: 00000003
GPT TCLR: 00000041
GPT TOCR: 00000000
GPT TOWR: 00000000
After Paul explained me that I have to use 'ctrl-a f q' at minicom to send Sysrq-q (thanks!) with debug ll disabled I get serial hang after ~27min:
-- cut --
root@beagleboard:~# SysRq : Show Pending Timers
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 1724156066894 nsecs
cpu: 0
clock 0:
.index: 0
.resolution: 1 nsecs
.get_time: ktime_get_real
.offset: 1216686567818359375 nsecs
active timers:
clock 1:
.index: 1
.resolution: 1 nsecs
.get_time: ktime_get
.offset: 0 nsecs
active timers:
#0: <c03f3d98>, hrtimer_wakeup, S:01, do_nanosleep, xkbd/1580
# expires at 1382835683593 nsecs [in -341320383301 nsecs]
#1: <c03f3d98>, hrtimer_wakeup, S:01, do_nanosleep, ipaq-sleep/1571
# expires at 1383926635742 nsecs [in -340229431152 nsecs]
#2: <c03f3d98>, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
# expires at 1656406250000 nsecs [in -67749816894 nsecs]
#3: <c03f3d98>, it_real_fn, S:01, do_setitimer, Xfbdev/1498
# expires at 1656428782958 nsecs [in -67727283936 nsecs]
.expires_next : 1382835683593 nsecs
.hres_active : 1
.nr_events : 146275
.nohz_mode : 2
.idle_tick : 1382828125000 nsecs
.tick_stopped : 0
.idle_jiffies : 138601
.idle_calls : 686178
.idle_sleeps : 668180
.idle_entrytime : 1722605987548 nsecs
.idle_waketime : 1656403564453 nsecs
.idle_exittime : 1656403747558 nsecs
.idle_sleeptime : 1663773041876 nsecs
.last_jiffies : 173619
.next_jiffies : 173620
.idle_expires : 1382835937500 nsecs
jiffies: 173619
Tick Device: mode: 1
Clock Event Device: gp timer
max_delta_ns: 2147483647
min_delta_ns: 30517
mult: 140737
shift: 32
mode: 3
next_event: 1382835683593 nsecs
set_next_event: omap2_gp_timer_set_next_event
set_mode: omap2_gp_timer_set_mode
event_handler: hrtimer_interrupt
GPT TCRR: 00aabe07
GPT TMAT: ffffbfff
GPT TISR: 00000000
GPT TIER: 00000003
GPT TCLR: 00000041
GPT TOCR: 00000000
GPT TOWR: 00000000
-- cut --
No "*** GPTIMER missed match interrupt!", though.
Doing 'ctrl-a f q' several times, "now at XXX nsecs" and GPT TCRR: ZZZZ still increases. The other GPT values stay the same.
Dirk
With all the hangs I see only TCRR differs, TMAT, TIET and TCLR are
always the same when hanging.
regards,
Koen
Dirk
August 4, 2008, 3:10pm
8
Koen Kooi wrote:
GPT TCRR: 00aabe07
GPT TMAT: ffffbfff
GPT TISR: 00000000
GPT TIER: 00000003
GPT TCLR: 00000041
GPT TOCR: 00000000
GPT TOWR: 00000000
-- cut --
No "*** GPTIMER missed match interrupt!", though.
Doing 'ctrl-a f q' several times, "now at XXX nsecs" and GPT TCRR:
ZZZZ still increases. The other GPT values stay the same.
With all the hangs I see only TCRR differs, TMAT, TIET and TCLR are
always the same when hanging.
Yesterday, Philip had this with Koen's image:
*** GPTIMER missed match interrupt! last load: ffff76e1
------------[ cut here ]------------
WARNING: at arch/arm/mach-omap2/timer-gp.c:73 omap2_gp_timer_interrupt+0x44/0x78()
Modules linked in: pegasus ipv6
[<c0037b88>] (dump_stack+0x0/0x14) from [<c0053de0>] (warn_on_slowpath+0x4c/0x68)
[<c0053d94>] (warn_on_slowpath+0x0/0x68) from [<c003db74>] (omap2_gp_timer_interrupt+0x44/0x78)
r6:00000000 r5:c04059f8 r4:00000002
[<c003db30>] (omap2_gp_timer_interrupt+0x0/0x78) from [<c0079dac>] (handle_IRQ_event+0x3c/0x74)
r5:00000000 r4:c03ff300
[<c0079d70>] (handle_IRQ_event+0x0/0x74) from [<c007b67c>] (handle_level_irq+0xd4/0xf0)
r7:00000ab6 r6:00000000 r5:00000025 r4:c04086ec
[<c007b5a8>] (handle_level_irq+0x0/0xf0) from [<c0033048>] (__exception_text_start+0x48/0x64)
r5:c04086ec r4:00000025
[<c0033000>] (__exception_text_start+0x0/0x64) from [<c00336b0>] (__irq_svc+0x30/0x80)
Exception stack(0xc03fbf08 to 0xc03fbf50)
bf00: a0000013 e671ecb1 220cb6b1 00000000 198e134f a0000013
bf20: 00000000 00000ab6 19254d38 00000ab5 0000004a c03fbfa4 c03fbee8 c03fbf50
bf40: c0069908 c006fdbc 60000013 ffffffff
r7:00000ab6 r6:00000000 r5:d8200000 r4:ffffffff
[<c006fa60>] (tick_nohz_stop_sched_tick+0x0/0x390) from [<c0034a78>] (cpu_idle+0x44/0x80)
[<c0034a34>] (cpu_idle+0x0/0x80) from [<c0318f20>] (rest_init+0x58/0x6c)
r5:c042804c r4:c043fe8c
[<c0318ec8>] (rest_init+0x0/0x6c) from [<c0008b64>] (start_kernel+0x24c/0x2a4)
[<c0008918>] (start_kernel+0x0/0x2a4) from [<80008034>] (0x80008034)
---[ end trace 4118c6862fc8eec1 ]---
<snip>
Khasim reports that switching from 32k timer to MPU timer eliminates
the hang.
regards,
Koen
Dirk
August 5, 2008, 4:44pm
10
Koen Kooi wrote:
Koen Kooi wrote:
GPT TCRR: 00aabe07
GPT TMAT: ffffbfff
GPT TISR: 00000000
GPT TIER: 00000003
GPT TCLR: 00000041
GPT TOCR: 00000000
GPT TOWR: 00000000
-- cut --
No "*** GPTIMER missed match interrupt!", though.
Doing 'ctrl-a f q' several times, "now at XXX nsecs" and GPT TCRR:
ZZZZ still increases. The other GPT values stay the same.
With all the hangs I see only TCRR differs, TMAT, TIET and TCLR are
always the same when hanging.
Yesterday, Philip had this with Koen's image:
*** GPTIMER missed match interrupt! last load: ffff76e1
<snip>
Khasim reports that switching from 32k timer to MPU timer eliminates
the hang.
Update from Khasim about 32k timer weirdness:
http://www.beagleboard.org/irclogs/index.php?date=2008-08-05#T16:27:58
Dirk