Kernel BUG Ubuntu

Hi all,

Has anyone seen an error similar to this one? I encounter it every now
and then. It's not clear to me whether it could be caused by an
application doing something bad, or if it could only be done by the
kernel or a driver. Should I try switching to a different kernel
version to see if it goes away? Is there some additional information
that I should send? It has showed up on several different boards but
all using the same kernel version.

Thanks in advance for your assistance.

PS. I looked at the relevant line in omap_l3_smx.c, and it says
"If we have a timeout error, there's nothing we can do besides
rebooting the board. So let's BUG on any of such errors and handle the
others. timeout error is severe and not expected to occur."

[ 56.646026] kernel BUG at arch/arm/mach-omap2/omap_l3_smx.c:187!
[ 56.655548] Unable to handle kernel NULL pointer dereference at
virtual addr0
[ 56.667205] pgd =
deb74000
[ 56.673187] [00000000] *pgd=9dc32831, *pte=00000000,
*ppte=00000000
[ 56.682952] Internal error: Oops: 817 [#1]
SMP
[ 56.690765] last sysfs file: /sys/kernel/
uevent_seqnum
[ 56.699310] Modules linked in: rtc_twl rtc_core smsc95xx
gpio_keys
[ 56.709014] CPU: 0 Not tainted (2.6.38.4-x3
#1)
[ 56.717315] PC is at __bug+0x1c/
0x24
[ 56.724182] LR is at __bug
+0x18/0x24
[ 56.731018] pc : [<c00632c4>] lr : [<c00632c0>] psr:
20000193
[ 56.731048] sp : ddf5dda8 ip : 00d61000 fp :
c08eea6c
[ 56.749328] r10: 00000000 r9 : 0000000a r8 :
00000000
[ 56.757904] r7 : 00000000 r6 : 00400000 r5 : 00000000 r4 :
f8000000
[ 56.767822] r3 : 00000000 r2 : c088fdc8 r1 : 00000000 r0 :
0000004a
[ 56.777679] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM
Segment usr
[ 56.788269] Control: 10c5387d Table: 9eb74019 DAC:
00000015
[ 56.797332] Process pdextended (pid: 642, stack limit =
0xddf5c2f8)
[ 56.806976] Stack: (0xddf5dda8 to
0xddf5e000)
[ 56.814575] dda0: 00000000 c007ebd4 c08f081c
00000030 00000
[ 56.826263] ddc0: 00000000 de8681c0 c0882644 c08eea70 c08eea5c
00000000 00000
[ 56.837921] dde0: c08eea6c c010070c 00008001 ddf5dde8 00000000
c0882600 c088a
[ 56.849578] de00: 00000001 dde5e48c 00000040 00000040 00000000
c0102e10 c0888
[ 56.861236] de20: c00578e4 0000000a 00000000 c005906c 20000013
ffffffff fa208
[ 56.872894] de40: 00000000 c066b638 00000000 20000093 00000000
c1555aa0 ddfe0
[ 56.884582] de60: c09aad98 00000000 dde5e48c 00000040 00000040
00000000 00000
[ 56.896331] de80: 00000000 c053eea8 20000013 ffffffff 00000040
dea25aa0 00000
[ 56.908081] dea0: 00000000 01f902c8 408eb000 00000280 00000001
00000003 00000
[ 56.919799] dec0: 000021b0 00000000 ddf5c000 00000000 00000000
c053f1ec c0530
[ 56.931518] dee0: dde5e400 c053acf4 00000020 bedfc99c c08948d0
00000000 01f90
[ 56.943267] df00: bedfc99c dea25aa0 400c4150 000021b0 00000000
c0183850 408e0
[ 56.955047] df20: dea25aa0 00080000 00000000 c0345c5c 408e9000
fffffffa 0000c
[ 56.966766] df40: 00000000 00000001 000000d7 00000000 ffffffff
00000000 ffff0
[ 56.978424] df60: bedfc9d8 dea25aa0 bedfc99c 400c4150 00000007
00000000 ddf50
[ 56.990112] df80: 00000000 c0183e20 00000000 00000001 01f90890
01f908e0 00006
[ 57.001770] dfa0: c005f4a4 c005f320 01f90890 01f908e0 00000007
400c4150 bedfc
[ 57.013458] dfc0: 01f90890 01f908e0 00000000 00000036 00000000
004abe9c 00000
[ 57.025085] dfe0: 00000001 bedfc998 40121d5f 40303d6c 60000010
00000007 00000
[ 57.036743] [<c00632c4>] (__bug+0x1c/0x24) from [<c007ebd4>]
(l3_interrupt_h)
[ 57.048797] [<c007ebd4>] (l3_interrupt_handler+0x0/0x168) from
[<de8681c0>] )
[ 57.060394] Code: e3060f64 e34c007a eb18146c e3a03000
(e5833000)
[ 57.070007] ---[ end trace
19a8e106a3fc012b ]---
[ 57.078002] Kernel panic - not syncing: Fatal exception in
interrupt
[ 57.087921] [<c0066ba4>] (unwind_backtrace+0x0/0xfc) from
[<c0668354>] (pani)
[ 57.099792] [<c0668354>] (panic+0x6c/0x18c) from [<c00639f8>]
(bad_mode+0x0/)
[ 57.110900] [<c00639f8>] (bad_mode+0x0/0x5c) from [<0000000b>]
(0xb)

PS. I'm wondering if this has been fixed, or if this is a different
issue since it's in a different file http://www.mail-archive.com/linux-omap@vger.kernel.org/msg55256.html

Yeah, that's a pretty old kernel, i pushed that out April 21, 2011...
I remember seeing omap_l3_smx.c errors, just can't remember when it
was fixed..

As a safe test, just update to the latest stable release.. (the older
kernel will be backed up in the boot partition with *_old)

export DIST=option

(where option can be one of:
lucid/maverick/natty/oneiric/precise/squeeze/wheezy)

wget http://rcn-ee.net/deb/${DIST}-armel/LATEST-omap
wget $(cat ./LATEST-omap | grep STABLE | awk '{print $3}')
/bin/bash install-me.sh

Regards,

Thanks for the tip! I get this kernel BUG much, much less frequently
in 3.1.8 and 3.2.2. I might have some more data on that to come --
I've automated a turn-on procedure with a relay and am going to try
power-cycling the board for a while (so far I have only done it 50
times).

One thing I've noticed is that pd will start at relatively low audio
latency with the ALSA drivers in 3.1.8 and 2.6.38.4. There is a spike
in the CPU usage right as it's starting, but it recovers. However, in
3.2.2 the ALSA drivers just won't start in pd. I'm currently fiddling
with some kernel switches recommended for pro/low-latency audio. I'll
let you know if I come up with anything.

Regards!