multiple kernel oops at every boot, at random points in the sequence

Dear group,
before going into a thorough research I ask you if you ever experienced this:
8 out of 10 reboots of the BBxM stop due to kernel oops at totally random moments in the boot process.
I have tested it on multiple BB and I get always Oops. At some times the boot sequence can get to the login without oops, but rarely.

Pasting here a few examples gathered from the console at boot (just the beginning of the oops message):

[ 4.951324] udevd[85]: starting version 175
[ 5.144683] usb 2-2: new high-speed USB device number 2 using ehci-omap
[ 5.222869] Unable to handle kernel paging request at virtual address 1edf8026
[ 5.230438] pgd = de3e4000
[ 5.233276] [1edf8026] *pgd=00000000
[ 5.236999] Internal error: Oops: 805 [#1] SMP ARM
[ 5.242004] Modules linked in:
[ 5.245208] CPU: 0 PID: 118 Comm: udevd Not tainted 3.15.1-armv7-x2 #1


[ 9.412963] omap-twl4030 sound.5: twl4030-hifi <-> 49022000.mcbsp mapping ok
[ 9.878204] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[ 9.886810] pgd = df414000
[ 9.889617] [00000004] *pgd=9e06b831, *pte=00000000, *ppte=00000000
[ 9.896240] Internal error: Oops: 17 [#1] SMP ARM
[ 9.901153] Modules linked in: leds_pwm snd_soc_omap_twl4030 snd_soc_omap gpio_keys snd_soc_omap_mcbsp snd_soc_twl4030 snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_seq omap_aes snd_seq_device snd_timer snd smsc95xx usbnet evdev soundcore rtc_twl twl4030_keypad matrix_keymap


fsck from util-linux 2.20.1

rootfs was not cleanly unmounted, check forced.
rootfs: 55636/232464 files (0.1% non-contiguous), 335295/942848 blocks
fsck died with exit status 1
failed (code 1).
[ 13.743408] EXT4-fs (mmcblk0p2): re-mounted. Opts: errors=remount-ro
[ ok ] Cleaning up temporary files… /tmp.
[ 14.735748] Unable to handle kernel paging request at virtual address 1edf8026
[ 14.743316] pgd = db77c000
[ 14.746154] [1edf8026] *pgd=00000000
[ 14.749877] Internal error: Oops: 805 [#3] SMP ARM
[ 14.754882] Modules linked in:


[ 11.426849] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
[…] Checking root file system…fsck from util-linux 2.20.1
rootfs was not cleanly unmounted, check forced.
[ 13.818634] Internal error: Oops - undefined instruction: 0 [#2] SMP ARM
[ 13.825653] Modules linked in: snd_soc_omap_twl4030 leds_pwm snd_soc_omap gpio_keys snd_soc_omap_mcbsp omap_aes snd_soc_twl4030 snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_seq snd_seq_device snd_timer snd rtc_twl smsc95xx usbnet evdev soundcore twl4030_keypad matrix_keymap
[ 13.852111] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G D 3.15.1-armv7-x2 #1

Just FYI, the first serial print:

U-Boot 2014.01-00014-gcc58fc1 (Feb 10 2014 - 16:24:52)

OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 Ghz
OMAP3 Beagle board + LPDDR/NAND
I2C: ready
DRAM: 512 MiB
*** Warning - readenv() failed, using default environment

In: serial
Out: serial
Err: serial
Beagle xM Rev C
No EEPROM on expansion board
No EEPROM on expansion board
Die ID #60b000029ff800000160a7450701e011
Net: usb_ether
Hit any key to stop autoboot: 0
mmc0 is current device
gpio: pin 173 (gpio 173) value is 0
gpio: pin 4 (gpio 4) value is 0
SD/MMC found on device 0
reading uEnv.txt
807 bytes read in 3 ms (262.7 KiB/s)
Loaded environment from uEnv.txt
Importing environment from mmc …
Checking if lcdcmd is set …
Checking if uenvcmd is set …
Running uenvcmd …
reading zImage
4737144 bytes read in 292 ms (15.5 MiB/s)
reading initrd.img
3019480 bytes read in 188 ms (15.3 MiB/s)
reading /dtbs/omap3-beagle-xm.dtb
61643 bytes read in 13 ms (4.5 MiB/s)

Error: “expansion_args” not defined

Kernel image @ 0x80300000 [ 0x000000 - 0x484878 ]

Flattened Device Tree blob at 815f0000

Booting using the fdt blob at 0x815f0000
Using Device Tree in place at 815f0000, end 816020ca

Starting kernel …

The obvious question: what did I do wrong? I don’t know, one day everything worked consistently, then this started out. I thought the SD was corrupted and reflashed it with a dd image I previously made, but the problem is still there. I tried with different boards, no way. It seems like a virus :wink:

Does it help if you disable this patch:


Oops I haven’t been notified of your reply…
Which one of the twl4030 patches? At line 247 I can’t see any…