I'm afraid I have a mysterious problem and I was wondering if anyone
could shed any light on it?
I am trying to run quite a bit of a code on a Beagle xM, Rev C and I
get random segfaults and/or SIGILLs.
These turn up, occasionally, in every program I attempt to run - busybox, Python, my C++ code - everywhere. One just happened in
They seem mostly to be associated with ELF thunks; I will get some
ldr r3, [ r12, # <something> ]
mov r0, r3
If I inspect these crashes in gdb:
- my r12 looks plausible.
- [r12, #<something>] is a legal address.
- r3 is that legal address
- r0 is total garbage
- I have attempted to jump to the garbage, resulting in the
It looks as though something, somewhere (the kernel's context switch
code?) has caused my registers to become corrupt - either that, or there is a RAM problem somewhere causing occasional reads to go bad -
though I don't really buy that theory because then r3 should also be bad.
Bolstering my "the kernel is screwing up" theory, strace segfaults
I am not out of memory.
I am using Linux 3.12rc1, with Tony Lindgren's patches, an up-to-date
u-boot and X-Loader 1.51 , gcc 4.8.1 and binutils 2.22 (from yesterday's
crosstool). I am ARM-only - no thumb code at all.
I'm mounting my root over nfs.
My kernel arguments are:
console=ttyO2,115200n8 mpurate=1000 mem=256M rootwait rw root=/dev/nfs ip=10.30.1.8:10.30.1.1:10.30.1.1:255.255.255.0:eth0 nfsroot=10.30.1.1:/export/elb2 earlyprintk=1 earlycon=ttyO2,115200n81 loglevel=8 vram=12M omapfb.vram=0:4M omapfb.vram=1:4M omapfb.vram=2:4M omapfb.mode=dvi:640x480MR-16@60 omapdss.def_disp=dvi nohlt
.. and you might say "stop using such an adventurous setup". So I tried:
- Rootfs on the uSD card.
- Linux 3.2.0, which was stable on another project (with an Overo)
- u-boot from that project.
- x-loader hasn't changed since that project
- arm-2011.03 gcc from codesourcery
.. and got the same result.
I have also tried several different power supplies (the one this project uses, which is a local regulator, a bench supply, a wall wart):
You might also say "you have a bad batch of Beagle boards" - so I tried
one from a previous project which also seemed fine there. And that one
exhibited the same fault - though it might have been doing it in the
previous project and we would have been unlikely to have noticed. But
it's not this batch that's the problem - "old" boards do it too.
So, can anyone tell me what I've missed? I'm running out of straws
to clutch at ..