Beagleboard xM bizarre error - boot failing

Hello,

I have a Beagleboard xM rev C1 (P/N 296-25798-ND) from Digi-key. I didn’t get very far with the built-in Anstrom Linux (kept complaining about missing configuration files), so I put a copy of the RISC OS image on the supplied microSD key (which is what I was intending to do in the first place), and have been running RISC OS for about two weeks without issue or incident. I use s-video output (don’t have a DVI/HDMI monitor).

Earlier in the week, booting to Linux, something went wrong and the thing suffered a kernel panic. I can’t say what went wrong as all over the screen were loads of numbers in columns (some sort of memory dump?). Following this, the Beagleboard would not boot up, symptoms as below. As I was Googling to see if others experienced anything similar, I tried to power up for like the twentieth time and the board decided to work (no, this doesn’t make any sense to me either…).

I have been running my board using an iomega zip power brick (it’s the only thing I have that actually outputs a reliable 5V). It has been quite capable of starting a Beagle xM with peripherals connected. But see later.

Saturday evening, I wished to make a copy of my original boot microSD, so I got a Verbatim 4GiB card and used Win32DiskImager to write the standard distribution from the circuitco site. [sorry, using Linux is not an option, I don’t have a working Linux machine] Now either the Verbatim is incompatible, or something went wrong with the write [the image MD5 was correct], for attempting to boot reported a lot of i/o errors to do with buffers and the SD card after the part where USB devices are checked. This resulted, obviously enough, in a kernel panic.

I actually videoed this happening on my mobile phone, so I can say it is errors like:
Buffer I/O error, dev mmcblk0, sector 1437296
lost page write due to I/O error on mmcblk0p2
two or three screenfuls of stuff like that, ending with:
Kernel panic - not syncing: VFS: Unable to mount rootfs on unknown-block

Now, I don’t know what on earth Linux does while in a kernel panic, but I have experienced two, and my Beagleboard xM acted up after each of them.

Only this time, it is just plain refusing to start.

At power-up, the power LED (D5) comes on. The hub LED (D14) briefly flickers. That is it.

Attaching a serial port, I see the beginning of the start up message. Sometimes it gets as far as reporting the DRAM, but usually it says “Reading boot sector” and freezes there. This happens with both microSDs (the one that won’t correctly start up Linux) and the original. Both FAT parts check out okay with chkdsk on the PC.

Having mentioned this on the RISC OS Open forum, it has been suggested that perhaps the power brick isn’t powerful enough (though I want to draw your attention to a week of starting RISC OS with all peripherals attached, and now failing to start with nothing connected save for the serial lead).
Therefore, I hooked the Beagleboard xM to the 5V line of a PC’s PSU. It is one I use with my RiscPC for powering the additional harddisc, so there is plenty of power to go around as it isn’t running an Intel board (it is of the late 486 / early Pentium era, so if it can run one of them then surely it can start up a Beagle!).

I do not believe the SD to be corrupted. The extfs written to the second would appear to be, but in either case I would have expected to get at least as far as u-boot.

Should I return my board for RMA or is there something else I can try? I’d just really like to see all the LEDs come on again, along with the colour bars testcard.

Thank you.

Best wishes,

Rick.

Hi Rick,

Before i say any-thing let me say I am a beginner myself, however I know for a fact that the error messages

lost page write due to I/O error on mmcblk0p2 is usually generated when the sd card is removed, infect I can re-produce the messages many times infect I just pulled out the SD card right now and here are the error messages

[ 66.121551] —[ end trace 22c83c3369b5c774 ]—
[ 66.126220] Buffer I/O error on device mmcblk0p2, logical block 0
[ 66.133209] —[ end trace 22c83c3369b5c775 ]—
[ 66.138305] —[ end trace 22c83c3369b5c776 ]—
[ 66.143371] lost page write due to I/O error on mmcblk0p2
[ 66.148834] Buffer I/O error on device mmcblk0p2, logical block 0
[ 66.155609] lost page write due to I/O error on mmcblk0p2
[ 66.865722] EXT3-fs error (device mmcblk0p2): ext3_find_entry: reading direc ory #112897 offset 0
[ 66.874938] Buffer I/O error on device mmcblk0p2, logical block 0
[ 66.881958] lost page write due to I/O error on mmcblk0p2

on top of this stuff are the memory address, I can only tell you what I believe may be the issue

  1. Some loose piece of solder has found its way into the SD card socket of you beagleboard and causing the issue (that could come from the socket of your pc / card reader from the micro SD card adapter or even the beagleboard itself-the last one is highly un-likely)
  2. There is some hardware issue, with the SD card socket, (got pulled / pushed too hard or manufacturing defect, once-again the last one is highly un-likely)

If I were you I would hold the sd card socket as firm against the board and see if that fixes the issue, if it does some soldering (touch-up) is in order.

like you mentioned " I tried to power up for like the twentieth time and the board decided to work (no, this doesn’t make any sense to me either…) " inconsistant errors are usually due to hardware issues and thus not usually repeatable

Hope this helps

Husain

Hi Husain,

Thank you for your suggestions. The lost page write (etc) error happens during booting from the microSD - I had put this down to Win32DiskWriter (ie not using Linux) and/or the card itself. The card was not removed during the process. Other than that, yes, your list of errors pretty much matches what I saw. Would this imply that as far as the computer could see, the microSD was there, then suddenly it wasn’t? How very peculiar.

I had a look at the microSD socket (it’s a Beagle xM) and there are about four connections either side. They appear to be well attached, though as you suggested I applied gentle pressure while powering up. Unfortunately there was no change. The card contacts do not appear to be contaminated with anything, either.

I am not going to poke around with a soldering iron - with my level of soldering (used to stripboard, not tiny stuff like this) and wondering if my eyesight is even up to it these days, I think for me to try resoldering anything would be about as productive as rubbing the Beagle on a cat… Hmm…

A friend is sending me a copy of his original microSD, and I’ll power up the Beagle xM when it arrives, running it off a BBC Micro power supply (an 8 bit TTL-logic '80s micro, sort of like an Apple II with bells on). If it doesn’t work with that, I guess I’ll have to RMA it. Until then, I’ll be sad - RISC OS ran like greased lightning!

Thank you again.

Best wishes,

Rick.

Hi Rick,

I was very much sure you were not trying to pull the SD card out, however I did just to generate this error message.

Good luck with the new SD card.

Regards

Husain