BeagleBone A5 freezes on login

Hello,

I have a couple of BeagleBone rev. A5 running the latest available kernel downloaded from rcn-ee.net (psp10 from 15. May) and Debian Squeeze. On most of the devices, everything works as it should. However, some of the machines freezes directly after login. What happens is that as soon as I log in, the machine freezes and the I/O leds on the board switch off. The power led is still shining green. I have tried with several different memory cards, power from USB and 5V and with and without my USB devices connected. Also, when I look at the boot log, I can’t see anything that is out of the ordinary, the boot seems to be sucessful. There is no error outpt to screen nor serial port.

Has anyone experienced anything similar or have any tips on how I can progress in debugging this? I have uploaded the bootlog to: http://pastebin.com/mfApp8sA

-Kristian

Yes, i've had reports of this freeze issue, but i've been unable to
reproduce that issue in the lab with any of my A2/A3/A5 boards.. For
the time being you can use the angstrom uImage kernel and modules just
fine on the debian rootfs..

If you'd like to help debug the issue, the source/build tree is available here:

https://github.com/RobertCNelson/linux-dev/tree/am33x-v3.2

(hint: it's probally a config difference between my tree and angstrom's..)

Regards,

Don't switch frequency with cpufreq, it oopses the kernel nowadays in softidle and irqs. Maybe the irq bank fix has some unintended side effects :frowning:

Any chance is that related to the same bug that some boards suspend,
but others work fine.. (and i just happened to luckily get the good
bone's?) :wink:

Regards,

Suspend/resume is a silicon issue last I heard. Kinda like 1GHz on xM, select boards work, so there is no problem....

Thanks for the tip. Where can I find the angstrom uImage/modules for the latest kernel? I have had lots of problems with Usb in kernels earlier than psp7, what would be the angstrom equivalent of that kernel?

When I have some spare time, I will start experimenting with the configs. I see this behavior on multiple boards, but have not been able to find a systematic difference.

Thanks again for the help,
Kristian

Also, if anyone has any suggestions for config options to enable/disable, please let me know.

-Kristian

With koen's suggestion i just tweaked this..

https://github.com/RobertCNelson/linux-dev/commit/bf451d395ff25352a37573b1b3f73a4d3b6fdb0b

I'll have a uImage/modules ready for download/testing in 10mins, which
i'd like to have you test as your board is showing the issues..

Regards,

You need the reverse diff of that :slight_smile: USERSPACE is the safest bet, provided nothing in userspace tries to touch cpufreq.

But we can't reverse that, as that's the way it was in his image by
default (using userspace cpufreq) and since the user space tool
cpufreq is installed by default.. :slight_smile: So either have user remove the
cpufreq tools, or just force performance...

Regards,

Can you give this a shot:

http://rcn-ee.homeip.net:81/testing/beaglebone/3.2-cpufreq/

Just cp the uImage file as uImage to the boot partition and extract
the modules as root to the rootfs partition..

Regards,

I have issues with Angstron, now using Ubuntu r8 11.10 with a 256MB swap files, that works nice …

Thanks. I only have remote access to the machines the rest of today
and tomorrow, and the machine that freezes on boot has crashed (there
is the successful boot now and then). I will try this first thing
Friday morning.

-Kristian

I tried this kernel on my R4, same problem

done.
Begin: Running /scripts/init-bottom … done.

[ 240.780506] INFO: task init:1 blocked for more than 120 seconds.
[ 240.786841] “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.
[ 240.795101] init D c04d4c7c 0 1 0 0x00000001
[ 240.801818] Backtrace:
[ 240.804429] [] (__schedule+0x0/0x43c) from [] (schedule+0x78/0x7c)
[ 240.812795] [] (schedule+0x0/0x7c) from [] (schedule_timeout+0x20/0x16c)
[ 240.821705] [] (schedule_timeout+0x0/0x16c) from [] (wait_for_common+0xc8/0x154)
[ 240.831328] r6:00000002 r5:cf837abc r4:7fffffff
[ 240.836216] [] (wait_for_common+0x0/0x154) from [] (wait_for_completion+0x18/0x1c)
[ 240.846042] [] (wait_for_completion+0x0/0x1c) from [] (flush_work+0x30/0x3c)
[ 240.855322] [] (flush_work+0x0/0x3c) from [] (tty_flush_to_ldisc+0x14/0x18)
[ 240.864503] [] (tty_flush_to_ldisc+0x0/0x18) from [] (n_tty_poll+0x7c/0x180)
[ 240.873790] [] (n_tty_poll+0x0/0x180) from [] (tty_poll+0x68/0x88)
[ 240.882134] r7:00000000 r6:ced16ac0 r5:ced2c000 r4:cf2be980
[ 240.888131] [] (tty_poll+0x0/0x88) from [] (do_select+0x2b8/0x460)
[ 240.896475] r7:00000000 r6:00000002 r5:00000200 r4:00000400
[ 240.902481] [] (do_select+0x0/0x460) from [] (core_sys_select+0x244/0x308)
[ 240.911573] [] (core_sys_select+0x0/0x308) from [] (sys_select+0xdc/0x10c)

Rob, can you be a little more specific, i there's about 3 revision
floating around since yesterday..

With another users, this one seems to be solving his problems:

http://rcn-ee.homeip.net:81/testing/beaglebone/3.2-cpufreq-disable-oppturbo/

Regards,

I mean this pre-built one one I just tested: http://rcn-ee.homeip.net:81/testing/beaglebone/3.2-cpufreq/
(I also tried one I cooked up from your instructions before that).

I’ll try the 3.2-cpufreq-disable-oppturbo one now.

Okay, thank's for confirming that..

Have you found a way to quickly replicate the error (certain program,
certain load, etc).. On my A3 it happened once yesterday, but i've
been unable to replicate it since..

Regards,

Just booted [ 0.000000] Linux version 3.2.17-psp10.3 (voodoo@hera) (gcc version 4.6.3 (Linaro GCC 4.6-2012.03) ) #3 Wed May 16 18:26:45 CDT 2012

from the 3.2-cpufreq-disable-oppturbo pre built directory and same problem.

Maybe I am chasing a furfie here, mine does it every time with arm ubuntu. It boots the old angstrom fine.
Does the hungtimeout stack back trace mean anything? It’s the same every time.

I am having the same problems on an A4 board.
All was fine from the original 12.04 Alpha install and upgrades up until a recent “aptitude full-upgrade” when I ran into the problem.
All attempts at new installs and substitutions of 3.2.17-psp10.1-modules.tar.gz, 3.2.17-psp10.1.uImage, linux-image-3.2.17-psp10.2_1.0cross_armel.deb result in the same problem.
Regards
Sid.

Starting to loose hair in this one..

Images to test..
http://rcn-ee.homeip.net:81/testing/beaglebone/3.2-cpufreq-more/

Thought it might be sd card related, so tested 7 different micro sd
cards, along with the an A2 (main patch in that kernel change), a3 and
a5 beaglebone.. I got some more boards at home to try and more micro
sd cards.. But i can't get these boards to have that error..

Regards,