Beaglebone Black Rebooting Several Times Every Day

Thank you!

Gerald

Well new and interesting discoveries on this problem.

Last night we set up 9 bbw as well as the blacks for control. during the night all of the bbb has crashed at least once and none of the bbw. All running the same kernel and software. 3.15.2-bone5.

This means that this is hw related on the bbb board. i am starting to suspect the PMIC that some interference somehow triggers a sys_reset. we are trying to connect a data logger to some pins on the bbb to see if sys_reset goes low prior to reboot but i strongly suspects it does…

I had a total of 5 resets yesterday one at 8am, 5pm, 9pm, 10pm, 11pm using a switching power supply. Maybe I should go back to the cheapo power supply as I was only resetting once or twice a day with that one. I’ll wait and see what Thomas comes up with before ordering the serial debug cable. Only thing I have changed in the last two days has been the power supply, all else is static (kernel, running daemons, attached hub). I’m going to unplug the USB HUB now and see what that does.

I didn’t follow all the conversation, but I had a reboot several time every day.

I found out that the bandwidth of the Network was too much for the BeagleBone black and the watchdog decided to reboot the beagle every time.

Micka that sounds interesting. which watchdog are you talking about? is this a hardware watchdog on the board or a software watchdog. How can you find evidence that it is the watchdog that is causing the reboot.

Which circuit are you talking about and is there any dmesg or other information you could share.

As stated i now have 9 black all rebooting and nine white not rebooting. Same kernel. same network same network load.

you have to activate the watchdog, normally it’s not activated by default.

I’m talking about the software watchdog.

As stated i now have 9 black all rebooting and nine white not rebooting. Same kernel. same network same network load.

One of my boards finally did this with 3.14-ti, just randomly
rebooted, no serial debug messages. Kinda looked like the watchdog
timer, as it was just "too" clean of a kernel reboot..

Regards,

Time to turn verbose mode on ?

Forget about claiming its a hardware issue. Because it is not. We've had
two beaglebone blacks for close to a year, and a half now. Both are rock
solid, running off barrel jack power, or USB( one of each ).

Ok well just stating facts that bbw is not rebooting and BBB is with the

same kernel

Ok i would be happy to stop guessing and actually find the problem. Could
you give me some real advise on HOW to use the serial module appropriately.
i do have it connected to the serial console and i have event built a
kernel with verbosed debugging on i am just not seeing anything.

So if you would like any information or dumps or whatever i am happy to
provide that.

Short term, you can disable the hardware watchdog in kernel config. That should temporarily fix your problem, and if it does not . . . bigger mystery.

Long term, it would be good to figure out what is triggering the watchdog. strace on the watchdog PID will not work - I tested this myself.

The only one thing I can think of off hand is perhaps modifying the watchdog module to output a debug message before rebooting the system. As to which process triggered a shutdown, and why. If possible . . .

Will research more tomorrow it is very late ( actually early ) here.

I switched over from kernel 3.8.13 to 3.14.19-ti-r28 (compiled with netconsole support) and after a little over 24 hours uptime, my BBB rebooted without any entries in the netconsole log or any logs on the BBB that I could find.

Did anyone have any luck in troubleshooting this issue any further?

In the meantime, I guess I’ll try your suggestion of disabling the watchdog for now.

The issue still happened even using the kernel compiled without any watchdog support. The netconsole log didn’t include any details. It just rebooted again without any indication why. That was about 24 hours after the last restart.
I still had the watchdog package installed (but with even the software watchdog disabled in the kernel, I didn’t think it would do anything). Well, I now removed that, but I’m not convinced that it did cause the reboot.

For the record, even with the watchdog package removed, my BBB rebooted on me again after just 7 hours of uptime. :frowning:
Here’s the kernel config I used. Searching for watchdog, there are only two entries. This is the one still enabled:
CONFIG_OMAP_REMOTEPROC_WATCHDOG=y
Should I disable that as well?

Did you check if memory usage is increasing ? Log file ?

Micka,

I remember that once it was suggested to check the reboot reason from the pmic. Definitely your issues are because of some hardware faults in the pmic

@Micka:
I’m not sure how to check for increased memory usage as it’s completely unpredictable when it’s going to reboot. Is there some kind of logging I can enable for memory usage so it would get logged to my netconsole?

As for logs, the netconsole log has nothing before the reboot occurs (just like the previous reports on this thread, including the one from RobertCNelson). Here’s the kernel log from when it booted up after the sudden reboot.
I also checked the syslog and there’s nothing out of the ordinary in it. It appears as if some killed the power for a second and the system just boots up again.
Is there any other log I could look for?

@lisarden:
How would I go about checking the PMIC? Looking at the kernel log from the boot I can only see this:

[ 5.761536] sr_init: No PMIC hook to init smartreflex
[ 5.767135] sr_init: platform driver register failed for SR

I’ve also checked my past kernel logs and it’s always the same line.

Checking the kernel config, I’m not seeing any obvious setting to enable. Can you point me in the correct direction?

Thanks to both of you for your replies.

Searching around a bit, I wonder if an old u-boot version might have anything to do with it? From what I can tell, I’m still using an u-boot release from 2013.07 . Should I be updating this? If so, what would be the safest way to do so?
In /boot/uboot/tools, I noticed a couple of scripts (bootloader_update, update and update_boot_files). Would any of these work or should I get a more recent version?

Mine is still resetting one to three times a day. I have moved both processes I was running on it (CUPS and weewx) over to my RasPi so it is just sitting idle except for CRON jobs. No events in serial debug logs, just a sudden reset and cold boot restart.

We still have the same issues with reboots as well. we have had to move our project to bbw for now instead. the bbw with the same kernel and no reboots for two weeks. The BBB reboots on an average og 1-2 times per 24 hours independent of load.

No serial messages. No kernel messages, no oops. We have also tried a kernel wo watchdog support with no avail…

Any suggestions is highly appreciated at this point…