Beaglebone Black Rebooting Several Times Every Day

William_Hermans · September 14, 2014, 9:42pm

Oh, and right, only time I’ve had my beaglebone black reset, is when it loses power ( we switch from generator, and back to solar ). Or I’ve intentionally reset it. Mostly I reset my own board because I make lots of changes to various things, and I want to make sure it comes back up as intended from a reset.

William_Hermans · September 14, 2014, 9:53pm

Greg, buying a serial debug cable really is necessary if you intend to do anything serious with the BBB. However . . .

http://www.ebay.com/itm/like/351093728732?lpid=82

According to the PL2303HX datasheet it is 3v3 TTL, but double check for yourself. I’ve seen these cheap USB<->Serial converters go for as low as 2-3 USD. Cables, 6 bux US is about as low as I remember having seen them. From what I understand they’re also supposed to be fairly decent, but as with anything else YMMV.

You said something about having other cables / modules just wrong header ? why not whip an adapter up ? Not that hard to make / use jumper wires . . . does matter if it looks pretty so long as it serves it’s purpose.

David1 · September 15, 2014, 12:24am

I agree, serial cable is a must. Here's one a bit cheaper and on Amazon Prime:
http://www.amazon.com/gp/product/B008AGDTA4/ref=oh_aui_detailpage_o02_s00?ie=UTF8&psc=1

Mike · September 15, 2014, 12:52am

Not on prime, but only 2.43. I’ve got several similar to this based on a Maxim 3232 chip. Supports both 3.3 and 5 volts. The shipping on the above rather kills the low price. I ordered mine from China and the slow boat shipping, typically 3 to 4 weeks. They are very versatile little boards. Another option if you have a level converter handy is use your Rpi to get the serial output from the BBB. If memory serves the Rpi is 5 volt on everything. Mike

Thomas_O · September 15, 2014, 9:02am

William which kernel release are you running when you have no reboots?

Thomas_O · September 15, 2014, 9:15am

Hi Robert i am having the same issues with at least 10 individual bbb boards both rev 5b and 5c and from booth adafruit and curcuitco. We are currently running them on a board with nice lab qualiy profesional 5v sources and ethernet and are pinging them constantly to simulate some sort of network activity. We have two of them connected via uart0 serial and have been logging that without spotting anything weird. Basically they just reboot. there is no trace of Oops or Panic…

We have tried the 3.14 kernels from ti and your kernel builds up to the 3.16.2-bone5 build. we have also compiled custom kernels trying to disable stuff like the otg based on another thread without success.

Still we still have an average of one reboot per day.

we do have access to serial cables and are happy to assist with any debug info you might want. We have found nothing in dmesg and when logging the serial console we have so far seen nothing. I actually starts looking as a proper reset.

Some reboot stats…
192.168.1.x:uptime 4:19 ;reboot 3

192.168.1.x:uptime 1 day 2:12 reboot 0

192.168.1.162:uptime 1 day 2:13;reboot 0

192.168.1.x:uptime 16:10;reboot 1

192.168.1.x:uptime 19:21;reboot 2

192.168.1.x:uptime 11:23;reboot 1

192.168.1.x:uptime 15:05;reboot 1

192.168.1.x:uptime 11:14;reboot 1

192.168.1.x:uptime 02:29;reboot 1

–Thomas

Thomas_O · September 15, 2014, 9:30am

Hi we are experiencing the same problems as Greg. We are currently running more then 10 bbb in test booth revision 5b and 5c with the same results. We have tested both kernel version 3.14, 3.15 from ti as well as your builds with the same results.

Here are some stats from our boards running the 3.16.2-bone5 release…

192.168.1.160:uptime 9:52 ;reboot 3

192.168.1.161:no connection

192.168.1.162:uptime 2:16;reboot 1

192.168.1.163:uptime 8:06;reboot 2

192.168.1.164:uptime 2:50;reboot 3

192.168.1.165:uptime 5:12;reboot 1

192.168.1.167:uptime 13:38;reboot 3

192.168.1.169:no connection

We do have access to serial cables and are happy to support with any debug information you might want… We do not see anything weird in dmesg or the serial console which we have logged during the reboots.

Greg_Kelley · September 15, 2014, 12:09pm

William,

I’m not using any I/O at this time. Setting this up in steps. First was to get it running as a CUPS and weewx Server like my RasPi. I can ‘play’ with it later once it’s stable. Found a cable on eBay for $6 shipped from CA using PL303HX claims 3.3v TTL.

Also, no resets since 3pm yesterday when I switched PS. Maybe there is a problem with the SuperPowerSupply model KZ0502000, although the other BBB (element14) was resetting using both PS (SPS and Garmin) and even a direct USB connection. It’s possible that there was a problem with both the other BBB and the SPS power supply. If it’s happy with the Garmin I’ll get a different 5v2a supply.

I am also solar but use a hybrid Grid-Tie Outback and a Manual Transfer Switch so when we lose power, it’s dark until I can get to the switch. That’s why I have all my electronics (cable modem, router, Vonage, etc on a UPS.

Thomas_O · September 15, 2014, 12:33pm

Well we are seeing exactly the same problems as Greg has experienced. We have 10 bbb running for a project and we are seeing on avg a crash per 24 hours or maybe rather a reboot.
We have tested the 3.14. 3.15 and 3.16 kernels built via Roberts build environment as well as installing the ti kernels wtih the same result.

We have ruled out the powers since we are now running all the boards from a profesional lab supply with measured ripple and performance way above the specs for the board.
We have had serial cables connected to the uart0 and have not seen anything strange in the dmesg or the serial console. No trace of oops or panic just restarts…
There is now difference between kernel versions or board revisions.

We do have access to serial cables and are happy to help with providing any debug info needed.

–Thomas

Greg_Kelley · September 15, 2014, 5:28pm

I have now been running 22 hours without a reset - this is a first. Yesterday at 3:17pm I changed out power supplies. I also killed wicp. So now I’m suspicious of wicp as the culprit. I added ‘auto eth0’ in /etc/network/interfaces since I’m hard wire connected and wicp was bringing up eth0 far too late in the boot sequence so that dhclient and ntp were both looping looking for the network (which wasn’t up yet) but I still had wicp starting up at boot. I suppose I can shutdown and put the other power supply back on and see what happens with wicp removed.

William_Hermans · September 15, 2014, 10:02pm

Greg, ah yeah wicd . . .IMHO is the worst piece of **** program out there. First thing I do after installing an OS is remove it. If it exists.

After all, how hard is it to manually insert an interface into /etc/network/interfaces ? I’m old school anyhow, and prefer to do it this way.

So boot up times for me with systemd enabled, and static ips set for my interfaces

, is 15-16 seconds. systemd-analyze is great for this__.__

Using the standard init daemon, but up times are not much slower, but no cool tool to tell you exactly how long it takes.

Greg_Kelley · September 16, 2014, 1:15pm

Update: I changed power supplies back to the 5v2a and removed wicd yesterday at 3:20pm. Ran fine until 8:42am and reset after 16hrs. This time, it didn’t bring up eth0 and was unreachable. I pulled the power and powered back up, booted normally. Interesting thing is that syslog only shows the second boot on power up and not the reset reboot. Syslog was fresh at 6:25am and shows only one bootup sequence. Strange. I was hoping resets were from wicd, guess not. Problem still exists. I could deal with a single reset every day, but not when eth0 fails as my weather server uploads ftp files to my website every 15 minutes. I lost eth0 on reset back on 9/11 at 8:13am, so loss of eth0 is random in the resets. I have a Phihong Switching Power Supply 5v2a that I’m going to put on there and see if that helps, but unlikely since Thomas has lab quality power and all of his are resetting. Here’s what’s left in rc2.d, I have removed S06weewx for now (as well as apache2 and saned).

`
README S01motd S03cron S03ssh S05cups
S01boot_scripts.sh S01rsyslog S03dbus S03udhcpd S06rc.local
S01capemgr.sh S01sudo S03loadcpufreq S04avahi-daemon S06rmnologin
S01hostapd S01xrdp S03ntp S04cpufrequtils
S01leds-off S03acpid S03rsync S05bootlogs

`

Greg_Kelley · September 16, 2014, 1:20pm

One other note, current Debian distro that comes with BBB has ntpdate installed. I have removed ntpdate and installed ntp in it’s place.

Greg_Kelley · September 16, 2014, 1:28pm

I just changed over to the Switching Power Supply and eth0 failed to come up on boot. I hit the reset button and a reboot brought it up. So now it seems there are eth0 issues as well as resets. Going from bad to worse.

Thomas_O · September 16, 2014, 3:42pm

Greg we are experiencing the same issues as well with reset not bringing up the eth0 interface. we were trying to do a hw watchdog that would reset the board and when we press reset the eth0 interface fails as well. We are also setting the interface config static in /etc/network/interfaces.

Chris_Morgan · September 16, 2014, 4:53pm

I very much appreciate your efforts in looking at these issues. The
company I work for is planning to use a BBB in a commercial product,
after discussions with CircuitCo, and this issue has me a little bit
worried. We haven't had a wide enough scale usage to know if we are
seeing something similar here with the resets and eth0 issues.

Looking forward to observing as you guys continue to work through this stuff.

Chris

Gerald_Coley1 · September 16, 2014, 4:56pm

I plan to address this on the Ethernet issue in the next revision of the board in the form of a GPIO based recovery mechanism that will allow SW to reset the Ethernet PHY.

Gerald

RobertCNelson · September 16, 2014, 5:00pm

we do have a phy workaround:

https://github.com/RobertCNelson/bb-kernel/blob/am33x-v3.16/patches/beaglebone/phy/0003-cpsw-search-for-phy.patch

i need to rebase it to the 3.14-ti tree..

Regards,

Gerald_Coley1 · September 16, 2014, 5:56pm

Well. That would be nice to have everywhere.

Gerald

RobertCNelson · September 16, 2014, 5:59pm

It is all the "bone" branches, except in the new 3.14-ti tree... It
was one of those, I was just tagging a new release, oh crap, should
have added that. But was in a race to leave work today to run some
errands. It'll be the first thing i add back later.

Regards,