RE: [beagleboard] BBB intermittently rebooting.

As far as I remember how the Vbus issue was described in details, hope I’m not mistaken:
Vbus connects to both CPU and PMIC. CPU has a charge-pump circuit to detect OTG devices (inject some power and track USB signals for incoming events) and it injects 5V periodically into Vbus line. Sometimes PMIC detects this 5V on the Vbus input as a good voltage and turns inner power switch to Vbus where in fact no 5V with sufficient current. Next power failure and immediate reboot occur

William/Wulfman: So far, running on USB Power module (no USB communications) the BBB is staying up. It has been 18 hours or so now. Running the same software that rebooted twice in 24 hours on +5V barrel connector input. I have seen the ‘bad’ configuration run as long as 36 hours between reboots, so the test needs to continue for at least two days.

Thanks to Maxim for the explanation. If that is the problem, it looks like the problem existed in 2013, a software patch was applied, and the software patch was lost between kernel 3 and kernel 4.

— Graham

usb0_vbus on the processor is an analog input to a comparator.
page 80 of the processor ref manual says - USB0_VBUS (7) Supply voltage
for USB VBUS comparator input
i do not see it saying supplies any voltage to the circuit anywhere in
the PDF.
I can see the software causing the issue if there is some kind of noise
in the line.
maybe instead of a hard ground which would cause headaches if one was to
plug a usb power connector to the BBB
a pulldown of 10k on the line to keep it low.
One would need to test this idea out.

Gerald might have a comment ?

William/Wulfman: So far, running on USB Power module (no USB communications) the BBB is staying up. It has been 18 hours or so now. Running the same software that rebooted twice in 24 hours on +5V barrel connector input. I have seen the ‘bad’ configuration run as long as 36 hours between reboots, so the test needs to continue for at least two days.

Thanks to Maxim for the explanation. If that is the problem, it looks like the problem existed in 2013, a software patch was applied, and the software patch was lost between kernel 3 and kernel 4.

— Graham

Glad to hear that so far everything seems good Graham. So far. However I have a sneaking suspicion it will continue to run until you decide to shut it off. Not 100% sure though, hence the need for someone like you to test. In hopes of narrowing down this problem.

William:

OK. I plan to let it run for at least two days, perhaps three.

— Graham

For what it is worth . . .

debian@beaglebone:~$ uname -a
Linux beaglebone 4.1.2-ti-r3 #1 SMP PREEMPT Tue Jul 14 06:54:47 UTC 2015 armv7l GNU/Linux
debian@beaglebone:~$ cat /etc/dogtag
BeagleBoard.org Debian Image 2015-03-01
debian@beaglebone:~$ uptime
19:46:56 up 6 days, 5:48, 1 user, load average: 0.00, 0.01, 0.05

Powered via USB, and running ever since I installed that kernel

Since we now have two threads about the same problem…
https://groups.google.com/forum/#!topic/beagleboard/lF1X1XINjDo

My 13 BB-Black under test are powered from external +5V with these power supplies:
http://www.deutronic.com/products/power-supplies/ac-adapter/esc15g-15-watt.html
Average reboot number of each BB-B was each about 3 per day.

After reading this thread, I simply connected a USB cable from the front side Type-A to the back side Mini-USB and since then the number of reboots drastically decreased, within the last 12h I saw in total just two.
Other systems under my control but not located in my lab are showing the same improvement.

— Günter (dl4mea)

I could in my archive messages:

I can confirm that the pulsing detected by PMIC on USB_DC signal is the probing from USB-OTG.

After I disabled the USB-OTG in the kernel, the system has never rebooted. Btw I also re-loaded Angstrom image (3.8 kernel) and Andrew’s Android image (with 3.8 kernel). I did not observe USB-OTG probing pulses on the VBus. I believe in the 3.8 kernel, the USB-OTG has not been implemented/enabled. That might be reason why it seems that 3.8 kernel doesn’t have the random reboot behavior.

In case anyone wants to test it out, here is the change in the source code (NOTE: ignore the line and column numbers; just search for the struct “static struct omap_musb_board_data musb_board_data” ):

— a/arch/arm/mach-omap2/board-am335xevm.c

+++ b/arch/arm/mach-omap2/board-am335xevm.c

@@ -3956,7 +4125,8 @@ static struct omap_musb_board_data musb_board_data = {

  • mode[4:7] = USB1PORT’s mode

  • AM335X beta EVM has USB0 in OTG mode and USB1 in host mode.

*/

  • .mode = (MUSB_HOST << 4) | MUSB_OTG,

+// .mode = (MUSB_HOST << 4) | MUSB_OTG,

  • .mode = (MUSB_HOST << 4) | MUSB_PERIPHERAL,

.power = 500,

.instances = 1,

};

Thanks Maxim,

I could in my archive messages:

I can confirm that the pulsing detected by PMIC on USB_DC signal is the
probing from USB-OTG.

After I disabled the USB-OTG in the kernel, the system has never rebooted.
Btw I also re-loaded Angstrom image (3.8 kernel) and Andrew's Android image
(with 3.8 kernel). I did not observe USB-OTG probing pulses on the VBus. I
believe in the 3.8 kernel, the USB-OTG has not been implemented/enabled.
That might be reason why it seems that 3.8 kernel doesn't have the random
reboot behavior.

In case anyone wants to test it out, here is the change in the source code
(NOTE: ignore the line and column numbers; just search for the struct
"static struct omap_musb_board_data musb_board_data" ):

This shouldn't be an issue, as we define what they are..

&usb0 {
     status = "okay";
     dr_mode = "peripheral";
};

&usb1 {
     status = "okay";
     dr_mode = "host";
};

but just in-case we have a regression somewhere else, i'll push out a
test with otg disabled. (it's enabled in 3.14.x and we need it for the
x15)

Regards,

diff --git a/patches/defconfig b/patches/defconfig
index 79ae693..7731b6a 100644
--- a/patches/defconfig
+++ b/patches/defconfig
@@ -4070,7 +4070,7 @@ CONFIG_USB_ANNOUNCE_NEW_DEVICES=y

Robert:

I can also confirm the behavior of rebooting on +5V Barrel power, and no reboots when on USB power.
Same hardware, same OS/Software, running on +5V barrel input reboots about three times per day,
randomly. Same-same on USB power (iPhone 1A charger, no USB activity) has been running without
reboot for almost two days, now.

I can also confirm that this reboot problem does NOT exist in 3.14. I have Debian 8.1/kernel 3.14
that have been running for multiple weeks on +5V barrel power, without reboot.

There have been some significant recent changes to USB power that allow tablets to both push
power out of a connector to power thumbdrives, etc, as well as take power in the SAME connector
for battery charging. It is a kludge of a system if I have ever seen one. You might want to see if
any of this new USB power code has affected things.

— Graham

BBB1 – Debian Image 2015-07-12
uname -a
Linux BBB1 4.1.2-ti-r4 #1 SMP PREEMPT Thu Jul 16 20:48:37 UTC 2015 armv7l GNU/Linux

By the way I with my collegues spent some time on investigating this issue and found in the tps65217 datasheet that all three power source Vac, Vusb and Vbat should not be enabled at the same time. Only two of them should. Because of the reason that BBB is a universal enthusiast’s board all three power sources are left floating, which contradicts with tps65217 architecture. We figured out that if the most of users use only a barrel connector or USB then Vbat input should be grounded. I don’t know if this solution fixes the random reboot issue but at least it complies with the TPS65217 architecture. Probably anybody should try to ground the Vbat input and see how the board behaves.

By the way I with my collegues spent some time on investigating this issue and found in the tps65217 datasheet that all three power source Vac, Vusb and Vbat should not be enabled at the same time. Only two of them should. Because of the reason that BBB is a universal enthusiast’s board all three power sources are left floating, which contradicts with tps65217 architecture. We figured out that if the most of users use only a barrel connector or USB then Vbat input should be grounded. I don’t know if this solution fixes the random reboot issue but at least it complies with the TPS65217 architecture. Probably anybody should try to ground the Vbat input and see how the board behaves.

I was reviewing the TPS65217 datasheet and I think this may be the area of concern (from section 9.3.9.1):

The linear charger periodically applies a 10-mA current source to the BAT pin to check for the presence of a battery. This will cause the BAT terminal to float up to > 3 V which may interfere with AC removal detection and the ability to switch from AC to USB input. For this reason, it is not recommended to use both AC and USB inputs when the battery is absent.

Since the battery is absent for most BBB users, TI recommends using AC or USB power, but not both.

Furthermore, the TPS65217 has internal sinks on the AC and USB inputs, so it should not be necessary to short either input to ground when it is not used as long as the input sinks have not been forced off by the software.

9.3.9.4 AC and USB Input Discharge

AC and USB inputs have 90-µA internal current sinks which are used to discharge the input pins to avoid false detection of an input source. The AC sink is enabled when USB is a valid supply and VAC is below the detection threshold. Likewise, the USB sink is enabled when AC is a valid supply and VUSB is below the detection limit. Both current sinks can be forced OFF by setting the [ACSINK, USBSINK] bits to 11b. Both bits are located in register 0x01 (PPATH).

NOTE [ACSINK, USBSINK] = 01b and 10b combinations are not recommended as these may lead to unexpected enabling and disabling of the current sinks. 9

Perhaps someone can check if these are being set to something other than 00b.

Dennis Cote

You can disable in software Vac or Vusb, but not the battery source, that is why so much trouble

What about adding a 1k resistor from the VBat+ input to ground, just for safety?

What about adding a 1k resistor from the VBat+ input to ground, just for safety?

That should not be necessary. The battery detection logic is complicated, but seems quite robust. Again from the datasheet:

9.3.13 Battery Detection and Recharge

Whenever the battery voltage falls below VRCH, IBAT(DET) is pulled from the battery for a duration tDET to determine if the battery has been removed. If the voltage on the BAT pin remains above VLOWV, it indicates that the battery is still connected. If the charger is enabled (CH_EN = 1), a new battery charging cycle begins.

If the BAT pin voltage falls below VLOWV in the battery detection test, it indicates that the battery has been removed. The device then checks for battery insertion: it turns on the charging path and sources IPRECHG out of the BAT pin for duration tDET. If the voltage does not rise above VRCH, it indicates that a battery has been inserted, and a new charge cycle can begin. If, however, the voltage does rise above VRCH, it is possible that a fully charged battery has been inserted. To check for this, IBAT(DET) is pulled from the battery for tDET and if the voltage falls below VLOWV, no battery is present. The battery detection cycle continues until the device detects a battery or the charger is disabled.

When the battery is removed from the system the charger will also flag a BATTEMP error indicating that the TS input is not connected to a thermistor.

The one problem I see with the BBB design is that the test currents are applied to the BAT terminals and the battery voltage is measured on the BAT_SENSE terminal. Normally these would be connected at the battery, but they are not connected at all on the BBB. Perhaps connecting TP5 and TP6 with a jumper wire will help.

Dennis Cote

wget http://rcn-ee.homeip.net:81/farm/testing/linux-image-4.1.2-ti-r4.6_1cross_armhf.deb
sudo dpkg -i linux-image-4.1.2-ti-r4.6_1cross_armhf.deb
sudo reboot

It improves the situation, but there are still some reboots.
After 12h operation, 5 of my 14 BBB are still up since reboot, the others have seen reboots, one outstanding has had 4 oft them.