BBB frozen, how to reset?

Once in a blue moon one of my beaglebones will get into a state where it has power (the power LED is lit), but it is not booted. Normally this would be fine, just hit the power button to reset. But in this weird state the power button does nothing. The reset button does nothing.
I checked the power and reset button pins on the header, the power was low, the reset was high.
The only way to get the board out of this state was to pull the 5V power.
I’m using a KL16 on a cape to do a watchdog on the BB, and reboot it via power and/or reset buttons on the header if the BB stops sending checkins over uart. This has been working great, except for the rare case where the board ends up in this state where the power and reset buttons are not functioning.
Any ideas how the BB could get into this state, and if there’s any other way to force a reboot other than physically pulling the 5v power?
Thanks,
JR

press, and hold the reset button, remove power. A few seconds later, reapply power.

I would start with your cape design and try and rule that out first.

The reset is an input pin read by the processor, not actually a HW power reset. If the SW is locked up, this could happen.

If you hold the power button for a 8 seconds or more the board should power cycle.

When it is in this state, what do the voltages read?

Gerald

For what it’s worth Gerald, this happens with nothing connected to the board as well. This just happened to me last night after issuing a reboot command from the command line.

I remember at some point you all were talking about something about the “ramp time” of the PMIC or something.

Is this on power up or is this state happening some time later? If it is on power up, then the power supply most likely is the issue based on the ramp requirements of the PMIC.

If the power LED is on, then the PMIC is on and ramped up. That is why I asked for the voltages.

It also could be a boot pin read issue where it misreads the boot pins. If that is the case you should see that from the serial port.

Gerald

Gerald, it’s like the board hangs at power down, but I can not be 100% sure. The reason why I “assume” it’s at power down, is that the heartbeat blink stops, but the rest of the LEDs stay on, and the ethernet port light still blinks.

The board I experienced this on last night is an Element14 RevC, but I do also have a circuitco A5A that exhibits the same thing.

Hmm, not sure what is going on. Sounds like the processor has stopped running the code and halted but it forgot to turn off the lights.

Gerald

By the way, last night I initially presses the boot button just to confirm what I’ve already “known”. Which is that once the board enters this state, you need to remove power for a few seconds. Anyway, the USR LEDs all flash on, like at normal power up, but then that is it. Nothing else.

I’m assuming this is a software “issue”, but honestly, I really do not know. One thing I do know for sure, is that this has happened since . . . forever. That is to say I seem to remember this happening since ~may 2013 when we got our first boards. The only difference I notice is that when using reboot versus shutdown now -r this problem seems to rear it’s head less often, but is not completely suppressed.

One more thing of note. I do not run systemd - Ever. I run SYSV as an init daemon. I only mention this as I think Robert said something about systemd lessening this issue.

Sounds to me that like BBB has gone into sleep mode and there is no trigger to wake it up. Is there a way to measure the current consumption?

Regards,
John

If the board was in sleep, then why wont the reset button reset ? Passed that, why would the USR cycle( flash on then off ) then nothing ?

From what Gerald said previously in this thread:

"The reset is an input pin read by the processor, not actually a HW power reset. If the SW is locked up, this could happen.”

Regards,
John

If the software is locked up, the USR LEDs would not cycle as if the system is attempting to restart.

Also, at POR, it would help to understand at which point the USR LEDs( all 4 at once ) come on, then go off again.

I’m assuming this is not done in uboot, but I really do not know.

Gerald, I do not have this setup yet, but perhaps in the future may have the means. Is this something that might be easily checkable via JTAG ? I’ve never used JTAG before, and do not have the header in place, but do have a JTAG emulator.

One thing that has been stopping me from seriously considering this as a debugging option, is that I do not know if there is an open source ( gcc - as in GNU compiler collection - Not the compiler its self ) tool. Passed that, it’s all new to me, and probably a steep learning curve initially.

I didn’t test the 8 second holddown of the power button but I doubt it would help, and unfortunately it’s not a reproducible issue. I’ll have to wait for it to happen again.

The zero volts on power was very weird. From the KL16 I’m “toggling” my own effective power button that is a transistor between the power pin on the header and ground. The KL16 pin was not driven high (I checked), so I don’t think it was the transistor on the cape that was pulling pwr to ground on the BBB. And the physical button wasn’t pressed in. It was as if the pullup at the PMIC wasn’t active, yet the power LED was on. Is that possible?

Wish I hadn’t pulled the 5V power to reset, then I could do more testing.

I didn’t test the 8 second holddown of the power button but I doubt it would help, and unfortunately it’s not a reproducible issue. I’ll have to wait for it to happen again.

I know what you mean, e.g. this happens so erratically, it’s hard to tell when it’ll happen next. But, I could possibly whip up a script, and a means to automate resetting the system. Really, you could probably do the same as well. Just put “sudo reboot” in a bash script, and run it through rc.d

With that said, I’m not 100% sure this is good for the board.

And, of course, in order to remove that the sdcard would need to be put into aother linux system to remove that file heh !

In my case linux is not booted at this time(none of the 4 user leds lit), so a script would not help. This is why I’m doing an external watchdog circuit.

In my case linux is not booted at this time(none of the 4 user leds lit), so a script would not help. This is why I’m doing an external watchdog circuit.

Exactly. So here is what I mean. The USR LEDs cycle on for me if and only if I press the power button on the board. After that, nothing changes. Otherwise the LEDs are off, well the power LED is on, and the ethernet port lights are on too, and potentially blinking.

The script, would just be to reboot the board in an attempt to put the board back into the bad state. For troubleshooting . . .