[PATCH 1/1] Fix sprz319 erratum 2.1

cc'ing beagleboard list, Kevin

Hi Richard,

thanks for the patch. A quick question for you or anyone else who is
experiencing this problem.

There is an erratum in DM3730 which results in the
EHCI USB PLL (DPLL5) not updating sufficiently frequently; this
leads to USB PHY clock drift and once the clock has drifted far
enough, the PHY's ULPI interface stops responding and USB
drops out. This is manifested on a Beagle xM by having the attached
SMSC9514 report 'Cannot enable port 2. Maybe the USB cable is bad?'
or similar.

The fix is to carefully adjust your DPLL5 settings so as to
keep the PHY clock as close as possible to 120MHz over the long
term; TI SPRZ319e gives a table of such settings and this patch
applies that table to systems with a 13MHz or a 26MHz clock,
thus fixing the issue (inasfar as it can be fixed) on Beagle xM
and Overo Firestorm.

Signed-off-by: Richard Watts <rrw@kynesim.co.uk>

Funny, I just had a conversation with Kevin Hilman about needing to put
the DPLL rate rounding code back in for the OPP tables. This looks like
another reason why...

Could you, or anyone else who is experiencing this problem on a board with
a 26MHz oscillator try a quick test for me? I'm a little curious about
the (M, N) of (443, 12) that SPRZ319E is recommending. Could you see if
your USB problems are also solved by using (480, 13) ?

...

Here's the rationale. Walking through the estimates here based on
SPRZ319E, (443, 12) results in a frequency error of -166,667 Hz at the VCO
output. This is 174 ppm below the desired target rate of 960MHz. But
(480, 13) results in no frequency error.

The downside of using (480, 13) is that the PLL update interval increases
from 461 ns to 500 ns (+9%). But if the long-term drift of the DPLL is
downward, then starting at a -174 ppm error has removed 35% of the total
margin (+/- 500 ppm). This might be too naïve, but a downward drift,
seems likely given that gate delay increases as temperature increases.

Mainline is currently using (120, 13) which results in a VCO output
frequency of 240MHz -- this presumably results in increased phase noise
compared to (443, 12) and (480, 13) per SPRZ319E.

- Paul

Hi,

Hi ,

Robert N provided me with a kernel that has the patch in it. It seems to work in the test unit I have . Going to do more extensive tests next.

The board i have essentially never goes beyond a few hrs with camera grabbing frames and ethernet activities going on. The cpu is constantly at 90+ percent. Many times it would even hit the usb reset problem in a just few mins.

With the patch it seems to be running so far overnight.

If the extensive test fails, i would update the group again.

Thanks to Robert and Richard for the patch!

cheers,
CS

Hi

We are facing this kind of issue on several beaglexm and panda es. We are currently seting up a test case which is:

Heavy network traffic:

A ftp server on the LAN serving one file to all boards.
All boards download the file, remove it, redownload it, …

Heavy usb traffic:

We have to USB-Serial adapters wired together at 115200 with RTS/CTS enabled and serving a file to each other simultaneously.

We’ll run the tests for a few days and I’ll give the results here.

OCS: do you have results to provide for your tests please?

Hi

The tests have been running for 6 days now and we had only one failing BBxm

There are five boards:

2 panda: high network + usb traffic
3 bbxm: high network + usb traffic

The only error we got was on a bbxm after 40 hours or so, see log below:

[81288.126525] usb 1-2.4: new full-speed USB device number 8 using ehci-omap
[81288.252624] usb 1-2.4: New USB device found, idVendor=0403, idProduct=6001
[81288.252655] usb 1-2.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[81288.252685] usb 1-2.4: Product: FT232R USB UART
[81288.252716] usb 1-2.4: Manufacturer: FTDI
[81288.252746] usb 1-2.4: SerialNumber: A401180G
[81288.264221] ftdi_sio 1-2.4:1.0: FTDI USB Serial Device converter detected
[81288.264587] usb 1-2.4: Detected FT232RL
[81288.264617] usb 1-2.4: Number of endpoints 2
[81288.264648] usb 1-2.4: Endpoint 1 MaxPacketSize 64
[81288.264678] usb 1-2.4: Endpoint 2 MaxPacketSize 64
[81288.264709] usb 1-2.4: Setting MaxPacketSize 64
[81288.269409] usb 1-2.4: FTDI USB Serial Device converter now attached to ttyUSB0
[244500.186431] hub 1-0:1.0: port 2 disabled by hub (EMI?), re-enabling…
[244500.193756] usb 1-2: USB disconnect, device number 2
[244500.193756] usb 1-2.1: USB disconnect, device number 3
[244500.197296] usb 1-2: clear tt 4 (9081) error -19
[244500.206756] usb 1-2: clear tt 4 (9081) error -19
[244500.216766] smsc95xx 1-2.1:1.0: eth0: unregister ‘smsc95xx’ usb-ehci-omap.0-2.1, smsc95xx USB 2.0 Ethernet
[244500.272369] ftdi_sio ttyUSB0: ftdi_set_termios urb failed to set baudrate
[244500.279937] ftdi_sio ttyUSB0: urb failed to clear flow control
[244501.531372] usb 1-2.4: USB disconnect, device number 8
[244501.532318] ftdi_sio ttyUSB0: FTDI USB Serial Device converter now disconnected from ttyUSB0
[244501.532409] ftdi_sio 1-2.4:1.0: device disconnected

This board has 2 xbee devices talking to each other, on on a usb serial adapter and the other one is mapped on a native UART of the bbxm.

This is not the first I’ve this EMI error, but this is the first time I see the “usb 1-2: clear tt 4 (9081) error -19” part.

Hi Adrien

Adrien responded to me in private E-mail saying that this was without any
changes at all. Hmm...

- Paul

Hi,

Sry about the private answer, I thought that using emails would publish the content of the email here.

Nothing new has come up during the tests. It has been 14 days now and everything seems to run smoothly. I am starting to suspect that temperature is an issue. The next step would be to put the tested boards in some kind of enclosure and run them again for a couple of weeks.