beaglebone phy strapping error on A3

Bas_Laarhoven1 · April 23, 2012, 1:00pm

Tracking a race in the kernel that prevents the networking code to initialize properly for some configurations, I ran into the following design error on the A3 board:

Reading the PHY configuration showed tranceiver mode 3 being configured at boot time, however the default strapping is for mode 7. How can be? The mode bit-2 strapping is connected to pin 14 instead of pin 15 of the PHY, probably due to confusing signal naming!

This gives an unintended pull-up on the CRS signal and causes the mode bit-2 to read low instead of a high.

As the kernel is transmitting properly, but not receiving incoming packets (disturbance on CRS?), I decided to see if fixing this error would solve anything. After patching the board (disconnect straps from pin 14, reconnect to pin 15), the PHY mode is set properly. But alas, it doesn't seem to solve the problem I'm having

-- Bas

Gerald_Coley1 · April 23, 2012, 4:03pm

This has been brought up before. The starpping works fine. There are also
pullups on the processor itself.

Gerald

Andrew_Bradford · April 23, 2012, 4:27pm

Tracking a race in the kernel that prevents the networking code to
initialize properly for some configurations

<snip>

As the kernel is transmitting properly, but not receiving incoming
packets (disturbance on CRS?)

Are you able to provide more info on recreating this problem?
Either with the new or old strapping?

Are you running a PREEMPT kernel?
What kernel version? Who's patch set?
Are you running Debian or Ubutnu? Or Angstrom? Or...?
Are you using netplug (on Debian / Ubuntu) or do you have an 'auto
eth0' line in /etc/network/interfaces?

More info please

-Andrew

Bas_Laarhoven1 · April 23, 2012, 6:13pm

Gerald,

I have only seen a posting about an issue with PHYAD1, R215 & R216. This is a different issue.
What pull-ups on the processor are you referring to? The bit is read as a ‘0’ instead of a ‘1’, so I don’t think there is a pull-up. This makes the default configuration for the PHY 100 Mbps Full Duplex with auto negotiation disabled. It’s probably working with the bootloader because U-boot reconfigures the PHY, so it’s no show-stopper right now and as long as nobody tries to ‘sysboot’ via the NIC it might not be ever.

What worried me a little was the unintended pull-up on the CRS line, especially since that pin was documented to be an output with a weak pull-down…(?)

But as I said fixing the error, didn’t solve my problem. So it’s probably a non-issue right now.

Cheers,
Bas

Bas_Laarhoven1 · April 23, 2012, 7:40pm

Tracking a race in the kernel that prevents the networking code to
initialize properly for some configurations

<snip>

As the kernel is transmitting properly, but not receiving incoming
packets (disturbance on CRS?)

Are you able to provide more info on recreating this problem?
Either with the new or old strapping?

Are you running a PREEMPT kernel?
What kernel version? Who's patch set?
Are you running Debian or Ubutnu? Or Angstrom? Or...?
Are you using netplug (on Debian / Ubuntu) or do you have an 'auto
eth0' line in /etc/network/interfaces?

More info please

-Andrew

Hi Andrew,

My previous email wasn't a (software) problem report, but since you're asking for one, here it comes

I've seen erratical network behaviour since I started with the BeagleBone, the end of last year. My configuration might be different from most because I have the entire configuration and filesystem on an NFS server. There is only an initial uEnv.txt on the SD card that allows me to bootstrap the real uEnv.txt from the NFS server. All development is done on the NFS server, this way I never have to swap or change the content of the SD card. But the network connection is critical and if the kernel doesn't initialize it correctly there is no chance to fix it manually!

The first time, unreliable networking was fixed by replacing an old 10/100 mbit hub by a modern switch.
A couple of weeks later, I found that the newer U-Boot/MLO code wouldn't boot. So I kept the old (2011.09) loaders on the SD-card and had reliable behaviour.
Only last week I decided to try to build a more responsive kernel. Both the standard kernel (CONFIG_PREEMPT_NONE) and one with CONFIG_PREEMPT_VOLUNTARY worked like before, but the kernel built with CONFIG_PREEMPT did not.

The kernel loops sending DHCP requests while trying to establish the NFS connection, until something like the following trace shows up:

[ 76.156863] ------------[ cut here ]------------
[ 76.161759] WARNING: at kernel/softirq.c:159 local_bh_enable+0x33/0x98()
[ 76.168824] Modules linked in:
[ 76.172094] [<c000fa03>] (unwind_backtrace+0x1/0x8a) from [<c002b18b>] (warn_slowpath_common)
[ 76.182022] [<c002b18b>] (warn_slowpath_common+0x33/0x48) from [<c002b1af>] (warn_slowpath_n)
[ 76.192126] [<c002b1af>] (warn_slowpath_null+0xf/0x10) from [<c002f12b>] (local_bh_enable+0x)
[ 76.201782] [<c002f12b>] (local_bh_enable+0x33/0x98) from [<c01e125d>] (neigh_lookup+0xdd/0x)
[ 76.210987] [<c01e125d>] (neigh_lookup+0xdd/0xe2) from [<c020969b>] (arp_process+0x2f1/0x3ae)
[ 76.220006] [<c020969b>] (arp_process+0x2f1/0x3ae) from [<c01d941f>] (__netif_receive_skb+0x)
[ 76.229852] [<c01d941f>] (__netif_receive_skb+0x253/0x2a2) from [<c01896bb>] (cpsw_rx_handle)
[ 76.239871] [<c01896bb>] (cpsw_rx_handler+0x27/0xbc) from [<c0187069>] (__cpdma_chan_free+0x)
[ 76.249518] [<c0187069>] (__cpdma_chan_free+0x85/0x90) from [<c01870eb>] (__cpdma_chan_proce)
[ 76.259620] [<c01870eb>] (__cpdma_chan_process+0x77/0x7a) from [<c01871d1>] (cpdma_chan_stop)
[ 76.269635] [<c01871d1>] (cpdma_chan_stop+0xb9/0x168) from [<c01872cd>] (cpdma_ctlr_stop+0x4)
[ 76.279190] [<c01872cd>] (cpdma_ctlr_stop+0x4d/0x9c) from [<c0188915>] (cpsw_ndo_stop+0x3d/0)
[ 76.288563] [<c0188915>] (cpsw_ndo_stop+0x3d/0x12c) from [<c01daadb>] (__dev_close_many+0x5b)
[ 76.298026] [<c01daadb>] (__dev_close_many+0x5b/0x80) from [<c01dab17>] (__dev_close+0x17/0x)
[ 76.307214] [<c01dab17>] (__dev_close+0x17/0x20) from [<c01dc387>] (__dev_change_flags+0x6f/)
[ 76.316592] [<c01dc387>] (__dev_change_flags+0x6f/0xe8) from [<c01dc455>] (dev_change_flags+)
[ 76.326342] [<c01dc455>] (dev_change_flags+0xd/0x2c) from [<c03a916d>] (ic_close_devs+0x1d/0)
[ 76.335624] [<c03a916d>] (ic_close_devs+0x1d/0x34) from [<c03a9aff>] (ip_auto_config+0x771/0)
[ 76.344998] [<c03a9aff>] (ip_auto_config+0x771/0xad2) from [<c00086c9>] (do_one_initcall+0x6)
[ 76.354561] [<c00086c9>] (do_one_initcall+0x65/0xf0) from [<c03925bf>] (kernel_init+0x7f/0xf)
[ 76.363668] [<c03925bf>] (kernel_init+0x7f/0xf4) from [<c000ce29>] (kernel_thread_exit+0x1/0)
[ 76.372844] ---[ end trace e1b6d0d737eac75f ]---
[ 76.378678] IP-Config: Retrying forever (NFS root)...

Then the kernel keeps sending DHCP requests. It also gets a proper DHCP offer from the server. But it's not clear how far the bits get upstream. That's why I was investigating the PHY configuration (and found the strapping problem).
Summarized: The packet transmission seems to be working, but reception is failing. The kernel is up and running, mounting the root FS via NFS fails.

The kernel is Angstrom, v 3.2.0+ extracted from the OE distribution from around January this year. The filesystem is from Arago 2011.09 am335x-evm.
Cross compiling is done with the Angstrom toolchain from the same OE distribution.

So, any ideas?

Cheers,
-- Bas

Gerald_Coley1 · April 23, 2012, 7:43pm

None what so ever. I will see if we can get one of the SW guys looing into
this.

Gerald

Andrew_Bradford · April 23, 2012, 8:10pm

Logs of IRC discussion about PREEMPT the other day [1] led me to disable
PREEMPT for my Bone. I was having issues with Ethernet and stack
traces that look awfully similar to yours (removed for brevity) but I'm
not NFS booting (do have an NFS mount, though).

[1]: http://www.beagleboard.org/irclogs/index.php?date=2012-04-20
around 13:33 UTC time, I'm 'bradfa' and there's a few other names in
there that should be recognizable

My issues were at shutdown, the kernel would oops [2], or at boot time
only when using netplug on Debian I wouldn't consistently get a DHCP
address even though I have wireshark logs showing the server offering
multiple times (the Bone was just ignoring them). When changing from
one network to another, the problem was worse, rebooting onto the same
network wasn't so bad.

[2]: https://gist.github.com/2428758

If I changed my network setup to have an 'auto eth0'
in /etc/network/interfaces and disable netplug, I could consistently
get a DHCP address but it took much longer (like 10 seconds just to get
the DHCP address versus 10 seconds for my normal full boot). I
determined that the PREEMPT kernel was the root cause and so I've
disabled it in my builds now. I have HZ set to 1000 and I get "good
enough" latency for my needs at this point and I have a fairly stable
system.

-Andrew

Bas_Laarhoven1 · April 23, 2012, 9:15pm

Only last week I decided to try to build a more responsive kernel. Both
the standard kernel (CONFIG_PREEMPT_NONE) and one with
CONFIG_PREEMPT_VOLUNTARY worked like before, but the kernel built with
CONFIG_PREEMPT did not.

Logs of IRC discussion about PREEMPT the other day [1] led me to disable
PREEMPT for my Bone. I was having issues with Ethernet and stack
traces that look awfully similar to yours (removed for brevity) but I'm
not NFS booting (do have an NFS mount, though).

The problem is not NFS specific: It's not getting past the DHCP request (in the kernel code). Just after it has loaded the entire kernel via NFS in the bootloader.

I'm a bit further now and it's kinda weird what's happening. The trace I've put in the rx_handler shows packets coming in as expected for the (working) non-PREEMPT kernel.
The PREEMPT kernel however, does not show any rx_handler calls until the code times out and the driver is being disabled. Then suddenly a stream of (queued???) calls on the rx_handler is made. But that is only after the cpsw device has been shut down (or is being shut down). It looks like some kind of inversion is happening. This doesn't look like a simple (hardware) initialization problem. I thought the preemption code was stable by now, obviously not for the ARM !???

[ 1.875807] net eth0: initializing cpsw version 1.12 (0)
[ 1.881451] CPSW soft_reset called
[ 1.887331] CPSW soft_reset called
[ 1.891325]
[ 1.891331] CPSW phy found on slave 0: id is : 0x7c0f1
[ 1.898987] CPSW soft_reset called
[ 1.902568] PHY 0:01 not found
[ 1.905763] net eth0: phy 0:01 not found on slave 1
[ 1.911360] net eth0: submitted 64 rx descriptors
[ 4.890122] PHY: 0:00 - Link is Up - 100/Full
[ 4.929511] Sending DHCP requests ...... timed out!
[ 77.390716] net eth0: shutting down cpsw device
[ 77.395496] CPSW rx_handler receiving 346 bytes
[ 77.400257] CPSW rx_handler receiving 64 bytes
[ 77.404905] CPSW rx_handler receiving 346 bytes
[ 77.409644] CPSW rx_handler receiving 346 bytes
[ 77.414383] CPSW rx_handler receiving 94 bytes
[ 77.419030] CPSW rx_handler receiving 306 bytes
[ 77.423771] CPSW rx_handler receiving 312 bytes
[ 77.428509] CPSW rx_handler receiving 350 bytes
[ 77.433248] CPSW rx_handler receiving 358 bytes
[ 77.437988] CPSW rx_handler receiving 306 bytes
[ 77.442726] CPSW rx_handler receiving 312 bytes
[ 77.447465] CPSW rx_handler receiving 350 bytes
[ 77.452203] CPSW rx_handler receiving 358 bytes
[ 77.456942] CPSW rx_handler receiving 64 bytes
[ 77.461592] ------------[ cut here ]------------
[ 77.466437] WARNING: at kernel/softirq.c:159 local_bh_enable+0x33/0x98()
[ 77.473436] Modules linked in:
[ 77.476676] [<c000fa03>] (unwind_backtrace+0x1/0x8a) from [<c002b123>] (warn_slowpath_common)
[ 77.486509] [<c002b123>] (warn_slowpath_common+0x33/0x48) from [<c002b147>] (warn_slowpath_n)
[ 77.496520] [<c002b147>] (warn_slowpath_null+0xf/0x10) from [<c002f0c3>] (local_bh_enable+0x)
[ 77.506092] [<c002f0c3>] (local_bh_enable+0x33/0x98) from [<c023d511>] (neigh_lookup+0xdd/0x)
[ 77.515209] [<c023d511>] (neigh_lookup+0xdd/0xe2) from [<c0267a07>] (arp_process+0x2f1/0x3ae)
[ 77.524133] [<c0267a07>] (arp_process+0x2f1/0x3ae) from [<c023568b>] (__netif_receive_skb+0x)
[ 77.533878] [<c023568b>] (__netif_receive_skb+0x253/0x2a2) from [<c01afe2d>] (cpsw_rx_handle)
[ 77.543799] [<c01afe2d>] (cpsw_rx_handler+0x31/0xcc) from [<c01ad7d1>] (__cpdma_chan_free+0x)
[ 77.553356] [<c01ad7d1>] (__cpdma_chan_free+0x85/0x90) from [<c01ad853>] (__cpdma_chan_proce)
[ 77.563364] [<c01ad853>] (__cpdma_chan_process+0x77/0x7a) from [<c01ad939>] (cpdma_chan_stop)
[ 77.573281] [<c01ad939>] (cpdma_chan_stop+0xb9/0x168) from [<c01ada35>] (cpdma_ctlr_stop+0x4)
[ 77.582745] [<c01ada35>] (cpdma_ctlr_stop+0x4d/0x9c) from [<c01af07d>] (cpsw_ndo_stop+0x3d/0)
[ 77.592029] [<c01af07d>] (cpsw_ndo_stop+0x3d/0x12c) from [<c0236d6f>] (__dev_close_many+0x5b)
[ 77.601403] [<c0236d6f>] (__dev_close_many+0x5b/0x80) from [<c0236dab>] (__dev_close+0x17/0x)
[ 77.610506] [<c0236dab>] (__dev_close+0x17/0x20) from [<c023863b>] (__dev_change_flags+0x6f/)
[ 77.619791] [<c023863b>] (__dev_change_flags+0x6f/0xe8) from [<c0238709>] (dev_change_flags+)
[ 77.629452] [<c0238709>] (dev_change_flags+0xd/0x2c) from [<c045221d>] (ic_close_devs+0x1d/0)
[ 77.638649] [<c045221d>] (ic_close_devs+0x1d/0x34) from [<c0452baf>] (ip_auto_config+0x771/0)
[ 77.647934] [<c0452baf>] (ip_auto_config+0x771/0xad2) from [<c00086c9>] (do_one_initcall+0x6)
[ 77.657410] [<c00086c9>] (do_one_initcall+0x65/0xf0) from [<c043a5bf>] (kernel_init+0x7f/0xf)
[ 77.666433] [<c043a5bf>] (kernel_init+0x7f/0xf4) from [<c000ce29>] (kernel_thread_exit+0x1/0)
[ 77.675523] ---[ end trace 01b799af1856c3c8 ]---
[ 77.680414] CPSW rx_handler receiving 64 bytes
[ 77.685074] CPSW rx_handler receiving 346 bytes
[ 77.689817] CPSW rx_handler receiving 346 bytes
[ 77.694555] CPSW rx_handler receiving 64 bytes
[ 77.699206] CPSW rx_handler receiving 64 bytes
[ 77.703854] CPSW rx_handler receiving 346 bytes
[ 77.708593] CPSW rx_handler receiving 161 bytes
[ 77.713333] CPSW rx_handler receiving 181 bytes
[ 77.718070] CPSW rx_handler receiving 228 bytes
[ 77.722809] CPSW rx_handler receiving 248 bytes
[ 77.727547] CPSW rx_handler receiving 129 bytes
[ 77.732292] CPSW rx_handler receiving 149 bytes
[ 77.737032] CPSW rx_handler receiving 129 bytes
[ 77.741770] CPSW rx_handler receiving 149 bytes
[ 77.746507] CPSW rx_handler receiving 137 bytes
[ 77.751246] CPSW rx_handler receiving 157 bytes
[ 77.755983] CPSW rx_handler receiving 228 bytes
[ 77.760722] CPSW rx_handler receiving 248 bytes
[ 77.765460] CPSW rx_handler receiving 95 bytes
[ 77.770108] CPSW rx_handler receiving 115 bytes
[ 77.774845] CPSW rx_handler receiving 74 bytes
[ 77.779491] CPSW rx_handler receiving 137 bytes
[ 77.784230] CPSW rx_handler receiving 157 bytes
[ 77.788972] CPSW rx_handler receiving 292 bytes
[ 77.793710] CPSW rx_handler receiving 312 bytes
[ 77.798448] CPSW rx_handler receiving 64 bytes
[ 77.803095] CPSW rx_handler receiving 95 bytes
[ 77.807743] CPSW rx_handler receiving 115 bytes
[ 77.812482] CPSW rx_handler receiving 64 bytes
[ 77.817134] CPSW rx_handler receiving 292 bytes
[ 77.821873] CPSW rx_handler receiving 312 bytes
[ 77.826610] CPSW rx_handler receiving 64 bytes
[ 77.831257] CPSW rx_handler receiving 306 bytes
[ 77.835996] CPSW rx_handler receiving 312 bytes
[ 77.840735] CPSW rx_handler receiving 350 bytes
[ 77.845486] CPSW rx_handler receiving 358 bytes
[ 77.850226] CPSW rx_handler receiving 306 bytes
[ 77.854964] CPSW rx_handler receiving 312 bytes
[ 77.859703] CPSW rx_handler receiving 350 bytes
[ 77.864442] CPSW rx_handler receiving 358 bytes
[ 77.869181] CPSW rx_handler receiving 64 bytes
[ 77.873832] CPSW rx_handler receiving 179 bytes
[ 77.878571] CPSW rx_handler receiving 94 bytes
[ 77.883220] CPSW rx_handler receiving 0 bytes
[ 77.888281] IP-Config: Retrying forever (NFS root)...
[ 77.893644] net eth0: initializing cpsw version 1.12 (0)
[ 77.899193] CPSW soft_reset called
[ 77.905088] CPSW soft_reset called
[ 77.909018]
[ 77.909023] CPSW phy found on slave 0: id is : 0x7c0f1
[ 77.916683] CPSW soft_reset called
[ 77.920264] PHY 0:01 not found
[ 77.923459] net eth0: phy 0:01 not found on slave 1
[ 77.928916] net eth0: submitted 64 rx descriptors
[ 79.900185] PHY: 0:00 - Link is Up - 100/Full
[ 79.929507] Sending DHCP requests ....

[1]: http://www.beagleboard.org/irclogs/index.php?date=2012-04-20
around 13:33 UTC time, I'm 'bradfa' and there's a few other names in
there that should be recognizable

My issues were at shutdown, the kernel would oops [2], or at boot time
only when using netplug on Debian I wouldn't consistently get a DHCP
address even though I have wireshark logs showing the server offering
multiple times (the Bone was just ignoring them). When changing from
one network to another, the problem was worse, rebooting onto the same
network wasn't so bad.

The issues you're describing here look exactly the same as I'm having. But instead of the consistent failure I'm seeing your network is operational depending on how you start it. Is that correct? That brings me back to my original idea of a race during initialization. But at least I can reproduce it very consistently now!

[2]: cpsw Ethernet crash on shutdown with PREEMPT · GitHub

If I changed my network setup to have an 'auto eth0'
in /etc/network/interfaces and disable netplug, I could consistently
get a DHCP address but it took much longer (like 10 seconds just to get
the DHCP address versus 10 seconds for my normal full boot). I
determined that the PREEMPT kernel was the root cause and so I've
disabled it in my builds now. I have HZ set to 1000 and I get "good
enough" latency for my needs at this point and I have a fairly stable
system.

Are you using PREEMPT_NONE or PREEMPT_VOLUNTARY ? The latter does not give me the boot problems, although I haven't tested it much further.

Cheers,
-- Bas

Bas_Laarhoven1 · April 23, 2012, 9:31pm

Thinking aloud:
IIRC I've seen this (or a similar) problem in the past with the non-PREEMPT kernel. Can somebody confirm this? That would mean that it is _not_ a PREEMPT specific issue, but a race that is just more likely to happen with the PREEMPT kernel.

Agreed?

-- Bas

Chendra_Handoko · April 23, 2012, 10:08pm

Does anyone know what are the differences between beagle XM rev. C and rev C1 board?

Thank you

RobertCNelson · April 23, 2012, 10:20pm

For one, they are two different boards..

(it looks like the C1 was a pre-production C2 according to:
http://elinux.org/BeagleBoard#Availability )
http://beagleboard.org/hardware/design

Regards,

Gerald_Coley1 · April 23, 2012, 10:24pm

You can read the Systems Reference Manual and it will tell you.

http://beagleboard.org/static/BBxMSRM_latest.pdf

Gerald

Gerald_Coley1 · April 23, 2012, 11:27pm

We changed the pads on the expansion header on Rev C1. No electrical
changes of any kind.

Gerald

Andrew_Bradford · April 24, 2012, 11:18am

The PREEMPT kernel however, does not show any rx_handler calls until the
code times out and the driver is being disabled. Then suddenly a stream
of (queued???) calls on the rx_handler is made. But that is only after
the cpsw device has been shut down (or is being shut down). It looks
like some kind of inversion is happening. This doesn't look like a
simple (hardware) initialization problem. I thought the preemption code
was stable by now, obviously not for the ARM !???

PREEMPT is not stable on AM335x. Not sure about other OMAP devices.
Koen may know. I would expect it is stable on most ARM devices, afaik
ARM was one of the first architectures to get PREEMPT.

The issues you're describing here look exactly the same as I'm having.
But instead of the consistent failure I'm seeing your network is
operational depending on how you start it. Is that correct? That brings
me back to my original idea of a race during initialization. But at
least I can reproduce it very consistently now!

Yes, that is correct. I can have operational networking with PREEMPT,
so long as I start it the slower and less flexible way (no netplugd) in
Debian. But the tradeoff is that by using the slower way is that on
Ethernet cable removal and reinsertion, the BeagleBone won't perform
DHCP_REQUEST or DHCP_DISCOVER again (this is not a bug). With netplugd,
it will, which makes moving from one network to another easy. I don't
know if there are other issues involving Ethernet and PREEMPT, I
haven't tested enough. Occasionally I can get a PREEMPT kernel and
netplugd to play nice and have working Ethernet on boot, but it's not
consistent.

Yours and my issues look very similar to me as well. I don't have
much time this week, but next week I may have more. In that time, I'd
like to debug this further. If you work on this more, I'd appreciate
it if you could please keep me CC'ed if you take it to another list.

Are you using PREEMPT_NONE or PREEMPT_VOLUNTARY ? The latter does not
give me the boot problems, although I haven't tested it much further.

I have run PREEMPT and PREEMPT_NONE on my BeagleBone. I've not compiled
many (1 maybe?) kernel with PREEMPT_VOLUNTARY. Sorry. Maybe next week
I'll try that more.

Sadly, I don't think the linux-omap git tree yet has AM335x code fully
integrated [1]. I've been using the Arago am335x repo v3.2-staging
branch [2] along with Greg KH's 3.2.y stable branch [3] patched on top
for my development. I'm running Debian 6 Squeeze armel.

[1]: kernel/git/tmlind/linux-omap.git - OMAP development tree
[2]: arago-project.org/git/projects/?p=linux-am33x.git
[3]: kernel/git/stable/linux.git - Linux kernel stable tree

-Andrew

Bas_Laarhoven1 · April 25, 2012, 7:38pm

Here are some more details: I've narrowed down what's happening when the DHCP code times out, but I'm not familiar enough with the code to see what's going wrong.
When the DHCP request code times out, cpsw_ndo_stop is called and a sequence of tx_handler and rx_handler calls is made.

cpsw_ndo_stop()
   cpdma_ctlr_stop()
     cpdma_chan_stop()
       __cpdma_chan_process()
         ==> varying series of tx_handler calls
     cpdma_chan_stop()
       __cpdma_chan_process()
         ==> varying series of rx_handler calls

A difference between the PREEMPTIVE (failing) kernel and the PREEMPTIVE_VOLUNTARY (working) kernel is that cpsw_interrupt() is being called. In the PREEMPTIVE kernel the cpsw_interrupt is _never_ being called. So it looks like proper initialization failing in the PREEMPTIVE kernel!???

--- Bas

Chendra_Handoko · April 25, 2012, 11:09pm

Does anyone know what the max spi fifo size in beagle XM is and how to set the spi fifo to max?

I found it in the omap3530spi.h but I got this error “XFER Timeout” if I changed the spi fifo to 64

#define OMAP3530_SPI_FIFOLEN 16 /* Half of the available FIFO for transmit/receive */

And how to eliminate the delay between two spi bytes transfer?

Thank you

Mark_Lazarewicz · April 26, 2012, 9:53pm

the comment indicates the max is 32 why set it to 64?

— On Wed, 4/25/12, Chendra, Handoko Handoko.Chendra@spansion.com wrote:

> From: Chendra, Handoko Handoko.Chendra@spansion.com
> Subject: [beagleboard] beagle xm spi fifo in QNX
> To: “beagleboard@googlegroups.com” beagleboard@googlegroups.com
> Date: Wednesday, April 25, 2012, 6:09 PM
>
> Does anyone know what the max spi fifo size in beagle XM is and how to set the spi fifo to max?
>
> I found it in the omap3530spi.h but I got this error “XFER Timeout” if I changed the spi fifo to 64
>
> #define OMAP3530_SPI_FIFOLEN 16 /* Half of the available FIFO for transmit/receive */
>
> And how to eliminate the delay between two spi bytes transfer?
>
> Thank you
>
> – To join: http://beagleboard.org/discuss
> To unsubscribe from this group, send email to:
> beagleboard+unsubscribe@googlegroups.com
> Frequently asked questions: http://beagleboard.org/faq

|

Koen_Kooi · May 2, 2012, 11:58am

Can you apply these 2 on top of the angstrom kernel and retry?

http://dominion.thruhere.net/koen/angstrom/beaglebone/preempt/

regards,

Koen

Bas_Laarhoven1 · May 2, 2012, 7:35pm

Patched, tried, verified, and tested again. No luck! Behaviour looks the same.

The patch seems to fix the highest interrupts not being detected and/or serviced, but the original code is working with normal scheduling and fails in preemptive mode. So it's hard to imagine these patches to fix our problem. I've captured a complete bootlog and put it at <http://pastebin.com/aXAVDDRZ>. There are (and have always been) some suspicious messages there, so one of these may ring a bell with someone.

-- Bas

Bas_Laarhoven1 · May 2, 2012, 8:38pm

More info: Tried second BeagleBone. First one was (patched) A3, second is A5.
Shows same behaviour. Traced down oops to irqs_disabled() being true in kernel/softirq.c.

-- Bas