Need some SPI help

I've torn my hair out over this BeagleBone problem for a week so now I
need to come to the list for help.

What I need to do:

read an SPI A/D converter at 1 megasample/second. This requires a 16Mhz
SPI clock. I need to read 512 byte sample.

Here's what I can do so far.

Read the converter at 16MHz for 128 bytes. At 256 bytes, the ioctl
never returns, the process is locked in the kernel and nothing but a
hard reset or power cycle will bring it back.

I think this may have something to do with DMA but TFRM has my head
spinning.

Thanks in advance for any help forthcoming.
John

If you don't want to try to read this line-wrapped crap below, it is in
a file here:

https://dl.dropbox.com/u/81715047/spi_help.txt

Here: my dmesg for a SUCCESSFUL run. I've added many debugging
statement to the driver code.

[ 102.622772] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcspi_setup_transfer(): dividor is 0, speed is 24000000
[ 102.622802] spidev spi1.0: spi_omap2-mcspi.c: after
mcspi_writ_chcon(): speed 24000000, sample trailing edge, clk normal
[ 102.622802] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcpi_setup_transfer returned 0
[ 102.622863] spidev spi1.0: spi.c: spi_setup: mode 1, 8 bits/w,
24000000 Hz max -- status 0
[ 102.622863] spidev spi1.0: spi mode 01
[ 102.622894] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcspi_setup_transfer(): dividor is 0, speed is 24000000
[ 102.622924] spidev spi1.0: spi_omap2-mcspi.c: after
mcspi_writ_chcon(): speed 24000000, sample trailing edge, clk normal
[ 102.622955] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcpi_setup_transfer returned 0
[ 102.622955] spidev spi1.0: spi.c: spi_setup: mode 1, 16 bits/w,
24000000 Hz max -- status 0
[ 102.622985] spidev spi1.0: 16 bits per word
[ 102.623016] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcspi_setup_transfer(): dividor is 1, speed is 16000000
[ 102.623016] spidev spi1.0: spi_omap2-mcspi.c: after
mcspi_writ_chcon(): speed 12000000, sample trailing edge, clk normal
[ 102.623046] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcpi_setup_transfer returned 0
[ 102.623077] spidev spi1.0: spi.c: spi_setup: mode 1, 16 bits/w,
16000000 Hz max -- status 0
[ 102.623077] spidev spi1.0: spidev.c: 16000000 Hz (max)
[ 102.625732] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcspi_setup_transfer(): dividor is 1, speed is 16000000
[ 102.625762] spidev spi1.0: spi_omap2-mcspi.c: after
mcspi_writ_chcon(): speed 12000000, sample trailing edge, clk normal
[ 102.625762] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcpi_setup_transfer returned 0
[ 102.625823] spidev spi1.0: spi.c: spi_setup: mode 1, 16 bits/w,
16000000 Hz max -- status 0
[ 102.625823] spidev spi1.0: spi mode 01
[ 102.625854] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcspi_setup_transfer(): dividor is 1, speed is 16000000
[ 102.625885] spidev spi1.0: spi_omap2-mcspi.c: after
mcspi_writ_chcon(): speed 12000000, sample trailing edge, clk normal
[ 102.625915] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcpi_setup_transfer returned 0
[ 102.625915] spidev spi1.0: spi.c: spi_setup: mode 1, 16 bits/w,
16000000 Hz max -- status 0
[ 102.625946] spidev spi1.0: 16 bits per word
[ 102.625976] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcspi_setup_transfer(): dividor is 1, speed is 16000000
[ 102.625976] spidev spi1.0: spi_omap2-mcspi.c: after
mcspi_writ_chcon(): speed 12000000, sample trailing edge, clk normal
[ 102.626007] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcpi_setup_transfer returned 0
[ 102.626037] spidev spi1.0: spi.c: spi_setup: mode 1, 16 bits/w,
16000000 Hz max -- status 0
[ 102.626037] spidev spi1.0: spidev.c: 16000000 Hz (max)
[ 102.626708] spidev spi1.0: spidev.c: _IOC_NR(cmd) = 0,
_IOC_NR(SPI_IOC_MESSAGE(0) = 0 , -IOC_DIR(cmd) = 1 _IOC_WRITE = 1
[ 102.626800] spidev spi1.0: spidev.c: tmp = 32 bytes, cmd = 1075866368
, n_ioc = 1 bytes
[ 102.626800] spidev spi1.0: spidev.c: calling spidev_message()
[ 102.626831] spidev spi1.0: spidev.c:spidev_msg(): xfer len 128 rx
16bits per word 0 usec delay 16000000Hz
[ 102.626861] spidev spi1.0: calling spidev_sync(), sizeof(msg) = 44
[ 102.626892] spidev spi1.0: spicev.c: spidev_sync(): return from
spi_async(), status is 0
[ 102.626953] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcspi_setup_transfer(): dividor is 1, speed is 16000000
[ 102.626983] spidev spi1.0: spi_omap2-mcspi.c: after
mcspi_writ_chcon(): speed 12000000, sample trailing edge, clk normal
[ 102.626983] spidev spi1.0: spi_omap2-mcspi.c: calling
mcspi_write_chconf0() in omap2_mcspi_force_cs
[ 102.627014] spidev spi1.0: spi_omap2-mcspi.c: calling
mcspi_write_chconf0() in omap2_mcspi_work()
[ 102.627105] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcspi_setup_transfer(): dividor is 1, speed is 16000000
[ 102.627136] spidev spi1.0: spi_omap2-mcspi.c: after
mcspi_writ_chcon(): speed 12000000, sample trailing edge, clk normal
[ 102.627136] spidev spi1.0: spi_omap2-mcspi.c: calling
mcspi_write_chconf0() in omap2_mcspi_force_cs
[ 102.627197] spidev spi1.0: spidev_sync() returned, status = 128
[ 102.627197] spidev spi1.0: spidev.c:spidev_msg(): starting to copy
received data to user space
[ 102.627227] spidev spi1.0: spidev.c:spidev_msg(): receive data copy
finished, status = 128
[ 102.627227] spidev spi1.0: spidev.c:spidev_msg(): returning from
function
[ 102.627258] spidev spi1.0: spidev.c: returned from spidev_message()

Here is the dmesg from a failed, 256 byte run. Everything is the same
until I get into spi_omap2-mcspie.c. This is just the last part of the
log where thing go south.

[ 429.267517] spidev spi1.0: spi.c: spi_setup: mode 1, 16 bits/w,
16000000 Hz max -- status 0
[ 429.267517] spidev spi1.0: spi mode 01
[ 429.267547] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcspi_setup_transfer(): dividor is 1, speed is 16000000
[ 429.267578] spidev spi1.0: spi_omap2-mcspi.c: after
mcspi_writ_chcon(): speed 12000000, sample trailing edge, clk normal
[ 429.267608] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcpi_setup_transfer returned 0
[ 429.267639] spidev spi1.0: spi.c: spi_setup: mode 1, 16 bits/w,
16000000 Hz max -- status 0
[ 429.267639] spidev spi1.0: 16 bits per word
[ 429.267669] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcspi_setup_transfer(): dividor is 1, speed is 16000000
[ 429.267700] spidev spi1.0: spi_omap2-mcspi.c: after
mcspi_writ_chcon(): speed 12000000, sample trailing edge, clk normal
[ 429.267700] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcpi_setup_transfer returned 0
[ 429.267730] spidev spi1.0: spi.c: spi_setup: mode 1, 16 bits/w,
16000000 Hz max -- status 0
[ 429.267730] spidev spi1.0: spidev.c: 16000000 Hz (max)
[ 429.268402] spidev spi1.0: spidev.c: _IOC_NR(cmd) = 0,
_IOC_NR(SPI_IOC_MESSAGE(0) = 0 , -IOC_DIR(cmd) = 1 _IOC_WRITE = 1
[ 429.268493] spidev spi1.0: spidev.c: tmp = 32 bytes, cmd = 1075866368
, n_ioc = 1 bytes
[ 429.268524] spidev spi1.0: spidev.c: calling spidev_message()
[ 429.268554] spidev spi1.0: spidev.c:spidev_msg(): xfer len 256 rx
16bits per word 0 usec delay 16000000Hz
[ 429.268554] spidev spi1.0: calling spidev_sync(), sizeof(msg) = 44
[ 429.268585] spidev spi1.0: spicev.c: spidev_sync(): return from
spi_async(), status is 0
[ 429.268646] spidev spi1.0: spi_omap2-mcspi.c:
omap2_mcspi_setup_transfer(): dividor is 1, speed is 16000000
[ 429.268676] spidev spi1.0: spi_omap2-mcspi.c: after
mcspi_writ_chcon(): speed 12000000, sample trailing edge, clk normal
[ 429.268707] spidev spi1.0: spi_omap2-mcspi.c: calling
mcspi_write_chconf0() in omap2_mcspi_force_cs
[ 429.268707] spidev spi1.0: spi_omap2-mcspi.c: calling
mcspi_write_chconf0() in omap2_mcspi_work()
[ 429.268737] spidev spi1.0: spi_omap2-mcspi.c: calling
mcspi_write_chconf0() in omap2_mcspi_set_dma_req

A few minutes later the following is added to the log.

[ 600.338867] INFO: task kworker/u:1:39 blocked for more than 120 seconds.
[ 600.345886] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 600.354095] kworker/u:1 D c04cc854 0 39 2 0x00000000
[ 600.354095] Backtrace:
[ 600.354156] [<c04cc468>] (__schedule+0x0/0x43c) from [<c04cc9c0>]
(schedule+0x78/0x7c)
[ 600.354187] [<c04cc948>] (schedule+0x0/0x7c) from [<c04ccd00>]
(schedule_timeout+0x20/0x16c)
[ 600.354217] [<c04ccce0>] (schedule_timeout+0x0/0x16c) from
[<c04cc220>] (wait_for_common+0xc8/0x154)
[ 600.354217] r6:00000002 r5:cf8b8cdc r4:7fffffff
[ 600.354248] [<c04cc158>] (wait_for_common+0x0/0x154) from
[<c04cc2c4>] (wait_for_completion+0x18/0x1c)
[ 600.354278] [<c04cc2ac>] (wait_for_completion+0x0/0x1c) from
[<c0333cdc>] (omap2_mcspi_work+0x46c/0xa14)
[ 600.354309] [<c0333870>] (omap2_mcspi_work+0x0/0xa14) from
[<c004cdcc>] (process_one_work+0x1e0/0x334)
[ 600.354339] [<c004cbec>] (process_one_work+0x0/0x334) from
[<c004d748>] (worker_thread+0x1b4/0x2b8)
[ 600.354370] [<c004d594>] (worker_thread+0x0/0x2b8) from [<c0050f48>]
(kthread+0x90/0x98)
[ 600.354400] [<c0050eb8>] (kthread+0x0/0x98) from [<c003c228>]
(do_exit+0x0/0x624)
[ 600.354431] r6:c003c228 r5:c0050eb8 r4:cf83fed8
[ 600.354461] INFO: task adcspi:327 blocked for more than 120 seconds.
[ 600.361114] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 600.369293] adcspi D c04cc854 0 327 313 0x00000000
[ 600.369293] Backtrace:
[ 600.369323] [<c04cc468>] (__schedule+0x0/0x43c) from [<c04cc9c0>]
(schedule+0x78/0x7c)
[ 600.369354] [<c04cc948>] (schedule+0x0/0x7c) from [<c04ccd00>]
(schedule_timeout+0x20/0x16c)
[ 600.369384] [<c04ccce0>] (schedule_timeout+0x0/0x16c) from
[<c04cc220>] (wait_for_common+0xc8/0x154)
[ 600.369384] r6:00000002 r5:cf83de3c r4:7fffffff
[ 600.369415] [<c04cc158>] (wait_for_common+0x0/0x154) from
[<c04cc2c4>] (wait_for_completion+0x18/0x1c)
[ 600.369445] [<c04cc2ac>] (wait_for_completion+0x0/0x1c) from
[<bf04e204>] (spidev_sync+0xa0/0xc8 [spidev])
[ 600.369476] [<bf04e164>] (spidev_sync+0x0/0xc8 [spidev]) from
[<bf04e9bc>] (spidev_ioctl+0x58c/0x6e4 [spidev])
[ 600.369506] r6:cfaecce4 r5:cfa4c400 r4:cf801280
[ 600.369537] [<bf04e430>] (spidev_ioctl+0x0/0x6e4 [spidev]) from
[<c00cbc54>] (do_vfs_ioctl+0x4b0/0x524)
[ 600.369567] [<c00cb7a4>] (do_vfs_ioctl+0x0/0x524) from [<c00cbd20>]
(sys_ioctl+0x58/0x7c)
[ 600.369567] r9:cf83c000 r8:00000000 r7:00000003 r6:40206b00 r5:00012260
[ 600.369598] r4:cfb223c0
[ 600.369628] [<c00cbcc8>] (sys_ioctl+0x0/0x7c) from [<c000d5c0>]
(ret_fast_syscall+0x0/0x30)
[ 600.369628] r8:c000d744 r7:00000036 r6:00008491 r5:00000000 r4:00000000

Finally, the test code that is producing this stuff.

#undef FFT_SIZE
#define FFT_SIZE 256

static const char *device = "/dev/spidev1.0";
static uint8_t mode = SPI_CPHA;
static uint8_t bits = 16;
static uint32_t speed = 16000000L; // 16 MHz clock for 1.0 megasample/sec
static uint16_t delay=0;
//array is also 512 (FFT_SIZE) bytes
uint16_t rx[FFT_SIZE] = {0, };

struct spi_ioc_transfer tr; // the IOCTL struct

        tr.tx_buf = NULL;
        tr.rx_buf = (unsigned long)rx;
        tr.len = ARRAY_SIZE(rx);
        tr.delay_usecs = delay;
        tr.speed_hz = speed;
        tr.bits_per_word = bits;

        // This call should return with 512 samples in tr.rx_buf
  ret = ioctl(fd, SPI_IOC_MESSAGE(1), &tr);
  if (ret < 0)
    pabort("can't send spi message, conversion failed ");

        exit(0);
}

What kernel are you running? 3.2.x will be next to useless in this use case.

#undef FFT_SIZE
#define FFT_SIZE 256

static const char *device = "/dev/spidev1.0";
static uint8_t mode = SPI_CPHA;
static uint8_t bits = 16;
static uint32_t speed = 16000000L; // 16 MHz clock for 1.0 megasample/sec
static uint16_t delay=0;
//array is also 512 (FFT_SIZE) bytes
uint16_t rx[FFT_SIZE] = {0, };

struct spi_ioc_transfer tr; // the IOCTL struct

         tr.tx_buf = NULL;
         tr.rx_buf = (unsigned long)rx;
         tr.len = ARRAY_SIZE(rx);

This should be length in bytes, is it, i'm not so sure...

#defineARRAY_SIZE <LXR linux/(a <LXR linux/) (sizeof(a <LXR linux/) / sizeof((a <LXR linux/)[0]))

looks like it would return 256 rather than 512.

It would. I hacked up the SPI test program that comes with Robert
Nelson's kernel. I started off testing with what I needed - 512 bytes
and then dropped the definition of FFT_SIZE until it worked. 128
worked; 256 didn't. I didn't change the comment because I thought I
could find the problem fairly rapidly and fix it. That was 2 weeks ago :slight_smile:

John

Robert Nelson's 3.2.33-psp26

I've become very intimate with that kernel :slight_smile: I've patched it to get
all three SPI channels working.

Two questions.

Out of curiosity (not doubting you) what is wrong with 3.2.x?

What would you recommend? I'd really like to stay with Robert's work
since it's so clean and easy to work with but I'll change to what is
necessary.

Thanks,
John

I have encountered something similar if not the same. I'm trying to
send 288 bytes with a SPI_IOC_MESSAGE ioctl call to spidev and when I
do that, the session hangs completely, no response from any signal and
its is impossible to kill the process. It appears like an overflow of
a fifo buffer or something, because after that's happened, if I try to
write anything out to spi with that system call no matter how small,
that process dies in the same state. This just bit me last night, so
today I'll try breaking it into multiple transfers in a single message
and see if that works.

I'm using 3.2.21, not sure what the previous comment on the 3.2.x
means... but I would really like a follow up on it :slight_smile:
Also, I'm running at 48MHz with 32 bit words.

-Jon

3.2.x has a buggy dma implementation. I would strongly recommend trying to get your head round using the 3.8 kernel on the beagleboard github page. It uses devicetree which is very different to what people are used to, but it's the future(tm) and you will need to understand it at some point.

The specific SPI transfer cutoff if I remember (this is off the top of my head) is 156 bytes. Try running a transfer with 154 and then try with 158, the latter will hang due to the bug with 16bit spi dma transfers.

I myself spent a number of weeks banging my head with this problem too, and then when the new (post 3.2.x) DMA patches came about I helped test to ensure that my use case was now working.

The initial way I got around the problem was to do multiple transfers of a smaller size.

x = 512

while (x=>156)
{
     do 156 transfer
     x -= 156
}

if x != 0
     do 'x' transfer

Hacky I know, but so is the story of an almost year old in-house vendor kernel.

Cheers,
Jack.

Thanks a lot for the info, unfortunately I’m stuck with 3.2.21, it’s the only kernel that jives with Xenomai and beagle bone and I literally don’t have the time to make a different one work. Hopefully I can make the hack work fast enough for now, but eventually I need transfer periods approaching once every 200 microseconds, hence the realtime patch.

Thanks again for sharing your experience, banging my head against a kernel bug is not something I have time for right now and I’m extremely lucky this came up today.

-Jon

3.2.x has a buggy dma implementation. I would strongly recommend trying
to get your head round using the 3.8 kernel on the beagleboard github
page. It uses devicetree which is very different to what people are used
to, but it's the future(tm) and you will need to understand it at some
point.

OK, I'll take a look. Anything has to be better than OpenEnbedded.

Just out of curiosity, have you looked at Robert Nelson's kernel? It's
just nice clean code with no framework or whatever overhead. He's up to
3.6 now but when I try to clone it from his github account, all I get
are two files, a readme and a patch file. That's another problem to be
dealt with

I myself spent a number of weeks banging my head with this problem too,
and then when the new (post 3.2.x) DMA patches came about I helped test
to ensure that my use case was now working.

It's amazing that we have to spend WEEKS on problems that shouldn't be
there or at least should be documented. The person who wrote the SPI
code seems allergic to comments.

The initial way I got around the problem was to do multiple transfers of
a smaller size.

x = 512

while (x=>156)
{
    do 156 transfer
    x -= 156
}

if x != 0
    do 'x' transfer

I looked at doing that but I'm sampling a signal that's primarily a sine
wave. I do an FFT and then run a peak finder algorithm over the
results. Breaking the sample into discontinuous parts introduces
glitches that can result in peaks higher than the signal I'm interested
in. I gotta figure out a way to do the 512 byte transfer as an atomic
operation.

Your 156 byte limit triggered a memory. In spi-omap2-mcspi.c at about
line 108 in Robert's code is the following.

/* use PIO for small transfers, avoiding DMA setup/teardown overhead and
* cache operations; better heuristics consider wordsize and bitrate.
*/
#define DMA_MIN_BYTES 160

So what your solution was doing was staying under that limit so that DMA
wasn't used. I'm going to change that to something larger and see what
happens.

Hacky I know, but so is the story of an almost year old in-house vendor
kernel.

Don't get me started on that...

I just changed that define to 520, recompiled and viola! We now do a
512 byte transfer with no problems. I'm not sure if this is going to be
my solution or not because there is a gap between each 16 bit transfer
but maybe I can fix that. Maybe a mutex around the critical code would
do the trick.

My test board with the ADC on it came in yesterday so I'll get it hooked
up and see what the FFT looks like with the data set this returns. It
may be fine as it is.

Gotta look at that new kernel too.

Thanks everybody.

John

Hate to bother you again but could you aim me at the 3.8 kernel? I'm
currently at

https://github.com/beagleboard/kernel/tree/3.8

but the latest they have for the Bone is 3.6.

Thanks
John

3.8 branch is for the bone... Your just over analyzing the branch name
based on older branch names...

Regards,

That’s a good solution, does anyone know if this means that that the processor is busy during the entire transfer out the port or just for the copy into the spi hardware? The TRM says that mcSPI has a FIFO buffer, is this taken advantage of without DMA?

-Jon

how do I get it? When I do a

$ git clone git://github.com/beagleboard/kernel.git

I get two files, kernel/README.md and patch.sh

What next?

Do you by any chance have your version of 3.8 ready yet?

Thanks
John

git checkout origin/3.8 -b tmp
./patch.sh

Regards,