Streamlining Data Transfer from STM32 to BBB to PC

Hello everyone,

I need to efficiently transfer data from an ADC with 8 channels. The data transfer speed requirement is 8 Mbps (Megabits per second). The data flow will be from an STM32 microcontroller to a BeagleBone Black (BBB) over SPI, and then from the BBB to a PC over UDP.

My current approach involves the STM32 acting as an SPI master, sending a data-ready signal to the BBB. The BBB is supposed to handle this signal as a GPIO interrupt and read the data accordingly. I’ve attempted to implement this using the @zmatt sys-gpio example, but unfortunately, it did not work as expected. I don’t know where I went wrong.

Is there any alternative method to achieve this data transfer scenario efficiently? It’s crucial that there is no data loss and minimal latency in receiving the data. Ultimately, my goal is to stream the data seamlessly from the STM32 to the BBB and then to the PC.

Any insights or suggestions would be greatly appreciated.

Aji Mathew

You might have to use an older image with a kernal like 4.9 or so that sys gpio is on by default. Just a guess.

whomever wrote that left comments so you should be able to make sense of it and fix it.

Just go to cli and see if you can toggle a pin, if so your board is functioning with sys and the issue would be the code itself not the board. Make sure you also have the correct device tree loaded and it maps to the code you are using.

Do you mean you’re using the ADC that’s integrated in the microcontroller, or do you have a separate ADC ? In the latter case, what’s the role of the microcontroller? (as opposed to interfacing the ADC directly to the BBB)

That doesn’t make it the SPI master. The SPI master is the side that provides the SPI clock (and chip select, if used) and therefore controls the entire transfer. If the STM32 were the SPI master then it wouldn’t send a data ready signal, it would just send the data.

Using a synchronous serial interface (either SPI or a digital audio interface) is probably indeed the most appropriate choice, but that still leaves a whole bunch of ways to approach this with different trade-offs. I’ll go into more detail once I have a clearer picture of what you’re trying to do exactly.

Nor can anyone help you figure that out if you don’t describe what actually happened. Using sysfs-gpio to trigger linux userspace code that performs an SPI transfer will be too slow to achieve the sample rate you want but other than that it should work fine. The result will depend on how the STM32 code behaves if the BBB doesn’t read the samples fast enough, if it keeps sending data ready pulses at a high rate then many of those events will end up getting dropped.

That’s not kernel dependent, I’ve a beaglebone here running debian buster with kernel upgraded to 5.10-ti and sysfs gpio works just fine.

Are you saying there’s something broken about sysfs gpio in current images? Did rcn turn cape-universal off by default or something?

Still enabled for v5.10.x… It’s anything v6.1.x+ in later is getting harder and harder as kernel gpiolib developers have been on a mission ripping out every legacy function call every single kernel cycle…

Honestly, cape-universal after v6.1.x will need a full rewrite…

For a full rewrite, i’ve been looking to see what others are doing… RPI’s pinctr looks interesting: utils/pinctrl at master · raspberrypi/utils · GitHub


gpio-of-helper has long been in need of a rewrite yeah, but that probably shouldn’t be too difficult. If this is starting to become a critical issue then I’ll see if I can look into that since I consider gpio-of-helper to be a critical piece of infrastructure.

Another comparable is what Renesas does: kernel/git/next/linux-next.git - The linux-next integration testing tree

as ^ defined, users can switch groups/functions from userspace…

Ah… two (pretty much completely unrelated) functions of cape-universal are being conflated here:

  • Exporting sysfs gpios from dts (gpio-of-helper)
  • Runtime pinmux switching (bone-pinmux-helper)

I was talking about the former, you’re talking about the latter.

I am using an external ADC from TI and I need to stream the data without any loss and as you said DRDY is not required since STM32 is the SPI master then how should I approach ? Also the STM32 uses SPI DMA to get the data with clock speed of 24 MHz.

Aji Mathew.

Okay, so then I’ll repeat the question I asked: what’s the reason to put the STM32 between the ADC and the BBB?

I said none of these things

STM32 work is not only to get the data from ADC it also has to Control my whole system which has PID , DAC.

Why can’t the BBB read the spi adc?

If all you are doing is grabbing ADC data and dumping over udp…

Why not a simple esp32 read spi, tx data over WiFi to network?

No ESP32 is not capable of doing all the stuff that STM32 could. The thing is that the system is too complex with quadrature encoders , PID loop , DAC etc.,

Fair enough. Which STM32 are you using? (so I can check what features its SPI peripheral has)

STM32L4P5 series

Its not just the embedded stuff, 6 and ubuntu have had some issues with our big box setups. First to break was our video drivers for the RTX2000 GPU then it broke VMware after 6.5 update. It is all my fault for letting the OS update, just got too comfortable with the updates working very well, then ouch!! Might be better off waiting until 7.x is released and see what happens.

Dear zmatt,

I would like to revisit my approach based on this discussion: The ADC sends data to the STM32L4P5 microcontroller (uC) via SPI (SPI1). The uC manages the control loop (PID + DAC), and then sends the framed ADC data via SPI (SPI2) to the BBB. To initiate an SPI transfer from the BBB (as master) to the STM32 (as slave), an interrupt(Data ready) will be sent to the BBB from the STM32, allowing the BBB to initiate the SPI transfer and send the data from SPI to the PC via UDP.

Regarding the response via sys-gpio, I wanted to know how fast it will respond to an external interrupt and I got a gpio toggle with a time period of 85 microseconds. I expect the data rate of ADC to uC to be maximum 8 megabits per second (including 8 channels). This data will be used within the uC for the control loop, formatting the results, and then sending them to the BBB via SPI. I plan to run the SPI in the uC and BBB at 15 MHz. Is this a good approach? Kindly share your insights.

Aji Mathew

In your private message you mentioned a 32k sample rate, so I’m guessing it’s something like 31250 samples/second * 8 channels * 32 bits per sample = 8 megabit/second ?

If you’ve measured a latency of 85 microseconds then that already answers your question since it means you can’t even deliver 32k events per second to linux userspace, let alone actually do anything with those events.

For this kind of data rate you really don’t want to have deal with each individual data frame in real-time using linux software, linux just has way too much overhead and random latencies. You’ll want to use either EDMA or a PRU program to transfer the received frames to a ringbuffer in memory.

Directly using peripherals (via UIO or using PRU) has a somewhat steep learning curve though, it’s probably actually much simpler to implement an SPI slave using PRU’s direct gpios. I can probably whip up an example later today or tomorrow.

Software then only has to deal with forwarding that data to the network. Hopefully this can manage to meet your performance targets, otherwise there’s always still options for further optimization. If necessary you could achieve extremely low latency by having PRU stick each frame into its own network packet and send them directly without involving linux, but that’s not entirely trivial (although it is on my to-do list of things I want to experiment with).

Dear zmatt,

Thank you for sharing your valuable insights. I look forward to receiving more input from you.

Best regards,
Aji Mathew

Speed 32000 sps
No of channels 8
Data width(for 8 CH) 24x8 bits
CRC 24 bits
Status Bits 24 bits
Total No.of bits 240 bits
Maximum No.of bits transmitted per sec(for 8 CH) 7.7E+6 bits/sec