Linux 3.8, am335x, How to set up continuous DMA transfers

leo1 · December 10, 2013, 8:28pm

Reading the documentation on the EDMA peripheral for this part (am3352), it is clearly capable of doing continuous/chaining DMA transfers via the link-address mechanism. The only method I see that interacts with this mechanism is the “CYCLIC” type transfer, which appears to set the address for the next transfer to the address of the current transfer.

I’d like to set up the DMA for chaining, with four buffers–two for transmit and two for receive, with the following behavior.

T_BUFF_1–currently in use

T_BUFF_2-address is loaded into link register of current transmit operation so that it is automatically utilized when the transfer from T_BUFF_1 completes.

R_BUFF_1-currently in use

R_BUFF_2-address is loaded into link register of current receive operation so that it is automatically utilized when the transfer into R_BUFF_1 completes.

Callback routines associated with these would just load the link-address associated with the new transfer with the address of the buffer not currently in use (the one having just finished).

So the transfers would look like T_BUFF_1 → T_BUFF_2 → T_BUFF_1 → T_BUFF_2 -->… and R_BUFF_1 → R_BUFF_2 → R_BUFF_1 → R_BUFF_2 -->…

I don’t see any full implementations of this in any example/sample code, and like I said, the “CYCLIC” transfers just seem to overwrite the same buffer, rather than update to a new buffer upon completion.

I am writing a replacement for the Linux 3.8 SPI driver ( “spi-omap2-mcspi.c” ) that does continuous transfers rather than the discreet message-queue method, in order to avoid the latency between handling of messages.

John_USP · December 10, 2013, 11:49pm

Reading the documentation on the EDMA peripheral for this part (am3352), it is clearly capable of doing continuous/chaining DMA transfers via the link-address mechanism. The only method I see that interacts with this mechanism is the “CYCLIC” type transfer, which appears to set the address for the next transfer to the address of the current transfer.

I’d like to set up the DMA for chaining, with four buffers–two for transmit and two for receive, with the following behavior.

T_BUFF_1–currently in use

T_BUFF_2-address is loaded into link register of current transmit operation so that it is automatically utilized when the transfer from T_BUFF_1 completes.

R_BUFF_1-currently in use

R_BUFF_2-address is loaded into link register of current receive operation so that it is automatically utilized when the transfer into R_BUFF_1 completes.

Callback routines associated with these would just load the link-address associated with the new transfer with the address of the buffer not currently in use (the one having just finished).

So the transfers would look like T_BUFF_1 → T_BUFF_2 → T_BUFF_1 → T_BUFF_2 -->… and R_BUFF_1 → R_BUFF_2 → R_BUFF_1 → R_BUFF_2 -->…

I don’t see any full implementations of this in any example/sample code, and like I said, the “CYCLIC” transfers just seem to overwrite the same buffer, rather than update to a new buffer upon completion.

I am writing a replacement for the Linux 3.8 SPI driver ( “spi-omap2-mcspi.c” ) that does continuous transfers rather than the discreet message-queue method, in order to avoid the latency between handling of messages.

Hi Leo,
You mean DMA_CYCLIC? You should look at the DMA implementation for McBSP. Unfortunately, TI didn’t implement the McBSP driver on it’s own, but instead made it part of the sound driver implementation. I’m working on the same problem, but my SPI device generates the clock. This isn’t the same as slave mode in that my SPI device doesn’t control the SPI chip select. I hoping I can use the McBSP DMA implementation as a reference.
Regards,
John.

leo1 · December 11, 2013, 10:57pm

Yes, “DMA_CYCLIC” type transfers. What I’m attempting to do now is set up two cyclic transfers (four total, two for transmit and two for receive), and then link the two transfers together and the two receives together.

For example, on the Rx side:

788 // RX
789 g_adcSpi.descRx1 = dmaengine_prep_dma_cyclic(
790 g_adcSpi.dmaChanRx1,
791 g_adcSpi.dmaRxBuff1,
792 ADC_SPI_DMA_BUFFER_SIZE_BYTES,
793 // this parameter sets the transfer level for signaling the callback();
794 // We can set it to be a complete transfer since the next transfer should
795 // already be set up and ready to go, ie, we are not racing the clock
796 ADC_SPI_DMA_BUFFER_SIZE_BYTES,
797 DMA_DEV_TO_MEM,
798 DMA_PREP_INTERRUPT | DMA_CTRL_ACK /* DMA_PREP_CONTINUE ? / );
799
800 if ( !g_adcSpi.adcSpiDescRx1 )
801 {
802 printk( KERN “\nError acquiring SPI DMA Rx1 descriptor!\n” );
803 return( -1 );
804 }
805
806 g_adcSpi.descRx1->callback = AdcSpiRxCallback;
807
808 g_adcSpi.descRx2 = dmaengine_prep_dma_cyclic(
809 g_adcSpi.dmaChanRx2,
810 g_adcSpi.dmaRxBuff2,
811 ADC_SPI_DMA_BUFFER_SIZE_BYTES,
812 // this parameter sets the transfer level for signaling the callback();
813 // We can set it to be a complete transfer since the next transfer should
814 // already be set up and ready to go, ie, we are not racing the clock
815 ADC_SPI_DMA_BUFFER_SIZE_BYTES,
816 DMA_DEV_TO_MEM,
817 DMA_PREP_INTERRUPT | DMA_CTRL_ACK / DMA_PREP_CONTINUE ? / );
818
819 if ( !g_adcSpi.adcSpiDescRx2 )
820 {
821 printk( KERN “\nError acquiring SPI DMA Rx2 descriptor!\n” );
822 return( -1 );
823 }
824
825 g_adcSpi.descRx2->callback = AdcSpiRxCallback;
826
827 // now link/chain them; 1–>2–>1–>2–>1…
828 struct omap_chan omapChanRx1 = to_omap_dma_chan( g_adcSpi.dmaChanRx1 );
829 struct omap_chan* omapChanRx2 = to_omap_dma_chan( g_adcSpi.dmaChanRx2 );
830 omap_dma_link_lch( omapChanRx1->dma_ch, omapChanRx2->dma_ch );
831 omap_dma_link_lch( omapChanRx2->dma_ch, omapChanRx1->dma_ch );

I don’t think the McBSP code does anything very similar to this…all I can find is one example of setting up a cyclic transfer (overwriting the same buffer over and over again, which I don’t want as it will trash my previous data)–>/sound/soc/soc-dmaengine-pcm.c, and one example of linking channels (which actually turns out to be cyclic as well, since it just links a channel to itself) -->/drivers/media/platform/soc_camera/omap1_camera.c

There’s got to be a complete example of how to set this up and get it working!

John_USP · December 12, 2013, 5:21am

Yes, “DMA_CYCLIC” type transfers. What I’m attempting to do now is set up two cyclic transfers (four total, two for transmit and two for receive), and then link the two transfers together and the two receives together.

For example, on the Rx side:

788 // RX
789 g_adcSpi.descRx1 = dmaengine_prep_dma_cyclic(
790 g_adcSpi.dmaChanRx1,
791 g_adcSpi.dmaRxBuff1,
792 ADC_SPI_DMA_BUFFER_SIZE_BYTES,
793 // this parameter sets the transfer level for signaling the callback();
794 // We can set it to be a complete transfer since the next transfer should
795 // already be set up and ready to go, ie, we are not racing the clock
796 ADC_SPI_DMA_BUFFER_SIZE_BYTES,
797 DMA_DEV_TO_MEM,
798 DMA_PREP_INTERRUPT | DMA_CTRL_ACK /* DMA_PREP_CONTINUE ? / );
799
800 if ( !g_adcSpi.adcSpiDescRx1 )
801 {
802 printk( KERN “\nError acquiring SPI DMA Rx1 descriptor!\n” );
803 return( -1 );
804 }
805
806 g_adcSpi.descRx1->callback = AdcSpiRxCallback;
807
808 g_adcSpi.descRx2 = dmaengine_prep_dma_cyclic(
809 g_adcSpi.dmaChanRx2,
810 g_adcSpi.dmaRxBuff2,
811 ADC_SPI_DMA_BUFFER_SIZE_BYTES,
812 // this parameter sets the transfer level for signaling the callback();
813 // We can set it to be a complete transfer since the next transfer should
814 // already be set up and ready to go, ie, we are not racing the clock
815 ADC_SPI_DMA_BUFFER_SIZE_BYTES,
816 DMA_DEV_TO_MEM,
817 DMA_PREP_INTERRUPT | DMA_CTRL_ACK / DMA_PREP_CONTINUE ? / );
818
819 if ( !g_adcSpi.adcSpiDescRx2 )
820 {
821 printk( KERN “\nError acquiring SPI DMA Rx2 descriptor!\n” );
822 return( -1 );
823 }
824
825 g_adcSpi.descRx2->callback = AdcSpiRxCallback;
826
827 // now link/chain them; 1–>2–>1–>2–>1…
828 struct omap_chan omapChanRx1 = to_omap_dma_chan( g_adcSpi.dmaChanRx1 );
829 struct omap_chan* omapChanRx2 = to_omap_dma_chan( g_adcSpi.dmaChanRx2 );
830 omap_dma_link_lch( omapChanRx1->dma_ch, omapChanRx2->dma_ch );
831 omap_dma_link_lch( omapChanRx2->dma_ch, omapChanRx1->dma_ch );

I don’t think the McBSP code does anything very similar to this…all I can find is one example of setting up a cyclic transfer (overwriting the same buffer over and over again, which I don’t want as it will trash my previous data)–>/sound/soc/soc-dmaengine-pcm.c, and one example of linking channels (which actually turns out to be cyclic as well, since it just links a channel to itself) -->/drivers/media/platform/soc_camera/omap1_camera.c

There’s got to be a complete example of how to set this up and get it working!

Hi Leo,

I’m not sure you are correct here. Most drivers using DMA regularly use two buffers, labelled ping and pong. One buffer is filled while the other is being processed. TI have some C6000 training material which shows this working. Granted the training material is for SysBIOS, but it is written for the EDMA which is the similar to the EDMA used on the AM3359. The same concepts will apply to a Linux driver. BTW, EDMA can support more than two buffers for both TX and RX.

Regards,
John