Direct Memory Acces from FPGA

Hello everyone!
I’m starting again to work on my color space conversion project on the BeagleV-Fire FPGA, and I would really appreciate some guidance on a few architecture-related questions.

I’m using the BeagleV-Fire (PolarFire SoC) and my goal is to perform YUV - RGB conversion fully in the FPGA, with minimal processor involvement and fast data transfer.

I have a few main questions:

  1. Direct memory access from FPGA (DMA-like approach)
    I want the FPGA to read image data directly from memory, without the processor being involved in every transfer (basically a DMA mechanism).

What is the recommended way to achieve this on BeagleV-Fire?

  1. Memory accessible by the FPGA

What types of memory can the FPGA directly access and how (I need bidirectional transfer, the FPGA should be able to do both read and write transfers to the memory)? Is the accessible memory large enough to store a 4K image, and what would be the typical/preferred memory to use for this use case?

  1. Current approach & reasoning
    What I’ve tried so far:

I designed a custom DMA controller in the FPGA. ( Here is the link to the repository and source code : repositoy and I’ve also attached the architecture of the DMA block and the entire project architecture)

The processor configures it via APB, providing base address of the memory buffer, number of addresses (or pixels) to read and a start signal. The image data is first written by the processor into an SDRAM memory block. After receiving the configuration, the custom DMA initiates AXI transactions via FIC to read data from that memory and feed it to the processing pipeline.

Does this reasoning make sense and could this approach work correctly in practice, or is there a more appropriate / standard approach for this platform?

Any insights, best practices, or references would be highly appreciated. Thanks a lot!

dma_custom.zip (175.5 KO)

With regards to 2., pretty much every FPGA come with dual-ported BRAMs,
so I would think you could use a number of them to carry a single line of the source,
and work on it while DMA fills the other buffer in a interleaved scheme.

As to the question of carrying an entire 4K frame, I highly doubt you’ll find any FPGA
with 25MB (figure from ChatGPT) of static ram, but I could be mistaken.

Seeing how Microchip also makes a Video Kit, perhaps you could find a more
suitable conversation partner on their QnA board?
On a similar note, Microchip also have a number of highly skilled FAEs.
Have you tried to get in touch with one of those?

In any case, keep us posted; I’d love to see where you go with this!