Reduce PRU DDR read latency

I see up to 5.8 microseconds of latency when the PRU is reading from DDR
memory. Is it possible to reduce this? I tried changing registers that
looked promising for giving the PRU priority but they didn't have a
noticeable effect. This is running on a Beaglebone black using a PRU timer
to measure the maximum time a LBBO from DDR takes.while the ARM is busy.

The examples I found for doing DMA from the PRU were all for older kernels
and wouldn't work with Linux 3.8.13. Is there an example I missed that
works with this kernel?

I’m not an expert, and just learning myself…

I found the examples in ‘am335x_pru_package’ on GitHub very useful, and was able to get them working on a 3.8 kernel.

DDR access will have additional latency but 5.8us sounds high.

Is that for 1 LBBO instruction?
How are you timing this (which timer)?
What is the CPU doing during this time, if it is running 100% in a tight loop polling that same memory then CPU/PRU will be in contention, and that could cause excessive latency. Typically something like the pru interrupt is needed to let the cpu know that pru is done/ready etc.

I found the examples in ‘am335x_pru_package’ on GitHub very useful, and was able to get them working on a 3.8 kernel.

I have used that. I don’t think there is a DMA example that works in there. I have everything working non DMA except the read latency is causing some difficulties in a few situations.

DDR access will have additional latency but 5.8us sounds high.

Is that for 1 LBBO instruction?

Yes that was for 1 LBBO. It times the LBBO, keeps the maximum and repeats.

How are you timing this (which timer)?

The timer in the PRUSS enhanced capture unit.

What is the CPU doing during this time, if it is running 100% in a tight loop polling that same memory then CPU/PRU will be in contention, and that could cause excessive latency. Typically something like the pru interrupt is needed to let the cpu know that pru is done/ready etc.

I was running various commands to create load. The maximum happened with compiling my code. Reading files was close to the maximum. I was wanting to make sure it will work worst case. The ARM will be reading and writing a file and doing some other work in parallel with the PRU. Without carefully stripping down the system you also don’t know what else will decide to run. I am using the interrupt to signal the ARM that it needs to do more work for the PRU.