I’m working on a project where I am using the pru to record a bunch of data with its fast IO (about 50 to 60 MHz). In some ways similar to the BeagleLogic project, but with some additional control code (on the pru side). Problem was that you can’t compile the project from sourcecode, so I build my own thing from the ground up and got it working. The only issue that I now have is the amount of data that I am recording.
For my measurement I need to collect data in the magnitude of about 2MB. I know that the pru also has access to some fast memory, but only 28 KB (8 pru0 + 8 pru1 + 12 sram), which is not even close. The obvious solution was to treat this memory as a buffer and push data at regular intervals with rpmsg, which I was going to use to get the final data anyway.
This however turned out to be impractical as it takes one of the cores approximately 800 cycles (varies a little) to move 496 byte to a mailbox. Doing smaller chunks is less efficient as there is a 400 cycle overhead and doing bigger ones is not supported. Both prus are needed for the recording process. Pru1 does the controls and gathers IO data into its registers and moves them thru broadside to pru0, which does a bit of formatting and saves it to the ram. There is only a hand full of free cycles during recording and about 150 every 3 KB of data (other hardware is busy and both prus are free to do anything while waiting).
Under these circumstances I can only see three solutions. Either I somehow enlarge the pru ram (I am not sure if it is just a memory region or actually special hardware that can’t be changed), make the pru copy the data directly into a large section of ram (hoping that the speed doesn’t suffer too much) or have the prus do their thing and task the arm with grabbing the data from pru memory. Some sort of memory mapping.
On that note I actually found a neat library (libpruio) that does exactly that (didn’t test how fast it actually reads tho). There are a couple things that I don’t particularly like:
- It takes forever to process an interrupt (between 3000 and 10.000 cycles, presumably because it’s userspace) which means that I’d have to grab data blindly hoping that it is already written
- I can only get it to run on the older 4.19 Kernel (mostly due to my inability to get the uio drivers working on 5.4 or some other mystery reason)
- It’s third party software that will only be maintained for as long as the maker is willing to. Sure its gpl, but the thought of maintaining a decade old project and learning cmake and freebasic is almost as frightening as a devicetree
- getting it to work requires some ancient software incantations. I am afraid that I won’t find them anywhere on the web one day.
I would prefer some TI endorsed tech. Some way to get the arm to directly access the pru memory. Or get the pru to write data into a bigger memory chunk. Maybe there is some other way that I am oblivious to?
In any case I’d like to thank you for reading my long ramblings, I’d be thankful for any tips directions or reading material!