Passing Mailbox messages between ARM and PRU

I’m reading through the TRM for how the mailbox is supposed to be interfaced with, but it’s giving me more questions than answers. The J721E TRM (spruil1c.pdf Sec 7.1.4) talks about Global Initialization and Mailbox initialization.

Unfortunately it doesn’t give any more useful examples of how the mailbox is interacted with within the context of Linux or PRU programming.

Suspecting the memory offset I’ll write values to will be a function of the queue number somehow, I start looking for more documentation that gives a little bit more specifics. Looking in the J721E_Register3.pdf, there’s a lot of mailbox registers. My interpretation of what’s documented is there’s 11 mailbox instances, each with 16 queues, each queue capable of holding up to 4 messages. The register I believe I would want to read/write to would be MAILBOX_MESSAGE_y as defined in the doc.

So the first queue of the 1st instance would be at 31F8_0040h. This register appears to be the 4-byte FIFO. So I write 2 values in, and then read it, I should read out the 1st value written. Then read it again, and get the 2nd value. That’s what I expected anyway.

But when I tested this theory, it didn’t seem to work. Here’s what I tried from the terminal of my BBAI64:
sudo busybox devmem 0x31f80040 w 0xdeadbeef
followed by
sudo busybox devmem 0x31f80040

The value read back was all 0s. So there’s either more configuration required to activate the mailbox or I’ve gone off the rails somehow. Anybody have any ideas?

Having an example already written and working would probably be all I need. Anybody know where such code might live?

I’ve revisited this and just to see what happens, I tried a different mailbox, and what do you know? It worked. Sort of.

When I send messages to 31F8_1040h, I can read those messages back out in the order I put them in. I can even reference 31F8_1080h which will return 1 if the queue is full (has 4 messages). I can even call 31F8_10C0h to get a count of how many messages are in the queue.

What is odd is when I 1st started testing, the default value if nothing is in the queue was 0x0. However after a while, one of my previous values wound up getting stuck in the register as the “default” value. So the queue would continue to return the values I put in, but once all messages were pulled, continuing to read the register no longer returned 0x0. It would return one of the previous values I’d sent. So it’s almost as though there’s an incantation you can do to set the default value OR the mailbox functionality is just flaky.

Once I’d played around with this a bit longer, I returned to the 0th mailbox, and it continues to not work. So I don’t know what’s up with that. Maybe there’s something in the background using that mailbox and as I’m putting garbage values in it, whatever is listening to the mailbox is pulling those values out faster than I can read them out.

But this is working well-enough for me to use and start writing some test code to pass messages back and forth between the PRU and ARM so I can get performance numbers. I vaguely recall that there’s a PRU timer value that can be referenced. I’d like to use that as the value I push to the ARM from the PRU so I can do other tasks in the PRU and essentially send a timer value before and after the task is complete and then get a delta in the ARM and calculate just how fast, things like accessing DDR is, and how much jitter there is in doing so. It’s a shame there’s only 4 message positions in the mailbox. It’ll likely be too cumbersome for me to use mailboxes to do bulk data transfers, so I’ll need to also test shared memory and use mailboxes to share, with the PRU, what the raw memory addresses are as well as how many bytes has been written so it can go read whatever I sent.

Anyway, it’s good to get confirmation on how this works and see that it does work just as simply as it is documented to albeit with some caveats.

If someone knows what is going on with the “default” value getting changed in the mailbox, let me know what it is I’m doing inadvertently to cause that…and how I can reset the mailbox to clear that out when/if it happens again.

With this information, I was able to successfully send data back and forth between the ARM and code I wrote for the PRU. However I never could find a timer counter that would return a value other than 0.

What I could send is the number of times my while-loop looped. As well, I was able to send a “stop” message from the ARM to the PRU. And by monitoring the mailbox count of the mailbox the PRU was pushing messages into, I could see that the PRU had, indeed, stopped confirming I successfully attained 2-way coms.

If someone knows a value I can use that’ll indicate a count of clock changes, I’ll put that in my program and retest.

The value I was trying was ICSSG0_PRU1_CYCLE_COUNT @ 0x0B02400C (I’m loading my code into PRU1).

I even found that the IEP subsystems have a 64-bit timer in them (IEP_COUNT_REG0 & IEP_COUNT_REG1). And I tried activating the IEP (via IEP_GLOBAL_CFG_REG) and reading the lower-32 but it never changed value either. Perhaps there’s more to activating it than setting the config register to a value of 1?

There has to be a stand-alone free-running timer register somewhere that can be used for delta-timing in the PRUs. That’s a pretty fundamental aspect to realtime programming & control.

The only other thing I found in the PRU subsystem that looks promising is the ECAP registers. But dealing with it just seems far more heavy-weight than what I need. I just need a value that increments on a known clock frequency.

Perhaps I’ll start another thread asking if anybody knows of a free running timer that can be accessed anywhere in the system. As long as I know how fast the timer ticks, I can then correlate different values and get an accurate indication as to how long tasks take to perform.

Anyway I’m going to mark this subject as solved since I do have a working solution for the title. There are still unanswered questions like why values get stuck in the FIFO as default values. But that’s easy enough to avoid, simply don’t read the FIFO if its empty.

How about the GTC?

12.10.1 Global Timebase Counter (GTC) GTC Overview
The GTC module provides a continuous running counter that can be used for time synchronization and debug trace time stamping.

   uint64_t t;
   t = (*((uint64_t volatile *)(0x00A90008)));

I didn’t know about that. I plugged it into the code…seems to work beautifully. Thank you.

Question though, I was trying to find something that seemed to be within the PRU subsystem. It seems all the documentation suggests that once the PRU has to reach outside the subsystem, there’s more than the standard opcode processing delay of 1-2 PRU clock cycles to complete, and the time it takes is also non-deterministic. Being GTC is “global”, are reads to it going to be subject to this non-deterministic delay?

I guess I’ll find out as I get into more realtime analysis, not manual terminal-checking the incoming mailbox on the ARM for values from the PRU.

I searched for some definitive information on this but, couldn’t find it readily. Based upon what I have gathered there is a “few nanoseconds of jitter” caused by going through the system interconnect. I also read somewhere that it can vary based upon the traffic on the interconnect.

Maybe this recent PRU timer post by user Juvinski could help you stay in the PRU subsystem: