PRU, remoteproc and unsigned long long types

We use Remoteproc to communicate summary/status data from the PRU back to the ARM host program. Our code is written in C.

We have had quite a bit of issue with the protocol not being consistent with messaging from the PRU to the ARM and just got used to it being flaky. We put in a lot of workarounds during development.

We had an unsigned long long variable declared in the PRU code that was not related to the remoteproc processes at all.

We had an issue outside of the remoteproc processes where a counter variable for a ā€˜forā€™ loop was getting set with random values even though we initialized the variable correctly.

While debugging the issue, we noted the associated register usage for variable types in the TI PRU Optimizing C/C++ Compiler V2.2 manual and saw that unsigned long long types use register pairs.

We changed the definition of that variable to unsigned int and now remoteproc messages sent from the PRU to the ARM are received by the ARM flawlessly.

So, Iā€™m interested in knowing more about how this works and why this could have been an issue.

1 Like

Donā€™t use ā€œunsigned long longā€. There is very little guaranteed about that type. Use things like ā€œuint64_tā€. That is guaranteed to be 64 bits and unsigned.

However, I suspect the real issue is that your ā€œuint64_tā€ was part of a struct or union. The problem is that compilers can pack, pad or align the internal entities of a struct in all manner of weird ways. A 64-bit entity may suddenly demand 8 byte alignment instead of 4 byte alignment, for example.

What you are doing goes by the term ā€œserializationā€ and it has a whole host of traps and gotchas.

At the end of the day, if you want to be positive that your struct is unpacking correctly, you will want to pass an opaque bag of bytes and do all the encoding/decoding manually.

Looking back, Iā€™m not sure why we used long long in the first place. We wonā€™t likely have a number that large occur before the system ends a batch and starts a new one at zero. The weird thing is that this variable wasnā€™t even involved in any of the processes that were misbehaving. Itā€™s clear from TIā€™s Compiler documentation that the long long uses two registers and what you are describing is definitely possible. This variable is not part of a struct or union that we defined by the way.

Recent C compilers have gotten stupidly aggressive about exploiting undefined behavior for optimization possibilities. This can cause all manner of strange behavior.

When my team finds strange embedded behavior, one of my first steps is simply to try to compile on my desktop compilers with warnings turned up to maximum. Often, this lists a bunch of stuff that the embedded compilers donā€™t always catch. Killing those warnings often makes the problem go away.

1 Like

The TI compiler is capable of producing detailed listing and memory map files that I find can be rather helpful in understanding how the compiler interpreted your code.

1 Like