PRU Memory Store Instruction with Autoincrement?

Hi, folks,

I’m trying to dump R31 over and over into either RAM0 or Shared DRAM (Data RAM).

So, basically it looks like I have to do:

Store R31 to address in register
Increment address in register
(Lather rinse repeat)

As that stands, that’s a 100MHz sample of R31. Is there anyway to do the autoincrement on the store? That would double my sampling rate to 200MHz which would be the maximum possible.

Alternatively, is there a different way to sample and store R31 repeatedly to RAM0 or Shared DRAM? I can’t seem to see one, but there are so many permutation of Store Burst, mapping, the broadside transfer bus, and peripherals that I can quite easily be missing something that someone more experienced knows about.

Thanks.

Based on
https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwiYq4C71L7sAhURLKwKHSFhDtwQFjAAegQIBBAC&url=https%3A%2F%2Fwww.ti.com%2Flit%2Fpdf%2Fspruij2&usg=AOvVaw0nqU9tbqk5uktRGNNbV4go
(apologies for the ugly Google encoded link) the ONLY PRU instruction with
auto-increment/-decrement is MVIx (Move Register File Indirect) but the
documentation gives one the impression that it allows for incrementing
among the registers, not memory.

  Might still be worth the effort of trying it... (Report back if it
works)

Hi,

Do you require continuous 200MHz sampling? That would be difficult I think.

If you require bursts of 200MHz sampling, how long should those bursts be? Even if you find autoincrement opcode, you would still need to add one jump instruction to “loop”.

You can try hardcoding the “increment” into the constant offset field. And load several registers with starting address of each 256-byte chunk of your destination buffer. Example:

init:
ldi r0, __buf_start
ldi r1, __buf_start+256
ldi r2, __buf_start+256*2

burst_store:
sbbo &r31, r0, 0, 4
sbbo &r31, r0, 4, 4
sbbo &r31, r0, 8, 4
sbbo &r31, r0, 12, 4
sbbo &r31, r0, 16, 4

sbbo &r31, r0, 252, 4
sbbo &r31, r1, 0, 4

sbbo &r31, r0, 4, 4

I am not sure what are the buffering limits of PRU’s posted writes. I doubt if so many consecutive sbbo instructions would not stall.

You should also examine BeagleLogic firmware’s operation mode for 100MHz. Perhaps you can modify it to take 200MHz bursts instead of continuous 100MHz sampling. Difference with above single-PRU example is that you would use PRU register banks for buffering, which would decouple sampling from SBBO bus operations.

Regards,
Dimitar

Hmm, that’s an interesting idea, Dimitar, to encode the increment in the immediate. That’s probably … useful.

That mechanism would give me a 64-sample burst per register which could possibly get me to 1920 samples if the SBBO doesn’t stall out anywhere.

I thought about the BeagleLogic, but it relies on the fact that every other cycle is NOP, so it can effectively run something simultaneously at half-speed in the interleaved time.

Looks like it’s time for some experiments. I’ll report back if I see anything interesting.

BeagleLogic[1] does some interesting tricks[2] to get a solid 100MHz sampling rate. The PRU Cookbook presents an overview of it. Check it out.

–Mark

[1] Welcome to BeagleLogic! — BeagleLogic 2.0 documentation
[2] BeagleLogic: Building a logic analyzer with the PRUs: Part 1 – The Embedded Kitchen
[3] Case Studies - Introduction

@Dimitar

SBBO is a two-cycle instruction. In order to get 200 MHz you’ve to use MOV.

@Andrew

The MVIx family of instructions provides an auto-increment feature. But it doesn’t really help.

Like Dimitar suggested it’s best to code a sequence of immediate operations to store a burst of samples.

On PRU-1 lines 0 to 14 and 16 are wired to header pins (standard BB 2x46 headers). By word access you’ll get up to 15 input lines.

On PRU-0 lines 0 to 16 are wired (note: some of them are on the SD-card slot). By word access you’ll get up to 16 input lines.

Let’s say you sample 16 lines on PRU-0 writing to the register file by code like

MOV R0.w0, R31.w0
MOV R0.w2, R31.w0
MOV R1.w0, R31.w0
MOV R1.w2, R31.w0

MOV R29.w0, R31.w0
MOV R29.w2, R31.w0

Then you’ll have to spend a cycle to save the register file

XOUT 10, R0, 120

Afterwards the initial code follows again.

In the meantime PRU-1 observes the program counter of PRU-0. When it reaches the XOUT instruction then PRU-1 resets the program counter of PRU-0 (to a value of 1, 2, 3 → test it out) and then graps the register file from the XFR interface and stores the samples in memory, before it again observes the PRU-0 program counter.

This solution will get you a burst of 16 lines/60 samples (or 8 lines/120 samples) at 200 MHz. In continuous mode you’ll have to deal with a one cycle gap for saving the register file.

Regards