Shared PRU Memory and beyond

Currently working with the Beaglebone Black Rev C and the PRU has the following arrangement:

PRU0 = 8k RAM, PRU1 = 8k RAM, and 12k RAM Shared

I have successfully been able to have one PRU access its own 8K, the other PRU’s 8K, and the shared 12K.

This leaves me with a total memory of 28K. Storing float’s I can store 7,168 values. I would like to capture 20,000 values in about 250ms.

The Derek Malloy tutorial on PRU’s (http://exploringbeaglebone.com/chapter15/) claims, “a pool of 2,000,000 bytes is allocated for the sample data”. Things appear to have changed since the writing of that tutorial. No more dynamic Device Tree through UIO and PRU’s accessed through Remote Proc. Not sure if the architecture changed as well. This makes most of the printed material out of date concerning PRU’s and it’s painful to figure out what is current and not. I would love to have these 2,000,000 bytes available in the new architecture. Needing help with the addressing, and some code samples would be excellent. Hoping someone out there can help me get it.

Question: Is there a way, technique, or resources that can be applied to allow me to do this? Remember, 20,000 samples in 250ms when providing a response.

Thanks in advance.

This leaves me with a total memory of 28K. Storing float's I can store
7,168 values. I would like to capture 20,000 values in about 250ms.

  Capture from what?

  80 samples/msec... or 80000 per second. If I converted properly, about
0.0125msec per sample.

The Derek Malloy tutorial on PRU's (
http://exploringbeaglebone.com/chapter15/) claims, "a pool of 2,000,000
bytes is allocated for the sample data". Things appear to have changed
since the writing of that tutorial. No more dynamic Device Tree through UIO

  The tutorial you reference is based upon the FIRST EDITION of the book
"Additional Content for First Edition" (when it was chapter 13).

and PRU's accessed through Remote Proc. Not sure if the architecture

  Device trees are now loaded by u-Boot. If you really want UIO, there is
a device tree for that... By default, current images use the TI recommended
RPROC/msg system (partly because that is supposed to work with multiple
types of special processor cores -- the BB AI not only has PRU, but DSP
processors).

  Extract from /boot/uEnv.txt:

###PRUSS OPTIONS
###pru_rproc (4.14.x-ti kernel)
#uboot_overlay_pru=/lib/firmware/AM335X-PRU-RPROC-4-14-TI-00A0.dtbo
###pru_rproc (4.19.x-ti kernel)
uboot_overlay_pru=/lib/firmware/AM335X-PRU-RPROC-4-19-TI-00A0.dtbo
###pru_uio (4.14.x-ti, 4.19.x-ti & mainline/bone kernel)
#uboot_overlay_pru=/lib/firmware/AM335X-PRU-UIO-00A0.dtbo

Not sure I am replying properly to preserve the format desired for this page, but your (Dennis B) response definitely deserves a response from me.

This leaves me with a total memory of 28K. Storing float’s I can store
7,168 values. I would like to capture 20,000 values in about 250ms.

Capture from what?

80 samples/msec… or 80000 per second. If I converted properly, about
0.0125msec per sample.

I will be sampling from an ADC up to 1MHz maybe.

The Derek Malloy tutorial on PRU’s (
http://exploringbeaglebone.com/chapter15/) claims, “a pool of 2,000,000
bytes is allocated for the sample data”. Things appear to have changed
since the writing of that tutorial. No more dynamic Device Tree through UIO

The tutorial you reference is based upon the FIRST EDITION of the book
“Additional Content for First Edition” (when it was chapter 13).

Does the second addition contain the updates to the remote proc process?

and PRU’s accessed through Remote Proc. Not sure if the architecture

Device trees are now loaded by u-Boot. If you really want UIO, there is
a device tree for that… By default, current images use the TI recommended
RPROC/msg system (partly because that is supposed to work with multiple
types of special processor cores – the BB AI not only has PRU, but DSP
processors).

Thanks for the DSP lead…that is awesome as well.

Extract from /boot/uEnv.txt:

###PRUSS OPTIONS
###pru_rproc (4.14.x-ti kernel)
#uboot_overlay_pru=/lib/firmware/AM335X-PRU-RPROC-4-14-TI-00A0.dtbo
###pru_rproc (4.19.x-ti kernel)
uboot_overlay_pru=/lib/firmware/AM335X-PRU-RPROC-4-19-TI-00A0.dtbo
###pru_uio (4.14.x-ti, 4.19.x-ti & mainline/bone kernel)
#uboot_overlay_pru=/lib/firmware/AM335X-PRU-UIO-00A0.dtbo

You would have to uncomment the UIO line, and comment out the active
RPROC line (only one mode allowed at a time); then reboot.

This is awesome and definitely gives me something to work with

changed as well. This makes most of the printed material out of date
concerning PRU’s and it’s painful to figure out what is current and not. I
would love to have these 2,000,000 bytes available in the new architecture.

Note that those bytes are allocated in the DDR RAM. There is no
hardware change.

Glad to know the HW did not change.

Needing help with the addressing, and some code samples would be excellent.
Hoping someone out there can help me get it.

Might https://github.com/pgmmpk/bbb_pru_adc be of use? Note that the
maximum speed for that code is 15KHz, or just 1/5th of your 80KHz desire.

I’ll take a look

Not sure I am replying properly to preserve the format desired for this
page, but your (Dennis B) response definitely deserves a response from me.

  Difficult -- somehow your added comments are formatted as quoted text
(ie, my comments).

80 samples/msec... or 80000 per second. If I converted properly, about
0.0125msec per sample.

I will be sampling from an ADC up to 1MHz maybe.

  The PRU itself only runs at 200MHz, and you have to account for how
many instructions are needed to run your ADC-start, read, store, notify
host logic.

Does the second addition contain the updates to the remote proc process?

  It uses Remote Proc for the examples -- but does NOT have ADC examples.
It does warn that Remote Proc/rpmsg is fairly new (at the time of the book)
and that things might change. Unfortunately TI has changed its document
access, so the links in the 2nd edition book don't work. Most of the
documentation seems to be at
http://software-dl.ti.com/processor-sdk-linux/esd/docs/latest/linux/Foundational_Components_PRU-ICSS_PRU_ICSSG.html#remoteproc-and-rpmsg
(note: processor SDK -- TI's raw OS build system).

  The examples are the simple Blink LED until Button Pressed; a PWM
generator, Ultrasonic distance sensor.

Thanks for the DSP lead...that is awesome as well.

  On the BB AI -- and not much information has appeared on programming
them.

Hello

Google Am335x PRU Support package it’s got 6 labs and examples including ADC.

It’s written for the SDK Linux Dennis mentioned but people have gotten these examples to work with Linux supported in this group

There’s also a Support package tar containing step by step HTML of all the labs above preserved from the old wiki.

There’s several PDF files supporting all this some are application notes

Getting started with RPMSg on Linux is one

If you start with these HTML files it’s really well written it’s a how-to

Labs 1 to 3 require CCS JTAG

The other labs deal with Linux

The ADC example passes samples back to Linux.

For a full blown PID loop using both PRU there’s a example code also on GitHub preserved from TI support expert Nick.

The PRU support package 5.8 code on GitHub or wherever get it in the examples shows linker commands for using shared RAM.

The concept in any book of accessing these has more to do with pragma or compliler specific linker commands in gcc should not change

Every book or cookbook or tutorial ever written was all this data I pointed you to explained by an experienced engineer’s using TI wiki Data

That my opinion.

Remember this stuff will talk about building code from command line in the 4G SDK which requires a native Ubuntu box or a virtual box Ubuntu on windows 7. Officially vbox support isn’t suppored anymore but I got it working recently only tested windows 7

All this discussion can be found by googling " PRU ADC beaglebone"
also many discussion on E2E forum replace Beaglebone with 2E

I’d suggest reading that getting started first

Or use Mark Yoder’s cookbook which is web-based compilation on the board itself. I have not tried that

Lastly if you take the Apple’s and Oranges approach which is taking the source code I mentioned and trying to follow the cookbook web based compilation. Most people seem to fail trying this I don’t recommend it

Pick one approach

Good luck
Mark

Regardless of the timing, you want to store 20,000 values but I think you’ve calculated correctly that you can only store 7,168 values in the 28k of combined PRU memory and that would only be true if some of the PRU memory wasn’t used by your PRU program when it’s loaded.

So are you trying to find a way to store the data back in the main host memory instead?

Remoteproc/RPMSG can send the data back but I don’t think it’s fast enough for you. I’m doing that now with code written in C. (It was a bear to learn and get going but I’m a rusty guy too. ) I am reading 12-bit AD with the onboard ADC (Touchscreen controller in general-purpose mode). I saw one post on the TI E2E forum that indicated that Remoteproc/RPMSG is not intended to be a fast data transfer mechanism. That was by a TI engineer I think.

The only alternative I can think of is declaring variables in your PRU code whose address is actually in the main host memory (Linux) space. I’ve tried to do this using the example in Mark Yoder’s PRU Cookbook but I could not get it to work. I decided that I didn’t really need that and went another route.

@TJF on this forum promotes a solution called libpruio that might work. I don’t know if it’s fast enough though.

This is a good group so keep the discussion going and maybe we can figure something out.

@TJF on this forum promotes a solution called libpruio that might work. I don’t know if it’s fast enough though.

I thought libprio was designed to be very fast was my understanding. I Saw in TI forum docs that UIO isn’t supported in SDK Linux by TI.

I would definitely agree with below and your math and if the OP can’t add I don’t think he will get anything to work :face_with_hand_over_mouth:. He only said he’d reply if you did his math for him.:rofl:

#saw one post on the TI E2E forum that indicated #that Remoteproc/RPMSG is not intended to be a #fast data transfer mechanism. That was by a TI #engineer I think.

As far as getting the conversation to continue maybe TJF will reply.

#saw one post on the TI E2E forum that indicated #that Remoteproc/RPMSG is not intended to be a #fast data transfer mechanism. That was by a TI #engineer I think.

In a parallel processing architect that ability to share data quickly between processors is paramount.
With out that the ARM is just a gateway
As I understood TJF solved this issue with libpru but his solution wasn’t adopted
As I have mentioned before I have seen custom solutions to share data between the ARM and DSP so I know its possible using TI Socs

@TJF on this forum promotes a solution called libpruio that might work. I don’t know if it’s fast enough though.

I thought libprio was designed to be very fast was my understanding. I Saw in TI forum docs that UIO isn’t supported in SDK Linux by TI.

I would definitely agree with below and your math and if the OP can’t add I don’t think he will get anything to work :face_with_hand_over_mouth:. He only said he’d reply if you did his math for him.:rofl:

#saw one post on the TI E2E forum that indicated #that Remoteproc/RPMSG is not intended to be a #fast data transfer mechanism. That was by a TI #engineer I think.

As far as getting the conversation to continue maybe TJF will reply.

Hi Bruce,

beside 2x8k DRam and 12k SRam libpruio is using a further memory block called ERam (extension or external), find details at

https://users.freebasic-portal.de/tjf/Projekte/libpruio/doc/html/ChaMemory.html#SecERam

This block is 128k by default and can get configured up to 8MB = 4MS@16Bit. Using that block your task is easy:

  1. Send the physical ERam adress to the PRU (DRam).
  2. Fetch by PRU firmware ADC samples and store them in ERam (integer).
  3. Read ERam from ARM side and compute (float).
    That’s how the MM and RB mode in libpruio works (fetching samples from the internal TSC_ADC_SS).

You’re talking about 1MS/s. That’s 200 PRU cycles per value. It sounds like a simple task.

Regards

what are the memory locations of ERam?

Sorry, the link above doesn’t work any more since I updated the docs (using a newer Doxygen version which creates different file names).

Now find the memory page at
https://users.freebasic-portal.de/tjf/Projekte/libpruio/doc/html/_cha_memory.html

Edit:
I just checked the page and updated the second paragraph in the ERam section

In both cases the number of total samples (`= AdcUdt::Samples x
AdcUdt::ChAz`) must not be greater than the available number of samples
in the external memory. The block size is available in member variable
PruIo::ESize. The member variable PruIo::EAddr (and in MM/RB mode also
the pointer AdcUdt::Samples) gets set to the start of the ERam block.