Streaming control data to PRU

I am trying to get started on a project where I will be streaming control data to the beaglebone black PRU to control two stepper motors. I have an endless stream of data generated in near realtime in the format of, wait X nanoseconds, set 4 gpio output pins to state Y, wait next X nanoseconds, etc. So I want to write say 4 kb of data at a time into the pru memory, then while its replaying the first 4k, write the next 4k to pru memory(double buffering), so that the PRU never stops or misses any nanoseconds when replaying the data.

I really don’t know where to even start with device tree overlays and getting the PRU working, Does anyone know of some existing code that does the streaming of data to the PRU that I could use as a starting point? Can a normal process even write to the PRU shared memory without interrupting PRU execution?