Yes in general the MPU works the same.
The ROM code runs at boot, reads the SBL from hardware (depending on boot pins) into the internal MCU R5 static ram and then executes it. This can run in lockstep mode if required. The executable image needs to be wrapped in a header.
What happens next depends on the SBL code. In practice on the BBAI64, I don’t know if this is actually uboot, or the SBL is loading uboot. In theory the SBL can be anything that fits into approximately 768k and do what ever you want.
The last time I did a custom BBB image, I flashed uboot directly to the eMMC and it was booted in raw mode. No FAT partition, no MLO.
There is a downloadable utility ( SYSCONFIG ) to setup pin muxing and clocks. There is also a cloud version which requires you to setup an account with TI Cloud SYSCONFIG. No idea what the output is from these utilities.
I put together a spreadsheet with the various pin muxings on the header connectors PinMux Spreadsheet I think just about every pin on the connectors can be connected to one of the PRU’s if required.
At the very least you should download the PROCESSOR-SDK-RTOS-J721E There is a lot of documentation there and code examples, plus low level drivers. There is a section on the SBL with example code to load and run baremetal stuff.
I am interesting in running code on the MCU R5, but other than the SBL, that is not explained. I assume I would have to load the code into DDR and then jump there after the SBL code has executed. At the moment I don’t really have enough spare time to delve into it in the detail required.
I have got as far as compiling the SBL example code, modifying the debug UART to the MCU UART on mikroBUS header and output some text, but not to actually boot anything as yet. I have also built a custom cape with 6 CAN bus interfaces as the code to run in the MCU R5 needs at least 6 CAN buses. Maybe over Christmas I will get some free time to investigate things further.