After scouring TI’s documentation and failing to find a minimal working example for the Cortex-R5 cores that doesn’t depend on the Processor SDK or TI’s compilers, I have decided to whip one up myself.
An update: I was able to properly configure the MPU and cache to run code from external memory. As a proof of concept I ported the Dhrystone benchmark to the Cortex-R5 running out of DDR memory. Performance is nearly identical to using TCMs due to the benchmark fitting entirely within the R5’s L1 cache.
Amazing work @Tony_Kao! Thanks for publishing this.
I don’t yet have a BB AI-64 of my own to play with – how is the latency between writing “start” to the “state” sysfs attribute and getting the “hello” message back? I imagine it’s under a second? (And although not directly related to your work, how long does the stock OS image take to boot to userspace?)
I infer the remoteproc devices exposed by the kernel are for the “MAIN”-domain cores. Are you aware of any existing exploration into the “MCU”-domain cores and implementing software images for them? I’m finding a lot of theoretical capabilities of the R5 cores in the reference manual for the TDA4VM as I read but successful implementations are rather sparse. It seems like the expected methodology is to load a boot image into the local RAM of the MCU-domain R5s and then communicate with it via internal SPI or I2C. Perhaps I’m just not looking hard enough in the upstream TI samples.
Also, are you or others aware of what the debugging capabilities on the R5 cores look like? I see the AI-64 maps some UART pins to external debug headers, as well as an MCU-specific header which has a selection of GPIOs. But I don’t see a JTAG interface for on-chip debugging of the R5 cores. Those pins don’t seem to appear in the AI-64 schematic. Are you aware of what’s going on there?
Oh nice, thank you! I was overlooking that port expecting something specific to the MCU cores but evidently I should have read the reference manual more closely.
@Tony_Kao thanks for posting this example! Is there a recommended JTAG and debugger for working with the R5 cores? It looks like @Nishanth_Menon is running openocd and debugging with gdb on his fork.
Any suggestions are appreciated.
BTW, the first example runs fine for me but, the dhrystone is not printing anything in the trace. I need to debug!!
The Dhrystone example worked for me, I haven’t tried the simpler one. I just compiled with the default settings and loaded the ELF firmware onto the R5 core. I got numbers almost exactly the same as the samples in the original GitHub repo. I might be able to help debug.
Re: debugging via direct memory access on-chip, I’m also interested in this. I came across @nmenon’s four patches here (https://review.openocd.org/c/openocd/+/7088) and have a local openocd tree in which I’ve applied them. I cooked up a config file for the j721e and tried to debug code running on the R5F “main” domain cores, but attaching to the second set (running the custom Dhrystone benchmark) failed claiming they were offline and attaching to the first set (running TI’s EdgeAI coprocessor firmware) locked up the whole SoC. I was kind of guessing on the appropriate values from the datasheet to use in config and might have gotten it wrong. I haven’t looked into it more deeply than that. I am also only inferring that this “dmem” driver for emulated debug ports is supposed to work on the TDA4VM. I’d be happy to collect and post what I have if it’s helpful, perhaps in a new thread.
aah yes - i need to followup to get it merged, but this should work way easier than the jtag route. Though, it also needs the very latest tifs firmware. my production ai64 is on order, waiting for it to arrive to run some experiments.
Awesome news! Looking forward to working with the R5 cores.
@kaelinl, If it is not too much trouble, what was your DogTag and Kernel for the working Dhrystone. I am going to start another thread to discuss some R5 experimentation I would like to perform.
@FredEckert@kaelinl → Apparently the TIFS firmware (that allows firewall access) update is yet to be released - due in 8.5 release of the firmware from TI. So you might not be able to use the openocd dmem path just yet for j721e… This could explain the various fails you might see as a consequence.
Hmmm OK, that’s unfortunate. Thanks for the information though!
Do you have any sense of when it’ll become available? And is this firmware something I can flash in-place myself? (…what is TIFS?)
Separately, if I end up going the JTAG+TagConnect route, is it indeed infeasible to connect the probe without removing the heatsink? Where’d you get that fan shown in the pictures?
I figure if that new firmware is going to take more than a couple weeks I’ll spend the few hundred dollars on the aggressively overpriced debug equipment and do it the hard way.
Oh, also, for reference, these are the openocd options I derived from @Nishanth_Menon 's patches and the J721e datasheet. I have no idea where the 0x1d500000 size came from in the original sample for am625 and am pretty sure it’s wrong.
TIFS - is TI Foundational security (firmware that runs inside the security enclave)… I hear it is due to be released in a couple of weeks, but working in TI, I do have access to a prebuilt, so I am able to check things out… as in the log above, I did try it out a bit…
debian@BeagleBone:~/openocd/openocd/tcl$ sudo devmem2 0x4C60000000
/dev/mem opened.
Memory mapped at address 0xffff9d48f000.
Read at address 0x4C60000000 (0xffff9d48f000): 0x00001003
debian@BeagleBone:~/openocd/openocd/tcl$ sudo devmem2 0x4C40002000
/dev/mem opened.
Memory mapped at address 0xffff9df24000.
Read at address 0x4C40002000 (0xffff9df24000): 0x1BB6402F