edgeAI flasher images not working on AI-64

Dear all,

I just spend a couple of hours realizing that none of the EdgeAI images are working ‘out of the box’. There are alltogether 4 images of this type for the AI-64. They show the heartbeat but the board is not reachable via COM or SSH.
Am I really the first one who is noticing that or do I miss something during installation.
In general I’m aware of how to handle a Flasher image. So I successfully flashed the 6.12. image (that doesn’t help me much due to my needs for the edgeAI lib).

Best Regards
Markus

Which of the 4 images are you trying?

I see the following available:

  • 11.6 (REVB1 production)
  • 11.8 (2023-10-07)
  • 11.7 (2023-09-02)
  • 11.5 (2022-11-01) not listed as a flasher image

My colleague got one of them working recently, I’ll verify which one they used.

My current status (always flashed with BeagleBoard Image utility, press boot button, power on (DC-connector), release boot button - balenaEtcher always fails with validation, even if I unpack the images first). Most of the items in the list below I tested on Ubuntu 24 and Win11 host. My procedure works with this image without any problems. To make sure eMMC content doesn’t have an impact on the test. I used the 13.3. image runing from eMMC and made it ‘unbootable’ by: sudo dd if=/dev/zero of=/dev/mmcblk0 bs=1M count=16 conv=fsync (verified and can confirm that only the power LED is on, no heartbeat). I don’t have a serial cable available to show boot messages.

  • 11.8 is saying its a flasher image it has the following entry in extlinux.conf:
    menu title BeagleBone AI-64 microSD (extlinux.conf) Options

    menu title BeagleBone AI-64 microSD (extlinux.conf) Options

    timeout 50

    default BeagleBone AI-64 microSD (default)

    label BeagleBone AI-64 microSD Recovery
    kernel /Image
    append root=/dev/mmcblk1p2 ro rootfstype=ext4 rootwait net.ifnames=0
    fdtdir /
    initrd /initrd.img

    label BeagleBone AI-64 copy microSD to eMMC
    kernel /Image
    append root=/dev/mmcblk1p2 ro rootfstype=ext4 rootwait net.ifnames=0 init=/usr/sbin/init-beagle-flasher
    fdtdir /
    initrd /initrd.img

    label BeagleBone AI-64 microSD (default)
    kernel /Image
    append root=/dev/mmcblk1p2 ro rootfstype=ext4 rootwait net.ifnames=0 quiet init=/usr/sbin/init-beagle-flasher
    fdtdir /
    #fdtoverlays /overlays/.dtbo
    initrd /initrd.img

    Nevertheless it shows after a short period (shorter then expected) a heartbeat but the board is not reachable on COM or SSH. No ‘knight-rider’ running LEDs either.

  • 11.7 more less the same as for 11.8 described above.

  • 11.6 more less the same as for 11.8 described above. The description on the website says it is used in a webinar (I havn’t checked that yet)
    This image has only one entry in the extlinux.conf:
    label Linux microSD
    kernel /Image
    append root=/dev/mmcblk1p2 ro rootfstype=ext4 rootwait net.ifnames=0 init=/usr/sbin/init-beagle-flasher quiet
    fdtdir /
    #fdtoverlays /overlays/.dtbo
    initrd /initrd.img

  • 11.5 more less the same as for 11.8 described above. This image has only one entry in the extlinux.conf:
    label Linux microSD
    kernel /Image
    append console=ttyS2,115200n8 earlycon=ns16550a,mmio32,0x02800000 root=/dev/mmcblk1p2 ro rootfstype=ext4 rootwait net.ifnames=0 quiet
    fdtdir /
    initrd /initrd.img

After testing these four images (in the order from top to bottom) I did a successful back-to-back test with the working 6.12 image. I’m using the same SD-card for all the tests (64Gbyte). One more observation. None of the edgeAI images has a sysconf.txt file in \boot.
So for a final test I add to the 11.8 version a sysconf.txt file (the default one) but don’t notice any difference (no surprise because all the listed images don’t even start the flash process).

With the assumption that no-one is putting broken images on the web and also couldn’t find anything about it in this forum I assume I missed anything that applies especially to EdegAI images. If someone can please point me to that.

Best Regards
Markus

I started with a board with the 13.3 image loaded (kernel 6.12).

With a 64GB micro-sd, used BeagleBoard Imager (Linux Flatpak version) to write the Debian 11.8 flasher image. The written card contains an extlinux.conf that looks like what you posted.

Installed the card, hold BOOT, power on. Continue to hold BOOT, I got to the flashing lights roughly 20 seconds in. I do have a console cable, so I was able to monitor the progress, and it does indeed flash the board.

Thoughts:

  • After you power on, how long are you continuing to hold the BOOT button? Try holding it for > 10 sec or more?
  • Have you tried a different power supply? I’m using USB-C, 5.1V / 3A max (one of the newer RPI branded supplies).

So the 11.8 image works, at least.

I’m not sure what else to suggest other than to verify with a console cable. For reference (as I had to do some digging to come up with this info), I found the following parts to construct a cable, since I didn’t find a ready-made assembly:

  • JST P/N ZHR-3 (DigiKey: 455-1160-ND) need 1 per cable, order at least 10
  • JST P/N ASZHSZH28K305 (DigiKey: 455-3080-ND) need 3 per cable, these are the 12” leads with pre-crimped connectors. I cut them in half, install them into the housing, and then solder to 0.1” headers for use with a 3.3v USB-TTL cable. (I’m sure there’s other ways to make the cable, if you can find pre-made 3-pin assemblies that’s great, this is just what worked for me.)

Hi,
thanks for all the hints.

I ordered the parts for an UART cable and will receive it by mid of next week. I will come back as soon I have it ready to work.

One more question about the ‘hold’ time of the the boot button. I thought the boot botton will force the board to boot from SDcard if the low level software reads button pressed at powerup. Then it looks into extlinux.conf and displays the menu. After the timeout it will choose the option that is marked as ‘default’. So the duration of the hold time doesn’t really matter. I assume I don’t have the full picture here. Please let me know the different effects of the hold time of the boot button.

Best Regards
Markus

Dear all,

I continue with my observations - still havn’t received my UART adapter.

I accidentially find out that having success (in terms of a working board) after the flashing depends also on the kernel version and the SD-card. I have two Samsung (Pro, UHS-I U3) SD cards that don’t work with 5.10 Kernels. The same cards run with 6.12 and 6.18 Kernel. I have another SD card (that I cannot read the manufacturer anymore) that is working on 5.x and 6.x Kernels. Certainly I checked all my SDcards with typical tools to make sure they are 100% OK.

There are already a couple of comments in this forum about suitable SD cards. For me it seems to be not only a question of the SD card but also a question of the Kernel version - maybe that rings a bell to someone who is involved with driver topics inside the Kernel.

Till now I still havn’t reached the point to use EdgeAI on one of the latest kernels. I will continue to try.

Best Regards
Markus

To use the newer EdgeAI and kernels you’ll need to build your own Yocto based image.

Unfortunately, there is no working EdgeAI Yocto image build for the BeagleboneAI64. I did mange to get a TISDK 11.00 EdgeAI image build working for the TDA4VM-SK/J721E-SK dev board.

Here is my meta layer with a fix to get the EdgeAI demo working for the J721E-SK dev board meta-j721e-sk-edgeai-sdk-11-00-fix

I have been meaning to also make a Yocto meta layer to fix the TISDK 11.01 EDGE AI build for the BeagleboneAI64. After some time, I got burned out and have not gotten back to it yet.

Some links to help you get started:

Figuring out a Yocto EdgeAI build for the BeagleboneAI64

There are two paths I see to attempt to get a working EdgeAI image build for the BeagleboneAI64.

  • One direction is to start from the working SDK 11.00 setup I have for the J721E-SK build. And then to make your own Yocto meta-layer with modifications for the BeagleBoneAI64. You’ll need to change the device tree config, bootloader config, and whatever else you run into. I once with success did this method for a similar TI SOC based board.

  • The other direction is to start from either SDK 11.01.07.05 or SDK 11.02.00. IIRC these two SDKs have working tisdk-default-image Yocto builds for the BeagleboardAI64. They will boot up and make it to a bare Weston desktop. This is the direction I am attempting to take.

    For your Yocto local config, in conf/local.conf, make sure to set MACHINE = "beaglebone-ai64" to use beaglebone-ai64.conf.

Yocto build instructions for BBAI64

Start with the normal TI Scarthgap-based Yocto setup described here Setup dev container and so on

Inside your TI build container, do the following steps.

git clone https://git.ti.com/git/arago-project/oe-layersetup.git tisdk
cd tisdk
./oe-layertool-setup.sh -f configs/processor-sdk-analytics/processor-sdk-analytics-11.##.##-config.txt

cd build
. conf/setenv

echo 'ARAGO_BRAND = "edgeai"' >> conf/local.conf
echo 'MACHINE = "beaglebone-ai64"' >> conf/local.conf

Build the default tisdk image:

bitbake -k tisdk-default-image # NO TI EdgeAI!

Build the Edge AI image:

bitbake -k tisdk-edgeai-image #currently broken for BBAI64

Building a current EdgeAI image for the BeagleBone AI-64 — progress + one remaining blocker

Following up on the “EdgeAI flasher images not working on AI-64” topic. I went the build-it-yourself route and got a current tisdk-edgeai-image to build, boot, and bring up a display on the BeagleBone AI-64. The accelerated TIOVX path still fails at runtime, and I think I’ve pinned down exactly why — would appreciate input from anyone who’s gone further.

Setup

  • oe-layersetup with processor-sdk-analytics-11.02.00-config.txt

  • MACHINE = "beaglebone-ai64", ARAGO_BRAND = "edgeai"

  • Building tisdk-default-image (works) and tisdk-edgeai-image

  • All fixes kept in a separate machine/brand-scoped meta layer so the TI layers stay untouched

Build fixes that got tisdk-edgeai-image to compile

1. ti-vision-apps was pinned to a stale manifest for the edgeai brand. The recipe has a brand override:

SRC_URI:edgeai = "...REL.PSDK.ANALYTICS.NONSAFETY.11.01.00.03...;manifest=vision_apps_yocto_nonsafety.xml"

With ARAGO_BRAND = "edgeai" that override wins, so vision_apps was built from the 11.01 release, while ti-tidl (its arm-tidl is at SRCREV 4366d488…) is at 11.02. Result: ti-tidl failed to link with

libvx_tidl_rt.so: undefined reference to `tivxCreateObjectArrayFromList'
libvx_tidl_rt.so: undefined reference to `tivxTIDLNodeV2'

i.e. the 11.01 libtivision_apps.so doesn’t export the newer TIOVX/TIDL symbols. There is no 11.02 NONSAFETY analytics tag (only REL.PSDK.ANALYTICS.11.02.00.01..10), which is presumably why the override was never bumped. I overrode SRC_URI:edgeai to the 11.02 manifest the recipe’s own default uses:

SRC_URI:edgeai = "...REL.PSDK.ANALYTICS.11.02.00.10...;manifest=vision_apps_yocto.xml"

2. The full 11.02 manifest pulls in the SRV GPU 3D demo, which needs two third-party headers. After the manifest fix, ti-vision-apps failed in kernels/srv/gpu/3dsrv/graphics:

fatal error: stb_image.h: No such file or directory
fatal error: nlohmann/json.hpp: No such file or directory

Both recipes already exist in the layer set, so adding

DEPENDS:append:edgeai = " nlohmann-json stb"

to ti-vision-apps resolved it, and the full image then built cleanly (all tasks succeeded).

Runtime status

  • Image boots from microSD, kernel 6.12.43 (from linux-bb.org).

  • edgeai-init.service autostarts the OOB demo (edgeai-gui-app -platform linuxfb).

  • Display: at the monitor’s native 4K I got TIDSS Plane1 underflow / CRTC0 SYNC LOST. There is no extlinux.conf and no devicetree line in the SD’s GRUB config; the SD boots via GRUB/EFI (EFI/BOOT/grub.cfg), and the name_overlays= line in uEnv.txt is not honored on this boot path. Forcing the mode in the GRUB linux line fixed it:

video=DP-1:1920x1080@60

The blocker: TIOVX can’t initialize its shared memory

The OOB GUI starts, then bails:

MEM: ERROR: Failed to initialize DMA HEAP [/dev/dma_heap/carveout_vision_apps_shared-memories] !!!
APP: ERROR: Memory init failed !!!
Bail out! ERROR: .../gsttiovxcontext.c:146:gst_tiovx_context_init: assertion failed: (0 == ret)

So the vision_apps_shared-memories carveout / DMA heap is missing from the device tree.

Root cause (as far as I can tell): MACHINE = beaglebone-ai64 uses the BeagleBoard.org kernel (linux-bb.org, here 6.12.43+git), and its device trees do not define the TI EdgeAI memory map. A grep -rl "vision_apps" across that kernel’s arch/arm64/boot/dts/ti returns nothing — not for the BBAI64 DT, and not even for the k3-j721e-sk.dtb it builds. The k3-j721e-edgeai-apps.dtbo overlay referenced in uEnv.txt is not built/present anywhere in the image. The TI vision_apps reserved-memory carveouts, DMA heaps and the related remoteproc/IPC nodes appear to live only in the TI kernel (ti-linux-kernel), not in the bb.org kernel this machine config selects.

Open questions

  1. Has anyone gotten the TIOVX vision_apps memory carveouts (the carveout_vision_apps_shared-memories DMA heap + the C7x/R5 reserved-memory and remoteproc nodes) onto the BBAI64 while using the linux-bb.org kernel?

  2. Is the realistic path to switch beaglebone-ai64 to ti-linux-kernel + a TI-style J721E device tree, or to port TI’s J721E rtos memory-map / vision_apps reserved-memory into the bb.org BBAI64 DT as an overlay/patch?

  3. Does a working k3-j721e-edgeai-apps.dtbo (or an equivalent carveout overlay) exist anywhere for the bb.org kernel — and if so, how is it applied on the GRUB/EFI boot path (which ignores uEnv.txt name_overlays)?

Happy to share the full fix layer if it’s useful. The build side seems solved; the device-tree memory map for EdgeAI on the bb.org kernel is the piece I’m stuck on.

Nice work and findings!

With my attempt at getting a Yocto TI EdgeAI build going, IIRC, I was able to get this overlay added and applied. I changed up a bunch of boot stuff. Unfortunately, I don’t remember much, it has been a few months since I touched this stuff. I will try looking into what I did later today. I need to clean things up a bit and make a nice meta layer

Are you able to share your custom meta layer work? If not, I should be able to follow your instructions

I put my meta layer up on Github. Please ignore most of it. I let the AI do some stuff without thinking much into it. It’s a mess.

Anyway, here is how I setup k3-j721e-edgeai-apps.dtso

recipes-kernel/linux/linux-bb.org_6.12.bbappend
recipes-kernel/linux/files/k3-j721e-edgeai-apps.dtso
recipes-kernel/linux/files/k3-j721e-rtos-memory-map.dtsi

Uboot config changes

recipes-tisdk/tisdk-uenv/tisdk-uenv.bbappend
recipes-tisdk/tisdk-uenv/tisdk-uenv/beaglebone-ai64/uEnv.txt

vision_apps / EdgeAI on BBAI-64 with kernel 6.12: remote cores hang, “Exceeded max object descriptors”

Follow-up to my earlier post — first of all, thanks for the overlay files
(k3-j721e-edgeai-apps.dtso / k3-j721e-rtos-memory-map.dtsi). They were very
helpful as a reference, although the memory map itself turned out NOT to be the
remaining problem in my setup. I’m posting the full root cause and a step-by-step
fix here, because this will hit anyone running TI vision_apps (PSDK-Analytics
11.02.x) on the BeagleBone AI-64 with a mainline-based kernel (bb.org 6.12 in my
case), and the failure mode is extremely confusing to debug.

Symptom

vx_app_arm_mem.out (and every other vision_apps demo) fails on the A72 with:

ownContextCreateCmdObj: context object descriptor [0] allocation failed / Exceeded max

All 5 remote cores show running in remoteproc, no errors in dmesg. But the
shared APP_LOG (phys 0xAC000000) reveals that only mcu2_0, mcu2_1 and C6x_2
reach the vision_apps boot sync barrier (“APP: Syncing with 5 CPUs”), while
C6x_1 and C7x_1 never even print their first line — they hang in the very
earliest RTOS startup. Since the barrier never completes, no RTOS core ever runs
tivxPlatformResetObjDescTableInfo(), the object descriptor table at
0xAC040000 stays uninitialized, and the A72 finds no free (0xFFFF) slot →
“Exceeded max” at descriptor [0]. The error message is purely a downstream
symptom.

Root cause: Linux steals the firmware’s DMTimers

Each vision_apps RTOS core uses a fixed main-domain DMTimer as its OS tick:

Core Timer
C66x_1 main_timer0
C66x_2 main_timer1
C7x_1 main_timer2
R5F main_timer12–15

TI’s vendor kernel marks exactly these timers status = "reserved" in
k3-j721e-main.dtsi
(see ti-linux-6.6.y: main_timer0, 1, 2, 12, 13, 14, 15).
Mainline — and therefore the bb.org kernel — reserves none of them. So Linux
probes all 20 main timers with TI_SCI_PD_EXCLUSIVE power domains, and at boot
you get a race between the Linux timer driver and the auto-booting RTOS cores
over the TI-SCI device claims. In my boots, C6x_2 happened to win TIMER1 while
C6x_1 and C7x_1 lost TIMER0/TIMER2 — k3conf dump devices showed exactly those
two as DEVICE_STATE_OFF while everything else was ON. A core whose tick timer
it cannot claim hangs before its first log line. That’s why two identical C66
cores behaved differently: it’s a boot lottery, not a configuration difference.

This explains why nobody sees this on the TI SK-EVM with the TI kernel — the
reservations are already in TI’s SoC dtsi there.

Secondary finding: C66 dma-region pairing differs between BBAI-64 and SK

While digging, I also found that k3-j721e-beagleboneai64.dts pairs the C66
reserved-memory nodes straight (c66_0 → dma@a6000000 + mem@a6100000),
whereas k3-j721e-som-p0.dtsi (SK) pairs them crossed. If you port the
vision_apps memory map by overriding reg on the existing nodes (instead of
using the overlay), the correct assignment per app_mem_map.h is:

  • node c66-dma-memory@a6000000 (dsp@4d80800000 / C66x_1) → 0xa9000000
  • node c66-dma-memory@a7000000 (dsp@4d81800000 / C66x_2) → 0xa8000000

Note this was NOT the cause of the hang: readelf -x .resource_table shows the
C66 firmwares request their vrings with da=0xFFFFFFFF (ADDR_ANY), so the
kernel allocates freely and the firmware reads the address back. It still needs
to be correct for a clean setup. Also, the C7x’s
remoteproc remoteproc2: unsupported resource 65538 in dmesg is harmless — it’s
TI’s vendor trace-buffer resource entry, which mainline simply skips.

The fix

Add this to your board DT (I include a .dtsi into the board dts via a Yocto
bbappend; the same content works in an overlay):

/* DMTimers used by vision_apps RTOS firmware — keep Linux off them.
 * Parity with TI vendor kernel (ti-linux-6.6.y k3-j721e-main.dtsi). */
&main_timer0  { status = "reserved"; };
&main_timer1  { status = "reserved"; };
&main_timer2  { status = "reserved"; };
&main_timer12 { status = "reserved"; };
&main_timer13 { status = "reserved"; };
&main_timer14 { status = "reserved"; };
&main_timer15 { status = "reserved"; };

plus the corrected C66 dma reg values mentioned above if you use the
reg-override approach.

Rebuild the DTB, deploy, reboot. Result on my board: all 5 RTOS cores boot and
complete the sync barrier (APP_LOG shows “Syncing with 5 CPUs” twice per core),
the obj_desc table gets reset (entries start 0000ffff — the 0xFFFF INVALID
marker is the type field at offset 2, not offset 0), and vx_app_arm_mem.out
runs clean end-to-end, including after a plain reboot with kernel auto-boot of
all cores.

How to verify (no JTAG needed)

# Timer states — none of the firmware timers should be OFF / Linux-bound:
k3conf dump devices | grep TIMER

# APP_LOG + obj_desc check via /dev/mem:
python3 - <<'PYEOF'
import mmap,os
f=os.open('/dev/mem',os.O_RDONLY|os.O_SYNC)
d=mmap.mmap(f,0x40000,prot=mmap.PROT_READ,offset=0xAC000000)[:0x40000]
print('Syncing:',d.count(b'Syncing with'),'| C6x_1:',d.count(b'C6x_1'),'C7x_1:',d.count(b'C7x_1'))
o=mmap.mmap(os.open('/dev/mem',os.O_RDONLY|os.O_SYNC),4096,prot=mmap.PROT_READ,offset=0xAC040000)
print('obj_desc[0..8]:', o[:8].hex())
PYEOF
# Goal: Syncing=10 (5 cores x 2 lines), C6x_1>=1, C7x_1>=1, obj_desc 0000ffff...

One warning

Do NOT try to fix this at runtime by unbinding the Linux timer driver and
re-enabling the devices with k3conf enable device … while restarting cores. I
managed to hard-freeze the entire SoC that way (heartbeat included) — most
likely an interconnect stall or a k3conf-vs-kernel race on the secure proxy
(k3conf talks TI-SCI via /dev/mem behind the kernel’s back). Put the
reservations in the device tree, where they belong.

Environment for reference

  • BeagleBone AI-64 (J721E SR1.1), Yocto tisdk-edgeai-image
  • Kernel 6.12.43-ti (linux-bb.org), extlinux boot (kernel loaded low so all
    vision_apps reserved-memory regions can be placed)
  • PSDK-Analytics 11.02.03 (ti-vision-apps 11.02.03, ti-edgeai-firmware
    REL.PSDK.ANALYTICS.11.02.00.10)
  • Other prerequisites from earlier in the saga (kernel 6.12 specifics): carveout
    DT node for the dma-heap-carveout shared region, R5F split-mode
    (ti,cluster-mode = <0>), ti-rpmsg-char chrdev-lookup fix, mcu3 removed from
    the HLOS IPC list, DMA-heap name symlinks
    (vision_apps_shared-memoriescarveout_…) at runtime.

Hope this saves someone the weeks it cost me. Happy to share the full .dtsi /
bbappend if anyone wants it.

EdgeAI OOB demos (TIDL) on BBAI-64 / kernel 6.12 — Part 3: the firmware/model version maze, a custom mcu2_0 build, and working demos

Follow-up to my previous post (timer reservations / “Exceeded max object
descriptors”). At the end of that one, vx_app_arm_mem.out ran cleanly and the
next milestone was the real out-of-box demo with TIDL on the C7x. That turned
out to be a whole second journey, so here is part 3 — ending with both
edgeai-gst-apps demos (image classification and object detection with bounding
boxes) running live from a USB camera to the display
on a self-built Yocto
image (bb.org 6.12 kernel) on the BeagleBone AI-64.

TL;DR: the PSDK-ANALYTICS firmware that the edgeai Yocto recipes pin is
internally an older TIOVX/TIDL code line than its version number suggests. It is
incompatible with both the 11.02 userspace and every published model zoo. The
fix is to run the firmware from PSDK RTOS 11.02.00.06 instead — which
requires building mcu2_0 yourself with two flags, because the prebuilt is an
EVM build that hard-freezes the BBAI-64. Full recipe below.

Problem 1: no published model zoo matches the ANALYTICS firmware

First demo attempt (app_edgeai.py + TFL-CL-0000-mobileNetV1-mlperf from the
11_02 model zoo) failed with a TIDL network version mismatch. The firmware’s
TIDL runtime expects network version 0x20250429; the stamp is simply the build
date of the TIDL library of the edgeai-tidl-tools release the model was compiled
with. Byte-grepping the tools releases gives this mapping:

Zoo release Network stamp Tools release
10_01_00 0x20241120 10_01_00_02
11_00_00 0x20250630 11_00_07/08
11_01_00 0x20250821 11_01 line
11_02_00 0x20251208 11_02_04_00

0x20250429 = tools 11_00_06_00 — which sits in a gap: no published zoo
release was compiled with it.
Workaround that does work: compile the model
yourself with edgeai-tidl-tools 11_00_06_00 (docker ubuntu:22.04,
SOC=am68pa for TDA4VM, python3 tflrt_delegate.py -c --models cl-tfl-mobilenet_v1_1.0_224). Pitfall: if the .tflite already exists, the
import skips the optimize step (mean/scale folding, input→uint8) and you get
“Got UINT8 but expected FLOAT32” — delete model and artifacts, recompile.

Problem 2: the userspace and the firmware speak different TIOVX

With a matching model, the next failure was on the video path:

[MCU2_0] tivxMemShared2PhysPtr: Invalid mem_heap_region 8

plus MSC/mosaic submit errors — output JPEGs all byte-identical, mosaic frame
with a black tile. Root cause, provable from the public tiovx headers
(include/TI/tivx_mem.h): the memory-region enum in the ANALYTICS 11.00 and
11.01 tags ends at index 7. Index 8 = TIVX_MEM_EXTERNAL_SHARED only exists in
the 11.02 code line. A 11.2.0 userspace (libtivision_apps.so.11.2.0) tags
buffers with region 8; the ANALYTICS firmware — despite carrying “11.02” in its
package name — is ≤11.01-era code and rejects them. There is no ANALYTICS-11.02
tag in the tiovx repo at all.

Conclusion: you cannot mix-and-match. The only consistent stack with the 11.02
userspace (and the bonus that the official 11_02 zoo matches, stamp
0x20251208) is the firmware from PSDK RTOS J721E 11.02.00.06.

Problem 3: the RTOS prebuilt firmware is an EVM build (two land mines)

Swapping the five j7-*-fw symlinks to the RTOS-11.02 prebuilt
(vision_apps_evm/) gets four cores up, but mcu2_0 dies in early init:

ETHFW: Init ... !!!
PMLIBClkRateSet failed for clock Id = 42
Assertion @ Line: 188 in enet_apputils_k3.c

That’s mine 1: the prebuilt contains ETHFW (CPSW9G ethernet firmware) for
the EVM, which the BBAI-64 doesn’t have. So: rebuild mcu2_0 with
BUILD_ENABLE_ETHFW=no. I did — and hit mine 2: the board hard-froze the
entire SoC at cold boot
(no heartbeat, SD-card surgery required to recover).
Lesson learned the hard way: the ETHFW assert had been acting as an accidental
early exit
— without it, init proceeds into the EVM build’s DSS/display and
board init. On the BBAI-64, Linux owns the DSS (tidss) and I2C; firmware
touching them at boot wedges the interconnect.

The proper fix is TI’s own switch, documented in
vision_apps/platform/j721e/rtos/common/app_cfg_mcu2_0.h:
BUILD_MCU_BOARD_DEPENDENCIES=no removes CSI2RX/TX, all ENABLE_DSS_*,
I2C and BOARD init — while keeping FVID2 + VHWA_VPAC, i.e. the MSC/VISS/LDC
hardware path the demos need. Exactly right for a board where the HLOS owns the
peripherals.

The mcu2_0 build recipe (PSDK RTOS 11.02.00.06 source package)

Host: Ubuntu 24.04 (TI tests against 22.04 — two small workarounds needed).

# 1) setup — fix the package list (python3-distutils doesn't exist on 24.04)
sed -i 's/"curl" "python3-distutils"/"curl"/' sdk_builder/scripts/setup_psdk_rtos.sh
export PIP_BREAK_SYSTEM_PACKAGES=1          # PEP-668 lock vs. the script's pip calls
./sdk_builder/scripts/setup_psdk_rtos.sh --firmware_only --skip_pc_emulation

# 2) one small makefile patch: the edgeai_deps target unconditionally tries to
#    build the Linux A72 lib (tivision_apps). Wrap its body in
#    ifeq ($(BUILD_LINUX_MPU),yes) ... endif in
#    sdk_builder/makerules/makefile_vision_apps.mak

# 3) build (from sdk_builder/)
make sdk BUILD_ENABLE_ETHFW=no BUILD_MCU_BOARD_DEPENDENCIES=no \
     BUILD_LINUX_MPU=no BUILD_CPU_MPU1=no BUILD_QNX_MPU=no \
     BUILD_EMULATION_MODE=no BUILD_APP_RTOS_LINUX=yes PROFILE=release -j$(nproc)

Notes:

  • BUILD_APP_RTOS_LINUX=yes must be explicit — its default is coupled to
    BUILD_LINUX_MPU, which we just turned off.
  • Stale-object trap: after changing DEFS-relevant flags, run
    make vision_apps_clean first. concerto does not recompile app_init.c on a
    flags-only change; the symptom is undefined appDssDefaultInit at link time.
  • Output: vision_apps/out/J721E/{R5F,C66,C71}/FREERTOS/release/ vx_app_rtos_linux_*.out (all five cores; mcu2_0 ≈ 3.9 MB).

Pre-deploy sanity checks that paid off:

strings mcu2_0.out | grep -c -i ethfw              # must be 0
strings mcu2_0.out | grep -c appDssDefaultInit     # must be 0
readelf -l mcu2_0.out | awk '/LOAD/{print $4,$6}'  # all inside a2100000..a4000000
# resource table diff vs. prebuilt: only the trace buffer address differs

Safe test protocol (after one SoC freeze you get careful)

  • Keep the five j7-*-fw symlinks pointing at a known-bootable firmware set at
    all times (autoboot safety). Test a candidate by flipping one symlink,
    sync, reboot — and flip it back immediately after the boot, before any
    diagnosis.
  • Runtime echo stop > /sys/class/remoteproc/*/state on vision_apps R5F cores
    is unreliable (k3_r5_rproc_stop: timeout, EBUSY -16, even on a healthy
    core) — don’t fight it, reboot instead.
  • Restart the remote-log reader after a core restart (a running reader misses
    the buffer reset), and beware: the GTC timestamps survive warm reboots
    if you see the same timestamps as in an earlier session, you are reading a
    stale buffer replay, not the present.
  • dmesg | grep "Booting fw image" — the size uniquely identifies which binary
    actually got loaded.

With the two flags the custom mcu2_0 boots cleanly: no ETHFW line, sync barrier
passes, REMOTE_SERVICE/FVID2/VHWA-VPAC up, IPC echo all-pass. Full symlink set
on vision_apps_rtos_11.02/ (4 prebuilt cores + custom mcu2_0), cold boot,
done.

The demos

Image classification (configs/image_classification.yaml, model
TFL-CL-0000-mobileNetV1-mlperf straight from the 11_02 zoo): file-based flow
produces correctly classified, varying outputs (the elephant test image yields
tusker / African elephant / Indian elephant as top-3), remote log clean — no
mem_heap_region, no TIDL errors. Camera flow
(flow0: [input0,model1,output0,[320,150,1280,720]]) runs live to the monitor
via kmssink — no desktop needed, and the DSS-free mcu2_0 is exactly the
right architecture for it, since Linux drives the display.

Object detection with bounding boxes (configs/object_detection.yaml):
install a detection model from the zoo, e.g.

wget https://software-dl.ti.com/jacinto7/esd/modelzoo/11_02_00/modelartifacts/TDA4VM/8bits/od-2020_tflitert_coco_tf1-models_ssdlite_mobiledet_dsp_320x320_coco_20200519_tflite.tar.gz
mkdir -p /opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320
tar xzf od-2020_*.tar.gz -C /opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320

(the per-release index is <zoo-url>/artifacts.csv), then set
flow0: [input0,model1,output0,…]. COCO classes, live boxes on the display —
stable for extended runs once the power supply is right (see the power gotcha
below; with an underpowered supply this demo resets the board within seconds,
which is easy to misread as a software problem).

The trap that cost me an afternoon: the silent CPU fallback (SOC env var)

This one deserves its own section because it mimics a hardware/firmware
regression perfectly. Symptom set: the demo still runs, but at 1 FPS instead
of 40
, the A72s are pegged (~190 % CPU), the perf overlay shows C7x at
0 %
and the MCU2 rows disappear entirely — right after you changed
something unrelated, so you naturally suspect that change. I chased a kernel
cmdline parameter and a “wedged remote core” theory for hours; the remote log
proved all cores healthy, IPC echo all-pass, error grep empty.

The actual cause was the very first output line, which is easy to scroll past:

[WARNING] SOC env var not specified. Defaulting target to arm.

The edgeai apps select the inference target via the SOC=j721e environment
variable. If it is missing, the TIDL runtime silently falls back to CPU
inference (XNNPACK) — no error, no abort, just 40× slower. And at least on my
Yocto image, nothing sets that variable persistently. It “worked” for days only
because my original interactive session had it exported. The moment I switched
to ssh board 'one-liner' style for debugging, it broke: dropbear command
execution is a non-login shell — it sources neither /etc/profile.d nor
has /usr/local/bin in PATH (PATH=/usr/sbin:/usr/bin:/sbin:/bin).

Two lessons in one: (a) never change your measurement method in the middle of
a regression hunt — my ssh one-liners changed the environment and created
the symptom I was chasing; (b) silent fallbacks are the most expensive
failures — “runs slowly” usually has an environment cause, not a hardware one.

Robust fix — a tiny wrapper in /usr/bin (not /usr/local/bin!):

#!/bin/sh
# /usr/bin/edgeai-demo
export SOC=j721e
cd /opt/edgeai-gst-apps/apps_python || exit 1
exec ./app_edgeai.py "${1:-../configs/object_detection.yaml}"

plus /etc/profile.d/edgeai-soc.sh (export SOC=j721e) for interactive
logins.

Gotchas worth knowing

  • Camera paths: the edgeai apps detect cameras by the literal /dev/video…
    prefix. A stable /dev/v4l/by-id/... path is rejected as “Invalid Input”.
    Use /dev/videoN (find the capture node via /dev/v4l/by-id/*-index0), or
    add a udev rule creating e.g. /dev/video-usb-cam0 — that name passes the
    prefix check and is stable.
  • The flows: line carries the mosaic position as a 4th element
    (…,[320,150,1280,720]]) — leave your sed patterns open-ended.
  • /tmp is tmpfs on these images — never stage firmware there across a reboot.
  • Power, the final boss: the BBAI-64 takes 5 V only via USB-C and does not
    negotiate USB-PD — so any PD charger tops out at 5 V/3 A (15 W), no matter
    whether it says 15 W or 65 W on the label. Object detection at full C7x load
    plus a Logitech BRIO plus 1080p output browned out my supply (board reset at
    a harmless 70 °C; a 65 W PD charger lasted seconds longer, then its
    over-current protection latched off). Use a genuine 5 V fixed-voltage supply
    with headroom. [UPDATE 2026-06-12] Verified: with a 5 V/6 A fixed supply
    (YU0506B) the object-detection demo runs stably at full load, with the BRIO
    plugged directly into the board — no self-powered hub needed.

Final stack

  • BBAI-64, own Yocto image (bb.org 6.12 kernel) with the timer/memory-map dtsi
    from my previous post
  • Userspace: libtivision_apps 11.2.0
  • Firmware: PSDK RTOS 11.02.00.06 — 4 cores prebuilt, mcu2_0 self-built
    with BUILD_ENABLE_ETHFW=no BUILD_MCU_BOARD_DEPENDENCIES=no
  • Models: official TI model zoo 11_02_00 (TDA4VM/8bits), unmodified
  • Power: 5 V fixed-voltage supply, 6 A (no PD charger)

Hope this saves the next person the SD-card surgery. Happy to share the dtsi or
more details if anyone wants them.

One more time: EdgeAI on BBAI-64 — Part 4: M.2 Wi-Fi (Intel AX210) on a custom 6.12 kernel, or: the case of the swallowed MSIs

Follow-up to parts 1–3 (custom Yocto image, timer reservations, RTOS-11.02
firmware + working TIDL demos). This part is about something seemingly
unrelated — getting an Intel AX210 in the M.2 Key-E slot to work on the
same self-built image (bb.org 6.12 kernel) — but it ends in the same place
the timer story did: a platform quirk that fails silently when one half of it
is missing. If you run a custom kernel on the BBAI-64 and any PCIe card
behaves strangely, this one is for you.

TL;DR: the J721E routes PCIe MSIs through a “pre-ITS” block in front of the
GICv3 ITS. The device tree carries the matching
socionext,synquacer-pre-its property — but the kernel code that reads
that property is compiled out unless CONFIG_SOCIONEXT_SYNQUACER_PREITS=y
is set. TI’s and BeagleBoard’s Debian kernels set it; the bb.org Yocto
defconfig I built from does not. Result: MSI-X vectors allocate fine, but
every MSI write from the card vanishes — the AX210’s firmware boots and then
times out waiting for an interrupt that can never arrive. One config fragment
fixes it.

The symptom

The card itself was detected fine (lspci: 8086:2725 behind the TI bridge,
link up at 5GT/s x1), the iwlwifi driver and userspace were present, only
the firmware blobs were missing. After installing them
(iwlwifi-ty-a0-gf-a0-89.ucode + the .pnvm — note these moved into the
intel/iwlwifi/ subdirectory of the linux-firmware tree, the old root-level
URLs are dead):

iwlwifi 0000:01:00.0: loaded firmware version 89.1a492d28.0 ty-a0-gf-a0-89.ucode
...
iwlwifi 0000:01:00.0: Failed to start RT ucode: -110
iwlwifi 0000:01:00.0: Failed to run INIT ucode: -110

plus IML/ROM error dumps (IML/ROM error/state 0xB03,
FSEQ_ERROR_CODE 0x60000000). -110 is a timeout. The decisive observation
was in /proc/interrupts:

ITS-PCI-MSIX-0000:01:00.0   0 Edge  iwlwifi:default_queue   0  0
ITS-PCI-MSIX-0000:01:00.0   3 Edge  iwlwifi:exception       0  0

All four MSI-X vectors allocated — all counters permanently at zero. The
driver pushes the firmware over PCIe just fine; the card’s “alive”
notification, which arrives as an MSI, never lands.

Dead ends (so you can skip them)

  • pcie_aspm=off — no change; ASPM was already disabled on this link.
  • pci=nomsi — the AX210/iwlwifi has no legacy-INTx fallback at all;
    the probe fails outright with -22. MSI is mandatory for this card.
  • Removing the .pnvm file (a workaround documented for another ARM SBC
    with the same -110 symptom) — no change here. The PNVM transfer is just
    the first DMA/interrupt handshake that happens to die; the file was never
    the culprit.

The move that cracked it: a second OS on the same board

The BBAI-64 ships with Debian on the eMMC, and mine was still there. Pull the
SD card, cold boot, and you have a reference system on identical hardware:
same card, same slot, same antennas, even the same 6.12-ti kernel family
(6.12.57 there vs. 6.12.43 in my image). On the eMMC Debian the AX210 came up
instantly — firmware 89 loads, loaded PNVM version, wlan0 present,
interrupt counters running. Hardware fully exonerated in five minutes; the
delta had to be kernel config or device tree.

Both are easy to capture from a live system: /sys/firmware/fdt is the
running device tree blob, /proc/config.gz (or /boot/config-$(uname -r))
the running config. Decompiling and diffing the two DTBs was a dead end with
a useful result: the PCIe and ITS nodes were byte-identical — including
dma-coherent, the 64 GB inbound dma-ranges, msi-map, and the
socionext,synquacer-pre-its property on the ITS node. The config diff was
the hit:

# Debian (works):   CONFIG_SOCIONEXT_SYNQUACER_PREITS=y
# bb.org defconfig: # CONFIG_SOCIONEXT_SYNQUACER_PREITS is not set

Why this option matters on a TI SoC

The J721E’s GIC-500 sits behind a “pre-ITS” block: PCIe devices don’t write
their MSIs to the normal GITS_TRANSLATER doorbell, but into a separate
window whose offset encodes the device ID. The kernel handles this with the
quirk originally written for the Socionext Synquacer SoC (which has the same
construction) — that’s why a TI device tree carries a socionext,* property.
The crucial detail sits in drivers/irqchip/irq-gic-v3-its.c: the quirk’s
entry in the ITS quirk table is wrapped in
#ifdef CONFIG_SOCIONEXT_SYNQUACER_PREITS. If the option is off, the DT
property is never even read.
The ITS then hands every PCIe device the
standard doorbell address, the hardware routes those writes nowhere, and you
get exactly the picture above: clean allocation, zero delivery, and a
perfectly healthy-looking card that times out. The Kconfig default is
actually y with no restrictive dependencies — the defconfig disables it
explicitly.

The fix (Yocto)

A one-line config fragment, plus one subtlety: the bb.org kernel recipe
inherits plain kernel with a classic copy-the-defconfig include — not
kernel-yocto — so .cfg fragments in SRC_URI are NOT merged
automatically. Merge explicitly in your bbappend:

# files/enable-gic-preits.cfg
CONFIG_SOCIONEXT_SYNQUACER_PREITS=y
# linux-bb.org_%.bbappend
FILESEXTRAPATHS:prepend := "${THISDIR}/files:"
SRC_URI += "file://enable-gic-preits.cfg"

do_configure:append() {
    cat ${WORKDIR}/enable-gic-preits.cfg >> ${B}/.config
    oe_runmake -C ${S} O=${B} olddefconfig
}

Verify the merge actually took before building the world — grep the kernel
build’s .config for the symbol. (My first attempt silently did nothing
because I had assumed automatic fragment merging; the .config check caught
it.)

Acceptance

Four checks after deploying the new kernel image:

zcat /proc/config.gz | grep SYNQUACER_PREITS      # =y
dmesg | grep -i pre-ITS
#  GIC: enabling workaround for ITS: Socionext Synquacer pre-ITS
dmesg | grep -i pnvm                              # loaded PNVM version ...
grep -i iwl /proc/interrupts                      # counters > 0

With the quirk active, the previously “broken” PNVM transfer completes
normally, the interface appears, and wpa_supplicant + systemd-networkd do the
rest. One small surprise on the way: on a systemd image the interface is
named wlp1s0 (predictable naming), not wlan0 as on the stock Debian —
dmesg even says renamed from wlan0. Grep accordingly before assuming the
interface is missing.

Final state

  • Intel AX210 (M.2 Key-E) on the self-built Yocto image: Wi-Fi 6 working,
    survives cold boots, EdgeAI demos unaffected and running in parallel
  • Kernel: bb.org 6.12 + one config fragment
    (CONFIG_SOCIONEXT_SYNQUACER_PREITS=y), versioned in my board layer
  • Firmware: iwlwifi-ty-a0-gf-a0-89.ucode + .pnvm (and intel/ibt-0041-*
    for the Bluetooth half) — in current linux-firmware these live under
    intel/iwlwifi/

Takeaways

  1. A DT property without its kernel code is a silent no-op. Device tree
    describes; whether the kernel acts on the description is a config
    question, and quirk tables are full of #ifdefs. If a property “should”
    work but doesn’t, check both halves. (On this board it’s a pattern: the
    timer-reservation story in part 2 had the same shape.)
  2. A second bootable OS on the same board is a diagnostic gold mine. The
    stock eMMC Debian exonerated the hardware, narrowed the suspects to
    config-vs-DT, and served as a live donor for both — think twice before
    wiping it.
  3. Zero interrupt counters are a statement, not an absence of data. They
    pointed away from firmware files and power and straight at the delivery
    path, long before the root cause was visible.

Hope this saves someone the detour. The fragment is one line; finding it was
the day.

Crazy impressive work. Do you have a Yocto Meta layer with all this put together that you can share?

My SDK 11.00 work

On my end, I gave up on the TISDK 11.02 path and went for TISDK 11.00. I have succeeded at getting the basic EdgeAI stack running. The EdgeAI gallery demo is running and working on boot. I am running the basic TI EdgeAI models.

I unfortunately do not have the vision_apps stack working. My R5 firmware is failing to load, with this, I expect vision_apps / OpenVX based EdgeAI demos will not work with my setup.

Digging into what you had to do for SDK 11.02, I might be able to back-port your work for SDK 11.00.

Here is my Yocto meta layer meta-beaglebone-ai64-edgeai-sdk-11-00-fix

TI descope

Looking on TI e2e, it seems that TI has descropped support for the basic EdgeAI stack in favor of the vision_apps stack. Evidently you have both working?

I have to agree.
Feels to me like this is only meant to work on one single developer’s machine;
a crazy amount of roadblocks…

Would be sad if all this work went un-noticed.

@MarkusKrug:
Would you be interested in putting all this knowledge on the Docs for posterity?

Only asking because the search button seems to be broken for most people…

Hi,

well, with the help of my AI assistent I can share the following:
meta-mk.tar.gz (9,5 KB)
README.md (6,3 KB)

In my experience, most of the AI-generated explanations are fine. Perhaps a bit brief (I’ve already added a few comments manually).

If it seems too strange, please don’t hesitate to ask again.

Best regards,
Markus

1 Like

you mean on the official BeagleBoard docs?

Indeed.

We’re always happy to accept writeups as detailed as yours!