EdgeAI OOB demos (TIDL) on BBAI-64 / kernel 6.12 — Part 3: the firmware/model version maze, a custom mcu2_0 build, and working demos
Follow-up to my previous post (timer reservations / “Exceeded max object
descriptors”). At the end of that one, vx_app_arm_mem.out ran cleanly and the
next milestone was the real out-of-box demo with TIDL on the C7x. That turned
out to be a whole second journey, so here is part 3 — ending with both
edgeai-gst-apps demos (image classification and object detection with bounding
boxes) running live from a USB camera to the display on a self-built Yocto
image (bb.org 6.12 kernel) on the BeagleBone AI-64.
TL;DR: the PSDK-ANALYTICS firmware that the edgeai Yocto recipes pin is
internally an older TIOVX/TIDL code line than its version number suggests. It is
incompatible with both the 11.02 userspace and every published model zoo. The
fix is to run the firmware from PSDK RTOS 11.02.00.06 instead — which
requires building mcu2_0 yourself with two flags, because the prebuilt is an
EVM build that hard-freezes the BBAI-64. Full recipe below.
Problem 1: no published model zoo matches the ANALYTICS firmware
First demo attempt (app_edgeai.py + TFL-CL-0000-mobileNetV1-mlperf from the
11_02 model zoo) failed with a TIDL network version mismatch. The firmware’s
TIDL runtime expects network version 0x20250429; the stamp is simply the build
date of the TIDL library of the edgeai-tidl-tools release the model was compiled
with. Byte-grepping the tools releases gives this mapping:
| Zoo release |
Network stamp |
Tools release |
| 10_01_00 |
0x20241120 |
10_01_00_02 |
| 11_00_00 |
0x20250630 |
11_00_07/08 |
| 11_01_00 |
0x20250821 |
11_01 line |
| 11_02_00 |
0x20251208 |
11_02_04_00 |
0x20250429 = tools 11_00_06_00 — which sits in a gap: no published zoo
release was compiled with it. Workaround that does work: compile the model
yourself with edgeai-tidl-tools 11_00_06_00 (docker ubuntu:22.04,
SOC=am68pa for TDA4VM, python3 tflrt_delegate.py -c --models cl-tfl-mobilenet_v1_1.0_224). Pitfall: if the .tflite already exists, the
import skips the optimize step (mean/scale folding, input→uint8) and you get
“Got UINT8 but expected FLOAT32” — delete model and artifacts, recompile.
Problem 2: the userspace and the firmware speak different TIOVX
With a matching model, the next failure was on the video path:
[MCU2_0] tivxMemShared2PhysPtr: Invalid mem_heap_region 8
plus MSC/mosaic submit errors — output JPEGs all byte-identical, mosaic frame
with a black tile. Root cause, provable from the public tiovx headers
(include/TI/tivx_mem.h): the memory-region enum in the ANALYTICS 11.00 and
11.01 tags ends at index 7. Index 8 = TIVX_MEM_EXTERNAL_SHARED only exists in
the 11.02 code line. A 11.2.0 userspace (libtivision_apps.so.11.2.0) tags
buffers with region 8; the ANALYTICS firmware — despite carrying “11.02” in its
package name — is ≤11.01-era code and rejects them. There is no ANALYTICS-11.02
tag in the tiovx repo at all.
Conclusion: you cannot mix-and-match. The only consistent stack with the 11.02
userspace (and the bonus that the official 11_02 zoo matches, stamp
0x20251208) is the firmware from PSDK RTOS J721E 11.02.00.06.
Problem 3: the RTOS prebuilt firmware is an EVM build (two land mines)
Swapping the five j7-*-fw symlinks to the RTOS-11.02 prebuilt
(vision_apps_evm/) gets four cores up, but mcu2_0 dies in early init:
ETHFW: Init ... !!!
PMLIBClkRateSet failed for clock Id = 42
Assertion @ Line: 188 in enet_apputils_k3.c
That’s mine 1: the prebuilt contains ETHFW (CPSW9G ethernet firmware) for
the EVM, which the BBAI-64 doesn’t have. So: rebuild mcu2_0 with
BUILD_ENABLE_ETHFW=no. I did — and hit mine 2: the board hard-froze the
entire SoC at cold boot (no heartbeat, SD-card surgery required to recover).
Lesson learned the hard way: the ETHFW assert had been acting as an accidental
early exit — without it, init proceeds into the EVM build’s DSS/display and
board init. On the BBAI-64, Linux owns the DSS (tidss) and I2C; firmware
touching them at boot wedges the interconnect.
The proper fix is TI’s own switch, documented in
vision_apps/platform/j721e/rtos/common/app_cfg_mcu2_0.h:
BUILD_MCU_BOARD_DEPENDENCIES=no removes CSI2RX/TX, all ENABLE_DSS_*,
I2C and BOARD init — while keeping FVID2 + VHWA_VPAC, i.e. the MSC/VISS/LDC
hardware path the demos need. Exactly right for a board where the HLOS owns the
peripherals.
The mcu2_0 build recipe (PSDK RTOS 11.02.00.06 source package)
Host: Ubuntu 24.04 (TI tests against 22.04 — two small workarounds needed).
# 1) setup — fix the package list (python3-distutils doesn't exist on 24.04)
sed -i 's/"curl" "python3-distutils"/"curl"/' sdk_builder/scripts/setup_psdk_rtos.sh
export PIP_BREAK_SYSTEM_PACKAGES=1 # PEP-668 lock vs. the script's pip calls
./sdk_builder/scripts/setup_psdk_rtos.sh --firmware_only --skip_pc_emulation
# 2) one small makefile patch: the edgeai_deps target unconditionally tries to
# build the Linux A72 lib (tivision_apps). Wrap its body in
# ifeq ($(BUILD_LINUX_MPU),yes) ... endif in
# sdk_builder/makerules/makefile_vision_apps.mak
# 3) build (from sdk_builder/)
make sdk BUILD_ENABLE_ETHFW=no BUILD_MCU_BOARD_DEPENDENCIES=no \
BUILD_LINUX_MPU=no BUILD_CPU_MPU1=no BUILD_QNX_MPU=no \
BUILD_EMULATION_MODE=no BUILD_APP_RTOS_LINUX=yes PROFILE=release -j$(nproc)
Notes:
BUILD_APP_RTOS_LINUX=yes must be explicit — its default is coupled to
BUILD_LINUX_MPU, which we just turned off.
- Stale-object trap: after changing DEFS-relevant flags, run
make vision_apps_clean first. concerto does not recompile app_init.c on a
flags-only change; the symptom is undefined appDssDefaultInit at link time.
- Output:
vision_apps/out/J721E/{R5F,C66,C71}/FREERTOS/release/ vx_app_rtos_linux_*.out (all five cores; mcu2_0 ≈ 3.9 MB).
Pre-deploy sanity checks that paid off:
strings mcu2_0.out | grep -c -i ethfw # must be 0
strings mcu2_0.out | grep -c appDssDefaultInit # must be 0
readelf -l mcu2_0.out | awk '/LOAD/{print $4,$6}' # all inside a2100000..a4000000
# resource table diff vs. prebuilt: only the trace buffer address differs
Safe test protocol (after one SoC freeze you get careful)
- Keep the five
j7-*-fw symlinks pointing at a known-bootable firmware set at
all times (autoboot safety). Test a candidate by flipping one symlink,
sync, reboot — and flip it back immediately after the boot, before any
diagnosis.
- Runtime
echo stop > /sys/class/remoteproc/*/state on vision_apps R5F cores
is unreliable (k3_r5_rproc_stop: timeout, EBUSY -16, even on a healthy
core) — don’t fight it, reboot instead.
- Restart the remote-log reader after a core restart (a running reader misses
the buffer reset), and beware: the GTC timestamps survive warm reboots —
if you see the same timestamps as in an earlier session, you are reading a
stale buffer replay, not the present.
dmesg | grep "Booting fw image" — the size uniquely identifies which binary
actually got loaded.
With the two flags the custom mcu2_0 boots cleanly: no ETHFW line, sync barrier
passes, REMOTE_SERVICE/FVID2/VHWA-VPAC up, IPC echo all-pass. Full symlink set
on vision_apps_rtos_11.02/ (4 prebuilt cores + custom mcu2_0), cold boot,
done.
The demos
Image classification (configs/image_classification.yaml, model
TFL-CL-0000-mobileNetV1-mlperf straight from the 11_02 zoo): file-based flow
produces correctly classified, varying outputs (the elephant test image yields
tusker / African elephant / Indian elephant as top-3), remote log clean — no
mem_heap_region, no TIDL errors. Camera flow
(flow0: [input0,model1,output0,[320,150,1280,720]]) runs live to the monitor
via kmssink — no desktop needed, and the DSS-free mcu2_0 is exactly the
right architecture for it, since Linux drives the display.
Object detection with bounding boxes (configs/object_detection.yaml):
install a detection model from the zoo, e.g.
wget https://software-dl.ti.com/jacinto7/esd/modelzoo/11_02_00/modelartifacts/TDA4VM/8bits/od-2020_tflitert_coco_tf1-models_ssdlite_mobiledet_dsp_320x320_coco_20200519_tflite.tar.gz
mkdir -p /opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320
tar xzf od-2020_*.tar.gz -C /opt/model_zoo/TFL-OD-2020-ssdLite-mobDet-DSP-coco-320x320
(the per-release index is <zoo-url>/artifacts.csv), then set
flow0: [input0,model1,output0,…]. COCO classes, live boxes on the display —
stable for extended runs once the power supply is right (see the power gotcha
below; with an underpowered supply this demo resets the board within seconds,
which is easy to misread as a software problem).
The trap that cost me an afternoon: the silent CPU fallback (SOC env var)
This one deserves its own section because it mimics a hardware/firmware
regression perfectly. Symptom set: the demo still runs, but at 1 FPS instead
of 40, the A72s are pegged (~190 % CPU), the perf overlay shows C7x at
0 % and the MCU2 rows disappear entirely — right after you changed
something unrelated, so you naturally suspect that change. I chased a kernel
cmdline parameter and a “wedged remote core” theory for hours; the remote log
proved all cores healthy, IPC echo all-pass, error grep empty.
The actual cause was the very first output line, which is easy to scroll past:
[WARNING] SOC env var not specified. Defaulting target to arm.
The edgeai apps select the inference target via the SOC=j721e environment
variable. If it is missing, the TIDL runtime silently falls back to CPU
inference (XNNPACK) — no error, no abort, just 40× slower. And at least on my
Yocto image, nothing sets that variable persistently. It “worked” for days only
because my original interactive session had it exported. The moment I switched
to ssh board 'one-liner' style for debugging, it broke: dropbear command
execution is a non-login shell — it sources neither /etc/profile.d nor
has /usr/local/bin in PATH (PATH=/usr/sbin:/usr/bin:/sbin:/bin).
Two lessons in one: (a) never change your measurement method in the middle of
a regression hunt — my ssh one-liners changed the environment and created
the symptom I was chasing; (b) silent fallbacks are the most expensive
failures — “runs slowly” usually has an environment cause, not a hardware one.
Robust fix — a tiny wrapper in /usr/bin (not /usr/local/bin!):
#!/bin/sh
# /usr/bin/edgeai-demo
export SOC=j721e
cd /opt/edgeai-gst-apps/apps_python || exit 1
exec ./app_edgeai.py "${1:-../configs/object_detection.yaml}"
plus /etc/profile.d/edgeai-soc.sh (export SOC=j721e) for interactive
logins.
Gotchas worth knowing
- Camera paths: the edgeai apps detect cameras by the literal
/dev/video…
prefix. A stable /dev/v4l/by-id/... path is rejected as “Invalid Input”.
Use /dev/videoN (find the capture node via /dev/v4l/by-id/*-index0), or
add a udev rule creating e.g. /dev/video-usb-cam0 — that name passes the
prefix check and is stable.
- The
flows: line carries the mosaic position as a 4th element
(…,[320,150,1280,720]]) — leave your sed patterns open-ended.
/tmp is tmpfs on these images — never stage firmware there across a reboot.
- Power, the final boss: the BBAI-64 takes 5 V only via USB-C and does not
negotiate USB-PD — so any PD charger tops out at 5 V/3 A (15 W), no matter
whether it says 15 W or 65 W on the label. Object detection at full C7x load
plus a Logitech BRIO plus 1080p output browned out my supply (board reset at
a harmless 70 °C; a 65 W PD charger lasted seconds longer, then its
over-current protection latched off). Use a genuine 5 V fixed-voltage supply
with headroom. [UPDATE 2026-06-12] Verified: with a 5 V/6 A fixed supply
(YU0506B) the object-detection demo runs stably at full load, with the BRIO
plugged directly into the board — no self-powered hub needed.
Final stack
- BBAI-64, own Yocto image (bb.org 6.12 kernel) with the timer/memory-map dtsi
from my previous post
- Userspace: libtivision_apps 11.2.0
- Firmware: PSDK RTOS 11.02.00.06 — 4 cores prebuilt, mcu2_0 self-built
with BUILD_ENABLE_ETHFW=no BUILD_MCU_BOARD_DEPENDENCIES=no
- Models: official TI model zoo 11_02_00 (TDA4VM/8bits), unmodified
- Power: 5 V fixed-voltage supply, 6 A (no PD charger)
Hope this saves the next person the SD-card surgery. Happy to share the dtsi or
more details if anyone wants them.