I’m attempting to run TI’s official EdgeAI benchmarks on my AI-64 to replicate their numbers as well as provide a POC before moving on to my own models.
The benchmarks are here: GitHub - TexasInstruments/edgeai-benchmark: EdgeAI Deep Neural Network Models Benchmarking
I don’t have any real background knowledge of how this is supposed to be done, so I’d appreciate any guidance. I’m just guessing how this is supposed to work from their (IMO rather poor) documentation. But I’ll walk through what I’ve tried and why it isn’t working.
My belief is that the “edgeai” version of the OS images includes a copy of TI’s builds of their libraries, and that this ought to enable me to run EdgeAI apps on it.
I started with a fresh install of the following image, on an SD card. bbai64-debian-11.4-xfce-edgeai-arm64-2022-09-02-10gb.img.xz
from: Debian 11.x (Bullseye) - Monthly Snapshots (ARM64).
Here’s my process so far. (Written as notes to self, so any "if"s are de facto true here.)
System setup and prep (click to expand)
Setup
Expand rootfs (if using SD card)
The default rootfs is 10GB for this image. To expand it to use your full SD card:
wget https://raw.githubusercontent.com/RobertCNelson/boot-scripts/master/tools/grow_partition.sh
chmod +x grow_partition.sh
sudo ./grow_partition.sh
sudo reboot
Disable web servers
If unused, you can disable these:
sudo systemctl disable --now bb-code-server
sudo systemctl disable --now nodered
Running EdgeAI benchmark suite
/opt
has some of these files already in the EdgeAI image, but
# Specific commits are my best guess to match TIDL 8.2 that seems to come with this image
git clone https://github.com/TexasInstruments/edgeai-benchmark.git -b 3342c09c56006f847a4907e2e930991bc2af4a21
git clone https://github.com/TexasInstruments/edgeai-modelzoo.git -b 20ef897df41198201a88e6250901934466303b57
sudo apt update
sudo apt-get install python3-venv
# Note: Retroactively added --system-site-packages because it needs the global install of onnxruntime.
python -m venv --system-site-packages benchmarkenv
source ./benchmarkenv/bin/activate
# Required for onnx
sudo apt install protobuf-compiler
pip install --upgrade onnx
pip install -r requirements_evm.txt
# See notes below before continuing
./run_benchmarks_evm.sh
I found the following issues in the EdgeAI setup which I had to manually correct:
- The NYU depth dataset’s normal host seems to be unavailable. I had to disable that dataset.
- The downloads from TI’s servers of the ONNX models failed with 403 errors, for reasons that are unclear.
resuests.get
/curl/etc. works buturllib.request.urlopen
gets an error. I manually downloaded the files into the right places.
Current problem (click to expand)
My current problem
<other output snipped>
libtidl_onnxrt_EP loaded 0xfaf27d0
Final number of subgraphs created are : 1, - Offloaded Nodes - 60, Total Nodes - 60
APP: Init ... !!!
APP_LOG: ERROR: Unable to open /dev/mem !!!
APP_LOG: ERROR: Unable to map memory @ 0xa90000 of size 512 bytes !!!
APP_LOG: ERROR: Unable to mmap gtc (0xa90000 of 512 bytes) !!!
APP: ERROR: Global timer init failed !!!
APP_LOG: ERROR: Unable to open /dev/mem !!!
APP_LOG: ERROR: Unable to map memory @ 0xb2000000 of size 262144 bytes !!!
APP: ERROR: Log writer init failed !!!
APP: Init ... Done !!!
./run_benchmarks_evm.sh: line 52: 1182 Segmentation fault python3 ./scripts/benchmark_modelzoo.py ${settings_file} "$@"
-------------------------------------------------------------------
Initial investigation
strings
tells me that these errors come from /usr/lib/libtivision_apps.so
. This binary seems to be closed-source – is that right?
Running under gdb, I am seeing the segfault at:
#0 __new_sem_wait (sem=0x0) at sem_wait.c:39
#1 0x0000ffffa6df67e8 in tivxPlatformSystemLock () from /usr/lib/libtivision_apps.so.8.2.0
#2 0x0000ffffa6dd43b4 in tivxObjDescAlloc () from /usr/lib/libtivision_apps.so.8.2.0
#3 0x0000ffffa6de3000 in vxCreateContext () from /usr/lib/libtivision_apps.so.8.2.0
#4 0x0000ffffa917a71c in TIDLRT_create () from /usr/lib/libvx_tidl_rt.so
#5 0x0000ffffe5a2a788 in TIDL_subgraphRtCreate () from /usr/lib/libtidl_onnxrt_EP.so
#6 0x0000ffffe5a29018 in TIDL_createStateInferFunc () from /usr/lib/libtidl_onnxrt_EP.so
#7 0x0000ffffa930a184 in ?? () from /usr/lib/python3.9/dist-packages/onnxruntime/capi/onnxruntime_pybind11_state.so
<cut to omit CPython function call internals>
I found docs for the two top (non-std) call frames:
If GDB is correctly reporting the arg to sem_wait
as 0x0
, then I guess that’d be the problem. I’m not 100% sure I trust its stack analysis and haven’t dug into it manually.
I don’t have a great theory for what’s going on here so far. I imagine it’s likely that the mmap failings reported via stdout then cascade into an attempt to unlock a garbage mutex, but it could also be that those mmap errors have perfectly functional fallbacks – I don’t know.
I also imagine this wouldn’t be specific to their benchmarks, but I haven’t yet looked for a means of smoke testing TIDL with less “stuff” in the way.
I also found a post which references the same error, although it doesn’t seem to give a specific misconfiguration to look for other than “incorrect memory mappings”: TDA4VM: APP_LOG: ERROR: Unable to map memory @ 0xb2000000 of size 262144 bytes !!! - Processors forum - Processors - TI E2E support forums. Perhaps someone from BB.org knows what the Beagle device tree looks like vs. what TI’s does.
dmesg:
ai64-tidl-initial-testing-dmesg.txt (48.3 KB)
Questions
Have others gotten TIDL/EdgeAI running on the AI-64? Is there something obvious I’m doing wrong, e.g. is a Debian 10 image or a different build of their libraries expected to work better?
I imagine that there’s either a device tree memory mapping in the Beagle OS distribution that doesn’t match what TI expects, or I’ve made a versioning error. Can anyone point me in a good direction re: resolving these errors? If that binary is indeed closed-source, it seems it’ll be a big pain to reverse-engineer the problem.
When I next have some time to look at this I’ll probably see if those mmap-failed addresses have particular significance in TI’s default device trees (e.g. are peripheral MMIO of interest); maybe that’ll uncover a lead. Although if what’s actually failing is whatever it’s trying to do with /dev/mem
, maybe that’s a waste of time.