Python code got killed on BBAI

Jianzhong_Xu · November 6, 2019, 3:26pm

I was trying to run a Python code. It failed and gave me error message “Killed”. Looks like this message means the Python program is out of memory: https://stackoverflow.com/questions/1811173/why-does-my-python-script-randomly-get-killed.

Then I opened /var/log/syslog and saw the following:

Nov 6 14:26:56 beaglebone kernel: [ 3072.894508] Out of memory: Kill process 28376 (python3) score 525 or sacrifice child
Nov 6 14:26:56 beaglebone kernel: [ 3072.919653] Killed process 28376 (python3) total-vm:491332kB, anon-rss:322104kB, file-rss:6724kB, shmem-rss:0kB

Below is what df reports:

debian@beaglebone:~/neo-ai-dlr/tests/python/integration$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 199M 0 199M 0% /dev
tmpfs 62M 7.0M 55M 12% /run
/dev/mmcblk0p1 30G 4.2G 24G 15% /
tmpfs 306M 8.0K 306M 1% /dev/shm
tmpfs 5.0M 4.0K 5.0M 1% /run/lock
tmpfs 306M 0 306M 0% /sys/fs/cgroup
tmpfs 62M 4.0K 62M 1% /run/user/1000

Why did the Python code run out of memory and how can I solve it?

Thanks and regards,
Jianzhong

jomoengineer · November 7, 2019, 4:04pm

Sounds like your BB AI ran out of memory not disk space. Is the Python script you are running part of the Cloud9 apps or your own creation?
Python has a tendency to consume all the available memory on a system if you let it and not delete variables or objects when no longer being used.

You can see the available memory in multiple ways:
Ex:

top
free
cat /proc/mem

Or from Python as in this example:
https://stackoverflow.com/questions/1204378/getting-total-free-ram-from-within-python

Cheers,

Jon

jkridner · November 10, 2019, 3:32pm

I took a look at the same code from https://neo-ai-dlr.readthedocs.io/en/latest/install.html.

I took the liberty of replacing python2.7 with python3.5 as the default. Also, I copied libdlr.* manually as the install script doesn’t seem to do it.

I was able to build it, but when I try to run the same script, I get:

debian@beaglebone:~/neo-ai-dlr/tests/python/integration$ python load_and_run_tvm_model.py
Preparing model artifacts for resnet18_v1 …
Preparing model artifacts for 4in2out …
Preparing model artifacts for assign_op …
Testing inference on resnet18…
Traceback (most recent call last):
File “load_and_run_tvm_model.py”, line 69, in
test_multi_input_multi_output()
File “load_and_run_tvm_model.py”, line 30, in test_multi_input_multi_output
assert model._impl._get_output_size_dim(0) == (2, 1)
AttributeError: ‘DLRModel’ object has no attribute ‘_impl’

jomoengineer · November 10, 2019, 4:30pm

Jason,

I’m not sure if you seen this but the same error you are seeing is listed in this issue listed at the github repo.
https://github.com/neo-ai/neo-ai-dlr/issues/74

Cheers,

Jon