Progress Report - Week 1 (23.05 - 31.05)

What I have done:

  • SPI tests (device tree for spidev not available on X15 => used soundcard device tree)
  • Created new device tree for CTAG face2|4 Audio Card (configures Multichannel Audio Serial Port (McASP), McSPI, audio codec, machine driver, … )
  • Modified ASoC machine driver for BB-X15
  • Master clock configuration different for BB-X15 and BBB
  • => Added feature to dynamically configure master clock of McSPI via dt based on hardware- Tested SPI, I2S with oscilloscope (detected changes on pins when playing audio or setting up audio codec (e.g. hw_params))
  • no adapter / breakout board available yet => hard to connect logic analyzer or soundcard with expansion header of BB-X15
  • Set up BB-X15 for development (TI SDK Linux)
  • Manually built SDK to compile gstreamer DSP example (http://processors.wiki.ti.com/index.php/Processor_Training:_Multimedia#DSP_C66x_Gstreamer_Plugin_Internals)
  • took enormous time and disk capacity => skipped and focused on DSP examples in Linux SDK filesystem- Still some problems with TI SDK default SD card image (not booting, no serial debug messages)
  • had to get FTDI adapter- Got detailed information about DSP kernels / OpenCL

Conclusion:

  • Finished all project goals of first week (although not tested with hardware (expansion header X15 adapter / breakout board not available))
  • Got a lot of information for second project part (C66x audio lib) (DSP kernels, TI DSPLIB, OpenCL, …)
  • Ready to start with small coding tests (e.g. FFT)

Progress Report - Week 2 (31.05 - 07.06)

  • Setting up TI Linux SDK:
  • Problems setting up SD card (Arago project) with Ubuntu 16.04
  • Tried VM (VirtualBox and VMWare Workstation) with Ubuntu 14.04 (Ubuntu 16.04 not offically supported for TI Linux SDK) => Still problems creating SD Card
  • Used old netbook with Ubuntu 14.04 => Successfully created SD Card with Arago project- Porting the CTAG face2|4 Audio Card drivers to BB-X15 (project part 1):
  • Adapter board is developed by rma (mentor)- DSP audio library (project part 2):
  • Information gathering via TI OpenCL article (http://downloads.ti.com/mctools/esd/docs/opencl/index.html) and TI OpenCL examples in Linux SDK
  • Successfully built and ran example applications (platforms, dsplib_fft, float_compute, …)
  • Information gathering about theoretical background of signal operations (e.g. twiddle factors for FFT)
  • Created library template which uses TI DSPLIB
  • Information gathering about direct ALSA buffer access (mmap) and GStreamer multimedia framework
  • Began coding ALSA buffer access to continuously calculate typical signal operations (e.g. FFT, convolution, …)

Progress Report - Week 2 (31.05 - 07.06)

Setting up TI Linux SDK:

Problems setting up SD card (Arago project) with Ubuntu 16.04
Tried VM (VirtualBox and VMWare Workstation) with Ubuntu 14.04 (Ubuntu 16.04
not offically supported for TI Linux SDK) => Still problems creating SD Card

^ yeah, don't bother with any of the VM's, they can't handle the low
level SD card creation commands, without voodoo magic, or rain
dances..

Used old netbook with Ubuntu 14.04 => Successfully created SD Card with
Arago project

Porting the CTAG face2|4 Audio Card drivers to BB-X15 (project part 1):

Adapter board is developed by rma (mentor)

DSP audio library (project part 2):

Information gathering via TI OpenCL article
(TI OpenCL v01.02.xx User’s Guide — TI OpenCL User's Guide) and TI OpenCL
examples in Linux SDK
Successfully built and ran example applications (platforms, dsplib_fft,
float_compute, ...)
Information gathering about theoretical background of signal operations
(e.g. twiddle factors for FFT)
Created library template which uses TI DSPLIB
Information gathering about direct ALSA buffer access (mmap) and GStreamer
multimedia framework
Began coding ALSA buffer access to continuously calculate typical signal
operations (e.g. FFT, convolution, ...)

x15, fyi: regarding debian, only the v4.1.x kernel has opencl working
out of the box, i need to finish the port to our v4.4.x kernel. :wink:

Regards,

Progress Report Week 3 (08.06- 14.06)

  • Problems with compilation and really hard to understand compiler error

  • Error was only showing undefined reference and some pointer addresses

  • Problem was the installed Linearo toolchain from Ubuntu repos and C++ std

  • => Solved by symlinking Linearo toolchain of Arago Linux, Makefile modifications and some extra environment variables-.-- Problems with shared memory between host CPU and DSP (invalid pointer error)

  • Error occurred due to dynamically allocated heap (not in global address space => not accessible from DSP)

  • => Fixed and optimized with TI OpenCL zero-copy memory extension __alloc_ddr() and __free_ddr() which uses reserved DDR3 area of host for shared memory- Created DSP tests and GNU Octave script to plot results of DSP calculations (currently only FFT)

  • FFT tests were successful

  • Plots are in pushed to repo

  • FFT with N = 16384 of 1kHz sine needs 2718 us (including buffer accesses)

  • DSP kernel is currently compiled online => Precompiled kernel should be quicker- Code refactorings and documentation in repo

  • Makefile bugfixes

  • Build instructions

@Robert

Thanks for the information.
Then I'll stick to 4.1 kernel for now;-)

Progress Report Week 4 (15.06 - 21.06)

What I have done:

  • Optimized audio signal lib:

  • Signal operations are asynchronously offloaded to C66x DSPs now

  • Using zero-copy OpenCL extensions for buffers now- Performance tests and evaluation of FFT/IFFT

  • Still bugs with IFFT (some samples of reconstructed sine are wrong)

  • Did some research in signal theory and tests with MATLAB- Refactorings: Separated lib and main app

  • Switched to BeagleBoard Debian 8.4 2016-04-10 image from TI Linux SDK

  • OpenCL support for 4.4 kernel not ported yet (see post of Robert Nelson)- Created JACK client to receive and transmit audio streams in realtime

  • Began to create QT app for GUI and realtime plots of signal / spectrum

  • Not sure right now if I use QCustomPlot (namespace conflicts) or to do it from scratch
    What to do next:

  • Finish GUI for signal plotting

  • speed up tests and bug fixes with signal operations (hopefully;-))- Applying some windows before calculating FFT/IFFT to find bugs in sine reconstruction

  • Verifications with MATLAB- Speed up signal calculations

  • Using buffers in on-chip MSMC memory instead of DDR3

  • Find block size (N) for FFT/IFFT- Add IIR/FIR and other useful operations to lib

  • Implement a demo audio effect which uses lib (e.g. reverb based on impulse response)

Progress Report Week 5 (22.06 - 28.06)

What I have done:

  • Fixed IFFT bug in sine reconstruction
  • FFT worked fine already, reconstructed sine was buggy
  • Did some reaseach in discrete LTI systems
  • Bug was caused due to wrong generated twiddle factors (have to be complex conjugated for IFFT)
  • FFT works correct now → Next I will use real time audio via JACK for “testing”
  • After this I will start implementing a convolution reverb- Adapter boards for BB-X15 <-> CTAG face2|4 Audio Card arrived (thanks rma!), so I switched back to my other project part (porting soundcard drivers to BB-X15)
  • Test of SPI
  • At first no clock or data signal (had some bugs in device tree → fixed)
  • After successfull data and clock signal output (decoded with logic analyzer) still problems with chip select line
  • Did a lot of tests with different chip selects, multiplex settings, …
  • Bug was caused due to one of the adapter board → tried another one and everything works fine now
  • Verified all register settings of audio codec (AD1938) → everything fine- Test of I2S
  • Audio codec is bit and frameclock master → clocks signals works fine
  • Kernel panic as soon as CPU DAI is triggered
  • Error logs of serial debugging mentioned IOMMU and firmware errors of DSP firmware (although not used)
  • Tried several configurations (TDM channels, CPU DAI clock master, …)
  • Got some help in beagle-gsoc IRC channel (thanks w_m, wormo and nerdboy!)
  • Turned out that the DSP firmware causes kernel crash => temporarily moved => everything works fine
  • Next I will try to fix kernel bug (reserve memory area via cmemk for DSP)

Progress Report Week 6 (29.06 - 05.07)

  • Started implementing convolution reverb
  • Created extra method in AudioAPI which has input buffer (input audio in time domain) and array with impulse response in frequency domain (downloaded some examples in WAV format).
  • Impuls responses can be loaded in WAV format with libsndfile
  • Audio from reverb is completely calculated on DSPs
    1. Spectrum of input buffer is calculated with FFT.
    1. Calculated spectrum is multiplied with impulse response of reverb
    1. Audio in time domain is reconstructed using IFFT.- To speed up tests JACK client is used to get audio in real time
  • Soundcard still can’t be used in connection with DSPs due to firmware bug of DSP
  • => Switched over to fix bug with cmemk- Gathering of information about cmemk / ti linux utils
  • Have to compile cmemk kernel module on my own (kernel module from factory kernel of BB-X15 image not working with my custom kernel)
  • Maybe also have to reserve explicit memory location via device tree

cmemk should be working now as of 4.4.14-ti-r34 (pushed last night)

[ 70.489185] cmem_init: allowOverlap parameter has been deprecated,
ignoring...
[ 70.502916] cmemk initialized

https://github.com/RobertCNelson/ti-linux-kernel-dev/blob/ti-linux-4.4.y/patches/x15/fixes/0001-x15-mmc-cmem-debugss.patch

Regards,

Progress Report Week 7 (06.07 - 12.07)

What I have done:

  • Compiled cmemk module for 4.1 and 4.4 kernel
  • First I tried compiling cmemk module with kernel stubs (took a lot of time until I realized that cmemk module can be compiled in runtime via dkms)
  • Compiled cmemk module for own custom kernel with dkms- Still kernel crash when using soundcard driver
  • iommu crash in remoteproc in connection with dsp firmware (dra7-dsp1-fw.xe66 / dra7-dsp2-fw.xe66) when soundcard is triggered (i.e. audio is played)
  • Memory areas of DSPs and soundcard (or mcasp) seem to overlap
  • If DSP firmware is temporarily moved soundcard works fine
  • Tried several kernels / tags
  • 4.1.13-ti-rt-r39: (before some cmem properties are commented in device tree am57xx-beagle-x15.dts): Error: Unable to allocate OCL MSMC memory from 0x40500000
  • 4.1.13-ti-rt-r56: DSP calculations work fine. Kernel crash when using soundcard.
  • Later than 4.1.13-ti-rt-r56 (not booting on BB-X15! (see error output http://pastebin.com/Bm7ZYmpS))
  • 4.4.14-ti-rt-r34 bb.org_defconfig (latest commit with cmem patch): Unable to allocate OCL MSMC memory from 0x40500000
  • Tried some device tree modifications in cmem node for all kernel versions (see above) => no IPC connection to DSPs or no execution on DSPs and errors)
  • Removed other soundcard drivers from dt (sound0 (analog i/o), sound1 (hdmi)) => Still kernel crash
  • Tried several parameters for cmemk => no change (parameters seem to be parsed from dt)
    What I do next:
  • Next I want to figure out the exact memory areas used by soundcard driver and remoteproc to check overlapping
  • When memory areas are clear, I want to reserve memory for cmemk with kernel parameter (mem=#[KMG]@0xXXXXXXXX) as mentioned in TI cmem troubleshooting (http://processors.wiki.ti.com/index.php/CMEM_Overview)

Progress Report Week 8 (13.07 - 19.07)

What I have done:

  • Verified reserved memory for cmem according to http://downloads.ti.com/mctools/esd/docs/opencl/memory/ddr-partition.html#am57

  • Tried different cmem configurations in dt for 4.1 kernel and 4.4 kernel

  • 4.1: Still kernel crash with dsp firmware or no ipc connection when using configuration of 4.4 kernel

  • 4.4: Clock errors with default config of dt (http://pastebin.com/53Fj9V7W). Successful exit with configuration of 4.1 kernel but not dsp operations were executed)- Looked for bugs in ASoC machine driver (e.g. widgets, clock config, …)

  • Still kernel crash- Tried ASoC simple card instead of own machine driver (is used for onboard analog audio i/o of BB-X15 as well)

  • Still kernel crash => problem seems not be caused from soundcard driver (ASoC machine driver / dt)- Get more information about edma, dsp dt nodes in connection with kernel modules, …

  • Moved over to library development

  • Refactorings: lib is shared now

  • hid some implementation details (especially OpenCL)

  • Integrated QCustomPlot for realtime plots (had some problems with conflicting types of QT OpenCL (is included from OpenGL) and TI OpenCL
    What I do next:

  • Get more help and information to fix kernel crash (highest priority right now)

  • Show audio signals in frequency and time domain in realtime with QCustomPlot

  • Continue implementing convolution reverb

  • Integrate more signal operations of TI DSPLIB (FIR, IIR, …)

Progress Report Week 9 (20.07 - 26.07)

What I have done:

  • Fixed kernel crash caused by DSP firmware (dra7-dsp1-fw.xe66 / dra7-dsp2-fw.xe66)
  • Kernel crash was caused by DSP firmware recovery path of remote processor framework
  • By disabling recovery (/sys/kernel/debug/remoteproc3|4/recovery) kernel crash can be avoided
  • Tested CTAG face 2|4 Audio Card and BB-X15 DSP lib simultaneously => everything works fine- Modified / extended build instructions for sound card drivers and DSP lib
  • Test program for DSP lib uses CTAG face2|4 Audio Card via JACK now instead of self calculated sine
  • DSP lib to slow to handle all audio samples => Performance optimizations- Hugh refactorings / optimizations of DSP lib for stability and performance
  • Used C++11 features (smart pointers, std::function, …)
  • Implemented real time plots with gnuplot and gnuplot-iostream
  • Created additional API functions for preparing DSP calculations (e.g. buffers initialisation)
  • New bug occured in FFT calculation (wrong spectrum of sine)
  • Seems to be caused by DSP lib performance optimizations (shouldn’t be hard to find and fix)

What I do next:

  • Find and fix bug of FFT calculations
  • Performance optimizations:
  • Buffers in fast MSMC memory instead of DDR
  • Queue optimizations for managing OpenCL kernels- Add additional functions for FIR, IIR, …
  • Continue implementing convolution reverb

Progress Report - Week 10 (27.07 - 02.08.2016)

What I have done:

  • Recreated display for realtime plots
  • Switched from QT to SDL1 (SDL2 not working) due to LLVM conflicts of OpenCL and OpenGL (see here)
  • Created mechanism for scaling amplitudes of calculated spectrum => plotting in realtime- Optimized queue mechanism to avoid callback overflow
  • Added more DSP operations to API (FIR, IIR)
  • Operations are available in DSP kernel already. Still need to create member functions of API for executing.- Created pull request for official BeagleBoard kernel to merge soundcard driver (travis error in build due to timeout???)
  • Added feature to query status of DSP operations
  • => Avoid callback overflows- Created new display class to plot waveforms or spectrum in realtime
  • Added interpolation (linear) to scale calculated spectrum amplitudes to SDL screen size
  • Tested with realtime audio input via JACK of CTAG face2|4 Audio Card (currently only mono)- Improved overall performance and stability
  • Did several tests for memory leaks and stabilitly (e.g. callback overflow)

What I do next:

  • Create windowing and interpolation mechanism for continuous FFT
  • Optimize plotting via SDL for performance
  • Create API methods for filter operations (data structure for parameters, …)
  • Continue implement convolution reverb

Progress Report - Week 11 (03.08.2016 - 09.08.2016)

What I have done:

  • Began to create final GSoC report
  • Did detailed research of convolution reverb
  • Impulse reponses can be very long (~10 seconds or even longer) therefore realtime audio input signal and reverb impulse reponse have to be sliced in equal blocks to achieve low latencies
  • Convolution in time domain (also available in TI DSPLIB) is very expensive therefore algorithm via FFT (in frequency domain) should be used
  • Last calculated block has to be taken into account for next calculation (otherwise calculated spectrum of single block isn’t correct/complete => unwanted acoustical effects due to aliasing effect and so on)
  • More information can be found here
  • Decided to create a different additional app for demonstration of DSP library- New demo effect idea is additive synthesis via IFFT:
  • User can “draw” spectral lines via SDL1
  • Via IFFT audio signal in time domain is calculated and saved in additional buffer
  • Buffer is continuously read and written to JACK client
  • If changes occured in spectrum, signal in time domain is calculated again and additional buffer will be updated.
  • Very efficient (IFFT is calculated only once if changes occured)
  • Low latency (N of IFFT can be very low / fast computation on DSPs)
  • Additional additive synth demo (and convolution reverb) will be also finished in the future if not finished in GSoC time (want to build Eurorack synthesizer module based on GSoC project)

What I do next:

  • Create windowing and interpolation (linear) mechanism for continuous FFT
  • Create API methods for filter operations (data structure for parameters, …)
  • Clean code and extend documentation (API description via Doxygen, Markdown document)
  • Continue/update final GSoC report
  • Create additive demo synth app

Hi Henrik,

I saw your process from Internet and I am also working on AM57x DSP. I met the same problem.

Unable to allocate OCL MSMC memory from 0x40500000

I saw that you disable the remoteproc 3| 4, then the issue been solved. But when I followed your instruction, it did not work on my board…

Could you please share more details about how to solved this issue?

Thank you!

在 2016年7月19日星期二 UTC+8下午10:53:03,Henrik Langer写道:

Dear Henrik,

Thank you for your work and for sharing it. I am also getting the OCL error:

Unable to allocate OCL MSMC memory from 0x40500000

Could you please comment on how you finally solved this issue?

Thanks,

Fran