I’ll post weekly updates for this GSoC project here.
More information about the project is available in the proposal.
Updates:
- Created a repository and added links to the project page
- Met with mentors and discussed considerations for a Pure Data external for difflogic
- Prepared slides for intro video and got them approved by mentor
- Received shipment from Bela (Bela cape + Trill square); awaiting BeagleBone Black
Updates:
- Created an introductory video
- Experimented a bit with difflogic and different ways of representing data for input to/output from networks
- Got setup with Google Colab Pro for GPU access
- Created a trivial Pure Data external, started the process of embedding a compiled difflogic network inside it
- Received shipment with BeagleBone Black and peripherals
Blockers: none at the moment! just debugging the Pd external.
Upcoming goals:
- Get Pd external working with embedded difflogic network
- Setup Bela (BeagleBone Black + Bela cape)
- Run difflogic network on Bela
- Try running on PRU
- Performance comparison? (laptop CPU vs. BB CPU vs. BB PRU)
Updates:
- Did a bit more experimenting with difflogic
- Setup Bela (BeagleBone Black + Bela cape); got a project to build and got audio out
- Debugged Pd external with embedded difflogic network; got it running at audio rate (on a laptop)
- Compiled Pd external to run on Bela; updated build script to work on both laptop and Bela
- Started investigating performance of simple difflogic network on Bela; running at audio rate, it does not keep up with real-tme
- Ran example project on PRU
Blockers: none; currently debugging performance.
Upcoming goals:
- Get simple difflogic network running at audio rate on Bela
- Run difflogic network on PRU
- Start work on additional wrappers (beyond Pd): C++, SuperCollider
Updates:
- Resolved performance issues with difflogic network on Bela, which is now keeping up with audio rate (both in standalone C++ project and as Pd external)
- Took basic performance measurements (e.g. observing CPU usage at different block sizes, timing different sections with
std::chrono::steady_clock
) - Modified PRU example project to send custom values between CPU & PRU
- Wasn’t able to get PRU->CPU interrupts working; currently relying on shared memory to signal status
- Got PRU to perform a trivial calculation on-demand from CPU
- Set up TI toolchain for compiling C for PRU (
clpru
& co.) - Got batching working in colab notebook (CUDA)
- Started bringing compiled difflogic network into PRU project
Blockers: currently trying to fit generated assembly into my existing PRU code, which is proving to be a hacky endeavor that involves some toolchain headaches (e.g. disagreements between clpru
, dispru
, pasm
)
Upcoming goals:
- Get difflogic network running on PRU
- Make process for doing so less hacky
- Write basic wrappers for C++, SuperCollider, CSound (or perhaps Faust)
Updates:
- Spent some time trying to get difflogic network running on PRU using pipeline involving generating C via
difflogic
library, compiling C withclpru
, disassembling withdispru
, and finally including in some other (preexisting) assembly compiled withpasm
(and finally loaded on to PRU withprussdrv_exec_code()
per Bela example project). - This approach caused a lot of headaches. After resolving numerous issues, the results obtained by the CPU and PRU still disagreed, and there were a few discrepancies in the intermediate computed values with no obvious cause.
- So, rather than continuing to debug
clpru
/dispru
output, I went ahead and wrote a custom PRU assembly exporter for difflogic models (based on difflogic’sCompiledLogicNet
, which generates some C). This sufficed to get my 8-bit identity network running on the PRU and producing the same results as the C-export version running on the CPU. - Wrote basic wrapper for C++.
- Started on basic wrapper for SuperCollider.
Blockers: none at the moment
Upcoming goals:
- Clean up & generalize PRU assembly exporter
- Finish basic wrapper for SuperCollider
- Write basic wrapper for CSound or Faust
Updates:
- Improved generated PRU assembly significantly by loading the network as a graph (using
networkx
) and performing a series of simplifying transformations on it. On my test network, this reduces the number of basic operations by a factor of about 2/3. (The number of assembly instructions is reduced further because the generated assembly now stores as many intermediate values as possible in the register file, rather than dumping everything into memory.)- Also, used
sympy
to verify correctness of simplified network.
- Also, used
- Wrote basic wrapper for SuperCollider.
- Started working on Faust wrapper via Faust’s foreign function interface.
Blockers: none
Upcoming goals:
- Finish basic wrapper for Faust
- Improve interfaces & functionality of wrappers
- Start experimenting with using difflogic in conjunction with DDSP techniques
Updates:
- Got basic Faust wrapper working via FFI
- Experimented a bit more with input encodings; tried thermometer (unary) encoding and one-hot encoding
- Fixed a bug in my notebook which was impacting the training of one of my types of models
- Compiled difflogic network as shared library and loaded with
dlopen()
- Improved Pure Data external so it no longer has the model “baked in” but instead loads it dynamically (via
dlopen()
) from a user-specified path - Added message-handling object
difflogic
to Pure Data external (to serve as counterpart to existing signal version,difflogic~
)
Blockers: none
Upcoming goals:
- Improve SuperCollider wrapper
- Tidy up repository & write some basic documentation so others can try the wrappers
- Experiment with difflogic + DDSP
Updates:
- Trained new identity model based on unary encoding, following experiment last week
- Created custom pytorch layers for input/output conversions
- These can go in the model itself, rather than having to be known by the caller
- Later, we can also generate the appropriate code for these layers in the C & assembly exporters, rather than needing to bake the input/output conversion code into the various network wrappers.
- Per discussion with mentors, decided to refocus on demo applications for the time being
Blockers: for combining difflogic with non-logic layers (e.g. for DDSP), still thinking about how to make the input conversion (e.g. from floats to unary-encoded bits) differentiable.
Upcoming goals:
- Train some networks to do audio synthesis as a function of time, bytebeat-style
- Create some interface for visualizing and/or manipulating synthesis networks
- Record a brief demo video
Updates:
- Set up notebook for running and playing bytebeat expressions (e.g.
((t >> 10) & 42) * t
) - Trained small networks on output from a bytebeat expression; generated and played audio from network in notebook.
- Found that networks with binary output (as opposed to GroupSum output) seem to yield more interesting results.
- Cleaned up repo a bit to reduce duplication (moving code shared between notebooks into utility modules).
- Calculated ratio between the size of the model and the size of its output (in bits) to get a sense of how much the model is a “compressed” representation of the output.
- Generated & played audio from notebook while also tweaking the model, live.
- Recorded a quick demo video.
- Modified difflogic’s
CompiledLogicNet
to support models without GroupSum - Started building a web-based interface for exploring the space of synthesis networks.
- Currently, this plays the network live via Wasm + AudioWorklets and displays the network graph.
Blockers: none
Upcoming goals:
- Continue building network explorer
- Make it interactive: allow user to modify the network directly or swap it out for another network (perhaps a further-trained version of the same network) while still playing audio.
- Enable running the network in the browser or on the Bela.
- Increase size of input (which is currently 14 bits → ~2 seconds of 8kHz audio) to enable longer audio output.
- Try training models on other synthetic or natural sounds.
Updates:
- Continued work on explorer application.
- Enabled switching between networks while playing.
- Wrote network interpreter to allow for quickly tweaking a network while playing without recompiling
- Experimented with possible interactions to enable “playing” (or “bending”) the network.
- Currently, this takes the form of painting an overlay on top of the network, masking over different gates. with
zero
orone
.
- Currently, this takes the form of painting an overlay on top of the network, masking over different gates. with
- Made UI a bit nicer and more compact to better facilitate interaction.
- Trained synthesis network with larger input size (16 bits) to enable longer audio output.
- Generalized explorer to support networks of different sizes & periods.
Blockers: none
Upcoming goals:
- Enable running the network on the Bela (in addition to the browser).
- Setup communication between explorer web interface and Bela.
- Connect the Trill Square and use it as an input device for network bending.
- Continue refining explorer: make it easy to open, save, and share networks; consider additional interactions & operations for tweaking & transforming the network.
Updates:
- Continued work on explorer.
- Always sort nodes by connection indices (even after loading new network).
- This serves as a heuristic to put child notes closer to their parents in order to make the spatial layout a better proxy for the network structure.
- Allow loading/saving networks as JSON-encoded strings.
- Improve UI; don’t draw unused edges to zero- or one-operand gates.
- Always sort nodes by connection indices (even after loading new network).
- Run network interpreter on Bela in real-time.
Blockers: none
Upcoming goals:
- Setup communication between the web interface and Bela.
- In this scenario, the Bela will act as another backend in addition to the AudioWorklet.
- Connect the Trill Square & OLED screen for input/output on the Bela.
- In this scenario, the Bela may be used as a standalone network-bending instrument.
Updates:
- Visualize network activity during playback
- Check out the demo!
- Add “reset time” button to explorer; continue tweaking explorer UI
- Experiment with training network on short recorded sound (“Hello World”)
- Create a ScoreCard from a synthesis network
- Establish communication between browser and Bela over websockets via GUI library
Blockers: none
Upcoming goals:
- Send network configuration from client to Bela; send visualization data from Bela to client
- Connect Trill & OLED screen for standalone usage on Bela
Updates:
- Bidirectional communication between web client and Bela via websocket
- Web client sends network changes (due to load, masking, or randomization) to Bela
- Bela sends touch data to web client for visual feedback
- Web client includes setting for whether to connect to Bela or run standalone (in the browser)
- Trill Square connected to Bela for network bending by touch
- Bela project can (nearly) run standalone, without web client
- Web client displays selected gate for visual feedback
- Network masking logic (temporarily replacing touched gates with constant 0’s) reimplemented in Bela project
Blockers: for OLED support (to get visual feedback without a browser), lack of Qwiic cable for I2C daisy-chaining
Upcoming goals:
- Add “pseudo-buttons” to Bela interface by delineating certain regions on the Trill Square
- If time & sourcing allows, connect OLED screen to Bela for visual feedback sans browser
- General cleanup; address most pressing “TODOs” and reduce code duplication
- Record demo video; write & submit final report