Weekly Progress Report: Differentiable Logic for Interactive Systems and Generative Music

I’ll post weekly updates for this GSoC project here.
More information about the project is available in the proposal.

Updates:

  • Created a repository and added links to the project page
  • Met with mentors and discussed considerations for a Pure Data external for difflogic
  • Prepared slides for intro video and got them approved by mentor
  • Received shipment from Bela (Bela cape + Trill square); awaiting BeagleBone Black

Updates:

  • Created an introductory video
  • Experimented a bit with difflogic and different ways of representing data for input to/output from networks
  • Got setup with Google Colab Pro for GPU access
  • Created a trivial Pure Data external, started the process of embedding a compiled difflogic network inside it
  • Received shipment with BeagleBone Black and peripherals

Blockers: none at the moment! just debugging the Pd external.

Upcoming goals:

  • Get Pd external working with embedded difflogic network
  • Setup Bela (BeagleBone Black + Bela cape)
  • Run difflogic network on Bela
  • Try running on PRU
    • Performance comparison? (laptop CPU vs. BB CPU vs. BB PRU)

Updates:

  • Did a bit more experimenting with difflogic
  • Setup Bela (BeagleBone Black + Bela cape); got a project to build and got audio out
  • Debugged Pd external with embedded difflogic network; got it running at audio rate (on a laptop)
  • Compiled Pd external to run on Bela; updated build script to work on both laptop and Bela
  • Started investigating performance of simple difflogic network on Bela; running at audio rate, it does not keep up with real-tme
  • Ran example project on PRU

Blockers: none; currently debugging performance.

Upcoming goals:

  • Get simple difflogic network running at audio rate on Bela
  • Run difflogic network on PRU
  • Start work on additional wrappers (beyond Pd): C++, SuperCollider

Updates:

  • Resolved performance issues with difflogic network on Bela, which is now keeping up with audio rate (both in standalone C++ project and as Pd external)
  • Took basic performance measurements (e.g. observing CPU usage at different block sizes, timing different sections with std::chrono::steady_clock)
  • Modified PRU example project to send custom values between CPU & PRU
    • Wasn’t able to get PRU->CPU interrupts working; currently relying on shared memory to signal status
  • Got PRU to perform a trivial calculation on-demand from CPU
  • Set up TI toolchain for compiling C for PRU (clpru & co.)
  • Got batching working in colab notebook (CUDA)
  • Started bringing compiled difflogic network into PRU project

Blockers: currently trying to fit generated assembly into my existing PRU code, which is proving to be a hacky endeavor that involves some toolchain headaches (e.g. disagreements between clpru, dispru, pasm)

Upcoming goals:

  • Get difflogic network running on PRU
  • Make process for doing so less hacky
  • Write basic wrappers for C++, SuperCollider, CSound (or perhaps Faust)

Updates:

  • Spent some time trying to get difflogic network running on PRU using pipeline involving generating C via difflogic library, compiling C with clpru, disassembling with dispru, and finally including in some other (preexisting) assembly compiled with pasm (and finally loaded on to PRU with prussdrv_exec_code() per Bela example project).
  • This approach caused a lot of headaches. After resolving numerous issues, the results obtained by the CPU and PRU still disagreed, and there were a few discrepancies in the intermediate computed values with no obvious cause.
  • So, rather than continuing to debug clpru/dispru output, I went ahead and wrote a custom PRU assembly exporter for difflogic models (based on difflogic’s CompiledLogicNet, which generates some C). This sufficed to get my 8-bit identity network running on the PRU and producing the same results as the C-export version running on the CPU.
  • Wrote basic wrapper for C++.
  • Started on basic wrapper for SuperCollider.

Blockers: none at the moment :slightly_smiling_face:

Upcoming goals:

  • Clean up & generalize PRU assembly exporter
  • Finish basic wrapper for SuperCollider
  • Write basic wrapper for CSound or Faust

Updates:

  • Improved generated PRU assembly significantly by loading the network as a graph (using networkx) and performing a series of simplifying transformations on it. On my test network, this reduces the number of basic operations by a factor of about 2/3. (The number of assembly instructions is reduced further because the generated assembly now stores as many intermediate values as possible in the register file, rather than dumping everything into memory.)
    • Also, used sympy to verify correctness of simplified network.
  • Wrote basic wrapper for SuperCollider.
  • Started working on Faust wrapper via Faust’s foreign function interface.

Blockers: none

Upcoming goals:

  • Finish basic wrapper for Faust
  • Improve interfaces & functionality of wrappers
  • Start experimenting with using difflogic in conjunction with DDSP techniques

Updates:

  • Got basic Faust wrapper working via FFI
  • Experimented a bit more with input encodings; tried thermometer (unary) encoding and one-hot encoding
  • Fixed a bug in my notebook which was impacting the training of one of my types of models
  • Compiled difflogic network as shared library and loaded with dlopen()
  • Improved Pure Data external so it no longer has the model “baked in” but instead loads it dynamically (via dlopen()) from a user-specified path
  • Added message-handling object difflogic to Pure Data external (to serve as counterpart to existing signal version, difflogic~)

Blockers: none

Upcoming goals:

  • Improve SuperCollider wrapper
  • Tidy up repository & write some basic documentation so others can try the wrappers
  • Experiment with difflogic + DDSP

Updates:

  • Trained new identity model based on unary encoding, following experiment last week
  • Created custom pytorch layers for input/output conversions
    • These can go in the model itself, rather than having to be known by the caller
    • Later, we can also generate the appropriate code for these layers in the C & assembly exporters, rather than needing to bake the input/output conversion code into the various network wrappers.
  • Per discussion with mentors, decided to refocus on demo applications for the time being

Blockers: for combining difflogic with non-logic layers (e.g. for DDSP), still thinking about how to make the input conversion (e.g. from floats to unary-encoded bits) differentiable.

Upcoming goals:

  • Train some networks to do audio synthesis as a function of time, bytebeat-style
  • Create some interface for visualizing and/or manipulating synthesis networks
  • Record a brief demo video

Updates:

  • Set up notebook for running and playing bytebeat expressions (e.g. ((t >> 10) & 42) * t)
  • Trained small networks on output from a bytebeat expression; generated and played audio from network in notebook.
    • Found that networks with binary output (as opposed to GroupSum output) seem to yield more interesting results.
  • Cleaned up repo a bit to reduce duplication (moving code shared between notebooks into utility modules).
  • Calculated ratio between the size of the model and the size of its output (in bits) to get a sense of how much the model is a “compressed” representation of the output.
  • Generated & played audio from notebook while also tweaking the model, live.
  • Modified difflogic’s CompiledLogicNet to support models without GroupSum
  • Started building a web-based interface for exploring the space of synthesis networks.
    • Currently, this plays the network live via Wasm + AudioWorklets and displays the network graph.

Blockers: none

Upcoming goals:

  • Continue building network explorer
    • Make it interactive: allow user to modify the network directly or swap it out for another network (perhaps a further-trained version of the same network) while still playing audio.
    • Enable running the network in the browser or on the Bela.
  • Increase size of input (which is currently 14 bits → ~2 seconds of 8kHz audio) to enable longer audio output.
  • Try training models on other synthetic or natural sounds.

Updates:

  • Continued work on explorer application.
    • Enabled switching between networks while playing.
    • Wrote network interpreter to allow for quickly tweaking a network while playing without recompiling
    • Experimented with possible interactions to enable “playing” (or “bending”) the network.
      • Currently, this takes the form of painting an overlay on top of the network, masking over different gates. with zero or one.
    • Made UI a bit nicer and more compact to better facilitate interaction.
  • Trained synthesis network with larger input size (16 bits) to enable longer audio output.
    • Generalized explorer to support networks of different sizes & periods.

Blockers: none

Upcoming goals:

  • Enable running the network on the Bela (in addition to the browser).
    • Setup communication between explorer web interface and Bela.
  • Connect the Trill Square and use it as an input device for network bending.
  • Continue refining explorer: make it easy to open, save, and share networks; consider additional interactions & operations for tweaking & transforming the network.
1 Like

Updates:

  • Continued work on explorer.
    • Always sort nodes by connection indices (even after loading new network).
      • This serves as a heuristic to put child notes closer to their parents in order to make the spatial layout a better proxy for the network structure.
    • Allow loading/saving networks as JSON-encoded strings.
    • Improve UI; don’t draw unused edges to zero- or one-operand gates.
  • Run network interpreter on Bela in real-time.

Blockers: none

Upcoming goals:

  • Setup communication between the web interface and Bela.
    • In this scenario, the Bela will act as another backend in addition to the AudioWorklet.
  • Connect the Trill Square & OLED screen for input/output on the Bela.
    • In this scenario, the Bela may be used as a standalone network-bending instrument.

Updates:

  • Visualize network activity during playback
  • Add “reset time” button to explorer; continue tweaking explorer UI
  • Experiment with training network on short recorded sound (“Hello World”)
  • Create a ScoreCard from a synthesis network
  • Establish communication between browser and Bela over websockets via GUI library

Blockers: none

Upcoming goals:

  • Send network configuration from client to Bela; send visualization data from Bela to client
  • Connect Trill & OLED screen for standalone usage on Bela

Updates:

  • Bidirectional communication between web client and Bela via websocket
    • Web client sends network changes (due to load, masking, or randomization) to Bela
    • Bela sends touch data to web client for visual feedback
    • Web client includes setting for whether to connect to Bela or run standalone (in the browser)
  • Trill Square connected to Bela for network bending by touch
    • Bela project can (nearly) run standalone, without web client
    • Web client displays selected gate for visual feedback
    • Network masking logic (temporarily replacing touched gates with constant 0’s) reimplemented in Bela project

Blockers: for OLED support (to get visual feedback without a browser), lack of Qwiic cable for I2C daisy-chaining

Upcoming goals:

  • Add “pseudo-buttons” to Bela interface by delineating certain regions on the Trill Square
  • If time & sourcing allows, connect OLED screen to Bela for visual feedback sans browser
  • General cleanup; address most pressing “TODOs” and reduce code duplication
  • Record demo video; write & submit final report
1 Like