Skip to main content
All Thayer Events

MS Thesis Defense: Taka Khoo

Apr

29

Wednesday, April 29, 2026
1:00pm–2:00pm ET

Rm 232, Cummings Hall (Jackson Conf Rm)/ Online

"MODULO: A Generative Audio Workstation for AI-Native Music Co-Creation"

Abstract

Generative music has reached an inflection point. AI systems such as Suno and Udio now turn text prompts into album-quality audio in seconds, and the browser has made music production one click away. What these systems cannot do, however, is collaborate at the level a musician actually thinks. True artistry lies in chord theory, polyphony, subjective nuance, and the high level musical reasoning that spans every harmonic choice and mix decision.

This thesis presents MODULO: Musician-Owned DAW for User-Led Orchestration, an AI-native desktop Generative Audio Workstation. Built in C++ on Tracktion Engine, MODULO embeds structured agentic co-creation directly inside a professional multi-track timeline with plugin hosting, low-latency audio, and full mixing infrastructure. Every generated note remains subject to the musician's ear and intent.

In a single session, MODULO offers parallel customizable chord generation, a Chord Workshop that treats harmony as an editable abstraction, and a rule-based, adaptive harmony engine that produces multiple independent voice lines in seconds. An audio-to-MIDI pipeline converts live recordings into editable material, and one-click stem separation opens finished tracks back up for rework. Prompt-conditioned music generation handles full-song synthesis with automatic stem layout, a generative sound-effects module fills in non-melodic content, and a composition planner turns free-form creative intent into section-level blueprints. Underneath sits a full mixer with parametric EQ, buses, sends, and plugin hosting, wrapped in a musician-facing interaction layer of domain-specific keyboard shortcuts and contextual affordances.

A series of controlled user studies with working musicians validates the system. Across every subsystem evaluated, participants worked dramatically faster, explored broader harmonic territory, and reported markedly lower cognitive load. Tasks that once demanded round trips to external web tools or hours of manual entry were completed efficiently in-session, and musicians preferred MODULO across nearly every dimension of post-study surveys.

MODULO argues that the interface mediating musician and model is itself a primary site of innovation. We propose that with the right architecture, generative power and musical agency are not in tension but rather mutually reinforcing. 

Thesis Committee

  • Peter Chin (Chair)
  • Eugene Santos
  • Nikhil Singh

Contact

For more information, contact Thayer Registrar at thayer.registrar@dartmouth.edu .