sw sound-walk · workshop
01 · welcome

A soundwalk,
rendered as film.

A hands-on 18-hour workshop in Python, audio classifiers, and AI video. Students field-record a place, analyse what they captured, and render it back as a short film. From their own input to their own output.

18 hours · 6 sessions Basic Python helpful Each student leaves with a film
sw sound-walk · workshop
02 · the pipeline

From field recording
to generative film.

1
Record
Your own field audio of a real place, captured on the device you already carry.
2
Classify
Pretrained audio models tag what's there — bird, siren, rain, engine — with timestamps and confidence.
3
Map
Each sound class becomes a visual property.
4
Render
AI video generation via Replicate. Prototype on images, render on video, assemble in Python.
# the heart of the pipeline
from transformers import pipeline
import librosa

y, sr = librosa.load("recording.wav",
                    sr=16000)

clf = pipeline("audio-classification",
  model="MIT/ast-finetuned-audioset")

events = clf(y, top_k=5)
sw sound-walk · workshop
03 · schedule

18 hours, one film.

Session 1 Listen — Python warm-up, loading and plotting field audio with librosa setup + theory
Session 2 Recognize — classifying sound events with AST and AudioSet hands-on
Session 3 Structure — windowed segmentation into a clean event timeline hands-on
Session 4 Map — sound classes to visual properties. The artistic core. hands-on
Session 5 Render & assemble — AI video via Replicate, composed in Python hands-on
Session 6 Your own sound-walk — new location, your pipeline, class screening build
6 × 3h
Sessions. Weekly pace with practice between.
~70%
Time spent making, not watching slides.
1 film
Finished short, in each student's hands.
sw sound-walk · workshop
04 · curriculum

From sound to film, step by step.

1
Listen
Loading, plotting, and exploring real field audio with librosa. The pipeline starts with your own ears.
2
Recognize
Pretrained classifiers tag the recording. What sound, when, and how confident.
3
Structure
Windowed segmentation and merging into a clean JSON event timeline. The scaffolding everything else builds on.
4
Map
The artistic core. Sound classes to visual properties — never to objects.
5
Render
AI video generation via Replicate. Render an image first to evaluate prompting.
6
Compose
Concatenate the clips, sync to the original field recording, export. The film is finished.
sw sound-walk · workshop
05 · the offering

Built for the eye and the ear.

Participants leave with
  • A finished generative short film built from their own field recording.
  • A reusable Python pipeline they can run on any future recording.
  • The skill to design sound-to-visual mappings that work.
  • Practical experience with audio classifiers and diffusion video models.
Who it's for
  • Film, animation, audio, and design students with basic Python.
  • Independent artists expanding into generative film.
  • Creative coders curious about audio-reactive AI workflows.
Available formats
Course 6 weekly sessions · practice and recording between
Custom In-house for schools, studios, or cultural institutions
About the instructor

AI engineer with 8+ years in production LLM and ML systems. Background in affective computing research and active studio practice in generative film. Teaches through building real things. Based in Athens.

sound-walk · workshop slides