A voice chat application that uses the LFM2-Audio-1.5B model to generate conversational audio responses. This application could work 100% locally, but the liquid-audio library requires CUDA. This is why the model is wrapped inside a Modal function and deployed to a serverless GPU environment with CUDA, so you can run it even if you don’t have an NVIDIA GPU at home. Record your voice, send it for processing, and receive an audio response that plays automatically.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/Liquid4All/cookbook/llms.txt
Use this file to discover all available pages before exploring further.
What’s in this example?
This example demonstrates how to build an interactive voice chat application using:- Audio Recording: Records audio from your microphone with automatic silence detection
- Cloud Processing: Uses Modal to run the LFM2-Audio-1.5B model on GPU in the cloud
- Audio Playback: Automatically plays the generated audio response
- Recording your voice question from the microphone (with auto-stop on silence)
- Uploading the audio to a Modal volume
- Processing the audio with LFM2-Audio-1.5B on a GPU instance to generate an interleaved text and audio response
- Downloading the generated audio response
- Playing the response through your speakers
Tools
This example uses the following tools and libraries:- liquid-audio: Python library for working with LFM2-Audio models
- Modal: Serverless cloud platform for running GPU workloads
- PyAudio: Cross-platform audio I/O library for recording from microphone
- pygame: Audio playback library
- torchaudio: Audio processing utilities for PyTorch
- rich: Terminal formatting library for audio visualization
Prerequisites
Set up Modal
- Create a Modal account at modal.com
- Install the Modal CLI:
uv add modal - Authenticate:
uv run modal token new
How to run it
The generated audio file will be saved locally as
answer_YYYYMMDD_HHMMSS.wav for each session.