[InterSpeech 2023] The official PyTorch implementation of: "AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation"
-
Updated
May 18, 2026 - Python
[InterSpeech 2023] The official PyTorch implementation of: "AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation"
A set of tools for computing waterfall spectrograms from an audio recording.
This program will create image of audio file.
Add a description, image, and links to the audio2image topic page so that developers can more easily learn about it.
To associate your repository with the audio2image topic, visit your repo's landing page and select "manage topics."