All templates
Template·Audio & Music

Audio To Image

Transform spoken descriptions into images with this workflow. Record or upload audio, which is transcribed by Whisper and then visualized by Stable Diffusion. Perfect for quickly generating images from verbal ideas without typing.

Audio To Image — example output from the NodeTool workflow

The workflow

Workflow Editor
Note
Audio Input
Automatic Speech Recognition
Audio
Text
Text To Image
Prompt

Nodes in this workflow

3 nodes · 3 types
  • Audio Input
    nodetool.input.AudioInput
  • Automatic Speech Recognition
    nodetool.text.AutomaticSpeechRecognition
  • Text To Image
    nodetool.image.TextToImage

How to run it

  1. 01

    Download NodeTool Studio

    Install the free desktop app for macOS, Windows, or Linux. It runs on your own machine, no account required to start.

  2. 02

    Open the Audio To Image template

    Browse the built-in template library inside Studio and open this workflow onto the canvas. Every node is already wired up.

  3. 03

    Add your keys

    Connect the providers this workflow uses (Audio Input, Text To Image). Bring your own keys — you pay the provider directly.

  4. 04

    Run and remix

    Hit Run to execute the graph and watch results stream in. Swap models, edit prompts, or rewire nodes to make it yours.

Run Audio To Image on your machine

Free, open source, and yours to run. Download Studio, open the template, and run it with your own keys.