How I Built a Voice-Controlled Local AI Agent from Scratch
Introduction What the System Does Architecture Overview Audio Input — Streamlit's built-in st.audio_input() handles browser microphone recording. File upload supports .wav, .mp3, and .m4a. Speech-to-Text (STT) — I used Groq's hosted Whisper API (whisper-large-v3). More on why below. Intent Classific



