# Unicorn AI Studio > AI-Powered Podcast Generation, Voice Synthesis & Transcription Platform ## Overview Unicorn AI Studio is a comprehensive AI audio platform that combines advanced text-to-speech, podcast generation, live transcription, and meeting recording capabilities. Built for creators, podcasters, content producers, and businesses who need professional-quality voice synthesis and transcription. ## Key Features ### 1. Multi-Speaker Podcast Generation - **AI Script Generation**: Automated script creation using Gemma 4 LLM - **Multiple TTS Models**: Quickhorn (0.5B, fast/local), Silvermane 1.5B, Silvermane Large, Silvermane 7B - **Up to 4 Speakers**: Natural multi-speaker conversations with voice variety - **Singing Capability**: AI voices can sing when prompted with ♪ notation - **18+ Voice Options**: Multiple languages including English, Spanish, Chinese, Arabic, Hindi, Tamil - **Advanced Controls**: CFG scale, inference steps, temperature, top-P sampling ### 2. Live Transcription - **Real-Time Speech-to-Text**: Instant transcription using faster-whisper - **Push-to-Talk Interface**: Space bar or click-to-record - **Local Processing**: Privacy-first, runs on your device - **Cloud Fallback**: HuggingFace Whisper large-v3-turbo for backup ### 3. Call & Meeting Recording - **Multi-Format Support**: WAV, MP3, M4A, FLAC, OGG - **Accurate Transcription**: Powered by OpenAI Whisper models - **AI Summarization**: Automatic extraction of key points, action items, decisions - **Export Options**: Download transcripts (.txt, .json) and summaries (.md) ### 4. Audio Library - **Automatic Storage**: All generated content saved securely - **In-Browser Playback**: Listen without downloading - **Organization**: Filter by type (podcast, TTS, recording) - **Download & Delete**: Full control over your audio files ## Technology Stack ### AI Models - **TTS**: Unicorn AI Studio model family (Quickhorn 0.5B, Silvermane 1.5B, Silvermane Large, Silvermane 7B) - **ASR**: faster-whisper (local), OpenAI Whisper (cloud fallback) - **LLM**: Ollama Gemma 4 (e4b local, 31b-cloud fallback) ### Infrastructure - **Backend**: Python FastAPI, SQLite, asyncio - **Frontend**: Vanilla JavaScript, WebSocket, modern CSS - **Cloud**: HuggingFace Spaces (A10G GPU) for Pro TTS - **Deployment**: Works on macOS (MPS), Windows (CUDA), Linux (CPU/CUDA) ## Pricing Tiers ### Free - $0/month - 10 minutes podcast generation - 20 minutes transcription - 5 AI script generations - Quickhorn (0.5B) model only - Perfect for trying the platform ### Bring Your Own - $9/month - 30 minutes podcast generation - 60 minutes transcription - 20 AI script generations - Use your own API keys (HuggingFace, OpenAI, Ollama) - Cost control and flexibility ### Starter - $14.99/month - 60 minutes podcast generation - 120 minutes transcription - 50 AI script generations - Quickhorn (0.5B) model - Ideal for regular creators ### Pro - $29/month - 300 minutes podcast generation - 500 minutes transcription - 200 AI script generations - Access to Silvermane 1.5B, Silvermane Large, Silvermane 7B - Priority processing - Best for professionals and businesses **Note**: Model complexity affects usage: Silvermane Large (1.5x tokens), Silvermane 7B (2x tokens). More speakers increase processing costs (~20% per additional speaker). ## Use Cases ### Content Creators - Generate podcast episodes without recording studios - Create audiobooks and audio stories - Produce multilingual content with consistent voices - Test scripts before final recording ### Businesses - Create training materials and presentations - Generate product demos and explainer videos - Transcribe and summarize meetings automatically, keep data on device. - Produce marketing content at scale ### Developers - API integration for voice synthesis - Automated content generation pipelines - Voice assistant prototyping - Accessibility features for applications ### Educators - Create educational podcast series - Generate language learning materials - Transcribe lectures and presentations - Produce accessible course content ## Technical Requirements ### System Requirements - **macOS**: macOS 12+ (Monterey), Apple Silicon (M1/M2/M3/M4/M5 and more) or Intel with 8GB+ RAM - **Windows**: Windows 10/11, NVIDIA GPU (CUDA 11.8+) or CPU with 16GB+ RAM - **Linux**: Ubuntu 20.04+, CUDA-capable GPU or 16GB+ RAM ### Browser Support - Chrome/Edge 90+ - Firefox 88+ - Safari 14+ - Modern browsers with WebSocket and Web Audio API support ## API & Integration ### RESTful API Endpoints - `/api/generate` - Podcast generation - `/api/transcribe` - Audio transcription - `/api/summarize` - Meeting summarization - `/api/script` - AI script generation - `/ws/transcribe` - WebSocket live transcription - `/ws/stream-tts` - WebSocket streaming TTS ### Authentication - JWT-based authentication - API key support for BYOK tier - Encrypted storage for user credentials ## Privacy & Security - **Local Processing**: Free tier runs entirely on your device - **Encrypted API Keys**: User-provided keys encrypted at rest - **No Training Data**: Your content is never used for model training - **Secure Storage**: SQLite with bcrypt password hashing - **HTTPS Required**: SSL/TLS encryption for all communications ## Getting Started 1. **Download**: Get Unicorn AI Studio for macOS or Windows 2. **Install**: Run the installer and follow setup wizard 3. **Sign Up**: Create a free account (no credit card required) 4. **Generate**: Start creating podcasts or transcribing audio immediately 5. **Upgrade**: Move to Pro when you need cloud models and higher limits ## Links - **Website**: https://www.unicornstudio.ca - **Documentation**: https://www.unicornstudio.ca/llms.txt - **Support**: info@coreledger.ca - **GitHub**: https://github.com/KELVI23/Unicorn-ai-studios ## Company Powered by Coreledger Technologies Inc. - Building the future of AI-powered audio production. --- *Last Updated: April 2026* *Version: 1.0.0*