Last updated
1/17/2025
Get started
Create your own tailored podcast using your documents
This blueprint demonstrates how you can use open-source models & tools to convert input documents into a podcast featuring two speakers. It combines document pre-processing, language model-powered script generation, and text-to-speech synthesis. Designed to run on most local setups, it requires no external API calls or GPU access, making it both more accessible and privacy-friendly by keeping all processing local.
If you encounter any issues with the hosted demo below, try the Blueprint in the GPU-enabled Google Colab Notebook available here.
Preview this Blueprint in action
Hosted demo
Step by step walkthrough
Tools used to create
Trusted open source tools used for this Blueprint
Choices
Insights into our motivations and key technical decisions throughout the development process.
Focus
Decision
Rationale
Alternatives Considered
Trade-offs
Overall Motivation
Overall Motivation
Build a local-friendly, developer-centric document-to-podcast solution.
Ensures privacy, flexibility, and accessibility for developers without GPUs.
APIs or targeting GPU-accessible developers.
Less polished UX; smaller models yield weaker output but are more accessible.
Document Pre-processing
Document Pre-processing
Used Python’s re library for text cleaning.
Lightweight, familiar, and effective for most input scenarios.
MarkItDown or custom regex libraries.
Limited scalability for complex tasks.
Podcast Script Generation
Podcast Script Generation
Used llama_cpp with Qwen2.5-3B-Instruct-GGUF.
CPU-friendly, consistent, and balanced performance.
Larger models or APIs; OLMoE-1B-7B-0924.
Smaller models are often less creative than larger ones, leading to weaker podcast scripts.
Audio Generation
Audio Generation
Adopted Kokoro/OuteTTS for TTS.
Best open-source results after testing multiple frameworks.
Suno/bark and parler-tts.
Open-source TTS lags behind closed-source solutions like ElevenLabs.
Deployment Options
Deployment Options
Supported Codespaces, CLI, Streamlit app, Colab, HF Spaces.
Flexible for diverse environments and compute needs.
Single-path setups like only local.
Maintaining multiple pathways adds complexity for updates and consistency.
Ready? Try it yourself!
System Requirments
OS: Windows, macOS, or Linux. Python 3.10 or higher. Min RAM: 10 GB. Disk space: 32 GB min.
Learn MoreHelp Documentation
Detailed guidance on GitHub walking you through this project installation.
View MoreDiscussion Points
Get involved in improving the Blueprint by visiting the GitHub Blueprint issues.
Join inExplore Blueprints Extensions
See examples of extended blueprints unlocking new capabilities and adjusted configurations enabling tailored solutions—or try it yourself.
Load more