Release and Version History

x.y.z (Backlog)

Features and Improvements

Minor Improvements

Bugfixes

Miscellaneous

0.1.1 (2024-08-11)

Features and Improvements

  • Core Transcription Engine: Built on faster-whisper for high-accuracy audio-to-text conversion

  • Parallel Processing: Automatic audio segmentation with multi-CPU core processing for exceptional speed

  • Smart Audio Segmentation: Intelligent chunking of large audio files by duration or segment count

  • Multi-Format Support: Handles popular audio formats including MP3, MP4, WAV, M4A, FLAC, OGG

  • Automatic Language Detection: AI-powered language detection without manual specification

  • Command Line Interface: Simple CLI with audinota transcribe –input=”file.mp3” syntax

  • Flexible Output Management: Support for directory output, custom file naming, and automatic conflict resolution

  • Cost-Effective Solution: 120x cheaper than AWS Transcribe when deployed on cloud infrastructure

CLI Features

  • Smart Output Resolution: Automatic .txt file creation next to input files when no output specified

  • Directory Output Support: Batch processing with organized output to specified directories

  • File Conflict Handling: Automatic numbering for directory outputs, overwrite protection for file outputs

  • Real-time Progress Feedback: Emoji-enhanced status updates during transcription process

  • Multiple Input Format Detection: Seamless handling of various audio/video formats

Python API

  • Simple Integration: Clean API with transcribe_audio_in_parallel() function

  • Audio Utility Functions: Duration calculation, segmentation by count/duration

  • Streaming Support: BytesIO input support for in-memory audio processing

  • Performance Optimization: Built-in parallel processing for large audio files

Documentation and Usability

  • Comprehensive README: Detailed usage examples, CLI documentation, and cost comparison

  • Real-world Examples: Practical use cases for researchers, content creators, and data analysts

  • Performance Benchmarking: Speed and cost comparisons with commercial transcription services