utils

Audio Processing Utilities

This module provides utilities for audio segmentation and metadata extraction. It uses soundfile for direct audio I/O to avoid deprecated audioread dependencies.

audinota.utils.segment_audio_by_count(audio: BinaryIO, n_seg: int) list[bytes][source]

Split audio into a fixed number of segments with equal duration.

Each segment will have approximately the same duration, with the last segment potentially being slightly longer to include any remaining samples.

Parameters:
  • audio – Audio data as a binary stream (e.g., io.BytesIO from file bytes)

  • n_seg – Number of segments to create (must be positive integer)

Returns:

List of WAV audio segments as bytes, ready for further processing

Example:
>>> audio_bytes = Path("audio.mp3").read_bytes()
>>> audio_stream = io.BytesIO(audio_bytes)
>>> segments = segment_audio_by_count(audio_stream, 4)
>>> print(f"Created {len(segments)} segments")
audinota.utils.get_audio_duration(audio: BinaryIO) float[source]

Get audio duration in seconds from audio metadata without loading audio data.

This function reads only the audio file header to extract duration information, making it efficient for large audio files where you only need the duration.

Parameters:

audio – Audio data as a binary stream (e.g., io.BytesIO from file bytes)

Returns:

Audio duration in seconds as a floating-point number

Example:
>>> audio_bytes = Path("recording.wav").read_bytes()
>>> audio_stream = io.BytesIO(audio_bytes)
>>> duration = get_audio_duration(audio_stream)
>>> print(f"Audio is {duration:.1f} seconds long")
audinota.utils.segment_audio_by_duration(audio: BinaryIO, duration: float) list[bytes][source]

Split audio into segments with a target duration per segment.

The audio will be divided into segments where each segment (except possibly the last one) has approximately the specified duration. The last segment may be shorter if the total duration is not evenly divisible.

Parameters:
  • audio – Audio data as a binary stream (e.g., io.BytesIO from file bytes)

  • duration – Target duration for each segment in seconds (can be fractional)

Returns:

List of WAV audio segments as bytes, ready for further processing

Example:
>>> audio_bytes = Path("lecture.mp3").read_bytes()
>>> audio_stream = io.BytesIO(audio_bytes)
>>> # Split into 2-minute segments
>>> segments = segment_audio_by_duration(audio_stream, 120.0)
>>> print(f"Created {len(segments)} segments of ~2 minutes each")