System 2 package¶

Submodules¶

yt_audio_collector.system_2.preprocess_audio module¶

Splitting audio based on transcript

class yt_audio_collector.system_2.preprocess_audio.PreProcessAudio(source_path: str = '/home/runner/work/yt_audio_collector/yt_audio_collector/output', destination_path: str = '/home/runner/work/yt_audio_collector/yt_audio_collector/processed_output', background_sound: bool = False)[source]¶

Bases: object

This class is responsible for preprocessing the audio for speech recognition model. It extracts vocals from audio files, resamples the audio, and divides the audio into chunks based on transcriptions.

extract_vocals(chunk_path: str) → None[source]¶

Extracts vocals from audio file using spleeter library and store it in the temporary file(temp).

Parameters:¶

chunk_path: str: The path of the audio file.

static get_file_name(total_file_path: Path) → str[source]¶

Finds the file name without extension from absolute file path

Parameters:¶

total_file_path: Path: The absolute path of the file

Return:¶

str: File name without extension

preprocess_audio()[source]¶: Traverses through all categories of audio and preprocess the audio.

preprocess_audio_chunks(category_path: str) → None[source]¶

Divides the audio into chunks based on the transcriptions and preprocess the audio chunks.

Parameters:¶

category_path: str: The category name in the audio folder.

resample(chunk_path: str, destination_chunk_path: str) → None[source]¶

Changes the sample rate, sample width, channels of the the given audio file .

Parameters:¶

chunk_path: str: The audio file path before resampling.
destination_chunk_path: str: The audio fille path after resampling.

System 2 package¶

Submodules¶

yt_audio_collector.system_2.preprocess_audio module¶

Parameters:¶

Parameters:¶

Return:¶

Parameters:¶

Parameters:¶

Module contents¶