Reshaping multiple files with the Core API [1]#
Create an OSEkit AudioDataset
from the files on disk, by directly specifying the time-related requirements in the constructor.
We will only use the folder in which the files are located: we don’t have to dig up to the file level.
from pathlib import Path
audio_folder = Path(r"_static/sample_audio")
from osekit.core_api.audio_dataset import AudioDataset
from pandas import Timestamp, Timedelta
audio_dataset = AudioDataset.from_folder(
folder=audio_folder,
strptime_format="%y%m%d_%H%M%S",
begin=Timestamp("2022-09-25 22:35:15"),
end=Timestamp("2022-09-25 22:36:25"),
data_duration=Timedelta(seconds=5),
)
The AudioDataset
object contains all the to-be-exported AudioData
:
print(f"{' AUDIO DATASET ':#^60}")
print(f"{'Begin:':<30}{str(audio_dataset.begin):>30}")
print(f"{'End:':<30}{str(audio_dataset.end):>30}")
print(f"{'Sample rate:':<30}{str(audio_dataset.sample_rate):>30}")
print(f"{'Nb of audio data:':<30}{str(len(audio_dataset.data)):>30}")
###################### AUDIO DATASET #######################
Begin: 2022-09-25 22:35:15
End: 2022-09-25 22:36:25
Sample rate: 48000
Nb of audio data: 14
We also wanted to skip the AudioData
that are in the gap between recordings.
Such AudioData
have no linked file, thus their is_empty
property should be True
.
print(f"{' BEFORE FILTERING ':#^60}")
print(
f"{'Nb of Empty data:':<30}{str(len([ad for ad in audio_dataset.data if ad.is_empty])):>30}\n"
)
# Remove the empty data by using the default AudioDataset constructor:
audio_dataset = AudioDataset([ad for ad in audio_dataset.data if not ad.is_empty])
##################### BEFORE FILTERING #####################
Nb of Empty data: 4
The AudioData
should now only contain non-empty AudioData
:
print(f"{' AFTER FILTERING ':#^60}")
print(f"{'Nb of audio data:':<30}{str(len(audio_dataset.data)):>30}")
print(
f"{'Nb of Empty data:':<30}{str(len([ad for ad in audio_dataset.data if ad.is_empty])):>30}\n"
)
##################### AFTER FILTERING ######################
Nb of audio data: 10
Nb of Empty data: 0
Export all the AudioData
of the AudioDataset
at once:
audio_dataset.write(audio_folder / "exported_files")