Reshaping multiple files with the Public API [1]#
As always in the Public API, the first step is to build the dataset:
from pathlib import Path
audio_folder = Path(r"_static/sample_audio")
from osekit.public_api.dataset import Dataset
dataset = Dataset(
folder=audio_folder,
strptime_format="%y%m%d_%H%M%S",
)
dataset.build()
2025-08-27 10:39:19,993
Building the dataset...
2025-08-27 10:39:19,994
Analyzing original audio files...
2025-08-27 10:39:20,001
Organizing dataset folder...
2025-08-27 10:39:20,004
Build done!
The Public API Dataset
is now analyzed and organized:
print(f"{' DATASET ':#^60}")
print(f"{'Begin:':<30}{str(dataset.origin_dataset.begin):>30}")
print(f"{'End:':<30}{str(dataset.origin_dataset.end):>30}")
print(f"{'Sample rate:':<30}{str(dataset.origin_dataset.sample_rate):>30}\n")
print(f"{' ORIGINAL FILES ':#^60}")
import pandas as pd
pd.DataFrame(
[
{
"Name": f.path.name,
"Begin": f.begin,
"End": f.end,
"Sample Rate": f.sample_rate,
}
for f in dataset.origin_files
],
).set_index("Name")
######################### DATASET ##########################
Begin: 2022-09-25 22:34:50
End: 2022-09-25 22:36:50
Sample rate: 48000
###################### ORIGINAL FILES ######################
Begin | End | Sample Rate | |
---|---|---|---|
Name | |||
sample_220925_223450.wav | 2022-09-25 22:34:50 | 2022-09-25 22:35:00 | 48000 |
sample_220925_223500.wav | 2022-09-25 22:35:00 | 2022-09-25 22:35:10 | 48000 |
sample_220925_223510.wav | 2022-09-25 22:35:10 | 2022-09-25 22:35:20 | 48000 |
sample_220925_223520.wav | 2022-09-25 22:35:20 | 2022-09-25 22:35:30 | 48000 |
sample_220925_223530.wav | 2022-09-25 22:35:30 | 2022-09-25 22:35:40 | 48000 |
sample_220925_223600.wav | 2022-09-25 22:36:00 | 2022-09-25 22:36:10 | 48000 |
sample_220925_223610.wav | 2022-09-25 22:36:10 | 2022-09-25 22:36:20 | 48000 |
sample_220925_223620.wav | 2022-09-25 22:36:20 | 2022-09-25 22:36:30 | 48000 |
sample_220925_223630.wav | 2022-09-25 22:36:30 | 2022-09-25 22:36:40 | 48000 |
sample_220925_223640.wav | 2022-09-25 22:36:40 | 2022-09-25 22:36:50 | 48000 |
To run analyses in the Public API, use the Analysis
class:
from osekit.public_api.analysis import Analysis, AnalysisType
from pandas import Timestamp, Timedelta
analysis = Analysis(
analysis_type=AnalysisType.AUDIO, # we just want to export the reshaped audio,
begin=Timestamp("2022-09-25 22:35:15"),
end=Timestamp("2022-09-25 22:36:25"),
data_duration=Timedelta(seconds=5),
name="reshape_example",
)
The Core API can still be used on top of the Public API.
Here, we filter out the empty AudioData
with some Core API:
# Returns a Core API AudioDataset that matches the analysis
audio_dataset = dataset.get_analysis_audiodataset(analysis=analysis)
# Filter the returned AudioDataset
audio_dataset.data = [ad for ad in audio_dataset.data if not ad.is_empty]
2025-08-27 10:39:20,027
Creating the audio data...
Running the analysis while specifying the filtered audio_dataset
will skip the empty AudioData
.
dataset.run_analysis(analysis=analysis, audio_dataset=audio_dataset)
2025-08-27 10:39:20,033
Running analysis...
2025-08-27 10:39:20,034
Writing audio files...
All the new files from the analysis are stored in an AudioDataset
named after analysis.name
:
pd.DataFrame(
[
{
"Exported file": list(ad.files)[0].path.name,
"Begin": ad.begin,
"End": ad.end,
"Sample Rate": ad.sample_rate,
}
for ad in dataset.get_dataset(analysis.name).data
],
).set_index("Exported file")
Begin | End | Sample Rate | |
---|---|---|---|
Exported file | |||
2022_09_25_22_35_15_000000.wav | 2022-09-25 22:35:15 | 2022-09-25 22:35:20 | 48000 |
2022_09_25_22_35_20_000000.wav | 2022-09-25 22:35:20 | 2022-09-25 22:35:25 | 48000 |
2022_09_25_22_35_25_000000.wav | 2022-09-25 22:35:25 | 2022-09-25 22:35:30 | 48000 |
2022_09_25_22_35_30_000000.wav | 2022-09-25 22:35:30 | 2022-09-25 22:35:35 | 48000 |
2022_09_25_22_35_35_000000.wav | 2022-09-25 22:35:35 | 2022-09-25 22:35:40 | 48000 |
2022_09_25_22_36_00_000000.wav | 2022-09-25 22:36:00 | 2022-09-25 22:36:05 | 48000 |
2022_09_25_22_36_05_000000.wav | 2022-09-25 22:36:05 | 2022-09-25 22:36:10 | 48000 |
2022_09_25_22_36_10_000000.wav | 2022-09-25 22:36:10 | 2022-09-25 22:36:15 | 48000 |
2022_09_25_22_36_15_000000.wav | 2022-09-25 22:36:15 | 2022-09-25 22:36:20 | 48000 |
2022_09_25_22_36_20_000000.wav | 2022-09-25 22:36:20 | 2022-09-25 22:36:25 | 48000 |