In [None]:
# Executing this cell will disable all TQDM outputs in stdout.
import os

os.environ["DISABLE_TQDM"] = "True"

# Reshaping multiple files with the Public API [^download]

[^download]: This notebook can be downloaded as **{nb-download}`example_reshaping_multiple_files_public.ipynb`**.

As always in the **Public API**, the first step is to **build the dataset**:

In [None]:
from pathlib import Path

audio_folder = Path(r"_static/sample_audio")

from osekit.public_api.dataset import Dataset

dataset = Dataset(
    folder=audio_folder,
    strptime_format="%y%m%d_%H%M%S",
)

dataset.build()

The **Public API** `Dataset` is now analyzed and organized:

In [None]:
print(f"{' DATASET ':#^60}")
print(f"{'Begin:':<30}{str(dataset.origin_dataset.begin):>30}")
print(f"{'End:':<30}{str(dataset.origin_dataset.end):>30}")
print(f"{'Sample rate:':<30}{str(dataset.origin_dataset.sample_rate):>30}\n")

print(f"{' ORIGINAL FILES ':#^60}")
import pandas as pd

pd.DataFrame(
    [
        {
            "Name": f.path.name,
            "Begin": f.begin,
            "End": f.end,
            "Sample Rate": f.sample_rate,
        }
        for f in dataset.origin_files
    ],
).set_index("Name")

To **run analyses** in the **Public API**, use the `Analysis` class:

In [None]:
from osekit.public_api.analysis import Analysis, AnalysisType
from pandas import Timestamp, Timedelta

analysis = Analysis(
    analysis_type=AnalysisType.AUDIO,  # we just want to export the reshaped audio,
    begin=Timestamp("2022-09-25 22:35:15"),
    end=Timestamp("2022-09-25 22:36:25"),
    data_duration=Timedelta(seconds=5),
    name="reshape_example",
)

The **Core API** can still be used on top of the **Public API**.
Here, we filter out the empty `AudioData` with some **Core API**:

In [None]:
# Returns a Core API AudioDataset that matches the analysis
audio_dataset = dataset.get_analysis_audiodataset(analysis=analysis)

# Filter the returned AudioDataset
audio_dataset.data = [ad for ad in audio_dataset.data if not ad.is_empty]

Running the analysis while specifying the filtered ``audio_dataset`` will skip the empty `AudioData`.

In [None]:
dataset.run_analysis(analysis=analysis, audio_dataset=audio_dataset)

All the new files from the analysis are stored in an `AudioDataset` named after `analysis.name`:

In [None]:
pd.DataFrame(
    [
        {
            "Exported file": list(ad.files)[0].path.name,
            "Begin": ad.begin,
            "End": ad.end,
            "Sample Rate": ad.sample_rate,
        }
        for ad in dataset.get_dataset(analysis.name).data
    ],
).set_index("Exported file")

In [None]:
# Reset the dataset to get all files back to place.

dataset.reset()