In [None]:
# Executing this cell will disable all TQDM outputs in stdout.
import os

os.environ["DISABLE_TQDM"] = "True"

# Computing multiple spectrograms with the Public API [^download]

[^download]: This notebook can be downloaded as **{nb-download}`example_multiple_spectrograms_public.ipynb`**.

As always in the **Public API**, the first step is to **build the dataset**.

An `Instrument` can be provided to the `Dataset` for the WAV data to be converted in pressure units. This will lead the resulting spectra to be expressed in dB SPL (rather than in dB FS):

In [None]:
from pathlib import Path

audio_folder = Path(r"_static/sample_audio")

from osekit.public_api.dataset import Dataset
from osekit.core_api.instrument import Instrument

dataset = Dataset(
    folder=audio_folder,
    strptime_format="%y%m%d_%H%M%S",
    instrument=Instrument(end_to_end_db=150.0),
)

dataset.build()

The **Public API** `Dataset` is now analyzed and organized:

In [None]:
print(f"{' DATASET ':#^60}")
print(f"{'Begin:':<30}{str(dataset.origin_dataset.begin):>30}")
print(f"{'End:':<30}{str(dataset.origin_dataset.end):>30}")
print(f"{'Sample rate:':<30}{str(dataset.origin_dataset.sample_rate):>30}\n")

print(f"{' ORIGINAL FILES ':#^60}")
import pandas as pd

pd.DataFrame(
    [
        {
            "Name": f.path.name,
            "Begin": f.begin,
            "End": f.end,
            "Sample Rate": f.sample_rate,
        }
        for f in dataset.origin_files
    ],
).set_index("Name")

Since we will run a spectral analysis, we need to define the FFT parameters:

In [None]:
from scipy.signal import ShortTimeFFT
from scipy.signal.windows import hamming

sample_rate = 24_000

sft = ShortTimeFFT(win=hamming(1024), hop=128, fs=sample_rate)

To **run analyses** in the **Public API**, use the `Analysis` class:

In [None]:
from osekit.public_api.analysis import Analysis, AnalysisType
from pandas import Timestamp, Timedelta

analysis = Analysis(
    analysis_type=AnalysisType.MATRIX
    | AnalysisType.SPECTROGRAM
    | AnalysisType.WELCH,  # we want to export these three spectrum outputs
    begin=Timestamp("2022-09-25 22:35:15"),
    end=Timestamp("2022-09-25 22:36:25"),
    data_duration=Timedelta(seconds=5),
    sample_rate=sample_rate,
    fft=sft,
    v_lim=(0.0, 150.0),  # Boundaries of the spectrograms
    colormap="viridis",  # Default value
    name="all_spectral_output",
)

The **Core API** can still be used on top of the **Public API**.

We will filter out the empty data this way:

In [None]:
# Returns a Core API AudioDataset that matches the analysis
audio_dataset = dataset.get_analysis_audiodataset(analysis=analysis)

# Filter the returned AudioDataset
audio_dataset.data = [ad for ad in audio_dataset.data if not ad.is_empty]

We can also glance at the spectrogram results with the **Core API**:

In [None]:
import matplotlib.pyplot as plt

analysis_spectro_dataset = dataset.get_analysis_spectrodataset(
    analysis=analysis,
    audio_dataset=audio_dataset,  # So that the filtered SpectroDataset is returned
)

analysis_spectro_dataset.data[1].plot()
plt.show()

Running the analysis while specifying the filtered ``audio_dataset`` will skip the empty `AudioData` (and thus the empty `SpectroData`).

In [None]:
dataset.run_analysis(analysis=analysis, audio_dataset=audio_dataset)

All the new files from the analysis are stored in a `SpectroDataset` named after `analysis.name`:

In [None]:
pd.DataFrame(
    [
        {
            "Exported file": list(sd.files)[0].path.name,
            "Begin": sd.begin,
            "End": sd.end,
            "Sample Rate": sd.fft.fs,
        }
        for sd in dataset.get_dataset(analysis.name).data
    ],
).set_index("Exported file")

In [None]:
# Reset the dataset to get all files back to place.

dataset.reset()