Computing a LTAS with the Public API

Computing a LTAS with the Public API [1]#

As always in the Public API, the first step is to build the dataset.

An Instrument can be provided to the Dataset for the WAV data to be converted in pressure units. This will lead the resulting spectra to be expressed in dB SPL (rather than in dB FS):

from pathlib import Path

audio_folder = Path(r"_static/sample_audio")

from osekit.public_api.dataset import Dataset
from osekit.core_api.instrument import Instrument

dataset = Dataset(
    folder=audio_folder,
    strptime_format="%y%m%d_%H%M%S",
    instrument=Instrument(end_to_end_db=150.0),
)

dataset.build()
	2025-08-27 10:39:02,696
Building the dataset...
	2025-08-27 10:39:02,697
Analyzing original audio files...
	2025-08-27 10:39:02,704
Organizing dataset folder...
	2025-08-27 10:39:02,706
Build done!

The Public API Dataset is now analyzed and organized:

print(f"{' DATASET ':#^60}")
print(f"{'Begin:':<30}{str(dataset.origin_dataset.begin):>30}")
print(f"{'End:':<30}{str(dataset.origin_dataset.end):>30}")
print(f"{'Sample rate:':<30}{str(dataset.origin_dataset.sample_rate):>30}\n")

print(f"{' ORIGINAL FILES ':#^60}")
import pandas as pd

pd.DataFrame(
    [
        {
            "Name": f.path.name,
            "Begin": f.begin,
            "End": f.end,
            "Sample Rate": f.sample_rate,
        }
        for f in dataset.origin_files
    ],
).set_index("Name")
######################### DATASET ##########################
Begin:                                   2022-09-25 22:34:50
End:                                     2022-09-25 22:36:50
Sample rate:                                           48000

###################### ORIGINAL FILES ######################
Begin End Sample Rate
Name
sample_220925_223450.wav 2022-09-25 22:34:50 2022-09-25 22:35:00 48000
sample_220925_223500.wav 2022-09-25 22:35:00 2022-09-25 22:35:10 48000
sample_220925_223510.wav 2022-09-25 22:35:10 2022-09-25 22:35:20 48000
sample_220925_223520.wav 2022-09-25 22:35:20 2022-09-25 22:35:30 48000
sample_220925_223530.wav 2022-09-25 22:35:30 2022-09-25 22:35:40 48000
sample_220925_223600.wav 2022-09-25 22:36:00 2022-09-25 22:36:10 48000
sample_220925_223610.wav 2022-09-25 22:36:10 2022-09-25 22:36:20 48000
sample_220925_223620.wav 2022-09-25 22:36:20 2022-09-25 22:36:30 48000
sample_220925_223630.wav 2022-09-25 22:36:30 2022-09-25 22:36:40 48000
sample_220925_223640.wav 2022-09-25 22:36:40 2022-09-25 22:36:50 48000

Since we will run a spectral analysis, we need to define the FFT parameters:

from scipy.signal import ShortTimeFFT
from scipy.signal.windows import hamming

sample_rate = 24_000

sft = ShortTimeFFT(
    win=hamming(1024),
    hop=128,  # This will be forced to len(win) if we compute a LTAS
    fs=sample_rate,
)

To run analyses in the Public API, use the Analysis class:

from osekit.public_api.analysis import Analysis, AnalysisType

analysis = Analysis(
    analysis_type=AnalysisType.SPECTROGRAM
    | AnalysisType.MATRIX,  # we want to export both the spectrogram and the sx matrix
    nb_ltas_time_bins=3000,  # This will turn the regular spectrum computation in a LTAS
    sample_rate=sample_rate,
    fft=sft,
    v_lim=(0.0, 150.0),  # Boundaries of the spectrograms
    colormap="viridis",  # Default value
    name="LTAS",
)

Running the analysis will compute the LTAS and save the output files to disk.

dataset.run_analysis(analysis=analysis)
	2025-08-27 10:39:02,740
Creating the audio data...
	2025-08-27 10:39:02,745
Running analysis...
	2025-08-27 10:39:02,745
Computing and writing spectrum matrices and spectrograms...
	2025-08-27 10:39:04,394
Analysis done!

As for regular spectrum analyses, the output LTAS is stored in a SpectroDataset named after analysis.name:

pd.DataFrame(
    [
        {
            "Exported file": list(sd.files)[0].path.name,
            "Begin": sd.begin,
            "End": sd.end,
            "Sample Rate": sd.fft.fs,
        }
        for sd in dataset.get_dataset(analysis.name).data
    ],
).set_index("Exported file")
Begin End Sample Rate
Exported file
2022_09_25_22_34_50_000000.npz 2022-09-25 22:34:50 2022-09-25 22:36:50 24000
# Reset the dataset to get all files back to place.

dataset.reset()