Public API#

This API provides tools for working on large sets of audio data.

Basically, the whole point of OSEkit’s Public API is to export large amounts of spectrograms and/or reshaped audio files with no consideration of the original format of the audio files.

The osekit.public.project.Project class is the cornerstone of OSEkit’s Public API.

Building a Project#

At first, A Project is built from a raw folder containing the audio files to be processed. For example, this folder containing 4 audio files plus some extra files:

7181.230205154906.wav
7181.230205164906.wav
7181.230205174906.wav
7181.230205194906.wav
foo
β”œβ”€β”€ bar.zip
└── 7181.wav
bar.txt

Only the folder path and strptime format are required to initialize the Project.

Extra parameters allow for e.g. localizing the files in a specific timezone or accounting for the measurement chain to link the raw wav data to the measured acoustic presure. The complete list of extra parameters is provided in the osekit.public.project.Project documentation.

from osekit.public.project import Project
from pathlib import Path

project = Project(
    folder=Path(r"...\project_folder"),
    strptime_format="%y%m%d%H%M%S" # Must match the strptime format of your audio files
)

If you don’t know (or don’t care to parse) the start timestamps of your audio files, you can specify the first_file_begin parameter. Then, the first valid audio file found in the folder will be considered as starting at this timestamp, and each next valid audio file will be considered as starting directly after the end of the previous one:

from osekit.public.project import Project
from pathlib import Path

project = Project(
    folder=Path(r"...\project_folder"),
    strptime_format=None # Must match the strptime format of your audio files,
    first_file_begin=Timestamp("2020-01-01 00:00:00") # Will mark the start of your audio files
)

Once this is done, the Project can be built using the osekit.public.project.Project.build() method.

project.build()

The folder is now organized in the following fashion:

data
└── audio
    └── original
        β”œβ”€β”€ 7181.230205154906.wav
        β”œβ”€β”€ 7181.230205164906.wav
        β”œβ”€β”€ 7181.230205174906.wav
        β”œβ”€β”€ 7181.230205194906.wav
        └── original.json
other
β”œβ”€β”€ foo
β”‚   β”œβ”€β”€ bar.zip
β”‚   └── 7181.wav
└── bar.txt
project.json

The original audio files have been turned into a osekit.core.audio_dataset.AudioDataset. In this AudioDataset, one osekit.core.audio_data.AudioData has been created per original audio file. Additionally, both this Core API Audiodataset and the Public API Project have been serialized into the original.json and project.json files, respectively.

Building a project from specific files#

It is possible to pass a specific collection of files to build within the project.

This is done thanks to the osekit.public.project.Project.build_from_files() method. The collection of files passed as the files parameter will be either moved or copied (depending on the move_files parameter) within Project.folder before the build is done. If Project.folder does not exist, it will be created before the files are moved:

from pathlib import Path
from osekit.public.project import Project

# Pick the files you want to include to the project
files = (
    r"cool\stuff\2007-11-05_00-01-00.wav",
    r"cool\things\2007-11-05_00-03-00.wav",
    r"cool\things\2007-11-05_00-05-00.wav"
)

# Set the DESTINATION folder of the project in the folder parameter
project = Project(
    folder = Path(r"cool\odd_hours"),
    strptime_format="%Y-%m-%d_%H-%M-%S",
)

project.build_from_files(
    files=files,
    move_files=False, # Copy files rather than moving them
)

Running an Transform#

In OSEkit, Analyses are run with the osekit.public.project.Project.run() method to process and export spectrogram images, spectrum matrices and audio files from original audio files.

Note

OSEkit makes it easy to reshape the original audio: it is not bound to the original files, and can freely be reshaped in audio data of any duration and sample rate.

The transform parameters are described by a osekit.public.transform.Transform instance passed as a parameter to this method.

Transform Type#

The output_type parameter passed to the initializer is a osekit.public.transform.OutputType instance that defines the transform output(s):

Output Types#

Flag

Output

OutputType.AUDIO

Reshaped audio files

OutputType.SPECTRUM

STFT NPZ spectrum files

OutputType.WELCH

Welch NPZ files

OutputType.SPECTROGRAM

PNG spectrogram images

Multiple outputs can be selected thanks to a logical or | separator.

For example, if a transform aims at exporting both the reshaped audio files and the corresponding spectrograms:

from osekit.public.transform import OutputType
output_type = OutputType.AUDIO | OutputType.SPECTROGRAM

Transform Parameters#

The remaining parameters of the transform (begin and end Timestamps, duration and sample rate of the reshaped data…) are described in the osekit.public.transform.Transform initializer docstring.

Note

If the Transform contains spectral computations (either OutputType.SPECTRUM, OutputType.SPECTROGRAM or OutputType.WELCH is in output_type), a scipy ShortTimeFFT instance should be passed to the Transform initializer.

Checking/Editing the transform#

If you want to take a peek at what the transform output will be before actually running it, the osekit.public.project.Project.prepare_audio() and osekit.public.project.Project.prepare_spectro() methods return a osekit.core.audio_dataset.AudioDataset and a osekit.core.spectro_dataset.SpectroDataset instance, respectively.

The returned AudioDataset can be edited at will and passed as a parameter later on when the transform is run:

ads = project.prepare_audio(transform=transform)

# Filtering out the AudioData that are not linked to any audio file:
ads.remove_empty_data(threshold=0.)

The returned SpectroDataset can be used e.g. to plot sample spectrograms prior to the transform running:

import matplotlib.pyplot as plt

sds = project.prepare_spectro(transform=transform, audio_dataset=ads) # audio_dataset is optional: here, the sds will match the edited ads (with no empty data)

# Computing/plotting the 100th SpectroData from the transform
sds.data[100].plot()
plt.show()

Running the transform#

To run the Transform, simply execute the osekit.public.project.Project.run() method:

project.run(transform=transform) # And that's it!

If you edited the transform AudioDataset as explained in the Checking/Editing the transform section, you can specify the edited AudioDataset on which the transform will be run:

project.run(transform=transform, audio_dataset=ads)

Note

Any OutputType.Spectrogram can be computed as a Long Term Average Spectrum by setting the nb_ltas_time_bins parameter to an integer value.

When the field is at the default None value, spectrograms are computed regularly:

# limit spectrograms to 3000 averaged time bins:
project.run(transform=transform, audio_dataset=ads, nb_ltas_time_bins=3000)

Simple Example: Reshaping audio#

Regardless of the format(s) of the original audio files (as always in OSEkit), let’s say we just want to resample our original audio data at 48 kHz and export it as 10 s-long audio files.

The corresponding Transform is the following:

from osekit.public.transform import Transform, OutputType
from pandas import Timedelta

transform = Transform(
    output_type = OutputType.AUDIO, # We just want to export the reshaped audio files
    data_duration=Timedelta("10s"), # Duration of the new audio files
    sample_rate=48_000, # Sample rate of the new audio files
    name="cool_reshape", # You can name the transform, or keep the default name.
)

project.run(transform=transform) # And that's it!

Output 1#

Once the transform is run, a osekit.core.audio_dataset.AudioDataset instance named cool_reshape has been created and added to the project’s osekit.public.project.Project.outputs field.

The project folder now looks like this:

data
└── audio
    β”œβ”€β”€ original
    β”‚   β”œβ”€β”€ 7181.230205154906.wav
    β”‚   β”œβ”€β”€ 7181.230205164906.wav
    β”‚   β”œβ”€β”€ 7181.230205174906.wav
    β”‚   β”œβ”€β”€ 7181.230205194906.wav
    β”‚   └── original.json
    └── cool_reshape
        β”œβ”€β”€ 2023_04_05_15_49_06_000000.wav
        β”œβ”€β”€ 2023_04_05_15_49_16_000000.wav
        β”œβ”€β”€ 2023_04_05_15_49_26_000000.wav
        β”œβ”€β”€ ...
        β”œβ”€β”€ 2023_04_05_20_48_46_000000.wav
        β”œβ”€β”€ 2023_04_05_20_48_56_000000.wav
        └── cool_reshape.json
other
β”œβ”€β”€ foo
β”‚   └── bar.zip
└── bar.txt
project.json

The cool_reshape folder has been created, containing the freshly created 10 s-long, 48 kHz-sampled audio files.

Note

The cool_reshape folder also contains a cool_reshape.json serialized version of the cool_reshape AudioDataset, which will be used for deserializing the project.json file in the project folder root.

Example: full transform#

Let’s now say we want to export audio, spectrum matrices and spectrograms with the following parameters:

Example transform parameters#

Parameter

Value

Begin

00:30:00 after the begin of the original audio files

End

01:30:00 after the begin of the original audio files

Data duration

10 s

Sample rate

48 kHz

Audio data normalization

dc_reject (removes the audio DC component)

FFT

hamming window, 1024 points, 40% overlap

Let’s first instantiate the ShortTimeFFT since we want to run a spectral transform:

from scipy.signal import ShortTimeFFT
from scipy.signal.windows import hamming

sft = ShortTimeFFT(
    win=hamming(1_024), # Window shape,
    hop=round(1_024*(1-.4)), # 40% overlap
    fs=48_000,
    scale_to="magnitude"
)

Then we are all set for running the transform:

from osekit.public.transform import Transform, OutputType
from pandas import Timedelta

transform = Transform(
    output_type = OutputType.AUDIO | OutputType.SPECTRUM | OutputType.WELCH | OutputType.SPECTROGRAM, # Full transform : audio files, spectrum matrices and spectrograms will be exported.
    begin=project.origin_dataset.begin + Timedelta(minutes=30), # 30m after the begin of the original dataset
    end=project.origin_dataset.begin + Timedelta(hours=1.5), # 1h30 after the begin of the original dataset
    data_duration=Timedelta("10s"), # Duration of the output data
    sample_rate=48_000, # Sample rate of the output data
    normalization="dc_reject",
    name="full_transform", # You can name the transform, or keep the default name.
    fft=sft, # The FFT parameters
)

project.run(transform=transform) # And that's it!

Output 2#

Since the transform contains both OutputType.AUDIO and spectral transform types, two core API datasets were created and added to the project’s osekit.public.project.Project.datasets field:

The project folder now looks like this (the output from the first example was removed for convenience):

data
└── audio
    β”œβ”€β”€ original
    β”‚   β”œβ”€β”€ 7181.230205154906.wav
    β”‚   β”œβ”€β”€ 7181.230205164906.wav
    β”‚   β”œβ”€β”€ 7181.230205174906.wav
    β”‚   β”œβ”€β”€ 7181.230205194906.wav
    β”‚   └── original.json
    └── full_transform_audio
        β”œβ”€β”€ 2023_04_05_16_19_06_000000.wav
        β”œβ”€β”€ 2023_04_05_16_19_16_000000.wav
        β”œβ”€β”€ 2023_04_05_16_19_26_000000.wav
        β”œβ”€β”€ ...
        β”œβ”€β”€ 2023_04_05_17_18_46_000000.wav
        β”œβ”€β”€ 2023_04_05_17_18_56_000000.wav
        └── full_transform_audio.json
processed
└── full_transform
    β”œβ”€β”€ spectrogram
    β”‚   β”œβ”€β”€ 2023_04_05_16_19_06_000000.png
    β”‚   β”œβ”€β”€ 2023_04_05_16_19_16_000000.png
    β”‚   β”œβ”€β”€ 2023_04_05_16_19_26_000000.png
    β”‚   β”œβ”€β”€ ...
    β”‚   β”œβ”€β”€ 2023_04_05_17_18_46_000000.png
    β”‚   └── 2023_04_05_17_18_56_000000.png
    β”œβ”€β”€ spectrum
    β”‚   β”œβ”€β”€ 2023_04_05_16_19_06_000000.npz
    β”‚   β”œβ”€β”€ 2023_04_05_16_19_16_000000.npz
    β”‚   β”œβ”€β”€ 2023_04_05_16_19_26_000000.npz
    β”‚   β”œβ”€β”€ ...
    β”‚   β”œβ”€β”€ 2023_04_05_17_18_46_000000.npz
    β”‚   └── 2023_04_05_17_18_56_000000.npz
    β”œβ”€β”€ welch
    β”‚   └── 2023_04_05_16_19_06_000000.npz
    └── full_transform.json
other
β”œβ”€β”€ foo
β”‚   └── bar.zip
└── bar.txt
project.json

As in the output of example 1, a full_transform_audio folder was created, containing the reshaped audio files.

Additionally, the fresh processed folder contains the output spectrograms and NPZ matrices, along with the full_transform.json serialized osekit.core.spectro_dataset.SpectroDataset.

Recovering a Project#

The project.json file in the root project folder can be used to deserialize a osekit.public.project.Project object thanks to the osekit.public.project.Project.from_json() method:

from pathlib import Path
from osekit.public.project import Project

json_file = Path(r"../project.json")
project = Project.from_json(json_file) # That's it!

Resetting a Project#

Warning

Calling this method is irreversible

The osekit.public.project.Project.reset() method resets the project’s folder to its initial state. All exported outputs ans json files will be removed, and the folder will be back to its state before building the project.