Public API#
This API provides tools for working on large sets of audio data.
Basically, the whole point of OSEkitβs Public API is to export large amounts of spectrograms and/or reshaped audio files with no consideration of the original format of the audio files.
The osekit.public_api.dataset.Dataset
class is the cornerstone of OSEkitβs Public API.
Building a Dataset
#
At first, A Dataset
is built from a raw folder containing the audio files to be processed.
For example, this folder containing 4 audio files plus some extra files:
7181.230205154906.wav
7181.230205164906.wav
7181.230205174906.wav
7181.230205194906.wav
foo
βββ bar.zip
βββ 7181.wav
bar.txt
Only the folder path and strptime format are required to initialize the Dataset
.
Extra parameters allow for e.g. localizing the files in a specific timezone or accounting for the measurement chain to link the raw wav data to the measured acoustic presure.
The complete list of extra parameters is provided in the osekit.public_api.dataset.Dataset
documentation.
from osekit.public_api.dataset import Dataset
from pathlib import Path
dataset = Dataset(
folder=Path(r"...\dataset_folder"),
strptime_format="%y%m%d%H%M%S" # Must match the strptime format of your audio files
)
Once this is done, the Dataset
can be built using the osekit.public_api.dataset.Dataset.build()
method.
dataset.build()
The folder is now organized in the following fashion:
data
βββ audio
βββ original
βββ 7181.230205154906.wav
βββ 7181.230205164906.wav
βββ 7181.230205174906.wav
βββ 7181.230205194906.wav
βββ original.json
other
βββ foo
β βββ bar.zip
β βββ 7181.wav
βββ bar.txt
dataset.json
The original audio files have been turned into a osekit.core_api.audio_dataset.AudioDataset
.
In this AudioDataset
, one osekit.core_api.audio_data.AudioData
has been created per original audio file.
Additionally, both this Core API Audiodataset
and the Public API Dataset
have been serialized
into the original.json
and dataset.json
files, respectively.
Running an Analysis
#
In OSEkit, Analyses are run with the osekit.public_api.dataset.Dataset.run_analysis()
method to process and export spectrogram images, spectrum matrices and audio files from original audio files.
Note
OSEkit makes it easy to reshape the original audio: it is not bound to the original files, and can freely be reshaped in audio data of any duration and sample rate.
The analysis parameters are described by a osekit.public_api.analysis.Analysis
instance passed as a parameter to this method.
Analysis Type#
The analysis_type
parameter passed to the initializer is a osekit.public_api.analysis.AnalysisType
instance that defines the analysis output(s):
Flag |
Output |
---|---|
|
Reshaped audio files |
|
STFT NPZ matrix files |
|
Welch NPZ files |
|
PNG spectrogram images |
Multiple outputs can be selected thanks to a logical or |
separator.
For example, if an analysis aims at exporting both the reshaped audio files and the corresponding spectrograms:
from osekit.public_api.analysis import AnalysisType
analysis_type = AnalysisType.AUDIO | AnalysisType.SPECTROGRAM
Analysis Parameters#
The remaining parameters of the analysis (begin and end Timestamps, duration and sample rate of the reshaped dataβ¦) are described in the osekit.public_api.analysis.Analysis
initializer docstring.
Note
If the Analysis
contains spectral computations (either AnalysisType.MATRIX
, AnalysisType.SPECTROGRAM
or AnalysisType.WELCH
is in analysis_type
), a scipy ShortTimeFFT instance should be passed to the Analysis
initializer.
Checking/Editing the analysis#
If you want to take a peek at what the analysis output will be before actually running it, the osekit.public_api.dataset.Dataset.get_analysis_audiodataset()
and osekit.public_api.dataset.Dataset.get_analysis_spectrodataset()
methods
return a osekit.core_api.audio_dataset.AudioDataset
and a osekit.core_api.spectro_dataset.SpectroDataset
instance, respectively.
The returned AudioDataset
can be edited at will and passed as a parameter later on when the analysis is run:
ads = dataset.get_analysis_audiodataset(analysis=analysis)
# Filtering out the AudioData that are not linked to any audio file:
ads.data = [ad for ad in ads.data if not ad.is_empty]
The returned SpectroDataset
can be used e.g. to plot sample spectrograms prior to the analysis:
import matplotlib.pyplot as plt
sds = dataset.get_analysis_spectrodataset(analysis=analysis, audio_dataset=ads) # audio_dataset is optional: here, the sds will match the edited ads (with no empty data)
# Computing/plotting the 100th SpectroData from the analysis
sds.data[100].plot()
plt.show()
Running the analysis#
To run the Analysis
, simply execute the osekit.public_api.dataset.Dataset.run_analysis()
method:
dataset.run_analysis(analysis=analysis) # And that's it!
If you edited the analysis AudioDataset
as explained in the Checking/Editing the analysis section, you can specify the edited AudioDataset
on which the analysis will be run:
dataset.run_analysis(analysis=analysis, audio_dataset=ads)
Note
Any AnalysisType.Spectrogram
can be computed as a Long Term Average Spectrum by setting the nb_ltas_time_bins
parameter to an integer value.
When the field is at the default None
value, spectrograms are computed regularly:
# limit spectrograms to 3000 averaged time bins:
dataset.run_analysis(analysis=analysis, audio_dataset=ads, nb_ltas_time_bins=3000)
Simple Example: Reshaping audio#
Regardless of the format(s) of the original audio files (as always in OSEkit), letβs say we just want to resample our original audio data at 48 kHz
and export it as 10 s
-long audio files.
The corresponding Analysis
is the following:
from osekit.public_api.analysis import Analysis, AnalysisType
from pandas import Timedelta
analysis = Analysis(
analysis_type = AnalysisType.AUDIO, # We just want to export the reshaped audio files
data_duration=Timedelta("10s"), # Duration of the new audio files
sample_rate=48_000, # Sample rate of the new audio files
name="cool_reshape", # You can name the analysis, or keep the default name.
)
dataset.run_analysis(analysis=analysis) # And that's it!
Output 1#
Once the analysis is run, a osekit.core_api.audio_dataset.AudioDataset
instance named cool_reshape
has been created and added to the datasetβs osekit.public_api.dataset.Dataset.datasets
field.
The dataset folder now looks like this:
data
βββ audio
βββ original
β βββ 7181.230205154906.wav
β βββ 7181.230205164906.wav
β βββ 7181.230205174906.wav
β βββ 7181.230205194906.wav
β βββ original.json
βββ cool_reshape
βββ 2023_04_05_15_49_06_000000.wav
βββ 2023_04_05_15_49_16_000000.wav
βββ 2023_04_05_15_49_26_000000.wav
βββ ...
βββ 2023_04_05_20_48_46_000000.wav
βββ 2023_04_05_20_48_56_000000.wav
βββ cool_reshape.json
other
βββ foo
β βββ bar.zip
βββ bar.txt
dataset.json
The cool_reshape
folder has been created, containing the freshly created 10 s
-long, 48 kHz
-sampled audio files.
Note
The cool_reshape
folder also contains a cool_reshape.json
serialized version of the cool_reshape
AudioDataset
, which will be used for deserializing the dataset.json
file in the dataset folder root.
Example: full analysis#
Letβs now say we want to export audio, spectrum matrices and spectrograms with the following parameters:
Parameter |
Value |
---|---|
Begin |
00:30:00 after the begin of the original audio files |
End |
01:30:00 after the begin of the original audio files |
Data duration |
|
Sample rate |
|
FFT |
|
Letβs first instantiate the ShortTimeFFT
since we want to run a spectral analysis:
from scipy.signal import ShortTimeFFT
from scipy.signal.windows import hamming
sft = ShortTimeFFT(
win=hamming(1_024), # Window shape,
hop=round(1_024*(1-.4)), # 40% overlap
fs=48_000,
scale_to="magnitude"
)
Then we are all set for running the analysis:
from osekit.public_api.analysis import Analysis, AnalysisType
from pandas import Timedelta
analysis = Analysis(
analysis_type = AnalysisType.AUDIO | AnalysisType.MATRIX | AnalysisType.WELCH | AnalysisType.SPECTROGRAM, # Full analysis : audio files, spectrum matrices and spectrograms will be exported.
begin=dataset.origin_dataset.begin + Timedelta(minutes=30), # 30m after the begin of the original dataset
end=dataset.origin_dataset.begin + Timedelta(hours=1.5), # 1h30 after the begin of the original dataset
data_duration=Timedelta("10s"), # Duration of the output data
sample_rate=48_000, # Sample rate of the output data
name="full_analysis", # You can name the analysis, or keep the default name.
fft=sft, # The FFT parameters
)
dataset.run_analysis(analysis=analysis) # And that's it!
Output 2#
Since the analysis contains both AnalysisType.AUDIO
and spectral analysis types, two core API datasets were created and added to the datasetβs osekit.public_api.dataset.Dataset.datasets
field:
A
osekit.core_api.audio_dataset.AudioDataset
namedfull_analysis_audio
(with the _audio suffix)A
osekit.core_api.spectro_dataset.SpectroDataset
namedfull_analysis
The dataset folder now looks like this (the output from the first example was removed for convenience):
data
βββ audio
βββ original
β βββ 7181.230205154906.wav
β βββ 7181.230205164906.wav
β βββ 7181.230205174906.wav
β βββ 7181.230205194906.wav
β βββ original.json
βββ full_analysis_audio
βββ 2023_04_05_16_19_06_000000.wav
βββ 2023_04_05_16_19_16_000000.wav
βββ 2023_04_05_16_19_26_000000.wav
βββ ...
βββ 2023_04_05_17_18_46_000000.wav
βββ 2023_04_05_17_18_56_000000.wav
βββ full_analysis_audio.json
processed
βββ full_analysis
βββ spectrogram
β βββ 2023_04_05_16_19_06_000000.png
β βββ 2023_04_05_16_19_16_000000.png
β βββ 2023_04_05_16_19_26_000000.png
β βββ ...
β βββ 2023_04_05_17_18_46_000000.png
β βββ 2023_04_05_17_18_56_000000.png
βββ matrix
β βββ 2023_04_05_16_19_06_000000.npz
β βββ 2023_04_05_16_19_16_000000.npz
β βββ 2023_04_05_16_19_26_000000.npz
β βββ ...
β βββ 2023_04_05_17_18_46_000000.npz
β βββ 2023_04_05_17_18_56_000000.npz
βββ welch
β βββ 2023_04_05_16_19_06_000000.npz
βββ full_analysis.json
other
βββ foo
β βββ bar.zip
βββ bar.txt
dataset.json
As in the output of example 1, a full_analysis_audio
folder was created, containing the reshaped audio files.
Additionally, the fresh processed
folder contains the output spectrograms and NPZ matrices, along with the full_analysis.json
serialized osekit.core_api.spectro_dataset.SpectroDataset
.
Recovering a Dataset
#
The dataset.json
file in the root dataset folder can be used to deserialize a osekit.public_api.dataset.Dataset
object thanks to the osekit.public_api.dataset.Dataset.from_json()
method:
from pathlib import Path
from osekit.public_api.dataset import Dataset
json_file = Path(r"../dataset.json")
dataset = Dataset.from_json(json_file) # That's it!
Resetting a Dataset
#
Warning
Calling this method is irreversible
The osekit.public_api.dataset.Dataset.reset()
method resets the datasetβs folder to its initial state.
All exported analyses ans json files will be removed, and the folder will be back to its state before building the dataset.