Transform#

Transform#

class osekit.public.transform.Transform(output_type: OutputType, begin: Timestamp | None = None, end: Timestamp | None = None, data_duration: Timedelta | None = None, mode: Literal['files', 'timedelta_total', 'timedelta_file']='timedelta_total', overlap: float = 0.0, sample_rate: float | None = None, normalization: Normalization = <Normalization.RAW: 1>, name: str | None = None, subtype: str | None = None, fft: ShortTimeFFT | None = None, v_lim: tuple[float, float] | None=None, colormap: str | None = None, scale: Scale | None = None, nb_ltas_time_bins: int | None = None)#

Class that contains all parameter of a transform.

Transform instances are passed to the public API project, which runs the transform. The Transform object contains all info on the transform to be done: the type(s) of core dataset(s) that will be created and added to the Project.outputs property and which output files will be written to disk (reshaped audio files, npz spectra matrices, png spectrograms…) depend on the output_type parameter. The Transform instance also contains the technical parameters of the transforms (begin/end times, sft, sample rate…).

Initialize an Transform object.

Parameters#

output_type: OutputType

The type of transform to run. See OutputType docstring for more info.

begin: Timestamp | None

The begin of the transform dataset. Defaulted to the begin of the original dataset.

end: Timestamp | None

The end of the transform dataset. Defaulted to the end of the original dataset.

data_duration: Timedelta | None

Duration of the data within the transform dataset. If provided, audio data will be evenly distributed between begin and end. Else, one data object will cover the whole time period.

mode: Literal[“files”, “timedelta_total”, “timedelta_file”]

Mode of creation of the dataset data from the original files. "files": one data will be created for each file. "timedelta_total": data objects of duration equal to data_duration will be created from the begin timestamp to the end timestamp. "timedelta_file": data objects of duration equal to data_duration will be created from the beginning of the first file that the begin timestamp is into, until it would resume in a data beginning between two files. Then, the next data object will be created from the beginning of the next original file and so on.

overlap: float

Overlap percentage between consecutive data.

sample_rate: float | None

Sample rate of the new transform data. Audio data will be resampled if provided, else the sample rate will be set to the one of the original dataset.

normalization: Normalization

The type of normalization to apply to the audio data.

name: str | None

Name of the transform dataset. Defaulted as the begin timestamp of the transform dataset. If both audio and spectro outputs are selected, the audio transform dataset name will be suffixed with "_audio".

subtype: str | None

Subtype of the written audio files as provided by the soundfile module. Defaulted as the default 16-bit PCM for wav audio files. This parameter has no effect if Transform.AUDIO is not in transform.

fft: ShortTimeFFT | None

FFT to use for computing the spectra. This parameter is mandatory if either Transform.SPECTRUM or Transform.SPECTROGRAM is in transform. This parameter has no effect if neither Transform.SPECTRUM nor Transform.SPECTROGRAM is in the transform.

v_lim: tuple[float, float] | None

Limits (in dB) of the colormap used for plotting the spectrogram. Has no effect if Transform.SPECTROGRAM is not in transform.

colormap: str | None

Colormap to use for plotting the spectrogram. Has no effect if Transform.SPECTROGRAM is not in transform.

scale: osekit.core.frequecy_scale.Scale

Custom frequency scale to use for plotting the spectrogram. Has no effect if Transform.SPECTROGRAM is not in transform.

nb_ltas_time_bins: int | None

If None, the spectrogram will be computed regularly. If specified, the spectrogram will be computed as LTAS, with the value representing the maximum number of averaged time bins.

property fft: ShortTimeFFT | None#

Return the FFT used in the transform.

property is_spectro: bool#

Return True if the transform contains spectral computations, False otherwise.

property sample_rate: float | None#

Return the sample rate of the transform.

OutputType#

class osekit.public.transform.OutputType(*values)#

Enum of flags that should be used to specify the type of transform to run.

AUDIO:

Will add an AudioDataset to the outputs and write the reshaped audio files to disk. The new AudioDataset will be linked to the reshaped audio files rather than to the original files.

SPECTRUM:

Will write the npz SpectroFiles to disk and link the SpectroDataset to these files.

SPECTROGRAM:

Will export the spectrogram png images.

WELCH:

Will write the npz welches to disk.

Multiple flags can be enabled thanks to the logical or | operator: OutputType.AUDIO | OutputType.SPECTROGRAM will export both audio files and spectrogram images.

>>> # Exporting both the reshaped audio and the spectrograms
>>> # (without the npz matrices):
>>> export = OutputType.AUDIO | OutputType.SPECTROGRAM
>>> OutputType.AUDIO in export
True
>>> OutputType.SPECTROGRAM in export
True
>>> OutputType.SPECTRUM in export
False