Project

Project#

Main class of the Public API.

The Project is the class that stores the original audio dataset, and from which transforms are ran from this dataset to generate spectro datasets, reshaped audio datasets, etc. It has additionnal metadata that can be exported, e.g. to APLOSE.

Initialize a Project.

Parameters#

folder: Path: Path to the folder containing the original audio files.
strptime_format: str | list[str] | None: The strptime format used in the filenames. It should use valid strftime codes (https://strftime.org/). If None, the first audio file of the folder will start at first_file_begin, and each following file will start at the end of the previous one.
gps_coordinates: str | list | tuple: GPS coordinates of the location were audio files were recorded.
depth: float: Depth at which the audio files were recorded.
timezone: str | None: Timezone in which the audio data will be located. If the audio file timestamps are parsed with a tz-aware strptime_format (%z or %Z code), the AudioFiles will be converted from the parsed timezone to the specified timezone.
outputs: dict | None: Core API Datasets that have been exported in this project. Mainly used for deserialization.
job_builder: Job_builder | None: If None, outputs from this Project will be run locally. Otherwise, PBS job files will be created and submitted when transforms are run. See the osekit.job module for more info.
instrument: Instrument | None: Instrument that might be used to obtain acoustic pressure from the wav audio data. See the osekit.core.instrument module for more info.
first_file_begin: Timestamp | None: Timestamp of the first audio file being processed. Will be ignored if striptime_format is specified.

build() → None#

Build the Project.

Building a Project moves the original audio files to a specific folder and creates serialized json files used by APLOSE.

build_from_files(files: Iterable[PathLike | str], *, move_files: bool = False) → None#

Build the Project from the specified files.

The files will be copied (or moved) to the project.folder folder.

Parameters#

files: Iterable[PathLike|str]: Files that are included in the project.
move_files: bool: If set to True, the files will be moved (rather than copied) in the project folder.

delete_transform_with_outputs(transform_name: str) → None#

Delete all output datasets from a given run transform name.

WARNING: all the output files will be deleted.

Parameters#

transform_name: str: Name of the transform whose output to delete.

deserialize_output(output_name: str) → type[DatasetChild]#

Deserialize an output dataset from its json file.

The self.outputs property will be updated so that it stores the deserialized dataset rather than the json file so that it is deserialized only once.

Parameters#

output_name: str: Name of the output dataset.

Returns#

type[DatasetChild]:: The deserialized output dataset.

export(output_type: OutputType, ads: AudioDataset | None = None, sds: SpectroDataset | LTASDataset | None = None, subtype: str | None = None, spectrum_folder_name: str = 'spectrum', spectrogram_folder_name: str = 'spectrogram', welch_folder_name: str = 'welch', nb_jobs: int = 1, name: str = 'OSEkit_transform', *, link: bool = False) → None#

Perform a transform and write the results on disk.

An transform is defined as a manipulation of the original audio files: reshaping the audio, exporting png spectrograms or npz matrices (or a combination of those three) are examples of transforms. The tasks will be distributed to jobs if self.job_builder is not None, else it will be distributed on self.job_builder.nb_jobs jobs.

Parameters#

spectrogram_folder_name:: The name of the folder in which the png spectrograms will be exported (relative to sds.folder)
spectrum_folder_name:: The name of the folder in which the npz matrices will be exported (relative to sds.folder)
welch_folder_name:: The name of the folder in which the npz welch files will be exported (relative to sds.folder)
sds: SpectroDataset | LTASDataset: The SpectroDataset on which the data should be written.
output_typeOutputType: Type of the transform to be performed. AudioDataset and SpectroDataset instances will be created depending on the flags. See osekit.public.transform.OutputType docstring for more information.
ads: AudioDataset: The AudioDataset on which the data should be written.
subtype: str | None: The subtype of the audio files as provided by the soundfile module.
nb_jobs: int: The number of jobs to run in parallel.
name: str: The name of the transform being performed.
link: bool: If True, the ads data will be linked to the exported files.

classmethod from_dict(dictionary: dict) → Project#

Deserialize a project from a dictionary.

Parameters#

dictionary: dict: The serialized dictionary representing the project.

Returns#

Project: The deserialized project.

classmethod from_json(file: Path) → Project#

Deserialize a Project from a json file.

Parameters#

file: Path: Path to the serialized json file representing the Project.

Returns#

Project: The deserialized Project.

get_output(output_name: str) → type[DatasetChild] | None#

Get an output dataset from its name.

Parameters#

output_name: str: Name of the output dataset.

Returns#

type[DatasetChild]:: Output dataset from the project.outputs property.

get_output_by_transform_name(transform_name: str) → list[type[DatasetChild]]#

Get all output output datasets from a given transform.

Parameters#

transform_name: str: Name of the transform of which to get the output datasets.

Returns#

list[type[DatasetChild]] List of the output datasets.

property origin_dataset: AudioDataset#: Return the AudioDataset from which this Project has been built.

property origin_files: list[AudioFile] | None#: Return the original audio files from which this Project has been built.

prepare_audio(transform: Transform) → AudioDataset#

Return an AudioDataset created from the transform parameters.

Parameters#

transform: Transform: Transform for which to generate an AudioDataset object.

Returns#

AudioDataset:: The AudioDataset that match the transform parameters. This AudioDataset can be used either to have a peek at the transform output, or to edit the transform (adding/removing data) by editing it and passing it as a parameter to the Project.run() method.

prepare_spectro(transform: Transform, audio_dataset: AudioDataset | None = None) → SpectroDataset | LTASDataset#

Return a SpectroDataset (or LTASDataset) created from transform parameters.

Parameters#

transform: Transform: Transform for which to generate an AudioDataset object.
audio_dataset: AudioDataset|None: If provided, the SpectroDataset will be initialized from this AudioDataset. This can be used to edit the transform (e.g. adding/removing data) before running it.

Returns#

SpectroDataset | LTASDataset:: The SpectroDataset that match the transform parameters. This SpectroDataset can be used, for example, to have a peek at the transform output before running it. If Transform.is_ltas is True, a LTASDataset is returned.

rename_transform_with_outputs(transform_name: str, new_transform_name: str) → None#

Rename an already run transform.

All outputs of the transform will be renamed as if the tranform had been originally ran with the new_transform_name name.

Parameters#

transform_name: str: Name of the transform to rename.
new_transform_name: str: New name of the transform to rename.

reset() → None#

Reset the Project.

Resetting a project will move back the original audio files and the content of the other folder to the root folder. WARNING: all other files and folders will be deleted.

run(transform: Transform, audio_dataset: AudioDataset | None = None, spectro_dataset: SpectroDataset | None = None, nb_jobs: int = 1) → None#

Create a new transform dataset from the original audio files.

The transform parameter sets which type(s) of core dataset(s) will be created and added to the Project.outputs property, plus which output files will be written to disk (reshaped audio files, npz spectra matrices, png spectrograms…).

Parameters#

transform: Transform: Transform to run. Contains the transform type and required info. See the public.transform.Transform docstring for more info.
audio_dataset: AudioDataset: If provided, the transform will be run on this AudioDataset. Else, an AudioDataset will be created from the transform parameters. This can be used to edit the transform AudioDataset (adding, removing, renaming AudioData etc.)
spectro_dataset: SpectroDataset: If provided, the spectral transform will be run on this SpectroDataset. Else, a SpectroDataset will be created from the audio_dataset if provided, or from the transform parameters. This can be used to edit the transform SpectroDataset (adding, removing, renaming SpectroData etc.)
nb_jobs: int: Number of jobs to run in parallel.

to_dict() → dict#

Serialize a project to a dictionary.

Returns#

dict:: The serialized dictionary representing the project.

property transforms: list[str]#: Return the list of the names of the transforms ran with this Project.

write_json(folder: Path | None = None) → None#: Write a serialized Project to a JSON file.

Project

Contents

Project#

Parameters#

Parameters#

Parameters#

Parameters#

Returns#

Parameters#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Parameters#

Returns#