AudioDataset#
- class osekit.core_api.audio_dataset.AudioDataset(data: list[AudioData], name: str | None = None, suffix: str = '', folder: Path | None = None, instrument: Instrument | None = None)#
AudioDataset is a collection of AudioData objects.
AudioDataset is a collection of AudioData, with methods that simplify repeated operations on the audio data.
Initialize an AudioDataset.
- classmethod from_base_dataset(base_dataset: BaseDataset, sample_rate: float | None = None, name: str | None = None, instrument: Instrument | None = None) AudioDataset #
Return an AudioDataset object from a BaseDataset object.
- classmethod from_dict(dictionary: dict) AudioDataset #
Deserialize an AudioDataset from a dictionary.
Parameters#
- dictionary: dict
The serialized dictionary representing the AudioDataset.
Returns#
- AudioDataset
The deserialized AudioDataset.
- classmethod from_files(files: list[AudioFile], begin: Timestamp | None = None, end: Timestamp | None = None, mode: Literal['files', 'timedelta_total', 'timedelta_file'] = 'timedelta_total', data_duration: Timedelta | None = None, name: str | None = None, instrument: Instrument | None = None) AudioDataset #
Return an AudioDataset object from a list of AudioFiles.
Parameters#
- files: list[AudioFile]
The list of files contained in the Dataset.
- begin: Timestamp | None
Begin of the first data object. Defaulted to the begin of the first file.
- end: Timestamp | None
End of the last data object. Defaulted to the end of the last file.
- mode: Literal[āfilesā, ātimedelta_totalā, ātimedelta_fileā]
Mode of creation of the dataset data from the original files. āfilesā: one data will be created for each file. ātimedelta_totalā: data objects of duration equal to data_duration will be created from the begin timestamp to the end timestamp. ātimedelta_fileā: data objects of duration equal to data_duration will be created from the beginning of the first file that the begin timestamp is into, until it would resume in a data beginning between two files. Then, the next data object will be created from the beginning of the next original file and so on.
- data_duration: Timedelta | None
Duration of the data objects. If mode is set to āfilesā, this parameter has no effect. If provided, data will be evenly distributed between begin and end. Else, one data object will cover the whole time period.
- name: str|None
Name of the dataset.
- instrument: Instrument | None
Instrument that might be used to obtain acoustic pressure from the wav audio data.
Returns#
BaseDataset[TItem, TFile]: The DataBase object.
- classmethod from_folder(folder: Path, strptime_format: str, begin: Timestamp | None = None, end: Timestamp | None = None, timezone: str | pytz.timezone | None = None, mode: Literal['files', 'timedelta_total', 'timedelta_file'] = 'timedelta_total', data_duration: Timedelta | None = None, name: str | None = None, instrument: Instrument | None = None, **kwargs: any) AudioDataset #
Return an AudioDataset from a folder containing the audio files.
Parameters#
- folder: Path
The folder containing the audio files.
- strptime_format: str
The strptime format of the timestamps in the audio file names.
- begin: Timestamp | None
The begin of the audio dataset. Defaulted to the begin of the first file.
- end: Timestamp | None
The end of the audio dataset. Defaulted to the end of the last file.
- timezone: str | pytz.timezone | None
The timezone in which the file should be localized. If None, the file begin/end will be tz-naive. If different from a timezone parsed from the filename, the timestampsā timezone will be converted from the parsed timezone to the specified timezone.
- mode: Literal[āfilesā, ātimedelta_totalā, ātimedelta_fileā]
Mode of creation of the dataset data from the original files. āfilesā: one data will be created for each file. ātimedelta_totalā: data objects of duration equal to data_duration will be created from the begin timestamp to the end timestamp. ātimedelta_fileā: data objects of duration equal to data_duration will be created from the beginning of the first file that the begin timestamp is into, until it would resume in a data beginning between two files. Then, the next data object will be created from the beginning of the next original file and so on.
- data_duration: Timedelta | None
Duration of the audio data objects. If mode is set to āfilesā, this parameter has no effect. If provided, audio data will be evenly distributed between begin and end. Else, one data object will cover the whole time period.
- name: str|None
Name of the dataset.
- instrument: Instrument | None
Instrument that might be used to obtain acoustic pressure from the wav audio data.
- kwargs: any
Keyword arguments passed to the BaseDataset.from_folder classmethod.
Returns#
- Audiodataset:
The audio dataset.
- classmethod from_json(file: Path) AudioDataset #
Deserialize an AudioDataset from a JSON file.
Parameters#
- file: Path
Path to the serialized JSON file representing the AudioDataset.
Returns#
- AudioDataset
The deserialized AudioDataset.
- property instrument: Instrument | None#
Instrument that can be used to get acoustic pressure from wav audio data.
- property sample_rate: set[float] | float#
Return the most frequent sample rate among those of this dataset data.
- write(folder: Path, subtype: str | None = None, link: bool = False, first: int = 0, last: int | None = None) None #
Write all data objects in the specified folder.
Parameters#
- folder: Path
Folder in which to write the data.
- subtype: str | None
Subtype as provided by the soundfile module. Defaulted as the default 16-bit PCM for WAV audio files.
- link: bool
If True, each AudioData will be bound to the corresponding written file. Their items will be replaced with a single item, which will match the whole new AudioFile.
- first: int
Index of the first AudioData object to write.
- last: int | None
Index after the last AudioData object to write.