SpectroDataset#
- class osekit.core_api.spectro_dataset.SpectroDataset(data: list[~osekit.core_api.spectro_data.SpectroData], name: str | None = None, suffix: str = '', folder: ~pathlib.Path | None = None, scale: ~osekit.core_api.frequency_scale.Scale | None = None, v_lim: tuple[float, float] | None | object = <object object>)#
SpectroDataset is a collection of SpectroData objects.
SpectroDataset is a collection of SpectroData, with methods that simplify repeated operations on the spectro data.
Initialize a SpectroDataset.
- property colormap: str#
Return the most frequent colormap of the spectro dataset.
- data_cls#
alias of
SpectroData
- property fft: ShortTimeFFT#
Return the fft of the spectro data.
- property folder: Path#
Folder in which the dataset files are located.
- classmethod from_audio_dataset(audio_dataset: AudioDataset, fft: ShortTimeFFT, name: str | None = None, colormap: str | None = None, v_lim: tuple[float, float] | None = <object object>, scale: Scale | None = None) SpectroDataset#
Return a SpectroDataset object from an AudioDataset object.
The SpectroData is computed from the AudioData using the given fft.
- classmethod from_base_dataset(base_dataset: ~osekit.core_api.base_dataset.BaseDataset, fft: ~scipy.signal._short_time_fft.ShortTimeFFT, name: str | None = None, colormap: str | None = None, scale: ~osekit.core_api.frequency_scale.Scale | None = None, v_lim: tuple[float, float] | None | object = <object object>) SpectroDataset#
Return a SpectroDataset object from a BaseDataset object.
- classmethod from_dict(dictionary: dict) SpectroDataset#
Deserialize a SpectroDataset from a dictionary.
Parameters#
- dictionary: dict
The serialized dictionary representing the SpectroDataset.
Returns#
- SpectroDataset
The deserialized SpectroDataset.
- classmethod from_folder(folder: Path, strptime_format: str, begin: Timestamp | None = None, end: Timestamp | None = None, timezone: str | pytz.timezone | None = None, mode: Literal['files', 'timedelta_total', 'timedelta_file'] = 'timedelta_total', overlap: float = 0.0, data_duration: Timedelta | None = None, name: str | None = None, v_lim: tuple[float, float] | None | object = <object object>, **kwargs: any) SpectroDataset#
Return a SpectroDataset from a folder containing the spectro files.
Parameters#
- folder: Path
The folder containing the spectro files.
- strptime_format: str
The strptime format of the timestamps in the spectro file names.
- begin: Timestamp | None
The begin of the spectro dataset. Defaulted to the begin of the first file.
- end: Timestamp | None
The end of the spectro dataset. Defaulted to the end of the last file.
- timezone: str | pytz.timezone | None
The timezone in which the file should be localized. If None, the file begin/end will be tz-naive. If different from a timezone parsed from the filename, the timestamps’ timezone will be converted from the parsed timezone to the specified timezone.
- mode: Literal[“files”, “timedelta_total”, “timedelta_file”]
Mode of creation of the dataset data from the original files. “files”: one data will be created for each file. “timedelta_total”: data objects of duration equal to data_duration will be created from the begin timestamp to the end timestamp. “timedelta_file”: data objects of duration equal to data_duration will be created from the beginning of the first file that the begin timestamp is into, until it would resume in a data beginning between two files. Then, the next data object will be created from the beginning of the next original file and so on.
- overlap: float
Overlap percentage between consecutive data.
- data_duration: Timedelta | None
Duration of the spectro data objects. If mode is set to “files”, this parameter has no effect. If provided, spectro data will be evenly distributed between begin and end. Else, one data object will cover the whole time period.
- name: str|None
Name of the dataset.
- v_lim: tuple[float, float] | None
Limits (in dB) of the colormap used for plotting the spectrogram.
- kwargs: any
Keyword arguments passed to the BaseDataset.from_folder classmethod.
Returns#
- Spectrodataset:
The audio dataset.
- classmethod from_json(file: Path) SpectroDataset#
Deserialize a SpectroDataset from a JSON file.
Parameters#
- file: Path
Path to the serialized JSON file representing the SpectroDataset.
Returns#
- SpectroDataset
The deserialized SpectroDataset.
- get_welch(first: int = 0, last: int | None = None, nperseg: int | None = None, detrend: str | callable | False = 'constant', scaling: Literal['density', 'spectrum'] = 'density', average: Literal['mean', 'median'] = 'mean', *, return_onesided: bool = True) DataFrame#
Return the welch values of the SpectroDataset.
Each SpectroData of the SpectroDataset represent one column of the welch values.
Parameters#
- first: int
Index of the first SpectroData object to include in the welch.
- last: int
Index after the last SpectroData object to include in the welch.
- nperseg: int|None
Length of each segment. Defaults to None, but if window is str or tuple, is set to 256, and if window is array_like, is set to the length of the window.
- detrend: str | callable | False
Specifies how to detrend each segment. If detrend is a string, it is passed as the type argument to the detrend function. If it is a function, it takes a segment and returns a detrended segment. If detrend is False, no detrending is done. Defaults to ‘constant’.
- scaling: Literal[“density”, “spectrum”]
Selects between computing the power spectral density (‘density’) where Pxx has units of V**2/Hz and computing the squared magnitude spectrum (‘spectrum’) where Pxx has units of V**2, if x is measured in V and fs is measured in Hz. Defaults to ‘density’
- average: Literal[“mean”, “median”]
Method to use when averaging periodograms. Defaults to ‘mean’.
- return_onesided: bool
If True, return a one-sided spectrum for real data. If False return a two-sided spectrum. Defaults to True, but for complex data, a two-sided spectrum is always returned.
- link_audio_dataset(audio_dataset: AudioDataset, first: int = 0, last: int | None = None) None#
Link the SpectroData of the SpectroDataset to the AudioData of the AudioDataset.
Parameters#
- audio_dataset: AudioDataset
The AudioDataset which data will be linked to the SpectroDataset data.
- save_all(matrix_folder: Path, spectrogram_folder: Path, link: bool = False, first: int = 0, last: int | None = None) None#
Export both Sx matrices as npz files and spectrograms for each data.
Parameters#
- matrix_folder: Path
Path to the folder in which the Sx matrices npz files will be saved.
- spectrogram_folder: Path
Path to the folder in which the spectrograms png files will be saved.
- link: bool
If True, the SpectroData will be bound to the written npz file. Its items will be replaced with a single item, which will match the whole new SpectroFile.
- first: int
Index of the first SpectroData object to export.
- last: int|None
Index after the last SpectroData object to export.
- save_spectrogram(folder: Path, first: int = 0, last: int | None = None) None#
Export all spectrogram data as png images in the specified folder.
Parameters#
- folder: Path
Folder in which the spectrograms should be saved.
- first: int
Index of the first SpectroData object to export.
- last: int|None
Index after the last SpectroData object to export.
- to_dict() dict#
Serialize a SpectroDataset to a dictionary.
Returns#
- dict:
The serialized dictionary representing the SpectroDataset.
- update_json_audio_data(first: int, last: int) None#
Update the serialized JSON file with the spectro data from first to int.
The update is done while using the locked decorator. That way, if a SpectroDataset is processed through multiple jobs, each one can update the JSON file safely.
Parameters#
- first: int
Index of the first data to update.
- last: int
Index of the last data to update.
- property v_lim: tuple[float, float] | None#
Return the most frequent v_lim of the spectro dataset.
- write_welch(folder: Path, first: int = 0, last: int | None = None, nperseg: int | None = None, detrend: str | callable | False = 'constant', scaling: Literal['density', 'spectrum'] = 'density', average: Literal['mean', 'median'] = 'mean', pxs: DataFrame | None = None, *, return_onesided: bool = True) None#
Write the welch values of the SpectroDataset in a npz file.
Parameters#
- folder: Path
The folder in which the NPZ file will be written.
- first: int
Index of the first SpectroData object to include in the welch.
- last: int
Index after the last SpectroData object to include in the welch.
- nperseg: int|None
Length of each segment. Defaults to None, but if window is str or tuple, is set to 256, and if window is array_like, is set to the length of the window.
- detrend: str | callable | False
Specifies how to detrend each segment. If detrend is a string, it is passed as the type argument to the detrend function. If it is a function, it takes a segment and returns a detrended segment. If detrend is False, no detrending is done. Defaults to ‘constant’.
- scaling: Literal[“density”, “spectrum”]
Selects between computing the power spectral density (‘density’) where Pxx has units of V**2/Hz and computing the squared magnitude spectrum (‘spectrum’) where Pxx has units of V**2, if x is measured in V and fs is measured in Hz. Defaults to ‘density’
- average: Literal[“mean”, “median”]
Method to use when averaging periodograms. Defaults to ‘mean’.
- pxs: DataFrame | None
Welch values as returned by SpectroDataset.get_welch(). If provided, the computation will be skipped and the provided pxs values will be written to disk.
- return_onesided: bool
If True, return a one-sided spectrum for real data. If False return a two-sided spectrum. Defaults to True, but for complex data, a two-sided spectrum is always returned.