Reshaping one file [1]#

This very simple example will only be covered with the Core API.

First, create an OSEkit AudioFile from the audio file on disk. In this example, we use the file name to parse the begin timestamp. It is also possible to specify a begin timestamp by providing a pandas.Timestamp to the begin parameter of the constructor.

from pathlib import Path
from osekit.core_api.audio_file import AudioFile

audio_file = AudioFile(
    path=Path(r"_static/sample_audio/timestamped/sample_220925_223450.wav"),
    strptime_format="%y%m%d_%H%M%S",
)

The AudioFile object has informations about the audio file:

print(f"{' FILE ':#^60}")
print(f"{'Begin:':<30}{str(audio_file.begin):>30}")
print(f"{'End:':<30}{str(audio_file.end):>30}")
print(f"{'Sample rate:':<30}{str(audio_file.sample_rate):>30}")

########################### FILE ###########################
Begin:                                   2022-09-25 22:34:50
End:                                     2022-09-25 22:35:00
Sample rate:                                           48000

Create an OSEkit AudioData, which represents a part of data distributed over one or more file(s):

from osekit.core_api.audio_data import AudioData
from pandas import Timestamp

audio_data = AudioData.from_files(
    files=[audio_file],
    begin=Timestamp("2022-09-25 22:34:52"),
    end=Timestamp("2022-09-25 22:34:56"),
)

Simply resample and normalize the AudioData by setting the corresponding properties:

from osekit.utils.audio_utils import Normalization

audio_data.sample_rate = 24_000
audio_data.normalization = Normalization.DC_REJECT  # Removes the DC component

The AudioData only contains 10 seconds of audio sampled at 48 kHz:

print(f"{' AUDIO DATA ':#^60}")
print(f"{'Begin:':<30}{str(audio_data.begin):>30}")
print(f"{'End:':<30}{str(audio_data.end):>30}")
print(f"{'Sample rate:':<30}{str(audio_data.sample_rate):>30}")

######################## AUDIO DATA ########################
Begin:                                   2022-09-25 22:34:52
End:                                     2022-09-25 22:34:56
Sample rate:                                           24000

The WAV data can be read (it will be resampled at readtime):

wav_data = audio_data.get_value()

print(
    f"WAV data should be {int(audio_data.duration.total_seconds())}*{audio_data.sample_rate:_} samples long: {len(wav_data):_} == {int(audio_data.duration.total_seconds()) * audio_data.sample_rate:_} samples"
)

WAV data should be 4*24_000 samples long: 96_000 == 96_000 samples

And/or written to disk:

audio_data.write(
    Path(r"../docs/source/_static/sample_audio/timestamped/exported_files/")
)