emonet.utils module#

More miscellaneous functions that probably fit better somewhere else.

emonet.utils.play_audio(waveform: torch.Tensor, sample_rate: int)[source]#

Play an audio signal within an IPython notebook.

Parameters
  • waveform (torch.Tensor) – Audio signal.

  • sample_rate (int) – Audio signal sample rate.

Returns

None

emonet.utils.print_stats(waveform, sample_rate=None, src=None)[source]#
emonet.utils.get_metadata(file: pathlib.Path)[source]#
emonet.utils.async_file_operation(files: List, func: Callable, max_workers: int = 50)[source]#

Asynchronously apply a function on a list of files.

Parameters
  • files (List) – Files to apply function to.

  • func (Callable) – Function to apply.

  • max_workers (int) – Maximum number of concurrent threads.

Returns

None

emonet.utils.get_rating_encoder(ratings: List[str]) speechbrain.dataio.encoder.CategoricalEncoder[source]#

Get a categorical encoder for emotion severity labels.

Parameters

ratings (List[str]) – List containing all possible categorical ratings.

Returns

CategoricalEncoder – A fitted Speechbrain CategoricalEncoder object.

emonet.utils.binarize_ratings(ratings: Union[int, List, torch.Tensor]) Union[int, torch.Tensor][source]#

Convert emotion severity levels to binary.

Assigns None and Low (0 and 1, respectively) to 0; assigns Med and High (2 and 3, respectively) to 1.

Parameters

ratings (Union[int, List, torch.Tensor]) – Rating(s) to convert; expects {0, 1, 2, 3}.

Returns

Union[int, torch.Tensor] – Binarized label(s).

emonet.utils.binarize_labels(labels: Dict) Dict[source]#

Convert all labels within a dictionary to binary.

Parameters

labels (Dict) – A dictionary mapping a specific file or sample to its respective emotion severity label.

Returns

Dict – The original labels dictionary with binarized labels.

emonet.utils.weights_to_binary(weights)[source]#
emonet.utils.get_sample(file: pathlib.Path, sample_rate: int) Tuple[torch.Tensor, int][source]#

Read an audio signal from a file and resample, if necessary.

Parameters
  • file (pathlib.Path) – Path to audio file.

  • sample_rate (int) – Desired sample rate.

Returns

Tuple[torch.Tensor, int] – Audio signal (potentially resampled) and output sample rate.

emonet.utils.get_random_segment(wav: torch.Tensor, seconds: int = 7, sample_rate: int = 16000) torch.Tensor[source]#

Get a random n-second sample from an audio signal. :Parameters: * wav (torch.Tensor) – Original audio signal.

  • seconds (int) – Desired duration of random sample.

  • sample_rate (int) – Original audio sampling rate.

Returns

torch.Tensor – A random segment of the original audio signal.

emonet.utils.channel1to3(t: torch.Tensor) torch.Tensor[source]#

Convert a single-channel spectrogram to 3-channels.

Parameters

t (torch.Tensor) – 3-d spectrogram (C, H, W)

Returns

torch.Tensor – A 3-channel version of original spectrogram, where new channels are copies of original.

emonet.utils.ohe_labels(labels, n_classes)[source]#
emonet.utils.decode_ohe(labels, as_dict=False)[source]#
emonet.utils.from_json(filepath: pathlib.Path) Dict[source]#

Read metadata from json file.

emonet.utils.to_json(meta: Union[Dict, List], filepath: pathlib.Path)[source]#

Write metadata to json file.