src.algorithms¶

Module contents¶

Module contents are imported from nussl.

class src.algorithms.FT2D(input_audio_signal, high_pass_cutoff=100.0, neighborhood_size=(1, 25), do_mono=False, use_librosa_stft=False, quadrants_to_keep=(0, 1, 2, 3), use_background_fourier_transform=True, mask_alpha=1.0, mask_type='soft', filter_approach='local_std')[source]¶

Bases: nussl.separation.mask_separation_base.MaskSeparationBase

Implements foreground/background separation using the 2D Fourier Transform

Parameters

input_audio_signal – (AudioSignal object) The AudioSignal object that has the audio data that REPET will be run on.
high_pass_cutoff – (Optional) (float) value (in Hz) for the high pass cutoff filter.
do_mono – (Optional) (bool) Flattens AudioSignal to mono before running the algorithm (does not effect the input AudioSignal object)
use_librosa_stft – (Optional) (bool) Calls librosa’s stft function instead of nussl’s

compute_ft2d_mask(ft2d, ch)[source]¶

filter_local_maxima(ft2d)[source]¶

filter_local_maxima_with_std(ft2d)[source]¶

filter_quadrants(data)[source]¶

make_audio_signals()[source]¶

Returns the background and foreground audio signals. You must have run FT2D.run() prior to calling this function. This function will return None if run() has not been called.

Returns

2 element list.

bkgd: Audio signal with the calculated background track

fkgd: Audio signal with the calculated foreground track

Return type

Audio Signals (List)

Example

run()[source]¶

Returns: An AudioSignal object with repeating background in background.audio_data (to get the corresponding non-repeating foreground run self.make_audio_signals())
Return type: background (AudioSignal)

Example

class src.algorithms.DeepClustering(input_audio_signal, model_path, metadata=None, extra_modules=None, use_cuda=False, **kwargs)[source]¶

Bases: nussl.separation.clustering.clustering_separation_base.ClusteringSeparationBase, nussl.separation.deep_mixin.DeepMixin

extract_features()[source]¶

make_audio_signals()[source]¶

Applies each mask in self.masks and returns a list of audio_signal: objects for each source.

Returns: An array of audio_signal objects containing each separated source
Return type: self.sources (np.array)

postprocess(assignments, confidence)[source]¶

project_data(data)[source]¶

set_audio_signal(new_audio_signal)[source]¶

class src.algorithms.DeepMaskEstimation(input_audio_signal, model_path, extra_modules=None, mask_type='soft', use_librosa_stft=False, use_cuda=True)[source]¶

Bases: nussl.separation.mask_separation_base.MaskSeparationBase, nussl.separation.deep_mixin.DeepMixin

Implements deep source separation models using PyTorch

apply_mask(mask)[source]¶: Applies individual mask and returns audio_signal object

make_audio_signals()[source]¶

Applies each mask in self.masks and returns a list of audio_signal: objects for each source.

Returns: An array of audio_signal objects containing each separated source
Return type: self.sources (np.array)

run()[source]¶: Returns:

class src.algorithms.SeparationBase(input_audio_signal)[source]¶

Bases: object

Base class for all separation algorithms in nussl.

Do not call this. It will not do anything.

Parameters: input_audio_signal (audio_signal.AudioSignal) – This will always make a copy of the provided AudioSignal object.

property audio_signal¶

Copy of the audio_signal.AudioSignal object passed in upon initialization.

Type: (audio_signal.AudioSignal)

classmethod from_json(json_string)[source]¶

Creates a new SeparationBase object from the parameters stored in this JSON string.

Parameters: json_string (str) – A JSON string containing all the data to create a new SeparationBase object.
Returns: (SeparationBase) A new SeparationBase object from the JSON string.

See also

to_json() to make a JSON string to freeze this object.

make_audio_signals()[source]¶

Makes audio_signal.AudioSignal objects after mask-based separation algorithm is run. Base class: Do not call directly!

Raises: NotImplementedError – Cannot call base class!

property mask_threshold¶

PROPERTY

Threshold of determining True/False if mask_type is BINARY_MASK. Some algorithms will first make a soft mask and then convert that to a binary mask using this threshold parameter. All values of the soft mask are between [0.0, 1.0] and as such mask_threshold() is expected to be a float between [0.0, 1.0].

Returns: Value between [0.0, 1.0] that indicates the True/False cutoff when converting a soft mask to binary mask.
Return type: mask_threshold (float)
Raises: ValueError if not a float or if set outside [0.0, 1.0] –

property mask_type¶

PROPERTY

This property indicates what type of mask the derived algorithm will create and be returned by run(). Options are either ‘soft’ or ‘binary’. mask_type is usually set when initializing a MaskSeparationBase-derived class and defaults to SOFT_MASK.

This property, though stored as a string, can be set in two ways when initializing:

First, it is possible to set this property with a string. Only 'soft' and 'binary' are accepted (case insensitive), every other value will raise an error. When initializing with a string, two helper attributes are provided: BINARY_MASK and SOFT_MASK.

It is HIGHLY encouraged to use these, as the API may change and code that uses bare strings (e.g. mask_type = 'soft' or mask_type = 'binary') for assignment might not be future-proof. BINARY_MASK` and SOFT_MASK are safe aliases in case these underlying types change.
The second way to set this property is by using a class prototype of either the separation.masks.binary_mask.BinaryMask or separation.masks.soft_mask.SoftMask class prototype. This is probably the most stable way to set this, and it’s fairly succinct. For example, mask_type = nussl.BinaryMask or mask_type = nussl.SoftMask are both perfectly valid.

Though uncommon, this can be set outside of __init__()

Examples of both methods are shown below.

Returns: Either 'soft' or 'binary'.
Return type: mask_type (str)
Raises: ValueError if set invalidly. –

Example:

import nussl
mixture_signal = nussl.AudioSignal()

# Two options for determining mask upon init...

# Option 1: Init with a string (BINARY_MASK is a string 'constant')
repet_sim = nussl.RepetSim(mixture_signal, mask_type=nussl.MaskSeparationBase.BINARY_MASK)

# Option 2: Init with a class type
ola = nussl.OverlapAdd(mixture_signal, mask_type=nussl.SoftMask)

# It's also possible to change these values after init by changing the `mask_type` property...
repet_sim.mask_type = nussl.MaskSeparationBase.SOFT_MASK  # using a string
ola.mask_type = nussl.BinaryMask  # or using a class type

ones_mask(shape)[source]¶

Parameters: shape –

Returns:

run()[source]¶

Runs mask-based separation algorithm. Base class: Do not call directly!

Raises: NotImplementedError – Cannot call base class!

zeros_mask(shape)[source]¶

Creates a new zeros mask with this object’s type

Parameters: shape –

Returns: