src.algorithms

Module contents

Module contents are imported from nussl.

class src.algorithms.FT2D(input_audio_signal, high_pass_cutoff=100.0, neighborhood_size=(1, 25), do_mono=False, use_librosa_stft=False, quadrants_to_keep=(0, 1, 2, 3), use_background_fourier_transform=True, mask_alpha=1.0, mask_type='soft', filter_approach='local_std')[source]

Bases: nussl.separation.mask_separation_base.MaskSeparationBase

Implements foreground/background separation using the 2D Fourier Transform

Parameters
  • input_audio_signal – (AudioSignal object) The AudioSignal object that has the audio data that REPET will be run on.

  • high_pass_cutoff – (Optional) (float) value (in Hz) for the high pass cutoff filter.

  • do_mono – (Optional) (bool) Flattens AudioSignal to mono before running the algorithm (does not effect the input AudioSignal object)

  • use_librosa_stft – (Optional) (bool) Calls librosa’s stft function instead of nussl’s

compute_ft2d_mask(ft2d, ch)[source]
filter_local_maxima(ft2d)[source]
filter_local_maxima_with_std(ft2d)[source]
filter_quadrants(data)[source]
make_audio_signals()[source]

Returns the background and foreground audio signals. You must have run FT2D.run() prior to calling this function. This function will return None if run() has not been called.

Returns

2 element list.

  • bkgd: Audio signal with the calculated background track

  • fkgd: Audio signal with the calculated foreground track

Return type

Audio Signals (List)

Example

run()[source]
Returns

An AudioSignal object with repeating background in background.audio_data (to get the corresponding non-repeating foreground run self.make_audio_signals())

Return type

background (AudioSignal)

Example

class src.algorithms.DeepClustering(input_audio_signal, model_path, metadata=None, extra_modules=None, use_cuda=False, **kwargs)[source]

Bases: nussl.separation.clustering.clustering_separation_base.ClusteringSeparationBase, nussl.separation.deep_mixin.DeepMixin

extract_features()[source]
make_audio_signals()[source]
Applies each mask in self.masks and returns a list of audio_signal

objects for each source.

Returns

An array of audio_signal objects containing each separated source

Return type

self.sources (np.array)

postprocess(assignments, confidence)[source]
project_data(data)[source]
set_audio_signal(new_audio_signal)[source]
class src.algorithms.DeepMaskEstimation(input_audio_signal, model_path, extra_modules=None, mask_type='soft', use_librosa_stft=False, use_cuda=True)[source]

Bases: nussl.separation.mask_separation_base.MaskSeparationBase, nussl.separation.deep_mixin.DeepMixin

Implements deep source separation models using PyTorch

apply_mask(mask)[source]

Applies individual mask and returns audio_signal object

make_audio_signals()[source]
Applies each mask in self.masks and returns a list of audio_signal

objects for each source.

Returns

An array of audio_signal objects containing each separated source

Return type

self.sources (np.array)

run()[source]

Returns:

class src.algorithms.SeparationBase(input_audio_signal)[source]

Bases: object

Base class for all separation algorithms in nussl.

Do not call this. It will not do anything.

Parameters

input_audio_signal (audio_signal.AudioSignal) – This will always make a copy of the provided AudioSignal object.

property audio_signal

Copy of the audio_signal.AudioSignal object passed in upon initialization.

Type

(audio_signal.AudioSignal)

classmethod from_json(json_string)[source]

Creates a new SeparationBase object from the parameters stored in this JSON string.

Parameters

json_string (str) – A JSON string containing all the data to create a new SeparationBase object.

Returns

(SeparationBase) A new SeparationBase object from the JSON string.

See also

to_json() to make a JSON string to freeze this object.

make_audio_signals()[source]

Makes audio_signal.AudioSignal objects after separation algorithm is run

Raises

NotImplementedError – Cannot call base class

plot(**kwargs)[source]

Plots relevant data for separation algorithm

Raises

NotImplementedError – Cannot call base class

run()[source]

Runs separation algorithm

Raises

NotImplementedError – Cannot call base class

property sample_rate

Sample rate of audio_signal. Literally audio_signal.sample_rate.

Type

(int)

property stft_params

spectral_utils.StftParams of audio_signal Literally audio_signal.stft_params.

Type

(spectral_utils.StftParams)

to_json()[source]

Outputs JSON from the data stored in this object.

Returns

(str) a JSON string containing all of the information to restore this object exactly as it was when this was called.

See also

from_json() to restore a JSON frozen object.

class src.algorithms.MaskSeparationBase(input_audio_signal, mask_type='soft', mask_threshold=0.5)[source]

Bases: nussl.separation.separation_base.SeparationBase

Base class for separation algorithms that create a mask (binary or soft) to do their separation. Most algorithms in nussl are derived from MaskSeparationBase.

Although this class will do nothing if you instantiate and run it by itself, algorithms that are derived from this class are expected to return a list of separation.masks.mask_base.MaskBase -derived objects (i.e., either a separation.masks.binary_mask.BinaryMask or separation.masks.soft_mask.SoftMask object) by their run() method. Being a subclass of MaskSeparationBase is an implicit contract assuring this. Returning a separation.masks.mask_base.MaskBase-derived object standardizes algorithm return types for evaluation.evaluation_base.EvaluationBase-derived objects.

Parameters
  • input_audio_signal – (audio_signal.AudioSignal) An audio_signal.AudioSignal object containing the mixture to be separated.

  • mask_type – (str) Indicates whether to make binary or soft masks. See mask_type property for details.

  • mask_threshold – (float) Value between [0.0, 1.0] to convert a soft mask to a binary mask. See mask_threshold property for details.

BINARY_MASK = 'binary'

String alias for setting this object to return separation.masks.binary_mask.BinaryMask objects

SOFT_MASK = 'soft'

String alias for setting this object to return separation.masks.soft_mask.SoftMask objects

classmethod from_json(json_string)[source]

Creates a new SeparationBase object from the parameters stored in this JSON string.

Parameters

json_string (str) – A JSON string containing all the data to create a new SeparationBase object.

Returns

(SeparationBase) A new SeparationBase object from the JSON string.

See also

to_json() to make a JSON string to freeze this object.

make_audio_signals()[source]

Makes audio_signal.AudioSignal objects after mask-based separation algorithm is run. Base class: Do not call directly!

Raises

NotImplementedError – Cannot call base class!

property mask_threshold

PROPERTY

Threshold of determining True/False if mask_type is BINARY_MASK. Some algorithms will first make a soft mask and then convert that to a binary mask using this threshold parameter. All values of the soft mask are between [0.0, 1.0] and as such mask_threshold() is expected to be a float between [0.0, 1.0].

Returns

Value between [0.0, 1.0] that indicates the True/False cutoff when converting a soft mask to binary mask.

Return type

mask_threshold (float)

Raises

ValueError if not a float or if set outside [0.0, 1.0]

property mask_type

PROPERTY

This property indicates what type of mask the derived algorithm will create and be returned by run(). Options are either ‘soft’ or ‘binary’. mask_type is usually set when initializing a MaskSeparationBase-derived class and defaults to SOFT_MASK.

This property, though stored as a string, can be set in two ways when initializing:

  • First, it is possible to set this property with a string. Only 'soft' and 'binary' are accepted (case insensitive), every other value will raise an error. When initializing with a string, two helper attributes are provided: BINARY_MASK and SOFT_MASK.

    It is HIGHLY encouraged to use these, as the API may change and code that uses bare strings (e.g. mask_type = 'soft' or mask_type = 'binary') for assignment might not be future-proof. BINARY_MASK` and SOFT_MASK are safe aliases in case these underlying types change.

  • The second way to set this property is by using a class prototype of either the separation.masks.binary_mask.BinaryMask or separation.masks.soft_mask.SoftMask class prototype. This is probably the most stable way to set this, and it’s fairly succinct. For example, mask_type = nussl.BinaryMask or mask_type = nussl.SoftMask are both perfectly valid.

Though uncommon, this can be set outside of __init__()

Examples of both methods are shown below.

Returns

Either 'soft' or 'binary'.

Return type

mask_type (str)

Raises

ValueError if set invalidly.

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import nussl
mixture_signal = nussl.AudioSignal()

# Two options for determining mask upon init...

# Option 1: Init with a string (BINARY_MASK is a string 'constant')
repet_sim = nussl.RepetSim(mixture_signal, mask_type=nussl.MaskSeparationBase.BINARY_MASK)

# Option 2: Init with a class type
ola = nussl.OverlapAdd(mixture_signal, mask_type=nussl.SoftMask)

# It's also possible to change these values after init by changing the `mask_type` property...
repet_sim.mask_type = nussl.MaskSeparationBase.SOFT_MASK  # using a string
ola.mask_type = nussl.BinaryMask  # or using a class type
ones_mask(shape)[source]
Parameters

shape

Returns:

run()[source]

Runs mask-based separation algorithm. Base class: Do not call directly!

Raises

NotImplementedError – Cannot call base class!

zeros_mask(shape)[source]

Creates a new zeros mask with this object’s type

Parameters

shape

Returns: