scripts

Submodules

scripts.analyze

scripts.analyze.analyze(path_to_yml_file, use_gsheet=False, upload_source_metrics=False)[source]

Analyzes the metrics for all the files that were evaluated in the experiment.

Parameters
  • path_to_yml_file (str) – Path to the yml file that defines the experiment. The corresponding results folder for the experiment is what will be analyzed and put into a Pandas dataframe.

  • use_gsheet (bool, optional) – Whether or not to upload to the Google Sheet. Defaults to False.

  • upload_source_metrics (bool) – Uploads metrics for each source if True. Defaults to False. Can have interactions with the API limit on Google Sheets. If there are two many sources, then it will hit the limit and the script will break.

Returns

3-element tuple containing

  • results (pandas.DataFrame): DataFrame containing all of the results for every file evaluated in the experiment. The DataFrame also has every key in the experiment configuration in flattened format.

    For example, model_config_recurrent_stack_args_embedding_size is a column in the DataFrame.

  • config (dict): A dictionary containing the configuration of the experiment.

  • exp (comet_ml.Experiment): An instantiated experiment if comet.ml is needed, otherwise it is None.

Return type

tuple

scripts.analyze.build_parser()[source]

Builds the parser for scripts.analyze.

usage: python -m scripts.analyze  [-h] -p PATH_TO_YML_FILE [--use_gsheet]
                    [--upload_source_metrics]

optional arguments:
  -h, --help            show this help message and exit
  -p PATH_TO_YML_FILE, --path_to_yml_file PATH_TO_YML_FILE
                        Path to the configuration for the experiment that is
                        getting analyzed. The corresponding results folder for
                        the experiment is what will be analyzed and put into a
                        Pandas dataframe.
  --use_gsheet          Results can be synced to a Google sheet after analysis
                        if this is true. Defaults to false.
  --upload_source_metrics
                        Uploads metrics for each source if True. Defaults to
                        False. Can have interactions with the API limit on
                        Google Sheets. If there are two many sources, then it
                        will hit the limit and the script will break.

Also see the arguments to scripts.analyze.analyze().

Returns

The parser for this script.

Return type

argparse.ArgumentParser

scripts.analyze.init_gsheet(credentials_path)[source]

Initializes the Google Sheets client given a path to credentials.

Parameters

credentials_path (str) – path to your Google credentials that are used to authorize the Google Sheets access.

Returns

Google Sheets Client initialized with credentials.

Return type

gspread.Client

scripts.analyze.upload_to_gsheet(results, config, exp=None, upload_source_metrics=False)[source]

Uploads the analysis to the Google Sheet, if possible.

Parameters
  • results (pandas.DataFrame) – DataFrame containing all the results - output by scripts.analyze.analyze().

  • config (dict) – Dictionary containing the entire experiment configuration.

  • exp (comet_ml.Experiment) – Experiment given by comet.ml (optional).

  • upload_source_metrics (bool) – Uploads metrics for each source if True. Defaults to False. Can have interactions with the API limit on Google Sheets. If there are two many sources, then it will hit the limit and the script will break.

scripts.download_toy_data

scripts.download_toy_data.build_parser()[source]

Builds the parser for scripts.download_toy_data.

usage: python -m scripts.download_toy_data  [-h] --target_folder TARGET_FOLDER

optional arguments:
  -h, --help            show this help message and exit
  --target_folder TARGET_FOLDER
                        Folder where the toy data gets saved to.

Also see the arguments to scripts.download_toy_data.download_toy_data().

Returns

The parser for this script.

Return type

argparse.ArgumentParser

scripts.download_toy_data.download_toy_data(target_folder)[source]

Downloads toy data to a target folder for the purposes of running some demo scripts.

Parameters

target_folder (str) – Where to put the data that gets downloaded.

scripts.evaluate

scripts.evaluate.build_parser()[source]

Builds the parser for scripts.evaluate.

usage: python -m scripts.evaluate  [-h] -p PATH_TO_YML_FILE [-e EVAL_KEYS [EVAL_KEYS ...]]

optional arguments:
  -h, --help            show this help message and exit
  -p PATH_TO_YML_FILE, --path_to_yml_file PATH_TO_YML_FILE
                        Path to the configuration for the experiment that is
                        getting evaluated. The corresponding test
                        configuration for the experiment will be used to
                        evaluate the experiment across all of the audio files
                        in the test dataset.
  -e EVAL_KEYS [EVAL_KEYS ...], --eval_keys EVAL_KEYS [EVAL_KEYS ...]
                        All of the keys to be used to evaluate the experiment.
                        Will run the evaluation on each eval_key in sequence.
                        Defaults to ['test'].

Also see the arguments to scripts.evaluate.evaluate().

Returns

The parser for this script.

Return type

argparse.ArgumentParser

scripts.evaluate.evaluate(path_to_yml_file, eval_keys=['test'])[source]

Evaluates an experiment across all of the data for each key in eval_keys. The key must correspond to a dataset included in the experiment configuration. This uses src.test.EvaluationRunner to evaluate the performance of the model on each dataset.

Parameters
  • path_to_yml_file (str) – Path to the yml file that defines the experiment. The corresponding test configuration for the experiment will be used to evaluate the experiment across all of the audio files in the test dataset.

  • eval_keys (list) – All of the keys to be used to evaluate the experiment. Will run the evaluation on each eval_key in sequence. Defaults to [‘test’].

scripts.mix_with_scaper

scripts.mix_with_scaper.mix_with_scaper(**kwargs)[source]

Takes in keyword arguments containing the specification for mixing with Scaper. See src.dataset.scaper_mix() for a description of what should be in the kwargs. This function does some sanitation of the keyword arguments before passing it to src.dataset.scaper_mix().

Parameters

kwargs (dict) – All of the keyword arguments required for src.dataset.scaper_mix().

scripts.pipeline

scripts.pipeline.build_parser()[source]

Builds the parser for scripts.pipeline.

usage: python -m scripts.pipeline  [-h] [--script SCRIPT] [--config CONFIG] [--run_in RUN_IN]
                    [--num_gpus NUM_GPUS] [--blocking]

optional arguments:
  -h, --help           show this help message and exit
  --script SCRIPT      Path to script to run.
  --config CONFIG      Path to config for script.
  --run_in RUN_IN      Run in host or container.
  --num_gpus NUM_GPUS  How many GPUs to use.
  --blocking           Finish this job before proceeding to next.

Also see the arguments to scripts.pipeline.parallel_job_execution().

Returns

The parser for this script.

Return type

argparse.ArgumentParser

scripts.pipeline.parallel_job_execution(script_func, jobs, num_jobs=1)[source]

Takes a .yml file with structure as follows:

script: name of script in scripts/ folder
config: path/to/yml/config.yml
run_in: 'host' or 'container' (default: host)
num_gpus: how many gpus (default: 0)
blocking: whether to block on this job or not (default: false)

Could also be multiple jobs:

num_jobs: how many jobs to run in parallel (default: 1)

jobs:
- script: script1.py
  config: config1.yml
- script: script2.py
  config: config2.yml
...

The jobs get executed in sequence or in parallel.

Parameters

path_to_yml_file (str) – Path to .yml file specifying the sequence of jobs that should be run.

scripts.reorganize

scripts.reorganize.build_parser()[source]

Builds the parser for scripts.reorganize.

usage: python -m scripts.reorganize  [-h] [--input_path INPUT_PATH] [--output_path OUTPUT_PATH]
                    [--org_func ORG_FUNC] [--make_copy]
                    [--audio_extensions AUDIO_EXTENSIONS [AUDIO_EXTENSIONS ...]]

optional arguments:
  -h, --help            show this help message and exit
  --input_path INPUT_PATH
                        Root of folder where all audio files will be
                        reorganized.
  --output_path OUTPUT_PATH
                        Root of folder where all reorganized files will be
                        placed.
  --org_func ORG_FUNC   Organization function to use reorganize the dataset.
                        Should correspond to the name of a function in
                        reorganize.py.
  --make_copy           Whether to use a symlink or to actually copy the file.
  --audio_extensions AUDIO_EXTENSIONS [AUDIO_EXTENSIONS ...]
                        Audio extensions to look for in the input_path.
                        Matching ones will be reorganize and placed into the
                        output directory via a symlink.

Also see the arguments to scripts.reorganize.reorganize().

Returns

The parser for this script.

Return type

argparse.ArgumentParser

scripts.reorganize.reorganize(input_path, output_path, org_func, make_copy=False, audio_extensions=['.wav', '.mp3', '.aac'], **kwargs)[source]

Reorganizes the folders in the input path into the output path given an organization function, passed in by org_func.

Parameters
  • input_path (str) – Root of folder where all audio files will be reorganized.

  • output_path (str) – Root of folder where the reorganized files will be placed.

  • org_func (str) – Organization function to use reorganize the dataset. Should correspond to the name of a function in reorganize.py.

  • make_copy (bool) – Whether to use a symlink or to actually copy the file. Defaults to False.

  • audio_extensions (list, optional) – Audio extensions to look for in the input_path. Matching ones will be reorganize and placed into the output directory via a symlink.. Defaults to [‘.wav’, ‘.mp3’, ‘.aac’].

  • kwargs (dict) – Additional keyword arguments that are passed to the org_func that is specified.

scripts.reorganize.split_folder_by_class(path_to_file, output_directory, input_directory, make_copy=False)[source]

Splits a folder by class which is indicated by the name of the file.

The mixture name is the name of the parent directory to the file. This function is used to organize datasets like musdb for consumption by Scaper for mixing new datasets.

Takes a folder with audio file structure that looks like this:

folder_input/
    mixture_one_name/
        vocals.wav
        bass.wav
        drums.wav
        other.wav
    mixture_two_name/
        vocals.wav
        bass.wav
        drums.wav
        other.wav
    ...

and reorganizes it to a different folder like so:

folder_output/
    vocals/
        mixture_one_name.wav
        mixture_two_name.wav
        ...
    bass/
        mixture_one_name.wav
        mixture_two_name.wav
        ...
    drums/
        mixture_one_name.wav
        mixture_two_name.wav
        ...
    other/
        mixture_one_name.wav
        mixture_two_name.wav
        ...

so that it can be processed easily by Scaper. Notably, MUSDB has this folder structure. This reorganization is done via symlinks so that the entire dataset is not copied.

Parameters
  • path_to_file (str) – Path to the audio file that will be reorganized. Has form /path/to/mixture_name/source_name.ext

  • output_directory (str) – Where the file after swapping the mixture_name and source_name will be copied to.

  • input_directory (str) – The root of the directory that the file comes from. Useful for figuring out the relative path with respect to the input directory for copying to the output_directory.

  • make_copy (bool) – Whether to use a symlink or to actually copy the file. Defaults to False.

scripts.reorganize.split_folder_by_file(path_to_file, output_directory, input_directory, org_file, make_copy=False)[source]

Reorganizes a directory using a organization file. The organization file should contain a list of paths that are relative to the input_directory. If path_to_file is in the organization file, then it will be symlinked (or moved) to the same relative path in output_directory.

For example if organization file has an entry:

path/to/my/file/0.wav

And path to file looks like:

input_directory/path/to/my/file/0.wav

Then a new file will be created (or symlinked) at:

output_directory/path/to/my/file/0.wav
Parameters
  • path_to_file (str) – Path to the audio file that will be reorganized.

  • output_directory (str) – Where the file after swapping the mixture_name and source_name will be copied to.

  • input_directory (str) – The root of the directory that the file comes from. Useful for figuring out the relative path with respect to the input directory for copying to the output_directory.

  • org_file (str) – Path to the file containing all of the file names that should be moved.

  • make_copy (bool, optional) – Whether to use a symlink or to actually copy the file. Defaults to False.

scripts.reorganize.split_urbansound_by_fold(path_to_file, output_directory, input_directory, make_copy=False, train_folds=[1, 2, 3, 4, 5, 6, 7, 8], val_folds=[9], test_folds=[10], path_to_urbansound_csv=None)[source]

Reorganizes the urbansound dataset using the metadata/UrbanSound8K.csv to determine which fold each file belongs to. It makes symlinks in the corresponding train, test, and val folders.

Parameters
  • path_to_file (str) – Path to the audio file that will be reorganized. Has form /path/to/mixture_name/source_name.ext

  • output_directory (str) – Where the file after swapping the mixture_name and source_name will be copied to.

  • input_directory (str) – The root of the directory that the file comes from. Useful for figuring out the relative path with respect to the input directory for copying to the output_directory.

  • make_copy (bool, optional) – Whether to use a symlink or to actually copy the file. Defaults to False.

  • train_folds (list, optional) – Which folds belong to the train set. Defaults to [1, 2, 3, 4, 5, 6, 7, 8].

  • val_folds (list, optional) – Which folds belong to the validation set. Defaults to [9].

  • test_folds (list, optional) – Which folds belong to the test set. Defaults to [10].

  • path_to_urbansound_csv ([type]) – Path to metadata/UrbanSound8k.csv. Defaults to None.

Raises

ValueError – raises an error if the path to the csv isn’t given.

scripts.resample

scripts.resample.build_parser()[source]

Builds the parser for scripts.resample.

usage: python -m scripts.resample  [-h] [--input_path INPUT_PATH] [--output_path OUTPUT_PATH]
                    [--sample_rate SAMPLE_RATE] [--num_workers NUM_WORKERS]
                    [--audio_extensions AUDIO_EXTENSIONS [AUDIO_EXTENSIONS ...]]

optional arguments:
  -h, --help            show this help message and exit
  --input_path INPUT_PATH
                        Root of folder where all audio files will be
                        resampled.
  --output_path OUTPUT_PATH
                        Root of folder where all resampled files will be
                        placed. Will match the same structure as the
                        input_path folder structure.
  --sample_rate SAMPLE_RATE
                        Sample rate to resample files to.
  --num_workers NUM_WORKERS
                        How many workers to use in parallel to resample files.
  --audio_extensions AUDIO_EXTENSIONS [AUDIO_EXTENSIONS ...]
                        Audio extensions to look for in the input_path.
                        Matching ones will be resampled and placed in the
                        output_path at the same relative location.

Also see the arguments to scripts.resample.resample().

Returns

The parser for this script.

Return type

argparse.ArgumentParser

scripts.resample.ig_f(dir, files)[source]

Filter for making sure something is a file.

Parameters
  • dir (str) – Directory to filter to only look for files.

  • files (list) – List of items to filter.

Returns

Filtered list.

Return type

list

scripts.resample.resample(input_path, output_path, sample_rate, num_workers=1, audio_extensions=['.wav', '.mp3', '.aac'])[source]

Resamples a folder of audio files into a copy of the same folder with the same structure but with every audio file replaced with a resampled version of that audio file. Relative paths to the audio file from the root of the folder will be the same.

Parameters
  • input_path (str) – Root of folder where all audio files will be resampled.

  • output_path (str) – Root of folder where all resampled files will be placed. Will match the same structure as the input_path folder structure.

  • sample_rate (int) – Sample rate to resample files to.

  • num_workers (int, optional) – How many workers to use in parallel to resample files. Defaults to 1.

  • audio_extensions (list, optional) – Audio extensions to look for in the input_path. Matching ones will be resampled and placed in the output_path at the same relative location. Defaults to [‘.wav’, ‘.mp3’, ‘.aac’].

scripts.resample.resample_audio_file(original_path, resample_path, sample_rate, verbose=False)[source]

Resamples an audio file at one path and places it at another path at a specified sample rate.

Parameters
  • original_path (str) – Path of audio file to be resampled.

  • resample_path (str) – Path to save resampled audio file to.

  • sample_rate (int) – Sample rate to resample audio file to.

scripts.run

scripts.run.build_parser()[source]

Builds the parser for scripts.run.

usage: python -m scripts.run  [-h] --command COMMAND [--run_in RUN_IN]

optional arguments:
  -h, --help         show this help message and exit
  --command COMMAND  Command to run.
  --run_in RUN_IN    Whether to run the command in the host or the container.
                     Defaults to host.

Also see the arguments to scripts.run.run().

Returns

The parser for this script.

Return type

argparse.ArgumentParser

scripts.run.run(command, run_in='host')[source]

Runs a command in the shell. Useful for sequences like wget a dataset, then unzip to another directory. You can just put the command in a yml file and call it here. You can run a sequence of commands easily by putting them one after the other in a .yml file and calling this script with the ‘-y’ option.

Parameters

command (str) – Command to run

scripts.sweep_experiment

scripts.sweep_experiment.build_parser()[source]

Builds the parser for scripts.sweep_experiment.

usage: python -m scripts.sweep_experiment  [-h] -p PATH_TO_YML_FILE [--num_jobs NUM_JOBS]
                    [--num_gpus NUM_GPUS] [--run_in RUN_IN]

optional arguments:
  -h, --help            show this help message and exit
  -p PATH_TO_YML_FILE, --path_to_yml_file PATH_TO_YML_FILE
                        Path to the configuration for the base experiment.
                        This will be expanded by the script, filling in the
                        values defined in 'sweep' accordingly, and create new
                        experiments.
  --num_jobs NUM_JOBS   Controls the number of jobs to use in the created
                        pipelines. Defaults to 1.
  --num_gpus NUM_GPUS   Controls the number of gpus to use in the created
                        pipelines. Defaults to 0.
  --run_in RUN_IN       Run jobs in containers or on the host ('container' or
                        'host'). Defaults to host.

Also see the arguments to scripts.sweep_experiment.sweep_experiment().

Returns

The parser for this script.

Return type

argparse.ArgumentParser

scripts.sweep_experiment.create_experiments(path_to_yml_file)[source]

The main logic of this script. Takes the path to the base experiment file and loads the configuration. It then goes through the sweep dictionary kept in that base experiment file. The sweep dictionary tells how to update the configuration. The Cartesian product of all the possible settings specified by sweep is taken. Each experiment is updated accordingly. The length of the Cartesian product of the sweep is the number of experiments that get created.

Parameters

path_to_yml_file (str) – Path to base experiment file.

Returns

2-element tuple containing

  • experiments (list): List of paths to .yml files that define the generated

    experiments.

  • cache_experiments (list): List of paths to .yml files that define the

    experiments used for creating caches if any.

Return type

tuple

scripts.sweep_experiment.create_pipeline(path_to_yml_files, script_name, num_jobs=1, num_gpus=0, run_in='host', blocking=False, prefix='-p', extra_cmd_args='')[source]

Takes a list of yml files, a script name, and some configuration options and creates a pipeline that can be passed to scripts.pipeline so that each job is executed accordingly.

Parameters
  • path_to_yml_files (list) – List of paths to each .yml file that contains the generated experiment configuration from the sweep.

  • script_name (str) – What script to use, should exist in scripts.

  • num_jobs (int, optional) – Number of jobs to be used to run each of these jobs. Is used as the max_workers argument in runners.script_runner_pool.ScriptRunnerPool. Defaults to 1.

  • num_gpus (int, optional) – Number of GPUs to use for each job. Defaults to 0.

  • run_in (str, optional) – Whether to run on ‘host’ or ‘container’. Defaults to ‘host’.

  • blocking (bool, optional) – Whether to block on each job (forces the jobs to run sequentially). Defaults to False.

  • prefix (str, optional) – The prefix to use before the command (either ‘-p’ or ‘-y’). Defaults to ‘-p’.

  • extra_cmd_args (str, optional) – Any extra command line arguments that pipeline may need to run the script, specified as a str as if it was on the command line. Defaults to ‘’.

Returns

A dictionary containing the sequence of pipelines that is later dumped to

YAML so it can be passed to scripts.pipeline.

Return type

dict

scripts.sweep_experiment.nested_set(element, value, *keys)[source]

Use a list of keys to replace a value in a dictionary. The result will look like:

element[key1][key2][key3]...[keyn] = value
Parameters
  • element (dict) – Dictionary to iteratively query.

  • value ([type]) – Value to set at the end of the query

Raises
  • AttributeError – first argument must be a dictionary

  • AttributeError – must have at least three arguments.

scripts.sweep_experiment.replace_item(obj, key, replace_value)[source]

Recursively replaces any matching key in a dictionary with a specified replacement value.

Parameters
  • obj (dict) – Dictionary where item is being replaced.

  • key (obj) – Key to replace in dictionary.

  • replace_value (obj) – What to replace the key with.

Returns

Dictionary with everything of that key replaced with the specified value.

Return type

dict

scripts.sweep_experiment.sweep_experiment(path_to_yml_file, num_jobs=1, num_gpus=0, run_in='host')[source]

Takes a base experiment file and sweeps across the ‘sweep’ key in it, replacing values as needed. Results in the Cartesian product of all of the parameters that are being swept across. Also creates pipeline files that can be passed to scripts.pipeline so that everything can be run in sequence easily, or in parallel as determined by num_jobs.

The sweep config is used to replace dictionary keys and create experiments on the fly. A separate experiment will be created for each sweep discovered. The set of experiments can then be submitted to the job runner in parallel or in sequence. If one of the arguments is a list, then it will loop across each of the items in the list creating a separate experiment for each one. There’s no real error checking so be careful when setting things up as creating invalid or buggy experiments (e.g. num_frequencies and n_fft don’t match) is possible.

If there is a ‘.’ in the key, then it is an absolute path to the exact value to update in the configuration. If there isn’t, then it is a global update for all matching keys.

Here’s a simple example of a sweep configuration that specifies the STFT parameters and sweeps across the number of hidden units and embedding size:

sweep:
   - n_fft: 128
     hop_length: 64
     num_frequencies: 65 # n_fft / 2 + 1
     num_features: 65
     model_config.modules.recurrent_stack.args.hidden_size: [50, 100] # specific sweep, delimited by '.'
     embedding_size: [10, 20] # global sweep
     cache: '${CACHE_DIRECTORY}/musdb_128'
     populate_cache: true # controls whether to create a separate experiment for caching
     num_cache_workers: 60 # how many workers to use when populating the cache

The above creates 5 experiments, across the Cartesian product of hidden size and embedding size, +1 for the caching experiment:

- caching "experiment" where training data is prepared
- hidden_size = 50, embedding_size = 10  # 1st experiment
- hidden_size = 50, embedding_size = 20  # 2nd experiment
- hidden_size = 100, embedding_size = 10 # 3rd experiment
- hidden_size = 100, embedding_size = 20 # 4th experiment

Each sweep within an item of the list should use the same cache. The cache is created as a separate experiment. For example, if we want to sweep across STFT parameters, then we need different caches as different STFTs will result in different training data.

sweep:
   - n_fft: 128
     hop_length: 64
     num_frequencies: 65 # n_fft / 2 + 1
     num_features: 65
     model_config.modules.recurrent_stack.args.hidden_size: [50, 100] # specific sweep, delimited by '.'
     embedding_size: [10, 20] # global sweep
     cache: '${CACHE_DIRECTORY}/musdb_128'
     populate_cache: true # controls whether to create a separate experiment for caching
     num_cache_workers: 60 # how many workers to use when populating the cache

   - n_fft: 256
     hop_length: 64
     num_frequencies: 129 # n_fft / 2 + 1
     num_features: 129
     model_config.modules.recurrent_stack.args.hidden_size: [50, 100] # specific sweep, delimited by '.'
     embedding_size: [10, 20] # global sweep
     cache: '${CACHE_DIRECTORY}/musdb_256'
     populate_cache: true # controls whether to create a separate experiment for caching
     num_cache_workers: 60 # how many workers to use when populating the cache

Now we create 10 experiments, 4 for each item in the list, +1 for each cache.

Parameters
  • path_to_yml_file ([type]) – Path to the configuration for the base experiment. This will be expanded by the script, filling in the values defined in ‘sweep’ accordingly, and create new experiments.

  • num_jobs (int) – Controls the number of jobs to use in the created pipelines. Defaults to 1.

  • num_gpus (int) – Controls the number of gpus to use in the created pipelines. Defaults to 0.

  • run_in (str) – Run jobs in containers or on the host (‘container’ or ‘host’). Defaults to host.

scripts.sweep_experiment.update_config_with_sweep(config, sweep, combo)[source]

Update a configuration with a sweep. The experiment configuration is updated using the sweep and combo. The sweep contains every key that needs to be updated in the configuration. If something in the sweep is a list, then the associated key is updated with only one of the elements of the list. Which element is specified by ‘combo. Otherwise, the value from sweep is used.

Parameters
  • config (dict) – The experiment configuration that is being updated.

  • sweep (dict) – The full sweep that is used to update the configuration.

  • combo (dict) – The specific values for keys in the sweep that are lists.

Returns

An updated configuration using the sweep and combo arguments.

Return type

dict

scripts.train

scripts.train.build_parser()[source]

Builds the parser for scripts.train_experiment.

usage: python -m scripts.train_experiment  [-h] -p PATH_TO_YML_FILE

optional arguments:
  -h, --help            show this help message and exit
  -p PATH_TO_YML_FILE, --path_to_yml_file PATH_TO_YML_FILE
                        Path to the configuration for the experiment that is
                        getting trained. The script will take the
                        configuration and launch a training job for the
                        experiment.

Also see the arguments to scripts.train.train().

Returns

The parser for this script.

Return type

argparse.ArgumentParser

scripts.train.train_experiment(path_to_yml_file, **kwargs)[source]

Starts a training job for the experiment defined at the path specified. Fits the model accordingly. You can also pass in things into keyword arguments that will get tossed into the “options” dictionary that is passed to the Trainer class.

Parameters
  • path_to_yml_file (str) – Path to the configuration for the experiment that

  • getting trained. The script will take the configuration and launch a (is) –

  • job for the experiment. (training) –

scripts.visualize

scripts.visualize.build_parser()[source]

Builds the parser for scripts.visualize.

usage: python -m scripts.visualize  [-h] -p PATH_TO_YML_FILE [-f FILE_NAMES [FILE_NAMES ...]]
                    [-e EVAL_KEYS [EVAL_KEYS ...]]

optional arguments:
  -h, --help            show this help message and exit
  -p PATH_TO_YML_FILE, --path_to_yml_file PATH_TO_YML_FILE
                        Path to the yml file that defines the experiment. The
                        visualization will be placed into a "viz" folder in
                        the same directory as the yml file.
  -f FILE_NAMES [FILE_NAMES ...], --file_names FILE_NAMES [FILE_NAMES ...]
                        Files to evaluate. Use only the base name of each file
                        in the list that is being evaluated.
  -e EVAL_KEYS [EVAL_KEYS ...], --eval_keys EVAL_KEYS [EVAL_KEYS ...]
                        All of the dataset keys to be used to visualize the
                        experiment. Will visualize for each eval_key in
                        sequence. Defaults to ['test'].

Also see the arguments to scripts.visualize.visualize().

Returns

The parser for this script.

Return type

argparse.ArgumentParser

scripts.visualize.visualize(path_to_yml_file, file_names=[], eval_keys=['test'])[source]

Takes in a path to a yml file containing an experiment configuration and runs the algorithm specified in the experiment on a random file from the test dataset specified in the experiment. If the algorithm has plotting available, then plot is used to visualize the algorithm and save it to a figure. The associated audio is also saved.

Parameters
  • path_to_yml_file (str) – Path to the yml file that defines the experiment. The visualization will be placed into a “viz” folder in the same directory as the yml file.

  • eval_keys (list) – All of the dataset keys to be used to visualize the experiment. Will visualize for each eval_key in sequence. Defaults to [‘test’].

Module contents

scripts.build_parser_for_yml_script()[source]

Builds an ArgumentParser with a common setup. Used in the scripts.

scripts.cmd(script_func, parser_func, exec_func=<function sequential_job_execution>)[source]

Builds a parser for any script in the scripts/ directory. Scripts should have two main functions: 1) a function that actually runs the script and 2) a build_parser function that builds up an ArgumentParser with informative help text for the script. This function allows the command line arguments to be passed to the script either through the command line as normal or through a YAML file which has matching keyword arguments for the script. Positional arguments are discouraged.

The arguments in the YAML file are checked by passing them back into the command line parser function before giving them to the script. This also allows for default values to be defined in the script argument parser.

A script can be called multiple times using a YAML file by having a top-level key called ‘jobs’. ‘jobs’ should contain a list where each item in the list is a set of arguments to be passed to the script one by one.

For each script, simply add this like so:

if __name__ == "__main__":
    cmd(script_func, parser_func)

Then to run a script, simply do:

python -m scripts.[script_name] --yml [path_to_yml_file] # for yml
python -m scripts.[script_name] [--arg val] # for cmd line
Parameters
  • script_func (function) – A function that will take in the arguments as keyword arguments and perform some action.

  • parser_func (function) – A function that will build up the argument parser for the script.

scripts.document_parser(script_name, reference)[source]

Fancy function for documenting a parser easily. Runs the function to build the parser, then gets the parsers help texts and formats it into the function’s doc string for sphinx. A bit hacky but works great!

Parameters
  • script_name (str) – Name of the script.

  • reference (str) – Where to point the reference function for the script (e.g. the script that it runs).

scripts.sequential_job_execution(script_func, jobs)[source]

Execute jobs one by one with a simple for loop.

Parameters
  • script_func (function) – Function to run.

  • jobs (list) – List of dictionaries containing arguments for function.