In [2]:
from nussl import jupyter_utils, AudioSignal

In [41]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:65% !important; font-size:1em;}</style>"))

In [16]:
from IPython.display import HTML

HTML('''<script>
code_show=true;
function code_toggle() {
if (code_show){
$('div.input').hide(); } else {$('div.input').show();
}
code_show = !code_show
}
</script>

Out[16]:

# Figure 1.2¶

An example of audio representations. The top plot depicts the time-series representation, while the bottom plot depicts the spectrogram (time-frequency) representation. The auditory scene consists of a small jazz band (consisting of saxophone, piano, drums, and bass), chatter produced by a crowd of people, and dishes clattering as they are being washed. It is difficult to visually separate the three sources from each other in either the time-frequency representation or the original time-series.

In [46]:
from wand.image import Image as WImage
img = WImage(filename='../mixture.png')
img

Out[46]:
In [47]:
audio = [
('mixture/mix2-261.wav', 'Mixture'),
]
for a, l in audio:
print(l)
s = AudioSignal(a)
s.to_mono()
jupyter_utils.embed_audio(s, display=True)

Mixture


# Figure 1.3¶

An example of time-frequency assignments in the spectrogram from Figure 1.2. The auditory scene consists of a small jazz band consisting of saxophone, piano, drums, and bass, chatter produced by a crowd of people, and dishes clattering as they are being washed. Dark blue represents silence, light blue represents the jazz band, red the chatter, and yellow the dishes.

In [48]:
from wand.image import Image as WImage
img = WImage(filename='../separated.png')
img

Out[48]:
In [49]:
audio = [
('mixture/separated_music-268.wav', 'Separated music (light blue mask)'),
]
for a, l in audio:
print(l)
s = AudioSignal(a)
s.to_mono()
jupyter_utils.embed_audio(s, display=True)

Separated music (light blue mask)