Synthetic soundscapes

Page Contents

Downloading the data

This page explains how to download the audio files and the scripts used to generate synthetic soundscapes. Three different scenario can be considered.

User who just wants to download dcase2019 dataset

Download DESED_synth_dcase2019.tar.gz from DESED_synthetic.
tar -xzvf DESED_synth_dcase2019.tar.gz to extract it.

User who wants to reproduce dcase2019 dataset

Clone the repo github (https://github.com/turpaultn/DESED )
Follow the instructions in synthetic/create_dcase2019_dataset.sh
Be careful, the distortions done on Matlab are up to you to create, it will be updated later to do it in python. For now, if you do not want to create them, uncomment corresponding lines in `create_dcase2019_dataset.sh` to download the eval set to get the distortions data.

User who wants to create new synthetic data

Download DESED_synth_soundbank.tar.gz from DESED_synthetic.
tar -xzvf DESED_synth_soundbank.tar.gz to extract it.
cd synthetic/src
python get_background_training.py to download SINS background files.
See examples of code to create files in the repo github in synthetic/src. Described in Generating new synthetic data.

Generating new synthetic data

Data are generated using Scaper. In the following you have examples of how to use it.
For more information, do not hesitate to check their docs.

Examples of how to generate new sounds in the same way as the Desed_synthetic dataset:

generate_training.py, uses event_occurences_train.json for co-occurrence of events.
generate_eval_FBSNR.py generates similar subsets with different foreground-background sound to noise ratio (fbsnr): 30dB, 24dB, 15dB, 0dB. Uses event_occurences_eval.json for occurence and co-occurrence of events.
generate_eval_var_onset.py generates subsets with a single event per file, the difference between subsets is the onset position:
1. Onset between 0.25s and 0.75s.
2. Onset between 5.25s and 5.75s.
3. Onset between 9.25s and 9.75s.
generate_eval_long_short.py generates subsets with a long event in the background and short events in the foreground, the difference beteen subsets is the FBSNR: 30dB, 15dB, 0dB.
generate_eval_distortion.py generates distortion subsets, not yet in python, see generate_eval_distortion.m for matlab code (will be updated later).

When a script is generating multiple subfolder but only one csv file, it means it is the same csv for the different cases. Example: when modifying the FBSNR, we do not change the labels (onset, offsets).

Note: The training soundbank can be divided in a training/validation soundbank if you want to create validation data

Class-wise statistics for in terms of isolated events.

	Development set	Evaluation set
Alarm/bell/ringing	190	63
Blender	98	27
Cat	88	26
Dishes	109	34
Dog	136	43
Electric shaver/toothbrush	56	17
Frying	64	17
Running water	68	20
Speech	128	47
Vacuum cleaner	74	20
Total	1011	314

Downloading the data

User who just wants to download dcase2019 dataset

User who wants to reproduce dcase2019 dataset

User who wants to create new synthetic data

Generating new synthetic data

Class-wise statistics for in terms of isolated events.

Share this