Synthetic soundscapes

Page Contents

Downloading the data

This page explains how to download the audio files and the scripts used to generate synthetic soundscapes. Three different scenario can be considered.

User who just wants to download dcase2019 dataset
  • Download DESED_synth_dcase2019.tar.gz from DESED_synthetic.
  • tar -xzvf DESED_synth_dcase2019.tar.gz to extract it.
User who wants to reproduce dcase2019 dataset
  • Clone the repo github (https://github.com/turpaultn/DESED)
  • Follow the instructions in synthetic/create_dcase2019_dataset.sh
  • Be careful, the distortions done on Matlab are up to you to create, it will be updated later to do it in python. For now, if you do not want to create them, uncomment corresponding lines in `create_dcase2019_dataset.sh` to download the eval set to get the distortions data.
User who wants to create new synthetic data
  • Download DESED_synth_soundbank.tar.gz from DESED_synthetic.
  • tar -xzvf DESED_synth_soundbank.tar.gz to extract it.
  • cd synthetic/src
  • python get_background_training.py to download SINS background files.
  • See examples of code to create files in the repo github in synthetic/src. Described in Generating new synthetic data.

Generating new synthetic data

Data are generated using Scaper. In the following you have examples of how to use it.
For more information, do not hesitate to check their docs.

Examples of how to generate new sounds in the same way as the Desed_synthetic dataset:

  • generate_training.py, uses event_occurences_train.json for co-occurrence of events.
  • generate_eval_FBSNR.py generates similar subsets with different foreground-background sound to noise ratio (fbsnr): 30dB, 24dB, 15dB, 0dB. Uses event_occurences_eval.json for occurence and co-occurrence of events.
  • generate_eval_var_onset.py generates subsets with a single event per file, the difference between subsets is the onset position:
    1. Onset between 0.25s and 0.75s.
    2. Onset between 5.25s and 5.75s.
    3. Onset between 9.25s and 9.75s.
  • generate_eval_long_short.py generates subsets with a long event in the background and short events in the foreground, the difference beteen subsets is the FBSNR: 30dB, 15dB, 0dB.
  • generate_eval_distortion.py generates distortion subsets, not yet in python, see generate_eval_distortion.m for matlab code (will be updated later).

When a script is generating multiple subfolder but only one csv file, it means it is the same csv for the different cases. Example: when modifying the FBSNR, we do not change the labels (onset, offsets).

Note: The training soundbank can be divided in a training/validation soundbank if you want to create validation data

Class-wise statistics for in terms of isolated events.

Development set Evaluation set
Alarm/bell/ringing 190 63
Blender 98 27
Cat 88 26
Dishes 109 34
Dog 136 43
Electric shaver/toothbrush 56 17
Frying 64 17
Running water 68 20
Speech 128 47
Vacuum cleaner 74 20
Total 1011 314

Comments are closed.

  • Share this

    Facebooktwitterlinkedinmail