DESED dataset contains:

  • Recorded soundscapes.
  • Synthetic soundbank (+ code to create new soundscapes using Scaper) and dcase 2019 soundscapes.
  • Public evaluation (recorded soundscapes) used in dcase 2019 (a.k.a. Youtube eval set in dcase, Vimeo is not available.).


The dataset is split into two subsets as described below.

Recorded soundscapes
  • Verified and unverfied subset of Audioset.
    • Unlabel_in_domain data: Unverified data have their label discarded: 14412 files.
    • Weakly labeled data: training data have their labels verified at the clip level: 1578 files.
    • Validation data have their labels with time boundaries (strong labels): 1168 files.
    • Evaluation public files: 692 Youtube files
Synthetic soundscapes
  • Background files are extracted from SINS [2], MUSAN [3] or Youtube and have been selected because they contain a very low amount of our sound event classes.
  • Foreground files are extracted from Freesound [4][5] and manually verified to check the quality and segmented to remove silences.
  • Mixtures are described in Generating new synthetic data.
  • Sound bank:
    • Training: 2060 background files (SINS) and 1009 foreground files (Freesound).
    • Eval: 12 (Freesound) + 5 (Youtube) background files and 314 foreground files (Freesound).


You can find information about this dataset in these papers:

  • Turpault et al. Description of DESED dataset + official results of DCASE 2019 task 4.
  • Serizel et al. Robustness of DCASE 2019 systems on synthetic evaluation set.

Relation to DCASE task 4

If you want more information about dcase 2019 dataset go to Desed for DCASE 2019 task 4 below, or visit DCASE 2019 task 4 web page

Comments are closed.

  • Share this