Description

Page Contents

DESED dataset contains:

Recorded soundscapes.
Synthetic soundbank (+ code to create new soundscapes using Scaper) and dcase 2019 soundscapes.
Public evaluation (recorded soundscapes) used in dcase 2019 (a.k.a. Youtube eval set in dcase, Vimeo is not available.).

The dataset is split into two subsets as described below.

Verified and unverfied subset of Audioset.
- Unlabel_in_domain data: Unverified data have their label discarded: 14412 files.
- Weakly labeled data: training data have their labels verified at the clip level: 1578 files.
- Validation data have their labels with time boundaries (strong labels): 1168 files.
- Evaluation public files: 692 Youtube files

Background files are extracted from SINS [2], MUSAN [3] or Youtube and have been selected because they contain a very low amount of our sound event classes.
Foreground files are extracted from Freesound [4][5] and manually verified to check the quality and segmented to remove silences.
Mixtures are described in Generating new synthetic data.
Sound bank:
- Training: 2060 background files (SINS) and 1009 foreground files (Freesound).
- Eval: 12 (Freesound) + 5 (Youtube) background files and 314 foreground files (Freesound).

You can find information about this dataset in these papers:

Turpault et al. Description of DESED dataset + official results of DCASE 2019 task 4.
Serizel et al. Robustness of DCASE 2019 systems on synthetic evaluation set.

If you want more information about dcase 2019 dataset go to Desed for DCASE 2019 task 4 below, or visit DCASE 2019 task 4 web page