What is ACM Multimedia Reproducibility?
Reproducibility of scientific results is at the core of the Scientific Method.
The reason why we do research is to create new knowledge. This knowledge is built up over time by research findings that are confirmed because they are reproducible by the research community.
An experimental result is not fully established unless it can be independently reproduced.ACM DL. Read it here.
Overall reproducibility objectives
ACM Multimedia Reproducibility has the following goals:
- Promoting reproducibility as a strong foundation for scientific productivity and progress in the multimedia community,
- Increasing the impact of multimedia research by accelerating the dissemination and uptake of multimedia research results,
- Establishing a common understanding among researchers of replicability and reproducibility in order to communicate clearly and visibly, via a badge, the effort that authors have invested in ensuring that their papers are reproducible,
- Supporting easier dissemination of code, data sets, experimental frameworks and other products of multimedia research.
The long-term objective of ACM Multimedia Reproducibility is to promote a common research culture where sharing the full set of products of multimedia research (e.g., not just the algorithm description, but also the data, the code, the plots) is the norm rather than an exception. The main challenge we face is promoting research reproducibility both efficiently and effectively. To tackle this challenge, we must build technical expertise that can support repeatable and shareable research. The ACM MM Reproducibility Committee is here to help you with this.
A reproducibility badge for your work
Reproducibility at ACM Multimedia takes the concrete form of a reproducibility badge that is given to papers whose authors have invested effort to describe their algorithms and experiment in detail and to develop and release research artifacts. Quoting ACM:
By “artifact” we mean a digital object that was either created by the authors to be used as part of the study or generated by the experiment itself. For example, artifacts can be software systems, scripts used to run experiments, input datasets, raw data collected in the experiment, or scripts used to analyze results.ACM DL. Read it here.
A reproducibility badge is prominently displayed in the ACM Digital Library and also embedded in the .pdf of the original paper. The ACM Multimedia Reproducibility Committee is responsible for carrying out the review process that determines whether or not a paper will receive a badge. That review process is a collaboration between the original authors and reviewers from the ACM Multimedia Reproducibility Committee.
In addition to the badge on the original paper, the process also results in a short reproducibility paper that documents the artifact and the process of review, including replication and/or reproduction. The reproducibility paper is a companion paper published in the proceedings of the ACM Multimedia conference. It also appears in the ACM Digital Library, from where it is linked to the original ACM Multimedia paper.
Note that the collaborative review process requires substantial efforts, and for this reason the reproducibility paper is not published in the same ACM Multimedia proceedings as the original paper, but rather appears in the next year’s proceedings. This provides authors with the opportunity to present their work twice at an ACM Multimedia conference: The original paper is presented first, and the next year the reproducibility paper is presented as a poster. In this way, the work of authors’ who devote attention to reproducibility receives additional visibility at the conference.
An ACM reproducibility badge is a label that is permanently associated with your (original and companion) papers, both in their .pdf and also in the metadata at the ACM Digital Library. A badge serves as an official mark of the level of reproducibility that your work has achieved. Reproducibility badges for ACM Multimedia papers are awarded via the ACM Multimedia reproducibility review process.
Having a reproducibility badge awarded to your paper promotes the dissemination and uptake of your work. When they see the badge, other researchers will understand it is possible to replicate or reproduce your results and whether there are resources available that will allow them to adopt or extend your approach. The result is that your work receives wider recognition and is positioned to make higher impact.
ACM defines several badges, spanning several levels reproducibility engagements, from light to strong. We are committed to set the standard high for reproducibility at ACM Multimedia, and the Reproducibility Committee asks authors to target the two top quality badges for the companion papers they submit. The targets are:
- Results Replicated:
The main results of the paper were replicated by the reproducibility reviewers and were found to support the central results reported in the paper, using, in part, artifacts provided by the author. This means that the companion paper contains minimally two parts (further details on packaging are below):
- One part is a link to an archive where the artifacts (e.g., code, scripts, data sets, protocols) are cleanly packaged, ready for download and use in experimentation. The package must include a readme with clear descriptions on how to use the artifact (e.g., download, compile, run).
- The other part is accompanying text/schemas/illustrations that describe what the package contains and how the packaged contents are to be deployed and then used. This part should also contain notes about parameters that can be set or adjusted and about how to recreate the plots. It has to have examples, with comments.
- Results Reproduced:
The main results of the paper were reproduced by the reproducibility reviewers and were found to support the central results reported in the paper, without the use of author-supplied artifacts. This means that the companion paper does not need to include or point to any code or dataset, but it must describe so precisely the original contribution, including the definition, the nature and role of parameters, etc, that the reproducibility reviewers can fully reimplement from scratch that work.
These two badges are considered independent of each other, meaning that a single paper can receive the Results Replicated badge, or the Results Reproduced badge, or both. In this case, it means that the authors include high quality artifacts (targeting the light blue badge), but that the description in the companion paper is so good that the reviewers could reimplement themselves everything from scratch without looking at these artifacts (targeting the dark blue badge).
All badged papers will be advertised at MM-INTEREST as well as in SIGMM Records. In addition, this ACM MM Reproducibility website maintains and advertises your badged papers, serving as a centralized location where researchers will be able to find all the experimentation material of sharable ACM Multimedia papers. We will continue to enhance the functionality and material on this website to make it attractive and useful for the community, so stop by often.
Reproducibility papers are included in the proceedings of ACM MM and are presented during a reproducibility poster session at ACM Multimedia the year after the original paper was published.
Submitting your work for reproducibility review
A paper must have been accepted at ACM Multimedia (as a main conference oral or poster paper) in order for the authors of that paper to participate in the reproducibility review process.
All authors of accepted ACM MM papers are invited to prepare and submit a reproducibility paper. The reproducibility paper is a companion paper that is distinct from the main paper, and appears after the main paper has already been published. Overall, a reproducibility paper is typically a 2-7 pages long, in double column, ACM style format. Reproducibility papers are submitted to the Reproducibility paper submission web site (not opened yet). Authors have to clearly declare which badge they target, replicability, reproducibility, or both. The content of the companion paper depends on this target.
Reproducibility papers can be submitted from the day of notification of acceptance of a main-conference paper (around July) through February 1 of the next calendar year. Submitting reproducibility papers is a parallel process to submitting to the main ACM Multimedia conference. Papers are submitted to the special Reproducibility paper submission web site, not opened yet.
However, because reproducibility papers appear in the ACM Multimedia conference proceedings, preparation of camera ready papers do follow the same procedure as main conference papers.
Accepted reproducibility papers are published in the ACM Multimedia conference proceedings the year following the year in which the original main-conference paper was accepted. The authors of an accepted reproducibility paper present their work as part of a specific poster session at ACM Multimedia.
Adjusted instructions and timeline for ACM MM 2019
ACM Multimedia Reproducibility begins, and this first edition is a pilot implementation. For this first edition, we exceptionally extend the eligibility period: we invite all authors of main conference papers accepted to ACM MM 2017 and ACM MM 2018 to prepare and send companion papers. Authors will have until April 1st, 2019, to submit their reproducibility paper.
We ask authors to target the Results Replicated badge (the light blue badge). The main reason for considering this badge only is a constrained timeline leaving reviewers less time for evaluating the works. Accepted reproducibility papers will be notified on July 1st and their final versions have to be ready by August, just as any other regular paper.
Subsequent editions of ACM Multimedia Reproducibility will consider the Results Replicated and the Results Reproduced badges. The review process will start earlier (Feb 1), leaving more time for reviewing papers targeting dark blue badges, which are huge commitments. Also, the eligibility period will be back to normal, that is, the original contribution and its reproducibility companion paper will be one year appart.
Reproducibility review process
Submitted reproducibility papers are reviewed by a collaborative reproducibility review process, which is distinct from the main conference paper reviewing process. The ACM MM Reproducibility Committee assigns two reviewers to each reproducibility paper submission.
During the process of assessing the submission and possibly the artifacts, the reviewers interact with the submitting authors in order to work through any technical issues or gaps in the documentation that they find. The review process is fully open: the reviewers know the authors; the authors know the reviewers. If it is possible to address all the issues, then the reviewers document the review process in a description which is added as a section to the reproducibility paper that is under review.
Because the reviewers interact with the authors, devote a large amount of time and effort to testing the artifacts or to reimplement your work, and because they write a section in the final reproducibility paper, the reviewers become co-authors of the final reproducibility paper. If the reviewers find that the technical issues of problems with the documentation cannot be resolved, then the reproducibility paper is rejected.
This companion paper in the ACM DL and its exposure during the conference are fabulous incentives for both authors and reviewers.
General guidelines for the Results Replicated badge
A paper submitted for reproducibility review targeting the light blue badge should contain:
- Means for the committee to download a prototype system provided as a white box (source, configuration files, build environment) or a black-box system fully specified,
- Input Data: Either the process to generate the input data should be made available, or when the data is not generated, the actual data itself or a link to the data should be provided,
- The set of experiments (system configuration and initialization, scripts, workload, measurement protocol, …) used to run the experiments that produce the raw experimental data,
- The scripts needed to transform the raw experimental data into the graphs included in the paper.
All this material should be extensively described, documented, easy to understand, all this making the core of the paper.
The central results and claims of the corresponding published paper should be supported by the submitted experiments, meaning we can recreate result data and graphs that demonstrate similar behavior with that shown in that paper. Typically when the results are about response times, we do not expect to get identical results. Instead, we expect to see that the overall behavior matches the conclusions from the paper, e.g., that a given algorithm is significantly faster than another one, or that a given parameter affects negatively or positively the behavior of a system.
One important characteristic of strong research results is how flexible and robust they are in terms of the parameters and the tested environment. For example, testing your algorithm for several input data distributions, workload characteristics and even hardware provides a complete picture of its properties.
We expect authors to provide a short description about different experiments that one could do to test their work on top of what already exists in the paper. Ideally, the scripts provided should enable such functionality so that reviewers can test these case. This would allow reviewers to argue about how “reproducible” the results of the paper are under different conditions.
We do not expect the authors to perform any additional experiments on top of the ones in their original paper. Any additional experiments submitted will be considered and tested but they are not required.
Each paper is reviewed by at least two members of the Reproducibility Committee. The evaluation process is not blind, and the authors and the reviewers cooperate in a proactive, opened, constructive and rigorous way in order to produce good science.
The normal end result of this process is awarding the Results Replicated badge to the companion paper, permanently attached to the original paper inside the ACM Digital Library, along with the data and code. The reviewers who participated to this audit add to this companion paper a section describing their effort, and become co-authors.
This companion paper is an incentive for both authors and reviewers.
The goal of the committee is to properly assess and promote multimedia research. While we expect that authors try their best to prepare a submission that works out of the box, we know that sometimes unexpected problems appear and that in certain cases experiments are very hard to fully automate. The committee will not dismiss submissions if something does not work out of the box; instead, they will contact the authors to get their input on how to properly evaluate their work.
Additional guidelines for packaging your work can be found here.
Please check the other tabs at the top of the page where a lot of additional info is provided.
General guidelines for the Results Reproduced badge
A paper submitted for reproducibility review targeting the dark blue badge must provide a great deal of details that could not be included in the original submission (possibly due to space constraints) but that are necessary for fully reimplementing from scratch the original contribution, and eventually observing similar findings. Consequently, a companion paper targeting the dark blue badge might contain:
- a very precise description of the key and decisive features of the original contribution,
- Crucial elements that the brand new implementation should consider, for the sake of e.g., numerical accuracy, computational efficiency, convergence of processes, seeds, or any other such property,
- Crucial elements in relation to the resource consumption, such as memory footprint, requirements in storage, suggestions about data structures to use, about termination conditions, elements about CPU, GPU, networking, exceptions, handling errors, NaN,
- Precise justifications and details about the values for the thresholds, the various (lower, upper, …) bounds,
- Clear specification of the parameters that have to be defined, about their domain of variation, their type, when they are to use, how they should be tuned and why as well as the resulting effects on the behavior of the contribution, default values, cases that make no sense or that cause no significant difference with already observed behaviors,
- Justifications and detailed instructions to connect to external software/libraries/packages (matlab, numpy, …)
- Clear specifications about the input data that your contribution needs,
- Elements about what results the contributions has to produce, how to best format them in order to facilitate the interpretability of the results, suggestions and recommendations to produce insightful plots,
- Examples of what it is expected to obtain given a clearly specified initial setup.
Your contribution may be such that it is extremely difficult to target a dark blue badge. It may be because it would take you way too much time and energy to describe so finely this or that detail; it may be because your algorithm uses a specific dataset of yours that can not be recreated by any other means; it may be because it includes a learning step that took months to run in your lab and this can not be done again easily, etc. In such cases, it might be more appropriate to target a replicability badge, the light blue one, where you provide some of your own material, key snippets for code, data, etc. Being awarded with a light blue badge is already an immense achievement, so target the badge that best suits you in terms of reproducibility investments.
Best practices in reproducibility are a set of actions and principles that you can take in order to ensure that your work is reproducible. Best practices should not be considered a single, static recipe, but rather they are a flexible knowledge of processes and strategies that evolve overtime.
To get started understanding best practices, we suggest that you take a look at the ACM Digital Library’s webpage entitled “Software and Data Artifacts in the ACM Digital Library”, which provides information on motivations for reproducibility and a look into how reproducibility is supported and continues to evolve in the ACM Digital library. From that page you can get more information on badges.
Researchers in the area of databases started explicitly emphasizing reproducibility early on. Much of what we propose to implement for ACM Multimedia is inspired from their dos and don’ts. We suggest that you take a look at material on reproducibility that has been published in the database community and consider how their best practices transfer to your work.
A good source of information can be found in the ICDE 2008 tutorial by Ioana Manolescu and Stefan Manegold. The tutorial includes a road-map of tips and tricks on how to organize and present code that performs experiments, so that an outsider can repeat them.
A discussion about reproducibility in research including guidelines and a review of existing tools can be found in the SIGMOD 2012 tutorial by Juliana Freire, Philippe Bonnet, and Dennis Shasha.
Closer to multimedia, ACM MMSys has lead the multimedia research community in explicitly emphasizing reproducibility and Gwendal Simon currently chairs the reproducibility track at MMSys. Emailing Gwendal or reading what he wrote about reproducibility in his blog can be helpful.
Note, however, that reproducibility at ACM MMSys differs slightly from the one we propose for ACM Multimedia.
Examples of badged papers
Here are a few links to papers with badges inside the ACM DL. They have been picked because they nicely illustrate what is described in this page, not due to their scientific value. They come from multiple domains, giving you a broad view of what colleagues have done. Note, however, that they do not comply with the ACM MM guidelines specified in this page. For example, sometimes there is no companion paper, sometimes the companion paper is not always a 2 to 7 pages paper, ACM style. Furthermore, papers may target other types of badges that are about the availability of the artifacts, not solely about replicability or reproducibility of results.
- Generating Preview Tables for Entity Graphs, N. Yan, S. Hasani, A. Asudeh and C. Li. Winner of the 2017 most reproducible papers from SIGMOD 2016. . Paper’s DOI
- Cicada: Dependably Fast Multi-Core In-Memory Transactions, H. Lim, M. Kaminsky, D. G. Andersen. ACM SIGMOD 2017 Reproducible Paper. . Paper’s DOI. The artifacts are on GitHub.
- Reproducible experiments on dynamic resource allocation in cloud data centers, A. Wolkea, M. Bichlera, F. Chirigati, V. Steeves. Information Systems, Volume 59, July 2016, Pages 98-101. Paper’s DOI. This is the companion paper of the original scientific contribution .
- 360-Degree Video Head Movement Dataset, X.Corbillon, F. De Simone, G. Simon. MMSys 2017. Paper’s DOI. The linked artifacts are here. You will see that the artifacts are inside the ACM DL.
- Laurent Amsaleg, CNRS-IRISA, France [email]
- Björn Þór Jónsson, IT University of Copenhagen, Denmark [email]
- Martin Aumüller, IT U. of Copenhagen, Denmark
- Ahmet Iscen, CTU Prague, Czech Republic
- Michael Riegler, Simula, Norway
- Stevan Rudinac, UVA, The Netherlands
- Lucile Sassatelli, U. Nice, France
- Gwendal Simon, IMT Atlantique, France
- Francesca De Simone, CWI, The Netherlands
- Bart Thomee, Google, USA
- Jan Zahálka, bohem.ai, Czech Republic
The committee is chaired by two chairpersons who each serve for two years. At the end of each year, one chair is replaced and the other continues. This system ensures a one year overlap, which helps the committee to maintain freshness while also facilitating the transmission of the gradually improved expertise in managing reproducibility.
The two chairpersons nominate the members of the committee that are enrolled for at least one year. Chairs can also take part in the evaluation. There is a strict conflict of interest policy: the chairs cannot submit a reproducibility paper while they hold their positions.
About the 2019 edition of ACM MM
ACM Multimedia Reproducibility must be bootstrapped. In 2019, it is a pilot and we will test a compressed version of the full procedure in order to determine if any aspect needs to be adapted.
All authors of papers accepted to ACM MM 2017 and 2018 will receive an e-mail inviting them to be part of the reproducibility process targeting the ACM MM 2019 edition. They will have until April, 1st, 2019 to submit their reproducibility paper. Some of these papers might be so good that reviewing them can be achieved in time for their reproducibility companion papers to appear in the proceedings of ACM MM 2019. Otherwise, the review process will continue, resulting for successful papers in being awarded with ACM badges at a later time. (We expect that some papers will require more time, which is why the deadline for future years is planned for February. In this first year, these papers should not be disadvantaged.)
For the 2019 edition of ACM MM, Laurent Amsaleg and Björn Þór Jónsson chair the committee. In this first year, an ACM MM GC (Laurent) will also serve as a Reproducibility Committee Chair in order to launch the initiative, but in future years the Reproducibility Committee Chairs will not include the GC. For the 2020 edition of ACM MM, Björn Þór will remain and Laurent will be replaced.
The approach we propose for reproducibility at ACM Multimedia has been inspired from extensive discussions with colleagues from the database field, namely Stratos Idreos from Harvard University, USA (e-mail), and Fernando Seabra Chirigati from NYU Tandon School of Engineering, USA (e-mail). We also want to thank Dennis Shasha, New York University, USA, for his advice.
Discussions with Gwendal Simon were also very helpful.
The look of this page and some of the text here originates from the Web page describing reproducibility at ACM SIGMOD, and was used with the authorization of the original authors. Naturally, however, that text was adapted when necessary.