Reproducibility - Notes on AI

# Reproducibility Reproducibility a key characteristic of good science, but hard to achieve for experimental disciplines like data science and artificial intelligence. Science is people's work. Reproducibility is key. ## Terminologies proposd by ACM ![[acm-reproducibilities.jpg]] **Repeatability** is achieved when a researcher can obtain the same results for her own experiment under exactly the same conditions, i.e., he/she can reliably repeat their own experiment ("Same team, same experimental setup") **Replicability** allows a different researcher to obtain the same results for an experiment under exactly the same conditions and using exactly the same artifacts, i.e., another independent researcher can reliably repeat an experiment of someone other than herself ("Different team, same experimental setup") **Reproducibility** enables researcher other than the authors to obtain the same results for an experiment under different conditions and using her self-developed artifacts ("Different team, different experimental setup") ## Current scenario Baker, Is There a Reproducibility Crisis? Nature 2016 ![[failures-reproducibility.jpg]] What about computer science? - Collberg, Proebstring, Repeatability in Computer Systems Research, CACM 2016 - Analyzed 601 ACM papers. - Out of these 508 - Locate and build source code - Able to for 32.3% w.o. communicating with authors. - Increase to 48.3% with (one shot) communication ## What to document? ![[what-to-document.jpg]] - Experiment - Data - Method ## What are the causes? Again, from paper: Baker, Is There a Reproducibility Crisis? Nature 2016 ![[factors-irreproducible.jpg]] Aspects of implementation not described or ambiguous Aspects of experiment not described or ambiguous Not all hyper-parameters are specified Mismatch between data in paper available online Method code share, experiment code not Method not described with enough detail ## How to improve Pessimist - We continue to do what we have always done - Over time, Al research loses credibility Optimist - Tools and platforms for reproducible experiments - Sharing data and code on the increase - Increasing appreciation for reproducibility papers General best practices (before you write the paper) - Problem formulation and design: hypothesize, plan and solicit feedback, iterate, factor in changes - Documentation: record everything, automate everything, version control everything (writing, code, data, ...), backup, backup, backup - Experimentation and data collection: validate and scale, don't reinvent wheel, automatically monitor experiments - Handling data: data privacy, data integrity, licensing, credit --- ## References 1. Based in part on ACM. 2018 . Artifact Review and Badging. https://www.acm.org/publications/policies/artifact-review-badging 2. Bajpai et al, 2019 The Dagstuhl Beginners Guide to Reproducibility for Experimental Networking Research. Dagstuhl Report. 3. Chirigati et al. 2016 ReproZip: Computational Reproducibility With Ease. SIGMOD 2016. https://doi.org/10.1145/2882903.2899401 4. Nicola Ferro, 2017 , Reproducibility Challenges in Information Retrieval Evaluation. J. Data and Information Quality. https://dl.acm.org/doi/10.1145/3020206 5. Odd Erik Gundersen, 2019 What I Talk About When I Talk About Reproducibility. IJCAI 2019