News and Resources

Data, Society, and Open Science Logo.png

Data, Society, and Open Science II: Roundtable on the FAIR principles and data-driven scientific practice

We are delighted to announce our second upcoming workshop in the Data, Society, and Open Science series, which will take the form of a roundtable discussion between invited participants Professor Sabina Leonelli (University of Exeter), Dr Marta Teperek (TU Delft), Dr Niccolò Tempini (University of Exeter), Dr Manuela Fernandez Pinto (University of Los Andes), and Professor Michael Resch (University of Stuttgart). The speakers will engage in a roundtable discussion on a number of issues concerning the FAIR data principles, and data-driven scientific practice. The discussion will take place (provisionally) on the 1st March 2021 (15:30-17:00 CET), and will be held fully online. Please click the link below for more information on this and other upcoming events.


Livestream of Data, Society, and Open Science Workshop

Full recorded livestream of the Data, Society and Open Science workshop, held online 10th November 2020, is now available. Thanks again to our speakers Mark Alfano, Kees Vuik, Mark Wilkinson, and Björn Schembera.

Traditional Library

First workshop announced: 'Data, Society, and Open Science'

We are pleased to announce the first in our series of three workshops we will be holding as part of the Making Dark Data FAIR project. The workshop will be held on the 10th November 2020, and will be completely online. Click the link below for more details.

Mark Alfano.jpg

First Speaker announced: Mark Alfano

We are delighted to announce that our first speaker for the Data, Society, and Open Science workshop (10th Nov 2020) will be Mark Alfano, with a talk entitled 'A case study on open data for the public good: The international collaboration on social & moral psychology: COVID-19'. 

Abstract: In the first half of 2020, dozens of researchers from around the globe sought to help the response to the COVID-19 pandemic by collecting data from nearly 50,000 adults in 67 countries. Our aim was establish the psychological predictors of four key outcome variables: physical hygiene (e.g., increased hand-washing), physical distancing (e.g., staying home as much as possible), policy support (e.g., support for closing bars, restaurants, and schools), and rejection of conspiracy theories (e.g., COVID-19 is caused by 5G connectivity). Various predictors were also collected, including individual and collective narcissism, nation identity, political ideology, risk aversion, openmindedness, moral identity, and several others. To promote open science and rapid analyses of this large dataset using multiple methods, hypotheses, and theories, we made a random 10% sample of the full dataset freely available via the Open Science Framework. Anyone who pre-registers an analysis based on this 10% sample then receives the remaining 90%. To date, 34 plans have been pre-registered, around a dozen papers drafted, and at least one paper submitted for peer review. We believe that this approach showcases the value of many-labs approaches to urgent public health issues.


Blog Post: SoBigData

A blog post concerning the Making Dark Data FAIR project has recently been published on (link below).


Second Speaker Announced: Kees Vuik

We are excited to confirm our second speaker for the Data, Society and Open Science workshop: TU Delft's Kees Vuik. Professor Vuik will be giving a talk entitled 'Reproducible Computational Science'.

Abstract: One of the most important characteristics of Scientific Research is that experimental, theoretical, and computational results should be
reproducible. In this presentation we will restrict ourselves to reproducibility in Computational Science. This means that if numerical algorithms are presented it should be possible to check if the theoretical results (speed of convergence, conservative properties, bounds for rounding
errors) are valid. In most cases, pen and paper would be enough to validate the propositions, lemmas and theorems given in the paper. However, many papers in this domain illustrate the performance of the computational methods, models, and algorithms by graphs and tables. In these graphs and tables accuracy, properties of the computed results, robustness, efficiency, etc. are presented. How to validate these results? This question is not easy to answer.

Various aspects that play a role are:

- is the problem well described?
- are the used algorithms given in enough detail?
- is the implementation publicly available?
- is the software open source, stand alone, commercial?
- are all results included, or only the 'nice' results?
- is the hardware specified and available?

A couple of these questions and problems will be discussed in this talk.


Third Speaker Announced: Björn Schembera

We are pleased to confirm our third speaker for the Data, Society, and Open Science workshop: Björn Schembera, with a talk entitled 'Update on Dark Data numbers and activity'.

Abstract: The concept of dark data has been shown as valuable for describing the problem of uncurated and unaccesible data.
The talk presents updated numbers on the phenomenon of dark data, both global total estimates as well as specific numbers from one of the leading HPC centers in Europe. It is shown how specific measures regarding data management took effect to reduce dark data in this center.
Moreover in this talk, it is shown why dark data is not FAIR and how other sources complete the picture of dark data.

Mark Wilkinson.jpeg

Fourth Speaker announced: Mark Wilkinson

We are delighted to announced our fourth and final speaker for the Data, Society and Open Science workshop will be Mark Wilkinson, with a talk entitled 'FAIR data principles'. 

Abstract:  In 2014 a workshop entitled "Jointly Designing a Data FAIRPORT" brought together a wide range of stakeholders with an interest in the domain of scholarly communication and publishing.  The goal was to find a minimal set of guidelines and best practices that would make scholarly data (and other digital objects) reusable.  Shortly after this meeting, the acronym FAIR (Findable, Accessible, Interoperable, Reusable) was coined, and in 2016 the FAIR Data Principles were published.  This publication has already achieved >3300 citations, and the "FAIRness" has become a key objective for a number of international initiatives, including the European Open Science Cloud, the US NIH, and the Horizon 2020 series of funding initiatives.  In this presentation, I will walk through the history of FAIR, and explore several of the Principles in some detail.  We will then discuss ongoing initiatives around FAIRness evaluations and how we hope to create tooling that facilitates data publishers in achieving maximum FAIRness with minimal effort through objective, automated, self-driven FAIR testing that generates helpful advice for improvement.

Bio: Dr. Mark D Wilkinson is the Fundacion BBVA Chair in Biotechnology, Isaac Peral Distinguished Researcher, Center for Plant Biotechnology and Genomics, Technical University of Madrid. For the past 18 years, his laboratory has focused on designing biomedical data/tool representation,discovery, and automated reuse infrastructures - what would now be called"FAIR". Dr. Wilkinson is the lead author of the primary FAIR Data Principles paper, and lead author on the first paper describing a complete implementation of those principles over legacy data. He is a founding member of the FAIR Metrics working group, tasked with defining the precise,measurable behaviors that FAIR resources should exhibit. Dr. Wilkinson’s laboratory's flagship technologies are Semantic Automated Discovery and Integration (SADI) and Semantic Health and Research Environment(SHARE). SADI enables FAIR discoverability for analytical algorithms in biomedicine, where the algorithms natively consume FAIR data; as such, SADI design practices are perfectly suited for deployment on the emergent Web of FAIR Data and Services.

Learn More