empty.png

Making Dark Data FAIR

The Making Dark Data FAIR project is generously funded by the EOSC.

 

Making Dark Data FAIR

Modern high-performance computing facilities (HPCs) generate a colossal amount of data. Recent studies have shown that a significant percentage (up to 3.41% of total storage capacity) of the data HPCs produce might never be used again — for reasons as mundane as improper labelling. Given that it’s often the case that large amounts of public money go into the production of that data, this is obviously hugely problematic. 


The reasons for the proliferation of this so-called dark data (i.e., data that is not-reusable for a number of different reasons) are varied. Missing or incomplete metadata, non-standardised storage methods, and researchers simply forgetting that the data is there, are just some of the reasons so much data becomes non-reusable. 


Recently, the FAIR principles concerning proper data management were introduced in an attempt to reduce the amount of data that is non-reusable. The principles outline a number of pragmatic measures institutions might enact in an effort to reduce the amount of data that is non-reusable. 


The project “Making Dark Data FAIR” aims to provide analysis of why dark data goes dark in the first place, develop concrete strategies for how we might best enact the FAIR principles in an effort to reduce dark data, and finally to interrogate the FAIR principles themselves, in an effort to ensure they’re fulfilling the role intended for them (i.e., reducing dark data). While our primary focus is the generation of dark data at HPC facilities in particular, the results of this research are nonetheless widely applicable; the need for the FAIR principles is recognised by a wide variety of parties, not only those interested in high-performance computing. 


As part of the project, we are running a series of workshops in late 2020 and early 2021, which will serve as a platform to both discuss and disseminate the results obtained from this project. Workshops will be held both online and in-person, including a workshop at TU Delft. These workshops will bring together a wide variety of stakeholders, including researchers, policymakers, data stewards, and HPC facility personnel. 


The project is financed by the European Open Science Cloud Secretariat. The lead coordinator of the project is Juan M. Durán (TU Delft – j.m.duran@tudelft.nl). Jack Casey (TU Delft) is the postdoc and contact person (j.j.casey@tudelft.nl).


The project is a collaboration between TU Delft (The Netherlands), the University of Exeter (UK), the University of Stuttgart (Germany), the CNR-IOM Center (Italy), and the ERC

Consortium SoBigData++.

 
 

Upcoming Workshops

  • Data, Society, and Open Science III: challenges for data management and data-based research
    30 mrt. 11:30 – 16:30 CEST
    Zoom
    Our final workshop in the Data, Society, and Open Science series - 30th March 2021 - 11:30-16:30
    Share
  • Data, Society, and Open Science II: Roundtable on the FAIR principles and data-driven scientific practice
    01 mrt. 15:30 – 17:00
    Zoom
    In our second workshop for the Data, Society, and Open Science series we will see a roundtable discussion between invited speakers Sabina Leonelli, Marta Teperek, Niccolò Tempini, Manuela Fernandez Pinto, and Michael Resch.
    Share
  • Data, Society, and Open Science
    10 nov. 2020 10:00 – 19:05 CET
    Zoom
    We are happy to announce our first one-day workshop, to be held completely online. This workshop will be focussed primarily on a broad range of philosophical issues raised by modern scientific practice.
    Share
 

Program Announced: Data, Society, and Open Science III Workshop - 30th March 2020

Background2_edited.jpg

Program: Data, Society, and Open Science III – 30th March 2021


11:30-11:45 - Introduction

11:45-12:15 - Saeedeh Babaii - Institute for Humanities and Cultural Studies

Tehran, Iran - GAN; A Promising Approach to Mitigate the Problem of Bias in AI


Break 212:15-12:30


12:30-13:00 - Martin Thomas Horsch, Taras Petrenko and Björn Schembera – HLRS Stuttgart - Automated metadata extraction and epistemic FAIRness in the engineering sciences


Lunch -13:00-14:30 


14:30-15:00 - Juniper Lovato and Randall Harp – The Vermont Complex Systems Center - Ethical Considerations of Dark Data: Making FAIR Data Fair


Break 3 – 15:00-15:15


15:15-15:45 - Fionn McGrath – Trinity College Dublin - An Adornoian critique of machine learning


Break 4 – 15:45-16:00


16:00-16:30 - Giovanni Galli – University of Urbino - Understanding the data-centric sciences, from dark data to Covid-19 tracking

Read More

Second Workshop Announced: Data Society and Open Science II: Roundtable on the FAIR principles and data-driven scientific practice

Background2_edited.jpg

We are delighted to announce our second upcoming workshop in the Data, Society, and Open Science series, which will take the form of a roundtable discussion between invited participants Professor Sabina Leonelli (University of Exeter), Dr Marta Teperek (TU Delft), Dr Niccolò Tempini (University of Exeter), Dr Manuela Fernandez Pinto (University of Los Andes), and Professor Michael Resch (University of Stuttgart). The speakers will engage in a roundtable discussion on a number of issues concerning the FAIR data principles, and data-driven scientific practice. The discussion will take place on the 1st March 2021 (15:30-17:00 CET), and will be held fully online. Please click the link below for more information on this and other upcoming events.

First Workshop Announced: Data, Society, and Open Science

Data, Society, and Open Science Logo.png

We are pleased to announce the first in our series of three workshops we will be holding as part of the Making Dark Data FAIR project. The workshop will be held on the 10th November 2020, and will be completely online. Click the link below for more details.

First Speaker Announced: Mark Alfano

Mark Alfano.jpg

We are delighted to announce that our first speaker for the Data, Society, and Open Science workshop (10th Nov 2020) will be Mark Alfano, with a talk entitled 'A case study on open data for the public good: The international collaboration on social & moral psychology: COVID-19'. 


Clink the link for more details.

SoBigData Blog Post

download.png

Clink the link below to read the Blog post on the Making Dark Data FAIR project recently published on SoBigData.eu

Second Speaker Announced: Kees Vuik

kees-vuik-square.jpg

We are excited to confirm our second speaker for the Data, Society and Open Science workshop (10th Nov 2020): TU Delft's Kees Vuik. Professor Vuik will be giving a talk entitled 'Reproducible Computational Science'.

Click the link below for more details. 

Third Speaker Announced: Björn Schembera

Björn.jpg

We are delighted to announce that our third speaker for the Data, Society, and Open Science workshop (10th Nov 2020) will be Björn Schembera, with a talk entitled 'Update on Dark Data numbers and activity'

Clink the link below for more details.

Read More

Fourth Speaker Announced: Mark Wilkinson

Mark Wilkinson.jpeg

We are excited to confirm that our fourth and final speaker for the Data, Society, and Open Science workshop (10th Nov 2020) will be Mark Wilkinson, with a talk entited FAIR data principles. 

Click the link below for more details. 

Read More
 

Get in Touch

Department of Values, Technology and Innovation
Faculty of Technology, Policy and Management
Jaffalaan 5
2628 BX Delft
B4.250
The Netherlands

  • Twitter