Better Science through better data

On Friday 14th Nov I attended a half day conference organised by the Scientific Data,  in Nature’s HQ dedicated on the publishing of research data and its contribution to a better science.

More than 95% of the participants were either PhD or Postdoctoral researchers while the rest comprised of the organisers and 2 Librarians. (One more apart from myself).

The well balanced programme offered the perspective of publishers via Phil Campbel (editor in Nature), Iain Hrynaszkiewicz (Scientific Data Journal), of editors such as Monica Contestabile (Nature Climate Change) and Andrew Hufton (Scientific Data Journal), and of Research Data Managers like Veerle Van den Eynden (UK Data Archive) and Sally Rumsey (Digital Research Librarian, The Bodleian Libraries). The event wouldn’t have been complete without the voice of a curator, in this case Susanna Sansone, of a funder, David Carr from the Wellcome Trust, and of a scientific entrepreneur and founder of Figshare, Mark Hanhel.

Artwork from the venue

Artwork from the venue, photo taken by Eleni Zazani

Although I tweeted most of my takeaway messages, I would like to revisit these notes and reflect on the positive points made during the event.

As with all the evolving trends within Academia, the one of managing scientific data calls for an institutional cultural change…

… and although there are varied approaches in funders’ requirements, all have identified the need to cultivate a culture of data sharing and to raise awarenesses of and to integrate openness as easily as possible in the researchers’ workflow.

Top tips for researchers

What a good Research Data Management (RDM) Plan should look like to attract funding? David Carr from the Wellcome Trust suggests that an RDM plan should have clarity in responding to seven important questions:

What Funders look in a RDM Plan

What Funders look in a RDM Plan


How to choose a Data Repository:

Andrew Hufton suggests..

Andrew Hufton suggests..

 A Librarian’s advice of RDM:

Advice to Researchers from Sally Rumsey (University of Oxford

Advice to Researchers from Sally Rumsey. (University of Oxford)

Sally suggested that researchers should seek advice from their librarians; while in many universities RDM policies and infrastructure, in general, are in their infancy, we, as librarians, can take Sally’s  recommendations and use them as a starting point in raising our awareness in RDM. I was very pleased listening to Sally stressing the importance of properly citing datasets directing researchers to the Datacite’s guidance:

Editor’s advice to new researchers:

Transparency and Validity

Transparency and Validity

New publishing avenues & new type of content:

While a lot of discussion so far has been dedicated on managing data via repositories, a new publishing avenue dawns with the appearance of journals dedicated to publishing data. During the event, the case of Scientific Data Journal explained which is a new initiative from the Nature Publishing Group to offer an open-access, online-only journal where researchers have the opportunity to publish their data instead of only storing it locally. The researchers focus on a dataset generated through their research by providing a narrative, describing their data in a curated and structured manner, and by providing information of the methodologies and other technical analysis/tools used.

This new type of content in scholarly communication is called Data Descriptor!

While researchers have the opportunity of being cited for their data, and publish/describe standalone datasets, this approach will have an impact in generating quality research outputs.

I can only agree with Andrew Huffton when he mentioned the difficulty in discovering whether a research has used randomised methodologies. Last week, we spent a lot of time with a researcher shifting through a high volume of articles to establish whether individual pieces of diabetes research were randomised. In all these articles, nowhere in the text and in methodologies was the type of research being undertaken mentioned; instead we were only able to find random use of the word “randomised” and read the whole article to identify whether the route followed was the research method in question.

How much does it cost to publish in Scientific Data?

 New career Path:

Susanna explained that especially lab scientists have a lot to offer in curating datasets. They bring their valuable knowledge of the discipline, standards and vocabularies  that will enable systems to make data discoverable via the use of discipline-specific ontologies.

Semantic links between data and ontologies

Semantic links between data and ontologies

New Tools

ISA-Tools: It was really refreshing listening to a non librarian talking about the importance of metadata for reusability and discoverability. Susanna talked about tools for tracking metadata. These are called ISA tools, which stands for Investigation’ (the project context), ‘Study’ (a unit of research) and ‘Assay’ (analytical measurement).

The ISA-Tab triggered my imagination and I would like to further investigate the possibilities available for biomedical nanotechnology datasets (ISA-TAB-Nano) relevant for the Materials (Nanomaterials) Sciences researchers.

Figshare: It is one of the many possibilities for storing datasets. It was born out of Mark Hahnel’s need/frustration to be able to publish non-conventional data (videos) generated as part of his PhD. Figshare fills a needs-gap among researchers; being able to generate impact for their research and showcase the breadth of data that traditional journals cannot support.

As a personal view, I like the clean visual way pieces of research are shown on Figshare. If you have yet to explore its potential, have a look at the breadth of analytics and  visualisation in this poster I picked from its repository.

News of interest:

Data curation in (ORA-Data)

Data curation in Bodleian Libraries  (ORA-Data)

 Some recommendations for future events:

  • I found the event being invaluable for my personal understanding and for meeting other researchers; from the educator’s point of view I loved the ice-braking exercise in which we had to look in our conference bag and exchange a promotional t-shirt with that of other participants to fit our size. I would have liked to have been given 5 minutes so that I could have had the chance to talk to people rather than tuning to the introductory talk while checking the size of my t-shirt. That was a very clever activity but unfortunately we didn’t manage to break the ice.
  • At the start of the event we were asked how many of us were PhDs, or Postdoctoral researchers, and from which discipline; I need to admit that I felt a bit excluded. What if someone is a researcher but not directly linked with a PhD programme, like me writing a book, or others researching  as part of their employment with the same rigour as in a doctoral situation and still want to publish and share openly their data? An inclusive language can be a very powerful vehicle for cultural change.
  • Having said that, in a future event I would like to hear from a plurality of funders such as the voice of EPSRC (Engineering and Physical Sciences Research Council), from the STFC (Science and Technology Facilities Council) and other RCUK councils.

Further links and resources:

Follow the recordings and the presentations from the event at the Scientific Data blog.

Read a snapshot of the event from the NatureJobs Blog.

This entry was posted in Events, RDM and tagged , , , , , , , , , , , , , , . Bookmark the permalink.

1 Response to Better Science through better data

  1. Pingback: Encourage better data management in a research groups: a #Scidata16 approach | Eleni's First Steps

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.