You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 35 Next »

Summary

Current developments in institutional and national databases have led to more information about research outputs, especially on publications which have become commonplace output to be used on the research evaluation and funding allocation. Institutional and national databases on publications, be it an institutional current research information systems (later on CRIS) or a national house-built solution, are in almost all cases built for the context of monitoring researcher's outputs in a way that this data can then be used to evaluate on an institutional level (e.g. tenure-track, recruitment) or nationally (e.g. national funding models that take into account the outputs) and further explored by general public (via e.g. portals or statistical dashboards). What makes both institutional and national databases apart from larger commercial databases, is that they usually included outputs from social sciences and humanities (later on SSH). To some extent also research and research outputs that are highly national (e.g. articles in domestic journals, books in national language) are not well covered.

To achieve the same on an European level has been on the agenda for quite a few organizations or projects. As part of the European Network for Research Evaluation within the Social Sciences and Humanities (later on ENRESSH), a pilot case study was made on utilizing the Finnish national VIRTA Publication Information system's solution for wider set of organizations in Europe. This collaborative VIRTA-ENRESSH-POC of a decentralized approach to aggregate publication metadata was launched in Spring 2016 and the case study was carried out between 2017-2018 for 6 organizations from 4 countries (Belgium, Finland, Norway and Spain). In this POC, it was discussed that if on an European level a so called European Research Information Service could be built, that would provide a complete overview, i.e. metadata on publications, and would include all types of scholarly publications from all fields of science. The data collected in the pilot from had its highest quality and consistency in terms of the bibliographic data meanwhile the classifications varied.

  • One of the goals of the ENRESSH is to design a roadmap for a European database for SSH outputs. A proof of concept VIRTA-ENRESSH was built up - especially for SSH but not excluding others fields either.
  • In the context of ENRESSH and research of publishing in SSH fields, a metadata has to be at a sufficient level
  • VIRTA-ENRESSH to some extent have already explored the topic on minimum metadata set

Perhaps even a bigger undertaking of collecting and combining bibliographic metadata on research publications on European level is ongoing as part the agenda of OpenAIRE, an organization behind a network of open science specialists in Europe and currently hosting one of the largest databases in Europe on research outputs. By utilizing the CERIF data model at it's core, OpenAIRE has, with first version dating back to 2015 and recent updates in 2018, gained momentum by the OpenAIRE Guidelines for CRIS Managers to support metadata harvests from various institutional and national publication databases (e.g. VIRTA) and CRIS systems (e.g. METIS, PURE). As it stands, in 2019 several CRIS systems aim to be compliant with the Guidelines and thus harvestable by the OpenAIRE.

Albeit both initiatives seemingly share a common goal of having a complete set of metadata on publication made in Europe, there is one difference in the approach 

  • Based on the CERIF data model the Guidelines for CRIS Managers have set the bare minimum of metadata to be mandatory in metadata harvests
  • For purposes of monitoring and / or research, the metadata requirements are not sufficient in Guidelines. These use cases need to be taken into account if and when CRIS data is to be aggregated
  • Guidelines do not currently support inclusion of institutional/national documentation on metadata and institutional/national criteria on e.g. what is determined as "article" or "scientific" in databases is is missing


  • Common European standardization and data content need to be defined in collaboration with ENRESSH - “lowest common denominator” + additional optional information


  • How and if this kind of minimum metadata standard can be implemented - and on what level → Implementing in ENRESSH-VIRTA infrastructure
    • Could be used on a separate ENRESSH publication database
    • Could be used as a reference for e.g. OpenAIRE harvesting


  1. Propose a minimum set of metadata (based on CERIF data model) that can be used in CRIS metadata exchanges

Deliverables

  1. A summary of minimum CERIF data model elements needed in research publication metadata transfers considering CRIS systems and national aggregators in European context
    1. ENRESSH Minimum Data Model
  2. An outline of implementing the research publication metadata transfer in ENRESSH-VIRTA infrastructure
    1. Implementing in ENRESSH-VIRTA Infrastructure
    2. Interoperability platform as supporting too


Future tasks

  • As part of the NordRIS proposal and further work on ENRESSH-VIRTA
  • Discuss with validation on OpenAIRE's side if ENRESSH Minimum Data Model could be used
  • CEF Telecom call for proposals. The next call on ”Access to re-usable public sector information –PUBLIC OPEN DATA” opening on July might be relevant for a project compiling and opening European publication data. The deadline for proposals is on 14 November. Find more (pages 31-34): https://ec.europa.eu/inea/sites/inea/files/cef_telecom_work_programme_2019.pdf
  • Presentation(s) in euroCRIS Membership Meeting 2019 and/or euroCRIS Conference 2020

Things to keep in mind

  • Focus on publications set only, or include others as well?
    • Of minimum elements, which are relevant to other sets?
    • Inclusion of emerging outputs (data sets, researcher activities) might lead to worse data quality
  • How and if this kind of minimum metadata standard can be implemented
    • And on what level?
    • Could be used on a separate ENRESSH publication database
    • Could be used as a reference for e.g. OpenAIRE harvesting
  • Authority lists (journal lists etc.) to support the quality aspect of publication metadata
    • Could be used to bypass some of the problems e.g. in field of science classification and/or quality aspects
  • Manual of good practices (SSH databases)

Supporting documents:

Proof of Concept of a European database for Social Sciences and Humanities publications

https://dspacecris.eurocris.org/bitstream/11366/682/4/VIRTA-ENRESSH-POC_CRIS2018_Puuska.pdf

Sīle, L. et al. (2017). European Databases and Repositories for Social Sciences and Humanities Research Output. Antwerp: ECOOM & ENRESSH. DOI:10.6084/m9.figshare.5172322

http://enressh.eu/wp-content/uploads/2017/09/2017_ENRESSH_European_Databases.pdf

Towards the integration of European research information

https://dspacecris.eurocris.org/handle/11366/593


https://openaire-guidelines-for-cris-managers.readthedocs.io/en/latest/index.html

CERIF-tietomallin määrittely OpenAIRE tiedonsiirrossa

CERIF - VIRTA mapping

https://docs.google.com/document/d/1Rm4OMOUf3JEti6aLmCrnSilX-sbutknFTR7njeItmBc/edit?usp=sharing


CERIF

Figure 1: ENRESSH Minimum Data Model in relation to CERIF data model and OpenAIRE Guidelines for CRIS Managers

  • No labels