MINTSERVICES

2/6 – The Linked Heritage Project Aggregator

Linked Heritage (2011-2013) is an initiative coordinated by the Central Institute for Union Catalogue of the Italian Libraries, depending on MiBAC that extends and implements the ATHENA results. It is a best practice network funded within FP7 that began in April 2011 and will run for 30 months; it will contribute new content to Europeana, from both public and private sectors (mainly publishers), improve the quality of content in terms of richness of metadata, potential reuse and uniqueness, explore the potential of cultural Linked Open Data, and enable better search, retrieval and use of the content published in Europeana.

The Linked Heritage content providers

Twenty-two countries are members of the Linked Heritage consortium: culture ministries, government agencies, museums, libraries, and national aggregators, major research centres, publishers and small businesses, as well as organisations that contribute to Europeana for the first time with 3 million records of various cultural content.

Aggregating content

Metadata Interoperability Services (MINT)

MINT: Metadata Interoperability Services compose a web-based platform that was designed and developed to facilitate aggregation initiatives for cultural heritage content and metadata in Europe.

It functions as a server for content ingestion and is based on open source software developed by the National Technical University of Athens (NTUA) in the context of the ATHENA project.

MINT allows content providers to upload, map, validate and deliver metadata to be sent to Europeana in an only web environment.

The platform also provides a management system both for users and organisations that allows the deployment and operation of different aggregation schemas with corresponding user roles and access rights.

Mapping content

Metadata records are critical to the documentation and maintenance of interrelationships between information resources, and are being used to find, gather, and maintain resources over long periods of time. Descriptive, administrative, technical, and preservation metadata contribute to the management of information resources and help to ensure their intellectual integrity both now and in the future.

A crosswalk provides a mapping of metadata elements from one metadata schema to another. Supporting the ability to retrieve the same or similar content in different data source, crosswalks support the so called semantic interoperability.

The Linked Heritage Technology Platform, MINT, implements an aggregation infrastructure offering a crosswalk mechanism to support the following critical activities:

  • harvesting and aggregating metadata records whether by standard and proprietary schemas
  • migrating from content providers' model to a reference model
  • transforming records from the Linked Heritage model to the Europeana Semantic Elements and the Europeana Data Model.

LIDO as Linked Heritage metadata reference model

MINT allows mapping and transformation of metadata into LIDO records.

LIDO stands for Lightweight Information Describing Objects. It is the result of a collaborative effort of international stakeholders in the museum sector, starting in 2008, to create a common solution for contributing cultural heritage content to web applications.

LIDO is based on CIDOC-CRM conceptual reference model. It comes from the integration between CDWA Lite and museumdat metadata schemas and it is based on SPECTRUM standard. Being an application of the CIDOC-CRM, it provides an explicit format to deliver (museum’s) object information in a standardised way.

MINT implemented LIDO as intermediate harvesting schema. Initially conceived for museum sector needs, it is currently used in cross-domain contexts proving its adaptability and effectiveness in preserving the integrity of rich metadata.

The ESE metadata profile

MINT allows to convert LIDO metadata records into Europeana Semantic Elements (ESE).

ESE is a data model based on a Dublin Core-based set of fields with additionally 12 specific Europeana elements. Content providers must conform their metadata to ESE profile necessary for records to display correctly in Europeana.

ESE is a subset of the Europeana Data Model (EDM), the new application profile that will be implemented in the coming months. Even EDM profile is supported by MINT (see: MINT screencast EDM Ingestion Tool).

Mandatory metadata elements

The ESE and LIDO metadata mandatory elements are the following:

  • dc:title
  • dc:type
  • europeana:type
  • dc:language (mandatory if europeana:type="TEXT")
  • dc:identifier
  • europeana:dataProvider
  • dc:source
  • europeana:isShownAt
  • europeana:object
  • europeana:isShownBy
  • europeana:rights
  • dc:rights


Click to download (PDF)

The complete mapping table LIDO v1.0 to ESE v3.4 by Regine Stein (Philipps-Universitaet Marburg - Bildarchiv Foto Marburg) is available in Use of Content in Linked Heritage and Europeana (v.5), Annex 3, prepared by the Linked Heritage DEA Task Force (see also Content aggregation: tools & guidelines).

Metadata flow

MINT functions as metadata ingestion server, enabling content providers:

  • to upload their datasets, that can be structured in heterogeneous metadata schemas, and map them to LIDO
  • to transform metadata records into LIDO records and convert them into ESE
  • to validate content through the Europeana Content Checker
  • and to transmit content to the Europeana ingestion office via OAI-PMH Protocol

However the content providers can be asked by Europeana to check again the quality of the content once published online and to assess possible problems.

The graphic below summarises the metadata ingestion flow in MINT and the metadata flow towards Europeana:

.  Linked Heritage, June 2013
The Workflow. Linked Heritage, June 2013

Licensing content

Europeana Data Exchange Agreement

Europeana Data Exchange Agreement is the new licence adopted by Europeana in September 2011. DEA foresees that descriptive metadata (not the thumbnails) are subjected to the Creative Commons CC0 1.0 Universal Public Domain Dedication, which effectively means releasing content as public domain and allowing the commercial reuse of metadata.

This implies the possibility for Europeana to support open re-use of data and to publish metadata as Linked Open Data (LOD).

The DEA subscription is mandatory for all content providers to make available their collections in Europeana.

The Linked Heritage DEA Task Force

The Linked Heritage DEA Task Force was set up in order to present the Linked Heritage consortium with practical ways to fulfil the project duties (while implies a DEA subscription) and to keep the integrity of their data.

The task force elaborated a strategy that gives content providers 3 options for the metadata publication:

  1. Publish a minimal metadata set to Europeana: of the metadata that is supplied to the Linked Heritage ingestion tool by the Content Provider, only the LIDO & ESE mandatory elements will be transmitted to Europeana under the Creative Commons CC0 1.0 Universal Public Domain Dedication



  2. Publish an intermediate metadata set to Europeana: of the metadata that is supplied to the Linked Heritage ingestion tool by the Content Provider, all metadata elements will be transmitted to Europeana under the Creative Commons CC0 1.0 Universal Public Domain Dedication, except the LIDO elements that result in dc:description. This means that no object description, the part that most likely contains sensitive or valuable content, will be shown on Europeana



  3. Publish a full metadata set to Europeana: of the metadata that is supplied to the Linked Heritage ingestion tool by the Content Provider, all metadata elements will be transmitted to Europeana under the Creative Commons CC0 1.0 Universal Public Domain Dedication.



MINT implemented a filter option to enable content providers to select the favourite one during the aggregation process (see screencast How to set a metadata filter in MINT).

Despite the minimum set of mandatory metadata required, Europeana as well as Linked Heritage encourage content providers to publish the widest range of information that can be made available through the Europeana portal, both for a better exposition and exploitation of content by end users and to make the user experience richer.

At present most Linked Heritage partners subscribed to DEA.

Providing content

The Linked Heritage methodology: workflow

Assessing the Linked Heritage content providers' digital collections (the Linked Heritage survey)

The first step for taking the content into Europeana is assessing the digital collections that content providers described in the Description of Work (PDF), available in the Reserved Area of the Linked Heritage site.

This assessement can be easily done, for example, through a template. Linked Heritage content providers were asked to answer to a survey providing the following information:

  • Country
  • Data provider
  • Primary contact
  • Technical contact
  • Collection URL
  • Amount of metadata to be aggregate
  • Amount of digital objects linked to metadata
  • Object types: image, text, sound, video
  • Description
  • Metadata formats used
  • Rights

As Europeana aggregates only metadata, it is of paramount importance to ask the amount of metadata and the amount of digital objects separately because the ratio 1 metadata : 1 digital object can not always be the rule.

Training and training materials

Face-to-face training sessions with the project content providers were organised to train them on LIDO mapping and MINT use.

After the training workshop documents were delivered to all project partners.

Moreover, a specific section devoted to tools and guidelines for content aggregation was contextually published within the Linked Heritage web site.

The Help-desk service and Frequently Asked Questions

A help-desk service was set up at the beginning of the project to support the content providers problems. FAQ were also elaborated and posted on the Linked Heritage Web site.

Community

Workflow and feedback methodology is fundamental to assist content providers and keep the aggregation process under control; it also helps to build a sense of community.

Periodical interviews, constant review of the main aggregation issues, the analysis of data reports from MINT together with the ongoing updating of training materials are crucial tasks to consider for the benefit and the overall success of the project.

Linked Heritage & Europeana workflows

The figure below summarises the way that metadata are contributed to Europeana through the Linked Heritage project.


Linked Heritage & Europeana Workflows. Michael Hopwood (EDItEUR), December 2011