Presentation type: Lightning talk
Abstract:Biblissima, a digital facility for medieval and renaissance written cultural heritage, is maintaining an experimental discovery environment to search across IIIF-compliant collections of manuscripts and rare books (materials dated before 1800). It is online at https://iiif.biblissima.fr/collections since December 2018. The first version aggregates about 60,000 Manifests coming from many IIIF repositories in Europe and beyond: Gallica (BnF), Digital.Bodleian, e-codices, the BVMM, Parker on the Web, The Bibliothèque Mazarine. More will follow in the coming weeks.
The general approach of this prototype is not only to harvest and index the Manifests’ metadata as present at the source, but also to reconcile, cluster and normalise some of the metadata elements in order to perform powerful search capabilities and facets. This data processing leverages the big cluster of medieval and early modern authorities which forms the backbone of the main Biblissima portal (https://biblissima.fr).
Eventually this application seeks to build on the work being done by the IIIF Discovery Technical Specification group. We also intend to collaborate with the IIIF Manuscripts community group to help promote best practices regarding search and discovery of Manifests (e.g. use of the seeAlso property, use of profile identifiers), define and share mappings between existing metadata formats. More broadly, it can be seen as an experiment to make concrete progress towards live discovery interfaces that allow users to search, browse and find IIIF resources kept in institutional silos.
In this presentation, we will tackle the main technical challenges we have faced while building this prototype: lack of a common mechanism to harvest the Manifests from the IIIF providers, the infrequent use of "seeAlso" links, the availability of structured metadata and the diversity of source formats (even within the same domain, working on the same type of documents), the extreme diversity of strings to represent the same named entity, without any reference to linked open authority files... This is what we are trying to address in this domain-specific prototype, based on our experience on the Biblissima portal. We will then talk about the process by which we aggregate the data, and finally present some potential development avenues for the web application.
Topics:
- Discovering IIIF resources
Keywords:
- discovery,
- manuscripts,
- rare books,
- aggregation,
- search engine