Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.

Rich markup

Pages in CollectionBuilder contain machine readable rich markup following several standards to improve discoverability, search representation, and social media sharing.

Dublin Core

Dublin Core elements are added to Item pages driven by the "dc_map" column of config-metadata.csv.

Choose mapping options directly from the DCMI Terms namespace: http://purl.org/dc/terms/ Note: DMCI the original 15 Elements namespace is mirrored in the Terms namespace (i.e. both have "title", "creator", etc), however, using the newer Terms namespace is preferred.

The values will be directly added to meta tag name attribute, thus should use the DCTERMS prefix.

For example, to use Dublin Core Title from the "Terms" namespace, the value would be DCTERMS.title. To use abstract, DCTERMS.abstract. If the "dc_map" column is empty, no DC meta tags will be added.

Recommended fields to map include:

  • DCTERMS.title
  • DCTERMS.creator
  • DCTERMS.created or DCTERMS.date
  • DCTERMS.description
  • DCTERMS.subject
  • DCTERMS.type

This implementation is based on DSpace, following the DC-HTML conventions.

Open Graph Protocol

Open Graph provides basic metadata in a open standard used by social media sites to generate representations of links shared on the platform. Open Graph was established by Facebook, but can be read by other platforms.

OG meta tags are automatically added to every page and are not configurable. They provide an authoritative title, description, and image that can be used to represent a link to the page. The OG image will be the item image/thumb in the case of item pages, or the site featured image for all other pages. For example:

<meta property="og:title" content="{{ page.title | escape }}" />

Schema.org

Schema is a standard designed to provide structured semantic markup for search engines to better understand content of web pages. The concepts described apply to a generalized web landscape, centered mostly around commercial sites, and don't necessarily follow the logic and structure of library-based metadata or digital collections. However, it is useful to provide the markup to drive better representations of the data in search results. See Full Schema hierarchy, or Google Guide to Structure Data.

Keep in mind that Schema is an open standard, however, Google is the biggest consumer, so information found in Google's Developer Docs is potentially more pragmatically useful (e.g. Google highly recommends using JSON-LD vs. Schema suggesting microdata). Markup can be tested using Google's Structured Data Testing Tool.

The Schema markup is different on a variety of page types:

Item pages

Item pages have in depth Schema markup in JSON-LD format driven by the object metadata. Schema elements are driven by the "schema_map" column of config-metadata.csv. Each item page is given the basic type of CreativeWork, thus metadata fields can be mapped to any of the properties listed on the CreativeWork documentation. Copy the exact property name, as this value will be turned into schema JSON-LD markup. If the "schema_map" column is empty, only the automatically generated markup will be added.

Suggested field mappings include:

  • headline (i.e. the title)
  • creator
  • dateCreated
  • description
  • keywords
  • contentLocation
  • encodingFormat (MIME type, should = format field of CollectionBuilder items)
  • license (should only be used with a standardized rights URL)

Additionally, the Schema type, isPartOf (the collection), image (url), thumbnailUrl (url), and page url will be added automatically.

Note: in the future, our base item type may move to ArchiveComponent when this spec is fully integrated into the standard, https://schema.org/ArchiveComponent . An alternative approach would be to use ItemPage, https://schema.org/ItemPage to describe the object pages, although this seems less direct.

Item pages are also marked up with Schema BreadcrumbList to represent their nesting in the site, which may be represented in search results.

Data page

The Data page includes Schema markup in JSON-LD representing the various data derivatives that can be downloaded (implemented in _includes/data-download-modal.html which is included by the data layout). See Google dataset docs and Schema Dataset for details behind this implementation.

The full metadata download in csv and json are automatically added. A metadata facets json file is added if fields are set in theme "metadata-facets-fields". Additional datasets described are selected based on what pages are in the config-nav, following the same logic used to select which data download buttons are shown to users. If the config-nav contains the following "stub", the following data files will be added to the markup:

  • "subject", subjects.csv, json
  • "location", locations.csv, json
  • "map", geodata.json
  • "timeline", timelinejs.json

This may not be accurate for all use cases. An easy way to manually set the downloads, is to create a list based on the stub values shown above, and edit the "stubs" assigned on the data.html layout. For example, if I want to show all data downloads, even though I don't have the pages in the navigation or have named them something different, edit the "assign stubs" line on data.html like this:

{%- assign stubs = "subject;map;location;timeline" -%}

Also check docs/data.md for more information.

Content pages

Other pages in the site receive more basic markup from the head/page-meta.html include following the WebPage schema type.