Globally Unique IDs (GUID)
Overview
In collaboration with the community via an open RFC process, iDigBio released a GUID statement regarding GUID usage requirements and effective practices for selecting and assigning GUIDs. Additional questions, comments, definitions, and information were generated pertaining to the GUID statement. That information is contained within this wiki as an appendix and is open for public discussion and enhancement.
Object Services to be Provided by iDigBio
Standard object services include ones that accept an object identifier and produce a webpage for people to learn about the object, a metadata document about the object, or the digital object itself. The iDigBio portal will provide all of these services for the digital objects accessible from the portal. Provider organizations may choose to provide some or all of these services for their objects.
Each request for metadata about a digital object will return a metadata document in some particular format. iDigBio services will produce metadata documents in RDF, RSS and JSON formats. Other formats will be supported as needed.
RDF metadata documents are of particular interest because of their use in the Linked Data protocol, uses HTTP GET requests to access web pages and metadata. If the header of the request includes the parameter Accept: application/rdf+xml, the service will return an RDF document containing the object metadata. Without the parameter, a web page will be returned by the web server. iDigBio will use the Linked Data protocol for serving web pages and metadata documents.
The URI pattern need not be associated with any software structure of a Web server. A variety of standard techniques may be used to associate a URI with a webpage. All web application systems have strategies for mapping URLs onto the structure of the web application.
iDigBio Proxy Services
The iDigBio portal will be capable of redirecting object service requests to provider services, in order to assist providers in creating and managing both identifiers and object services. The portal will serve as a proxy for the provider. The portal will include a facility for registering URI patterns and service end points. When a proxy request is received, the portal will use standard http capabilities to redirect the request to the provider service.
In the example below, the herbarium collection of the Florida Museum of Natural History has registered a URI pattern with iDigBio. A request for information about the particular digital object is sent by a user to the proxy server at iDigBio.
http:// proxy.idigbio.org/?q=http:// ids.flmnh.ufl.edu/herb/abcd12345678
The proxy server will send the user a response to redirect the request to the following URL for processing by the Museum’s object services.
http:// services.flmnh.ufl.edu/herb/?id=abcd12345678
Version management
Identifiers can be used to represent objects whose content is subject to change. iDigBio intends to provide a service for fetching a particular version of an object by date or version number. If the content of an object changes so much that it can be considered a different object, a different identifier should be attached to the new object.
URI Resolution Services
A service that associates object delivery and metadata services with particular URIs.
Darwin Core Triples and Embedding Information in URIs
GBIF has previously relied on the Darwin Core triple of institution code, collection code and catalog number for specimen identification. Unfortunately, these triples have not provided reliable identification of all occurrences over time. GBIF is now advocating identifiers like those described in the iDigBio “Guidelines for Managing Persistent Identifiers” for all data. Darwin core triples can form the basis of persistent identifiers, but must be supported with proper services and curation.
It is important to recognize that embedding information in identifiers creates problems. In particular, people will assume that the presence of an institution code in a specimen identifier, as recommended by iDigBio, means that the specimen is owned by or housed at that institution. We must continue to emphasize to all users that identifiers contain no reliable information. Metadata for an object must be acquired from an object service and not by parsing an identifier.
The major advantage of embedding information in an identifier is that the service provider can use the embedded information to aid in finding the metadata record for the identifier. However, each provider must maintain information about identifiers that will support finding metadata records when the embedded information is out of date, and reporting on the state of inactive identifiers−those that no longer identify an object.