Identifiers.org Compact Identifier services
The Identifiers.org resolution system provides consistent access to life science data using Compact Identifiers (CIDs). CIDs consist of an assigned unique prefix and a local provider designated accession number (prefix:accession). The resolving location of CIDs is determined using information that is stored in the Identifiers.org Registry, which contains high quality, manually curated information on over 700 life science data resources (largely databases).
The prefix assignment process involves registration of a unique prefix to individual life science data collections together with recording a variety of useful metadata, including a description of the data resource, accession identifier pattern and a list of known resolving locations. When a Compact Identifier is presented to the Identifiers.org resolver, redirection can be accomplished in either a resource specified or location independent (resource unspecified) manner. The latter method takes into consideration information such as the uptime and reliability of all available hosting resources, for example, pdb:2gc4, GO:0006915.
Besides resolution, Identifiers.org provides a number of additional services, including the ability to harvest and display Schema.org (and bioschemas) metadata markup associated with datasets by presenting a CID to the Identifiers.org metadata service.
We have also re-engineered the system to address the need to provide scalable, highly available and low latency services within global scientific e-infrastructures. We have deployed the identifiers.org infrastructure in multiple cloud environments including Amazon web services and Google Cloud Platform, bringing our services closer to the data. The new Identifiers.org system benefits from the auto-scaling and multiple zone availability afforded by cloud provision.