IDigBio API v1 Specification: Difference between revisions
|  (add and clarify offsets →GET /v1/mediarecords) | |||
| Line 269: | Line 269: | ||
| </pre> | </pre> | ||
| which includes links to the previous page  | which includes links to the previous page and next page: | ||
| <pre> | |||
|       "idigbio:nextPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=10", | |||
|       "idigbio:prevPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=0" | |||
| </pre> | |||
| using offsets of 0 (previous page) and 10 (next page). | |||
| ''DO NOT expect to be able to page through the entire iDigBio data this way.  See [[iDigBio API Performance]] if you find yourself trying to page through large amounts of data.'' | ''DO NOT expect to be able to page through the entire iDigBio data this way.  See [[iDigBio API Performance]] if you find yourself trying to page through large amounts of data.'' | ||
Revision as of 11:13, 20 May 2014
API Version Information
This is the specification for v1 of the iDigBio API. Previous versions of the API continue to exist but should be considered deprecated. API users should migrate to using the current version of the API. This document supercedes iDigBio API v0 Specification.
iDigBio Data and Schema
Data elements generally conform to the Biodiversity Information Standards (also known as the Taxonomic Databases Working Group or TDWG) Darwin Core and Audobon Core.
The iDigBio Data Ingestion Requirements and Guidelines may be useful to understand how data becomes available in iDigBio.
Endpoints
Unless otherwise noted, successful responses from the API will return a JSON-formatted document.
Most of the provided examples include a JSON formatter (such as json_pp) to make the output easier for humans to read. Additional usage examples as well as information on JSON formatting and the "curl" command, are available in iDigBio API Examples.
There are two major types of API enpoints:
- Collection - which is a group endpoint that returns lists of multiple records. These urls are of the form <base url>/<version>/<type>, such as http://api.idigbio.org/v1/mediarecords/ . Additionally, a collection endpoint can contain optional query parameters, ?limit indicates the number of records returned in the collection and defaults to 1000 and the ?offset parameter which indicates the number of records to skip before returning a set of records and defaults to 0. If a collection endpoint request finds more then the set limit of records it will include a "next page" link to retrieve the next set of records in the collection. See the endpoint properties section for more information on properties returned.
- Entity - A single item endpoint which returns all of the data available about an object. These urls are of the form <base url>/<version>/<type>/<id> like the example used above.
Examples:
collection: "http://api.idigbio.org/v1/mediarecords" collection w/ optional query parameters: "http://api.idigbio.org/v1/mediarecords?limit=100&offset=100" entity: "http://api.idigbio.org/v1/mediarecords/00000230-01bc-4a4f-8389-204f39da9530"
GET /
- Description
- Returns a list of top-level api_version or service URLs
- Resource URL
http://api.idigbio.org/
- Optional Parameters
- None
- Sample Usage
$ curl -s http://api.idigbio.org/ | json_pp
{
   "v1" : "http://api.idigbio.org/v1/",
   "check" : "http://api.idigbio.org/check",
   "v0" : "http://api.idigbio.org/v0/"
}
GET /{api_version}
- Description
- Returns a list of top-level API feature types for a particular version of the API
- Resource URL
http://api.idigbio.org/v1
- Optional Parameters
- None
- Sample Usage
$ curl -s http://api.idigbio.org/v1 | json_pp
{
   "aggregates" : "http://api.idigbio.org/v1/aggregates",
   "records" : "http://api.idigbio.org/v1/records",
   "mediaaps" : "http://api.idigbio.org/v1/mediaaps",
   "taxa" : "http://api.idigbio.org/v1/taxa",
   "people" : "http://api.idigbio.org/v1/people",
   "organizations" : "http://api.idigbio.org/v1/organizations",
   "recordsets" : "http://api.idigbio.org/v1/recordsets",
   "mediarecords" : "http://api.idigbio.org/v1/mediarecords"
}
- Notes
- Some of the listed feature types may deprecated. This will be noted elsewhere in the API specification document.
GET /v1/aggregates
- Description
- Deprecated, do not use.
GET /v1/mediaaps
- Description
- Deprecated, do not use.
GET /v1/mediarecords
- Description
- Returns a collection (list) of Media Record IDs.
- Resource URL
http://api.idigbio.org/v1/mediarecords
- Optional Parameters
| parameter | valid values | detailed description | 
|---|---|---|
| limit | Controls the number of records returned by a collection url. Large values may cause HTTP requests to time out. | |
| offset | Controls the starting record offset paging through the API. Large offsets are extremely inefficient, so combinations of small limits and large offsets may cause requests to fail. | 
- Sample Usage
Request the first 5 media record entity ids:
$ curl -s "http://api.idigbio.org/v1/mediarecords?limit=5" | json_pp
{
   "idigbio:errors" : [],
   "idigbio:links" : {
      "idigbio:nextPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=5"
   },
   "idigbio:items" : [
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/000003cd-0cca-421b-8f26-f557a26b0393"
         },
         "idigbio:uuid" : "000003cd-0cca-421b-8f26-f557a26b0393",
         "idigbio:version" : 1,
         "idigbio:etag" : "ce3e2f7272ec996bb479c87549ba90c15ba96426",
         "idigbio:dateModified" : "2014-04-21T22:19:27.436Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00000728-ffb3-4a68-9f93-137f19961121"
         },
         "idigbio:uuid" : "00000728-ffb3-4a68-9f93-137f19961121",
         "idigbio:version" : 3,
         "idigbio:etag" : "ef2cac326a60d89d8cb9005abaa82068bfa83565",
         "idigbio:dateModified" : "2014-04-24T05:03:56.782Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00000b03-e208-4d22-983b-506ad2842f7c"
         },
         "idigbio:uuid" : "00000b03-e208-4d22-983b-506ad2842f7c",
         "idigbio:version" : 2,
         "idigbio:etag" : "bc118a7ea53e004c82ab9b7e813e1010ae5f8e17",
         "idigbio:dateModified" : "2014-04-20T05:16:20.389Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/000010bc-a4d4-483d-b71d-0dbdd4fd2d5a"
         },
         "idigbio:uuid" : "000010bc-a4d4-483d-b71d-0dbdd4fd2d5a",
         "idigbio:version" : 0,
         "idigbio:etag" : "68c441bd3c49507bf930f3b278f2c58f9cb792ec",
         "idigbio:dateModified" : "2014-04-20T21:38:46.679Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/000012f9-d288-4a14-b898-77430e0a137a"
         },
         "idigbio:uuid" : "000012f9-d288-4a14-b898-77430e0a137a",
         "idigbio:version" : 1,
         "idigbio:etag" : "cf49416750fdb9bdb808c334a74b84f27bb8160b",
         "idigbio:dateModified" : "2014-04-23T02:43:08.344Z"
      }
   ],
   "idigbio:itemCount" : "2342880"
}
Of interest here is that  "idigbio:itemCount"  contains the number of items of this type in the API. In this case, we have 2,342,880 mediarecords total.
A link to the next "page" of records is also provided:
   "idigbio:links" : {
      "idigbio:nextPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=5"
   }
The next page of records can be requested by adding the "offset" paramenter:
$ curl -s "http://api.idigbio.org/v1/mediarecords?limit=5&offset=5" | json_pp
{
   "idigbio:errors" : [],
   "idigbio:links" : {
      "idigbio:nextPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=10",
      "idigbio:prevPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=0"
   },
   "idigbio:items" : [
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00001478-c150-4faf-a617-439a838d4377"
         },
         "idigbio:uuid" : "00001478-c150-4faf-a617-439a838d4377",
         "idigbio:version" : 1,
         "idigbio:etag" : "30f602e4eb47ebb2ceb265f64217e3cf5664f517",
         "idigbio:dateModified" : "2014-03-21T23:09:39.752Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00001a91-189b-4002-b56e-a770a55951a0"
         },
         "idigbio:uuid" : "00001a91-189b-4002-b56e-a770a55951a0",
         "idigbio:version" : 0,
         "idigbio:etag" : "647e82d17ee435fb14f0f8607dabe88dfc3a1944",
         "idigbio:dateModified" : "2014-04-25T04:49:32.359Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00002091-4fb3-410a-9307-bd3e917dfcca"
         },
         "idigbio:uuid" : "00002091-4fb3-410a-9307-bd3e917dfcca",
         "idigbio:version" : 0,
         "idigbio:etag" : "90d98d48d9e7e07eab9064bd9b6e22ce6502c07f",
         "idigbio:dateModified" : "2014-05-03T18:45:47.112Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00002c32-ae3a-41ed-9bd9-f6c50d3e35fb"
         },
         "idigbio:uuid" : "00002c32-ae3a-41ed-9bd9-f6c50d3e35fb",
         "idigbio:version" : 3,
         "idigbio:etag" : "d1ded90d06e93876b1badd01222905add93e8806",
         "idigbio:dateModified" : "2014-04-19T00:25:59.471Z"
      },
      {
         "idigbio:links" : {
            "mediarecord" : "http://api.idigbio.org/v1/mediarecords/00002dbd-6415-463b-8cae-38f548415ffa"
         },
         "idigbio:uuid" : "00002dbd-6415-463b-8cae-38f548415ffa",
         "idigbio:version" : 2,
         "idigbio:etag" : "4e298045b496146f5c51e331c9887fd7afde4deb",
         "idigbio:dateModified" : "2014-04-21T20:29:39.531Z"
      }
   ],
   "idigbio:itemCount" : "2342880"
}
which includes links to the previous page and next page:
      "idigbio:nextPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=10",
      "idigbio:prevPage" : "http://api.idigbio.org/v1/mediarecords?limit=5&offset=0"
using offsets of 0 (previous page) and 10 (next page).
DO NOT expect to be able to page through the entire iDigBio data this way. See iDigBio API Performance if you find yourself trying to page through large amounts of data.
GET /v1/mediarecords/{ID}
- Description
- Returns a Media Record with the specific entity ID
- Resource URL
http://api.idigbio.org/v1/mediarecords/{ID}
- Optional Parameters
| parameter | valid values | detailed description | 
|---|---|---|
| version | Integer values from 0 to maxium version of a particular record | The API normally returns the most recent version of a particular record. Records may be updated over time. The version parameter can be used to retrieve previous versions of a record. | 
- Sample Usage
GET /v1/mediarecords/{ID}/media
- Description
- Returns an image file (JPEG) associated with the specific entity ID. Omitting the "quality" parameter will return the full size image specified in the source data accessURI field. For many use cases, the recommended use of this endpoint would include the quality parameter.
- Resource URL
http://api.idigbio.org/v1/mediarecords/{ID}/media
- Optional Parameters
| parameter | valid values | detailed description | 
|---|---|---|
| quality | "thumbnail" "webview" | Specifiy the quality of the image returned from the API. Omitting quality will return the full-size high quality original image from source provider. The values "thumbnail" and "webview" return images of width 260 and 600 pixels respectively. | 
- Sample Usage
# CURL SOMETHING with -L to follow redirects
GET /v1/records
- Description
- Returns a collection of record IDs
- Resource URL
http://api.idigbio.org/v1/records
- Optional Parameters
| parameter | valid values | detailed description | 
|---|---|---|
| limit | ||
| offset | 
- Sample Usage
# CURL SOMETHING
GET /v1/records/{ID}
- Description
- Returns a record with the specific entity ID
- Resource URL
http://api.idigbio.org/v1/records/{ID}
- Optional Parameters
| parameter | valid values | detailed description | 
|---|---|---|
| version | Numeric values from 0 to maxium version of a particular record | The API normally returns the "latest" or most recent version of a particular record. Records may be updated over time. The version parameter can be used to retrieve previous versions of a record. | 
- Sample Usage
# CURL SOMETHING
GET /v1/records/{ID}/media
- Description
- Returns an image (JPEG) associated with the specific entity ID (via the relationship to a mediarecord). If multiple mediarecords are associated with a specimen record, the particular image returned in non-deterministic.
- Resource URL
http://api.idigbio.org/v1/records/{ID}/media
- Optional Parameters
| parameter | valid values | detailed description | 
|---|---|---|
| quality | "thumbnail" "webview" | Specifiy the quality of the image returned from the API. Omitting quality will return the full-size high quality original image from source provider. The values "thumbnail" and "webview" return images of width 260 and 600 pixels respectively. | 
- Sample Usage
# CURL SOMETHING with -L to watch redirects
GET /v1/publishers
- Description
- Returns a collection of publisher IDs
- Resource URL
http://api.idigbio.org/v1/publishers
- Optional Parameters
- something goes here
- Sample Usage
# CURL SOMETHING
GET /v1/organizations
- Description
- Deprecated, do not use.
GET /v1/people
- Description
- Deprecated, do not use.
GET /v1/publishers/{ID}
- Description
- Returns a publisher with specific entity ID
- Resource URL
http://api.idigbio.org/v1/publishers/{ID}
- Optional Parameters
- something goes here
- Sample Usage
# CURL SOMETHING
GET /v1/recordsets
- Description
- Returns a collection of recordset IDs
- Resource URL
http://api.idigbio.org/v1/recordsets
- Optional Parameters
- something goes here
- Sample Usage
# CURL SOMETHING
GET /v1/recordsets/{ID}
- Description
- Returns information about a recordset with specific entity ID
- Resource URL
http://api.idigbio.org/v1/recordsets/{ID}
- Parameters
- something goes here
- Sample Usage
# CURL SOMETHING
GET /v1/recordsets/{ID}/mediarecords
- Description
- Returns a colleciton of mediarecord IDs that belong to the recordset of the specified entity ID
- Resource URL
http://api.idigbio.org/v1/recordsets/{ID}/mediarecords
- Optional Parameters
- something goes here
- Sample Usage
# CURL SOMETHING
GET /v1/recordsets/{ID}/records
- Description
- Returns a collection of record IDs that belong to the recordset of the specified entity ID
- Resource URL
http://api.idigbio.org/v1/recordsets/{ID}/records
- Optional Parameters
- something goes here
- Sample Usage
# CURL SOMETHING
GET /v1/taxa
- Description
- Deprecated, do not use.
Search
Elasticsearch is an open source distributed document-oriented NoSQL search system. Although not technically part of the API, iDigBio exposes a public Elasticsearch interface for programmers to access advanced search functionality of iDigBio data.
The following are external links to Elasticsearch reference documentation and should be considered prerequisite reading before attempting to use the iDigBio Elasticsearch interface.
There is also an elasticsearch Google Group available.
The iDigBio search index provides two document types to query on: Records (specimen records) and Media Records (media metadata). Search results are returned as JSON-formatted documents.
Each type can be queried through the following respective URLs:
| Query Type | Description | Search URL | 
|---|---|---|
| Records | specimen records | https://search.idigbio.org/idigbio/records/_search | 
| Media Records | media metadata records | https://search.idigbio.org/idigbio/mediarecords/_search | 
Examples specific to iDigBio are available in iDigBio API Examples.
Elasticsearch - Records
Specimen Records Query URL:
https://search.idigbio.org/idigbio/records/_search
The following terms are currently available in the index for Records type of queries to Elasticsearch:
"barcodevalue" "catalognumber" "class" "collectioncode" "collectionid" "collectionname" "collector" "commonname" "continent" "country" "county" "datecollected" "datemodified" "etag" "family" "fieldnumber" "genus" "geopoint" "hasImage" "highertaxon" "infraspecificepithet" "institutioncode" "institutionid" "institutionname" "kingdom" "locality" "maxdepth" "maxelevation" "mediarecords" "mindepth" "minelevation" "municipality" "occurenceid" "order" "phylum" "recordset" "scientificname" "specificepithet" "stateprovince" "typestatus" "uuid" "verbatimlocality" "version" "waterbody"
The values stored in these terms are converted to lowercase, so searches based on terms should use the all-lowercase version of the string.
For example, searching for "Arkansas" in stateprovince will return no records.
$ curl -s "http://search.idigbio.org/idigbio/records/_search?q=stateprovince:Arkansas" | json_pp | grep scientificname | wc -l 0
Searching for "arkansas" will return multiple records.
$ curl -s "http://search.idigbio.org/idigbio/records/_search?q=stateprovince:arkansas" | json_pp | grep scientificname | wc -l 10
See iDigBio API Examples page for more Elasticsearch examples that are specific to iDigBio.
Elasticsearch - Media Records
Media Records Query URL:
https://search.idigbio.org/idigbio/mediarecords/_search
There are no useful search terms for Media Records queries using Elasticsearch at this time.