|   |     | 
| (17 intermediate revisions by the same user not shown) | 
| Line 1: | Line 1: | 
|  | [[Category:CYWG]]
 |  | 
|  | [[Category:API]]
 |  | 
|  | [[Category:Documentation]]
 |  | 
|  | 
 |  | 
 | 
|  | == iDigBio APIOverview ==
 |  | '''Please feel free to make notes here if you find something is missing from the API documentation.''' | 
|  | 
 |  | 
 | 
|  | This document serves as the official documentation for the iDigBio Application Programming Interface (API).
 |  | Thanks!  - Dan Stoner | 
|  | 
 |  | 
 | 
|  | The iDigBio API serves as an abstraction layer for storing and retrieving data from the iDigBio back-end data systems.
 |  | ---- | 
|  |   |  | 
|  | == Quick Start ==
 |  | 
|  |   |  | 
|  | The iDigBio API is a RESTful pattern HTTP API that primarily delivers data in JSON format. Currently, the public API supports GET requests for data read operations only.
 |  | 
|  |   |  | 
|  | Experienced programmers may wish to jump straight to the [[iDigBio API Examples]] page or the [[iDigBio API v1 Specification]].
 |  | 
|  |   |  | 
|  |   |  | 
|  |   |  | 
|  | API URLs (endpoints) have several parameters in them, and typically follow the form of 
 |  | 
|  | <pre style="color:red">
 |  | 
|  | <base url>/<version>/<type>/<id> 
 |  | 
|  | </pre>
 |  | 
|  |   |  | 
|  | For example:
 |  | 
|  | <pre>
 |  | 
|  |  Using the following parameters for an API request
 |  | 
|  |   |  | 
|  |  base url = http://api.idigbio.org
 |  | 
|  |  version = v1
 |  | 
|  |  type = records
 |  | 
|  |  id = 00000230-01bc-4a4f-8389-204f39da9530
 |  | 
|  |   |  | 
|  |  would produce a URL of the following form
 |  | 
|  |   |  | 
|  |  "http://api.idigbio.org/v1/records/00000230-01bc-4a4f-8389-204f39da9530" 
 |  | 
|  | </pre>
 |  | 
|  |   |  | 
|  |   |  | 
|  | There are two major types of API enpoints: 
 |  | 
|  |   |  | 
|  | *Collection - which is a group endpoint that returns lists of multiple records. These urls are of the form <base url>/<version>/<type>, such as http://api.idigbio.org/v1/mediarecords/ . Additionally, a collection endpoint can contain optional query parameters, [[?limit]] indicates the number of records returned in the collection and defaults to 1000 and the [[?offset]] parameter which indicates the number of records to skip before returning a set of records and defaults to 0. If a collection endpoint request finds more then the set limit of records it will include a "next page" link to retrieve the next set of records in the collection. See the [[#endpointprops|endpoint properties]] section for more information on properties returned.
 |  | 
|  |  
 |  | 
|  | *Entity - A single item endpoint which returns all of the data available about an object. These urls are of the form  <base url>/<version>/<type>/<id> like the example used above.
 |  | 
|  |   |  | 
|  | NOTE: at this time the API does not support search capabilities on entities or collections.
 |  | 
|  |   |  | 
|  | Examples:
 |  | 
|  |   |  | 
|  |  collection:
 |  | 
|  |  "http://api.idigbio.org/v1/mediarecords"
 |  | 
|  |  collection w/ optional query parameters:
 |  | 
|  |  "http://api.idigbio.org/v1/mediarecords?limit=100&offset=100"
 |  | 
|  |  entity:
 |  | 
|  |  "http://api.idigbio.org/v1/mediarecords/00000230-01bc-4a4f-8389-204f39da9530"
 |  | 
|  |   |  | 
|  | == Endpoint Basics ==
 |  | 
|  |   |  | 
|  |   |  | 
|  | Calling just the base URL will return a list of API version endpoints. For example, an HTTP GET request to "http://api.idigbio.org" will return the following JSON data:
 |  | 
|  |   |  | 
|  | <pre>
 |  | 
|  | {
 |  | 
|  |    "v1" : "http://api.idigbio.org/v1/",
 |  | 
|  |    "check" : "http://api.idigbio.org/check",
 |  | 
|  |    "v0" : "http://api.idigbio.org/v0/"
 |  | 
|  | }
 |  | 
|  | </pre>
 |  | 
|  | 
 |  | 
 | 
|  | == Endpoint Properties == |  | == Endpoint Properties == | 
|  | 
 |  | 
 | 
|  |  |   ''This section preserved here for the moment until the content can be re-integrated or willfully discarded. -dstoner '' | 
|  | 
 |  | 
 | 
|  | The iDigBio API tries to follow the REST paradigm's HATEOAS (Hypermedia as the Engine of Application State) model, which basically means that within each API endpoint we provide a list of relevant links to further API actions. This list typically is stored in "idigbio:links" |  | The iDigBio API tries to follow the REST paradigm's HATEOAS (Hypermedia as the Engine of Application State) model, which basically means that within each API endpoint we provide a list of relevant links to further API actions. This list typically is stored in "idigbio:links" | 
| Line 88: | Line 28: | 
|  | *itemCount - the number of total items in the collection |  | *itemCount - the number of total items in the collection | 
|  | 
 |  | 
 | 
|  |  | == Entity Data == | 
|  |  |  | 
|  |  |   ''This section preserved here for the moment until the content can be re-integrated or willfully discarded. -dstoner '' | 
|  | 
 |  | 
 | 
|  | == Entity Data ==
 |  | 
|  | 
 |  | 
 | 
|  |  | === Concepts === | 
|  | 
 |  | 
 | 
|  | The data element for each entity can include any number of key-value pairs containing properties of the entity, including potentially values that are themselves lists or dictionaries. Typical key namespaces that might appear in each type are (in order of decreasing usefulness): |  | The data element for each entity can include any number of key-value pairs containing properties of the entity, including potentially values that are themselves lists or dictionaries. Typical key namespaces that might appear in each type are (in order of decreasing usefulness): | 
|  | 
 |  | 
 | 
|  | *Records: typically contains darwin core elements ( http://rs.tdwg.org/dwc/terms/index.htm ) describing a physical specimen, may also contain custom elements or elements defined by other standards. |  | *[[CollectionObject Bag of Terms|Records]]: typically contains Darwin Core elements ( http://rs.tdwg.org/dwc/terms/index.htm ) describing a physical specimen, may also contain custom elements or elements defined by other standards. See the complete list of terms [[CollectionObject Bag of Terms|here]]. | 
|  | *Mediarecords: typically contains Audubon Core elements ( http://terms.gbif.org/wiki/Audubon_Core_Term_List_(1.0_normative) ) describing a media capture event, may also contain custom elements or elements defined by other standards. |  | *[[Media Bag of Terms|Mediarecords]]: typically contains Audubon Core elements ( http://terms.gbif.org/wiki/Audubon_Core_Term_List_(1.0_normative) ) describing a media capture event, may also contain custom elements or elements defined by other standards. See the complete list of terms [[Media Bag of Terms|here]]. | 
|  | *Publishers: A top level entity for the data ingestion process, each publisher contains metadata about a piece of publishingsoftware such as an IPT installation or Symbiota portal. |  | *Publishers: A top level entity for the data ingestion process, each publisher contains metadata about a publishing location such as an IPT installation or Symbiota portal. | 
|  | *Recordsets: An entity largely derived from the publisher metadata. These serve as the join point between multiple data files for single collection, and all records and mediarecords in iDigBio are expected to be associated with a recordset that links them to a source. |  | *Recordsets: An entity largely derived from the publisher metadata. These serve as the join point between multiple data files for single collection, and all records and mediarecords in iDigBio are expected to be associated with a recordset that links them to a source. | 
|  | *All other entities exposed via the api are either internal only concepts with no fixed definition, or are unused. |  | *All other entities exposed via the api are either internal only concepts with no fixed definition, or are unused. | 
|  | 
 |  | 
|  | 
 |  | 
|  | == Available API endpoints ==
 |  | 
|  | 
 |  | 
|  | All endpoints follow the form of 
 |  | 
|  | "http://api.idigbio.org/{api_version}{endpoint}"
 |  | 
|  | 
 |  | 
|  | {|class="wikitable"
 |  | 
|  | ! align="left"| Endpoint
 |  | 
|  | ! Method
 |  | 
|  | ! API Versions Available
 |  | 
|  | ! Description
 |  | 
|  | |-
 |  | 
|  | | '/mediarecords'
 |  | 
|  | | GET
 |  | 
|  | | v0, v1
 |  | 
|  | | returns a collection of media record IDs
 |  | 
|  | |-
 |  | 
|  | | '/mediarecords/{ID}'
 |  | 
|  | | GET
 |  | 
|  | | v0, v1
 |  | 
|  | | returns a media record with the specific entity ID
 |  | 
|  | |-
 |  | 
|  | | '/mediarecords/{ID}/media'
 |  | 
|  | | GET
 |  | 
|  | | v0, v1
 |  | 
|  | | returns an image associated with the specific entity ID
 |  | 
|  | |-
 |  | 
|  | | '/records'
 |  | 
|  | | GET
 |  | 
|  | | v0, v1
 |  | 
|  | | returns a collection of record IDs
 |  | 
|  | |-
 |  | 
|  | | '/records/{ID}'
 |  | 
|  | | GET
 |  | 
|  | | v0, v1
 |  | 
|  | | returns a record with the specific entity ID
 |  | 
|  | |-
 |  | 
|  | | '/records/{ID}/media'
 |  | 
|  | | GET
 |  | 
|  | | v0, v1
 |  | 
|  | | returns an image associated with the specific entity ID
 |  | 
|  | |-
 |  | 
|  | | '/publishers'
 |  | 
|  | | GET
 |  | 
|  | | v0, v1
 |  | 
|  | | returns a collection of publisher IDs
 |  | 
|  | |-
 |  | 
|  | | '/publishers/{ID}'
 |  | 
|  | | GET
 |  | 
|  | | v0, v1
 |  | 
|  | | returns a publisher with specific entity ID
 |  | 
|  | |-
 |  | 
|  | | '/recordsets'
 |  | 
|  | | GET
 |  | 
|  | | v0, v1
 |  | 
|  | | returns a collection of recordset IDs
 |  | 
|  | |-
 |  | 
|  | | '/recordsets/{ID}'
 |  | 
|  | | GET
 |  | 
|  | | v0, v1
 |  | 
|  | | returns a recordset with specific entity ID
 |  | 
|  | |}
 |  | 
|  | 
 |  | 
|  | == Optional API Parameters ==
 |  | 
|  | 
 |  | 
|  | {|class="wikitable"
 |  | 
|  | ! align="left"| Parameter
 |  | 
|  | ! Endpoint type
 |  | 
|  | ! Values
 |  | 
|  | ! Description
 |  | 
|  | ! Example
 |  | 
|  | |-
 |  | 
|  | | limit
 |  | 
|  | | Collections
 |  | 
|  | | [1-]
 |  | 
|  | | Controls the number of records returned by a collection url. Large numbers may cause requests to time out, but are significantly more efficient when attempting to query large numbers of records.
 |  | 
|  | | http://api.idigbio.org/v1/mediarecords?limit=100
 |  | 
|  | |-
 |  | 
|  | | offset
 |  | 
|  | | Collections
 |  | 
|  | | [0-]
 |  | 
|  | | Controls how many records to skip forward when paging through the API. Large offsets are extremely inefficient, so combinations of small limits and large offsets may cause requests to fail.
 |  | 
|  | | http://api.idigbio.org/v1/mediarecords?limit=100&offset=100
 |  | 
|  | |-
 |  | 
|  | | version
 |  | 
|  | | Entities
 |  | 
|  | | [0-current version], -1 for latest version
 |  | 
|  | | Return a specific version of a record from the data store. Can be used to query historical data for iDigBio records.
 |  | 
|  | | http://api.idigbio.org/v1/records/c93ebbee-64b5-4452-9e80-93bbfb11b815?version=0
 |  | 
|  | |-
 |  | 
|  | | quality
 |  | 
|  | | Entities
 |  | 
|  | | ["thumbnail", "web"], 
 |  | 
|  | | Specifiy the quality of the image returned from the API (valid values are "thumbnail" and "web" which return images of width 260 and 600 pixels respectively). Omitting quality will return the full-size high quality image.
 |  | 
|  | | http://api.idigbio.org/v1/mediarecords/55dd6860-213d-4478-8bfa-b5486afcffda/media?quality=thumbnail   http://api.idigbio.org/v1/mediarecords/55dd6860-213d-4478-8bfa-b5486afcffda/media?quality=web
 |  | 
|  | |}
 |  | 
|  | 
 |  | 
|  | == Searching iDigBio ==
 |  | 
|  | 
 |  | 
|  | === Search Portal and Bulk Record Downloads ===
 |  | 
|  | 
 |  | 
|  | The recommended method for searching iDigBio is to use the Portal search, not the API. The portal also provides bulk download capabilities for aquiring larger sets of data.  See: https://www.idigbio.org/portal
 |  | 
|  | 
 |  | 
|  | === Elasticsearch Overview ===
 |  | 
|  | 
 |  | 
|  | The iDigBio API does not currently (yet!) provide query/search capabilities.  However, the back-end Elasticsearch interface is public-facing and available for use by advanced users and programmers. This is the same interface that is used by the iDigBio Portal search.
 |  | 
|  | 
 |  | 
|  | '''Note: Direct queries to the iDigBio Elasticsearch service should be considered an Advanced operation.'''
 |  | 
|  | 
 |  | 
|  | According to the [http://www.elasticsearch.org/overview/elasticsearch/ Elasticsearch project site], Elasticsearch is a "flexible and powerful open source, distributed, real-time search and analytics engine."
 |  | 
|  | 
 |  | 
|  | The iDigBio search index provides two document types to query on: '''Records''' (specimen records) and '''Media Records''' (media metadata). Search results are returned as JSON-formatted documents. Each type can be queried through the following respective URLs:
 |  | 
|  | 
 |  | 
|  | {| class="wikitable"
 |  | 
|  | !Query Type
 |  | 
|  | !Description
 |  | 
|  | !Search URL
 |  | 
|  | |-
 |  | 
|  | |Records
 |  | 
|  | |specimen records
 |  | 
|  | |https://search.idigbio.org/idigbio/records/_search
 |  | 
|  | |-
 |  | 
|  | |Media Records
 |  | 
|  | |media metadata records
 |  | 
|  | |https://search.idigbio.org/idigbio/mediarecords/_search
 |  | 
|  | |}
 |  | 
|  | 
 |  | 
|  | The [http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html Elasticsearch Query Domain Specific Language (DSL)] and [http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/search-uri-request.html Elasticsearch URI Search] documents will likely be useful.
 |  | 
|  | 
 |  | 
|  | There is also an [https://groups.google.com/forum/?fromgroups#!forum/elasticsearch elasticsearch Google Group] available.
 |  | 
|  | 
 |  | 
|  | === Elasticsearch - Records ===
 |  | 
|  | 
 |  | 
|  | Specimen Records Query URL:
 |  | 
|  | <pre>
 |  | 
|  | https://search.idigbio.org/idigbio/records/_search
 |  | 
|  | </pre>
 |  | 
|  | 
 |  | 
|  | The following terms are currently available in the indexes for '''Records''' type of queries to Elasticsearch:
 |  | 
|  | 
 |  | 
|  | <pre>
 |  | 
|  | "barcodevalue"
 |  | 
|  | "catalognumber"
 |  | 
|  | "class"
 |  | 
|  | "collectioncode"
 |  | 
|  | "collectionid"
 |  | 
|  | "collectionname"
 |  | 
|  | "collector"
 |  | 
|  | "commonname"
 |  | 
|  | "continent"
 |  | 
|  | "country"
 |  | 
|  | "county"
 |  | 
|  | "datecollected"
 |  | 
|  | "datemodified"
 |  | 
|  | "etag"
 |  | 
|  | "family"
 |  | 
|  | "fieldnumber"
 |  | 
|  | "genus"
 |  | 
|  | "geopoint"
 |  | 
|  | "hasImage"
 |  | 
|  | "highertaxon"
 |  | 
|  | "infraspecificepithet"
 |  | 
|  | "institutioncode"
 |  | 
|  | "institutionid"
 |  | 
|  | "institutionname"
 |  | 
|  | "kingdom"
 |  | 
|  | "locality"
 |  | 
|  | "maxdepth"
 |  | 
|  | "maxelevation"
 |  | 
|  | "mediarecords"
 |  | 
|  | "mindepth"
 |  | 
|  | "minelevation"
 |  | 
|  | "municipality"
 |  | 
|  | "occurenceid"
 |  | 
|  | "order"
 |  | 
|  | "phylum"
 |  | 
|  | "recordset"
 |  | 
|  | "scientificname"
 |  | 
|  | "specificepithet"
 |  | 
|  | "stateprovince"
 |  | 
|  | "typestatus"
 |  | 
|  | "uuid"
 |  | 
|  | "verbatimlocality"
 |  | 
|  | "version"
 |  | 
|  | "waterbody"
 |  | 
|  | </pre>
 |  | 
|  | 
 |  | 
|  | The values stored in these terms are converted to lowercase, so searches based on terms should use the all-lowercase version of the string.
 |  | 
|  | 
 |  | 
|  | For example, searching for "Arkansas" in stateprovince will return no records.
 |  | 
|  | 
 |  | 
|  | <pre>
 |  | 
|  | $ curl -s "http://search.idigbio.org/idigbio/records/_search?q=stateprovince:Arkansas" | json_pp | grep scientificname | wc -l
 |  | 
|  | 0
 |  | 
|  | </pre>
 |  | 
|  | 
 |  | 
|  | Searching for "arkansas" will return multiple records.
 |  | 
|  | <pre>
 |  | 
|  | $ curl -s "http://search.idigbio.org/idigbio/records/_search?q=stateprovince:arkansas" | json_pp | grep scientificname | wc -l
 |  | 
|  | 10
 |  | 
|  | </pre>
 |  | 
|  | 
 |  | 
|  | 
 |  | 
|  | See [[iDigBio API Examples#Elasticsearch_Examples]] page for more Elasticsearch examples that are specific to iDigBio.
 |  | 
|  | 
 |  | 
|  | === Elasticsearch - Media Records ===
 |  | 
|  | 
 |  | 
|  | Media Records Query URL:
 |  | 
|  | <pre>
 |  | 
|  | https://search.idigbio.org/idigbio/mediarecords/_search
 |  | 
|  | </pre>
 |  | 
|  | 
 |  | 
|  | There are no useful terms for '''Media Records''' queries using Elasticsearch at this time.
 |  | 
|  | 
 |  | 
|  | See [[iDigBio API Examples#Elasticsearch_Examples]] page for Elasticsearch examples that are specific to iDigBio.
 |  |