Input CSV Format

From iDigBio
Revision as of 13:11, 8 June 2015 by Joanna (talk | contribs)
Jump to navigation Jump to search


Column Name Required Description Example
idigbio:MediaGUID Y GUID for a digital multimedia object. "urn:uuid:3c1dc496-e5c6-4849-b616-cada2896190d", "ids.amnh.org/AMNH_PBI 00010147 lateral.jpg"
idigbio:OriginalFileName Y Entire path+filename as stored on the appliance user's local disk. "C:/images/AMNH_PBI/AMNH_PBI 00010147 lateral.jpg", "\home\Plants\VSC0043330.jpg"
idigbio:CollectionObjectGUID N Relates the media record identified by idigbio:MediaGUID to a specimen record (CollectionObject concept). "http://biocoll.inhs.illinois.edu/fish/INHS106002", "urn:uuid:3c1dc496-e5c6-4849-b616-cada2896190d"
idigbio:Title N Concise title, name, or brief descriptive label of individual resource. This field should include the complete title with all the subtitles, if any. "Ilex glabra from FSU"
idigbio:Description N Description of the individual resource, containing the Who, What, When, Where and Why as free-form text. "Scanned herbarium sheet with specimen collected West of Plant City 4 miles from Mango Jct., on Hwy 92."
idigbio:LanguageCode N Code for the language used in the title and description. Must be in ISO 639-1 format. "en", "es", "pt"
idigbio:DigitizationDevice N Free form text describing the device or devices used to create the resource. "Canon Supershot 2000","Makroscan Scanner 2000","Zeiss Axioscope with Camera IIIu","SEM (Scanning Electron Microscope)"
idigbio:NominalPixelResolution N The real size of the pixel depicted in the image (e.g., microscopy). Include a number and a unit. "128µm"
idigbio:Magnification N Magnification applied when capturing the image of an object. "4x", "100x"
idigbio:OcrOutput N Output of the process of applying OCR to the multimedia object. "\tThe New York Botanical Garden\n\tLICHENS of NEW YORK STATE, U.S.A.\n\n Polycoccum minutulum Kocourkov� & F. Berger\n\n on Trapelia placodioides Coppins & P. James\n\nRockland County: Harriman State Park, along\n Woodtown Road West near dam at S end of Lake\n Sebago along Seven Lakes Drive, 41�11'N, 74�08'W,\n ca. 240 m; mixed hardwood-hemlock forest with\n granitic erratics.\n\n19 April 1998\n\nRichard C. Harris 42164\tNEW YORK BOTANICAL GARDEN\n\n\t\t\t\t\t01075759\n"
idigbio:OcrTechnology N Free form text describing the software utilized for OCR as well as any additional technique (cropping, color alteration applied, controlled vocabulary). "Tesseract version 3.01 on Windows, latin character set"
idigbio:InformationWithheld N Indication that additional information exists and that it has not been shared in the given record due to sensitive nature. It does not contain the withheld information itself. Should include information on how to obtain the withheld information by other means (e.g., a contact). "location information not given for endangered species, contact my@email", "collector identities withheld, contact xyz", "ask about tissue samples by contacting my@email"


Invalid input CSV file
Image ingestion appliance checks the validity of input CSV file. Below list is the cases of invalid input CSV files.
- The number of columns differs among rows.
- Any entity contains double quatation mark(") in the field.


Specimen Record UUID Field
To support the relationship between media records and specimens, users can provide the field "idigbio:SpecimenRecordUUID" in the csv file.
Values of this field should be UUIDs of the specimen records. Multiple UUIDs can be specified for one media record, separated by ",".
For example:
"6c11a2a4-25f0-446b-b594-97f97a277bf7"
"6c11a2a4-25f0-446b-b594-97f97a277bf7,73c09da2-89ef-4b07-883c-d8de80ba14bd"

Examples of the input CSV files: Example 1 Example 2 Example 3