Input CSV Format: Difference between revisions

From iDigBio
Jump to navigation Jump to search
No edit summary
No edit summary
Line 7: Line 7:
The appliance will build you an input CSV file if you don't already have one. In the dialog you can point it to your folder of media files and it will build the bare minimum of what it needs to ingest your media. On the other hand, if you have rich data from the media EXIF or IPTC, for example, you can use Audubon Core terms to build your own input file.
The appliance will build you an input CSV file if you don't already have one. In the dialog you can point it to your folder of media files and it will build the bare minimum of what it needs to ingest your media. On the other hand, if you have rich data from the media EXIF or IPTC, for example, you can use Audubon Core terms to build your own input file.


'''With the exception of idigbio:OriginalFileName and a few additional fields (TBD), the following is deprecated. The information below is being maintained for historical refaced only. Use [https://terms.tdwg.org/wiki/Audubon_Core_Term_List Audubon Core] terms instead.'''
'''With the exception of idigbio:OriginalFileName and a few additional fields (TBD), the following is deprecated. The information below is being maintained for historical reference only. Use [https://terms.tdwg.org/wiki/Audubon_Core_Term_List Audubon Core] terms instead.'''





Revision as of 12:23, 5 September 2017


Input CSV Format for iDigBio Media Appliance

The appliance will build you an input CSV file if you don't already have one. In the dialog you can point it to your folder of media files and it will build the bare minimum of what it needs to ingest your media. On the other hand, if you have rich data from the media EXIF or IPTC, for example, you can use Audubon Core terms to build your own input file.

With the exception of idigbio:OriginalFileName and a few additional fields (TBD), the following is deprecated. The information below is being maintained for historical reference only. Use Audubon Core terms instead.


Column Name Required Description Example
idigbio:MediaGUID Y GUID for a digital multimedia object. "urn:uuid:3c1dc496-e5c6-4849-b616-cada2896190d", "ids.amnh.org/AMNH_PBI 00010147 lateral.jpg"
idigbio:OriginalFileName Y Entire path+filename as stored on the appliance user's local disk. "C:/images/AMNH_PBI/AMNH_PBI 00010147 lateral.jpg", "\home\Plants\VSC0043330.jpg"
idigbio:CollectionObjectGUID N Relates the media record identified by idigbio:MediaGUID to a specimen record (CollectionObject concept). "http://biocoll.inhs.illinois.edu/fish/INHS106002", "urn:uuid:3c1dc496-e5c6-4849-b616-cada2896190d"
idigbio:Title N Concise title, name, or brief descriptive label of individual resource. This field should include the complete title with all the subtitles, if any. "Ilex glabra from FSU"
idigbio:Description N Description of the individual resource, containing the Who, What, When, Where and Why as free-form text. "Scanned herbarium sheet with specimen collected West of Plant City 4 miles from Mango Jct., on Hwy 92."
idigbio:LanguageCode N Code for the language used in the title and description. Must be in ISO 639-1 format. "en", "es", "pt"
idigbio:DigitizationDevice N Free form text describing the device or devices used to create the resource. "Canon Supershot 2000","Makroscan Scanner 2000","Zeiss Axioscope with Camera IIIu","SEM (Scanning Electron Microscope)"
idigbio:NominalPixelResolution N The real size of the pixel depicted in the image (e.g., microscopy). Include a number and a unit. "128µm"
idigbio:Magnification N Magnification applied when capturing the image of an object. "4x", "100x"
idigbio:OcrOutput N Output of the process of applying OCR to the multimedia object. "\tThe New York Botanical Garden\n\tLICHENS of NEW YORK STATE, U.S.A.\n\n Polycoccum minutulum Kocourkov� & F. Berger\n\n on Trapelia placodioides Coppins & P. James\n\nRockland County: Harriman State Park, along\n Woodtown Road West near dam at S end of Lake\n Sebago along Seven Lakes Drive, 41�11'N, 74�08'W,\n ca. 240 m; mixed hardwood-hemlock forest with\n granitic erratics.\n\n19 April 1998\n\nRichard C. Harris 42164\tNEW YORK BOTANICAL GARDEN\n\n\t\t\t\t\t01075759\n"
idigbio:OcrTechnology N Free form text describing the software utilized for OCR as well as any additional technique (cropping, color alteration applied, controlled vocabulary). "Tesseract version 3.01 on Windows, latin character set"
idigbio:InformationWithheld N Indication that additional information exists and that it has not been shared in the given record due to sensitive nature. It does not contain the withheld information itself. Should include information on how to obtain the withheld information by other means (e.g., a contact). "location information not given for endangered species, contact my@email", "collector identities withheld, contact xyz", "ask about tissue samples by contacting my@email"


Invalid input CSV file
Image ingestion appliance checks the validity of input CSV file. Below list is the cases of invalid input CSV files.
- The number of columns differs among rows.
- Any entity contains double quotation mark(") in the field.


Specimen Record UUID Field
To support the relationship between media records and specimens, users can provide the field "idigbio:SpecimenRecordUUID" in the csv file.
Values of this field should be UUIDs of the specimen records. Multiple UUIDs can be specified for one media record, separated by ",".
For example:
"6c11a2a4-25f0-446b-b594-97f97a277bf7"
"6c11a2a4-25f0-446b-b594-97f97a277bf7,73c09da2-89ef-4b07-883c-d8de80ba14bd"

Examples of the input CSV files: Example 1 Example 2 Example 3