Georeferencing for Research Use: Difference between revisions
| No edit summary | |||
| (40 intermediate revisions by 5 users not shown) | |||
| Line 1: | Line 1: | ||
| = Post Workshop Publication = | |||
| Organizers and participants co-wrote a summation from this workshop of lessons learned and key observations and published these results as | |||
| *Seltmann K, Lafia S, Paul D, James S, Bloom D, Rios N, Ellis S, Farrell U, Utrup J, Yost M, Davis E, Emery R, Motz G, Kimmig J, Shirey V, Sandall E, Park D, Tyrrell C, Thackurdeen R, Collins M, O'Leary V, Prestridge H, Evelyn C, Nyberg B (2018) Georeferencing for Research Use (GRU): An integrated geospatial training paradigm for biocollections researchers and data providers. Research Ideas and Outcomes 4: e32449. https://doi.org/10.3897/rio.4.e32449 | |||
| == iDigBio - CCBER GWG Georeferencing for Research Use, a short course  == | == iDigBio - CCBER GWG Georeferencing for Research Use, a short course  == | ||
| Line 8: | Line 11: | ||
| !colspan="2" style="background:#D58B28;text-align:center;font-size:9pt" | Quick Links for GWG Second Train the Trainers Workshop   | !colspan="2" style="background:#D58B28;text-align:center;font-size:9pt" | Quick Links for GWG Second Train the Trainers Workshop   | ||
| |-   | |-   | ||
| |Georeferencing for Research Use - link to agenda | |[[Georeferencing_for_Research_Use#Schedule_of_Events_-_Agenda|Georeferencing for Research Use - link to agenda]] | ||
| |-   | |-   | ||
| |Biblio entries<br> | |Biblio entries<br> | ||
| |-   | |-   | ||
| | Georeferencing for Research Use, short course report | |[https://www.idigbio.org/content/georeferencing-and-visualizing-biodiversity-data-research Georeferencing for Research Use, short course report] | ||
| |} | |} | ||
| [[Category:Workshop]] | [[Category:Workshop]][[Category:Georeferencing]][[Category:Research]] | ||
| [[File:Capture.PNG|200px|thumb|right|hotel and NCEAS map]] | [[File:Capture.PNG|200px|thumb|right|hotel and NCEAS map]] | ||
| October 4 - 7, 2016 at (https://www.nceas.ucsb.edu/) NCEAS, Santa Barbara California | October 4 - 7, 2016 at (https://www.nceas.ucsb.edu/) NCEAS, Santa Barbara California | ||
| Line 22: | Line 25: | ||
| After the workshop, we will encourage our participants to share use cases, any training materials developed, and to offer workshops, webinars, talks, or other events aimed at increasing use of best practices for georeferencing legacy locality data, best practices for capturing the locality data from future biological and paleontological collecting and sampling events, and best practices for using the data in research. | After the workshop, we will encourage our participants to share use cases, any training materials developed, and to offer workshops, webinars, talks, or other events aimed at increasing use of best practices for georeferencing legacy locality data, best practices for capturing the locality data from future biological and paleontological collecting and sampling events, and best practices for using the data in research. | ||
| Some anticipated course content includes discussion and activities about georeferencing integration, georeferenced data visualization, and georeferences for modeling and research | Some anticipated course content includes discussion and activities about georeferencing integration, georeferenced data visualization, and georeferences for modeling and research. | ||
| === Logistics: === | === Logistics: === | ||
| Line 30: | Line 33: | ||
| === Course Instructor List === | === Course Instructor List === | ||
| (''in alphabetical order'') David Bloom, Matt Collins, Shelley James, Sara Lafia, Deborah Paul, Marcy Revelez, Nelson Rios, Katja Seltmann, Jessica Utrup, Mike Yost | (''in alphabetical order'') David Bloom, Matt Collins, Una Farrell, Shelley James, Sara Lafia, Deborah Paul, Marcy Revelez, Nelson Rios, Katja Seltmann, Jessica Utrup, Mike Yost | ||
| === Bring your Datasets and Laptops:  === | === Bring your Datasets and Laptops:  === | ||
| '''Participants are strongly encouraged to bring representative datasets''' from their collections or research that need georeferencing to expose everyone to the variety of locality data georeferencing issues and give the experts and participants a chance to work together to address any challenges. | '''Participants are strongly encouraged to bring representative datasets''' from their collections or research that need georeferencing to expose everyone to the variety of locality data georeferencing issues and give the experts and participants a chance to work together to address any challenges. | ||
| Line 58: | Line 58: | ||
| == Goals of the Workshop:  == | == Goals of the Workshop:  == | ||
| *Best practices for researchers for in-the-field creating of new locality data and legacy data georeferencing.  | |||
| **Tools (hardware and software) and standards (what to document, datum etc.). | |||
| **How to re-patriate data and/or best practices for putting data into data repository if can’t be repatriated (what the obstacles are and minimization of data loss). | |||
| *How to evaluate already georeferenced data. Current tools for visualization and evaluation. | |||
| **Metrics to look for | |||
| **Current tools for georeferencing | |||
| **Online tools | |||
| **R | |||
| **QGIS | |||
| *Researchers give input on the challenges for georeferencing, using existing georeferences. | |||
| *Workflow review for some research review of using georeferenced data (Katja, Shelley, ...) | |||
| Ultimate goal: Participant can point to aspects they have learned (tool, standard etc.) during the workshop and can indicate how they will use those aspects for their research goal/purpose (present or future). | |||
| == Workshop Objectives:  == | == Workshop Objectives:  == | ||
| '''Topics to be covered'''<br> | '''Topics to be covered'''<br> | ||
| ''Pre-workshop materials''<br> | ''Pre-workshop materials''<br> | ||
| Introductory information about datums, mapping, coordinate systems<br> | *Introductory information about datums, mapping, coordinate systems<br> | ||
| Basic georeferencing how-to<br> | *Basic georeferencing how-to<br> | ||
| ''During workshop''<br> | ''During workshop''<br> | ||
| Data standards, DwC terminology and fields (e.g. lat, long, datum), differences among disciplines (neo- and paleontological fields)<br> | *Data standards, DwC terminology and fields (e.g. lat, long, datum), differences among disciplines (neo- and paleontological fields)<br> | ||
| Georeferencing toolkit and workflow examples ( | *Georeferencing toolkit and workflow examples (GEOLocate, maps, other resources, pros and cons)<br> | ||
| Best practices for field collection of data (locality strings and GPS units, precision, datum) <br> | *Best practices for field collection of data (locality strings and GPS units, precision, datum) <br> | ||
| Best practices for georeferencing of legacy data given:<br> | *How best to record and store georeferencing notes and other data sources (database/CMS dependant)<br> | ||
| Varied research requirements for precision | *Best practices for georeferencing of legacy data given:<br> | ||
| Project and collection management limitations | **Varied research requirements for accuracy and precision | ||
| Uncertainty data - | **Project and collection management limitations | ||
| Datum - georectify to standard  | **Uncertainty data - polygon vs. point radius, description and metadata, etc. | ||
| Workflows for incorporating data into different collections databases  | **Datum - georectify to a standard versus verbatim | ||
| Best practice syntax in locality descriptions for use in automation vs verbatim strings | *Workflows for incorporating data into different collections databases   | ||
| Database limitations | **Best practice syntax in locality descriptions for use in automation vs verbatim strings | ||
| Multiple geopoint values and storage (verbatim, automated-non-vetted value,  | **Database limitations | ||
| Downloading datasets - sources, different mechanisms | **Multiple geopoint values and storage (verbatim, automated-non-vetted value, nearest named place, update to more accurate value, etc.) | ||
| Assessing data quality | *Downloading datasets - sources, different mechanisms | ||
| Uncertainty data - availability in data sources and interpretation | **Assessing data quality | ||
| Tools for aggregating, cleaning, visualizing and analyzing data | **Uncertainty data - availability in data sources and interpretation | ||
| *Tools for aggregating, cleaning, visualizing and analyzing data | |||
| Creating maps | **R, QGIS, OpenRefine | ||
| Spatial analyses | **Creating maps | ||
| Automated tools using  | **Spatial analyses | ||
| Difficult cases, such as geopolitically fluid locations over time, offshore localities<br> | **Automated, online tools and applications using geospatial data (e.g. LifeMapper) | ||
| Hands-on practice & case studies<br> | *Difficult cases, such as geopolitically fluid locations over time, offshore localities<br> | ||
| *Hands-on practice & case studies<br> | |||
| == Schedule of Events - Agenda  | == Schedule of Events - Agenda == | ||
| Breakfast, Lunch and Dinner every day is on our own (not provided).   | Breakfast, Lunch and Dinner every day is on our own (not provided).   | ||
| === Day 1, Tuesday October 4th  === | === Day 1, Tuesday October 4th  === | ||
| [https://vimeo.com/album/2163673/video/192472653 Recording Day 1] | |||
| {| cellspacing="2" cellpadding="5" border="1" | {| cellspacing="2" cellpadding="5" border="1" | ||
| |- | |- | ||
| Line 175: | Line 188: | ||
| === Day 2, Wednesday October 5th  === | === Day 2, Wednesday October 5th  === | ||
| [https://vimeo.com/album/2163673/video/192472654 Recording Day 2] | |||
| {| cellspacing="2" cellpadding="5" border="1" | {| cellspacing="2" cellpadding="5" border="1" | ||
| |+ | |+ | ||
| Line 227: | Line 242: | ||
| :[http://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsRCollectionManagers.pptx Collection and Data Managers] | :[http://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsRCollectionManagers.pptx Collection and Data Managers] | ||
| :[http://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsResearchers.pptx Researchers] | :[http://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsResearchers.pptx Researchers] | ||
| :[https://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsRCollectionManagers_YOST | :[https://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsRCollectionManagers_YOST.pptx Mike Yost] | ||
| :[https://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsJessicaUtrup.pptx Jessica Utrup] | :[https://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsJessicaUtrup.pptx Jessica Utrup] | ||
| :[https://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsResearchers_SLafia.pptx Sara Lafia] | :[https://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsResearchers_SLafia.pptx Sara Lafia] | ||
| :[https://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsResearchers-seltmann.pptx Katja Seltmann] | :[https://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsResearchers-seltmann.pptx Katja Seltmann] | ||
| :[https://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsResearchers_SAJ.pptx Shelley James] | :[https://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsResearchers_SAJ.pptx Shelley James] | ||
| : | :[http://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/WorkflowKickOffQuestionsEBD.pptx Edward Davis] | ||
| <br> | <br> | ||
| :[https://www.idigbio.org/content/digitization-workflows Digitization Workflows at iDigBio] | :[https://www.idigbio.org/content/digitization-workflows Digitization Workflows at iDigBio] | ||
| Line 273: | Line 288: | ||
| === Day 3, Thursday October 6th  === | === Day 3, Thursday October 6th  === | ||
| [http://s.idigbio.org/idigbio-downloads/a69d1541-4726-465d-84ad-50c7ed556eee.zip Download zipped dataset] The parameters for this dataset are specimens in the family Carabidae, that have geocoordinates, and are in California.  It results in about 25,000 records in total. | [http://s.idigbio.org/idigbio-downloads/a69d1541-4726-465d-84ad-50c7ed556eee.zip Download zipped dataset] The parameters for this dataset are specimens in the family Carabidae, that have geocoordinates, and are in California.  It results in about 25,000 records in total.<br/> | ||
| [https://vimeo.com/album/2163673/video/192472656 Recording Day 3] | |||
| {| cellspacing="2" cellpadding="5" border="1" | {| cellspacing="2" cellpadding="5" border="1" | ||
| |- | |- | ||
| Line 285: | Line 302: | ||
| |- | |- | ||
| | 9:05<br>   | | 9:05<br>   | ||
| |  | | [https://docs.google.com/presentation/d/1ORWr2krUhwpNWteDXmNyoUaUjFkJ8PAc-mj7tNW1Rng/edit?usp=sharing Georeferencing for Research Use Workshop - iDigBio Datasets]  | ||
| * Downloading datasets from iDigBio - get data from portal and explain each component to the dataset. | * [https://docs.google.com/presentation/d/1ORWr2krUhwpNWteDXmNyoUaUjFkJ8PAc-mj7tNW1Rng/edit?usp=sharing Downloading datasets from iDigBio] - get data from portal and explain each component to the dataset. | ||
| filter and get the dataset | filter and get the dataset | ||
| * What is raw vs not raw? | |||
| * Similar or different from GBIF? | * Similar or different from GBIF? | ||
| *  | * [https://github.com/iDigBio/idigbio-search-api/wiki/Data-Quality-Flags List of iDigBio Flags]:  | ||
| * Walk through steps of download, but provide dataset. | * Walk through steps of download, but provide dataset. | ||
| * Data set: http://s.idigbio.org/idigbio-downloads/a69d1541-4726-465d-84ad-50c7ed556eee.zip | * iDigBio Data set: http://s.idigbio.org/idigbio-downloads/a69d1541-4726-465d-84ad-50c7ed556eee.zip | ||
| |Matthew Collins (remote), Katja Seltmann, Shelley James<br> | |Matthew Collins (remote), Katja Seltmann, Shelley James<br> | ||
| |- | |- | ||
| Line 317: | Line 335: | ||
| |- | |- | ||
| | 13:00<br>   | | 13:00<br>   | ||
| | Cleaning Datasets: Spreadsheets, Open Refine, tracking your work (2)<br>   | | [https://www.idigbio.org/sites/default/files/workshop-presentations/georef-research-use/GRU_spreadsheetsRefine6Oct2016.pptx Cleaning Datasets: Spreadsheets, Open Refine, tracking your work] (2)<br>   | ||
| | Deb Paul, Nelson Rios, Katja Seltmann<br> | | Deb Paul, Nelson Rios, Katja Seltmann<br> | ||
| |- | |- | ||
| Line 346: | Line 364: | ||
| === Day 4, Friday October 7th  === | === Day 4, Friday October 7th  === | ||
| [https://ucsb.box.com/s/5qqiiqw237jr5mb7ip8hm5yspl4b8hcn Download zipped QGIS project] The project to the point we completed on Day 3 is available for download in the same folder as the auxiliary data. Launch the QGIS project from the '''Tutorial.qgs''' file. <br/> | |||
| [https://vimeo.com/album/2163673/video/192472655 Recording Day 4] | |||
| {| cellspacing="2" cellpadding="5" border="1" | {| cellspacing="2" cellpadding="5" border="1" | ||
| |- | |- | ||
| Line 378: | Line 399: | ||
| | 11:00<br>   | | 11:00<br>   | ||
| | Exploring datasets: Uncertainty | | Exploring datasets: Uncertainty | ||
| * Bin points based on uncertainty rank | * Bin points based on uncertainty rank | ||
| * Symbolize uncertainty by collector, data quality score - systematic error | * Symbolize uncertainty by collector, data quality score - systematic error | ||
| | Sara Lafia<br> | | Sara Lafia<br> | ||
| Line 389: | Line 410: | ||
| |- | |- | ||
| | 12:00<br>   | | 12:00<br>   | ||
| | Lunch on our own. | | Lunch on our own. | ||
| | <br> | | <br> | ||
| |- | |- | ||
| Line 436: | Line 457: | ||
| <br/> | <br/> | ||
| Some software [http://www.datacarpentry.org/workshop-template/install.html install instructions] from Data and Software Carpentry | Some software [http://www.datacarpentry.org/workshop-template/install.html install instructions] from Data and Software Carpentry | ||
| == Requests for the Future == | |||
| * Scripts/tools for repeated cleaning/analysis | |||
| * Using the iDigBio API (API for dummies) | |||
| * Inselect (note we provided links for more on this tool - to the workshop participants, see [https://docs.google.com/document/d/1m9cdERGtJkukb3EHUXPmCg58G08WWMA2HyBv28k6PUo/edit?usp=sharing google doc]) | |||
| * Automated data cleaning - iDigBio and VertNet activities | |||
| * What to do with quantified uncertainties & polygons - Jorge Soberon (KU team, others in the fitness for use GBIF working group - see [https://www.gbif.org/document/82612/report-of-the-task-group-on-gbif-data-fitness-for-use-in-distribution-modelling Final Report of the Task Group on GBIF Data Fitness for Use in Distribution Modelling] | |||
| * QGIS layers - use cases (e.g. elevation) | |||
| * Detailed Workflows - for georeferencing, when not to georeference (see  iDigBio Georeferencing Working Group - https://www.idigbio.org/wiki/index.php/IDigBio_Working_Groups#Georeferencing_Working_Group_.28GWG.29), cleaning | |||
| * Documentation for tutorials | |||
| * Standards/possibility for storing multiple georeferences (and other possibilities such as annotations within iDigBio) | |||
| * QGIS tutorial as a Software/Data Carpentry format | |||
| * QGIS working group | |||
| * Geolocate with r webinar (follow on from Symbiota  webinar https://www.idigbio.org/content/symbiota-webinar-geolocate-toolkit https://www.idigbio.org/content/coge-collaborative-georeferencing-demo-webinar | |||
| == Trained Georeferencers == | == Trained Georeferencers == | ||
| Line 531: | Line 567: | ||
| ### '''GPS Status''': available for [https://play.google.com/store/apps/details?id=com.eclipsim.gpsstatus2&hl=en android] and [https://itunes.apple.com/us/app/gps-status/id378085995?mt=8 iOS] devices. | ### '''GPS Status''': available for [https://play.google.com/store/apps/details?id=com.eclipsim.gpsstatus2&hl=en android] and [https://itunes.apple.com/us/app/gps-status/id378085995?mt=8 iOS] devices. | ||
| ### '''Geopaparazzi''': [https://play.google.com/store/apps/details?id=eu.hydrologis.geopaparazzi&hl=en android] only | ### '''Geopaparazzi''': [https://play.google.com/store/apps/details?id=eu.hydrologis.geopaparazzi&hl=en android] only | ||
| == Updates  == | |||
Latest revision as of 13:28, 1 May 2019
Post Workshop Publication
Organizers and participants co-wrote a summation from this workshop of lessons learned and key observations and published these results as
- Seltmann K, Lafia S, Paul D, James S, Bloom D, Rios N, Ellis S, Farrell U, Utrup J, Yost M, Davis E, Emery R, Motz G, Kimmig J, Shirey V, Sandall E, Park D, Tyrrell C, Thackurdeen R, Collins M, O'Leary V, Prestridge H, Evelyn C, Nyberg B (2018) Georeferencing for Research Use (GRU): An integrated geospatial training paradigm for biocollections researchers and data providers. Research Ideas and Outcomes 4: e32449. https://doi.org/10.3897/rio.4.e32449
iDigBio - CCBER GWG Georeferencing for Research Use, a short course
| Georeferencing for Research Use, a short course | |
|---|---|
| Quick Links for GWG Second Train the Trainers Workshop | |
| Georeferencing for Research Use - link to agenda | |
| Biblio entries | |
| Georeferencing for Research Use, short course report | |
October 4 - 7, 2016 at (https://www.nceas.ucsb.edu/) NCEAS, Santa Barbara California
We welcome you to this short course, with a focus on research use of georeferenced natural history collections data. We will include activities and discussions about best practices and tools for georeferencing, capturing locality data in the field, and using georeferenced specimen locality data in research. Attendees must have a basic level of experience with georeferencing techniques and tools and be researchers or directly involved with researchers.
After the workshop, we will encourage our participants to share use cases, any training materials developed, and to offer workshops, webinars, talks, or other events aimed at increasing use of best practices for georeferencing legacy locality data, best practices for capturing the locality data from future biological and paleontological collecting and sampling events, and best practices for using the data in research.
Some anticipated course content includes discussion and activities about georeferencing integration, georeferenced data visualization, and georeferences for modeling and research.
Logistics:
- Hotel and NCEAS Map
- NCEAS is 3rd floor of the Balboa Building, 735 State Street
- Local restaurant list
Course Instructor List
(in alphabetical order) David Bloom, Matt Collins, Una Farrell, Shelley James, Sara Lafia, Deborah Paul, Marcy Revelez, Nelson Rios, Katja Seltmann, Jessica Utrup, Mike Yost
Bring your Datasets and Laptops:
Participants are strongly encouraged to bring representative datasets from their collections or research that need georeferencing to expose everyone to the variety of locality data georeferencing issues and give the experts and participants a chance to work together to address any challenges.
Participants must bring their own laptops and everyone will have wired access to facilitate the best possible workshop experience.
Reading Materials and Resources:
- Georeferencing.org
- Georeferencing Quick Reference Guide
 version 2012-10-08. John Wieczorek, David Bloom, Heather Constable, Janet Fang, Michelle Koo, Carol Spencer, Kristina Yamamoto
- Guide to Best Practices for Georeferencing - Chapman, A.D. and J. Wieczorek (eds). 2006
- Georeferencing Working Group Training Videos
- Georeferencing Incidents from Locality Descriptions and its Applications: a Case Study from Yosemite National Park Search and Rescue Transactions in GIS, 2011, 15(6): 775–793 Authors: Doherty, Guo, Liu, Wieczorek, Doke
- iDigBio Georeferencing Wiki http://tinyurl.com/idbgeowiki
- HerpNET Georeferencing Resources
- Take Workshop Notes Together Here
- Post - Workshop Survey Questions
- Got a Georeferencing Question? Post it on the iDigBio Georeferencing List Serve
- BITC Global Online Seminar #25: Simple Workflow for Data Cleaning
Wireless / Wired Access Issues:
Both wired and wireless access provided to workshop participants. Connectivity instructions will be provided at the workshop.
Goals of the Workshop:
- Best practices for researchers for in-the-field creating of new locality data and legacy data georeferencing.
- Tools (hardware and software) and standards (what to document, datum etc.).
- How to re-patriate data and/or best practices for putting data into data repository if can’t be repatriated (what the obstacles are and minimization of data loss).
 
- How to evaluate already georeferenced data. Current tools for visualization and evaluation.
- Metrics to look for
- Current tools for georeferencing
- Online tools
- R
- QGIS
 
- Researchers give input on the challenges for georeferencing, using existing georeferences.
- Workflow review for some research review of using georeferenced data (Katja, Shelley, ...)
Ultimate goal: Participant can point to aspects they have learned (tool, standard etc.) during the workshop and can indicate how they will use those aspects for their research goal/purpose (present or future).
Workshop Objectives:
Topics to be covered
Pre-workshop materials
- Introductory information about datums, mapping, coordinate systems
- Basic georeferencing how-to
During workshop
- Data standards, DwC terminology and fields (e.g. lat, long, datum), differences among disciplines (neo- and paleontological fields)
- Georeferencing toolkit and workflow examples (GEOLocate, maps, other resources, pros and cons)
- Best practices for field collection of data (locality strings and GPS units, precision, datum) 
- How best to record and store georeferencing notes and other data sources (database/CMS dependant)
- Best practices for georeferencing of legacy data given:
 - Varied research requirements for accuracy and precision
- Project and collection management limitations
- Uncertainty data - polygon vs. point radius, description and metadata, etc.
- Datum - georectify to a standard versus verbatim
 
- Workflows for incorporating data into different collections databases
- Best practice syntax in locality descriptions for use in automation vs verbatim strings
- Database limitations
- Multiple geopoint values and storage (verbatim, automated-non-vetted value, nearest named place, update to more accurate value, etc.)
 
- Downloading datasets - sources, different mechanisms
- Assessing data quality
- Uncertainty data - availability in data sources and interpretation
 
- Tools for aggregating, cleaning, visualizing and analyzing data
- R, QGIS, OpenRefine
- Creating maps
- Spatial analyses
- Automated, online tools and applications using geospatial data (e.g. LifeMapper)
 
- Difficult cases, such as geopolitically fluid locations over time, offshore localities
- Hands-on practice & case studies
Schedule of Events - Agenda
Breakfast, Lunch and Dinner every day is on our own (not provided).
Day 1, Tuesday October 4th
| Time | Activity | Presenter | 
|---|---|---|
| 8:45 | Pick up Name Tags, Wireless Log-In, Wired Setup, Collaborative Notes (google doc) | |
| 9:00 | Welcome by NCEAS host, Logistics, Trainer Introductions, Introduction to iDigBio, CCBER | Katja Seltmann - CCBER, Debbie Paul - iDigBio, Ben Halpern - Director NCEAS, Ginger Gillquist - Logistics NCEAS | 
| 9:20 | From the participants and instructors: a quick informal survey Quick Name/Rank/Serial# introductions 
 | Deb Paul | 
| 10:00 | Standards, Terms & Fields: Darwin Core Standard, Key Terminology | David Bloom, Shelley James | 
| 10:15 | Georeferencing Quick Reference Guide, and Georeferencing Template | Una Farrell | 
| 10:30 | Coffee Klatch w/ NCEAS | |
| 11:15 | Locality Types | Una Farrell | 
| 11:45 | Georeferencing Calculator, Calculator Manual | David Bloom | 
| 12:10 | Lunch | |
| 13:10 | Georeferencing Calculator Example and Exercises, MaNIS/HerpNET/ORNIS Georeferencing Guidelines | David Bloom | 
| 13:40 | Internet Resources - Where to Begin? georeferencing.org | Una Farrell | 
| 14:40 | Break | |
| 15:10 | Exercises cont. | |
| 15:30 | GEOLocate: Overview, Basics & Demos GEOLocate Introduction | Nelson Rios | 
| 17:00 | Day in Review Trivia Question of the Day Survey (15 min) | |
| 17:30 | End | 
Dinner on our own - See list of local restaurants. Optional Evening Activity: Happy hour and joyful GeoGathering at Hoffmann Brat Haus
Day 2, Wednesday October 5th
| Time | Activity | Presenter | 
|---|---|---|
| 8:50 | Please complete Survey for Day 1! | |
| 9:00 | Two! Trivia Questions Review and Questions Software Installs check for tomorrow | All | 
| 9:10 | GEOLocate: Advanced Features, Collaborative Georeferencing and the GEOLocate API | Nelson Rios | 
| 10:00 | Importance of Polygons | Mike Yost, Nelson Rios | 
| 10:30 | Break | |
| 11:00 | GPS Units and APPs: Exercise Introduction | David Bloom, Mike Yost, Shelley James, Katja Seltmann | 
| 11:15 | GPS Exercises (continued outside) | All | 
| 12:15 | Lunch | |
| 13:15 | GPS Exercises (continued outside) Please upload your GPS Data here | All | 
| 13:30 | Good and Bad Localities, Field Locality Handout: MVZ and iDigBio GWG Guide for Recording Localities in Field Notes, Field Information Management Systems (FIMS) Paper maps | David Bloom | 
| 14:15 | Georeferencing Workflows: presentations and discussion Researcher and Collections perspectives: Producers and Consumers 
 
 | All | 
| 15:15 | Break | |
| 15:45 | Online Exercises, Review of known answers | |
| 16:30 | GPS Exercise - Review (.kmz), Summary Spreadsheet, Field Worksheet, Locality Descriptions 
 | David Bloom, Jessica Utrup | 
| 16:45 | Day in Review Download dataset for tomorrow | |
| 17:15 | Survey (15 min) | |
| 17:30 | End | 
Dinner on our own - See list of local restaurants. Optional Evening Activities: TBA
Day 3, Thursday October 6th
Download zipped dataset The parameters for this dataset are specimens in the family Carabidae, that have geocoordinates, and are in California.  It results in about 25,000 records in total.
Recording Day 3
| Time | Activity | Presenter | 
|---|---|---|
| 9:00 | Review and Questions | All | 
| 9:05 | Georeferencing for Research Use Workshop - iDigBio Datasets 
 filter and get the dataset 
 | Matthew Collins (remote), Katja Seltmann, Shelley James | 
| 10:00 | Data Quality: How to evaluate existing georeferenced data/Fitness for Use 
 | Katja Seltmann, Shelley James | 
| 10:30 | Break | |
| 11:00 | Cleaning Datasets: Spreadsheets, Open Refine, tracking your work | Deb Paul, Nelson Rios, Katja Seltmann | 
| 12:00 | Lunch | |
| 13:00 | Cleaning Datasets: Spreadsheets, Open Refine, tracking your work (2) | Deb Paul, Nelson Rios, Katja Seltmann | 
| 13:30 | Visualizing datasets: Set up QGIS and load data 
 Auxiliary datasets: Download any additional datasets of interest. Online Tutorial | Sara Lafia | 
| 15:00 | Break | |
| 15:30 | Visualizing datasets: Preview and explore toolkits & saving your maps and data | Sara Lafia | 
| 17:15 | Survey (15 min) | |
| 17:30 | End | 
Dinner: TBD
Day 4, Friday October 7th
Download zipped QGIS project The project to the point we completed on Day 3 is available for download in the same folder as the auxiliary data. Launch the QGIS project from the Tutorial.qgs file. 
Recording Day 4
| Time | Activity | Presenter | 
|---|---|---|
| 9:00 | Questions and Review Share your datasets! [1]: Upload your research datasets that you'd like to work on. | All | 
| 9:10 | Exploring datasets: Aggregating by Regions 
 
 | Sara Lafia, Katja Seltmann, Nelson Rios | 
| 9:50 | Exploring datasets: Time animation 
 | Sara Lafia | 
| 10:30 | Break | |
| 11:00 | Exploring datasets: Uncertainty 
 | Sara Lafia | 
| 11:30 | Exploring datasets: Spatial autocorrelation 
 | Sara Lafia | 
| 12:00 | Lunch on our own. | |
| 13:00 | LifeMapper LIVE DEMO | Jeffrey Cavner, James Beach, et al | 
| 13:15 | Work on own data sets/Open question time/Practice. Polygon practice | Nelson Rios, et al | 
| 13:45 | Breakout sessions Cleaning data using r  | |
| 15:30 | Break | |
| 16:00 | Research Use of the Data. A conversation from the collective point-of-view of the researchers present. Challenges? Experiences? Needs (software, skills, infrastructure)? What changes might you make now to your workflows? | Ed Davis, Katja Seltmann, Shelley James, Nelson Rios, Sara Lafia | 
| 16:30 | Day & workshop in Review iDigBio Webinar On Your Calendar Oct 12th, 2016 - Isn't that Spatial? Post Workshop Survey | |
| 17:30 | Beer | 
Dinner on our own - See list of local restaurants. 
Some software install instructions from Data and Software Carpentry
Requests for the Future
- Scripts/tools for repeated cleaning/analysis
- Using the iDigBio API (API for dummies)
- Inselect (note we provided links for more on this tool - to the workshop participants, see google doc)
- Automated data cleaning - iDigBio and VertNet activities
- What to do with quantified uncertainties & polygons - Jorge Soberon (KU team, others in the fitness for use GBIF working group - see Final Report of the Task Group on GBIF Data Fitness for Use in Distribution Modelling
- QGIS layers - use cases (e.g. elevation)
- Detailed Workflows - for georeferencing, when not to georeference (see iDigBio Georeferencing Working Group - https://www.idigbio.org/wiki/index.php/IDigBio_Working_Groups#Georeferencing_Working_Group_.28GWG.29), cleaning
- Documentation for tutorials
- Standards/possibility for storing multiple georeferences (and other possibilities such as annotations within iDigBio)
- QGIS tutorial as a Software/Data Carpentry format
- QGIS working group
- Geolocate with r webinar (follow on from Symbiota webinar https://www.idigbio.org/content/symbiota-webinar-geolocate-toolkit https://www.idigbio.org/content/coge-collaborative-georeferencing-demo-webinar
Trained Georeferencers
- Map of Participants and Instructors for TTT1 and TTT2
- Wiki for all TTT1 and TTT2 Participants
Pre-Workshop Assignments
- Attend pre-workshop online meeting. Two options, choose one.
- Thursday September 15th - two times to choose from:
- 11am EDT (10am CDT, 9am MDT, 8am PDT)
- 3pm EDT (2pm CDT, 1pm MDT, 12pm PDT)
 
- Sign Up Here: https://goo.gl/forms/WmJO6z79rx5nHlv32
- Meet: http://idigbio.adobeconnect.com/geotrain
 
- Thursday September 15th - two times to choose from:
- Please watch the following videos - before the workshop. (flipped-classroom). Be sure to note any questions / insights to share with the group.
- Collaboration to Automation: https://vimeo.com/53006304 (25 min lecture, 10 min discussion)
- Geographical Concepts: https://vimeo.com/53008556 (4 min lecture, 2 min discussion)
- https://vimeo.com/album/2163673/video/63692461 (4 min lecture only)
 
- Point Radius Method and Best Practices: https://vimeo.com/53006303 (20 min lecture, 5 min discussion)
- OPTIONAL video: BITC Global Online Seminar #25: Simple Workflow for Data Cleaning (1 hour)
 
- Please install the following software
- QGIS and then QGIS Plugins. NOTE it's easy to install all the plugins from inside QGIS once you have it installed. 
- QGIS: http://qgis.org/en/site/forusers/download.html
-  QGIS Plug-ins: Open your QGIS installation on your laptop > navigate to Plugins > Manage and Install Plugins (as seen in the screenshots). You can then add these plugins within QGIS by typing the tool name into the search box and clicking on "Install Plugin": Clipper, Coordinate Capture, GPS Tools, Heatmap, Interpolation, OpenLayers, Processing, TimeManager, and Lifemapper.
- Clipper (clip intersecting vector features)
- Coordinate Capture (find coordinates in various coordinate reference systems (CRS) via mouse-over)
- Gazetteer Search (finding named places via a search bar): NOTE: The Gazetteer Plugin is not "discoverable" through the Plugins manager in QGIS. You'll need to follow the installation steps listed here: https://github.com/AstunTechnology/QGIS-Gazetteer-Plugin#Installation
- Manual
- find where your QGIS is installed on your machine
- right click the folder to see contents and find the folder for Plugins
- for example, on Deb's Windows 10 laptop, the path to the correct QGIS plugins folder is C:\Users\dlpss\.qgis2\python\plugins
 
- make a folder called gazetteersearch inside of the QGIS Plugins directory
- download the contents from GitHub and move them into the gazetteersearch folder
- close and reopen QGIS in order for the plugin to show up
 
- via Git
- clone the repository into your QGIS Plugins folder following the steps from the link above. Please let Sara know if you have any other questions.
 
 
- Manual
- GPS Tools (loading and importing GPS data)
- Heatmap (generate a heatmap raster given input vector points)
- Interpolation (interpolation techniques given vertices of a vector layer)
- OpenLayers (load basemaps from OpenStreetMap, Google, etc.)
- Processing (spatial data processing framework)
- TimeManager (event-visualization animation for vector features)
- Lifemapper: Plugin for Lifemapper webservices for SDM modeling, and multispecies Presence Absence Matrix (PAM) analysis. The tool allows you to build SDM models using GBIF, iDigBio, or user supplied species occurrence data.
 
-  Gazetteer Search requires an additional step; follow these steps to install (manual): 
- find where your QGIS is installed on your machine
- right click the folder to see contents and find the folder for Plugins
- make a folder called gazetteersearch inside of the QGIS Plugins directory
- download the contents from GitHub and move them into the gazetteersearch folder
- close and reopen QGIS in order for the plugin to show up
- OR install via command line (using Git - see instructions in link above)
- clone the repository into your QGIS Plugins folder following the steps from the link above.
 
 
- Open Refine: (previously Google Refine) is a tool for data cleaning that runs through a web browser, and any browser - Safari, Firefox, Chrome, - should work fine (Explorer not recommended).  You will need to download Google Refine and install it, and when you open it, it will run through the browser, but you don't need an internet connection, and the data will all be stored on your computer. (Use these resources Open Refine Install or Install Open Refine for more help if you run into any Open Refine install issues).
- Windows
- Go to the OpenRefine download page.
- Click on Windows kit to download the install file
- To use it, unzip, and double-click on openrefine.exe (if you're having issues with openrefine.exe try refine.bat instead)
- OpenRefine will then open in your web browser.
- If it doesn't open automatically, open a web broswer after you've started the program and go to the URL http://localhost:3333and you should see OpenRefine.
 
- MacOS
- Go to the OpenRefine download page.
- Click on Mac kit to download the install file
- Open the downloaded .dmg file
- Drag the icon in to the Applications folder
- Double click on the icon and Google Refine will then open in your web browser.
- If it doesn't open automatically, open a web broswer after you've started the program and go to the URL http://localhost:3333and you should see OpenRefine.
 
- Linux
- Go to the OpenRefine download page.
- Click on Linux kit to download the install file
- Download and extract
- Type ./refinein your terminal and Google Refine will then open in your web browser.
- If it doesn't open automatically, open a web broswer after you've started the program and go to the URL http://localhost:3333and you should see OpenRefine.
 
 
- Windows
- Spreadsheet software (your choice, Libre Office, Excel, etc.,)
- We'll be using a spreadsheet program. If you already have a spreadsheet program installed, like LibreOffice, Excel or OpenOffice, you can use whatever you already have. If you don't have a spreadsheet program, please download and install LibreOffice from http://www.libreoffice.org/download/libreoffice-fresh/
 
- Java: Please make sure you have Java installed (needed for Open Refine to work).
 
- QGIS and then QGIS Plugins. NOTE it's easy to install all the plugins from inside QGIS once you have it installed. 
- OPTIONAL software install and tutorials - if you are interested in the R breakout section we will offer at the workshop.
- R & RStudio: R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.
- Windows
- Video Tutorial
- Install R by downloading and running this .exe file from CRAN (http://cran.r-project.org/index.html).
- Also, please install the RStudio IDE.
 
- Mac OS X
- Video Tutorial
- Install R by downloading and running this .pkg file from CRAN (http://cran.r-project.org/index.html).
- Also, please install the RStudio IDE.
 
- Linux
- You can download the binary files for your distribution from CRAN. Or you can use your package manager
- e.g. for Debian/Ubuntu run sudo apt-get install r-baseand for Fedora runsudo yum install R.
 
- e.g. for Debian/Ubuntu run 
- Also, please install the RStudio IDE.
 
- You can download the binary files for your distribution from CRAN. Or you can use your package manager
 
- Windows
- Then install packages:
- R Tutorials. OPTIONAL take a short course in R. If you are a novice, take a beginner course. We don't expect you know know R well, but we do need you be familiar enough to follow along with one of our optional hands-on sessions. There are several good options:
- Try R (Code School course)
- Beginner Course: Up and Running with R with Barton Poulson (course at lynda.com)
- Intermediate Course: R Statistics Essential Training with Barton Poulson(course at lynda.com)
- For the future you could take a Coursera class. intro to R(Coursera course started August 22nd).
 
- Georeferencing using Apps: please install either of these on your device, if you want to try georeferencing this way to compare with results from a GPS unit.
 
- R & RStudio: R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.


