Data Management Interest Group
iDigBio's Digitization Resources Wiki Home
Data Management Interest Group (DMI)
This page is devoted to resources and discussion for the DMI Group. Keeping up with data requires certain skills and infrastructure. This Interest Group plans to discuss issues surrounding shared data and the help and information the biodiversity community needs in order to ensure, if possible, that the provider has the most up-to-date versions of their own datasets. We intend to provide a forum for discussion, and act as a resource for guidance to point providers toward potential solutions. Do you need help to re-integrate data into your database? Are you able to help, or know of resources? We invite anyone with an interest in this topic to join us and contribute your observations and potential solutions to this challenging topic. Anyone is welcome to join the interest group.
The interest group schedules regular discussion sessions via Adobe Connect for the purpose of sharing techniques, strategies, uses, improvements, and technology associated with re-integrating enhanced data back into a provider's database. Resources, related documents, and discussion notes are stored below. Our first meeting: Webinar 7 August 2014, 1:00 - 2:00 PM EDT.
Interest Group Members
Meeting Recordings
- First Meeting of the Data Management Interest Group (DMI) August 07, 2014
- Partnering with libraries for data management, by Brian Westra, Recorded Monday October 20th, 2014. Noon - 1 PM EDT
- Issues in Re-integrating Georeferenced Data, the FishNet2 Experience, by Nelson Rios, recorded Monday March 30th, 2015. 2 - 3 PM EDT
- Data quality, usage, and issue tracking using GitHub, by John Wieczorek, et al at VertNet, recorded Friday 23 April, 2015. 4 - 5 PM EDT.
- Towards user-definable, semi-automated workflows for curating biodiversity data (recording). Presenters (abc order): David Lowery, James A. Macklin, Timothy McPhillips, Paul J. Morris, Tianghong Song. Recorded 28 May 2015 2 - 4 PM EDT
- Improving Data Quality: iDigBio Recordset data cleaning methods, tools, and data flags. Presenters: Alex Thompson (iDigBio IT), Matt Collins (iDigBio IT), and guests: Heather Appleby and Katja Seltmann. Recorded 23 October 2015 - 2 - 3 PM EDT.
- Variations on the theme of tracking loans, gifts, sampling, and more Presenters: Simon Checksfield with Nicole Fisher, CSIRO; Andrew Bentley from University of Kansas Biodiversity Institute, Specify, and SPNHC; Christine Johnson, Entomology, AMNH; Tiffany Adrain, University of Iowa Paleontology; Elspeth Haston, RBGE.
- Shaping the semantic layer by mining digitised data: an encounter between iDigBio's plant records and the Environment Ontology (ENVO) Presenters: Dr. Pier Luigi Buttigieg, HGF-MPG Group for Deep Sea Ecology and Technology, c/o Max Planck Institute for Marine Microbiology, Bremen, Germany, Email: pbuttigi@mpi-bremen.de; and Grant Godden, Research Associate, Michigan State University, Email: goddengr@msu.edu
- If you have ideas for next steps with this work, or would like to be involved in the next steps conversation, please send a note to idigbio@acis.ufl.edu
- DAMmed if you Do or Don't : Archiving, what is it anyway? and just what is a DAM?, by Larry Gall, Yale Peabody Museum, recorded 17 November 2015. 4 - 5 PM EST.
- A follow-up webinar on this topic, panel-style, is planned for early 2016. Stay tuned for more about this.
- Insights into Inselect Software: automating image processing, barcode reading, and validation of user-defined metadata
- Adobe Connect webinar recording
- MP4 Version on Vimeo
- by Lawrence Hudson and Ben Price, Natural History Museum, London. Recorded 29 March, 2016. 11 - 12 PM EDT.
- Webinar Panel: DAMs and Archival Issues for Large and Small Collections: options, considerations, resources
- from Pensoft Publishers and Biodiversity Data Journal Online direct import of specimen records from iDigBio infrastructure into taxonomic manuscripts
- Adobe Connect webinar recording by Viktor Senderov - Marie Curie PhD Student at Pensoft, datascience@pensoft.net and Lyubomir Penev - Managing Directory and Founder of Pensoft Publishers, penev@pensoft.net. Recorded 16 June, 2016. 9 - 10 am EDT.
- Mass Digitizing a Working Herbarium using a conveyor belt: Workflows, Strategies, Challenges presented by Sylvia Orli, IT and Digitization Manager, US Herbarium, Smithsonian. Recorded 18 October 2016. 3 - 4 PM EDT.
Darwin Core Hour Recordings
All Darwin Core Hour resources are on or linked through Git Hub https://github.com/tdwg/dwc-qa/wiki/Webinars
- Chapter 0. Introduction to Darwin Core Hour Webinar Series (adobe connect) presented by John Wieczorek, Paula Zermoglio, and Deborah Paul. Recorded 2017-02-07. On Vimeo as mp4.
- Chapter 1. Introduction to Darwin Core (adobe connect) presented by John Wieczorek. Recorded 2017-02-07. On Vimeo as mp4
- Chapter 2. Even Simple is Hard (adobe connect) presented by John Wieczorek. Recorded 2017-03-07. On Vimeo as mp4
- Chapter 3. Thousands of shades for “Controlled” Vocabularies (adobe connect) presented by Paula Zermoglio. Recorded 2017-04-04. On Vimeo as mp4
- Chapter 4a+b. Evolution of Darwin Core Terms and Extensions - two extant examples for community input (adobe connect) presented by Andy Bentley and Quentin Groom. Recorded 2017-05-02. On Vimeo as mp4
- Chapter 5. Darwin Core in Practice: Introduction to the GBIF IPT (adobe connect) presented by Kyle Braak, Laura Russel, and Carole Sinou. Recorded 2017-06-13. On Vimeo as mp4
- Chapter 6. Where am I, exactly? Darwin Core geoferencing terms (adobe connect) presented by David Bloom, Town Peterson, and John Wieczorek. Recorded 2017-07-11. On Vimeo as mp4
- Chapter 7. Aggregators - a Darwin Core View Part I: GBIF & iDigBio (adobe connect) and Part II: (More Than Vert)Net (adobe connect) presented by GBIF, iDigBio, Vertnet, ALA, and Canadensys. Recorded 2017-08-15. On Vimeo as mp4: Part I and Part II
- Chapter 8. A bite from the core - testing for data quality (adobe connect) presented by Lee Belbin and Arthur Chapman. Recorded 2017-09-05 (North America) and 2017-09-06 (Oceania). On Vimeo as mp4
- Chapter 9. Kurator Web: for Cleaner Biodiversity Data (adobe connect) presented by John Wieczorek. Recorded 2017-10-24. On Vimeo as mp4
- Chapter 10. Audubon Core and 3D Biodiversity Data: Metadata, Practice, and Unification of Efforts (adobe connect) presented by Gary Motz and John Wieczorek. Recorded 2017-11-21. On Vimeo as mp4
- Chapter 11. DwC Hour Brainstorming – Inviting the Community to Plan for Next Year (adobe connect). Recorded 2017-12-04. On Vimeo as mp4
- Chapter 12. Making DNA and tissue collections available by using the GGBN extensions with IPT (adobe connect) presented by Gabriela Dröge and Katherine Barker. Recorded 2018-02-21. On Vimeo as mp4
- Chapter 13. The Problem of Time: Dealing with Paleontological and Zooarchaeological Specimens in Darwin Core (adobe connect) presented by Laura Brenskelle. Recorded 2018-04-24. On Vimeo as mp4
2020 Darwin Core Hours coming soon. Watch this space.
Collaborative Notes and Interest Group Documents
- DMI Kickoff Meeting - Collaborative Notes Document short url: http://goo.gl/S7nzs5
- DMI Google Notes, Partnering with Librarians for Data Mgmt August 07, 2014
- DMI Meeting Chat Transcript, Issues in Re-integrating Georeferenced Data, the FishNet2 Experience March 30, 2015
- DMI Google Doc notes for iDigBio Webinar: Data quality, usage, and issue tracking using GitHub 23 April 2015
- DMI Meeting 28 August - Planning a Webinar Series Group Notes
- 9 January 2017 DMI Organizational Meeting Notes
Presentations, Posters, Upcoming Topics
- Data Management: The Data Re-integration Step Presentation from first meeting (Webinar), 7 August 2014
- Partnering with libraries for data management, by Brian Westra, 20 October 2014
- DMI Poster at DigBio Summit IV, 27-28 October 2014
- Poster presented by Mare Nazaire, content by working group, design by Jeremy Spinks.
- Webinar Calendar Announcement: FishNet2 on re-integrating georeferences back into local collections databases (March 30, 2015)
- Webinar Calendar Announcement Data quality, usage, and issue tracking using GitHub: the view from VertNet (23 April 2015)
- Presentation Slides from Data quality, usage, and issue tracking using GitHub: the view from VertNet
- Webinar Calendar Announcement: Towards user-definable, semi-automated workflows for curating biodiversity data.(Filtered PUSH, Kepler Kurator, *Akka) May 28th, 2015
- Kurator presentation (pdf)
- Part 2 of Webinar: designed for IT-oriented folks wanting to install and test please go here http://wiki.datakurator.net/web/iDigBioWebinar_May2015 Follow the instructions and you'll have some opportunities in the second half of the webinar to get input into use of this tool.
- iDigBio Recordset Data Cleaning tools and flags: where do they come from? how can data providers use them to enhance their datasets?
- Alex Thompson and Matt Collins, Friday, October 23rd, 2 PM EDT
- Slides available here
- Check out blog post by Heather Appleby and Katja Seltmann about their experience using the information in the data flags provided by iDigBio. What did they learn? What did we learn at iDigBio? What's next?
- Variations on the theme of tracking loans, gifts, sampling, and more
- Simon Checksfield, Nicole Fisher, Andrew Bentley, Matt Woodburn, Vince Smith, Christine Johnson, Tiffany Adrain, and Elspeth Haston Friday, October 30, 2015, 21:00:00 (UTC) Friday 5:00 PM (EDT); Friday 4:00 PM (Kansas City); Friday 9:00 PM (Edinburgh, London); Sat 8:00 AM (Sydney)
- Shaping the semantic layer by mining digitised data: an encounter between iDigBio's plant records and the Environment Ontology (ENVO)
- Dr. Pier Luigi Buttigieg, Max Plank Institute; Tuesday, November 10, 2015 - 9:00am to 10:00am EST
- Announcement: DAMmed if you Do or Don't : Archiving, what is it anyway? and just what is a DAM?
- PowerPoint DAMmed if you Do or Don’t (ppt) by Larry Gall
- Recording,by Larry Gall (Yale Peabody Museum); Tuesday, 17 November 2015 at 4 PM EST (that's 21:00 UTC).
- DEMO and Webinar Announcement: Insights into Inselect Software: automating image processing, barcode reading, and validation of user-defined metadata
- Insights into Inselect presentation (pdf); Tuesday, 29 March 2016 11 AM EDT, 4 PM BST by software developers Lawrence Hudson and Ben Price from the Natural History Museum (NHM) in London.
- from Pensoft Publishers and Biodiversity Data Journal Online direct import of specimen records from iDigBio infrastructure into taxonomic manuscripts
- Webinar Presentation (pdf) by Viktor Senderov - Marie Curie PhD Student at Pensoft, datascience@pensoft.net and Lyubomir Penev - Managing Directory and Founder of Pensoft Publishers, penev@pensoft.net. Recorded 16 June, 2016. 9 - 10 am EDT. (pptx)
- Providing Data to iDigBio - Getting Feedback from iDigBio: Experiencing the Data Life Cycle
- Mare Nazaire (date to be decided after their data is ingested)
Potential Topics
- Linking specimens, notes, and literature-- what systems have you found that best serve those linkages?
- More about Archiving Options and Challenges
- Macroalgal TCN using Voice Recognition and OCR output to speed up digitization
Relevant Papers and Documents
A specialist’s audit of aggregated occurrence records Robert Mesibov