Access to Digitization Tools and Methods: Difference between revisions
| No edit summary | |||
| (31 intermediate revisions by 3 users not shown) | |||
| Line 1: | Line 1: | ||
| [[Category: | [[Category:Symposium]] | ||
| {| class="wikitable" style="float:right;" | |||
| |- | |||
| | colspan="2" style="text-align:center;font-size:7pt" |<!--YOU CAN INSERT A NEW IMAGE FOR THE LOGO BETWEEN THE COLON AND THE PIPE-->[[File:Tdwg2014SymposiumLogo_0.png|center|400px]]<br /> | |||
| |- | |||
| |[https://docs.google.com/document/d/1hJ2gF0DRXOiLaA2SwBFRpadoDtn9C6dcqyA8CeAoO1w TDWG 2014 Agenda]  | |||
| |-  | |||
| |[https://www.idigbio.org/biblio?f%5bkeyword%5d=451 TDWG 2014 Biblio Entries] | |||
| |-  | |||
| |[https://www.idigbio.org/content/idigbio-bis-tdwg-2014-some-digitization-crowd-sourcing-and-data-use-too TDWG 2014 IV Report] | |||
| |} | |||
| This wiki supports the '''BIS(TDWG) 2014 Symposium: Access to Digitisation Tools and Methods''', Jönköping, Sweden, October 27th, 2014. | This wiki supports the '''BIS(TDWG) 2014 Symposium: Access to Digitisation Tools and Methods''', Jönköping, Sweden, October 27th, 2014. | ||
| This broad digitisation symposium included three sessions, covering different elements of digitisation. The key focus was to cover the developments that are occurring in digitisation but with a strong emphasis on the '''accessibility''' of tools and protocols ('''think open access, open source'''). | |||
| This broad digitisation symposium  | |||
| Examples of topics include tools for data/metadata capture and enrichment such as Optical Character Recognition (OCR), text mining, Natural Handwriting Recognition (NHR), Natural Language Processing (NLP), their availability and how they are being adopted and adapted. How are these tools being used currently, and how can we ensure that they are accessible to all? In addition, what are the tools in use for image capture and management, quality control and long-term preservation of images? What techniques are in use by many institutes, who are capturing images of their natural history collections and related objects like field notebooks, illustrations, labels, card catalogs, journals, and literature? | Examples of topics include tools for data/metadata capture and enrichment such as Optical Character Recognition (OCR), text mining, Natural Handwriting Recognition (NHR), Natural Language Processing (NLP), their availability and how they are being adopted and adapted. How are these tools being used currently, and how can we ensure that they are accessible to all? In addition, what are the tools in use for image capture and management, quality control and long-term preservation of images? What techniques are in use by many institutes, who are capturing images of their natural history collections and related objects like field notebooks, illustrations, labels, card catalogs, journals, and literature? | ||
| Line 11: | Line 19: | ||
| ==BIS(TDWG) 2014 Symposium: Access to Digitisation Tools and Methods, Agenda and Logistics== | ==BIS(TDWG) 2014 Symposium: Access to Digitisation Tools and Methods, Agenda and Logistics== | ||
| *[https://docs.google.com/document/d/1hJ2gF0DRXOiLaA2SwBFRpadoDtn9C6dcqyA8CeAoO1w/edit Agenda in Google Doc] | *[https://docs.google.com/document/d/1hJ2gF0DRXOiLaA2SwBFRpadoDtn9C6dcqyA8CeAoO1w/edit Agenda in Google Doc] | ||
| *[https://mbgserv18.mobot.org/ocs/public/conferences/5/schedConfs/9/program-en_US.pdf TDWG 2014 Program] | *[https://mbgserv18.mobot.org/ocs/public/conferences/5/schedConfs/9/program-en_US.pdf TDWG 2014 Program] | ||
| Line 19: | Line 26: | ||
| *Twitter: @iDigBio @TDWG #tdwg2014 #tdwg #digitization and conveners: @emhaston @idbdeb @vsmithuk | *Twitter: @iDigBio @TDWG #tdwg2014 #tdwg #digitization and conveners: @emhaston @idbdeb @vsmithuk | ||
| ==Collaborative  | ==Collaborative Documents== | ||
| *[https://docs.google.com/document/d/1hJ2gF0DRXOiLaA2SwBFRpadoDtn9C6dcqyA8CeAoO1w/edit Google Doc for Group Notes] | *[https://docs.google.com/document/d/1hJ2gF0DRXOiLaA2SwBFRpadoDtn9C6dcqyA8CeAoO1w/edit Google Doc for Group Notes] | ||
| **Schedule is embedded in the Google Doc | **Schedule is embedded in the Google Doc | ||
| ==Conference and Symposium Blog Post== | |||
| [https://www.idigbio.org/content/idigbio-bis-tdwg-2014-some-digitization-crowd-sourcing-and-data-use-too iDigBio at BIS-TDWG 2014: some digitization, crowd-sourcing, and data use too] | |||
| ==Photos== | |||
| *[https://www.facebook.com/media/set/?set=a.830862130291068.1073741850.215120891865198&type=3 Facebook Photo Album] | |||
| ==Workshop Recordings== | ==Workshop Recordings== | ||
| '''Monday, 27 October 2014''' | '''Monday, 27 October 2014''' | ||
| #[http://idigbio.adobeconnect.com/p79nfi1o9xh/ 11:00 - 11:10 am '''Discovery and access to digitisation tools and methods'''.] (recording) Elspeth M Haston, Robert Cubey | #[http://idigbio.adobeconnect.com/p79nfi1o9xh/ 11:00 - 11:10 am '''Discovery and access to digitisation tools and methods'''.] (recording) Elspeth M Haston, Robert Cubey | ||
| #[http://idigbio.adobeconnect.com/p81as7fs075/ | #[http://idigbio.adobeconnect.com/p81as7fs075/ 11:10 - 11:30 am '''The Open Drawer Project - Providing free access to high resolution images of entomological collection drawers.'''] (recording) Alexander Kroupa, Falko Glöckler, Bernhard Schurian, Felix Maier, Stefan Schmidt, Gregor Hagedorn, Christoph Häuser | ||
| #[http://idigbio.adobeconnect.com/p3iliimu2dd/ 11:30 - 11:50 am '''StanDAP-Herb develops a standard process for extracting metadata from digitised herbarium specimens.'''] (recording) Agnes Kirchhoff, Walter G. Berendsohn, Ulrich Bügel, Fernando Chaves, Cailin Guan, Markus Lindhorst, Dominik Röpert, Eduard Santamaria, Karl-Heinz Steinke, Hangyan Zheng | |||
| #[http://idigbio.adobeconnect.com/p3iliimu2dd/ | #[http://idigbio.adobeconnect.com/p3iliimu2dd/ 11:50 - 12:10 pm '''Moving beyond the box: automating the digitisation of insect collections.'''] (recording) Pieter Holtzhausen, Stéfan van der Walt, Alice Heaton, Laurence Livermore, Vladimir Blagoderov, Ben Price, Lawrence Hudson, Vincent Smith (Jump to 14:00 minutes into this recording for this talk). | ||
| #[http://idigbio.adobeconnect.com/p6conw2y8su/ 12:10 - 12:30 pm '''ZooSphere - Development of a software for automated spheric image capturing and interactive 3D visualization of biological collection objects.'''] (recording) Martin Pluta, Falko Glöckler, Alexander Kroupa, Bernhard Schurian | |||
| #[http://idigbio.adobeconnect.com/p3iliimu2dd/ | #[http://idigbio.adobeconnect.com/p2o42nvigxv/ 2:00 - 2:20 pm '''Capturing Inventory level information about collections as a step in object to image to data workflows.'''] (recording) Paul J Morris, James Hanken, David Lowery, Bertram Ludäscher, James A. Macklin, Robert A Morris, Tianhong Song, Patrick Sweeney | ||
| #[http://idigbio.adobeconnect.com/p93xq45smni/ 2:20 - 2:40 pm '''Data Discovery and Doer Happiness: Uses for Optical Character Recognition (OCR) Output.'''] (recording) Deborah Paul, Andrea Matsunaga, Miao Chen, Jason Best, Sylvia Orli, William Ulate, Reed Beaman | |||
| #[https://www.idigbio.org/sites/default/files/workshop-images/tdwg2014/Enriching%20the%20legacy%20literature%20with%20OCR%20corrections%20and%20text-mined%20semantic%20metadata.mp4 2:40 - 3:00 pm '''Enriching the legacy literature with OCR corrections and text-mined semantic metadata.'''] (recording mp4) Riza Batista-Navarro, Aminul Islam, William Ulate, Jennifer Hammock, Axel Soto, Sophia Ananiadou, Evangelos Milios | |||
| #[http://idigbio.adobeconnect.com/p6conw2y8su/ | #[http://idigbio.adobeconnect.com/p51w7xff110/ 3:00 - 3:20 pm '''Managing Digitization Projects with Biospex.'''] (recording) Greg Riccardi, Austin Mast, Elizabeth Ellwood, Robert Bruhn, Jeremy Spinks (Note that talk has discussion from last talk. This talk begins at the 1 min:40 sec mark. Follows with discussion. | ||
| #'''Optical character recognition (OCR) in linking entomological labels with field notebook data.''' Tero Mononen, Riitta Tegelberg, Janne Karppinen, Mira Sääskilahti, Hannu Saarenmaa, Tommi Koskinen, Jyrki Muona (not recorded) | |||
| #[http://idigbio.adobeconnect.com/p2o42nvigxv/ | #[http://idigbio.adobeconnect.com/p6l6xceheha/ 4:20 - 4:40 pm '''What do you do when your Network Manager tells you there is no more space and they mean it?.'''] (recording) Sharon Grant, Kate Webbink, Marc Lambruschi, Mike Yoshida | ||
| #[http://idigbio.adobeconnect.com/p81as7n4a2c/ 4:40 - 5:00 pm '''ENVIRONMENTS-EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life.'''] (recording) Evangelos Pafilis, Sune Frankild, Lucia Fanini, Sarah Faulwetter, Christina Pavloudi, Julia Schnetzer, Aikaterini Vasileiadou, Umer Ijaz, Christos Arvanitidis, Robert Stevenson, Lars Juhl Jensen Talk begins at 3 minutes:30 seconds into the video. Sound capture only. See slides (below). | |||
| #[http://idigbio.adobeconnect.com/p93xq45smni/ | #[http://idigbio.adobeconnect.com/p12c1i3emuj/ 5:00 - 5:20 pm '''Case study of reuse of digitised content by creative industry in games: Europeana Creative.'''] (recording) Jiri Frank | ||
| #['''Enriching the legacy literature with OCR corrections and text-mined semantic metadata.'''] ( | |||
| #[http:// | |||
| #'''Optical character recognition (OCR) in linking entomological labels with field notebook data.''' Tero Mononen, Riitta Tegelberg, Janne Karppinen, Mira Sääskilahti, Hannu Saarenmaa, Tommi Koskinen, Jyrki Muona | |||
| #[http:// | |||
| #[http:// | |||
| #[http:// | |||
| ==PowerPoints and  | ==Presentation PowerPoints and PDFs== | ||
| '''Monday, 27 October 2014''' | '''Monday, 27 October 2014''' | ||
| #'''Discovery and access to digitisation tools and methods'''. Elspeth M Haston, Robert Cubey | #'''Discovery and access to digitisation tools and methods'''. Elspeth M Haston, Robert Cubey | ||
| Line 62: | Line 62: | ||
| #[http://www.tdwg.org/fileadmin/2014conference/slides/Paul_DataDiscoveryAndDoerHappiness.pptx '''Data Discovery and Doer Happiness: Uses for Optical Character Recognition (OCR) Output.'''] (pptx) Deborah Paul, Andrea Matsunaga, Miao Chen, Jason Best, Sylvia Orli, William Ulate, Reed Beaman | #[http://www.tdwg.org/fileadmin/2014conference/slides/Paul_DataDiscoveryAndDoerHappiness.pptx '''Data Discovery and Doer Happiness: Uses for Optical Character Recognition (OCR) Output.'''] (pptx) Deborah Paul, Andrea Matsunaga, Miao Chen, Jason Best, Sylvia Orli, William Ulate, Reed Beaman | ||
| #[http://www.tdwg.org/fileadmin/2014conference/slides/Batista_EnrichingLegacyLiterature.pptx '''Enriching the legacy literature with OCR corrections and text-mined semantic metadata.'''] (pptx) Riza Batista-Navarro, Aminul Islam, William Ulate, Jennifer Hammock, Axel Soto, Sophia Ananiadou, Evangelos Milios | #[http://www.tdwg.org/fileadmin/2014conference/slides/Batista_EnrichingLegacyLiterature.pptx '''Enriching the legacy literature with OCR corrections and text-mined semantic metadata.'''] (pptx) Riza Batista-Navarro, Aminul Islam, William Ulate, Jennifer Hammock, Axel Soto, Sophia Ananiadou, Evangelos Milios | ||
| #[http://www.tdwg.org/fileadmin/2014conference/slides/Riccardi_Biospex.pptx '''Managing Digitization Projects with Biospex.'''] (pptx) Greg Riccardi, Austin Mast, Elizabeth Ellwood, Robert Bruhn, Jeremy Spinks | #[http://www.tdwg.org/fileadmin/2014conference/slides/Riccardi_Biospex.pptx '''Managing Digitization Projects with Biospex.'''] (pptx) Greg Riccardi, Austin Mast, Elizabeth Ellwood, Robert Bruhn, Jeremy Spinks   | ||
| #'''Optical character recognition (OCR) in linking entomological labels with field notebook data.''' Tero Mononen, Riitta Tegelberg, Janne Karppinen, Mira Sääskilahti, Hannu Saarenmaa, Tommi Koskinen, Jyrki Muona | #[http://www.tdwg.org/fileadmin/2014conference/slides/Mononen_OpticalCharacterReognition.pdf '''Optical character recognition (OCR) in linking entomological labels with field notebook data.'''] (pdf) Tero Mononen, Riitta Tegelberg, Janne Karppinen, Mira Sääskilahti, Hannu Saarenmaa, Tommi Koskinen, Jyrki Muona | ||
| #[http://www.tdwg.org/fileadmin/2014conference/slides/Grant_DigTools_WhatDoYouDoWhen.pptx '''What do you do when your Network Manager tells you there is no more space and they mean it?.'''] (pptx) Sharon Grant, Kate Webbink, Marc Lambruschi, Mike Yoshida | #[http://www.tdwg.org/fileadmin/2014conference/slides/Grant_DigTools_WhatDoYouDoWhen.pptx '''What do you do when your Network Manager tells you there is no more space and they mean it?.'''] (pptx) Sharon Grant, Kate Webbink, Marc Lambruschi, Mike Yoshida | ||
| #[http://www.tdwg.org/fileadmin/2014conference/slides/Pafilis_EnvironmentsEOL.pdf '''ENVIRONMENTS-EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life.'''] (pdf) Evangelos Pafilis, Sune Frankild, Lucia Fanini, Sarah Faulwetter, Christina Pavloudi, Julia Schnetzer, Aikaterini Vasileiadou, Umer Ijaz, Christos Arvanitidis, Robert Stevenson, Lars Juhl Jensen | #[http://www.tdwg.org/fileadmin/2014conference/slides/Pafilis_EnvironmentsEOL.pdf '''ENVIRONMENTS-EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life.'''] (pdf) Evangelos Pafilis, Sune Frankild, Lucia Fanini, Sarah Faulwetter, Christina Pavloudi, Julia Schnetzer, Aikaterini Vasileiadou, Umer Ijaz, Christos Arvanitidis, Robert Stevenson, Lars Juhl Jensen | ||
| #[http://www.tdwg.org/fileadmin/2014conference/slides/Frank_NHEGames.pdf '''Case study of reuse of digitised content by creative industry in games: Europeana Creative.'''] (pdf) Jiri Frank | #[http://www.tdwg.org/fileadmin/2014conference/slides/Frank_NHEGames.pdf '''Case study of reuse of digitised content by creative industry in games: Europeana Creative.'''] (pdf) Jiri Frank | ||
Latest revision as of 12:43, 3 February 2015
| TDWG 2014 Agenda | |
| TDWG 2014 Biblio Entries | |
| TDWG 2014 IV Report | |
This wiki supports the BIS(TDWG) 2014 Symposium: Access to Digitisation Tools and Methods, Jönköping, Sweden, October 27th, 2014.
This broad digitisation symposium included three sessions, covering different elements of digitisation. The key focus was to cover the developments that are occurring in digitisation but with a strong emphasis on the accessibility of tools and protocols (think open access, open source).
Examples of topics include tools for data/metadata capture and enrichment such as Optical Character Recognition (OCR), text mining, Natural Handwriting Recognition (NHR), Natural Language Processing (NLP), their availability and how they are being adopted and adapted. How are these tools being used currently, and how can we ensure that they are accessible to all? In addition, what are the tools in use for image capture and management, quality control and long-term preservation of images? What techniques are in use by many institutes, who are capturing images of their natural history collections and related objects like field notebooks, illustrations, labels, card catalogs, journals, and literature?
Digitization Resources Wiki Home
BIS(TDWG) 2014 Symposium: Access to Digitisation Tools and Methods, Agenda and Logistics
- Agenda in Google Doc
- TDWG 2014 Program
- Start time: 11 am on Monday October 27th at the 00 Elmia Congress Centre, Rydberg Hall, Jönköping, Sweden.
- TDWG 2014 Symposium: Access to Digitisation Tools and Methods - Calendar Announcement
- BIS(TDWG) 2014 Conference Website
- Twitter: @iDigBio @TDWG #tdwg2014 #tdwg #digitization and conveners: @emhaston @idbdeb @vsmithuk
Collaborative Documents
- Google Doc for Group Notes
- Schedule is embedded in the Google Doc
 
Conference and Symposium Blog Post
iDigBio at BIS-TDWG 2014: some digitization, crowd-sourcing, and data use too
Photos
Workshop Recordings
Monday, 27 October 2014
- 11:00 - 11:10 am Discovery and access to digitisation tools and methods. (recording) Elspeth M Haston, Robert Cubey
- 11:10 - 11:30 am The Open Drawer Project - Providing free access to high resolution images of entomological collection drawers. (recording) Alexander Kroupa, Falko Glöckler, Bernhard Schurian, Felix Maier, Stefan Schmidt, Gregor Hagedorn, Christoph Häuser
- 11:30 - 11:50 am StanDAP-Herb develops a standard process for extracting metadata from digitised herbarium specimens. (recording) Agnes Kirchhoff, Walter G. Berendsohn, Ulrich Bügel, Fernando Chaves, Cailin Guan, Markus Lindhorst, Dominik Röpert, Eduard Santamaria, Karl-Heinz Steinke, Hangyan Zheng
- 11:50 - 12:10 pm Moving beyond the box: automating the digitisation of insect collections. (recording) Pieter Holtzhausen, Stéfan van der Walt, Alice Heaton, Laurence Livermore, Vladimir Blagoderov, Ben Price, Lawrence Hudson, Vincent Smith (Jump to 14:00 minutes into this recording for this talk).
- 12:10 - 12:30 pm ZooSphere - Development of a software for automated spheric image capturing and interactive 3D visualization of biological collection objects. (recording) Martin Pluta, Falko Glöckler, Alexander Kroupa, Bernhard Schurian
- 2:00 - 2:20 pm Capturing Inventory level information about collections as a step in object to image to data workflows. (recording) Paul J Morris, James Hanken, David Lowery, Bertram Ludäscher, James A. Macklin, Robert A Morris, Tianhong Song, Patrick Sweeney
- 2:20 - 2:40 pm Data Discovery and Doer Happiness: Uses for Optical Character Recognition (OCR) Output. (recording) Deborah Paul, Andrea Matsunaga, Miao Chen, Jason Best, Sylvia Orli, William Ulate, Reed Beaman
- 2:40 - 3:00 pm Enriching the legacy literature with OCR corrections and text-mined semantic metadata. (recording mp4) Riza Batista-Navarro, Aminul Islam, William Ulate, Jennifer Hammock, Axel Soto, Sophia Ananiadou, Evangelos Milios
- 3:00 - 3:20 pm Managing Digitization Projects with Biospex. (recording) Greg Riccardi, Austin Mast, Elizabeth Ellwood, Robert Bruhn, Jeremy Spinks (Note that talk has discussion from last talk. This talk begins at the 1 min:40 sec mark. Follows with discussion.
- Optical character recognition (OCR) in linking entomological labels with field notebook data. Tero Mononen, Riitta Tegelberg, Janne Karppinen, Mira Sääskilahti, Hannu Saarenmaa, Tommi Koskinen, Jyrki Muona (not recorded)
- 4:20 - 4:40 pm What do you do when your Network Manager tells you there is no more space and they mean it?. (recording) Sharon Grant, Kate Webbink, Marc Lambruschi, Mike Yoshida
- 4:40 - 5:00 pm ENVIRONMENTS-EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life. (recording) Evangelos Pafilis, Sune Frankild, Lucia Fanini, Sarah Faulwetter, Christina Pavloudi, Julia Schnetzer, Aikaterini Vasileiadou, Umer Ijaz, Christos Arvanitidis, Robert Stevenson, Lars Juhl Jensen Talk begins at 3 minutes:30 seconds into the video. Sound capture only. See slides (below).
- 5:00 - 5:20 pm Case study of reuse of digitised content by creative industry in games: Europeana Creative. (recording) Jiri Frank
Presentation PowerPoints and PDFs
Monday, 27 October 2014
- Discovery and access to digitisation tools and methods. Elspeth M Haston, Robert Cubey
- The Open Drawer Project - Providing free access to high resolution images of entomological collection drawers. (pdf) Alexander Kroupa, Falko Glöckler, Bernhard Schurian, Felix Maier, Stefan Schmidt, Gregor Hagedorn, Christoph Häuser
- StanDAP-Herb develops a standard process for extracting metadata from digitised herbarium specimens. (pdf) Agnes Kirchhoff, Walter G. Berendsohn, Ulrich Bügel, Fernando Chaves, Cailin Guan, Markus Lindhorst, Dominik Röpert, Eduard Santamaria, Karl-Heinz Steinke, Hangyan Zheng
- Moving beyond the box: automating the digitisation of insect collections. (ppt) Pieter Holtzhausen, Stéfan van der Walt, Alice Heaton, Laurence Livermore, Vladimir Blagoderov, Ben Price, Lawrence Hudson, Vincent Smith
- ZooSphere - Development of a software for automated spheric image capturing and interactive 3D visualization of biological collection objects. (pdf) Martin Pluta, Falko Glöckler, Alexander Kroupa, Bernhard Schurian
- Capturing Inventory level information about collections as a step in object to image to data workflows. (pdf) Paul J Morris, James Hanken, David Lowery, Bertram Ludäscher, James A. Macklin, Robert A Morris, Tianhong Song, Patrick Sweeney
- Data Discovery and Doer Happiness: Uses for Optical Character Recognition (OCR) Output. (pptx) Deborah Paul, Andrea Matsunaga, Miao Chen, Jason Best, Sylvia Orli, William Ulate, Reed Beaman
- Enriching the legacy literature with OCR corrections and text-mined semantic metadata. (pptx) Riza Batista-Navarro, Aminul Islam, William Ulate, Jennifer Hammock, Axel Soto, Sophia Ananiadou, Evangelos Milios
- Managing Digitization Projects with Biospex. (pptx) Greg Riccardi, Austin Mast, Elizabeth Ellwood, Robert Bruhn, Jeremy Spinks
- Optical character recognition (OCR) in linking entomological labels with field notebook data. (pdf) Tero Mononen, Riitta Tegelberg, Janne Karppinen, Mira Sääskilahti, Hannu Saarenmaa, Tommi Koskinen, Jyrki Muona
- What do you do when your Network Manager tells you there is no more space and they mean it?. (pptx) Sharon Grant, Kate Webbink, Marc Lambruschi, Mike Yoshida
- ENVIRONMENTS-EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life. (pdf) Evangelos Pafilis, Sune Frankild, Lucia Fanini, Sarah Faulwetter, Christina Pavloudi, Julia Schnetzer, Aikaterini Vasileiadou, Umer Ijaz, Christos Arvanitidis, Robert Stevenson, Lars Juhl Jensen
- Case study of reuse of digitised content by creative industry in games: Europeana Creative. (pdf) Jiri Frank
