Text Transcription Issues: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
|||
(6 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== About Standards for Transcribing Text == | == About Standards for Transcribing Text == | ||
<br> | |||
*Content here begins with resources put together by Jason Best (thank you Jason) in an email sent to the AOCR wg on 19 December 2012. | |||
*In our last meeting (18 Dec 2012) we discussed some of the challenges of transcribing text with corrections, alterations, strikeouts, ambiguous letters, etc and I [Jason Best] briefly mentioned some transcription projects that have dealt with similar issues. A hackathon participant, Ben Brumfeld, has much more experience in this topic so first I'll point you to some information he has compiled. His blog home page (http://manuscripttranscription.blogspot.com) currently has a transcription of his talk about the variety of formats that various projects are using. A worthwhile read. | |||
*In our last meeting (18 Dec 2012) we discussed some of the challenges of transcribing text with corrections, alterations, strikeouts, ambiguous letters, etc and I briefly mentioned some transcription projects that have dealt with similar issues. A hackathon participant, Ben Brumfeld, has much more experience in this topic so first I'll point you to some information he has compiled. His blog home page (http://manuscripttranscription.blogspot.com) currently has a transcription of his talk about the variety of formats that various projects are using. A worthwhile read. | |||
*If we decide to try to transcribe or preserve ambiguous or corrected/struckout characters, then the Text Encoding Initiative format might be a good start, though it would require the use of XML elements in brackets. A more lightweight approach might be to utilize some of the wiki markup formats like: | *If we decide to try to transcribe or preserve ambiguous or corrected/struckout characters, then the Text Encoding Initiative format might be a good start, though it would require the use of XML elements in brackets. A more lightweight approach might be to utilize some of the wiki markup formats like: | ||
Line 9: | Line 10: | ||
*Below I've listed some projects that either establish transcription or markup standards or have published guidelines or suggestions about how to transcribe text. | *Below I've listed some projects that either establish transcription or markup standards or have published guidelines or suggestions about how to transcribe text. | ||
**TEI elements for representing primary documents (in particular, errors, corrections, alterations, ambiguity, etc) - http://www.tei-c.org/release/doc/tei-p5-doc/en/html/PH.html#PHCH | **TEI elements for representing primary documents (in particular, errors, corrections, alterations, ambiguity, etc) - http://www.tei-c.org/release/doc/tei-p5-doc/en/html/PH.html#PHCH | ||
**FreeReg, register transcription - http://www.freereg.org.uk/howto/transcribe.htm | **FreeReg, register transcription - http://www.freereg.org.uk/howto/transcribe.htm | ||
Line 15: | Line 15: | ||
**New York Public Library Menu transcription guidelines - http://menus.nypl.org/help | **New York Public Library Menu transcription guidelines - http://menus.nypl.org/help | ||
**National Archives Transcription tips - http://transcribe.archives.gov/tips | **National Archives Transcription tips - http://transcribe.archives.gov/tips | ||
**Leiden+ notation used by classicists for marking damage and unclear readings in Greek papyrus standards - http://papyri.info/editor/documentation?docotype=text (In use since the mid-1930s, updated and translated to TEI by the Integrating Digital Papyrology group.) | |||
*Projects that might have additional approaches to transcription | *Projects that might have additional approaches to transcription | ||
**http://scripto.org http://www.uscript.org | **http://scripto.org http://www.uscript.org | ||
**http://transcriptorium.eu http://t-pen.org | **http://transcriptorium.eu http://t-pen.org | ||
Back to the [[2013 AOCR Hackathon Wiki]] |
Latest revision as of 16:31, 17 January 2013
About Standards for Transcribing Text
- Content here begins with resources put together by Jason Best (thank you Jason) in an email sent to the AOCR wg on 19 December 2012.
- In our last meeting (18 Dec 2012) we discussed some of the challenges of transcribing text with corrections, alterations, strikeouts, ambiguous letters, etc and I [Jason Best] briefly mentioned some transcription projects that have dealt with similar issues. A hackathon participant, Ben Brumfeld, has much more experience in this topic so first I'll point you to some information he has compiled. His blog home page (http://manuscripttranscription.blogspot.com) currently has a transcription of his talk about the variety of formats that various projects are using. A worthwhile read.
- If we decide to try to transcribe or preserve ambiguous or corrected/struckout characters, then the Text Encoding Initiative format might be a good start, though it would require the use of XML elements in brackets. A more lightweight approach might be to utilize some of the wiki markup formats like:
- Markdown (http://daringfireball.net/projects/markdown/syntax) or
- Textile (http://txstyle.org).
- Below I've listed some projects that either establish transcription or markup standards or have published guidelines or suggestions about how to transcribe text.
- TEI elements for representing primary documents (in particular, errors, corrections, alterations, ambiguity, etc) - http://www.tei-c.org/release/doc/tei-p5-doc/en/html/PH.html#PHCH
- FreeReg, register transcription - http://www.freereg.org.uk/howto/transcribe.htm
- Transcribe Bentham Guidelines (seems to be based on TEI) - http://www.transcribe-bentham.da.ulcc.ac.uk/td/Help:Transcription_Guidelines
- New York Public Library Menu transcription guidelines - http://menus.nypl.org/help
- National Archives Transcription tips - http://transcribe.archives.gov/tips
- Leiden+ notation used by classicists for marking damage and unclear readings in Greek papyrus standards - http://papyri.info/editor/documentation?docotype=text (In use since the mid-1930s, updated and translated to TEI by the Integrating Digital Papyrology group.)
- Projects that might have additional approaches to transcription
Back to the 2013 AOCR Hackathon Wiki