Known OCR, ML, NLP Issues: Difference between revisions
Jump to navigation
Jump to search
Line 10: | Line 10: | ||
</ol> | </ol> | ||
Back to the [[2013 AOCR Hackathon Wiki|Hackathon Wiki] | Back to the [[2013 AOCR Hackathon Wiki|Hackathon Wiki]] |
Revision as of 22:46, 10 January 2013
Specific Issues Needing Work
This page is meant to serve as an ongoing list of known topics where work is needed that would improve things like OCR output, overall parsing results, and meaningful data set creation for digitization and data transcription by a human-in-the-loop.
Please add to the list.
- how to get OCR to ignore a map (reduce OCR confusion)
- ... and ___ present a challenge and confuse OCR and parsing.
- figure out an algorithm that would separate images into sets with no handwriting, little handwriting (mostly text typed or printed), lots of handwriting
Back to the Hackathon Wiki