OCRRelative

Ray Smith, Daria Antonova, and Dar-Shyang Lee. 2009. Adapting the Tesseract open source OCR engine for multilingual OCR. In Proceedings of the International Workshop on Multilingual OCR (MOCR ‘09). Association for Computing Machinery, New York, NY, USA, Article 1, 1–8. DOI:https://doi.org/10.1145/1577802.1577804


Paper: Tesseract Open Source OCR Engine

Summary

  • 遗留问题:对于识别低的词怎么合理去掉的,如何将知识图谱相关的知识融入进去。

https://lddpicture.oss-cn-beijing.aliyuncs.com/picture/image-20210203154201326.png

https://lddpicture.oss-cn-beijing.aliyuncs.com/picture/image-20210203154754219.png

https://lddpicture.oss-cn-beijing.aliyuncs.com/picture/image-20210203160214969.png

1. Unet 文档图像去噪

https://lddpicture.oss-cn-beijing.aliyuncs.com/picture/20200830094005.png

2. 开源工具

0%