some papers and datasets links collected from:
- [1] wanghaisheng/awesome-ocr
- [2] kba/awesome-ocr
- [3] chongyangtao/Awesome-Scene-Text-Recognition
- [4] whitelok/image-text-localization-recognition
- [5] 文字检测与识别资源
- [6] OCR material
- [7] handong1587
- [8] hs105/Deep-Learning-for-OCR
- [9] 文字检测与识别资料整理
you can access the website ICDAR, and see some awesome ocr models on the "Ranking Table" of each competition's result page
- 【Synthetic data】de T. Campos, B. R. Babu, and M. Varma. Character recognition in natural images. In VISAPP, 2009
- Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform[C]//Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010: 2963-2970.
code:[code]
- Rusinol M, Aldavert D, Toledo R, et al. Browsing heterogeneous document collections by a segmentation-free word spotting method[C]//Document Analysis and Recognition (ICDAR), 2011 International Conference on. IEEE, 2011: 63-67.
- Neumann L, Matas J. Text localization in real-world images using efficiently pruned exhaustive search[C]//Document Analysis and Recognition (ICDAR), 2011 International Conference on. IEEE, 2011: 687-691.
- 【Synthetic data】Wang T, Wu D J, Coates A, et al. End-to-end text recognition with convolutional neural networks[C]//Pattern Recognition (ICPR), 2012 21st International Conference on. IEEE, 2012: 3304-3308.
code:[code] - Elagouni K, Garcia C, Mamalet F, et al. Text recognition in videos using a recurrent connectionist approach[C]//International Conference on Artificial Neural Networks. Springer, Berlin, Heidelberg, 2012: 172-179.
- Frinken V, Fischer A, Manmatha R, et al. A novel word spotting method based on recurrent neural networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2012, 34(2): 211-224.
- Neumann L, Matas J. Real-time scene text localization and recognition[C]//Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012: 3538-3545.
code:[code] - Mishra A, Alahari K, Jawahar C V. Top-down and bottom-up cues for scene text recognition[C]//Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012: 2687-2694.
- Yin X C, Yin X, Huang K, et al. Robust text detection in natural scene images[J]. IEEE transactions on pattern analysis and machine intelligence, 2014, 36(5): 970-983.
- Bissacco A, Cummins M, Netzer Y, et al. Photoocr: Reading text in uncontrolled conditions[C]//Proceedings of the IEEE International Conference on Computer Vision. 2013: 785-792.
- Breuel T M, Ul-Hasan A, Al-Azawi M A, et al. High-performance OCR for printed English and Fraktur using LSTM networks[C]//Document Analysis and Recognition (ICDAR), 2013 12th International Conference on. IEEE, 2013: 683-687.
code:[code] - Milyaev S, Barinova O, Novikova T, et al. Image binarization for end-to-end text understanding in natural images[C]//Document Analysis and Recognition (ICDAR), 2013 12th International Conference on. IEEE, 2013: 128-132.
- Neumann L, Matas J. On combining multiple segmentations in scene text recognition[C]//Document Analysis and Recognition (ICDAR), 2013 12th International Conference on. IEEE, 2013: 523-527.
- Koo H I, Kim D H. Scene text detection via connected component clustering and nontext filtering[J]. IEEE transactions on image processing, 2013, 22(6): 2296-2305.
- Shi C, Wang C, Xiao B, et al. Scene text recognition using part-based tree-structured character detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2013: 2961-2968.
- Halima M B, Karray H, Alimi A M. Arabic text recognition in video sequences[J]. arXiv preprint arXiv:1308.3243, 2013.
- Zaghden N, Khelifi B, Alimi A M, et al. Text Recognition in both ancient and cartographic documents[J]. arXiv preprint arXiv:1308.6309, 2013.
- Alsharif O, Pineau J. End-to-end text recognition with hybrid HMM maxout models[J]. arXiv preprint arXiv:1310.1811, 2013.
- Louradour J, Kermorvant C. Curriculum learning for handwritten text line recognition[C]//Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on. IEEE, 2014: 56-60.
- Goodfellow I J, Bulatov Y, Ibarz J, et al. Multi-digit number recognition from street view imagery using deep convolutional neural networks[J]. arXiv preprint arXiv:1312.6082, 2013.
- Bušta M, Drtina T, Helekal D, et al. Efficient character skew rectification in scene text images[C]//Asian Conference on Computer Vision. Springer, Cham, 2014: 134-146.
- Almazán J, Gordo A, Fornés A, et al. Word spotting and recognition with embedded attributes[J]. IEEE transactions on pattern analysis and machine intelligence, 2014, 36(12): 2552-2566.
code:[code] - Jaderberg M, Vedaldi A, Zisserman A. Deep features for text spotting[C]//European conference on computer vision. Springer, Cham, 2014: 512-528.
code:[code] - Bluche T, Ney H, Kermorvant C. A comparison of sequence-trained deep neural networks and recurrent neural networks optical modeling for handwriting recognition[C]//International Conference on Statistical Language and Speech Processing. Springer, Cham, 2014: 199-210.
- Yao C, Bai X, Liu W. A unified framework for multioriented text detection and recognition[J]. IEEE Transactions on Image Processing, 2014, 23(11): 4737-4749.
- Huang W, Qiao Y, Tang X. Robust scene text detection with convolution neural network induced mser trees[C]//European Conference on Computer Vision. Springer, Cham, 2014: 497-511.
- Bhowmick S, Banerjee P. Bangla text recognition from video sequence: A new focus[J]. arXiv preprint arXiv:1401.1190, 2014.
- 【Synthetic data】Jaderberg M, Simonyan K, Vedaldi A, et al. Synthetic data and artificial neural networks for natural scene text recognition[J]. arXiv preprint arXiv:1406.2227, 2014.
code:[model;offical website] - Jaderberg M, Simonyan K, Vedaldi A, et al. Reading text in the wild with convolutional neural networks[J]. International Journal of Computer Vision, 2016, 116(1): 1-20.
offical website:[offical website] - Jaderberg M, Simonyan K, Vedaldi A, et al. Deep structured output learning for unconstrained text recognition[J]. arXiv preprint arXiv:1412.5903, 2014.
- Ye Q, Doermann D. Text detection and recognition in imagery: A survey[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(7): 1480-1500.
- Jaderberg M. Deep learning for text spotting[D]. University of Oxford, 2015.
- Ren X, Chen K, Yang X, et al. A new unsupervised convolutional neural network model for Chinese scene text detection[C]//Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on. IEEE, 2015: 428-432.
- Wang Z, Yang J, Jin H, et al. Deepfont: Identify your font from an image[C]//Proceedings of the 23rd ACM international conference on Multimedia. ACM, 2015: 451-459.
- Gomez L, Karatzas D. Object proposals for text extraction in the wild[C]//Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015: 206-210.[code]
- Shi B, Yao C, Zhang C, et al. Automatic script identification in the wild[C]//Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015: 531-535.
- Busta M, Neumann L, Matas J. Fastext: Efficient unconstrained scene text detector[C]//Proceedings of the IEEE International Conference on Computer Vision. 2015: 1206-1214.[code]
- Zhang Z, Shen W, Yao C, et al. Symmetry-based text line detection in natural scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 2558-2567.
code:[code] - Ray A, Rajeswar S, Chaudhury S. A hypothesize-and-verify framework for text recognition using deep recurrent neural networks[C]//Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015: 936-940.
- Neumann L, Matas J. Efficient scene text localization and recognition with local character refinement[C]//Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015: 746-750.
- Visin F, Kastner K, Cho K, et al. Renet: A recurrent neural network based alternative to convolutional networks[J]. arXiv preprint arXiv:1505.00393, 2015.
- Zhong Z, Jin L, Xie Z. High performance offline handwritten chinese character recognition using googlenet and directional feature maps[C]//Document Analysis and Recognition (ICDAR), 2015 13th International Conference on. IEEE, 2015: 846-850.
code:[code] - 【CRNN】Shi B, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(11): 2298-2304.
code:【1 - offical】; 【2 - crnn.pytorch】; 【3 - unfinished】; 【4 - crnn.pytorch-chinese】; 【5 - crnn+stn-tf】; 【6 - lstm+ctc】; 【7 - ctpn+crnn-merge-cannot-train】; 【8 - crnn-mnist-keras】; 【9 - crnn-tf】; 【10 - crnn-tf-could-be-better】; 【11 - crnn.mxnet】; 【12 - crnn-tf-estimators】; 【13 - crnn-attention-tf】; 【14 - crnn.caffe】; 【15 - chinese.ocr-ctpn+crnn-tf+pytorch】; 【16 - another.crnn-attentive pooling】; 【17 - crnn-tf-music】; 【18 - crnn-tf-developing】; 【19 - crnn-torch】; 【20 - crnn-tf-developing】; 【21 - chinese-ocr-keras】; 【22 - crnn-tf-developing】; 【23 - ctpn+crnn-cannot-train-7】; 【24 - crnn-pytorch】; 【25 - cnn+lstm+ctc-tf】; 【26 - crnn-tf-resnet]】;【27 - caffe_ocr】 - He T, Huang W, Qiao Y, et al. Text-attentional convolutional neural network for scene text detection[J]. IEEE transactions on image processing, 2016, 25(6): 2529-2541.
- Sahu D K, Sukhwani M. Sequence to sequence learning for optical character recognition[J]. arXiv preprint arXiv:1511.04176, 2015.
- Hosseini-Asl E, Guha A. Similarity-based Text Recognition by Deeply Supervised Siamese Network[J]. arXiv preprint arXiv:1511.04397, 2015.
- Wang D H, Wang H, Zhang D, et al. Robust Scene Text Recognition Using Sparse Coding based Features[J]. arXiv preprint arXiv:1512.08669, 2015.
- Yin X C, Zuo Z Y, Tian S, et al. Text detection, tracking and recognition in video: a comprehensive survey[J]. IEEE Transactions on Image Processing, 2016, 25(6): 2752-2773.
- Zhu Y, Yao C, Bai X. Scene text detection and recognition: Recent advances and future trends[J]. Frontiers of Computer Science, 2016, 10(1): 19-36.
- He P, Huang W, Qiao Y, et al. Reading Scene Text in Deep Convolutional Sequences[C]//AAAI. 2016: 3501-3508.
code:[code] - Lee C Y, Osindero S. Recursive recurrent nets with attention modeling for OCR in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2231-2239.
- 【Synthetic data】Gupta A, Vedaldi A, Zisserman A. Synthetic data for text localisation in natural images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2315-2324.
code:[offical;vgg;other] - Sivakorn S, Polakis J, Keromytis A D. I’m not a human: Breaking the Google reCAPTCHA[J]. Black Hat,(i), 2016: 1-12.
- Sivakorn S, Polakis I, Keromytis A D. I am robot:(deep) learning to break semantic image captchas[C]//Security and Privacy (EuroS&P), 2016 IEEE European Symposium on. IEEE, 2016: 388-403.
- Lee C Y, Osindero S. Recursive recurrent nets with attention modeling for OCR in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2231-2239.
- Neumann L, Matas J. Real-time lexicon-free scene text localization and recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2016, 38(9): 1872-1885.
- Zhang Z, Zhang C, Shen W, et al. Multi-oriented text detection with fully convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 4159-4167.
- Fabrizio J, Robert-Seidowsky M, Dubuisson S, et al. TextCatcher: a method to detect curved and challenging text in natural scenes[J]. International Journal on Document Analysis and Recognition (IJDAR), 2016, 19(2): 99-117.
- Cho H, Sung M, Jun B. Canny text detector: Fast and robust scene text localization algorithm[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 3566-3573.
- Qiang G, Dan T, Guohui L, et al. Memory Matters: Convolutional Recurrent Neural Network for Scene Text Recognition[J]. arXiv preprint arXiv:1601.01100, 2016.
- Mishra A, Alahari K, Jawahar C V. Enhancing energy minimization framework for scene text recognition with top-down cues[J]. Computer Vision and Image Understanding, 2016, 145: 30-42.
- Li H, Shen C. Reading car license plates using deep convolutional neural networks and lstms[J]. arXiv preprint arXiv:1601.05610, 2016.
- Veit A, Matera T, Neumann L, et al. Coco-text: Dataset and benchmark for text detection and recognition in natural images[J]. arXiv preprint arXiv:1601.07140, 2016.
- Huang W. Context modeling for semantic text matching and scene text detection[M]. The Pennsylvania State University, 2016.
- Tian S, Pei W Y, Zuo Z Y, et al. Scene Text Detection in Video by Learning Locally and Globally[C]//IJCAI. 2016: 2647-2653.
- Shi B, Wang X, Lyu P, et al. Robust scene text recognition with automatic rectification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 4168-4176.
- Shuye Zhang, Mude Lin, Tianshui Chen, Lianwen Jin, Liang Lin. Character Proposal Network for Robust Text Extraction. arXiv preprint arXiv:1602.04348, 2016.
- Lluis Gomez, Dimosthenis Karatzas. A fine-grained approach to scene text script identification. arXiv preprint arXiv:1602.07475, 2016.
- Lluis Gomez, Anguelos Nicolaou, Dimosthenis Karatzas. Improving patch-based scene text script identification with ensembles of conjoined networks. arXiv preprint arXiv:1602.07480, 2016.
- He T, Huang W, Qiao Y, et al. Accurate text localization in natural image with cascaded convolutional text network[J]. arXiv preprint arXiv:1603.09423, 2016.
- Hafemann L G, Sabourin R, Oliveira L S. Writer-independent feature learning for offline signature verification using deep convolutional neural networks[C]//Neural Networks (IJCNN), 2016 International Joint Conference on. IEEE, 2016: 2576-2583.
- Ren X, Chen K, Sun J. A CNN Based Scene Chinese Text Recognition Algorithm With Synthetic Data Engine[J]. arXiv preprint arXiv:1604.01891, 2016.
- Xiaohang Ren, Kai Chen, Jun Sun. A Novel Scene Text Detection Algorithm Based On Convolutional Neural Network. arXiv preprint arXiv:1604.01894, 2016.
- Gómez L, Karatzas D. Textproposals: a text-specific selective search algorithm for word spotting in the wild[J]. Pattern Recognition, 2017, 70: 60-74.[code]
- Bluche T, Louradour J, Messina R. Scan, attend and read: End-to-end handwritten paragraph recognition with mdlstm attention[J]. arXiv preprint arXiv:1604.03286, 2016.
- Zheng Zhang, Chengquan Zhang, Wei Shen, Cong Yao, Wenyu Liu, Xiang Bai. Multi-Oriented Text Detection with Fully Convolutional Networks. arXiv preprint arXiv:1604.04018, 2016.
- Xie Z, Sun Z, Jin L, et al. Fully convolutional recurrent network for handwritten Chinese text recognition[C]//Pattern Recognition (ICPR), 2016 23rd International Conference on. IEEE, 2016: 4011-4016.
- Shangxuan Tian, Yifeng Pan, Chang Huang, Shijian Lu, Kai Yu, Chew Lim Tan. Text Flow: A Unified Text Detection System in Natural Scene Images. arXiv preprint arXiv:1604.06877, 2016.
- Zhong Z, Jin L, Zhang S, et al. Deeptext: A unified framework for text proposal generation and text detection in natural images[J]. arXiv preprint arXiv:1605.07314, 2016.
- Zhang X Y, Yin F, Zhang Y M, et al. Drawing and recognizing chinese characters with recurrent neural network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
- Yao C, Bai X, Sang N, et al. Scene text detection via holistic, multi-channel prediction[J]. arXiv preprint arXiv:1606.09002, 2016.
- Hassanien A M A. Sequence to sequence learning for unconstrained scene text recognition[J]. arXiv preprint arXiv:1607.06125, 2016.
- Nitigya Sambyal, Pawanesh Abrol. Automatic text extraction and character segmentation using maximally stable extremal regions. arXiv preprint arXiv:1608.03374, 2016.
- 【Synthetic data】 Krishnan P, Jawahar C V. Generating Synthetic Data for Text Recognition[J]. arXiv preprint arXiv:1608.04224, 2016.
- 【CTPN】Tian Z, Huang W, He T, et al. Detecting text in natural image with connectionist text proposal network[C]//European Conference on Computer Vision. Springer International Publishing, 2016: 56-72.
code:[code;cuda8-caffe;offical;ocr_detection_ctpn;keras_ocr]
dataset:[ICDAR 2011; ICDAR 2013; ICDAR 2015; SWT; Multilingual dataset] - Xie Z, Sun Z, Jin L, et al. Learning spatial-semantic context with fully convolutional recurrent network for online handwritten chinese text recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2017.
- Hu B, Liu X, Wu X, et al. Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition[J]. arXiv preprint arXiv:1610.04057, 2016.
- Ahmed Ibrahim, A. Lynn Abbott, Mohamed E. Hussein. An Image Dataset of Text Patches in Everyday Scenes. arXiv preprint arXiv:1610.06494, 2016.
- Lou X, Kansky K, Lehrach W, et al. Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data[C]//Advances in Neural Information Processing Systems. 2016: 2793-2801.
- Xu Y, Shan S, Qiu Z, et al. End-to-End Subtitle Detection and Recognition for Videos in East Asian Languages via CNN Ensemble with Near-Human-Level Performance[J]. arXiv preprint arXiv:1611.06159, 2016.
- Chengzhe Yan, Jie Hu, Changshui Zhang. A DNN Framework For Text Image Rectification From Planar Transformations. arXiv preprint arXiv:1611.04298, 2016.
- Minghui Liao, Baoguang Shi, Xiang Bai, Xinggang Wang, Wenyu Liu. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. arXiv preprint arXiv:1611.06779, 2016.
- Jie Mei, Aminul Islam, Yajing Wu, Abidalrahman Moh'd, Evangelos E. Milios. Statistical Learning for OCR Text Correction. arXiv preprint arXiv:1611.06950, 2016.
- Yang X, He D, Huang W, et al. Smart Library: Identifying Books in a Library using Richly Supervised Deep Scene Text Reading[J]. arXiv preprint arXiv:1611.07385, 2016.
- Junnan Yu, Xuna Ma, Ting Han. Usability Investigation on the Localization of Text CAPTCHAs: Take Chinese Characters as a Case Study. arXiv preprint arXiv:1612.01070, 2016.
- Singh Vijendra, Nisha Vasudeva, Hem Jyotsana Parashar. Recognition of Text Image Using Multilayer Perceptron. arXiv preprint arXiv:1612.00625, 2016.
- Zichuan Liu, Yixing Li, Fengbo Ren, Hao Yu. A Binary Convolutional Encoder-decoder Network for Real-time Natural Scene Text Processing. arXiv preprint arXiv:1612.03630, 2016.
- Raj D, SAHU S, Anand A. Learning local and global contexts using a convolutional recurrent network model for relation classification in biomedical text[C]//Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017). 2017: 311-321.
code:[code] - Florian Fink, Klaus-U. Schulz, Uwe Springmann. Profiling of OCR'ed Historical Texts Revisited. arXiv preprint arXiv:1701.05377, 2017.
- Cheang T K, Chong Y S, Tay Y H. Segmentation-free Vehicle License Plate Recognition using ConvNet-RNN[J]. arXiv preprint arXiv:1701.06439, 2017.
- Shahin A A. Printed Arabic Text Recognition using Linear and Nonlinear Regression[J]. arXiv preprint arXiv:1702.01444, 2017.
- Smith R, Gu C, Lee D S, et al. End-to-end interpretation of the french street name signs dataset[C]//European Conference on Computer Vision. Springer International Publishing, 2016: 411-426.
code:[code] - Bazazian D, Gomez R, Nicolaou A, et al. Improving Text Proposals for Scene Images with Fully Convolutional Networks[J]. arXiv preprint arXiv:1702.05089, 2017.
- 【synthetic Captcha】Le T A, Baydin A G, Zinkov R, et al. Using Synthetic Data to Train Neural Networks is Model-Based Reasoning[J]. arXiv preprint arXiv:1703.00868, 2017.
- Jianqi Ma, Weiyuan Shao, Hao Ye, Li Wang, Hong Wang, Yingbin Zheng, Xiangyang Xue. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. arXiv preprint arXiv:1703.01086, 2017.
- Liu Y, Jin L. Deep matching prior network: Toward tighter multi-oriented text detection[J]. arXiv preprint arXiv:1703.01425, 2017.
- Shi B, Bai X, Belongie S. Detecting Oriented Text in Natural Images by Linking Segments[J]. arXiv preprint arXiv:1703.06520, 2017.
code:[code] - Masood S Z, Shu G, Dehghan A, et al. License Plate Detection and Recognition Using Deeply Learned Convolutional Neural Networks[J]. arXiv preprint arXiv:1703.07330, 2017.
- Liao M, Shi B, Bai X, et al. TextBoxes: A Fast Text Detector with a Single Deep Neural Network[C]//AAAI. 2017: 4161-4167.
code:[code;code] - He W, Zhang X Y, Yin F, et al. Deep Direct Regression for Multi-Oriented Scene Text Detection[J]. arXiv preprint arXiv:1703.08289, 2017.
- Ma J, Shao W, Ye H, et al. Arbitrary-Oriented Scene Text Detection via Rotation Proposals[J]. arXiv preprint arXiv:1703.01086, 2017.
- Qin S, Manduchi R. Cascaded Segmentation-Detection Networks for Word-Level Text Spotting[J]. arXiv preprint arXiv:1704.00834, 2017.
- Zhou X, Yao C, Wen H, et al. EAST: An Efficient and Accurate Scene Text Detector[J]. arXiv preprint arXiv:1704.03155, 2017.
code:[code] - Wojna Z, Gorban A, Lee D S, et al. Attention-based Extraction of Structured Information from Street View Imagery[J]. arXiv preprint arXiv:1704.03549, 2017.
: code:[offical;similar] - Moysset B, Kermorvant C, Wolf C. Full-Page Text Recognition: Learning Where to Start and When to Stop[J]. arXiv preprint arXiv:1704.08628, 2017.
- Nakamura T, Zhu A, Yanai K, et al. Scene Text Eraser[J]. arXiv preprint arXiv:1705.02772, 2017.
- Xiao X, Yang Y, Ahmad T, et al. Design of a Very Compact CNN Classifier for Online Handwritten Chinese Character Recognition Using DropWeight and Global Pooling[J]. arXiv preprint arXiv:1705.05207, 2017.
- Polzounov A, Ablavatski A, Escalera S, et al. WordFence: Text Detection in Natural Images with Border Awareness[J]. arXiv preprint arXiv:1705.05483, 2017.
- Ghosh S K, Valveny E, Bagdanov A D. Visual attention models for scene text recognition[J]. arXiv preprint arXiv:1706.01487, 2017.
- Lyu P, Bai X, Yao C, et al. Auto-Encoder Guided GAN for Chinese Calligraphy Synthesis[J]. arXiv preprint arXiv:1706.04041, 2017.
- Shervin Minaee, Yao Wang. Text Extraction From Texture Images Using Masked Signal Decomposition. arXiv preprint arXiv:1706.08789, 2017.
- Jiang Y, Zhu X, Wang X, et al. R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection[J]. arXiv preprint arXiv:1706.09579, 2017.
- Ghosh S, Valveny E. R-PHOC: Segmentation-Free Word Spotting using CNN[J]. arXiv preprint arXiv:1707.01294, 2017.
- Wang X, You M, Shen C. Adversarial generation of training examples for vehicle license plate recognition[J]. arXiv preprint arXiv:1707.03124, 2017.
- Li H, Wang P, Shen C. Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks[J]. arXiv preprint arXiv:1707.03985, 2017.
- Aneeshan Sain, Ayan Kumar Bhunia, Partha Pratim Roy, Umapada Pal. Multi-Oriented Text Detection and Verification in Video Frames and Scene Images. arXiv preprint arXiv:1707.07150, 2017.
- Bhunia A K, Kumar G, Roy P P, et al. Text recognition in scene image and video frame using Color Channel selection[J]. Multimedia Tools and Applications, 2017: 1-28.
- Partha Pratim Roy, Ayan Kumar Bhunia, Umapada Pal. Date-Field Retrieval in Scene Image and Video Frames using Text Enhancement and Shape Coding. arXiv preprint arXiv:1707.06833, 2017.
- Bartz C, Yang H, Meinel C. STN-OCR: A single Neural Network for Text Detection and Text Recognition[J]. arXiv preprint arXiv:1707.08831, 2017.
code:[code] - Jiang F, Hao Z, Liu X. Deep Scene Text Detection with Connected Component Proposals[J]. arXiv preprint arXiv:1708.05133, 2017.
- Amarnath R, P. Nagabhushan. Spotting Separator Points at Line Terminals in Compressed Document Images for Text-line Segmentation. arXiv preprint arXiv:1708.05545, 2017.
- P. Shivakumara, D. S. Guru, H.T. Basavaraju. Color and Gradient Features for Text Segmentation from Video Frames. arXiv preprint arXiv:1708.06561, 2017.
- Hu H, Zhang C, Luo Y, et al. Wordsup: Exploiting word annotations for character based text detection[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017.
- He P, Huang W, He T, et al. Single shot text detector with regional attention[C]//The IEEE International Conference on Computer Vision (ICCV). 2017.
code:[code;code] - Yin F, Wu Y C, Zhang X Y, et al. Scene Text Recognition with Sliding Convolutional Character Models[J]. arXiv preprint arXiv:1709.01727, 2017.
- Ekta Vats, Anders Hast. On-the-fly Historical Handwritten Text Annotation. arXiv preprint arXiv:1709.01775, 2017.
- Cheng Z, Bai F, Xu Y, et al. Focusing Attention: Towards Accurate Text Recognition in Natural Images[C]//2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017: 5086-5094.
- Dai Y, Huang Z, Gao Y, et al. Fused Text Segmentation Networks for Multi-oriented Scene Text Detection[J]. arXiv preprint arXiv:1709.03272, 2017.
- Teresa Nicole Brooks. Exploring Geometric Property Thresholds For Filtering Non-Text Regions In A Connected Component Based Text Detection Application. arXiv preprint arXiv:1709.03548, 2017.
- Yunze Gao, Yingying Chen, Jinqiao Wang, Hanqing Lu .Reading Scene Text with Attention Convolutional Sequence Modeling. arXiv preprint arXiv:1709.04303, 2017.
- Li H, Wang P, Shen C. Towards End-to-End Car License Plates Detection and Recognition with Deep Neural Networks[J]. arXiv preprint arXiv:1709.08828, 2017.
- Kazem Qazanfari, Saeed Shiri. Real time text localization for Indoor Mobile Robot Navigation. arXiv preprint arXiv:1709.09634, 2017.
- Zhan H, Wang Q, Lu Y. Handwritten digit string recognition by combination of residual network and RNN-CTC[C]//International Conference on Neural Information Processing. Springer, Cham, 2017: 583-591.
- Yang C, Yin X C, Li Z, et al. AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition[J]. arXiv preprint arXiv:1710.03425, 2017.
- Tian S, Lu S, Li C. WeText: Scene Text Detection under Weak Supervision[J]. arXiv preprint arXiv:1710.04826, 2017.
- Kheng Chng C, Chan C S. Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition[J]. arXiv preprint arXiv:1710.10400, 2017.
- Jain M, Mathew M, Jawahar C V. Unconstrained scene text and video text recognition for Arabic script[C]//Arabic Script Analysis and Recognition (ASAR), 2017 1st International Workshop on. IEEE, 2017: 26-30.
- Ren H, Wang W. A New Hybrid-parameter Recurrent Neural Networks for Online Handwritten Chinese Character Recognition[J]. arXiv preprint arXiv:1711.02809, 2017.
- Zhu X, Jiang Y, Yang S, et al. Deep Residual Text Detection Network for Scene Text[J]. arXiv preprint arXiv:1711.04147, 2017.
- Cheng Z, Liu X, Bai F, et al. Arbitrarily-Oriented Text Recognition[J]. arXiv preprint arXiv:1711.04226, 2017.
- Zhang S, Liu Y, Jin L, et al. Feature Enhancement Network: A Refined Scene Text Detector[J]. arXiv preprint arXiv:1711.04249, 2017.
- Xing D, Li Z, Chen X, et al. ArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene[J]. arXiv preprint arXiv:1711.11249, 2017.
- Yuliang L, Lianwen J, Shuaitao Z, et al. Detecting Curve Text in the Wild: New Dataset and New Solution[J]. arXiv preprint arXiv:1712.02170, 2017.
code:[code] - Jason Poulos, Rafael Valle. Attention networks for image-to-text. arXiv preprint arXiv:1712.04046, 2017.
- Aarushi Agrawal, Prerana Mukherjee, Siddharth Srivastava, Brejesh Lall. Enhanced Characterness for Text Detection in the Wild. arXiv preprint arXiv:1712.04927, 2017.
- Bartz C, Yang H, Meinel C. SEE: Towards Semi-Supervised End-to-End Scene Text Recognition[J]. arXiv preprint arXiv:1712.05404, 2017.
- Kang C, Kim G, Yoo S I. Detection and Recognition of Text Embedded in Online Images via Neural Context Models[C]//AAAI. 2017: 4103-4110.
code:[code] - Busta M, Neumann L, Matas J. Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 2204-2212.[code]
- Wu Y, Natarajan P. Self-organized Text Detection with Minimal Post-processing via Border Learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 5000-5009.
- Rong X, Yi C, Tian Y. Unambiguous text localization and retrieval for cluttered scenes[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017: 3279-3287.
- Deng D, Liu H, Li X, et al. PixelLink: Detecting Scene Text via Instance Segmentation[J]. arXiv preprint arXiv:1801.01315, 2018.
- Agnese Chiatti, Mu Jung Cho, Anupriya Gagneja, Xiao Yang, Miriam Brinberg, Katie Roehrick, Sagnik Ray Choudhury, Nilam Ram, Byron Reeves, C. Lee Giles. Text Extraction and Retrieval from Smartphone Screenshots: Building a Repository for Life in Media. arXiv preprint arXiv:1801.01316, 2018.
- Liu X, Liang D, Yan S, et al. FOTS: Fast Oriented Text Spotting with a Unified Network[J]. arXiv preprint arXiv:1801.01671, 2018.
- Liao M, Shi B, Bai X. TextBoxes++: A Single-Shot Oriented Scene Text Detector[J]. arXiv preprint arXiv:1801.02765, 2018.
- Anders Hast, Per Cullhed, Ekta Vats. TexT - Text Extractor Tool for Handwritten Document Transcription and Annotation. arXiv preprint arXiv:1801.05367, 2018.
- Yash Patel, Michal Bušta, Jiri Matas. E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text. arXiv preprint arXiv:1801.09919, 2018.
- Yixing Zhu, Jun Du. Sliding Line Point Regression for Shape Robust Scene Text Detection. arXiv preprint arXiv:1801.09969, 2018.
- Tobias Grüning, Gundram Leifert, Tobias Strauß, Roger Labahn. A Two-Stage Method for Text Line Detection in Historical Documents. arXiv preprint arXiv:1802.03345, 2018.
- Congzheng Song, Vitaly Shmatikov. Fooling OCR Systems with Adversarial Text Images. arXiv preprint arXiv:1802.05385, 2018.
- Pengyuan Lyu, Cong Yao, Wenhao Wu, Shuicheng Yan, Xiang Bai. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation. arXiv preprint arXiv:1802.08948, 2018.
- Tai-Ling Yuan, Zhe Zhu, Kun Xu, Cheng-Jun Li, Shi-Min Hu. Chinese Text in the Wild. arXiv preprint arXiv:1803.00085, 2018.
there are three websites that have the dataset list of some different data type:
1 - www.iapr-tc11.org
2 - tc11.cvc.uab.es
3 - rrc.cvc.uab.es
-
2017 COCO-Text
2017 DeTEXT
2017 DOST
2017 FSNS
2017 MLT
2017 IEHHR
2011-2015 Born-DIgitalImage
2013-2015 Focused Scene Text
2013-2015 Text in Videos
2015 Incidental Scene Text
-
ICDAR Chinese
2017
- more than 12,000 images. Most of the images are collected in the wild by phone cameras.
- Task: Chinese Text in the Wild.
-
- 32,285 high resolution images, 1,018,402 character instances, 3,850 character categories, 6 kinds of attributes
-
Total-Text
2017
- 1555 images,11459 text instances, includes curved tex
-
SCUT_FORU_DB_Release
2016
- FORU contains two parts, which are Chinese2k and English2k dataset, respectively.
-
SynthText in the Wild Dataset
2016
- 800 thousand images, 8 million synthetic word instances.
- Each text instance is annotated with its text-string, word-level and character-level bounding-boxes.
-
COCO-Text (Computer Vision Group, Cornell)
2016
- 63,686 images, 173,589 text instances, 3 fine-grained text attributes.
- Task: text location and recognition
COCO-Text API
-
USTB-SV1k
2014
- 1000 (500 for training and 500 for testing) street view (patch) images from 6 USA cities
-
Synthetic Word Dataset (Oxford, VGG)
2014
- 9 million images covering 90k English words
- Task: text recognition, segmantation
download
-
IIIT 5K-Words
2012
- 5000 images from Scene Texts and born-digital (2k training and 3k testing images)
- Each image is a cropped word image of scene text with case-insensitive labels
- Task: text recognition
download
-
StanfordSynth(Stanford, AI Group)
2012
- Small single-character images of 62 characters (0-9, a-z, A-Z)
- Task: text recognition
download
-
MSRA Text Detection 500 Database (MSRA-TD500)
2012
- 500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)
- Chinese, English or mixture of both
- Task: text detection
-
OSTD
2011
- cannot find the downloadlink
-
Traffice Guide Panel Text Dataset,TGPT
2016
- 3841 high-resolution individual images, 2315 containing traffic guide panel level annotations (1911 for training and 404 for testing, and all the testing images are manually labeled with ground truth tight text region bounding boxes), 1526 containing no traffic signs}.
-
- 350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)
- Only word level bounding boxes are provided with case-insensitive labels
- Task: text location
-
KAIST Scene_Text Database
2010
- 3000 images of indoor and outdoor scenes containing text
- Korean, English (Number), and Mixed (Korean + English + Number)
- Task: text location, segmantation and recognition
-
Chars74k
2009
- Over 74K images from natural images, as well as a set of synthetically generated characters
- Small single-character images of 62 characters (0-9, a-z, A-Z)
- Task: text recognition
-
ICDAR Benchmark Datasets
Dataset | Discription | Competition Paper |
---|---|---|
ICDAR 2015 | 1000 training images and 500 testing images | paper |
ICDAR 2013 | 229 training images and 233 testing images | paper |
ICDAR 2011 | 229 training images and 255 testing images | paper |
ICDAR 2005 | 1001 training images and 489 testing images | paper |
ICDAR 2003 | 181 training images and 251 testing images(word level and character level) | paper |