[1] |
Yao C, Bai X, Shi B , et al. Strokelets:a learned multi-scale representation for scene text recognition [C].Columbus: Computer Vision and Pattern Recognition, 2014.
|
[2] |
Jaderberg M, Vedaldi A, Zisserman A . Deep features for text spotting [C].Cham:Computer Vision, 2014.
|
[3] |
薛皓天, 杨晶东, 谈凯德 . 一种改进的BP神经网络在手写体识别上的应用[J]. 电子科技, 2015,28(5):20-23.
|
|
Xue Haotian, Yang Jingdong, Tan Kaide . Application of an improved BP neural network in handwriting recognition[J]. Electronic Science and Technology, 2015,28(5):20-23.
|
[4] |
熊海朋, 陈洋洋, 陈春玮 . 基于卷积神经网络的场景图像文本定位研究[J]. 电子科技, 2018,31(1):50-53.
|
|
Xiong Haipeng, Chen Xiangxiang, Chen Chunwei . Text location in image based on convolution neural network[J]. Electronic Science and Technology, 2018,31(1):50-53.
|
[5] |
Bahdanau D, Chorowski J, Serdyuk D , et al. End-to-end attention-based large vocabulary speech recognition [C]. Shanghai:The 41st IEEE International Conference on Acoustics, Speech and Signal Processing , 2016.
|
[6] |
Luong M T, Pham H, Manning C D . Effective approaches to attention-based neural machine translation [C].Lisbon: Empirical Methods in Natural Language Processing, 2015.
|
[7] |
Qu S, Xi Y, Ding S . Visual attention based on long-short term memory model for image caption generation [C]. Melbourne:Control & Decision Conference. 2017.
|
[8] |
Bahdanau D, Cho K, Bengio Y . Neural machine translation by jointly learning to align and translate [C].San Diego: International Conference on Learning Representations, 2015.
|
[9] |
Graves A, Gomez F . Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks [C].Hong Kong:International Conference on Machine Learning, 2006.
|
[10] |
尹征, 唐春晖, 张轩雄 . 基于改进型稀疏自动编码器的图像识别[J]. 电子科技, 2016,29(1):124-127.
|
|
Yin Zheng, Tang Chunhui, Zhang Xuanxiong . Image recognition based on improved sparse auto-encoder[J]. Electronic Science and Technology, 2016,29(1):124-127.
|
[11] |
Hochreiter S, Schmidhuber J . Long short-term memory[J]. Neural Computation, 1997,9(8):1735-1780.
doi: 10.1162/neco.1997.9.8.1735
pmid: 9377276
|
[12] |
Lucas S M, Panaretos A, Sosa L , et al. ICDAR 2003 robust reading competitions[J]. Proceeding of the Icdar, 2003,7(2-3):105-122.
|
[13] |
Ioffe S, Szegedy C . Batch normalization: accelerating deep network training by reducing internal covariate shift [C].Lille Grand Palais:International Conference on Machine Learning, 2015.
|
[14] |
Szegedy C, Vanhoucke V, Ioffe S , et al. Rethinking the inception architecture for computer vision [C].Las Vegas:Computer Vision and Pattern Recognition, 2016.
|
[15] |
Szegedy C, Ioffe S, Vanhoucke V , et al. Inception-v4, inception-resnet and the impact of residual connections on learning [C].San Francisco:The Thirty-First AAAI Conference on Artificial Intelligence, 2017.
|
[16] |
Kim S, Hori T, Watanabe S . Joint CTC-attention based end-to-end speech recognition using multi-task learning [C].New Orleans:The 42nd IEEE International Conference on Acoustics,Speech and Signal Processing , 2017.
|
[17] |
Hori T, Watanabe S, Zhang Y , et al. Advances in joint CTC-attention based end-to-end speech recognition with a deep CNN encoder and RNN-LM [C].USA:IEEE International Conference, 2017.
|
[18] |
Xu K, Li D, Cassimatis N , et al. LCANet:end-to-end lipreading with cascaded attention-CTC [C].Xi’an:China Automatic Face & Gesture Recognition, 2018.
|