Glyce | Proceedings of the 33rd International Conference on Neural Information Processing Systems (2024)

chapter

Free access

AUTHORs: Yuxian Meng, Wei Wu, Fei Wang, Xiaoya Li, + 6, Ping Nie, Fan Yin, + 4, Muyu Li, Qinghong Han, Xiaofei Sun, and Jiwei Li (Less)

Proceedings of the 33rd International Conference on Neural Information Processing Systems

December 2019

Article No.: 247, Pages 2746 - 2757

Published: 08 December 2019 Publication History

  • 2citation
  • 74
  • Downloads

Metrics

Total Citations2Total Downloads74

Last 12 Months27

Last 6 weeks8

  • Get Citation Alerts

    New Citation Alert added!

    This alert has been successfully added and will be sent to:

    You will be notified whenever a record that you have chosen has been cited.

    To manage your alert preferences, click on the button below.

    Manage my Alerts

    New Citation Alert!

    Please log in to your account

  • PDFeReaderPublisher Site

      • View Options
      • References
      • Media
      • Tables
      • Share

    Abstract

    It is intuitive that NLP tasks for logographic languages like Chinese should benefit from the use of the glyph information in those languages. However, due to the lack of rich pictographic evidence in glyphs and the weak generalization ability of standard computer vision models on character data, an effective way to utilize the glyph information remains to be found.

    In this paper, we address this gap by presenting Glyce, the glyph-vectors for Chinese character representations. We make three major innovations: (1) We use historical Chinese scripts (e.g., bronzeware script, seal script, traditional Chinese, etc) to enrich the pictographic evidence in characters; (2) We design CNN structures (called tianzege-CNN) tailored to Chinese character image processing; and (3) We use image-classification as an auxiliary task in a multi-task learning setup to increase the model's ability to generalize.

    We show that glyph-based models are able to consistently outperform word/char ID-based models in a wide range of Chinese NLP tasks. We are able to set new state-of-the-art results for a variety of Chinese NLP tasks, including tagging (NER, CWS, POS), sentence pair classification, single sentence classification tasks, dependency parsing, and semantic role labeling. For example, the proposed model achieves an F1 score of 80.6 on the OntoNotes dataset of NER, +1.5 over BERT; it achieves an almost perfect accuracy of 99.8% on the Fudan corpus for text classification.

    References

    [1]

    Xinlei Shi, Junjie Zhai, Xudong Yang, Zehua Xie, and Chao Liu. Radical embedding: Delving deeper to chinese radicals. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), volume 2, pages 594-598, 2015.

    [2]

    Yanran Li, Wenjie Li, Fei Sun, and Sujian Li. Component-enhanced chinese character embeddings. arXiv preprint arXiv:1508.06669, 2015.

    [3]

    Rongchao Yin, Quan Wang, Peng Li, Rui Li, and Bin Wang. Multi-granularity chinese word embedding. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 981-986, 2016.

    [4]

    Yaming Sun, Lei Lin, Nan Yang, Zhenzhou Ji, and Xiaolong Wang. Radical-enhanced chinese character embedding. In International Conference on Neural Information Processing, pages 279-286. Springer, 2014.

    [5]

    Yan Shao, Christian Hardmeier, Jörg Tiedemann, and Joakim Nivre. Character-based joint segmentation and pos tagging for chinese using bidirectional rnn-crf. arXiv preprint arXiv:1704.01314, 2017.

    [6]

    Mi Xue Tan, Yuhuang Hu, Nikola I Nikolov, and Richard HR Hahnloser. wubi2en: Character-level chinese-english translation through ascii encoding. arXiv preprint arXiv:1805.03330, 2018.

    [7]

    Shaosheng Cao, Wei Lu, Jun Zhou, and Xiaolong Li. cw2vec: Learning chinese word embeddings with stroke n-gram information. 2018.

    [8]

    Frederick Liu, Han Lu, Chieh Lo, and Graham Neubig. Learning character-level compositionality with visual features. arXiv preprint arXiv:1704.04859, 2017.

    [9]

    Xiang Zhang and Yann LeCun. Which encoding is the best for text classification in chinese, english, japanese and korean? arXiv preprint arXiv:1708.02657, 2017.

    [10]

    Falcon Z Dai and Zheng Cai. Glyph-aware embedding of chinese characters. arXiv preprint arXiv:1709.00028, 2017.

    [11]

    Tzu-Ray Su and Hung-Yi Lee. Learning chinese word representations from glyphs of characters. arXiv preprint arXiv:1708.04755, 2017.

    [12]

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 248-255. Ieee, 2009.

    [13]

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770-778, 2016.

    [14]

    Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.

    [15]

    Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. Shufflenet v2: Practical guidelines for efficient cnn architecture design. arXiv preprint arXiv:1807.11164, 5, 2018a.

    [16]

    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097-1105, 2012.

    Digital Library

    [17]

    Ting Zhang, Guo-Jun Qi, Bin Xiao, and Jingdong Wang. Interleaved group convolutions. In Computer Vision and Pattern Recognition, 2017.

    [18]

    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.

    [19]

    Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. Deep contextualized word representations. arXiv preprint arXiv:1802.05365, 2018.

    [20]

    Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. Improving language understanding by generative pre-training. URL https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/languageunsupervised/languageunderstandingpaper.pdf, 2018.

    [21]

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems, pages 5998-6008, 2017.

    Digital Library

    [22]

    Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, pages 632-642, 2015.

    [23]

    Jie Yang, Yue Zhang, and Shuailong Liang. Subword encoding in lattice LSTM for chinese word segmentation. CoRR, abs/1810.12594, 2018.

    [24]

    Yue Zhang and Jie Yang. Chinese NER using lattice LSTM. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pages 1554-1564, 2018.

    [25]

    Xuezhe Ma and Eduard H. Hovy. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1, 2016.

    [26]

    Jie Yang, Yue Zhang, and Fei Dong. Neural word segmentation with rich pretraining. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pages 839-849, 2017.

    [27]

    Ji Ma, Kuzman Ganchev, and David Weiss. State-of-the-art chinese word segmentation with bi-lstms. CoRR, abs/1808.06511, 2018b. URL http://arxiv.org/abs/1808.06511.

    [28]

    Weipeng Huang, Xingyi Cheng, Kunlong Chen, Taifeng Wang, and Wei Chu. Toward fast and accurate neural chinese word segmentation with multi-criteria learning. CoRR, abs/1903.04190, 2019. URL http://arxiv.org/abs/1903.04190.

    [29]

    Xin Liu, Qingcai Chen, Chong Deng, Huajun Zeng, Jing Chen, Dongfang Li, and Buzhou Tang. Lcqmc: A large-scale chinese question matching corpus. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1952-1962, 2018.

    [30]

    Adina Williams and Samuel R Bowman. The multi-genre nli corpus 0.2: Repeval shared task preliminary version description paper.

    [31]

    Zhiguo Wang, Wael Hamza, and Radu Florian. Bilateral multi-perspective matching for natural language sentences. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19-25, 2017, pages 4144-4150, 2017.

    [32]

    Ronglu Li. Fudan corpus for text classification. 2011.

    [33]

    Danqi Chen and Christopher Manning. A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 740-750, 2014.

    [34]

    Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, and Noah A Smith. Transition-based dependency parsing with stack long short-term memory. arXiv preprint arXiv:1505.08075, 2015.

    [35]

    Timothy Dozat and Christopher D Manning. Deep biaffine attention for neural dependency parsing. arXiv preprint arXiv:1611.01734, 2016.

    [36]

    Miguel Ballesteros, Yoav Goldberg, Chris Dyer, and Noah A Smith. Training with exploration improves a greedy stack-lstm parser. arXiv preprint arXiv:1603.03793, 2016.

    [37]

    Hao Cheng, Hao Fang, Xiaodong He, Jianfeng Gao, and Li Deng. Bi-directional attention with agreement for dependency parsing. arXiv preprint arXiv:1608.02076, 2016.

    [38]

    Eliyahu Kiperwasser and Yoav Goldberg. Simple and accurate dependency parsing using bidirectional lstm feature representations. arXiv preprint arXiv:1603.04351, 2016.

    [39]

    Michael Roth and Mirella Lapata. Neural semantic role labeling with dependency path embeddings. arXiv preprint arXiv:1605.07515, 2016.

    [40]

    Diego Marcheggiani and Ivan Titov. Encoding sentences with graph convolutional networks for semantic role labeling. arXiv preprint arXiv:1703.04826, 2017.

    [41]

    Shexia He, Zuchao Li, Hai Zhao, and Hongxiao Bai. Syntax for semantic role labeling, to be, or not to be. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, pages 2061-2071, 2018.

    Cited By

    View all

    • Wang JPan GSun DZhang JShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Chinese Character Inpainting with Contextual Semantic ConstraintsProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475333(1829-1837)Online publication date: 17-Oct-2021

      https://dl.acm.org/doi/10.1145/3474085.3475333

    • Wang XXiong YNiu HYue JZhu YYu PDemartini GZuccon GCulpepper JHuang ZTong H(2021)Improving Chinese Character Representation with Formation Graph Attention NetworkProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482265(1999-2009)Online publication date: 26-Oct-2021

      https://dl.acm.org/doi/10.1145/3459637.3482265

    Index Terms

    1. Glyce: glyph-vectors for chinese character representations

      1. Computing methodologies

        1. Artificial intelligence

          1. Computer vision

            1. Natural language processing

            2. Machine learning

              1. Learning paradigms

                1. Supervised learning

                2. Machine learning approaches

                  1. Neural networks

            Index terms have been assigned to the content through auto-classification.

            Recommendations

            • Character and numeral recognition for non-Indic and Indic scripts: a survey

              Abstract

              A collection of different scripts is employed in writing languages throughout the world. Character and numeral recognition of a particular script is a key area in the field of pattern recognition. In this paper, we have presented a comprehensive ...

              Read More

            • A survey on Arabic character segmentation

              Arabic character segmentation is a necessary step in Arabic Optical Character Recognition (OCR). The cursive nature of Arabic script poses challenging problems in Arabic character recognition; however, incorrectly segmented characters will cause ...

              Read More

            • Indic script family and its offline handwriting recognition for characters/digits and words: a comprehensive survey

              Abstract

              Handwriting recognition has become an active area of research in pattern recognition and machine learning in recent years. Handwriting recognition systems have a variety of applications ranging from digital character conversion to signboard ...

              Read More

            Comments

            Information & Contributors

            Information

            Published In

            Glyce | Proceedings of the 33rd International Conference on Neural Information Processing Systems (11)

            NIPS'19: Proceedings of the 33rd International Conference on Neural Information Processing Systems

            December 2019

            15947 pages

            • Editors:
            • Hanna M. Wallach,
            • Hugo Larochelle,
            • Alina Beygelzimer,
            • Florence d'Alché-Buc,
            • Emily B. Fox

            Copyright © 2019 Neural Information Processing Systems Foundation, Inc.

            In-Cooperation

            Publisher

            Curran Associates Inc.

            Red Hook, NY, United States

            Publication History

            Published: 08 December 2019

            Qualifiers

            • Chapter
            • Research
            • Refereed limited

            Contributors

            Glyce | Proceedings of the 33rd International Conference on Neural Information Processing Systems (17)

            Other Metrics

            View Article Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 2

              Total Citations

              View Citations
            • 74

              Total Downloads

            • Downloads (Last 12 months)27
            • Downloads (Last 6 weeks)8

            Other Metrics

            View Author Metrics

            Citations

            Cited By

            View all

            • Wang JPan GSun DZhang JShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Chinese Character Inpainting with Contextual Semantic ConstraintsProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475333(1829-1837)Online publication date: 17-Oct-2021

              https://dl.acm.org/doi/10.1145/3474085.3475333

            • Wang XXiong YNiu HYue JZhu YYu PDemartini GZuccon GCulpepper JHuang ZTong H(2021)Improving Chinese Character Representation with Formation Graph Attention NetworkProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482265(1999-2009)Online publication date: 26-Oct-2021

              https://dl.acm.org/doi/10.1145/3459637.3482265

            View Options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Get Access

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Media

            Figures

            Other

            Tables

            Glyce | Proceedings of the 33rd International Conference on Neural Information Processing Systems (2024)

            FAQs

            What is the acceptance rate for advances in neural information processing systems? ›

            Contents. Advances in Neural Information Processing Systems (NIPS) has an average acceptance rate of 24.4% .

            Is NeurIPS a good conference? ›

            Over the years, NeurIPS became a premier conference on machine learning and although the 'Neural' in the NeurIPS acronym had become something of a historical relic, the resurgence of deep learning in neural networks since 2012, fueled by faster computers and big data, has led to achievements in speech recognition, ...

            Can anyone attend NeurIPS? ›

            You must be a full time student in an accredited undergraduate, masters or graduate program or have submitted an accepted paper while you were a full time student. You will also need to present the documentation when you check in at the registration desk.

            How many submissions are there in NeurIPS 2024? ›

            We are excited to announce the list of NeurIPS 2024 workshops! We received 204 total submissions — a significant increase from last year. From this great batch of submissions, we have accepted 56 workshops that will take place on Dec. 14 & 15.

            Is Iclr a good conference? ›

            ICLR is globally renowned for presenting and publishing cutting-edge research on all aspects of deep learning used in the fields of artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, text understanding, gaming, ...

            Is IJCNN a good conference? ›

            Yes it is a top conference.

            What are the Tier 1 conferences in NLP? ›

            NLP top 10 conferences Compendium
            • ACL: Association for Computational Linguistics.
            • EMNLP: Empirical Methods in Natural Language Processing.
            • NAACL: North American Chapter of the Association for Computational Linguistics.
            • EACL: European Chapter of the Association for Computational Linguistics.

            How many accepted papers are there in NeurIPS? ›

            This year's organizers received a record number of paper submissions. Of the 13,300 submitted papers that were reviewed by 968 Area Chairs, 98 senior area chairs, and 396 Ethics reviewers 3,540 were accepted after 502 papers were flagged for ethics reviews.

            What is a good NeurIPS score? ›

            Overall score.

            9: Top 15% of accepted papers, strong accept. 8: Top 50% of accepted papers, clear accept. 7: Good paper, accept. 6: Marginally above acceptance threshold.

            Does NeurIPS provide food? ›

            IMPORTANT INFORMATION FOR PARENTS

            NeurIPS only provides snacks. You will be responislbe for making sure you child(ren) get their meals. Parents/guardians must have a conference registration for the same dates that they've registred their child for child care services.

            How many people go to NeurIPS? ›

            The thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023) took place in New Orleans from Sunday 10 to Saturday 16 December. The event was vast, with over 13,000 people in attendance at the venue, and a further 3,000 tuning in virtually.

            How many submissions does NeurIPS get? ›

            NeurIPS Statistics
            StatisticsTotalReject
            NeurIPS 202210411 min: 1.60, max: 8.20 avg: 5.97, std: 0.72153 (1.47%) min: 1.60, max: 7.60 avg: 4.82, std: 0.99
            NeurIPS 20219122 min: 2.50, max: 8.70 avg: 6.38, std: 0.62133 (1.46%) min: 2.50, max: 7.30 avg: 5.22, std: 0.80
            NeurIPS 20209467-
            NeurIPS 20196743-
            8 more rows

            Is NeurIPS a top conference? ›

            1. NeurIPS – Neural Information Processing Systems* Description: NeurIPS is one of the premier conferences on machine learning and computational neuroscience.

            Does NeurIPS 2024 have rebuttal? ›

            You can submit a rebuttal of up to 6000 characters per review, and one global rebuttal of up to 6000 reviews. These are posted by clicking the "Rebuttal" and "Author Rebuttal" buttons. You can additionally add a one-page PDF with Figures and tables.

            Why did NeurIPS change its name? ›

            Much later, another indelicate connotation came to be associated with the acronym, which eventually resulted in changing it to NeurIPS. The title of the Proceedings was "Neural Information Processing Systems" (the red one - all the following ones were blue and added "Advances in" to the title).

            What is the acceptance rate for WSDM? ›

            This year, WSDM was able to accept 84 out of 514 papers, which amounts to an acceptance rate about 16%.

            What is the acceptance rate for ICDE? ›

            ICDE is another data mining related conference with an acceptance rate of about 19%.

            What is the acceptance rate for the Ictai? ›

            The full paper acceptance rate is 15.7%.

            What is the acceptance rate for Eccv 2024? ›

            (27.90%)

            References

            Top Articles
            Latest Posts
            Article information

            Author: Carmelo Roob

            Last Updated:

            Views: 6285

            Rating: 4.4 / 5 (65 voted)

            Reviews: 80% of readers found this page helpful

            Author information

            Name: Carmelo Roob

            Birthday: 1995-01-09

            Address: Apt. 915 481 Sipes Cliff, New Gonzalobury, CO 80176

            Phone: +6773780339780

            Job: Sales Executive

            Hobby: Gaming, Jogging, Rugby, Video gaming, Handball, Ice skating, Web surfing

            Introduction: My name is Carmelo Roob, I am a modern, handsome, delightful, comfortable, attractive, vast, good person who loves writing and wants to share my knowledge and understanding with you.