In this study, researchers describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
Author: Mingda Chen. Table of Links Abstract Acknowledgements 1 INTRODUCTION 1.1 Overview 1.2 Contributions 2 BACKGROUND 2.1 Self-Supervised Language Pretraining 2.2 Naturally-Occurring Data Structures 2.3 Sentence Variational Autoencoder 2.4 Summary 3 IMPROVING SELF-SUPERVISION FOR LANGUAGE PRETRAINING 3.1 Improving Language Representation Learning via Sentence Ordering Prediction 3.2 Improving In-Context Few-Shot Learning via Self-Supervised Training 3.
finetuning improves claim detection. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 . Ting-Yun Chang and Chi-Jen Lu. 2021. Rethinking why intermediate-task finetuning works. In Findings of the Association for Computational Linguistics: EMNLP 2021. David L. Chen and Raymond J. Mooney. 2008. Learning to sportscast: A test of grounded language acquisition.
Context-aware fine-grained named entity typing. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Yuntian Deng, Yoon Kim, Justin Chiu, Demi Guo, and Alexander Rush. 2018. Latent alignment and variational attention. In Advances in Neural Information Processing Systems. Michel Deudon. 2018. Learning semantic similarity in a continuous space. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R.
Conference on Computer Vision and Pattern Recognition.
Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, 8:64–77. Mandar Joshi, Eunsol Choi, Daniel Weld, and Luke Zettlemoyer. 2017. TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics . Dan Jurafsky and James H. Martin. 2009.
package for automatic evaluation of summaries. In Text Summarization Branches Out. Dekang Lin and Patrick Pantel. 2001. Discovery of inference rules for questionanswering. Nat. Lang. Eng., 7:343–360. Shuai Lin, Wentao Wang, Zichao Yang, Xiaodan Liang, Frank F. Xu, Eric Xing, and Zhiting Hu. 2020b. Data-to-text generation with style imitation. In Findings of the Association for Computational Linguistics: EMNLP 2020. Xiang Lin, Shafiq Joty, Prathyusha Jwalapuram, and M Saiful Bari. 2019.
large-scale dataset for abstractive and coherent summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Dinghan Shen, Guoyin Wang, Wenlin Wang, Martin Renqiang Min, Qinliang Su, Yizhe Zhang, Chunyuan Li, Ricardo Henao, and Lawrence Carin. 2018a. Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics .
multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Lu Wang and Wang Ling. 2016. Neural network-based abstract generation for opinions and arguments. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Luyu Wang, Yujia Li, Ozlem Aslan, and Oriol Vinyals. 2021a.
Pre-training with extracted gap-sentences for abstractive summarization. In Proceedings of the 37th International Conference on Machine Learning. Kaizhong Zhang and Dennis Shasha. 1989. Simple fast algorithms for the editing distance between trees and related problems. Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing . Alexander A Alemi, Ian Fischer, Joshua V Dillon, and Kevin Murphy. 2017. Deep variational information bottleneck. In ICLR. Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, et al. 2021.
Knowledgegrounded pre-training for data-to-text generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing . Yu Chen and Andreas Eisele. 2012. MultiUN v2: UN documents with multilingual alignments. In Proceedings of the Eighth International Conference on Language Resources and Evaluation . Yu-Hsin Chen and Jinho D. Choi. 2016. Character identification on multiparty conversation: Identifying mentions of characters in TV shows.
Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 . Bhuwan Dhingra, Manaal Faruqui, Ankur Parikh, Ming-Wei Chang, Dipanjan Das, and William Cohen. 2019. Handling divergent reference texts when evaluating table-to-text generation.
Sum corpus: A human-annotated dialogue dataset for abstractive summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization. Max Glockner, Vered Shwartz, and Yoav Goldberg. 2018. Breaking NLI systems with sentences that require simple lexical inferences. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics . Seraphina Goldfarb-Tarrant, Tuhin Chakrabarty, Ralph Weischedel, and Nanyun Peng. 2020.
Large-scale ReAding comprehension dataset from examinations. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Wuwei Lan, Siyu Qiu, Hua He, and Wei Xu. 2017. A continuously growing dataset of sentential paraphrases. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2020.
Opendomain structured data record to text generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Nikita Nangia, Clara Vania, Rasika Bhalerao, and Samuel R. Bowman. 2020. CrowSpairs: A challenge dataset for measuring social biases in masked language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing . Karthik Narasimhan and Regina Barzilay. 2015.
distantly supervised datasetfor generating entity descriptions. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing . Vered Shwartz, Yoav Goldberg, and Ido Dagan. 2016. Improving hypernymy detection with an integrated path-based and distributional method. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics .
unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics, 9:176–194. Yaqing Wang, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed Hassan Awadallah, and Jianfeng Gao. 2021d. List: Lite self-training makes efficient fewshot learners. arXiv preprint arXiv:2110.06274. Ye-Yi Wang, J. Lafferty, and A. Waibel. 1996. Word clustering with parallel spoken language corpora.
Canada Latest News, Canada Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Leveraging Natural Supervision for Language Representation Learning and Generation: AcknowledgementsIn this study, researchers describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
Read more »
Leveraging Natural Supervision for Language Representation Learning and Generation: ConclusionIn this study, researchers describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
Read more »
Leveraging Natural Supervision for Language Representation Learning and Generation: AbstractIn this study, researchers describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
Read more »
Leveraging Natural Supervision for Language Representation: Sentence Variational AutoencoderIn this study, researchers describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
Read more »
Leveraging Natural Supervision: Learning Semantic Knowledge from WikipediaIn this study, researchers exploit rich, naturally-occurring structures on Wikipedia for various NLP tasks.
Read more »
Using AI to predict grade point average from college application essaysJonah Berger and Olivier Toubia used natural language processing to understand what drives academic success.
Read more »