pcmoritz · April 30, 2023 22:56
diff --git a/llm-papers.csv b/llm-papers.csv
 URL,License,Title,Author(s),PDF
 http://arxiv.org/abs/2202.03371v1,creativecommons.org/licenses/by-sa/4.0/,Cedille: A large autoregressive French language model,Martin Müller and Florian Laurent,http://arxiv.org/pdf/2202.03371v1
 http://arxiv.org/abs/2303.00077v1,creativecommons.org/licenses/by-sa/4.0/,Beyond the limitations of any imaginable mechanism: large language models and psycholinguistics,Conor Houghton and Nina Kazanina and Priyanka Sukumaran,http://arxiv.org/pdf/2303.00077v1
 http://arxiv.org/abs/2010.12858v2,creativecommons.org/licenses/by-sa/4.0/,When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models,Benjamin Muller and Antonis Anastasopoulos and Benoît Sagot and Djamé Seddah,http://arxiv.org/pdf/2010.12858v2
 http://arxiv.org/abs/2303.01911v1,creativecommons.org/licenses/by-sa/4.0/,Investigating the Translation Performance of a Large Multilingual Language Model: the Case of BLOOM,Rachel Bawden and François Yvon,http://arxiv.org/pdf/2303.01911v1
 http://arxiv.org/abs/2111.06053v1,creativecommons.org/licenses/by-sa/4.0/,Improving Large-scale Language Models and Resources for Filipino,Jan Christian Blaise Cruz and Charibeth Cheng,http://arxiv.org/pdf/2111.06053v1
 http://arxiv.org/abs/2301.01162v1,creativecommons.org/licenses/by-sa/4.0/,Language Models are Drummers: Drum Composition with Natural Language Pre-Training,Li Zhang and Chris Callison-Burch,http://arxiv.org/pdf/2301.01162v1
 http://arxiv.org/abs/2207.13988v2,creativecommons.org/licenses/by-sa/4.0/,Sequence to sequence pretraining for a less-resourced Slovenian language,Matej Ulčar and Marko Robnik-Šikonja,http://arxiv.org/pdf/2207.13988v2
 http://arxiv.org/abs/2005.00318v1,creativecommons.org/licenses/by-sa/4.0/,Can Multilingual Language Models Transfer to an Unseen Dialect? A Case Study on North African Arabizi,Benjamin Muller and Benoit Sagot and Djamé Seddah,http://arxiv.org/pdf/2005.00318v1
 http://arxiv.org/abs/2301.10472v1,creativecommons.org/licenses/by-sa/4.0/,XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models,Davis Liang and Hila Gonen and Yuning Mao and Rui Hou and Naman Goyal and Marjan Ghazvininejad and Luke Zettlemoyer and Madian Khabsa,http://arxiv.org/pdf/2301.10472v1
 http://arxiv.org/abs/1810.11895v3,creativecommons.org/licenses/by-sa/4.0/,"Language Modeling for Code-Switching: Evaluation, Integration of Monolingual Data, and Discriminative Training",Hila Gonen and Yoav Goldberg,http://arxiv.org/pdf/1810.11895v3
 http://arxiv.org/abs/2110.10319v1,creativecommons.org/licenses/by-sa/4.0/,LMSOC: An Approach for Socially Sensitive Pretraining,Vivek Kulkarni and Shubhanshu Mishra and Aria Haghighi,http://arxiv.org/pdf/2110.10319v1
 http://arxiv.org/abs/2207.06882v1,creativecommons.org/licenses/by-sa/4.0/,Multilinguals at SemEval-2022 Task 11: Complex NER in Semantically Ambiguous Settings for Low Resource Languages,Amit Pandey and Swayatta Daw and Narendra Babu Unnam and Vikram Pudi,http://arxiv.org/pdf/2207.06882v1
 http://arxiv.org/abs/2109.02903v2,creativecommons.org/licenses/by-sa/4.0/,IndicBART: A Pre-trained Model for Indic Natural Language Generation,Raj Dabre and Himani Shrotriya and Anoop Kunchukuttan and Ratish Puduppully and Mitesh M. Khapra and Pratyush Kumar,http://arxiv.org/pdf/2109.02903v2
 http://arxiv.org/abs/2210.00320v1,creativecommons.org/licenses/by-sa/4.0/,MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation,Kshitij Gupta,http://arxiv.org/pdf/2210.00320v1
 http://arxiv.org/abs/2205.14288v1,creativecommons.org/licenses/by-sa/4.0/,Few-shot Subgoal Planning with Language Models,Lajanugen Logeswaran and Yao Fu and Moontae Lee and Honglak Lee,http://arxiv.org/pdf/2205.14288v1
 http://arxiv.org/abs/2211.07524v1,creativecommons.org/licenses/by-sa/4.0/,Towards a Mathematics Formalisation Assistant using Large Language Models,Ayush Agrawal and Siddhartha Gadgil and Navin Goyal and Ashvni Narayanan and Anand Tadipatri,http://arxiv.org/pdf/2211.07524v1
 http://arxiv.org/abs/1904.01989v1,creativecommons.org/licenses/by-sa/4.0/,Subword-Level Language Identification for Intra-Word Code-Switching,Manuel Mager and Özlem Çetinoğlu and Katharina Kann,http://arxiv.org/pdf/1904.01989v1
 http://arxiv.org/abs/2009.08712v1,creativecommons.org/licenses/by-sa/4.0/,The birth of Romanian BERT,Stefan Daniel Dumitrescu and Andrei-Marius Avram and Sampo Pyysalo,http://arxiv.org/pdf/2009.08712v1
 http://arxiv.org/abs/2204.02292v2,creativecommons.org/licenses/by-sa/4.0/,Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual Retrieval,Robert Litschko and Ivan Vulić and Goran Glavaš,http://arxiv.org/pdf/2204.02292v2
 http://arxiv.org/abs/2101.11575v1,creativecommons.org/licenses/by-sa/4.0/,Mining Large-Scale Low-Resource Pronunciation Data From Wikipedia,Tania Chakraborty and Manasa Prasad and Theresa Breiner and Sandy Ritchie and Daan van Esch,http://arxiv.org/pdf/2101.11575v1
 http://arxiv.org/abs/2203.06462v2,creativecommons.org/licenses/by-sa/4.0/,Low-Rank Softmax Can Have Unargmaxable Classes in Theory but Rarely in Practice,Andreas Grivas and Nikolay Bogoychev and Adam Lopez,http://arxiv.org/pdf/2203.06462v2
 http://arxiv.org/abs/1910.04210v1,creativecommons.org/licenses/by-sa/4.0/,Perturbation Sensitivity Analysis to Detect Unintended Model Biases,Vinodkumar Prabhakaran and Ben Hutchinson and Margaret Mitchell,http://arxiv.org/pdf/1910.04210v1
 http://arxiv.org/abs/2201.08471v1,creativecommons.org/licenses/by-sa/4.0/,Transfer Learning Approaches for Building Cross-Language Dense Retrieval Models,Suraj Nair and Eugene Yang and Dawn Lawrie and Kevin Duh and Paul McNamee and Kenton Murray and James Mayfield and Douglas W. Oard,http://arxiv.org/pdf/2201.08471v1
 http://arxiv.org/abs/2203.03759v1,creativecommons.org/licenses/by-sa/4.0/,IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation,Gabriele Sarti and Malvina Nissim,http://arxiv.org/pdf/2203.03759v1
 http://arxiv.org/abs/2112.10553v1,creativecommons.org/licenses/by-sa/4.0/,Training dataset and dictionary sizes matter in BERT models: the case of Baltic languages,Matej Ulčar and Marko Robnik-Šikonja,http://arxiv.org/pdf/2112.10553v1
 http://arxiv.org/abs/2204.09391v1,creativecommons.org/licenses/by-sa/4.0/,You Are What You Write: Preserving Privacy in the Era of Large Language Models,Richard Plant and Valerio Giuffrida and Dimitra Gkatzia,http://arxiv.org/pdf/2204.09391v1
 http://arxiv.org/abs/2212.09895v1,creativecommons.org/licenses/by-sa/4.0/,Improved Long-Form Spoken Language Translation with Large Language Models,Arya D. McCarthy and Hao Zhang and Shankar Kumar and Felix Stahlberg and Axel H. Ng,http://arxiv.org/pdf/2212.09895v1
 http://arxiv.org/abs/2004.09456v1,creativecommons.org/licenses/by-sa/4.0/,StereoSet: Measuring stereotypical bias in pretrained language models,Moin Nadeem and Anna Bethke and Siva Reddy,http://arxiv.org/pdf/2004.09456v1
 http://arxiv.org/abs/2109.15290v1,creativecommons.org/licenses/by-sa/4.0/,MatSciBERT: A Materials Domain Language Model for Text Mining and Information Extraction,Tanishq Gupta and Mohd Zaki and N. M. Anoop Krishnan and Mausam,http://arxiv.org/pdf/2109.15290v1
 http://arxiv.org/abs/2206.01205v1,creativecommons.org/licenses/by-sa/4.0/,Snow Mountain: Dataset of Audio Recordings of The Bible in Low Resource Languages,Kavitha Raju and Anjaly V and Ryan Lish and Joel Mathew,http://arxiv.org/pdf/2206.01205v1
 http://arxiv.org/abs/2211.00142v1,creativecommons.org/licenses/by-sa/4.0/,TaTa: A Multilingual Table-to-Text Dataset for African Languages,Sebastian Gehrmann and Sebastian Ruder and Vitaly Nikolaev and Jan A. Botha and Michael Chavinda and Ankur Parikh and Clara Rivera,http://arxiv.org/pdf/2211.00142v1
 http://arxiv.org/abs/2107.09931v1,creativecommons.org/licenses/by-sa/4.0/,The Effectiveness of Intermediate-Task Training for Code-Switched Natural Language Understanding,Archiki Prasad and Mohammad Ali Rehan and Shreya Pathak and Preethi Jyothi,http://arxiv.org/pdf/2107.09931v1
 http://arxiv.org/abs/2302.07912v1,creativecommons.org/licenses/by-sa/4.0/,Meeting the Needs of Low-Resource Languages: The Value of Automatic Alignments via Pretrained Models,Abteen Ebrahimi and Arya D. McCarthy and Arturo Oncevay and Luis Chiruzzo and John E. Ortega and Gustavo A. Giménez-Lugo and Rolando Coto-Solano and Katharina Kann,http://arxiv.org/pdf/2302.07912v1
 http://arxiv.org/abs/2102.12162v1,creativecommons.org/licenses/by-sa/4.0/,From Universal Language Model to Downstream Task: Improving RoBERTa-Based Vietnamese Hate Speech Detection,Quang Huu Pham and Viet Anh Nguyen and Linh Bao Doan and Ngoc N. Tran and Ta Minh Thanh,http://arxiv.org/pdf/2102.12162v1
 http://arxiv.org/abs/2211.13613v2,creativecommons.org/licenses/by-sa/4.0/,Ham2Pose: Animating Sign Language Notation into Pose Sequences,Rotem Shalev-Arkushin and Amit Moryossef and Ohad Fried,http://arxiv.org/pdf/2211.13613v2
 http://arxiv.org/abs/2212.05058v1,creativecommons.org/licenses/by-sa/4.0/,Structured Like a Language Model: Analysing AI as an Automated Subject,Liam Magee and Vanicka Arora and Luke Munn,http://arxiv.org/pdf/2212.05058v1
 http://arxiv.org/abs/2108.01589v1,creativecommons.org/licenses/by-sa/4.0/,ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference,Amit Gajbhiye and Noura Al Moubayed and Steven Bradley,http://arxiv.org/pdf/2108.01589v1
 http://arxiv.org/abs/2108.02598v1,creativecommons.org/licenses/by-sa/4.0/,Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification,Yidi Jiang and Bidisha Sharma and Maulik Madhavi and Haizhou Li,http://arxiv.org/pdf/2108.02598v1
 http://arxiv.org/abs/2301.08130v2,creativecommons.org/licenses/by-sa/4.0/,A Cohesive Distillation Architecture for Neural Language Models,Jan Philip Wahle,http://arxiv.org/pdf/2301.08130v2
 http://arxiv.org/abs/2107.10614v1,creativecommons.org/licenses/by-sa/4.0/,Evaluation of contextual embeddings on less-resourced languages,Matej Ulčar and Aleš Žagar and Carlos S. Armendariz and Andraž Repar and Senja Pollak and Matthew Purver and Marko Robnik-Šikonja,http://arxiv.org/pdf/2107.10614v1
 http://arxiv.org/abs/2109.10724v1,creativecommons.org/licenses/by-sa/4.0/,Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network,Takaaki Saeki and Shinnosuke Takamichi and Hiroshi Saruwatari,http://arxiv.org/pdf/2109.10724v1
 http://arxiv.org/abs/2011.12432v2,creativecommons.org/licenses/by-sa/4.0/,Enhancing deep neural networks with morphological information,Matej Klemen and Luka Krsnik and Marko Robnik-Šikonja,http://arxiv.org/pdf/2011.12432v2
 http://arxiv.org/abs/2211.02956v1,creativecommons.org/licenses/by-sa/4.0/,Privacy-Preserving Models for Legal Natural Language Processing,Ying Yin and Ivan Habernal,http://arxiv.org/pdf/2211.02956v1
 http://arxiv.org/abs/1606.09403v1,creativecommons.org/licenses/by-sa/4.0/,Learning Crosslingual Word Embeddings without Bilingual Corpora,Long Duong and Hiroshi Kanayama and Tengfei Ma and Steven Bird and Trevor Cohn,http://arxiv.org/pdf/1606.09403v1
 http://arxiv.org/abs/2303.12153v1,creativecommons.org/licenses/by-sa/4.0/,Text2Motion: From Natural Language Instructions to Feasible Plans,Kevin Lin and Christopher Agia and Toki Migimatsu and Marco Pavone and Jeannette Bohg,http://arxiv.org/pdf/2303.12153v1
 http://arxiv.org/abs/2207.01772v1,creativecommons.org/licenses/by-sa/4.0/,Vision-and-Language Pretraining,Thong Nguyen and Cong-Duy Nguyen and Xiaobao Wu and Anh Tuan Luu,http://arxiv.org/pdf/2207.01772v1
 http://arxiv.org/abs/2302.00923v4,creativecommons.org/licenses/by-sa/4.0/,Multimodal Chain-of-Thought Reasoning in Language Models,Zhuosheng Zhang and Aston Zhang and Mu Li and Hai Zhao and George Karypis and Alex Smola,http://arxiv.org/pdf/2302.00923v4
 http://arxiv.org/abs/2210.03568v3,creativecommons.org/licenses/by-sa/4.0/,How Large Language Models are Transforming Machine-Paraphrased Plagiarism,Jan Philip Wahle and Terry Ruas and Frederic Kirstein and Bela Gipp,http://arxiv.org/pdf/2210.03568v3
 http://arxiv.org/abs/2110.07982v1,creativecommons.org/licenses/by-sa/4.0/,Scribosermo: Fast Speech-to-Text models for German and other Languages,Daniel Bermuth and Alexander Poeppel and Wolfgang Reif,http://arxiv.org/pdf/2110.07982v1
 http://arxiv.org/abs/2112.12650v3,creativecommons.org/licenses/by-sa/4.0/,Distilling the Knowledge of Romanian BERTs Using Multiple Teachers,Andrei-Marius Avram and Darius Catrina and Dumitru-Clementin Cercel and Mihai Dascălu and Traian Rebedea and Vasile Păiş and Dan Tufiş,http://arxiv.org/pdf/2112.12650v3
 http://arxiv.org/abs/2207.06814v1,creativecommons.org/licenses/by-sa/4.0/,BERTIN: Efficient Pre-Training of a Spanish Language Model using Perplexity Sampling,Javier de la Rosa and Eduardo G. Ponferrada and Paulo Villegas and Pablo Gonzalez de Prado Salas and Manu Romero and Marıa Grandury,http://arxiv.org/pdf/2207.06814v1
 http://arxiv.org/abs/2109.05772v1,creativecommons.org/licenses/by-sa/4.0/,Wine is Not v i n. -- On the Compatibility of Tokenizations Across Languages,Antonis Maronikolakis and Philipp Dufter and Hinrich Schütze,http://arxiv.org/pdf/2109.05772v1
 http://arxiv.org/abs/2212.10548v1,creativecommons.org/licenses/by-sa/4.0/,T-Projection: High Quality Annotation Projection for Sequence Labeling Tasks,Iker García-Ferrero and Rodrigo Agerri and German Rigau,http://arxiv.org/pdf/2212.10548v1
 http://arxiv.org/abs/2302.06476v2,creativecommons.org/licenses/by-sa/4.0/,Is ChatGPT a General-Purpose Natural Language Processing Task Solver?,Chengwei Qin and Aston Zhang and Zhuosheng Zhang and Jiaao Chen and Michihiro Yasunaga and Diyi Yang,http://arxiv.org/pdf/2302.06476v2
 http://arxiv.org/abs/2101.03289v5,creativecommons.org/licenses/by-sa/4.0/,Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing,Minh Van Nguyen and Viet Dac Lai and Amir Pouran Ben Veyseh and Thien Huu Nguyen,http://arxiv.org/pdf/2101.03289v5
 http://arxiv.org/abs/2211.09085v1,creativecommons.org/licenses/by-sa/4.0/,Galactica: A Large Language Model for Science,Ross Taylor and Marcin Kardas and Guillem Cucurull and Thomas Scialom and Anthony Hartshorn and Elvis Saravia and Andrew Poulton and Viktor Kerkez and Robert Stojnic,http://arxiv.org/pdf/2211.09085v1
 http://arxiv.org/abs/2302.07735v1,creativecommons.org/licenses/by-sa/4.0/,Targeted Attack on GPT-Neo for the SATML Language Model Data Extraction Challenge,Ali Al-Kaswan and Maliheh Izadi and Arie van Deursen,http://arxiv.org/pdf/2302.07735v1
 http://arxiv.org/abs/2202.07359v1,creativecommons.org/licenses/by-sa/4.0/,textless-lib: a Library for Textless Spoken Language Processing,Eugene Kharitonov and Jade Copet and Kushal Lakhotia and Tu Anh Nguyen and Paden Tomasello and Ann Lee and Ali Elkahky and Wei-Ning Hsu and Abdelrahman Mohamed and Emmanuel Dupoux and Yossi Adi,http://arxiv.org/pdf/2202.07359v1
 http://arxiv.org/abs/2106.05822v1,creativecommons.org/licenses/by-sa/4.0/,GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures,Ivan Chelombiev and Daniel Justus and Douglas Orr and Anastasia Dietrich and Frithjof Gressmann and Alexandros Koliousis and Carlo Luschi,http://arxiv.org/pdf/2106.05822v1
 http://arxiv.org/abs/2207.00352v1,creativecommons.org/licenses/by-sa/4.0/,Toward Low-Cost End-to-End Spoken Language Understanding,Marco Dinarelli and Marco Naguib and François Portet,http://arxiv.org/pdf/2207.00352v1
 http://arxiv.org/abs/2204.13913v1,creativecommons.org/licenses/by-sa/4.0/,Leaner and Faster: Two-Stage Model Compression for Lightweight Text-Image Retrieval,Siyu Ren and Kenny Q. Zhu,http://arxiv.org/pdf/2204.13913v1
 http://arxiv.org/abs/2212.07126v1,creativecommons.org/licenses/by-sa/4.0/,Explainability of Text Processing and Retrieval Methods: A Critical Survey,Sourav Saha and Debapriyo Majumdar and Mandar Mitra,http://arxiv.org/pdf/2212.07126v1
 http://arxiv.org/abs/2301.13382v1,creativecommons.org/licenses/by-sa/4.0/,Numeracy from Literacy: Data Science as an Emergent Skill from Large Language Models,David Noever and Forrest McKee,http://arxiv.org/pdf/2301.13382v1
 http://arxiv.org/abs/2304.01373v1,creativecommons.org/licenses/by-sa/4.0/,Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling,Stella Biderman and Hailey Schoelkopf and Quentin Anthony and Herbie Bradley and Kyle O'Brien and Eric Hallahan and Mohammad Aflah Khan and Shivanshu Purohit and USVSN Sai Prashanth and Edward Raff and Aviya Skowron and Lintang Sutawika and Oskar van der Wal,http://arxiv.org/pdf/2304.01373v1
 http://arxiv.org/abs/2106.02679v1,creativecommons.org/licenses/by-sa/4.0/,Layered gradient accumulation and modular pipeline parallelism: fast and efficient training of large language models,Joel Lamy-Poirier,http://arxiv.org/pdf/2106.02679v1
 http://arxiv.org/abs/2103.12450v5,creativecommons.org/licenses/by-sa/4.0/,Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection,Jan Philip Wahle and Terry Ruas and Norman Meuschke and Bela Gipp,http://arxiv.org/pdf/2103.12450v5
 http://arxiv.org/abs/2201.03382v1,creativecommons.org/licenses/by-sa/4.0/,BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives,Frederico Souza and João Filho,http://arxiv.org/pdf/2201.03382v1
 http://arxiv.org/abs/2209.06899v1,creativecommons.org/licenses/by-sa/4.0/,"Out of One, Many: Using Language Models to Simulate Human Samples",Lisa P. Argyle and Ethan C. Busby and Nancy Fulda and Joshua Gubler and Christopher Rytting and David Wingate,http://arxiv.org/pdf/2209.06899v1
 http://arxiv.org/abs/2102.06203v2,creativecommons.org/licenses/by-sa/4.0/,Proof Artifact Co-training for Theorem Proving with Language Models,Jesse Michael Han and Jason Rute and Yuhuai Wu and Edward W. Ayers and Stanislas Polu,http://arxiv.org/pdf/2102.06203v2
 http://arxiv.org/abs/2006.14223v1,creativecommons.org/licenses/by-sa/4.0/,Neural Machine Translation For Paraphrase Generation,Alex Sokolov and Denis Filimonov,http://arxiv.org/pdf/2006.14223v1
 http://arxiv.org/abs/2207.00349v1,creativecommons.org/licenses/by-sa/4.0/,Vers la compréhension automatique de la parole bout-en-bout à moindre effort,Marco Naguib and François Portet and Marco Dinarelli,http://arxiv.org/pdf/2207.00349v1
 http://arxiv.org/abs/2210.10668v1,creativecommons.org/licenses/by-sa/4.0/,N-Best Hypotheses Reranking for Text-To-SQL Systems,Lu Zeng and Sree Hari Krishnan Parthasarathi and Dilek Hakkani-Tur,http://arxiv.org/pdf/2210.10668v1
 http://arxiv.org/abs/2304.12203v1,creativecommons.org/licenses/by-sa/4.0/,Creating Large Language Model Resistant Exams: Guidelines and Strategies,Simon kaare Larsen,http://arxiv.org/pdf/2304.12203v1
 http://arxiv.org/abs/2105.09680v4,creativecommons.org/licenses/by-sa/4.0/,KLUE: Korean Language Understanding Evaluation,Sungjoon Park and Jihyung Moon and Sungdong Kim and Won Ik Cho and Jiyoon Han and Jangwon Park and Chisung Song and Junseong Kim and Yongsook Song and Taehwan Oh and Joohong Lee and Juhyun Oh and Sungwon Lyu and Younghoon Jeong and Inkwon Lee and Sangwoo Seo and Dongjun Lee and Hyunwoo Kim and Myeonghwa Lee and Seongbo Jang and Seungwon Do and Sunkyoung Kim and Kyungtae Lim and Jongwon Lee and Kyumin Park and Jamin Shin and Seonghyun Kim and Lucy Park and Alice Oh and Jung-Woo Ha and Kyunghyun Cho,http://arxiv.org/pdf/2105.09680v4
 http://arxiv.org/abs/2302.02083v3,creativecommons.org/licenses/by-sa/4.0/,Theory of Mind May Have Spontaneously Emerged in Large Language Models,Michal Kosinski,http://arxiv.org/pdf/2302.02083v3
 http://arxiv.org/abs/2208.06042v1,creativecommons.org/licenses/by-sa/4.0/,CodeBERT-nt: code naturalness via CodeBERT,Ahmed Khanfir and Matthieu Jimenez and Mike Papadakis and Yves Le Traon,http://arxiv.org/pdf/2208.06042v1
 http://arxiv.org/abs/2110.02056v1,creativecommons.org/licenses/by-sa/4.0/,Are Training Resources Insufficient? Predict First Then Explain!,Myeongjun Jang and Thomas Lukasiewicz,http://arxiv.org/pdf/2110.02056v1
 http://arxiv.org/abs/2006.07890v1,creativecommons.org/licenses/by-sa/4.0/,FinEst BERT and CroSloEngual BERT: less is more in multilingual models,Matej Ulčar and Marko Robnik-Šikonja,http://arxiv.org/pdf/2006.07890v1
 http://arxiv.org/abs/2302.13942v2,creativecommons.org/licenses/by-sa/4.0/,Inseq: An Interpretability Toolkit for Sequence Generation Models,Gabriele Sarti and Nils Feldhus and Ludwig Sickert and Oskar van der Wal and Malvina Nissim and Arianna Bisazza,http://arxiv.org/pdf/2302.13942v2
 http://arxiv.org/abs/2011.10208v1,creativecommons.org/licenses/by-sa/4.0/,Collaborative Storytelling with Large-scale Neural Language Models,Eric Nichols and Leo Gao and Randy Gomez,http://arxiv.org/pdf/2011.10208v1
 http://arxiv.org/abs/2001.07063v4,creativecommons.org/licenses/by-sa/4.0/,Modular coinduction up-to for higher-order languages via first-order transition systems,Jean-Marie Madiot and Damien Pous and Davide Sangiorgi,http://arxiv.org/pdf/2001.07063v4
 http://arxiv.org/abs/2212.11135v1,creativecommons.org/licenses/by-sa/4.0/,Array-Aware Matching: Taming the Complexity of Large-Scale Simulation Models,Massimo Fioravanti and Daniele Cattaneo and Federico Terraneo and Silvano Seva and Stefano Cherubin and Giovanni Agosta and Francesco Casella and Alberto Leva,http://arxiv.org/pdf/2212.11135v1
 http://arxiv.org/abs/2107.02027v2,creativecommons.org/licenses/by-sa/4.0/,Efficient Sequence Packing without Cross-contamination: Accelerating Large Language Models without Impacting Performance,Mario Michael Krell and Matej Kosec and Sergio P. Perez and Andrew Fitzgibbon,http://arxiv.org/pdf/2107.02027v2
 http://arxiv.org/abs/2302.14021v1,creativecommons.org/licenses/by-sa/4.0/,Quantifying Valence and Arousal in Text with Multilingual Pre-trained Transformers,Gonçalo Azevedo Mendes and Bruno Martins,http://arxiv.org/pdf/2302.14021v1
 http://arxiv.org/abs/2206.15076v1,creativecommons.org/licenses/by-sa/4.0/,BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing,Jason Alan Fries and Leon Weber and Natasha Seelam and Gabriel Altay and Debajyoti Datta and Samuele Garda and Myungsun Kang and Ruisi Su and Wojciech Kusa and Samuel Cahyawijaya and Fabio Barth and Simon Ott and Matthias Samwald and Stephen Bach and Stella Biderman and Mario Sänger and Bo Wang and Alison Callahan and Daniel León Periñán and Théo Gigant and Patrick Haller and Jenny Chim and Jose David Posada and John Michael Giorgi and Karthik Rangasai Sivaraman and Marc Pàmies and Marianna Nezhurina and Robert Martin and Michael Cullan and Moritz Freidank and Nathan Dahlberg and Shubhanshu Mishra and Shamik Bose and Nicholas Michio Broad and Yanis Labrak and Shlok S Deshmukh and Sid Kiblawi and Ayush Singh and Minh Chien Vu and Trishala Neeraj and Jonas Golde and Albert Villanova del Moral and Benjamin Beilharz,http://arxiv.org/pdf/2206.15076v1
 http://arxiv.org/abs/2211.17163v1,creativecommons.org/licenses/by-sa/4.0/,Misogyny classification of German newspaper forum comments,Johann Petrak and Brigitte Krenn,http://arxiv.org/pdf/2211.17163v1
 http://arxiv.org/abs/2212.10448v1,creativecommons.org/licenses/by-sa/4.0/,Parameter-efficient Zero-shot Transfer for Cross-Language Dense Retrieval with Adapters,Eugene Yang and Suraj Nair and Dawn Lawrie and James Mayfield and Douglas W. Oard,http://arxiv.org/pdf/2212.10448v1
 http://arxiv.org/abs/2304.02016v1,creativecommons.org/licenses/by-sa/4.0/,The Multimodal And Modular Ai Chef: Complex Recipe Generation From Imagery,David Noever and Samantha Elizabeth Miller Noever,http://arxiv.org/pdf/2304.02016v1
 http://arxiv.org/abs/2108.06277v1,creativecommons.org/licenses/by-sa/4.0/,Towards Structured Dynamic Sparse Pre-Training of BERT,Anastasia Dietrich and Frithjof Gressmann and Douglas Orr and Ivan Chelombiev and Daniel Justus and Carlo Luschi,http://arxiv.org/pdf/2108.06277v1
 http://arxiv.org/abs/2304.03325v1,creativecommons.org/licenses/by-sa/4.0/,ChatGPT-Crawler: Find out if ChatGPT really knows what it's talking about,Aman Rangapur and Haoran Wang,http://arxiv.org/pdf/2304.03325v1
 http://arxiv.org/abs/2302.13652v1,creativecommons.org/licenses/by-sa/4.0/,Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech,Dong Yang and Tomoki Koriyama and Yuki Saito and Takaaki Saeki and Detai Xin and Hiroshi Saruwatari,http://arxiv.org/pdf/2302.13652v1
 http://arxiv.org/abs/2104.08512v2,creativecommons.org/licenses/by-sa/4.0/,Minimal Supervision for Morphological Inflection,Omer Goldman and Reut Tsarfaty,http://arxiv.org/pdf/2104.08512v2
 http://arxiv.org/abs/2107.08091v1,creativecommons.org/licenses/by-sa/4.0/,A Comparison of Methods for OOV-word Recognition on a New Public Dataset,Rudolf A. Braun and Srikanth Madikeri and Petr Motlicek,http://arxiv.org/pdf/2107.08091v1
 http://arxiv.org/abs/2201.07406v2,creativecommons.org/licenses/by-sa/4.0/,Fooling MOSS Detection with Pretrained Language Models,Stella Biderman and Edward Raff,http://arxiv.org/pdf/2201.07406v2
 http://arxiv.org/abs/2206.07048v1,creativecommons.org/licenses/by-sa/4.0/,A smile is all you need: Predicting limiting activity coefficients from SMILES with natural language processing,Benedikt Winter and Clemens Winter and Johannes Schilling and André Bardow,http://arxiv.org/pdf/2206.07048v1
 http://arxiv.org/abs/2210.13778v1,creativecommons.org/licenses/by-sa/4.0/,IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension,Rifki Afina Putri and Alice Oh,http://arxiv.org/pdf/2210.13778v1
 http://arxiv.org/abs/2008.02878v1,creativecommons.org/licenses/by-sa/4.0/,A Multilingual Neural Machine Translation Model for Biomedical Data,Alexandre Bérard and Zae Myung Kim and Vassilina Nikoulina and Eunjeong L. Park and Matthias Gallé,http://arxiv.org/pdf/2008.02878v1
 http://arxiv.org/abs/2208.10448v1,creativecommons.org/licenses/by-sa/4.0/,Dialogue Term Extraction using Transfer Learning and Topological Data Analysis,Renato Vukovic and Michael Heck and Benjamin Matthias Ruppik and Carel van Niekerk and Marcus Zibrowius and Milica Gašić,http://arxiv.org/pdf/2208.10448v1
 http://arxiv.org/abs/2210.15187v1,creativecommons.org/licenses/by-sa/4.0/,Learning Joint Representation of Human Motion and Language,Jihoon Kim and Youngjae Yu and Seungyoun Shin and Taehyun Byun and Sungjoon Choi,http://arxiv.org/pdf/2210.15187v1
 http://arxiv.org/abs/2304.02496v1,creativecommons.org/licenses/by-sa/4.0/,Evaluation of ChatGPT Family of Models for Biomedical Reasoning and Classification,Shan Chen and Yingya Li and Sheng Lu and Hoang Van and Hugo JWL Aerts and Guergana K. Savova and Danielle S. Bitterman,http://arxiv.org/pdf/2304.02496v1
 http://arxiv.org/abs/1711.01048v2,creativecommons.org/licenses/by-sa/4.0/,Dual Language Models for Code Switched Speech Recognition,Saurabh Garg and Tanmay Parekh and Preethi Jyothi,http://arxiv.org/pdf/1711.01048v2
 http://arxiv.org/abs/2109.00271v1,creativecommons.org/licenses/by-sa/4.0/,Discovering Representation Sprachbund For Multilingual Pre-Training,Yimin Fan and Yaobo Liang and Alexandre Muzio and Hany Hassan and Houqiang Li and Ming Zhou and Nan Duan,http://arxiv.org/pdf/2109.00271v1
 http://arxiv.org/abs/2204.12632v2,creativecommons.org/licenses/by-sa/4.0/,Testing the Ability of Language Models to Interpret Figurative Language,Emmy Liu and Chen Cui and Kenneth Zheng and Graham Neubig,http://arxiv.org/pdf/2204.12632v2
 http://arxiv.org/abs/2010.00622v2,creativecommons.org/licenses/by-sa/4.0/,Pix2Prof: fast extraction of sequential information from galaxy imagery via a deep natural language 'captioning' model,Michael J. Smith and Nikhil Arora and Connor Stone and Stéphane Courteau and James E. Geach,http://arxiv.org/pdf/2010.00622v2
 http://arxiv.org/abs/2208.07870v1,creativecommons.org/licenses/by-sa/4.0/,Language-guided Semantic Style Transfer of 3D Indoor Scenes,Bu Jin and Beiwen Tian and Hao Zhao and Guyue Zhou,http://arxiv.org/pdf/2208.07870v1
 http://arxiv.org/abs/2304.11350v1,creativecommons.org/licenses/by-sa/4.0/,Romanian Multiword Expression Detection Using Multilingual Adversarial Training and Lateral Inhibition,Andrei-Marius Avram and Verginica Barbu Mititelu and Dumitru-Clementin Cercel,http://arxiv.org/pdf/2304.11350v1
 http://arxiv.org/abs/2301.12596v2,creativecommons.org/licenses/by-sa/4.0/,Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining,Takaaki Saeki and Soumi Maiti and Xinjian Li and Shinji Watanabe and Shinnosuke Takamichi and Hiroshi Saruwatari,http://arxiv.org/pdf/2301.12596v2
 http://arxiv.org/abs/2304.05468v1,creativecommons.org/licenses/by-sa/4.0/,A Survey of Resources and Methods for Natural Language Processing of Serbian Language,Ulfeta A. Marovac and Aldina R. Avdić and Nikola Lj. Milošević,http://arxiv.org/pdf/2304.05468v1
 http://arxiv.org/abs/2012.12612v2,creativecommons.org/licenses/by-sa/4.0/,Incremental Text-to-Speech Synthesis Using Pseudo Lookahead with Large Pretrained Language Model,Takaaki Saeki and Shinnosuke Takamichi and Hiroshi Saruwatari,http://arxiv.org/pdf/2012.12612v2
 http://arxiv.org/abs/2208.12461v1,creativecommons.org/licenses/by-sa/4.0/,AutoQGS: Auto-Prompt for Low-Resource Knowledge-based Question Generation from SPARQL,Guanming Xiong and Junwei Bao and Wen Zhao and Youzheng Wu and Xiaodong He,http://arxiv.org/pdf/2208.12461v1
 http://arxiv.org/abs/2210.03493v1,creativecommons.org/licenses/by-sa/4.0/,Automatic Chain of Thought Prompting in Large Language Models,Zhuosheng Zhang and Aston Zhang and Mu Li and Alex Smola,http://arxiv.org/pdf/2210.03493v1
 http://arxiv.org/abs/2101.08370v1,creativecommons.org/licenses/by-sa/4.0/,Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval,Robert Litschko and Ivan Vulić and Simone Paolo Ponzetto and Goran Glavaš,http://arxiv.org/pdf/2101.08370v1
 http://arxiv.org/abs/2211.15533v1,creativecommons.org/licenses/by-sa/4.0/,The Stack: 3 TB of permissively licensed source code,Denis Kocetkov and Raymond Li and Loubna Ben Allal and Jia Li and Chenghao Mou and Carlos Muñoz Ferrandis and Yacine Jernite and Margaret Mitchell and Sean Hughes and Thomas Wolf and Dzmitry Bahdanau and Leandro von Werra and Harm de Vries,http://arxiv.org/pdf/2211.15533v1
 http://arxiv.org/abs/2106.04571v1,creativecommons.org/licenses/by-sa/4.0/,TIMEDIAL: Temporal Commonsense Reasoning in Dialog,Lianhui Qin and Aditya Gupta and Shyam Upadhyay and Luheng He and Yejin Choi and Manaal Faruqui,http://arxiv.org/pdf/2106.04571v1
 http://arxiv.org/abs/2209.15168v1,creativecommons.org/licenses/by-sa/4.0/,Depth-Wise Attention (DWAtt): A Layer Fusion Method for Data-Efficient Classification,Muhammad ElNokrashy and Badr AlKhamissi and Mona Diab,http://arxiv.org/pdf/2209.15168v1
 http://arxiv.org/abs/2301.10172v2,creativecommons.org/licenses/by-sa/4.0/,MTTN: Multi-Pair Text to Text Narratives for Prompt Generation,Archan Ghosh and Debgandhar Ghosh and Madhurima Maji and Suchinta Chanda and Kalporup Goswami,http://arxiv.org/pdf/2301.10172v2
 http://arxiv.org/abs/2304.04083v1,creativecommons.org/licenses/by-sa/4.0/,"VOICE: Visual Oracle for Interaction, Conversation, and Explanation",Donggang Jia and Alexandra Irger and Ondrej Strnad and Johanna Bjorklund and Anders Ynnerman and Ivan Viola,http://arxiv.org/pdf/2304.04083v1
 http://arxiv.org/abs/1906.02979v1,creativecommons.org/licenses/by-sa/4.0/,A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains,Dominik Schlechtweg and Anna Hätty and Marco del Tredici and Sabine Schulte im Walde,http://arxiv.org/pdf/1906.02979v1
 http://arxiv.org/abs/2107.10989v1,creativecommons.org/licenses/by-sa/4.0/,Estimating Predictive Uncertainty Under Program Data Distribution Shift,Yufei Li and Simin Chen and Wei Yang,http://arxiv.org/pdf/2107.10989v1
 http://arxiv.org/abs/2104.09712v1,creativecommons.org/licenses/by-sa/4.0/,Problems and Countermeasures in Natural Language Processing Evaluation,Qingxiu Dong and Zhifang Sui and Weidong Zhan and Baobao Chang,http://arxiv.org/pdf/2104.09712v1
 http://arxiv.org/abs/2210.01512v2,creativecommons.org/licenses/by-sa/4.0/,Code-Switching without Switching: Language Agnostic End-to-End Speech Translation,Christian Huber and Enes Yavuz Ugan and Alexander Waibel,http://arxiv.org/pdf/2210.01512v2
 http://arxiv.org/abs/1811.01115v1,creativecommons.org/licenses/by-sa/4.0/,Neural Task Representations as Weak Supervision for Model Agnostic Cross-Lingual Transfer,Sujay Kumar Jauhar and Michael Gamon and Patrick Pantel,http://arxiv.org/pdf/1811.01115v1
 http://arxiv.org/abs/1903.08905v1,creativecommons.org/licenses/by-sa/4.0/,RAP-Net: Recurrent Attention Pooling Networks for Dialogue Response Selection,Chao-Wei Huang and Ting-Rui Chiang and Shang-Yu Su and Yun-Nung Chen,http://arxiv.org/pdf/1903.08905v1
 http://arxiv.org/abs/2202.08316v2,creativecommons.org/licenses/by-sa/4.0/,FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction,Minh Van Nguyen and Nghia Trung Ngo and Bonan Min and Thien Huu Nguyen,http://arxiv.org/pdf/2202.08316v2
 http://arxiv.org/abs/2202.03052v2,creativecommons.org/licenses/by-sa/4.0/,"OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework",Peng Wang and An Yang and Rui Men and Junyang Lin and Shuai Bai and Zhikang Li and Jianxin Ma and Chang Zhou and Jingren Zhou and Hongxia Yang,http://arxiv.org/pdf/2202.03052v2
 http://arxiv.org/abs/2210.01478v3,creativecommons.org/licenses/by-sa/4.0/,When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment,Zhijing Jin and Sydney Levine and Fernando Gonzalez and Ojasv Kamal and Maarten Sap and Mrinmaya Sachan and Rada Mihalcea and Josh Tenenbaum and Bernhard Schölkopf,http://arxiv.org/pdf/2210.01478v3
 http://arxiv.org/abs/2110.03888v3,creativecommons.org/licenses/by-sa/4.0/,M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining,Junyang Lin and An Yang and Jinze Bai and Chang Zhou and Le Jiang and Xianyan Jia and Ang Wang and Jie Zhang and Yong Li and Wei Lin and Jingren Zhou and Hongxia Yang,http://arxiv.org/pdf/2110.03888v3
 http://arxiv.org/abs/1710.01799v1,creativecommons.org/licenses/by-sa/4.0/,Counterfactual Language Model Adaptation for Suggesting Phrases,Kenneth C. Arnold and Kai-Wei Chang and Adam T. Kalai,http://arxiv.org/pdf/1710.01799v1
 http://arxiv.org/abs/2102.12971v1,creativecommons.org/licenses/by-sa/4.0/,Are pre-trained text representations useful for multilingual and multi-dimensional language proficiency modeling?,Taraka Rama and Sowmya Vajjala,http://arxiv.org/pdf/2102.12971v1
 http://arxiv.org/abs/2303.03953v2,creativecommons.org/licenses/by-sa/4.0/,ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification,Taja Kuzman and Igor Mozetič and Nikola Ljubešić,http://arxiv.org/pdf/2303.03953v2
 http://arxiv.org/abs/2209.01335v2,creativecommons.org/licenses/by-sa/4.0/,Neural Approaches to Multilingual Information Retrieval,Dawn Lawrie and Eugene Yang and Douglas W. Oard and James Mayfield,http://arxiv.org/pdf/2209.01335v2
 http://arxiv.org/abs/1911.03894v3,creativecommons.org/licenses/by-sa/4.0/,CamemBERT: a Tasty French Language Model,Louis Martin and Benjamin Muller and Pedro Javier Ortiz Suárez and Yoann Dupont and Laurent Romary and Éric Villemonte de la Clergerie and Djamé Seddah and Benoît Sagot,http://arxiv.org/pdf/1911.03894v3
 http://arxiv.org/abs/2203.14507v2,creativecommons.org/licenses/by-sa/4.0/,ANNA: Enhanced Language Representation for Question Answering,Changwook Jun and Hansol Jang and Myoseop Sim and Hyun Kim and Jooyoung Choi and Kyungkoo Min and Kyunghoon Bae,http://arxiv.org/pdf/2203.14507v2
 http://arxiv.org/abs/1406.1241v1,creativecommons.org/licenses/by/3.0/,The Best Templates Match Technique For Example Based Machine Translation,T. El-Shishtawy and A. El-Sammak,http://arxiv.org/pdf/1406.1241v1
 http://arxiv.org/abs/1311.3837v1,creativecommons.org/licenses/by/3.0/,SBML for optimizing decision support's tools,Dalila Hamami and Baghdad Atmani,http://arxiv.org/pdf/1311.3837v1
 http://arxiv.org/abs/1005.4752v1,creativecommons.org/licenses/by/3.0/,A database approach to information retrieval: The remarkable relationship between language models and region models,Djoerd Hiemstra and Vojkan Mihajlovic,http://arxiv.org/pdf/1005.4752v1
 http://arxiv.org/abs/2104.10441v1,creativecommons.org/licenses/by/4.0/,"Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead?",Tim Isbister and Fredrik Carlsson and Magnus Sahlgren,http://arxiv.org/pdf/2104.10441v1
 http://arxiv.org/abs/1909.04879v1,creativecommons.org/licenses/by/4.0/,Dynamic Fusion: Attentional Language Model for Neural Machine Translation,Michiki Kurosawa and Mamoru Komachi,http://arxiv.org/pdf/1909.04879v1
 http://arxiv.org/abs/2303.15324v1,creativecommons.org/licenses/by/4.0/,Can Large Language Models design a Robot?,Francesco Stella and Cosimo Della Santina and Josie Hughes,http://arxiv.org/pdf/2303.15324v1
 http://arxiv.org/abs/2112.02969v1,creativecommons.org/licenses/by/4.0/,Jigsaw: Large Language Models meet Program Synthesis,Naman Jain and Skanda Vaidyanath and Arun Iyer and Nagarajan Natarajan and Suresh Parthasarathy and Sriram Rajamani and Rahul Sharma,http://arxiv.org/pdf/2112.02969v1
 http://arxiv.org/abs/2105.00572v1,creativecommons.org/licenses/by/4.0/,Larger-Scale Transformers for Multilingual Masked Language Modeling,Naman Goyal and Jingfei Du and Myle Ott and Giri Anantharaman and Alexis Conneau,http://arxiv.org/pdf/2105.00572v1
 http://arxiv.org/abs/2206.02252v1,creativecommons.org/licenses/by/4.0/,Exploring Cross-lingual Textual Style Transfer with Large Multilingual Language Models,Daniil Moskovskiy and Daryna Dementieva and Alexander Panchenko,http://arxiv.org/pdf/2206.02252v1
 http://arxiv.org/abs/2104.00772v1,creativecommons.org/licenses/by/4.0/,Low-Resource Language Modelling of South African Languages,Stuart Mesham and Luc Hayward and Jared Shapiro and Jan Buys,http://arxiv.org/pdf/2104.00772v1
 http://arxiv.org/abs/2302.12299v1,creativecommons.org/licenses/by/4.0/,In What Languages are Generative Language Models the Most Formal? Analyzing Formality Distribution across Languages,Asım Ersoy and Gerson Vizcarra and Tasmiah Tahsin Mayeesha and Benjamin Muller,http://arxiv.org/pdf/2302.12299v1
 http://arxiv.org/abs/2212.09535v1,creativecommons.org/licenses/by/4.0/,BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting,Zheng-Xin Yong and Hailey Schoelkopf and Niklas Muennighoff and Alham Fikri Aji and David Ifeoluwa Adelani and Khalid Almubarak and M Saiful Bari and Lintang Sutawika and Jungo Kasai and Ahmed Baruwa and Genta Indra Winata and Stella Biderman and Dragomir Radev and Vassilina Nikoulina,http://arxiv.org/pdf/2212.09535v1
 http://arxiv.org/abs/2210.14473v1,creativecommons.org/licenses/by/4.0/,Benchmarking Language Models for Code Syntax Understanding,Da Shen and Xinyun Chen and Chenguang Wang and Koushik Sen and Dawn Song,http://arxiv.org/pdf/2210.14473v1
 http://arxiv.org/abs/2110.13658v1,creativecommons.org/licenses/by/4.0/,Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?,Arij Riabi and Benoît Sagot and Djamé Seddah,http://arxiv.org/pdf/2110.13658v1
 http://arxiv.org/abs/2302.03491v1,creativecommons.org/licenses/by/4.0/,Learning Translation Quality Evaluation on Low Resource Languages from Large Language Models,Amirkeivan Mohtashami and Mauro Verzetti and Paul K. Rubenstein,http://arxiv.org/pdf/2302.03491v1
 http://arxiv.org/abs/2110.00687v1,creativecommons.org/licenses/by/4.0/,Investigating Robustness of Dialog Models to Popular Figurative Language Constructs,Harsh Jhamtani and Varun Gangal and Eduard Hovy and Taylor Berg-Kirkpatrick,http://arxiv.org/pdf/2110.00687v1
 http://arxiv.org/abs/2112.00567v1,creativecommons.org/licenses/by/4.0/,DPRK-BERT: The Supreme Language Model,Arda Akdemir and Yeojoo Jeon,http://arxiv.org/pdf/2112.00567v1
 http://arxiv.org/abs/2106.14127v1,creativecommons.org/licenses/by/4.0/,Visual Conceptual Blending with Large-scale Language and Vision Models,Songwei Ge and Devi Parikh,http://arxiv.org/pdf/2106.14127v1
 http://arxiv.org/abs/2209.15236v3,creativecommons.org/licenses/by/4.0/,Language-Family Adapters for Low-Resource Multilingual Neural Machine Translation,Alexandra Chronopoulou and Dario Stojanovski and Alexander Fraser,http://arxiv.org/pdf/2209.15236v3
 http://arxiv.org/abs/2203.05300v1,creativecommons.org/licenses/by/4.0/,Connecting Neural Response measurements & Computational Models of language: a non-comprehensive guide,Mostafa Abdou,http://arxiv.org/pdf/2203.05300v1
 http://arxiv.org/abs/2301.12597v1,creativecommons.org/licenses/by/4.0/,BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models,Junnan Li and Dongxu Li and Silvio Savarese and Steven Hoi,http://arxiv.org/pdf/2301.12597v1
 http://arxiv.org/abs/2007.15813v1,creativecommons.org/licenses/by/4.0/,Language Modelling for Source Code with Transformer-XL,Thomas Dowdell and Hongyu Zhang,http://arxiv.org/pdf/2007.15813v1
 http://arxiv.org/abs/2210.00185v1,creativecommons.org/licenses/by/4.0/,Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks,Zhenhailong Wang and Xiaoman Pan and Dian Yu and Dong Yu and Jianshu Chen and Heng Ji,http://arxiv.org/pdf/2210.00185v1
 http://arxiv.org/abs/2104.06546v1,creativecommons.org/licenses/by/4.0/,Large-Scale Contextualised Language Modelling for Norwegian,Andrey Kutuzov and Jeremy Barnes and Erik Velldal and Lilja Øvrelid and Stephan Oepen,http://arxiv.org/pdf/2104.06546v1
 http://arxiv.org/abs/2303.07304v1,creativecommons.org/licenses/by/4.0/,Algorithmic Ghost in the Research Shell: Large Language Models and Academic Knowledge Creation in Management Research,Nigel Williams and Stanislav Ivanov and Dimitrios Buhalis,http://arxiv.org/pdf/2303.07304v1
 http://arxiv.org/abs/1711.01100v1,creativecommons.org/licenses/by/4.0/,One Model to Rule them all: Multitask and Multilingual Modelling for Lexical Analysis,Johannes Bjerva,http://arxiv.org/pdf/1711.01100v1
 http://arxiv.org/abs/2212.09146v1,creativecommons.org/licenses/by/4.0/,Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model,Parishad BehnamGhader and Santiago Miret and Siva Reddy,http://arxiv.org/pdf/2212.09146v1
 http://arxiv.org/abs/2209.01515v2,creativecommons.org/licenses/by/4.0/,Do Large Language Models know what humans know?,Sean Trott and Cameron Jones and Tyler Chang and James Michaelov and Benjamin Bergen,http://arxiv.org/pdf/2209.01515v2
 http://arxiv.org/abs/2204.06487v3,creativecommons.org/licenses/by/4.0/,Adapting Pre-trained Language Models to African Languages via Multilingual Adaptive Fine-Tuning,Jesujoba O. Alabi and David Ifeoluwa Adelani and Marius Mosbach and Dietrich Klakow,http://arxiv.org/pdf/2204.06487v3
 http://arxiv.org/abs/2211.02069v2,creativecommons.org/licenses/by/4.0/,LMentry: A Language Model Benchmark of Elementary Language Tasks,Avia Efrat and Or Honovich and Omer Levy,http://arxiv.org/pdf/2211.02069v2
 http://arxiv.org/abs/2301.12566v1,creativecommons.org/licenses/by/4.0/,Improving Cross-lingual Information Retrieval on Low-Resource Languages via Optimal Transport Distillation,Zhiqi Huang and Puxuan Yu and James Allan,http://arxiv.org/pdf/2301.12566v1
 http://arxiv.org/abs/2210.07041v1,creativecommons.org/licenses/by/4.0/,Spontaneous Emerging Preference in Two-tower Language Model,Zhengqi He and Taro Toyoizumi,http://arxiv.org/pdf/2210.07041v1
 http://arxiv.org/abs/2212.02564v1,creativecommons.org/licenses/by/4.0/,INCLUSIFY: A benchmark and a model for gender-inclusive German,David Pomerenke,http://arxiv.org/pdf/2212.02564v1
 http://arxiv.org/abs/2205.09634v2,creativecommons.org/licenses/by/4.0/,Phylogeny-Inspired Adaptation of Multilingual Models to New Languages,Fahim Faisal and Antonios Anastasopoulos,http://arxiv.org/pdf/2205.09634v2
 http://arxiv.org/abs/2209.02842v1,creativecommons.org/licenses/by/4.0/,ASR2K: Speech Recognition for Around 2000 Languages without Audio,Xinjian Li and Florian Metze and David R Mortensen and Alan W Black and Shinji Watanabe,http://arxiv.org/pdf/2209.02842v1
 http://arxiv.org/abs/2210.12302v1,creativecommons.org/licenses/by/4.0/,What do Large Language Models Learn beyond Language?,Avinash Madasu and Shashank Srivastava,http://arxiv.org/pdf/2210.12302v1
 http://arxiv.org/abs/2302.00093v2,creativecommons.org/licenses/by/4.0/,Large Language Models Can Be Easily Distracted by Irrelevant Context,Freda Shi and Xinyun Chen and Kanishka Misra and Nathan Scales and David Dohan and Ed Chi and Nathanael Schärli and Denny Zhou,http://arxiv.org/pdf/2302.00093v2
 http://arxiv.org/abs/2112.06598v2,creativecommons.org/licenses/by/4.0/,WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models,Benjamin Minixhofer and Fabian Paischer and Navid Rekabsaz,http://arxiv.org/pdf/2112.06598v2
 http://arxiv.org/abs/2301.06627v1,creativecommons.org/licenses/by/4.0/,Dissociating language and thought in large language models: a cognitive perspective,Kyle Mahowald and Anna A. Ivanova and Idan A. Blank and Nancy Kanwisher and Joshua B. Tenenbaum and Evelina Fedorenko,http://arxiv.org/pdf/2301.06627v1
 http://arxiv.org/abs/2212.08390v1,creativecommons.org/licenses/by/4.0/,Lessons learned from the evaluation of Spanish Language Models,Rodrigo Agerri and Eneko Agirre,http://arxiv.org/pdf/2212.08390v1
 http://arxiv.org/abs/2301.04589v1,creativecommons.org/licenses/by/4.0/,Memory Augmented Large Language Models are Computationally Universal,Dale Schuurmans,http://arxiv.org/pdf/2301.04589v1
 http://arxiv.org/abs/2110.12010v3,creativecommons.org/licenses/by/4.0/,ClimateBert: A Pretrained Language Model for Climate-Related Text,Nicolas Webersinke and Mathias Kraus and Julia Anna Bingler and Markus Leippold,http://arxiv.org/pdf/2110.12010v3
 http://arxiv.org/abs/2210.05758v1,creativecommons.org/licenses/by/4.0/,Decoupled Context Processing for Context Augmented Language Modeling,Zonglin Li and Ruiqi Guo and Sanjiv Kumar,http://arxiv.org/pdf/2210.05758v1
 http://arxiv.org/abs/2208.12097v1,creativecommons.org/licenses/by/4.0/,Training a T5 Using Lab-sized Resources,Manuel R. Ciosici and Leon Derczynski,http://arxiv.org/pdf/2208.12097v1
 http://arxiv.org/abs/2010.14571v2,creativecommons.org/licenses/by/4.0/,Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus,Isaac Caswell and Theresa Breiner and Daan van Esch and Ankur Bapna,http://arxiv.org/pdf/2010.14571v2
 http://arxiv.org/abs/2203.13344v1,creativecommons.org/licenses/by/4.0/,Linking Emergent and Natural Languages via Corpus Transfer,Shunyu Yao and Mo Yu and Yang Zhang and Karthik R Narasimhan and Joshua B. Tenenbaum and Chuang Gan,http://arxiv.org/pdf/2203.13344v1
 http://arxiv.org/abs/1810.07156v2,creativecommons.org/licenses/by/4.0/,Strategies for Language Identification in Code-Mixed Low Resource Languages,Soumil Mandal and Sankalp Sanand,http://arxiv.org/pdf/1810.07156v2
 http://arxiv.org/abs/2105.02855v2,creativecommons.org/licenses/by/4.0/,Adapting Monolingual Models: Data can be Scarce when Language Similarity is High,Wietse de Vries and Martijn Bartelds and Malvina Nissim and Martijn Wieling,http://arxiv.org/pdf/2105.02855v2
 http://arxiv.org/abs/2206.02885v2,creativecommons.org/licenses/by/4.0/,Norm Participation Grounds Language,David Schlangen,http://arxiv.org/pdf/2206.02885v2
 http://arxiv.org/abs/2304.03728v1,creativecommons.org/licenses/by/4.0/,Interpretable Unified Language Checking,Tianhua Zhang and Hongyin Luo and Yung-Sung Chuang and Wei Fang and Luc Gaitskell and Thomas Hartvigsen and Xixin Wu and Danny Fox and Helen Meng and James Glass,http://arxiv.org/pdf/2304.03728v1
 http://arxiv.org/abs/2204.06283v2,creativecommons.org/licenses/by/4.0/,Curriculum: A Broad-Coverage Benchmark for Linguistic Phenomena in Natural Language Understanding,Zeming Chen and Qiyue Gao,http://arxiv.org/pdf/2204.06283v2
 http://arxiv.org/abs/2212.06094v1,creativecommons.org/licenses/by/4.0/,Prompting Is Programming: A Query Language For Large Language Models,Luca Beurer-Kellner and Marc Fischer and Martin Vechev,http://arxiv.org/pdf/2212.06094v1
 http://arxiv.org/abs/2206.12638v1,creativecommons.org/licenses/by/4.0/,Distilling a Pretrained Language Model to a Multilingual ASR Model,Kwanghee Choi and Hyung-Min Park,http://arxiv.org/pdf/2206.12638v1
 http://arxiv.org/abs/2204.05717v1,creativecommons.org/licenses/by/4.0/,Do Not Fire the Linguist: Grammatical Profiles Help Language Models Detect Semantic Change,Mario Giulianelli and Andrey Kutuzov and Lidia Pivovarova,http://arxiv.org/pdf/2204.05717v1
 http://arxiv.org/abs/2202.09662v6,creativecommons.org/licenses/by/4.0/,Reward Modeling for Mitigating Toxicity in Transformer-based Language Models,Farshid Faal and Ketra Schmitt and Jia Yuan Yu,http://arxiv.org/pdf/2202.09662v6
 http://arxiv.org/abs/2304.02015v1,creativecommons.org/licenses/by/4.0/,How well do Large Language Models perform in Arithmetic tasks?,Zheng Yuan and Hongyi Yuan and Chuanqi Tan and Wei Wang and Songfang Huang,http://arxiv.org/pdf/2304.02015v1
 http://arxiv.org/abs/2304.09960v2,creativecommons.org/licenses/by/4.0/,A Latent Space Theory for Emergent Abilities in Large Language Models,Hui Jiang,http://arxiv.org/pdf/2304.09960v2
 http://arxiv.org/abs/2211.03263v2,creativecommons.org/licenses/by/4.0/,AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages,Bonaventure F. P. Dossou and Atnafu Lambebo Tonja and Oreen Yousuf and Salomey Osei and Abigail Oppong and Iyanuoluwa Shode and Oluwabusayo Olufunke Awoyomi and Chris Chinenye Emezue,http://arxiv.org/pdf/2211.03263v2
 http://arxiv.org/abs/2302.01973v2,creativecommons.org/licenses/by/4.0/,Measuring The Impact Of Programming Language Distribution,Gabriel Orlanski and Kefan Xiao and Xavier Garcia and Jeffrey Hui and Joshua Howland and Jonathan Malmaud and Jacob Austin and Rishah Singh and Michele Catasta,http://arxiv.org/pdf/2302.01973v2
 http://arxiv.org/abs/2109.01207v4,creativecommons.org/licenses/by/4.0/,Similarity of Sentence Representations in Multilingual LMs: Resolving Conflicting Literature and Case Study of Baltic Languages,Maksym Del and Mark Fishel,http://arxiv.org/pdf/2109.01207v4
 http://arxiv.org/abs/2210.14409v1,creativecommons.org/licenses/by/4.0/,Modeling the Graphotactics of Low-Resource Languages Using Sequential GANs,Isaac Wasserman,http://arxiv.org/pdf/2210.14409v1
 http://arxiv.org/abs/2301.10439v2,creativecommons.org/licenses/by/4.0/,ViDeBERTa: A powerful pre-trained language model for Vietnamese,Cong Dao Tran and Nhut Huy Pham and Anh Nguyen and Truong Son Hy and Tu Vu,http://arxiv.org/pdf/2301.10439v2
 http://arxiv.org/abs/2304.00869v1,creativecommons.org/licenses/by/4.0/,GreekBART: The First Pretrained Greek Sequence-to-Sequence Model,Iakovos Evdaimon and Hadi Abdine and Christos Xypolopoulos and Stamatis Outsios and Michalis Vazirgiannis and Giorgos Stamou,http://arxiv.org/pdf/2304.00869v1
 http://arxiv.org/abs/2201.09227v2,creativecommons.org/licenses/by/4.0/,A Large and Diverse Arabic Corpus for Language Modeling,Abbas Raza Ali and Muhammad Ajmal Siddiqui and Rema Algunaibet and Hasan Raza Ali,http://arxiv.org/pdf/2201.09227v2
 http://arxiv.org/abs/2211.05417v1,creativecommons.org/licenses/by/4.0/,Can Transformers Reason in Fragments of Natural Language?,Viktor Schlegel and Kamen V. Pavlov and Ian Pratt-Hartmann,http://arxiv.org/pdf/2211.05417v1
 http://arxiv.org/abs/1904.09122v1,creativecommons.org/licenses/by/4.0/,Zero-Shot Cross-Lingual Opinion Target Extraction,Soufian Jebbara and Philipp Cimiano,http://arxiv.org/pdf/1904.09122v1
 http://arxiv.org/abs/2303.08006v2,creativecommons.org/licenses/by/4.0/,Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification,Jiayi Pan and Glen Chou and Dmitry Berenson,http://arxiv.org/pdf/2303.08006v2
 http://arxiv.org/abs/2111.01243v1,creativecommons.org/licenses/by/4.0/,Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey,Bonan Min and Hayley Ross and Elior Sulem and Amir Pouran Ben Veyseh and Thien Huu Nguyen and Oscar Sainz and Eneko Agirre and Ilana Heinz and Dan Roth,http://arxiv.org/pdf/2111.01243v1
 http://arxiv.org/abs/1802.08375v2,creativecommons.org/licenses/by/4.0/,Reusing Weights in Subword-aware Neural Language Models,Zhenisbek Assylbekov and Rustem Takhanov,http://arxiv.org/pdf/1802.08375v2
 http://arxiv.org/abs/2304.06962v1,creativecommons.org/licenses/by/4.0/,Prompt Engineering and Calibration for Zero-Shot Commonsense Reasoning,Chenkai Ma,http://arxiv.org/pdf/2304.06962v1
 http://arxiv.org/abs/2304.08865v1,creativecommons.org/licenses/by/4.0/,Romanization-based Large-scale Adaptation of Multilingual Language Models,Sukannya Purkayastha and Sebastian Ruder and Jonas Pfeiffer and Iryna Gurevych and Ivan Vulić,http://arxiv.org/pdf/2304.08865v1
 http://arxiv.org/abs/2105.12428v1,creativecommons.org/licenses/by/4.0/,"Neural Morphology Dataset and Models for Multiple Languages, from the Large to the Endangered",Mika Hämäläinen and Niko Partanen and Jack Rueter and Khalid Alnajjar,http://arxiv.org/pdf/2105.12428v1
 http://arxiv.org/abs/2111.08546v1,creativecommons.org/licenses/by/4.0/,Interpreting Language Models Through Knowledge Graph Extraction,Vinitra Swamy and Angelika Romanou and Martin Jaggi,http://arxiv.org/pdf/2111.08546v1
 http://arxiv.org/abs/2104.08826v2,creativecommons.org/licenses/by/4.0/,GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation,Kang Min Yoo and Dongju Park and Jaewook Kang and Sang-Woo Lee and Woomyeong Park,http://arxiv.org/pdf/2104.08826v2
 http://arxiv.org/abs/2201.11903v6,creativecommons.org/licenses/by/4.0/,Chain-of-Thought Prompting Elicits Reasoning in Large Language Models,Jason Wei and Xuezhi Wang and Dale Schuurmans and Maarten Bosma and Brian Ichter and Fei Xia and Ed Chi and Quoc Le and Denny Zhou,http://arxiv.org/pdf/2201.11903v6
 http://arxiv.org/abs/2002.05417v1,creativecommons.org/licenses/by/4.0/,Comparison of Turkish Word Representations Trained on Different Morphological Forms,Gökhan Güler and A. Cüneyd Tantuğ,http://arxiv.org/pdf/2002.05417v1
 http://arxiv.org/abs/2006.00591v2,creativecommons.org/licenses/by/4.0/,Efficient Deployment of Conversational Natural Language Interfaces over Databases,Anthony Colas and Trung Bui and Franck Dernoncourt and Moumita Sinha and Doo Soon Kim,http://arxiv.org/pdf/2006.00591v2
 http://arxiv.org/abs/2303.07226v1,creativecommons.org/licenses/by/4.0/,Scaling Vision-Language Models with Sparse Mixture of Experts,Sheng Shen and Zhewei Yao and Chunyuan Li and Trevor Darrell and Kurt Keutzer and Yuxiong He,http://arxiv.org/pdf/2303.07226v1
 http://arxiv.org/abs/2204.10198v2,creativecommons.org/licenses/by/4.0/,Context-Aware Language Modeling for Goal-Oriented Dialogue Systems,Charlie Snell and Mengjiao Yang and Justin Fu and Yi Su and Sergey Levine,http://arxiv.org/pdf/2204.10198v2
 http://arxiv.org/abs/2012.07331v1,creativecommons.org/licenses/by/4.0/,Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval,Yuma Koizumi and Yasunori Ohishi and Daisuke Niizumi and Daiki Takeuchi and Masahiro Yasuda,http://arxiv.org/pdf/2012.07331v1
 http://arxiv.org/abs/2102.07350v1,creativecommons.org/licenses/by/4.0/,Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm,Laria Reynolds and Kyle McDonell,http://arxiv.org/pdf/2102.07350v1
 http://arxiv.org/abs/2207.06839v1,creativecommons.org/licenses/by/4.0/,Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model,Chris van der Lee and Thiago Castro Ferreira and Chris Emmery and Travis Wiltshire and Emiel Krahmer,http://arxiv.org/pdf/2207.06839v1
 http://arxiv.org/abs/2212.10461v1,creativecommons.org/licenses/by/4.0/,Go-tuning: Improving Zero-shot Learning Abilities of Smaller Language Models,Jingjing Xu and Qingxiu Dong and Hongyi Liu and Lei Li,http://arxiv.org/pdf/2212.10461v1
 http://arxiv.org/abs/2207.03777v1,creativecommons.org/licenses/by/4.0/,Hidden Schema Networks,Ramsés J. Sánchez and Lukas Conrads and Pascal Welke and Kostadin Cvejoski and César Ojeda,http://arxiv.org/pdf/2207.03777v1
 http://arxiv.org/abs/2105.14880v1,creativecommons.org/licenses/by/4.0/,A Multilingual Modeling Method for Span-Extraction Reading Comprehension,Gaochen Wu and Bin Xu and Dejie Chang and Bangchang Liu,http://arxiv.org/pdf/2105.14880v1
 http://arxiv.org/abs/2301.12868v3,creativecommons.org/licenses/by/4.0/,On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex,Terry Yue Zhuo and Zhuang Li and Yujin Huang and Fatemeh Shiri and Weiqing Wang and Gholamreza Haffari and Yuan-Fang Li,http://arxiv.org/pdf/2301.12868v3
 http://arxiv.org/abs/2204.08110v4,creativecommons.org/licenses/by/4.0/,Language Contamination Helps Explain the Cross-lingual Capabilities of English Pretrained Models,Terra Blevins and Luke Zettlemoyer,http://arxiv.org/pdf/2204.08110v4
 http://arxiv.org/abs/2207.09152v1,creativecommons.org/licenses/by/4.0/,Benchmarking Transformers-based models on French Spoken Language Understanding tasks,Oralie Cattan and Sahar Ghannay and Christophe Servan and Sophie Rosset,http://arxiv.org/pdf/2207.09152v1
 http://arxiv.org/abs/2112.08346v1,creativecommons.org/licenses/by/4.0/,Simple Text Detoxification by Identifying a Linear Toxic Subspace in Language Model Embeddings,Andrew Wang and Mohit Sudhakar and Yangfeng Ji,http://arxiv.org/pdf/2112.08346v1
 http://arxiv.org/abs/2205.05718v1,creativecommons.org/licenses/by/4.0/,"Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks",Katherine M. Collins and Catherine Wong and Jiahai Feng and Megan Wei and Joshua B. Tenenbaum,http://arxiv.org/pdf/2205.05718v1
 http://arxiv.org/abs/2104.14830v2,creativecommons.org/licenses/by/4.0/,Scaling End-to-End Models for Large-Scale Multilingual ASR,Bo Li and Ruoming Pang and Tara N. Sainath and Anmol Gulati and Yu Zhang and James Qin and Parisa Haghani and W. Ronny Huang and Min Ma and Junwen Bai,http://arxiv.org/pdf/2104.14830v2
 http://arxiv.org/abs/2208.13078v1,creativecommons.org/licenses/by/4.0/,MDIA: A Benchmark for Multilingual Dialogue Generation in 46 Languages,Qingyu Zhang and Xiaoyu Shen and Ernie Chang and Jidong Ge and Pengke Chen,http://arxiv.org/pdf/2208.13078v1
 http://arxiv.org/abs/2304.09957v1,creativecommons.org/licenses/by/4.0/,Low-resource Bilingual Dialect Lexicon Induction with Large Language Models,Ekaterina Artemova and Barbara Plank,http://arxiv.org/pdf/2304.09957v1
 http://arxiv.org/abs/2102.03596v1,creativecommons.org/licenses/by/4.0/,Does He Wink or Does He Nod? A Challenging Benchmark for Evaluating Word Understanding of Language Models,Lutfi Kerem Senel and Hinrich Schütze,http://arxiv.org/pdf/2102.03596v1
 http://arxiv.org/abs/2109.12346v3,creativecommons.org/licenses/by/4.0/,DziriBERT: a Pre-trained Language Model for the Algerian Dialect,Amine Abdaoui and Mohamed Berrimi and Mourad Oussalah and Abdelouahab Moussaoui,http://arxiv.org/pdf/2109.12346v3
 http://arxiv.org/abs/2110.14782v3,creativecommons.org/licenses/by/4.0/,When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer,Ameet Deshpande and Partha Talukdar and Karthik Narasimhan,http://arxiv.org/pdf/2110.14782v3
 http://arxiv.org/abs/2104.09411v1,creativecommons.org/licenses/by/4.0/,Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training,Chenyi Lei and Shixian Luo and Yong Liu and Wanggui He and Jiamang Wang and Guoxin Wang and Haihong Tang and Chunyan Miao and Houqiang Li,http://arxiv.org/pdf/2104.09411v1
 http://arxiv.org/abs/2201.00150v5,creativecommons.org/licenses/by/4.0/,Cross-Domain Deep Code Search with Few-Shot Meta Learning,Yitian Chai and Hongyu Zhang and Beijun Shen and Xiaodong Gu,http://arxiv.org/pdf/2201.00150v5
 http://arxiv.org/abs/2304.03738v2,creativecommons.org/licenses/by/4.0/,Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models,Emilio Ferrara,http://arxiv.org/pdf/2304.03738v2
 http://arxiv.org/abs/2104.05882v1,creativecommons.org/licenses/by/4.0/,Discourse Probing of Pretrained Language Models,Fajri Koto and Jey Han Lau and Timothy Baldwin,http://arxiv.org/pdf/2104.05882v1
 http://arxiv.org/abs/2107.08146v2,creativecommons.org/licenses/by/4.0/,Picard understanding Darmok: A Dataset and Model for Metaphor-Rich Translation in a Constructed Language,Peter Jansen and Jordan Boyd-Graber,http://arxiv.org/pdf/2107.08146v2
 http://arxiv.org/abs/2201.07311v1,creativecommons.org/licenses/by/4.0/,Datasheet for the Pile,Stella Biderman and Kieran Bicheno and Leo Gao,http://arxiv.org/pdf/2201.07311v1
 http://arxiv.org/abs/2210.13693v1,creativecommons.org/licenses/by/4.0/,XRICL: Cross-lingual Retrieval-Augmented In-Context Learning for Cross-lingual Text-to-SQL Semantic Parsing,Peng Shi and Rui Zhang and He Bai and Jimmy Lin,http://arxiv.org/pdf/2210.13693v1
 http://arxiv.org/abs/2202.08772v1,creativecommons.org/licenses/by/4.0/,A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models,Da Yin and Li Dong and Hao Cheng and Xiaodong Liu and Kai-Wei Chang and Furu Wei and Jianfeng Gao,http://arxiv.org/pdf/2202.08772v1
 http://arxiv.org/abs/2004.14963v3,creativecommons.org/licenses/by/4.0/,Data and Representation for Turkish Natural Language Inference,Emrah Budur and Rıza Özçelik and Tunga Güngör and Christopher Potts,http://arxiv.org/pdf/2004.14963v3
 http://arxiv.org/abs/2205.14326v1,creativecommons.org/licenses/by/4.0/,Adaptive Activation Network For Low Resource Multilingual Speech Recognition,Jian Luo and Jianzong Wang and Ning Cheng and Zhenpeng Zheng and Jing Xiao,http://arxiv.org/pdf/2205.14326v1
 http://arxiv.org/abs/2206.11871v1,creativecommons.org/licenses/by/4.0/,Offline RL for Natural Language Generation with Implicit Language Q Learning,Charlie Snell and Ilya Kostrikov and Yi Su and Mengjiao Yang and Sergey Levine,http://arxiv.org/pdf/2206.11871v1
 http://arxiv.org/abs/2208.03067v2,creativecommons.org/licenses/by/4.0/,Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning,Sandy Ritchie and You-Chi Cheng and Mingqing Chen and Rajiv Mathews and Daan van Esch and Bo Li and Khe Chai Sim,http://arxiv.org/pdf/2208.03067v2
 http://arxiv.org/abs/2212.09196v2,creativecommons.org/licenses/by/4.0/,Emergent Analogical Reasoning in Large Language Models,Taylor Webb and Keith J. Holyoak and Hongjing Lu,http://arxiv.org/pdf/2212.09196v2
 http://arxiv.org/abs/2203.13411v1,creativecommons.org/licenses/by/4.0/,Reshaping Robot Trajectories Using Natural Language Commands: A Study of Multi-Modal Data Alignment Using Transformers,Arthur Bucker and Luis Figueredo and Sami Haddadin and Ashish Kapoor and Shuang Ma and Rogerio Bonatti,http://arxiv.org/pdf/2203.13411v1
 http://arxiv.org/abs/2012.04307v2,creativecommons.org/licenses/by/4.0/,Cross-lingual Transfer of Abstractive Summarizer to Less-resource Language,Aleš Žagar and Marko Robnik-Šikonja,http://arxiv.org/pdf/2012.04307v2
 http://arxiv.org/abs/2205.00551v3,creativecommons.org/licenses/by/4.0/,Gender Bias in Masked Language Models for Multiple Languages,Masahiro Kaneko and Aizhan Imankulova and Danushka Bollegala and Naoaki Okazaki,http://arxiv.org/pdf/2205.00551v3
 http://arxiv.org/abs/2104.07358v2,creativecommons.org/licenses/by/4.0/,Adaptive Sparse Transformer for Multilingual Translation,Hongyu Gong and Xian Li and Dmitriy Genzel,http://arxiv.org/pdf/2104.07358v2
 http://arxiv.org/abs/2108.07790v3,creativecommons.org/licenses/by/4.0/,Mitigating harm in language models with conditional-likelihood filtration,Helen Ngo and Cooper Raterink and João G. M. Araújo and Ivan Zhang and Carol Chen and Adrien Morisot and Nicholas Frosst,http://arxiv.org/pdf/2108.07790v3
 http://arxiv.org/abs/2110.01485v2,creativecommons.org/licenses/by/4.0/,JuriBERT: A Masked-Language Model Adaptation for French Legal Text,Stella Douka and Hadi Abdine and Michalis Vazirgiannis and Rajaa El Hamdani and David Restrepo Amariles,http://arxiv.org/pdf/2110.01485v2
 http://arxiv.org/abs/2110.13032v2,creativecommons.org/licenses/by/4.0/,Paradigm Shift in Language Modeling: Revisiting CNN for Modeling Sanskrit Originated Bengali and Hindi Language,Chowdhury Rafeed Rahman and MD. Hasibur Rahman and Mohammad Rafsan and Samiha Zakir and Mohammed Eunus Ali and Rafsanjani Muhammod,http://arxiv.org/pdf/2110.13032v2
 http://arxiv.org/abs/2207.09099v1,creativecommons.org/licenses/by/4.0/,Analyzing Bagging Methods for Language Models,Pranab Islam and Shaan Khosla and Arthur Lok and Mudit Saxena,http://arxiv.org/pdf/2207.09099v1
 http://arxiv.org/abs/2209.02982v2,creativecommons.org/licenses/by/4.0/,Improving the Cross-Lingual Generalisation in Visual Question Answering,Farhad Nooralahzadeh and Rico Sennrich,http://arxiv.org/pdf/2209.02982v2
 http://arxiv.org/abs/2212.12937v1,creativecommons.org/licenses/by/4.0/,GAE-ISumm: Unsupervised Graph-Based Summarization of Indian Languages,Lakshmi Sireesha Vakada and Anudeep Ch and Mounika Marreddy and Subba Reddy Oota and Radhika Mamidi,http://arxiv.org/pdf/2212.12937v1
 http://arxiv.org/abs/2207.08982v1,creativecommons.org/licenses/by/4.0/,Selection Bias Induced Spurious Correlations in Large Language Models,Emily McMilin,http://arxiv.org/pdf/2207.08982v1
 http://arxiv.org/abs/2111.04909v3,creativecommons.org/licenses/by/4.0/,FPM: A Collection of Large-scale Foundation Pre-trained Language Models,Dezhou Shen,http://arxiv.org/pdf/2111.04909v3
 http://arxiv.org/abs/2212.10471v1,creativecommons.org/licenses/by/4.0/,Little Red Riding Hood Goes Around the Globe:Crosslingual Story Planning and Generation with Large Language Models,Evgeniia Razumovskaia and Joshua Maynez and Annie Louis and Mirella Lapata and Shashi Narayan,http://arxiv.org/pdf/2212.10471v1
 http://arxiv.org/abs/2302.12834v1,creativecommons.org/licenses/by/4.0/,Leveraging Large Language Model and Story-Based Gamification in Intelligent Tutoring System to Scaffold Introductory Programming Courses: A Design-Based Research Study,Chen Cao,http://arxiv.org/pdf/2302.12834v1
 http://arxiv.org/abs/2106.13627v1,creativecommons.org/licenses/by/4.0/,Language Models are Good Translators,Shuo Wang and Zhaopeng Tu and Zhixing Tan and Wenxuan Wang and Maosong Sun and Yang Liu,http://arxiv.org/pdf/2106.13627v1
 http://arxiv.org/abs/2204.10365v1,creativecommons.org/licenses/by/4.0/,Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias,Anoop K. and Manjary P. Gangan and Deepak P. and Lajish V. L,http://arxiv.org/pdf/2204.10365v1
 http://arxiv.org/abs/2205.10782v1,creativecommons.org/licenses/by/4.0/,Instruction Induction: From Few Examples to Natural Language Task Descriptions,Or Honovich and Uri Shaham and Samuel R. Bowman and Omer Levy,http://arxiv.org/pdf/2205.10782v1
 http://arxiv.org/abs/2206.04439v1,creativecommons.org/licenses/by/4.0/,Dict-NMT: Bilingual Dictionary based NMT for Extremely Low Resource Languages,Nalin Kumar and Deepak Kumar and Subhankar Mishra,http://arxiv.org/pdf/2206.04439v1
 http://arxiv.org/abs/2205.10583v4,creativecommons.org/licenses/by/4.0/,Automated Repair of Programs from Large Language Models,Zhiyu Fan and Xiang Gao and Martin Mirchev and Abhik Roychoudhury and Shin Hwei Tan,http://arxiv.org/pdf/2205.10583v4
 http://arxiv.org/abs/2304.06123v1,creativecommons.org/licenses/by/4.0/,The Impact of Large Language Multi-Modal Models on the Future of Job Market,Tarry Singh,http://arxiv.org/pdf/2304.06123v1
 http://arxiv.org/abs/2210.03941v1,creativecommons.org/licenses/by/4.0/,Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling,Hsin-Ying Lee and Hung-Ting Su and Bing-Chen Tsai and Tsung-Han Wu and Jia-Fong Yeh and Winston H. Hsu,http://arxiv.org/pdf/2210.03941v1
 http://arxiv.org/abs/2211.05958v2,creativecommons.org/licenses/by/4.0/,MINION: a Large-Scale and Diverse Dataset for Multilingual Event Detection,Amir Pouran Ben Veyseh and Minh Van Nguyen and Franck Dernoncourt and Thien Huu Nguyen,http://arxiv.org/pdf/2211.05958v2
 http://arxiv.org/abs/1909.04302v1,creativecommons.org/licenses/by/4.0/,Multimodal Embeddings from Language Models,Shao-Yen Tseng and Panayiotis Georgiou and Shrikanth Narayanan,http://arxiv.org/pdf/1909.04302v1
 http://arxiv.org/abs/2202.12576v1,creativecommons.org/licenses/by/4.0/,A Survey of Multilingual Models for Automatic Speech Recognition,Hemant Yadav and Sunayana Sitaram,http://arxiv.org/pdf/2202.12576v1
 http://arxiv.org/abs/2210.17236v1,creativecommons.org/licenses/by/4.0/,When Language Model Meets Private Library,Daoguang Zan and Bei Chen and Zeqi Lin and Bei Guan and Yongji Wang and Jian-Guang Lou,http://arxiv.org/pdf/2210.17236v1
 http://arxiv.org/abs/2211.02098v1,creativecommons.org/licenses/by/4.0/,Overcoming Barriers to Skill Injection in Language Modeling: Case Study in Arithmetic,Mandar Sharma and Nikhil Muralidhar and Naren Ramakrishnan,http://arxiv.org/pdf/2211.02098v1
 http://arxiv.org/abs/1902.08830v1,creativecommons.org/licenses/by/4.0/,Categorization in the Wild: Generalizing Cognitive Models to Naturalistic Data across Languages,Lea Frermann and Mirella Lapata,http://arxiv.org/pdf/1902.08830v1
 http://arxiv.org/abs/2012.15643v2,creativecommons.org/licenses/by/4.0/,CoCoLM: COmplex COmmonsense Enhanced Language Model with Discourse Relations,Changlong Yu and Hongming Zhang and Yangqiu Song and Wilfred Ng,http://arxiv.org/pdf/2012.15643v2
 http://arxiv.org/abs/2212.10551v1,creativecommons.org/licenses/by/4.0/,Lego-MT: Towards Detachable Models in Massively Multilingual Machine Translation,Fei Yuan and Yinquan Lu and WenHao Zhu and Lingpeng Kong and Lei Li and Jingjing Xu,http://arxiv.org/pdf/2212.10551v1
 http://arxiv.org/abs/2301.12004v1,creativecommons.org/licenses/by/4.0/,Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation,Jessica Huynh and Cathy Jiao and Prakhar Gupta and Shikib Mehri and Payal Bajaj and Vishrav Chaudhary and Maxine Eskenazi,http://arxiv.org/pdf/2301.12004v1
 http://arxiv.org/abs/2111.07119v1,creativecommons.org/licenses/by/4.0/,Extracting and filtering paraphrases by bridging natural language inference and paraphrasing,Matej Klemen and Marko Robnik-Šikonja,http://arxiv.org/pdf/2111.07119v1
 http://arxiv.org/abs/2210.11757v1,creativecommons.org/licenses/by/4.0/,University of Cape Town's WMT22 System: Multilingual Machine Translation for Southern African Languages,Khalid N. Elmadani and Francois Meyer and Jan Buys,http://arxiv.org/pdf/2210.11757v1
 http://arxiv.org/abs/2003.01355v2,creativecommons.org/licenses/by/4.0/,CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model,Liang Xu and Xuanwei Zhang and Qianqian Dong,http://arxiv.org/pdf/2003.01355v2
 http://arxiv.org/abs/2110.08294v2,creativecommons.org/licenses/by/4.0/,Coherence boosting: When your pretrained language model is not paying enough attention,Nikolay Malkin and Zhen Wang and Nebojsa Jojic,http://arxiv.org/pdf/2110.08294v2
 http://arxiv.org/abs/2112.13960v1,creativecommons.org/licenses/by/4.0/,A Preordered RNN Layer Boosts Neural Machine Translation in Low Resource Settings,Mohaddeseh Bastan and Shahram Khadivi,http://arxiv.org/pdf/2112.13960v1
 http://arxiv.org/abs/2302.06555v1,creativecommons.org/licenses/by/4.0/,Implications of the Convergence of Language and Vision Model Geometries,Jiaang Li and Yova Kementchedjhieva and Anders Søgaard,http://arxiv.org/pdf/2302.06555v1
 http://arxiv.org/abs/2110.03501v3,creativecommons.org/licenses/by/4.0/,Pretrained Language Models are Symbolic Mathematics Solvers too!,Kimia Noorbakhsh and Modar Sulaiman and Mahdi Sharifi and Kallol Roy and Pooyan Jamshidi,http://arxiv.org/pdf/2110.03501v3
 http://arxiv.org/abs/2104.11390v1,creativecommons.org/licenses/by/4.0/,Transfer training from smaller language model,Han Zhang,http://arxiv.org/pdf/2104.11390v1
 http://arxiv.org/abs/2301.04013v1,creativecommons.org/licenses/by/4.0/,There is No Big Brother or Small Brother: Knowledge Infusion in Language Models for Link Prediction and Question Answering,Ankush Agarwal and Sakharam Gawade and Sachin Channabasavarajendra and Pushpak Bhattacharyya,http://arxiv.org/pdf/2301.04013v1
 http://arxiv.org/abs/2211.13899v1,creativecommons.org/licenses/by/4.0/,Comparison Study Between Token Classification and Sequence Classification In Text Classification,Amir Jafari,http://arxiv.org/pdf/2211.13899v1
 http://arxiv.org/abs/2304.11389v1,creativecommons.org/licenses/by/4.0/,Transformer-Based LM Surprisal Predicts Human Reading Times Best with About Two Billion Training Tokens,Byung-Doh Oh and William Schuler,http://arxiv.org/pdf/2304.11389v1
 http://arxiv.org/abs/2201.06317v1,creativecommons.org/licenses/by/4.0/,Language Model-Based Paired Variational Autoencoders for Robotic Language Learning,Ozan Özdemir and Matthias Kerzel and Cornelius Weber and Jae Hee Lee and Stefan Wermter,http://arxiv.org/pdf/2201.06317v1
 http://arxiv.org/abs/2304.07880v2,creativecommons.org/licenses/by/4.0/,Sabiá: Portuguese Large Language Models,Ramon Pires and Hugo Abonizio and Thales Sales Almeida and Rodrigo Nogueira,http://arxiv.org/pdf/2304.07880v2
 http://arxiv.org/abs/2302.12313v2,creativecommons.org/licenses/by/4.0/,Testing AI performance on less frequent aspects of language reveals insensitivity to underlying meaning,Vittoria Dentella and Elliot Murphy and Gary Marcus and Evelina Leivada,http://arxiv.org/pdf/2302.12313v2
 http://arxiv.org/abs/2211.08264v1,creativecommons.org/licenses/by/4.0/,QAmeleon: Multilingual QA with Only 5 Examples,Priyanka Agrawal and Chris Alberti and Fantine Huot and Joshua Maynez and Ji Ma and Sebastian Ruder and Kuzman Ganchev and Dipanjan Das and Mirella Lapata,http://arxiv.org/pdf/2211.08264v1
 http://arxiv.org/abs/2304.08485v1,creativecommons.org/licenses/by/4.0/,Visual Instruction Tuning,Haotian Liu and Chunyuan Li and Qingyang Wu and Yong Jae Lee,http://arxiv.org/pdf/2304.08485v1
 http://arxiv.org/abs/2303.06689v1,creativecommons.org/licenses/by/4.0/,Self-planning Code Generation with Large Language Model,Xue Jiang and Yihong Dong and Lecheng Wang and Qiwei Shang and Ge Li,http://arxiv.org/pdf/2303.06689v1
 http://arxiv.org/abs/2010.08319v1,creativecommons.org/licenses/by/4.0/,Detecting ESG topics using domain-specific language models and data augmentation approaches,Tim Nugent and Nicole Stelea and Jochen L. Leidner,http://arxiv.org/pdf/2010.08319v1
 http://arxiv.org/abs/2203.16595v3,creativecommons.org/licenses/by/4.0/,Improving Speech Recognition for Indic Languages using Language Model,Ankur Dhuriya and Harveen Singh Chadha and Anirudh Gupta and Priyanshi Shah and Neeraj Chhimwal and Rishabh Gaur and Vivek Raghavan,http://arxiv.org/pdf/2203.16595v3
 http://arxiv.org/abs/2205.12910v2,creativecommons.org/licenses/by/4.0/,NaturalProver: Grounded Mathematical Proof Generation with Language Models,Sean Welleck and Jiacheng Liu and Ximing Lu and Hannaneh Hajishirzi and Yejin Choi,http://arxiv.org/pdf/2205.12910v2
 http://arxiv.org/abs/2209.12099v1,creativecommons.org/licenses/by/4.0/,Controllable Text Generation for Open-Domain Creativity and Fairness,Nanyun Peng,http://arxiv.org/pdf/2209.12099v1
 http://arxiv.org/abs/2301.04347v3,creativecommons.org/licenses/by/4.0/,Counteracts: Testing Stereotypical Representation in Pre-trained Language Models,Damin Zhang and Julia Rayz and Romila Pradhan,http://arxiv.org/pdf/2301.04347v3
 http://arxiv.org/abs/2109.04921v1,creativecommons.org/licenses/by/4.0/,Examining Cross-lingual Contextual Embeddings with Orthogonal Structural Probes,Tomasz Limisiewicz and David Mareček,http://arxiv.org/pdf/2109.04921v1
 http://arxiv.org/abs/2207.10648v1,creativecommons.org/licenses/by/4.0/,A No-Code Low-Code Paradigm for Authoring Business Automations Using Natural Language,Michael Desmond and Evelyn Duesterwald and Vatche Isahagian and Vinod Muthusamy,http://arxiv.org/pdf/2207.10648v1
 http://arxiv.org/abs/2111.09728v1,creativecommons.org/licenses/by/4.0/,Measuring source code conciseness across programming languages using compression,Lodewijk Bergmans and Xander Schrijen and Edwin Ouwehand and Magiel Bruntink,http://arxiv.org/pdf/2111.09728v1
 http://arxiv.org/abs/2112.09600v1,creativecommons.org/licenses/by/4.0/,Transcribing Natural Languages for The Deaf via Neural Editing Programs,Dongxu Li and Chenchen Xu and Liu Liu and Yiran Zhong and Rong Wang and Lars Petersson and Hongdong Li,http://arxiv.org/pdf/2112.09600v1
 http://arxiv.org/abs/2203.16972v3,creativecommons.org/licenses/by/4.0/,Improving Language Identification of Accented Speech,Kunnar Kukk and Tanel Alumäe,http://arxiv.org/pdf/2203.16972v3
 http://arxiv.org/abs/2209.06794v2,creativecommons.org/licenses/by/4.0/,PaLI: A Jointly-Scaled Multilingual Language-Image Model,Xi Chen and Xiao Wang and Soravit Changpinyo and AJ Piergiovanni and Piotr Padlewski and Daniel Salz and Sebastian Goodman and Adam Grycner and Basil Mustafa and Lucas Beyer and Alexander Kolesnikov and Joan Puigcerver and Nan Ding and Keran Rong and Hassan Akbari and Gaurav Mishra and Linting Xue and Ashish Thapliyal and James Bradbury and Weicheng Kuo and Mojtaba Seyedhosseini and Chao Jia and Burcu Karagol Ayan and Carlos Riquelme and Andreas Steiner and Anelia Angelova and Xiaohua Zhai and Neil Houlsby and Radu Soricut,http://arxiv.org/pdf/2209.06794v2
 http://arxiv.org/abs/2203.05081v1,creativecommons.org/licenses/by/4.0/,NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks,Fawaz Sammani and Tanmoy Mukherjee and Nikos Deligiannis,http://arxiv.org/pdf/2203.05081v1
 http://arxiv.org/abs/2303.12528v2,creativecommons.org/licenses/by/4.0/,MEGA: Multilingual Evaluation of Generative AI,Kabir Ahuja and Rishav Hada and Millicent Ochieng and Prachi Jain and Harshita Diddee and Samuel Maina and Tanuja Ganu and Sameer Segal and Maxamed Axmed and Kalika Bali and Sunayana Sitaram,http://arxiv.org/pdf/2303.12528v2
 http://arxiv.org/abs/2302.05508v1,creativecommons.org/licenses/by/4.0/,FairPy: A Toolkit for Evaluation of Social Biases and their Mitigation in Large Language Models,Hrishikesh Viswanath and Tianyi Zhang,http://arxiv.org/pdf/2302.05508v1
 http://arxiv.org/abs/2303.16104v1,creativecommons.org/licenses/by/4.0/,Hallucinations in Large Multilingual Translation Models,Nuno M. Guerreiro and Duarte Alves and Jonas Waldendorf and Barry Haddow and Alexandra Birch and Pierre Colombo and André F. T. Martins,http://arxiv.org/pdf/2303.16104v1
 http://arxiv.org/abs/2204.06644v2,creativecommons.org/licenses/by/4.0/,METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals,Payal Bajaj and Chenyan Xiong and Guolin Ke and Xiaodong Liu and Di He and Saurabh Tiwary and Tie-Yan Liu and Paul Bennett and Xia Song and Jianfeng Gao,http://arxiv.org/pdf/2204.06644v2
 http://arxiv.org/abs/2212.10422v2,creativecommons.org/licenses/by/4.0/,Localising In-Domain Adaptation of Transformer-Based Biomedical Language Models,Tommaso Mario Buonocore and Claudio Crema and Alberto Redolfi and Riccardo Bellazzi and Enea Parimbelli,http://arxiv.org/pdf/2212.10422v2
 http://arxiv.org/abs/1908.09892v1,creativecommons.org/licenses/by/4.0/,Does BERT agree? Evaluating knowledge of structure dependence through agreement relations,Geoff Bacon and Terry Regier,http://arxiv.org/pdf/1908.09892v1
 http://arxiv.org/abs/1906.05149v1,creativecommons.org/licenses/by/4.0/,Putting words in context: LSTM language models and lexical ambiguity,Laura Aina and Kristina Gulordava and Gemma Boleda,http://arxiv.org/pdf/1906.05149v1
 http://arxiv.org/abs/2212.10536v1,creativecommons.org/licenses/by/4.0/,"Measure More, Question More: Experimental Studies on Transformer-based Language Models and Complement Coercion",Yuling Gu,http://arxiv.org/pdf/2212.10536v1
 http://arxiv.org/abs/2202.09452v1,creativecommons.org/licenses/by/4.0/,From FreEM to D'AlemBERT: a Large Corpus and a Language Model for Early Modern French,Simon Gabay and Pedro Ortiz Suarez and Alexandre Bartz and Alix Chagué and Rachel Bawden and Philippe Gambette and Benoît Sagot,http://arxiv.org/pdf/2202.09452v1
 http://arxiv.org/abs/2203.02092v1,creativecommons.org/licenses/by/4.0/,Deep Lexical Hypothesis: Identifying personality structure in natural language,Andrew Cutler and David M. Condon,http://arxiv.org/pdf/2203.02092v1
 http://arxiv.org/abs/2210.05287v2,creativecommons.org/licenses/by/4.0/,Revisiting and Advancing Chinese Natural Language Understanding with Accelerated Heterogeneous Knowledge Pre-training,Taolin Zhang and Junwei Dong and Jianing Wang and Chengyu Wang and Ang Wang and Yinghui Liu and Jun Huang and Yong Li and Xiaofeng He,http://arxiv.org/pdf/2210.05287v2
 http://arxiv.org/abs/2212.10815v1,creativecommons.org/licenses/by/4.0/,ZEROTOP: Zero-Shot Task-Oriented Semantic Parsing using Large Language Models,Dheeraj Mekala and Jason Wolfe and Subhro Roy,http://arxiv.org/pdf/2212.10815v1
 http://arxiv.org/abs/2201.09377v1,creativecommons.org/licenses/by/4.0/,An Application of Pseudo-Log-Likelihoods to Natural Language Scoring,Darren Abramson and Ali Emami,http://arxiv.org/pdf/2201.09377v1
 http://arxiv.org/abs/2304.11158v1,creativecommons.org/licenses/by/4.0/,Emergent and Predictable Memorization in Large Language Models,Stella Biderman and USVSN Sai Prashanth and Lintang Sutawika and Hailey Schoelkopf and Quentin Anthony and Shivanshu Purohit and Edward Raf,http://arxiv.org/pdf/2304.11158v1
 http://arxiv.org/abs/2104.04052v1,creativecommons.org/licenses/by/4.0/,AlephBERT:A Hebrew Large Pre-Trained Language Model to Start-off your Hebrew NLP Application With,Amit Seker and Elron Bandel and Dan Bareket and Idan Brusilovsky and Refael Shaked Greenfeld and Reut Tsarfaty,http://arxiv.org/pdf/2104.04052v1
 http://arxiv.org/abs/2205.08184v1,creativecommons.org/licenses/by/4.0/,SKILL: Structured Knowledge Infusion for Large Language Models,Fedor Moiseev and Zhe Dong and Enrique Alfonseca and Martin Jaggi,http://arxiv.org/pdf/2205.08184v1
 http://arxiv.org/abs/2012.10309v1,creativecommons.org/licenses/by/4.0/,Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training,Peng Shi and Patrick Ng and Zhiguo Wang and Henghui Zhu and Alexander Hanbo Li and Jun Wang and Cicero Nogueira dos Santos and Bing Xiang,http://arxiv.org/pdf/2012.10309v1
 http://arxiv.org/abs/2107.07253v5,creativecommons.org/licenses/by/4.0/,MarIA: Spanish Language Models,Asier Gutiérrez-Fandiño and Jordi Armengol-Estapé and Marc Pàmies and Joan Llop-Palao and Joaquín Silveira-Ocampo and Casimiro Pio Carrino and Aitor Gonzalez-Agirre and Carme Armentano-Oller and Carlos Rodriguez-Penagos and Marta Villegas,http://arxiv.org/pdf/2107.07253v5
 http://arxiv.org/abs/2109.11321v2,creativecommons.org/licenses/by/4.0/,Transferring Knowledge from Vision to Language: How to Achieve it and how to Measure it?,Tobias Norlund and Lovisa Hagström and Richard Johansson,http://arxiv.org/pdf/2109.11321v2
 http://arxiv.org/abs/2304.03159v1,creativecommons.org/licenses/by/4.0/,Bridging the Language Gap: Knowledge Injected Multilingual Question Answering,Zhichao Duan and Xiuxing Li and Zhengyan Zhang and Zhenyu Li and Ning Liu and Jianyong Wang,http://arxiv.org/pdf/2304.03159v1
 http://arxiv.org/abs/1910.10893v1,creativecommons.org/licenses/by/4.0/,Low-Resource Sequence Labeling via Unsupervised Multilingual Contextualized Representations,Zuyi Bao and Rui Huang and Chen Li and Kenny Q. Zhu,http://arxiv.org/pdf/1910.10893v1
 http://arxiv.org/abs/2202.08882v1,creativecommons.org/licenses/by/4.0/,Improving English to Sinhala Neural Machine Translation using Part-of-Speech Tag,Ravinga Perera and Thilakshi Fonseka and Rashmini Naranpanawa and Uthayasanker Thayasivam,http://arxiv.org/pdf/2202.08882v1
 http://arxiv.org/abs/2304.10464v2,creativecommons.org/licenses/by/4.0/,Learning to Program with Natural Language,Yiduo Guo and Yaobo Liang and Chenfei Wu and Wenshan Wu and Dongyan Zhao and Nan Duan,http://arxiv.org/pdf/2304.10464v2
 http://arxiv.org/abs/1811.00258v1,creativecommons.org/licenses/by/4.0/,Language-Independent Representor for Neural Machine Translation,Long Zhou and Yuchen Liu and Jiajun Zhang and Chengqing Zong and Guoping Huang,http://arxiv.org/pdf/1811.00258v1
 http://arxiv.org/abs/2303.01793v1,creativecommons.org/licenses/by/4.0/,Exploiting Language Relatedness in Machine Translation Through Domain Adaptation Techniques,Amit Kumar and Rupjyoti Baruah and Ajay Pratap and Mayank Swarnkar and Anil Kumar Singh,http://arxiv.org/pdf/2303.01793v1
 http://arxiv.org/abs/2103.06434v1,creativecommons.org/licenses/by/4.0/,Topical Language Generation using Transformers,Rohola Zandie and Mohammad H. Mahoor,http://arxiv.org/pdf/2103.06434v1
 http://arxiv.org/abs/2104.05433v1,creativecommons.org/licenses/by/4.0/,Multilingual Language Models Predict Human Reading Behavior,Nora Hollenstein and Federico Pirovano and Ce Zhang and Lena Jäger and Lisa Beinborn,http://arxiv.org/pdf/2104.05433v1
 http://arxiv.org/abs/2109.14989v2,creativecommons.org/licenses/by/4.0/,Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations,Arabella Sinclair and Jaap Jumelet and Willem Zuidema and Raquel Fernández,http://arxiv.org/pdf/2109.14989v2
 http://arxiv.org/abs/2206.14858v2,creativecommons.org/licenses/by/4.0/,Solving Quantitative Reasoning Problems with Language Models,Aitor Lewkowycz and Anders Andreassen and David Dohan and Ethan Dyer and Henryk Michalewski and Vinay Ramasesh and Ambrose Slone and Cem Anil and Imanol Schlag and Theo Gutman-Solo and Yuhuai Wu and Behnam Neyshabur and Guy Gur-Ari and Vedant Misra,http://arxiv.org/pdf/2206.14858v2
 http://arxiv.org/abs/2207.04429v2,creativecommons.org/licenses/by/4.0/,"LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action",Dhruv Shah and Blazej Osinski and Brian Ichter and Sergey Levine,http://arxiv.org/pdf/2207.04429v2
 http://arxiv.org/abs/2105.02570v4,creativecommons.org/licenses/by/4.0/,Capturing the diversity of multilingual societies,Thomas Louf and David Sanchez and Jose J. Ramasco,http://arxiv.org/pdf/2105.02570v4
 http://arxiv.org/abs/2107.13723v2,creativecommons.org/licenses/by/4.0/,An Empirical Study of Developers' Discussions about Security Challenges of Different Programming Languages,Roland Croft and Yongzheng Xie and Mansooreh Zahedi and M. Ali Babar and Christoph Treude,http://arxiv.org/pdf/2107.13723v2
 http://arxiv.org/abs/2208.09021v3,creativecommons.org/licenses/by/4.0/,VAuLT: Augmenting the Vision-and-Language Transformer for Sentiment Classification on Social Media,Georgios Chochlakis and Tejas Srinivasan and Jesse Thomason and Shrikanth Narayanan,http://arxiv.org/pdf/2208.09021v3
 http://arxiv.org/abs/2006.05014v2,creativecommons.org/licenses/by/4.0/,HausaMT v1.0: Towards English-Hausa Neural Machine Translation,Adewale Akinfaderin,http://arxiv.org/pdf/2006.05014v2
 http://arxiv.org/abs/2109.05704v2,creativecommons.org/licenses/by/4.0/,Mitigating Language-Dependent Ethnic Bias in BERT,Jaimeen Ahn and Alice Oh,http://arxiv.org/pdf/2109.05704v2
 http://arxiv.org/abs/2112.13800v1,creativecommons.org/licenses/by/4.0/,"A Passage to India"": Pre-trained Word Embeddings for Indian Languages""",Kumar Saurav and Kumar Saunack and Diptesh Kanojia and Pushpak Bhattacharyya,http://arxiv.org/pdf/2112.13800v1
 http://arxiv.org/abs/2203.07911v2,creativecommons.org/licenses/by/4.0/,Signal in Noise: Exploring Meaning Encoded in Random Character Sequences with Character-Aware Language Models,Mark Chu and Bhargav Srinivasa Desikan and Ethan O. Nadler and D. Ruggiero Lo Sardo and Elise Darragh-Ford and Douglas Guilbeault,http://arxiv.org/pdf/2203.07911v2
 http://arxiv.org/abs/2302.01308v1,creativecommons.org/licenses/by/4.0/,What Language Reveals about Perception: Distilling Psychophysical Knowledge from Large Language Models,Raja Marjieh and Ilia Sucholutsky and Pol van Rijn and Nori Jacoby and Thomas L. Griffiths,http://arxiv.org/pdf/2302.01308v1
 http://arxiv.org/abs/2304.02468v1,creativecommons.org/licenses/by/4.0/,Comparative Analysis of CHATGPT and the evolution of language models,Oluwatosin Ogundare and Gustavo Quiros Araya,http://arxiv.org/pdf/2304.02468v1
 http://arxiv.org/abs/2201.12469v1,creativecommons.org/licenses/by/4.0/,ScaLA: Accelerating Adaptation of Pre-Trained Transformer-Based Language Models via Efficient Large-Batch Adversarial Noise,Minjia Zhang and Niranjan Uma Naresh and Yuxiong He,http://arxiv.org/pdf/2201.12469v1
 http://arxiv.org/abs/2206.07682v2,creativecommons.org/licenses/by/4.0/,Emergent Abilities of Large Language Models,Jason Wei and Yi Tay and Rishi Bommasani and Colin Raffel and Barret Zoph and Sebastian Borgeaud and Dani Yogatama and Maarten Bosma and Denny Zhou and Donald Metzler and Ed H. Chi and Tatsunori Hashimoto and Oriol Vinyals and Percy Liang and Jeff Dean and William Fedus,http://arxiv.org/pdf/2206.07682v2
 http://arxiv.org/abs/2208.11671v1,creativecommons.org/licenses/by/4.0/,Interpreting Song Lyrics with an Audio-Informed Pre-trained Language Model,Yixiao Zhang and Junyan Jiang and Gus Xia and Simon Dixon,http://arxiv.org/pdf/2208.11671v1
 http://arxiv.org/abs/2205.08605v1,creativecommons.org/licenses/by/4.0/,OneAligner: Zero-shot Cross-lingual Transfer with One Rich-Resource Language Pair for Low-Resource Sentence Retrieval,Tong Niu and Kazuma Hashimoto and Yingbo Zhou and Caiming Xiong,http://arxiv.org/pdf/2205.08605v1
 http://arxiv.org/abs/2303.16275v1,creativecommons.org/licenses/by/4.0/,Writing Assistants Should Model Social Factors of Language,Vivek Kulkarni and Vipul Raheja,http://arxiv.org/pdf/2303.16275v1
 http://arxiv.org/abs/2106.03379v1,creativecommons.org/licenses/by/4.0/,LAWDR: Language-Agnostic Weighted Document Representations from Pre-trained Models,Hongyu Gong and Vishrav Chaudhary and Yuqing Tang and Francisco Guzmán,http://arxiv.org/pdf/2106.03379v1
 http://arxiv.org/abs/2112.04482v3,creativecommons.org/licenses/by/4.0/,FLAVA: A Foundational Language And Vision Alignment Model,Amanpreet Singh and Ronghang Hu and Vedanuj Goswami and Guillaume Couairon and Wojciech Galuba and Marcus Rohrbach and Douwe Kiela,http://arxiv.org/pdf/2112.04482v3
 http://arxiv.org/abs/2112.05253v2,creativecommons.org/licenses/by/4.0/,MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning,Constantin Eichenberg and Sidney Black and Samuel Weinbach and Letitia Parcalabescu and Anette Frank,http://arxiv.org/pdf/2112.05253v2
 http://arxiv.org/abs/2203.06386v2,creativecommons.org/licenses/by/4.0/,Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation,Wenliang Dai and Lu Hou and Lifeng Shang and Xin Jiang and Qun Liu and Pascale Fung,http://arxiv.org/pdf/2203.06386v2
 http://arxiv.org/abs/2205.10893v1,creativecommons.org/licenses/by/4.0/,Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers,Albert Q. Jiang and Wenda Li and Szymon Tworkowski and Konrad Czechowski and Tomasz Odrzygóźdź and Piotr Miłoś and Yuhuai Wu and Mateja Jamnik,http://arxiv.org/pdf/2205.10893v1
 http://arxiv.org/abs/2301.12112v2,creativecommons.org/licenses/by/4.0/,On Pre-trained Language Models for Antibody,Danqing Wang and Fei Ye and Hao Zhou,http://arxiv.org/pdf/2301.12112v2
 http://arxiv.org/abs/2110.07143v1,creativecommons.org/licenses/by/4.0/,bert2BERT: Towards Reusable Pretrained Language Models,Cheng Chen and Yichun Yin and Lifeng Shang and Xin Jiang and Yujia Qin and Fengyu Wang and Zhi Wang and Xiao Chen and Zhiyuan Liu and Qun Liu,http://arxiv.org/pdf/2110.07143v1
 http://arxiv.org/abs/2304.05613v1,creativecommons.org/licenses/by/4.0/,ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning,Viet Dac Lai and Nghia Trung Ngo and Amir Pouran Ben Veyseh and Hieu Man and Franck Dernoncourt and Trung Bui and Thien Huu Nguyen,http://arxiv.org/pdf/2304.05613v1
 http://arxiv.org/abs/2101.12462v1,creativecommons.org/licenses/by/4.0/,Synthesizing Monolingual Data for Neural Machine Translation,Benjamin Marie and Atsushi Fujita,http://arxiv.org/pdf/2101.12462v1
 http://arxiv.org/abs/2212.09271v2,creativecommons.org/licenses/by/4.0/,Very Large Language Model as a Unified Methodology of Text Mining,Meng Jiang,http://arxiv.org/pdf/2212.09271v2
 http://arxiv.org/abs/2301.13820v1,creativecommons.org/licenses/by/4.0/,Explaining Large Language Model-Based Neural Semantic Parsers (Student Abstract),Daking Rai and Yilun Zhou and Bailin Wang and Ziyu Yao,http://arxiv.org/pdf/2301.13820v1
 http://arxiv.org/abs/2303.12024v2,creativecommons.org/licenses/by/4.0/,cTBL: Augmenting Large Language Models for Conversational Tables,Anirudh S Sundar and Larry Heck,http://arxiv.org/pdf/2303.12024v2
 http://arxiv.org/abs/2212.00851v1,creativecommons.org/licenses/by/4.0/,SOLD: Sinhala Offensive Language Dataset,Tharindu Ranasinghe and Isuri Anuradha and Damith Premasiri and Kanishka Silva and Hansi Hettiarachchi and Lasitha Uyangodage and Marcos Zampieri,http://arxiv.org/pdf/2212.00851v1
 http://arxiv.org/abs/2301.09003v1,creativecommons.org/licenses/by/4.0/,Blacks is to Anger as Whites is to Joy? Understanding Latent Affective Bias in Large Pre-trained Neural Language Models,Anoop Kadan and Deepak P. and Sahely Bhadra and Manjary P. Gangan and Lajish V. L,http://arxiv.org/pdf/2301.09003v1
 http://arxiv.org/abs/2210.14431v3,creativecommons.org/licenses/by/4.0/,$N$-gram Is Back: Residual Learning of Neural Text Generation with $n$-gram Language Model,Huayang Li and Deng Cai and Jin Xu and Taro Watanabe,http://arxiv.org/pdf/2210.14431v3
 http://arxiv.org/abs/2008.09049v1,creativecommons.org/licenses/by/4.0/,Discovering Useful Sentence Representations from Large Pretrained Language Models,Nishant Subramani and Nivedita Suresh,http://arxiv.org/pdf/2008.09049v1
 http://arxiv.org/abs/2201.00971v1,creativecommons.org/licenses/by/4.0/,Submix: Practical Private Prediction for Large-Scale Language Models,Antonio Ginart and Laurens van der Maaten and James Zou and Chuan Guo,http://arxiv.org/pdf/2201.00971v1
 http://arxiv.org/abs/2207.04901v2,creativecommons.org/licenses/by/4.0/,Exploring Length Generalization in Large Language Models,Cem Anil and Yuhuai Wu and Anders Andreassen and Aitor Lewkowycz and Vedant Misra and Vinay Ramasesh and Ambrose Slone and Guy Gur-Ari and Ethan Dyer and Behnam Neyshabur,http://arxiv.org/pdf/2207.04901v2
 http://arxiv.org/abs/2303.15430v2,creativecommons.org/licenses/by/4.0/,TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models,Md Kamrul Hasan and Md Saiful Islam and Sangwu Lee and Wasifur Rahman and Iftekhar Naim and Mohammed Ibrahim Khan and Ehsan Hoque,http://arxiv.org/pdf/2303.15430v2
 http://arxiv.org/abs/1906.10519v1,creativecommons.org/licenses/by/4.0/,Embedding Projection for Targeted Cross-Lingual Sentiment: Model Comparisons and a Real-World Study,Jeremy Barnes and Roman Klinger,http://arxiv.org/pdf/1906.10519v1
 http://arxiv.org/abs/2106.06937v1,creativecommons.org/licenses/by/4.0/,Common Sense Beyond English: Evaluating and Improving Multilingual Language Models for Commonsense Reasoning,Bill Yuchen Lin and Seyeon Lee and Xiaoyang Qiao and Xiang Ren,http://arxiv.org/pdf/2106.06937v1
 http://arxiv.org/abs/2201.12086v2,creativecommons.org/licenses/by/4.0/,BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation,Junnan Li and Dongxu Li and Caiming Xiong and Steven Hoi,http://arxiv.org/pdf/2201.12086v2
 http://arxiv.org/abs/2205.04086v1,creativecommons.org/licenses/by/4.0/,A Balanced Data Approach for Evaluating Cross-Lingual Transfer: Mapping the Linguistic Blood Bank,Dan Malkin and Tomasz Limisiewicz and Gabriel Stanovsky,http://arxiv.org/pdf/2205.04086v1
 http://arxiv.org/abs/2210.07128v3,creativecommons.org/licenses/by/4.0/,Language Models of Code are Few-Shot Commonsense Learners,Aman Madaan and Shuyan Zhou and Uri Alon and Yiming Yang and Graham Neubig,http://arxiv.org/pdf/2210.07128v3
 http://arxiv.org/abs/2304.05128v1,creativecommons.org/licenses/by/4.0/,Teaching Large Language Models to Self-Debug,Xinyun Chen and Maxwell Lin and Nathanael Schärli and Denny Zhou,http://arxiv.org/pdf/2304.05128v1
 http://arxiv.org/abs/2109.08634v1,creativecommons.org/licenses/by/4.0/,Grounding Natural Language Instructions: Can Large Language Models Capture Spatial Information?,Julia Rozanova and Deborah Ferreira and Krishna Dubba and Weiwei Cheng and Dell Zhang and Andre Freitas,http://arxiv.org/pdf/2109.08634v1
 http://arxiv.org/abs/2110.07640v1,creativecommons.org/licenses/by/4.0/,Sparks: Inspiration for Science Writing using Language Models,Katy Ilonka Gero and Vivian Liu and Lydia B. Chilton,http://arxiv.org/pdf/2110.07640v1
 http://arxiv.org/abs/2207.01893v1,creativecommons.org/licenses/by/4.0/,ASR-Generated Text for Language Model Pre-training Applied to Speech Tasks,Valentin Pelloin and Franck Dary and Nicolas Herve and Benoit Favre and Nathalie Camelin and Antoine Laurent and Laurent Besacier,http://arxiv.org/pdf/2207.01893v1
 http://arxiv.org/abs/2207.05289v1,creativecommons.org/licenses/by/4.0/,PLM-ICD: Automatic ICD Coding with Pretrained Language Models,Chao-Wei Huang and Shang-Chi Tsai and Yun-Nung Chen,http://arxiv.org/pdf/2207.05289v1
 http://arxiv.org/abs/2210.03057v1,creativecommons.org/licenses/by/4.0/,Language Models are Multilingual Chain-of-Thought Reasoners,Freda Shi and Mirac Suzgun and Markus Freitag and Xuezhi Wang and Suraj Srivats and Soroush Vosoughi and Hyung Won Chung and Yi Tay and Sebastian Ruder and Denny Zhou and Dipanjan Das and Jason Wei,http://arxiv.org/pdf/2210.03057v1
 http://arxiv.org/abs/2210.07700v2,creativecommons.org/licenses/by/4.0/,Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey,Sachin Kumar and Vidhisha Balachandran and Lucille Njoo and Antonios Anastasopoulos and Yulia Tsvetkov,http://arxiv.org/pdf/2210.07700v2
 http://arxiv.org/abs/2302.09664v3,creativecommons.org/licenses/by/4.0/,Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation,Lorenz Kuhn and Yarin Gal and Sebastian Farquhar,http://arxiv.org/pdf/2302.09664v3
 http://arxiv.org/abs/2304.10977v1,creativecommons.org/licenses/by/4.0/,Evaluating Transformer Language Models on Arithmetic Operations Using Number Decomposition,Matteo Muffo and Aldo Cocco and Enrico Bertino,http://arxiv.org/pdf/2304.10977v1
 http://arxiv.org/abs/2210.14199v1,creativecommons.org/licenses/by/4.0/,"Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models",Hong Liu and Sang Michael Xie and Zhiyuan Li and Tengyu Ma,http://arxiv.org/pdf/2210.14199v1
 http://arxiv.org/abs/2110.06490v2,creativecommons.org/licenses/by/4.0/,Dict-BERT: Enhancing Language Model Pre-training with Dictionary,Wenhao Yu and Chenguang Zhu and Yuwei Fang and Donghan Yu and Shuohang Wang and Yichong Xu and Michael Zeng and Meng Jiang,http://arxiv.org/pdf/2110.06490v2
 http://arxiv.org/abs/2111.02840v2,creativecommons.org/licenses/by/4.0/,Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models,Boxin Wang and Chejian Xu and Shuohang Wang and Zhe Gan and Yu Cheng and Jianfeng Gao and Ahmed Hassan Awadallah and Bo Li,http://arxiv.org/pdf/2111.02840v2
 http://arxiv.org/abs/2206.05658v1,creativecommons.org/licenses/by/4.0/,Fine-tuning Pre-trained Language Models with Noise Stability Regularization,Hang Hua and Xingjian Li and Dejing Dou and Cheng-Zhong Xu and Jiebo Luo,http://arxiv.org/pdf/2206.05658v1
 http://arxiv.org/abs/2209.13279v1,creativecommons.org/licenses/by/4.0/,Improving Multilingual Neural Machine Translation System for Indic Languages,Sudhansu Bala Das and Atharv Biradar and Tapas Kumar Mishra and Bidyut Kumar Patra,http://arxiv.org/pdf/2209.13279v1
 http://arxiv.org/abs/2109.12584v4,creativecommons.org/licenses/by/4.0/,Curb Your Carbon Emissions: Benchmarking Carbon Emissions in Machine Translation,Mirza Yusuf and Praatibh Surana and Gauri Gupta and Krithika Ramesh,http://arxiv.org/pdf/2109.12584v4
 http://arxiv.org/abs/2212.10560v1,creativecommons.org/licenses/by/4.0/,Self-Instruct: Aligning Language Model with Self Generated Instructions,Yizhong Wang and Yeganeh Kordi and Swaroop Mishra and Alisa Liu and Noah A. Smith and Daniel Khashabi and Hannaneh Hajishirzi,http://arxiv.org/pdf/2212.10560v1
 http://arxiv.org/abs/1703.02504v1,creativecommons.org/licenses/by/4.0/,Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification,Jan Deriu and Aurelien Lucchi and Valeria De Luca and Aliaksei Severyn and Simon Müller and Mark Cieliebak and Thomas Hofmann and Martin Jaggi,http://arxiv.org/pdf/1703.02504v1
 http://arxiv.org/abs/2201.08277v3,creativecommons.org/licenses/by/4.0/,NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis,Shamsuddeen Hassan Muhammad and David Ifeoluwa Adelani and Sebastian Ruder and Ibrahim Said Ahmad and Idris Abdulmumin and Bello Shehu Bello and Monojit Choudhury and Chris Chinenye Emezue and Saheed Salahudeen Abdullahi and Anuoluwapo Aremu and Alipio Jeorge and Pavel Brazdil,http://arxiv.org/pdf/2201.08277v3
 http://arxiv.org/abs/2211.07615v1,creativecommons.org/licenses/by/4.0/,UGIF: UI Grounded Instruction Following,Sagar Gubbi Venkatesh and Partha Talukdar and Srini Narayanan,http://arxiv.org/pdf/2211.07615v1
 http://arxiv.org/abs/2204.03067v2,creativecommons.org/licenses/by/4.0/,ByT5 model for massively multilingual grapheme-to-phoneme conversion,Jian Zhu and Cong Zhang and David Jurgens,http://arxiv.org/pdf/2204.03067v2
 http://arxiv.org/abs/2210.07993v1,creativecommons.org/licenses/by/4.0/,MiQA: A Benchmark for Inference on Metaphorical Questions,Iulia-Maria Comsa and Julian Martin Eisenschlos and Srini Narayanan,http://arxiv.org/pdf/2210.07993v1
 http://arxiv.org/abs/2303.01347v1,creativecommons.org/licenses/by/4.0/,Letz Translate: Low-Resource Machine Translation for Luxembourgish,Yewei Song and Saad Ezzini and Jacques Klein and Tegawende Bissyande and Clément Lefebvre and Anne Goujon,http://arxiv.org/pdf/2303.01347v1
 http://arxiv.org/abs/2302.08091v1,creativecommons.org/licenses/by/4.0/,Do We Still Need Clinical Language Models?,Eric Lehman and Evan Hernandez and Diwakar Mahajan and Jonas Wulff and Micah J. Smith and Zachary Ziegler and Daniel Nadler and Peter Szolovits and Alistair Johnson and Emily Alsentzer,http://arxiv.org/pdf/2302.08091v1
 http://arxiv.org/abs/2212.09723v1,creativecommons.org/licenses/by/4.0/,MANER: Mask Augmented Named Entity Recognition for Extreme Low-Resource Languages,Shashank Sonkar and Zichao Wang and Richard G. Baraniuk,http://arxiv.org/pdf/2212.09723v1
 http://arxiv.org/abs/1809.02428v1,creativecommons.org/licenses/by/4.0/,Multitask and Multilingual Modelling for Lexical Analysis,Johannes Bjerva,http://arxiv.org/pdf/1809.02428v1
 http://arxiv.org/abs/2107.12603v1,creativecommons.org/licenses/by/4.0/,Federated Learning Meets Natural Language Processing: A Survey,Ming Liu and Stella Ho and Mengqi Wang and Longxiang Gao and Yuan Jin and He Zhang,http://arxiv.org/pdf/2107.12603v1
 http://arxiv.org/abs/2203.02912v1,creativecommons.org/licenses/by/4.0/,Graph Neural Network Enhanced Language Models for Efficient Multilingual Text Classification,Samujjwal Ghosh and Subhadeep Maji and Maunendra Sankar Desarkar,http://arxiv.org/pdf/2203.02912v1
 http://arxiv.org/abs/2210.12814v1,creativecommons.org/licenses/by/4.0/,RuCoLA: Russian Corpus of Linguistic Acceptability,Vladislav Mikhailov and Tatiana Shamardina and Max Ryabinin and Alena Pestova and Ivan Smurov and Ekaterina Artemova,http://arxiv.org/pdf/2210.12814v1
 http://arxiv.org/abs/2301.12507v1,creativecommons.org/licenses/by/4.0/,Distilling Internet-Scale Vision-Language Models into Embodied Agents,Theodore Sumers and Kenneth Marino and Arun Ahuja and Rob Fergus and Ishita Dasgupta,http://arxiv.org/pdf/2301.12507v1
 http://arxiv.org/abs/2010.10077v2,creativecommons.org/licenses/by/4.0/,Neural Language Modeling for Contextualized Temporal Graph Generation,Aman Madaan and Yiming Yang,http://arxiv.org/pdf/2010.10077v2
 http://arxiv.org/abs/2012.13978v1,creativecommons.org/licenses/by/4.0/,MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining,Zhi Wen and Xing Han Lu and Siva Reddy,http://arxiv.org/pdf/2012.13978v1
 http://arxiv.org/abs/2012.05983v2,creativecommons.org/licenses/by/4.0/,Towards Neural Programming Interfaces,Zachary C. Brown and Nathaniel Robinson and David Wingate and Nancy Fulda,http://arxiv.org/pdf/2012.05983v2
 http://arxiv.org/abs/2301.12031v1,creativecommons.org/licenses/by/4.0/,Context Matters: A Strategy to Pre-train Language Model for Science Education,Zhengliang Liu and Xinyu He and Lei Liu and Tianming Liu and Xiaoming Zhai,http://arxiv.org/pdf/2301.12031v1
 http://arxiv.org/abs/1909.04625v1,creativecommons.org/licenses/by/4.0/,Representation of Constituents in Neural Language Models: Coordination Phrase as a Case Study,Aixiu An and Peng Qian and Ethan Wilcox and Roger Levy,http://arxiv.org/pdf/1909.04625v1
 http://arxiv.org/abs/2201.06642v1,creativecommons.org/licenses/by/4.0/,Towards a Cleaner Document-Oriented Multilingual Crawled Corpus,Julien Abadji and Pedro Ortiz Suarez and Laurent Romary and Benoît Sagot,http://arxiv.org/pdf/2201.06642v1
 http://arxiv.org/abs/2301.12726v1,creativecommons.org/licenses/by/4.0/,Specializing Smaller Language Models towards Multi-Step Reasoning,Yao Fu and Hao Peng and Litu Ou and Ashish Sabharwal and Tushar Khot,http://arxiv.org/pdf/2301.12726v1
 http://arxiv.org/abs/2303.03846v2,creativecommons.org/licenses/by/4.0/,Larger language models do in-context learning differently,Jerry Wei and Jason Wei and Yi Tay and Dustin Tran and Albert Webson and Yifeng Lu and Xinyun Chen and Hanxiao Liu and Da Huang and Denny Zhou and Tengyu Ma,http://arxiv.org/pdf/2303.03846v2
 http://arxiv.org/abs/2205.06457v2,creativecommons.org/licenses/by/4.0/,ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation,Long Phan and Hieu Tran and Hieu Nguyen and Trieu H. Trinh,http://arxiv.org/pdf/2205.06457v2
 http://arxiv.org/abs/2206.08916v2,creativecommons.org/licenses/by/4.0/,"Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks",Jiasen Lu and Christopher Clark and Rowan Zellers and Roozbeh Mottaghi and Aniruddha Kembhavi,http://arxiv.org/pdf/2206.08916v2
 http://arxiv.org/abs/2212.10678v1,creativecommons.org/licenses/by/4.0/,Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing,Justus Mattern and Zhijing Jin and Mrinmaya Sachan and Rada Mihalcea and Bernhard Schölkopf,http://arxiv.org/pdf/2212.10678v1
 http://arxiv.org/abs/2303.13367v2,creativecommons.org/licenses/by/4.0/,ChatGPT and a New Academic Reality: Artificial Intelligence-Written Research Papers and the Ethics of the Large Language Models in Scholarly Publishing,Brady Lund and Ting Wang and Nishith Reddy Mannuru and Bing Nie and Somipam Shimray and Ziang Wang,http://arxiv.org/pdf/2303.13367v2
 http://arxiv.org/abs/2304.09542v1,creativecommons.org/licenses/by/4.0/,Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent,Weiwei Sun and Lingyong Yan and Xinyu Ma and Pengjie Ren and Dawei Yin and Zhaochun Ren,http://arxiv.org/pdf/2304.09542v1
 http://arxiv.org/abs/1706.00377v1,creativecommons.org/licenses/by/4.0/,Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules,Ivan Vulić and Nikola Mrkšić and Roi Reichart and Diarmuid Ó Séaghdha and Steve Young and Anna Korhonen,http://arxiv.org/pdf/1706.00377v1
 http://arxiv.org/abs/1612.01744v1,creativecommons.org/licenses/by/4.0/,Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation,Alexandre Berard and Olivier Pietquin and Christophe Servan and Laurent Besacier,http://arxiv.org/pdf/1612.01744v1
 http://arxiv.org/abs/2103.16613v1,creativecommons.org/licenses/by/4.0/,Tracking Knowledge Propagation Across Wikipedia Languages,Roldolfo Valentim and Giovanni Comarela and Souneil Park and Diego Saez-Trumper,http://arxiv.org/pdf/2103.16613v1
 http://arxiv.org/abs/2111.09749v2,creativecommons.org/licenses/by/4.0/,Detecting Cross-Language Plagiarism using Open Knowledge Graphs,Johannes Stegmüller and Fabian Bauer-Marquart and Norman Meuschke and Terry Ruas and Moritz Schubotz and Bela Gipp,http://arxiv.org/pdf/2111.09749v2
 http://arxiv.org/abs/2212.10011v1,creativecommons.org/licenses/by/4.0/,PLUE: Language Understanding Evaluation Benchmark for Privacy Policies in English,Jianfeng Chi and Wasi Uddin Ahmad and Yuan Tian and Kai-Wei Chang,http://arxiv.org/pdf/2212.10011v1
 http://arxiv.org/abs/2302.11186v1,creativecommons.org/licenses/by/4.0/,UML: A Universal Monolingual Output Layer for Multilingual ASR,Chao Zhang and Bo Li and Tara N. Sainath and Trevor Strohman and Shuo-yiin Chang,http://arxiv.org/pdf/2302.11186v1
 http://arxiv.org/abs/2304.07840v1,creativecommons.org/licenses/by/4.0/,Automated Program Repair Based on Code Review: How do Pre-trained Transformer Models Perform?,Rishov Paul and Md. Mohib Hossain and Masum Hasan and Anindya Iqbal,http://arxiv.org/pdf/2304.07840v1
 http://arxiv.org/abs/2205.11342v1,creativecommons.org/licenses/by/4.0/,ScholarBERT: Bigger is Not Always Better,Zhi Hong and Aswathy Ajith and Gregory Pauloski and Eamon Duede and Carl Malamud and Roger Magoulas and Kyle Chard and Ian Foster,http://arxiv.org/pdf/2205.11342v1
 http://arxiv.org/abs/2304.12244v1,creativecommons.org/licenses/by/4.0/,WizardLM: Empowering Large Language Models to Follow Complex Instructions,Can Xu and Qingfeng Sun and Kai Zheng and Xiubo Geng and Pu Zhao and Jiazhan Feng and Chongyang Tao and Daxin Jiang,http://arxiv.org/pdf/2304.12244v1
 http://arxiv.org/abs/2211.14402v1,creativecommons.org/licenses/by/4.0/,An Analysis of Social Biases Present in BERT Variants Across Multiple Languages,Aristides Milios and Parishad BehnamGhader,http://arxiv.org/pdf/2211.14402v1
 http://arxiv.org/abs/2301.10095v2,creativecommons.org/licenses/by/4.0/,Large Language Models as Fiduciaries: A Case Study Toward Robustly Communicating With Artificial Intelligence Through Legal Standards,John J. Nay,http://arxiv.org/pdf/2301.10095v2
 http://arxiv.org/abs/2201.10066v1,creativecommons.org/licenses/by/4.0/,Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources,Angelina McMillan-Major and Zaid Alyafeai and Stella Biderman and Kimbo Chen and Francesco De Toni and Gérard Dupont and Hady Elsahar and Chris Emezue and Alham Fikri Aji and Suzana Ilić and Nurulaqilla Khamis and Colin Leong and Maraim Masoud and Aitor Soroa and Pedro Ortiz Suarez and Zeerak Talat and Daniel van Strien and Yacine Jernite,http://arxiv.org/pdf/2201.10066v1
 http://arxiv.org/abs/1909.09543v1,creativecommons.org/licenses/by/4.0/,"Process Query Language: Design, Implementation, and Evaluation",Artem Polyvyanyy and Arthur H. M. ter Hofstede and Marcello La Rosa and Chun Ouyang and Anastasiia Pika,http://arxiv.org/pdf/1909.09543v1
 http://arxiv.org/abs/2012.07098v1,creativecommons.org/licenses/by/4.0/,MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision and Language Research in Turkish,Begum Citamak and Ozan Caglayan and Menekse Kuyu and Erkut Erdem and Aykut Erdem and Pranava Madhyastha and Lucia Specia,http://arxiv.org/pdf/2012.07098v1
 http://arxiv.org/abs/2208.11640v3,creativecommons.org/licenses/by/4.0/,Repair Is Nearly Generation: Multilingual Program Repair with LLMs,Harshit Joshi and José Cambronero and Sumit Gulwani and Vu Le and Ivan Radicek and Gust Verbruggen,http://arxiv.org/pdf/2208.11640v3
 http://arxiv.org/abs/2210.01343v3,creativecommons.org/licenses/by/4.0/,The Surprising Computational Power of Nondeterministic Stack RNNs,Brian DuSell and David Chiang,http://arxiv.org/pdf/2210.01343v3
 http://arxiv.org/abs/2211.10017v1,creativecommons.org/licenses/by/4.0/,Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production,Young Jin Kim and Rawn Henry and Raffy Fahim and Hany Hassan Awadalla,http://arxiv.org/pdf/2211.10017v1
 http://arxiv.org/abs/2206.08446v1,creativecommons.org/licenses/by/4.0/,Methods for Estimating and Improving Robustness of Language Models,Michal Štefánik,http://arxiv.org/pdf/2206.08446v1
 http://arxiv.org/abs/2302.11412v1,creativecommons.org/licenses/by/4.0/,Data Augmentation for Neural NLP,Domagoj Pluščec and Jan Šnajder,http://arxiv.org/pdf/2302.11412v1
 http://arxiv.org/abs/2109.04593v1,creativecommons.org/licenses/by/4.0/,A Large-Scale Study of Machine Translation in the Turkic Languages,Jamshidbek Mirzakhalov and Anoop Babu and Duygu Ataman and Sherzod Kariev and Francis Tyers and Otabek Abduraufov and Mammad Hajili and Sardana Ivanova and Abror Khaytbaev and Antonio Laverghetta Jr. and Behzodbek Moydinboyev and Esra Onal and Shaxnoza Pulatova and Ahsan Wahab and Orhan Firat and Sriram Chellappan,http://arxiv.org/pdf/2109.04593v1
 http://arxiv.org/abs/2010.06478v1,creativecommons.org/licenses/by/4.0/,XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization,Alessandro Raganato and Tommaso Pasini and Jose Camacho-Collados and Mohammad Taher Pilehvar,http://arxiv.org/pdf/2010.06478v1
 http://arxiv.org/abs/2304.10453v1,creativecommons.org/licenses/by/4.0/,Phoenix: Democratizing ChatGPT across Languages,Zhihong Chen and Feng Jiang and Junying Chen and Tiannan Wang and Fei Yu and Guiming Chen and Hongbo Zhang and Juhao Liang and Chen Zhang and Zhiyi Zhang and Jianquan Li and Xiang Wan and Benyou Wang and Haizhou Li,http://arxiv.org/pdf/2304.10453v1
 http://arxiv.org/abs/2111.09734v1,creativecommons.org/licenses/by/4.0/,ClipCap: CLIP Prefix for Image Captioning,Ron Mokady and Amir Hertz and Amit H. Bermano,http://arxiv.org/pdf/2111.09734v1
 http://arxiv.org/abs/2105.11832v2,creativecommons.org/licenses/by/4.0/,Estimating Redundancy in Clinical Text,Thomas Searle and Zina Ibrahim and James Teo and Richard JB Dobson,http://arxiv.org/pdf/2105.11832v2
 http://arxiv.org/abs/2303.03457v1,creativecommons.org/licenses/by/4.0/,Spelling convention sensitivity in neural language models,Elizabeth Nielsen and Christo Kirov and Brian Roark,http://arxiv.org/pdf/2303.03457v1
 http://arxiv.org/abs/2304.05406v1,creativecommons.org/licenses/by/4.0/,Galactic ChitChat: Using Large Language Models to Converse with Astronomy Literature,Ioana Ciucă and Yuan-Sen Ting,http://arxiv.org/pdf/2304.05406v1
 http://arxiv.org/abs/2209.08141v1,creativecommons.org/licenses/by/4.0/,Psychologically-informed chain-of-thought prompts for metaphor understanding in large language models,Ben Prystawski and Paul Thibodeau and Noah Goodman,http://arxiv.org/pdf/2209.08141v1
 http://arxiv.org/abs/2210.03871v1,creativecommons.org/licenses/by/4.0/,Data-Efficiency with a Single GPU: An Exploration of Transfer Methods for Small Language Models,Alon Albalak and Akshat Shrivastava and Chinnadhurai Sankar and Adithya Sagar and Mike Ross,http://arxiv.org/pdf/2210.03871v1
 http://arxiv.org/abs/2001.03521v1,creativecommons.org/licenses/by/4.0/,Towards Minimal Supervision BERT-based Grammar Error Correction,Yiyuan Li and Antonios Anastasopoulos and Alan W Black,http://arxiv.org/pdf/2001.03521v1
 http://arxiv.org/abs/2109.06605v1,creativecommons.org/licenses/by/4.0/,MDAPT: Multilingual Domain Adaptive Pretraining in a Single Model,Rasmus Kær Jørgensen and Mareike Hartmann and Xiang Dai and Desmond Elliott,http://arxiv.org/pdf/2109.06605v1
 http://arxiv.org/abs/2210.03347v1,creativecommons.org/licenses/by/4.0/,Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding,Kenton Lee and Mandar Joshi and Iulia Turc and Hexiang Hu and Fangyu Liu and Julian Eisenschlos and Urvashi Khandelwal and Peter Shaw and Ming-Wei Chang and Kristina Toutanova,http://arxiv.org/pdf/2210.03347v1
 http://arxiv.org/abs/2201.13348v3,creativecommons.org/licenses/by/4.0/,Advantages and Disadvantages of (Dedicated) Model Transformation Languages A Qualitative Interview Study,Stefan Höppner and Yves Haas and Matthias Tichy and Katharina Juhnke,http://arxiv.org/pdf/2201.13348v3
 http://arxiv.org/abs/2204.10555v2,creativecommons.org/licenses/by/4.0/,KALA: Knowledge-Augmented Language Model Adaptation,Minki Kang and Jinheon Baek and Sung Ju Hwang,http://arxiv.org/pdf/2204.10555v2
 http://arxiv.org/abs/2301.05402v1,creativecommons.org/licenses/by/4.0/,In BLOOM: Creativity and Affinity in Artificial Lyrics and Art,Evan Crothers and Herna Viktor and Nathalie Japkowicz,http://arxiv.org/pdf/2301.05402v1
 http://arxiv.org/abs/2303.02927v1,creativecommons.org/licenses/by/4.0/,LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models,Victor Dibia,http://arxiv.org/pdf/2303.02927v1
 http://arxiv.org/abs/2303.04496v1,creativecommons.org/licenses/by/4.0/,MenuCraft: Interactive Menu System Design with Large Language Models,Amir Hossein Kargaran and Nafiseh Nikeghbal and Abbas Heydarnoori and Hinrich Schütze,http://arxiv.org/pdf/2303.04496v1
 http://arxiv.org/abs/2304.06597v1,creativecommons.org/licenses/by/4.0/,"What It Wants Me To Say"": Bridging the Abstraction Gap Between End-User Programmers and Code-Generating Large Language Models""",Michael Xieyang Liu and Advait Sarkar and Carina Negreanu and Ben Zorn and Jack Williams and Neil Toronto and Andrew D. Gordon,http://arxiv.org/pdf/2304.06597v1
 http://arxiv.org/abs/2010.15036v1,creativecommons.org/licenses/by/4.0/,A Comprehensive Survey on Word Representation Models: From Classical to State-Of-The-Art Word Representation Language Models,Usman Naseem and Imran Razzak and Shah Khalid Khan and Mukesh Prasad,http://arxiv.org/pdf/2010.15036v1
 http://arxiv.org/abs/2103.13997v1,creativecommons.org/licenses/by/4.0/,Real-time low-resource phoneme recognition on edge devices,Yonatan Alon,http://arxiv.org/pdf/2103.13997v1
 http://arxiv.org/abs/2105.13573v1,creativecommons.org/licenses/by/4.0/,Investigating Code-Mixed Modern Standard Arabic-Egyptian to English Machine Translation,El Moatez Billah Nagoudi and AbdelRahim Elmadany and Muhammad Abdul-Mageed,http://arxiv.org/pdf/2105.13573v1
 http://arxiv.org/abs/2106.16038v1,creativecommons.org/licenses/by/4.0/,ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information,Zijun Sun and Xiaoya Li and Xiaofei Sun and Yuxian Meng and Xiang Ao and Qing He and Fei Wu and Jiwei Li,http://arxiv.org/pdf/2106.16038v1
 http://arxiv.org/abs/2109.07046v1,creativecommons.org/licenses/by/4.0/,A Conditional Generative Matching Model for Multi-lingual Reply Suggestion,Budhaditya Deb and Guoqing Zheng and Milad Shokouhi and Ahmed Hassan Awadallah,http://arxiv.org/pdf/2109.07046v1
 http://arxiv.org/abs/2205.12630v1,creativecommons.org/licenses/by/4.0/,Multimodal Knowledge Alignment with Reinforcement Learning,Youngjae Yu and Jiwan Chung and Heeseung Yun and Jack Hessel and JaeSung Park and Ximing Lu and Prithviraj Ammanabrolu and Rowan Zellers and Ronan Le Bras and Gunhee Kim and Yejin Choi,http://arxiv.org/pdf/2205.12630v1
 http://arxiv.org/abs/2212.01964v1,creativecommons.org/licenses/by/4.0/,Building Metadata Inference Using a Transducer Based Language Model,David Waterworth and Subbu Sethuvenkatraman and Quan Z. Sheng,http://arxiv.org/pdf/2212.01964v1
 http://arxiv.org/abs/2212.03278v1,creativecommons.org/licenses/by/4.0/,Counterfactual reasoning: Do language models need world knowledge for causal understanding?,Jiaxuan Li and Lang Yu and Allyson Ettinger,http://arxiv.org/pdf/2212.03278v1
 http://arxiv.org/abs/2302.03773v1,creativecommons.org/licenses/by/4.0/,What Matters In The Structured Pruning of Generative Language Models?,Michael Santacroce and Zixin Wen and Yelong Shen and Yuanzhi Li,http://arxiv.org/pdf/2302.03773v1
 http://arxiv.org/abs/2110.08413v2,creativecommons.org/licenses/by/4.0/,Invariant Language Modeling,Maxime Peyrard and Sarvjeet Singh Ghotra and Martin Josifoski and Vidhan Agarwal and Barun Patra and Dean Carignan and Emre Kiciman and Robert West,http://arxiv.org/pdf/2110.08413v2
 http://arxiv.org/abs/2203.00902v1,creativecommons.org/licenses/by/4.0/,Do Prompts Solve NLP Tasks Using Natural Language?,Sen Yang and Yunchen Zhang and Leyang Cui and Yue Zhang,http://arxiv.org/pdf/2203.00902v1
 http://arxiv.org/abs/2201.10716v1,creativecommons.org/licenses/by/4.0/,Neural Grapheme-to-Phoneme Conversion with Pre-trained Grapheme Models,Lu Dong and Zhi-Qiang Guo and Chao-Hong Tan and Ya-Jun Hu and Yuan Jiang and Zhen-Hua Ling,http://arxiv.org/pdf/2201.10716v1
 http://arxiv.org/abs/2301.13779v1,creativecommons.org/licenses/by/4.0/,FLAME: A small language model for spreadsheet formulas,Harshit Joshi and Abishai Ebenezer and José Cambronero and Sumit Gulwani and Aditya Kanade and Vu Le and Ivan Radiček and Gust Verbruggen,http://arxiv.org/pdf/2301.13779v1
 http://arxiv.org/abs/2104.08786v2,creativecommons.org/licenses/by/4.0/,Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity,Yao Lu and Max Bartolo and Alastair Moore and Sebastian Riedel and Pontus Stenetorp,http://arxiv.org/pdf/2104.08786v2
 http://arxiv.org/abs/2210.15458v1,creativecommons.org/licenses/by/4.0/,Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models,Luke Vilnis and Yury Zemlyanskiy and Patrick Murray and Alexandre Passos and Sumit Sanghai,http://arxiv.org/pdf/2210.15458v1
 http://arxiv.org/abs/2203.07687v1,creativecommons.org/licenses/by/4.0/,Compressing Sentence Representation for Semantic Retrieval via Homomorphic Projective Distillation,Xuandong Zhao and Zhiguo Yu and Ming Wu and Lei Li,http://arxiv.org/pdf/2203.07687v1
 http://arxiv.org/abs/2110.07560v2,creativecommons.org/licenses/by/4.0/,Composable Sparse Fine-Tuning for Cross-Lingual Transfer,Alan Ansell and Edoardo Maria Ponti and Anna Korhonen and Ivan Vulić,http://arxiv.org/pdf/2110.07560v2
 http://arxiv.org/abs/2302.13681v2,creativecommons.org/licenses/by/4.0/,The (ab)use of Open Source Code to Train Large Language Models,Ali Al-Kaswan and Maliheh Izadi,http://arxiv.org/pdf/2302.13681v2
 http://arxiv.org/abs/2211.06452v1,creativecommons.org/licenses/by/4.0/,Cross-Platform and Cross-Domain Abusive Language Detection with Supervised Contrastive Learning,Md Tawkat Islam Khondaker and Muhammad Abdul-Mageed and Laks V. S. Lakshmanan,http://arxiv.org/pdf/2211.06452v1
 http://arxiv.org/abs/2105.08645v4,creativecommons.org/licenses/by/4.0/,CoTexT: Multi-task Learning with Code-Text Transformer,Long Phan and Hieu Tran and Daniel Le and Hieu Nguyen and James Anibal and Alec Peltekian and Yanfang Ye,http://arxiv.org/pdf/2105.08645v4
 http://arxiv.org/abs/2110.05679v6,creativecommons.org/licenses/by/4.0/,Large Language Models Can Be Strong Differentially Private Learners,Xuechen Li and Florian Tramèr and Percy Liang and Tatsunori Hashimoto,http://arxiv.org/pdf/2110.05679v6
 http://arxiv.org/abs/1907.03187v1,creativecommons.org/licenses/by/4.0/,Applying a Pre-trained Language Model to Spanish Twitter Humor Prediction,Bobak Farzin and Piotr Czapla and Jeremy Howard,http://arxiv.org/pdf/1907.03187v1
 http://arxiv.org/abs/2104.05277v1,creativecommons.org/licenses/by/4.0/,Building a Swedish Open-Domain Conversational Language Model,Tobias Norlund and Agnes Stenbom,http://arxiv.org/pdf/2104.05277v1
 http://arxiv.org/abs/2104.09933v1,creativecommons.org/licenses/by/4.0/,Grammatical Error Generation Based on Translated Fragments,Eetu Sjöblom and Mathias Creutz and Teemu Vahtola,http://arxiv.org/pdf/2104.09933v1
 http://arxiv.org/abs/2106.04653v1,creativecommons.org/licenses/by/4.0/,Comprehension Based Question Answering using Bloom's Taxonomy,Pritish Sahu and Michael Cogswell and Sara Rutherford-Quach and Ajay Divakaran,http://arxiv.org/pdf/2106.04653v1
 http://arxiv.org/abs/2204.00498v1,creativecommons.org/licenses/by/4.0/,Evaluating the Text-to-SQL Capabilities of Large Language Models,Nitarshan Rajkumar and Raymond Li and Dzmitry Bahdanau,http://arxiv.org/pdf/2204.00498v1
 http://arxiv.org/abs/2303.08014v1,creativecommons.org/licenses/by/4.0/,Does ChatGPT resemble humans in language use?,Zhenguang G. Cai and David A. Haslett and Xufeng Duan and Shuqi Wang and Martin J. Pickering,http://arxiv.org/pdf/2303.08014v1
 http://arxiv.org/abs/2304.12191v1,creativecommons.org/licenses/by/4.0/,"Genlangs"" and Zipf's Law: Do languages generated by ChatGPT statistically look human?""",Justin Diamond,http://arxiv.org/pdf/2304.12191v1
 http://arxiv.org/abs/2203.16601v3,creativecommons.org/licenses/by/4.0/,Is Word Error Rate a good evaluation metric for Speech Recognition in Indic Languages?,Priyanshi Shah and Harveen Singh Chadha and Anirudh Gupta and Ankur Dhuriya and Neeraj Chhimwal and Rishabh Gaur and Vivek Raghavan,http://arxiv.org/pdf/2203.16601v3
 http://arxiv.org/abs/2303.13592v2,creativecommons.org/licenses/by/4.0/,Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages,Zheng-Xin Yong and Ruochen Zhang and Jessica Zosa Forde and Skyler Wang and Samuel Cahyawijaya and Holy Lovenia and Genta Indra Winata and Lintang Sutawika and Jan Christian Blaise Cruz and Long Phan and Yin Lin Tan and Alham Fikri Aji,http://arxiv.org/pdf/2303.13592v2
 http://arxiv.org/abs/1911.12579v3,creativecommons.org/licenses/by/4.0/,Word Embedding based New Corpus for Low-resourced Language: Sindhi,Wazir Ali and Jay Kumar and Junyu Lu and Zenglin Xu,http://arxiv.org/pdf/1911.12579v3
 http://arxiv.org/abs/2101.04758v4,creativecommons.org/licenses/by/4.0/,Self-Training Pre-Trained Language Models for Zero- and Few-Shot Multi-Dialectal Arabic Sequence Labeling,Muhammad Khalifa and Muhammad Abdul-Mageed and Khaled Shaalan,http://arxiv.org/pdf/2101.04758v4
 http://arxiv.org/abs/2110.06078v1,creativecommons.org/licenses/by/4.0/,Model-based analysis of brain activity reveals the hierarchy of language in 305 subjects,Charlotte Caucheteux and Alexandre Gramfort and Jean-Rémi King,http://arxiv.org/pdf/2110.06078v1
 http://arxiv.org/abs/2205.10828v3,creativecommons.org/licenses/by/4.0/,What Do Compressed Multilingual Machine Translation Models Forget?,Alireza Mohammadshahi and Vassilina Nikoulina and Alexandre Berard and Caroline Brun and James Henderson and Laurent Besacier,http://arxiv.org/pdf/2205.10828v3
 http://arxiv.org/abs/2003.12111v1,creativecommons.org/licenses/by/4.0/,FFR V1.0: Fon-French Neural Machine Translation,Bonaventure F. P. Dossou and Chris C. Emezue,http://arxiv.org/pdf/2003.12111v1
 http://arxiv.org/abs/2012.05628v3,creativecommons.org/licenses/by/4.0/,As Good as New. How to Successfully Recycle English GPT-2 to Make Models for Other Languages,Wietse de Vries and Malvina Nissim,http://arxiv.org/pdf/2012.05628v3
 http://arxiv.org/abs/2104.09585v1,creativecommons.org/licenses/by/4.0/,ELECTRAMed: a new pre-trained language representation model for biomedical NLP,Giacomo Miolo and Giulio Mantoan and Carlotta Orsenigo,http://arxiv.org/pdf/2104.09585v1
 http://arxiv.org/abs/2105.04633v1,creativecommons.org/licenses/by/4.0/,"Language Acquisition is Embodied, Interactive, Emotive: a Research Proposal",Casey Kennington,http://arxiv.org/pdf/2105.04633v1
 http://arxiv.org/abs/2106.04563v2,creativecommons.org/licenses/by/4.0/,XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation,Subhabrata Mukherjee and Ahmed Hassan Awadallah and Jianfeng Gao,http://arxiv.org/pdf/2106.04563v2
 http://arxiv.org/abs/2110.08527v3,creativecommons.org/licenses/by/4.0/,An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models,Nicholas Meade and Elinor Poole-Dayan and Siva Reddy,http://arxiv.org/pdf/2110.08527v3
 http://arxiv.org/abs/2111.14031v1,creativecommons.org/licenses/by/4.0/,FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding,Bill Tuck Weng Pung and Alvin Chan,http://arxiv.org/pdf/2111.14031v1
 http://arxiv.org/abs/2205.12558v2,creativecommons.org/licenses/by/4.0/,Gradient-Based Constrained Sampling from Language Models,Sachin Kumar and Biswajit Paria and Yulia Tsvetkov,http://arxiv.org/pdf/2205.12558v2
 http://arxiv.org/abs/2208.10264v4,creativecommons.org/licenses/by/4.0/,Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies,Gati Aher and Rosa I. Arriaga and Adam Tauman Kalai,http://arxiv.org/pdf/2208.10264v4
 http://arxiv.org/abs/2301.00068v1,creativecommons.org/licenses/by/4.0/,On the Inconsistencies of Conditionals Learned by Masked Language Models,Tom Young and Yang You,http://arxiv.org/pdf/2301.00068v1
 http://arxiv.org/abs/2301.12810v1,creativecommons.org/licenses/by/4.0/,Crawling the Internal Knowledge-Base of Language Models,Roi Cohen and Mor Geva and Jonathan Berant and Amir Globerson,http://arxiv.org/pdf/2301.12810v1
 http://arxiv.org/abs/2304.08442v1,creativecommons.org/licenses/by/4.0/,The MiniPile Challenge for Data-Efficient Language Models,Jean Kaddour,http://arxiv.org/pdf/2304.08442v1
 http://arxiv.org/abs/2112.10668v3,creativecommons.org/licenses/by/4.0/,Few-shot Learning with Multilingual Language Models,Xi Victoria Lin and Todor Mihaylov and Mikel Artetxe and Tianlu Wang and Shuohui Chen and Daniel Simig and Myle Ott and Naman Goyal and Shruti Bhosale and Jingfei Du and Ramakanth Pasunuru and Sam Shleifer and Punit Singh Koura and Vishrav Chaudhary and Brian O'Horo and Jeff Wang and Luke Zettlemoyer and Zornitsa Kozareva and Mona Diab and Veselin Stoyanov and Xian Li,http://arxiv.org/pdf/2112.10668v3
 http://arxiv.org/abs/2111.13999v1,creativecommons.org/licenses/by/4.0/,Exploring Low-Cost Transformer Model Compression for Large-Scale Commercial Reply Suggestions,Vaishnavi Shrivastava and Radhika Gaonkar and Shashank Gupta and Abhishek Jha,http://arxiv.org/pdf/2111.13999v1
 http://arxiv.org/abs/1904.07334v1,creativecommons.org/licenses/by/4.0/,Multi-Head Multi-Layer Attention to Deep Language Representations for Grammatical Error Detection,Masahiro Kaneko and Mamoru Komachi,http://arxiv.org/pdf/1904.07334v1
 http://arxiv.org/abs/2111.05671v1,creativecommons.org/licenses/by/4.0/,Pre-trained Transformer-Based Approach for Arabic Question Answering : A Comparative Study,Kholoud Alsubhi and Amani Jamal and Areej Alhothali,http://arxiv.org/pdf/2111.05671v1
 http://arxiv.org/abs/2206.01843v2,creativecommons.org/licenses/by/4.0/,Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning,Yujia Xie and Luowei Zhou and Xiyang Dai and Lu Yuan and Nguyen Bach and Ce Liu and Michael Zeng,http://arxiv.org/pdf/2206.01843v2
 http://arxiv.org/abs/2301.09790v2,creativecommons.org/licenses/by/4.0/,Can Very Large Pretrained Language Models Learn Storytelling With A Few Examples?,Zhuohan Xie and Trevor Cohn and Jey Han Lau,http://arxiv.org/pdf/2301.09790v2
 http://arxiv.org/abs/2208.05446v2,creativecommons.org/licenses/by/4.0/,CoditT5: Pretraining for Source Code and Natural Language Editing,Jiyang Zhang and Sheena Panthaplackel and Pengyu Nie and Junyi Jessy Li and Milos Gligoric,http://arxiv.org/pdf/2208.05446v2
 http://arxiv.org/abs/2212.09682v1,creativecommons.org/licenses/by/4.0/,Multilingual Sequence-to-Sequence Models for Hebrew NLP,Matan Eyal and Hila Noga and Roee Aharoni and Idan Szpektor and Reut Tsarfaty,http://arxiv.org/pdf/2212.09682v1
 http://arxiv.org/abs/2103.05327v2,creativecommons.org/licenses/by/4.0/,BERTese: Learning to Speak to BERT,Adi Haviv and Jonathan Berant and Amir Globerson,http://arxiv.org/pdf/2103.05327v2
 http://arxiv.org/abs/2207.03546v1,creativecommons.org/licenses/by/4.0/,"BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus",Josh Meyer and David Ifeoluwa Adelani and Edresson Casanova and Alp Öktem and Daniel Whitenack Julian Weber and Salomon Kabongo and Elizabeth Salesky and Iroro Orife and Colin Leong and Perez Ogayo and Chris Emezue and Jonathan Mukiibi and Salomey Osei and Apelete Agbolo and Victor Akinode and Bernard Opoku and Samuel Olanrewaju and Jesujoba Alabi and Shamsuddeen Muhammad,http://arxiv.org/pdf/2207.03546v1
 http://arxiv.org/abs/2208.11857v1,creativecommons.org/licenses/by/4.0/,Shortcut Learning of Large Language Models in Natural Language Understanding: A Survey,Mengnan Du and Fengxiang He and Na Zou and Dacheng Tao and Xia Hu,http://arxiv.org/pdf/2208.11857v1
 http://arxiv.org/abs/2209.05629v1,creativecommons.org/licenses/by/4.0/,Leveraging Large Language Models for Robot 3D Scene Understanding,William Chen and Siyi Hu and Rajat Talak and Luca Carlone,http://arxiv.org/pdf/2209.05629v1
 http://arxiv.org/abs/2212.09736v1,creativecommons.org/licenses/by/4.0/,"Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments",Yu Gu and Xiang Deng and Yu Su,http://arxiv.org/pdf/2212.09736v1
 http://arxiv.org/abs/2201.06796v2,creativecommons.org/licenses/by/4.0/,CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities,Mina Lee and Percy Liang and Qian Yang,http://arxiv.org/pdf/2201.06796v2
 http://arxiv.org/abs/2303.03004v2,creativecommons.org/licenses/by/4.0/,"xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval",Mohammad Abdullah Matin Khan and M Saiful Bari and Xuan Long Do and Weishi Wang and Md Rizwan Parvez and Shafiq Joty,http://arxiv.org/pdf/2303.03004v2
 http://arxiv.org/abs/2304.00116v1,creativecommons.org/licenses/by/4.0/,Enhancing Large Language Models with Climate Resources,Mathias Kraus and Julia Anna Bingler and Markus Leippold and Tobias Schimanski and Chiara Colesanti Senni and Dominik Stammbach and Saeid Ashraf Vaghefi and Nicolas Webersinke,http://arxiv.org/pdf/2304.00116v1
 http://arxiv.org/abs/2302.04089v1,creativecommons.org/licenses/by/4.0/,ZipLM: Hardware-Aware Structured Pruning of Language Models,Eldar Kurtic and Elias Frantar and Dan Alistarh,http://arxiv.org/pdf/2302.04089v1
 http://arxiv.org/abs/2211.07705v1,creativecommons.org/licenses/by/4.0/,A Machine Learning Approach to Classifying Construction Cost Documents into the International Construction Measurement Standard,J. Ignacio Deza and Hisham Ihshaish and Lamine Mahdjoubi,http://arxiv.org/pdf/2211.07705v1
 http://arxiv.org/abs/2303.05546v1,creativecommons.org/licenses/by/4.0/,Weakly-Supervised HOI Detection from Interaction Labels Only and Language/Vision-Language Priors,Mesut Erhan Unal and Adriana Kovashka,http://arxiv.org/pdf/2303.05546v1
 http://arxiv.org/abs/2212.01907v1,creativecommons.org/licenses/by/4.0/,Understanding How Model Size Affects Few-shot Instruction Prompting,Ayrton San Joaquin and Ardy Haroen,http://arxiv.org/pdf/2212.01907v1
 http://arxiv.org/abs/2304.09871v1,creativecommons.org/licenses/by/4.0/,A Theory on Adam Instability in Large-Scale Machine Learning,Igor Molybog and Peter Albert and Moya Chen and Zachary DeVito and David Esiobu and Naman Goyal and Punit Singh Koura and Sharan Narang and Andrew Poulton and Ruan Silva and Binh Tang and Puxin Xu and Yuchen Zhang and Melanie Kambadur and Stephen Roller and Susan Zhang,http://arxiv.org/pdf/2304.09871v1
 http://arxiv.org/abs/2209.01975v1,creativecommons.org/licenses/by/4.0/,Selective Annotation Makes Language Models Better Few-Shot Learners,Hongjin Su and Jungo Kasai and Chen Henry Wu and Weijia Shi and Tianlu Wang and Jiayi Xin and Rui Zhang and Mari Ostendorf and Luke Zettlemoyer and Noah A. Smith and Tao Yu,http://arxiv.org/pdf/2209.01975v1
 http://arxiv.org/abs/2304.09181v1,creativecommons.org/licenses/by/4.0/,Large Language Models Based Automatic Synthesis of Software Specifications,Shantanu Mandal and Adhrik Chethan and Vahid Janfaza and S M Farabi Mahmud and Todd A Anderson and Javier Turek and Jesmin Jahan Tithi and Abdullah Muzahid,http://arxiv.org/pdf/2304.09181v1
 http://arxiv.org/abs/2104.06678v1,creativecommons.org/licenses/by/4.0/,Large-Scale Self- and Semi-Supervised Learning for Speech Translation,Changhan Wang and Anne Wu and Juan Pino and Alexei Baevski and Michael Auli and Alexis Conneau,http://arxiv.org/pdf/2104.06678v1
 http://arxiv.org/abs/2105.03791v2,creativecommons.org/licenses/by/4.0/,Enhancing Transformers with Gradient Boosted Decision Trees for NLI Fine-Tuning,Benjamin Minixhofer and Milan Gritta and Ignacio Iacobacci,http://arxiv.org/pdf/2105.03791v2
 http://arxiv.org/abs/1912.05372v4,creativecommons.org/licenses/by/4.0/,FlauBERT: Unsupervised Language Model Pre-training for French,Hang Le and Loïc Vial and Jibril Frej and Vincent Segonne and Maximin Coavoux and Benjamin Lecouteux and Alexandre Allauzen and Benoît Crabbé and Laurent Besacier and Didier Schwab,http://arxiv.org/pdf/1912.05372v4
 http://arxiv.org/abs/2109.06262v1,creativecommons.org/licenses/by/4.0/,Evaluating Multiway Multilingual NMT in the Turkic Languages,Jamshidbek Mirzakhalov and Anoop Babu and Aigiz Kunafin and Ahsan Wahab and Behzod Moydinboyev and Sardana Ivanova and Mokhiyakhon Uzokova and Shaxnoza Pulatova and Duygu Ataman and Julia Kreutzer and Francis Tyers and Orhan Firat and John Licato and Sriram Chellappan,http://arxiv.org/pdf/2109.06262v1
 http://arxiv.org/abs/2205.12986v4,creativecommons.org/licenses/by/4.0/,Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling,Kaitao Song and Yichong Leng and Xu Tan and Yicheng Zou and Tao Qin and Dongsheng Li,http://arxiv.org/pdf/2205.12986v4
 http://arxiv.org/abs/2211.09084v1,creativecommons.org/licenses/by/4.0/,Technical Report on Neural Language Models and Few-Shot Learning for Systematic Requirements Processing in MDSE,Vincent Bertram and Miriam Boß and Evgeny Kusmenko and Imke Helene Nachmann and Bernhard Rumpe and Danilo Trotta and Louis Wachtmeister,http://arxiv.org/pdf/2211.09084v1
 http://arxiv.org/abs/2304.11477v1,creativecommons.org/licenses/by/4.0/,LLM+P: Empowering Large Language Models with Optimal Planning Proficiency,Bo Liu and Yuqian Jiang and Xiaohan Zhang and Qiang Liu and Shiqi Zhang and Joydeep Biswas and Peter Stone,http://arxiv.org/pdf/2304.11477v1
 http://arxiv.org/abs/2301.03980v1,creativecommons.org/licenses/by/4.0/,Language Models sounds the Death Knell of Knowledge Graphs,Kunal Suri and Atul Singh and Prakhar Mishra and Swapna Sourav Rout and Rajesh Sabapathy,http://arxiv.org/pdf/2301.03980v1
 http://arxiv.org/abs/2204.02633v1,creativecommons.org/licenses/by/4.0/,DAGAM: Data Augmentation with Generation And Modification,Byeong-Cheol Jo and Tak-Sung Heo and Yeongjoon Park and Yongmin Yoo and Won Ik Cho and Kyungsun Kim,http://arxiv.org/pdf/2204.02633v1
 http://arxiv.org/abs/2209.06430v4,creativecommons.org/licenses/by/4.0/,CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment,Hongwei Xue and Yuchong Sun and Bei Liu and Jianlong Fu and Ruihua Song and Houqiang Li and Jiebo Luo,http://arxiv.org/pdf/2209.06430v4
 http://arxiv.org/abs/2303.07895v1,creativecommons.org/licenses/by/4.0/,The Learnability of In-Context Learning,Noam Wies and Yoav Levine and Amnon Shashua,http://arxiv.org/pdf/2303.07895v1
 http://arxiv.org/abs/2010.04900v2,creativecommons.org/licenses/by/4.0/,Toward Micro-Dialect Identification in Diaglossic and Code-Switched Environments,Muhammad Abdul-Mageed and Chiyu Zhang and AbdelRahim Elmadany and Lyle Ungar,http://arxiv.org/pdf/2010.04900v2
 http://arxiv.org/abs/2101.00027v1,creativecommons.org/licenses/by/4.0/,The Pile: An 800GB Dataset of Diverse Text for Language Modeling,Leo Gao and Stella Biderman and Sid Black and Laurence Golding and Travis Hoppe and Charles Foster and Jason Phang and Horace He and Anish Thite and Noa Nabeshima and Shawn Presser and Connor Leahy,http://arxiv.org/pdf/2101.00027v1
 http://arxiv.org/abs/2103.12407v4,creativecommons.org/licenses/by/4.0/,Detecting Hate Speech with GPT-3,Ke-Li Chiu and Annie Collins and Rohan Alexander,http://arxiv.org/pdf/2103.12407v4
 http://arxiv.org/abs/2108.02340v1,creativecommons.org/licenses/by/4.0/,Robust Transfer Learning with Pretrained Language Models through Adapters,Wenjuan Han and Bo Pang and Yingnian Wu,http://arxiv.org/pdf/2108.02340v1
 http://arxiv.org/abs/2205.12586v2,creativecommons.org/licenses/by/4.0/,Perturbation Augmentation for Fairer NLP,Rebecca Qian and Candace Ross and Jude Fernandes and Eric Smith and Douwe Kiela and Adina Williams,http://arxiv.org/pdf/2205.12586v2
 http://arxiv.org/abs/2208.05051v1,creativecommons.org/licenses/by/4.0/,Limitations of Language Models in Arithmetic and Symbolic Induction,Jing Qian and Hong Wang and Zekun Li and Shiyang Li and Xifeng Yan,http://arxiv.org/pdf/2208.05051v1
 http://arxiv.org/abs/2210.07074v2,creativecommons.org/licenses/by/4.0/,CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing,Andy Rosenbaum and Saleh Soltan and Wael Hamza and Amir Saffari and Marco Damonte and Isabel Groves,http://arxiv.org/pdf/2210.07074v2
 http://arxiv.org/abs/2101.03216v2,creativecommons.org/licenses/by/4.0/,Breaking Writer's Block: Low-cost Fine-tuning of Natural Language Generation Models,Alexandre Duval and Thomas Lamson and Gael de Leseleuc de Kerouara and Matthias Gallé,http://arxiv.org/pdf/2101.03216v2
 http://arxiv.org/abs/2106.09204v1,creativecommons.org/licenses/by/4.0/,An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models,Xueqing Liu and Chi Wang,http://arxiv.org/pdf/2106.09204v1
 http://arxiv.org/abs/2107.05002v2,creativecommons.org/licenses/by/4.0/,Improving Low-resource Reading Comprehension via Cross-lingual Transposition Rethinking,Gaochen Wu and Bin Xu and Yuxin Qin and Fei Kong and Bangchang Liu and Hongwen Zhao and Dejie Chang,http://arxiv.org/pdf/2107.05002v2
 http://arxiv.org/abs/2107.12600v1,creativecommons.org/licenses/by/4.0/,PiSLTRc: Position-informed Sign Language Transformer with Content-aware Convolution,Pan Xie and Mengyi Zhao and Xiaohui Hu,http://arxiv.org/pdf/2107.12600v1
 http://arxiv.org/abs/2203.10415v1,creativecommons.org/licenses/by/4.0/,How does the pre-training objective affect what large language models learn about linguistic properties?,Ahmed Alajrami and Nikolaos Aletras,http://arxiv.org/pdf/2203.10415v1
 http://arxiv.org/abs/2204.12820v1,creativecommons.org/licenses/by/4.0/,LyS_ACoruña at SemEval-2022 Task 10: Repurposing Off-the-Shelf Tools for Sentiment Analysis as Semantic Dependency Parsing,Iago Alonso-Alonso and David Vilares and Carlos Gómez-Rodríguez,http://arxiv.org/pdf/2204.12820v1
 http://arxiv.org/abs/2206.08325v2,creativecommons.org/licenses/by/4.0/,Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models,Maribeth Rauh and John Mellor and Jonathan Uesato and Po-Sen Huang and Johannes Welbl and Laura Weidinger and Sumanth Dathathri and Amelia Glaese and Geoffrey Irving and Iason Gabriel and William Isaac and Lisa Anne Hendricks,http://arxiv.org/pdf/2206.08325v2
 http://arxiv.org/abs/2206.10744v1,creativecommons.org/licenses/by/4.0/,Don't Forget About Pronouns: Removing Gender Bias in Language Models Without Losing Factual Gender Information,Tomasz Limisiewicz and David Mareček,http://arxiv.org/pdf/2206.10744v1
 http://arxiv.org/abs/2301.11660v2,creativecommons.org/licenses/by/4.0/,Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning,Hyunsoo Cho and Choonghyun Park and Junyeop Kim and Hyuhng Joon Kim and Kang Min Yoo and Sang-goo Lee,http://arxiv.org/pdf/2301.11660v2
 http://arxiv.org/abs/2303.07578v1,creativecommons.org/licenses/by/4.0/,VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation,Rohan Badlani and Akshit Arora and Subhankar Ghosh and Rafael Valle and Kevin J. Shih and João Felipe Santos and Boris Ginsburg and Bryan Catanzaro,http://arxiv.org/pdf/2303.07578v1
 http://arxiv.org/abs/2303.15619v1,creativecommons.org/licenses/by/4.0/,Typhoon: Towards an Effective Task-Specific Masking Strategy for Pre-trained Language Models,Muhammed Shahir Abdurrahman and Hashem Elezabi and Bruce Changlong Xu,http://arxiv.org/pdf/2303.15619v1
 http://arxiv.org/abs/2209.15162v3,creativecommons.org/licenses/by/4.0/,Linearly Mapping from Image to Text Space,Jack Merullo and Louis Castricato and Carsten Eickhoff and Ellie Pavlick,http://arxiv.org/pdf/2209.15162v3
 http://arxiv.org/abs/2205.09295v2,creativecommons.org/licenses/by/4.0/,Are Prompt-based Models Clueless?,Pride Kavumba and Ryo Takahashi and Yusuke Oda,http://arxiv.org/pdf/2205.09295v2
 http://arxiv.org/abs/2207.12994v1,creativecommons.org/licenses/by/4.0/,V$^2$L: Leveraging Vision and Vision-language Models into Large-scale Product Retrieval,Wenhao Wang and Yifan Sun and Zongxin Yang and Yi Yang,http://arxiv.org/pdf/2207.12994v1
 http://arxiv.org/abs/2212.14206v1,creativecommons.org/licenses/by/4.0/,Maximizing Use-Case Specificity through Precision Model Tuning,Pranjali Awasthi and David Recio-Mitter and Yosuke Kyle Sugi,http://arxiv.org/pdf/2212.14206v1
 http://arxiv.org/abs/2205.07557v2,creativecommons.org/licenses/by/4.0/,"Heroes, Villains, and Victims, and GPT-3: Automated Extraction of Character Roles Without Training Data",Dominik Stammbach and Maria Antoniak and Elliott Ash,http://arxiv.org/pdf/2205.07557v2
 http://arxiv.org/abs/1909.11272v1,creativecommons.org/licenses/by/4.0/,TalkDown: A Corpus for Condescension Detection in Context,Zijian Wang and Christopher Potts,http://arxiv.org/pdf/1909.11272v1
 http://arxiv.org/abs/2109.00916v1,creativecommons.org/licenses/by/4.0/,Coarse-To-Fine And Cross-Lingual ASR Transfer,Peter Polák and Ondřej Bojar,http://arxiv.org/pdf/2109.00916v1
 http://arxiv.org/abs/2206.05399v1,creativecommons.org/licenses/by/4.0/,Building a Personalized Dialogue System with Prompt-Tuning,Tomohito Kasahara and Daisuke Kawahara and Nguyen Tung and Shengzhe Li and Kenta Shinzato and Toshinori Sato,http://arxiv.org/pdf/2206.05399v1
 http://arxiv.org/abs/2302.02178v1,creativecommons.org/licenses/by/4.0/,Construction Grammar Provides Unique Insight into Neural Language Models,Leonie Weissweiler and Taiqi He and Naoki Otani and David R. Mortensen and Lori Levin and Hinrich Schütze,http://arxiv.org/pdf/2302.02178v1
 http://arxiv.org/abs/2303.15647v1,creativecommons.org/licenses/by/4.0/,Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning,Vladislav Lialin and Vijeta Deshpande and Anna Rumshisky,http://arxiv.org/pdf/2303.15647v1
 http://arxiv.org/abs/2203.06096v1,creativecommons.org/licenses/by/4.0/,WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language,Federico Tavella and Viktor Schlegel and Marta Romeo and Aphrodite Galata and Angelo Cangelosi,http://arxiv.org/pdf/2203.06096v1
 http://arxiv.org/abs/2204.04306v1,creativecommons.org/licenses/by/4.0/,MMTAfrica: Multilingual Machine Translation for African Languages,Chris C. Emezue and Bonaventure F. P. Dossou,http://arxiv.org/pdf/2204.04306v1
 http://arxiv.org/abs/2205.10560v1,creativecommons.org/licenses/by/4.0/,Unsupervised Sign Language Phoneme Clustering using HamNoSys Notation,Boris Mocialov and Graham Turner and Helen Hastie,http://arxiv.org/pdf/2205.10560v1
 http://arxiv.org/abs/2203.13151v1,creativecommons.org/licenses/by/4.0/,Multi-armed bandits for online optimization of language model pre-training: the use case of dynamic masking,Iñigo Urteaga and Moulay-Zaïdane Draïdia and Tomer Lancewicki and Shahram Khadivi,http://arxiv.org/pdf/2203.13151v1
 http://arxiv.org/abs/2104.07639v4,creativecommons.org/licenses/by/4.0/,Robust Optimization for Multilingual Translation with Imbalanced Data,Xian Li and Hongyu Gong,http://arxiv.org/pdf/2104.07639v4
 http://arxiv.org/abs/2111.02643v5,creativecommons.org/licenses/by/4.0/,Response Generation with Context-Aware Prompt Learning,Xiaodong Gu and Kang Min Yoo and Sang-Woo Lee,http://arxiv.org/pdf/2111.02643v5
 http://arxiv.org/abs/2202.05144v1,creativecommons.org/licenses/by/4.0/,InPars: Data Augmentation for Information Retrieval using Large Language Models,Luiz Bonifacio and Hugo Abonizio and Marzieh Fadaee and Rodrigo Nogueira,http://arxiv.org/pdf/2202.05144v1
 http://arxiv.org/abs/2303.05221v1,creativecommons.org/licenses/by/4.0/,SEAM: An Integrated Activation-Coupled Model of Sentence Processing and Eye Movements in Reading,Maximilian M. Rabe and Dario Paape and Daniela Mertzen and Shravan Vasishth and Ralf Engbert,http://arxiv.org/pdf/2303.05221v1
 http://arxiv.org/abs/2206.04327v1,creativecommons.org/licenses/by/4.0/,Language Identification for Austronesian Languages,Jonathan Dunn and Wikke Nijhof,http://arxiv.org/pdf/2206.04327v1
 http://arxiv.org/abs/2111.14745v1,creativecommons.org/licenses/by/4.0/,A Simple Long-Tailed Recognition Baseline via Vision-Language Model,Teli Ma and Shijie Geng and Mengmeng Wang and Jing Shao and Jiasen Lu and Hongsheng Li and Peng Gao and Yu Qiao,http://arxiv.org/pdf/2111.14745v1
 http://arxiv.org/abs/2204.11574v1,creativecommons.org/licenses/by/4.0/,A global analysis of metrics used for measuring performance in natural language processing,Kathrin Blagec and Georg Dorffner and Milad Moradi and Simon Ott and Matthias Samwald,http://arxiv.org/pdf/2204.11574v1
 http://arxiv.org/abs/2210.10358v2,creativecommons.org/licenses/by/4.0/,Leveraging a New Spanish Corpus for Multilingual and Crosslingual Metaphor Detection,Elisa Sanchez-Bayona and Rodrigo Agerri,http://arxiv.org/pdf/2210.10358v2
 http://arxiv.org/abs/2105.11321v1,creativecommons.org/licenses/by/4.0/,Neural Language Models for Nineteenth-Century English,Kasra Hosseini and Kaspar Beelen and Giovanni Colavizza and Mariona Coll Ardanuy,http://arxiv.org/pdf/2105.11321v1
 http://arxiv.org/abs/2106.01023v1,creativecommons.org/licenses/by/4.0/,One Teacher is Enough? Pre-trained Language Model Distillation from Multiple Teachers,Chuhan Wu and Fangzhao Wu and Yongfeng Huang,http://arxiv.org/pdf/2106.01023v1
 http://arxiv.org/abs/2205.11277v2,creativecommons.org/licenses/by/4.0/,When does Parameter-Efficient Transfer Learning Work for Machine Translation?,Ahmet Üstün and Asa Cooper Stickland,http://arxiv.org/pdf/2205.11277v2
 http://arxiv.org/abs/2205.11747v1,creativecommons.org/licenses/by/4.0/,BabyBear: Cheap inference triage for expensive language models,Leila Khalili and Yao You and John Bohannon,http://arxiv.org/pdf/2205.11747v1
 http://arxiv.org/abs/2210.17497v1,creativecommons.org/licenses/by/4.0/,Leveraging Pre-trained Models for Failure Analysis Triplets Generation,Kenneth Ezukwoke and Anis Hoayek and Mireille Batton-Hubert and Xavier Boucher and Pascal Gounet and Jerome Adrian,http://arxiv.org/pdf/2210.17497v1
 http://arxiv.org/abs/2211.15593v1,creativecommons.org/licenses/by/4.0/,GPT-Neo for commonsense reasoning-a theoretical and practical lens,Rohan Kashyap and Vivek Kashyap and Narendra C. P,http://arxiv.org/pdf/2211.15593v1
 http://arxiv.org/abs/2212.08192v1,creativecommons.org/licenses/by/4.0/,The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources in Natural Language Understanding Systems,Akshatha Arodi and Martin Pömsl and Kaheer Suleman and Adam Trischler and Alexandra Olteanu and Jackie Chi Kit Cheung,http://arxiv.org/pdf/2212.08192v1
 http://arxiv.org/abs/2212.05238v1,creativecommons.org/licenses/by/4.0/,Structured information extraction from complex scientific text with fine-tuned large language models,Alexander Dunn and John Dagdelen and Nicholas Walker and Sanghoon Lee and Andrew S. Rosen and Gerbrand Ceder and Kristin Persson and Anubhav Jain,http://arxiv.org/pdf/2212.05238v1
 http://arxiv.org/abs/2210.16431v1,creativecommons.org/licenses/by/4.0/,DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention,Fenglin Liu and Xian Wu and Shen Ge and Xuancheng Ren and Wei Fan and Xu Sun and Yuexian Zou,http://arxiv.org/pdf/2210.16431v1
 http://arxiv.org/abs/2101.05783v2,creativecommons.org/licenses/by/4.0/,Persistent Anti-Muslim Bias in Large Language Models,Abubakar Abid and Maheen Farooqi and James Zou,http://arxiv.org/pdf/2101.05783v2
 http://arxiv.org/abs/2110.02370v1,creativecommons.org/licenses/by/4.0/,Leveraging the Inductive Bias of Large Language Models for Abstract Textual Reasoning,Christopher Michael Rytting and David Wingate,http://arxiv.org/pdf/2110.02370v1
 http://arxiv.org/abs/2206.11484v2,creativecommons.org/licenses/by/4.0/,Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models,Virginia K. Felkner and Ho-Chun Herbert Chang and Eugene Jang and Jonathan May,http://arxiv.org/pdf/2206.11484v2
 http://arxiv.org/abs/2211.04486v1,creativecommons.org/licenses/by/4.0/,Active Example Selection for In-Context Learning,Yiming Zhang and Shi Feng and Chenhao Tan,http://arxiv.org/pdf/2211.04486v1
 http://arxiv.org/abs/2302.14520v1,creativecommons.org/licenses/by/4.0/,Large Language Models Are State-of-the-Art Evaluators of Translation Quality,Tom Kocmi and Christian Federmann,http://arxiv.org/pdf/2302.14520v1
 http://arxiv.org/abs/1805.04453v1,creativecommons.org/licenses/by/4.0/,Bootstrapping Multilingual Intent Models via Machine Translation for Dialog Automation,Nicholas Ruiz and Srinivas Bangalore and John Chen,http://arxiv.org/pdf/1805.04453v1
 http://arxiv.org/abs/2108.03353v1,creativecommons.org/licenses/by/4.0/,Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning,Bryan Wang and Gang Li and Xin Zhou and Zhourong Chen and Tovi Grossman and Yang Li,http://arxiv.org/pdf/2108.03353v1
 http://arxiv.org/abs/2110.08512v1,creativecommons.org/licenses/by/4.0/,AugmentedCode: Examining the Effects of Natural Language Resources in Code Retrieval Models,Mehdi Bahrami and N. C. Shrikanth and Yuji Mizobuchi and Lei Liu and Masahiro Fukuyori and Wei-Peng Chen and Kazuki Munakata,http://arxiv.org/pdf/2110.08512v1
 http://arxiv.org/abs/2202.11923v1,creativecommons.org/licenses/by/4.0/,Welcome to the Modern World of Pronouns: Identity-Inclusive Natural Language Processing beyond Gender,Anne Lauscher and Archie Crowley and Dirk Hovy,http://arxiv.org/pdf/2202.11923v1
 http://arxiv.org/abs/2205.00445v1,creativecommons.org/licenses/by/4.0/,"MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning",Ehud Karpas and Omri Abend and Yonatan Belinkov and Barak Lenz and Opher Lieber and Nir Ratner and Yoav Shoham and Hofit Bata and Yoav Levine and Kevin Leyton-Brown and Dor Muhlgay and Noam Rozen and Erez Schwartz and Gal Shachaf and Shai Shalev-Shwartz and Amnon Shashua and Moshe Tenenholtz,http://arxiv.org/pdf/2205.00445v1
 http://arxiv.org/abs/2210.03162v1,creativecommons.org/licenses/by/4.0/,Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models,David Wingate and Mohammad Shoeybi and Taylor Sorensen,http://arxiv.org/pdf/2210.03162v1
 http://arxiv.org/abs/2210.03575v2,creativecommons.org/licenses/by/4.0/,Are Representations Built from the Ground Up? An Empirical Examination of Local Composition in Language Models,Emmy Liu and Graham Neubig,http://arxiv.org/pdf/2210.03575v2
 http://arxiv.org/abs/2212.10770v1,creativecommons.org/licenses/by/4.0/,ImPaKT: A Dataset for Open-Schema Knowledge Base Construction,Luke Vilnis and Zach Fisher and Bhargav Kanagal and Patrick Murray and Sumit Sanghai,http://arxiv.org/pdf/2212.10770v1
 http://arxiv.org/abs/2302.12069v1,creativecommons.org/licenses/by/4.0/,Deep learning model for Mongolian Citizens Feedback Analysis using Word Vector Embeddings,Zolzaya Dashdorj and Tsetsentsengel Munkhbayar and Stanislav Grigorev,http://arxiv.org/pdf/2302.12069v1
 http://arxiv.org/abs/2303.06854v1,creativecommons.org/licenses/by/4.0/,Robust Contrastive Language-Image Pretraining against Adversarial Attacks,Wenhan Yang and Baharan Mirzasoleiman,http://arxiv.org/pdf/2303.06854v1
 http://arxiv.org/abs/2303.14100v1,creativecommons.org/licenses/by/4.0/,Errors are Useful Prompts: Instruction Guided Task Programming with Verifier-Assisted Iterative Prompting,Marta Skreta and Naruki Yoshikawa and Sebastian Arellano-Rubach and Zhi Ji and Lasse Bjørn Kristensen and Kourosh Darvish and Alán Aspuru-Guzik and Florian Shkurti and Animesh Garg,http://arxiv.org/pdf/2303.14100v1
 http://arxiv.org/abs/2303.15727v1,creativecommons.org/licenses/by/4.0/,Evaluation of ChatGPT for NLP-based Mental Health Applications,Bishal Lamichhane,http://arxiv.org/pdf/2303.15727v1
 http://arxiv.org/abs/2209.12106v1,creativecommons.org/licenses/by/4.0/,Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity,Gabriel Simmons,http://arxiv.org/pdf/2209.12106v1
 http://arxiv.org/abs/2304.10592v1,creativecommons.org/licenses/by/4.0/,MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models,Deyao Zhu and Jun Chen and Xiaoqian Shen and Xiang Li and Mohamed Elhoseiny,http://arxiv.org/pdf/2304.10592v1
 http://arxiv.org/abs/2106.07716v1,creativecommons.org/licenses/by/4.0/,Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR Models using Hybrid Generated Pseudotranscripts,Chak-Fai Li and Francis Keith and William Hartmann and Matthew Snover and Owen Kimball,http://arxiv.org/pdf/2106.07716v1
 http://arxiv.org/abs/2010.11856v3,creativecommons.org/licenses/by/4.0/,XOR QA: Cross-lingual Open-Retrieval Question Answering,Akari Asai and Jungo Kasai and Jonathan H. Clark and Kenton Lee and Eunsol Choi and Hannaneh Hajishirzi,http://arxiv.org/pdf/2010.11856v3
 http://arxiv.org/abs/2301.01967v1,creativecommons.org/licenses/by/4.0/,A Survey of Code-switching: Linguistic and Social Perspectives for Language Technologies,A. Seza Doğruöz and Sunayana Sitaram and Barbara E. Bullock and Almeida Jacqueline Toribio,http://arxiv.org/pdf/2301.01967v1
 http://arxiv.org/abs/2210.16147v3,creativecommons.org/licenses/by/4.0/,Modeling structure-building in the brain with CCG parsing and large language models,Miloš Stanojević and Jonathan R. Brennan and Donald Dunagan and Mark Steedman and John T. Hale,http://arxiv.org/pdf/2210.16147v3
 http://arxiv.org/abs/2302.09458v1,creativecommons.org/licenses/by/4.0/,Learning Language Representations with Logical Inductive Bias,Jianshu Chen,http://arxiv.org/pdf/2302.09458v1
 http://arxiv.org/abs/2102.13136v1,creativecommons.org/licenses/by/4.0/,Automated essay scoring using efficient transformer-based language models,Christopher M Ormerod and Akanksha Malhotra and Amir Jafari,http://arxiv.org/pdf/2102.13136v1
 http://arxiv.org/abs/2103.10360v2,creativecommons.org/licenses/by/4.0/,GLM: General Language Model Pretraining with Autoregressive Blank Infilling,Zhengxiao Du and Yujie Qian and Xiao Liu and Ming Ding and Jiezhong Qiu and Zhilin Yang and Jie Tang,http://arxiv.org/pdf/2103.10360v2
 http://arxiv.org/abs/2109.10847v2,creativecommons.org/licenses/by/4.0/,Small-Bench NLP: Benchmark for small single GPU trained models in Natural Language Processing,Kamal Raj Kanakarajan and Bhuvana Kundumani and Malaikannan Sankarasubbu,http://arxiv.org/pdf/2109.10847v2
 http://arxiv.org/abs/2201.07434v2,creativecommons.org/licenses/by/4.0/,Interpreting Arabic Transformer Models,Ahmed Abdelali and Nadir Durrani and Fahim Dalvi and Hassan Sajjad,http://arxiv.org/pdf/2201.07434v2
 http://arxiv.org/abs/2207.14255v1,creativecommons.org/licenses/by/4.0/,Efficient Training of Language Models to Fill in the Middle,Mohammad Bavarian and Heewoo Jun and Nikolas Tezak and John Schulman and Christine McLeavey and Jerry Tworek and Mark Chen,http://arxiv.org/pdf/2207.14255v1
 http://arxiv.org/abs/2302.00070v1,creativecommons.org/licenses/by/4.0/,Debiasing Vision-Language Models via Biased Prompts,Ching-Yao Chuang and Varun Jampani and Yuanzhen Li and Antonio Torralba and Stefanie Jegelka,http://arxiv.org/pdf/2302.00070v1
 http://arxiv.org/abs/2303.17612v2,creativecommons.org/licenses/by/4.0/,"oBERTa: Improving Sparse Transfer Learning via improved initialization, distillation, and pruning regimes",Daniel Campos and Alexandre Marques and Mark Kurtz and ChengXiang Zhai,http://arxiv.org/pdf/2303.17612v2
 http://arxiv.org/abs/2205.12005v2,creativecommons.org/licenses/by/4.0/,mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections,Chenliang Li and Haiyang Xu and Junfeng Tian and Wei Wang and Ming Yan and Bin Bi and Jiabo Ye and Hehong Chen and Guohai Xu and Zheng Cao and Ji Zhang and Songfang Huang and Fei Huang and Jingren Zhou and Luo Si,http://arxiv.org/pdf/2205.12005v2
 http://arxiv.org/abs/2301.13294v2,creativecommons.org/licenses/by/4.0/,Adaptive Machine Translation with Large Language Models,Yasmin Moslem and Rejwanul Haque and John D. Kelleher and Andy Way,http://arxiv.org/pdf/2301.13294v2
 http://arxiv.org/abs/2203.07785v1,creativecommons.org/licenses/by/4.0/,The Ghost in the Machine has an American accent: value conflict in GPT-3,Rebecca L Johnson and Giada Pistilli and Natalia Menédez-González and Leslye Denisse Dias Duran and Enrico Panai and Julija Kalpokiene and Donald Jay Bertulfo,http://arxiv.org/pdf/2203.07785v1
 http://arxiv.org/abs/2010.11973v1,creativecommons.org/licenses/by/4.0/,Rediscovering the Slavic Continuum in Representations Emerging from Neural Models of Spoken Language Identification,Badr M. Abdullah and Jacek Kudera and Tania Avgustinova and Bernd Möbius and Dietrich Klakow,http://arxiv.org/pdf/2010.11973v1
 http://arxiv.org/abs/2211.08411v1,creativecommons.org/licenses/by/4.0/,Large Language Models Struggle to Learn Long-Tail Knowledge,Nikhil Kandpal and Haikang Deng and Adam Roberts and Eric Wallace and Colin Raffel,http://arxiv.org/pdf/2211.08411v1
 http://arxiv.org/abs/2301.00303v1,creativecommons.org/licenses/by/4.0/,Rethinking with Retrieval: Faithful Large Language Model Inference,Hangfeng He and Hongming Zhang and Dan Roth,http://arxiv.org/pdf/2301.00303v1
 http://arxiv.org/abs/2303.08559v1,creativecommons.org/licenses/by/4.0/,"Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples!",Yubo Ma and Yixin Cao and YongChing Hong and Aixin Sun,http://arxiv.org/pdf/2303.08559v1
 http://arxiv.org/abs/2303.09136v1,creativecommons.org/licenses/by/4.0/,A Short Survey of Viewing Large Language Models in Legal Aspect,Zhongxiang Sun,http://arxiv.org/pdf/2303.09136v1
 http://arxiv.org/abs/2304.02868v1,creativecommons.org/licenses/by/4.0/,Can Large Language Models Play Text Games Well? Current State-of-the-Art and Open Questions,Chen Feng Tsai and Xiaochen Zhou and Sierra S. Liu and Jing Li and Mo Yu and Hongyuan Mei,http://arxiv.org/pdf/2304.02868v1
 http://arxiv.org/abs/2304.10611v1,creativecommons.org/licenses/by/4.0/,Multi-aspect Repetition Suppression and Content Moderation of Large Language Models,Minghui Zhang and Alex Sokolov and Weixin Cai and Si-Qing Chen,http://arxiv.org/pdf/2304.10611v1
 http://arxiv.org/abs/2205.12650v2,creativecommons.org/licenses/by/4.0/,Few-shot Reranking for Multi-hop QA via Language Model Prompting,Muhammad Khalifa and Lajanugen Logeswaran and Moontae Lee and Honglak Lee and Lu Wang,http://arxiv.org/pdf/2205.12650v2
 http://arxiv.org/abs/2212.10933v1,creativecommons.org/licenses/by/4.0/,Resolving Indirect Referring Expressions for Entity Selection,Mohammad Javad Hosseini and Filip Radlinski and Silvia Pareti and Annie Louis,http://arxiv.org/pdf/2212.10933v1
 http://arxiv.org/abs/2301.11596v2,creativecommons.org/licenses/by/4.0/,ThoughtSource: A central hub for large language model reasoning data,Simon Ott and Konstantin Hebenstreit and Valentin Liévin and Christoffer Egeberg Hother and Milad Moradi and Maximilian Mayrhauser and Robert Praas and Ole Winther and Matthias Samwald,http://arxiv.org/pdf/2301.11596v2
 http://arxiv.org/abs/2304.02138v2,creativecommons.org/licenses/by/4.0/,Geotechnical Parrot Tales (GPT): Harnessing Large Language Models in geotechnical engineering,Krishna Kumar,http://arxiv.org/pdf/2304.02138v2
 http://arxiv.org/abs/2010.12626v1,creativecommons.org/licenses/by/4.0/,Topic Modeling with Contextualized Word Representation Clusters,Laure Thompson and David Mimno,http://arxiv.org/pdf/2010.12626v1
 http://arxiv.org/abs/2203.10945v1,creativecommons.org/licenses/by/4.0/,AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive Summarization,Moussa Kamal Eddine and Nadi Tomeh and Nizar Habash and Joseph Le Roux and Michalis Vazirgiannis,http://arxiv.org/pdf/2203.10945v1
 http://arxiv.org/abs/2010.14649v2,creativecommons.org/licenses/by/4.0/,Learning Contextualised Cross-lingual Word Embeddings and Alignments for Extremely Low-Resource Languages Using Parallel Corpora,Takashi Wada and Tomoharu Iwata and Yuji Matsumoto and Timothy Baldwin and Jey Han Lau,http://arxiv.org/pdf/2010.14649v2
 http://arxiv.org/abs/2109.12068v4,creativecommons.org/licenses/by/4.0/,AraT5: Text-to-Text Transformers for Arabic Language Generation,El Moatez Billah Nagoudi and AbdelRahim Elmadany and Muhammad Abdul-Mageed,http://arxiv.org/pdf/2109.12068v4
 http://arxiv.org/abs/2112.06905v2,creativecommons.org/licenses/by/4.0/,GLaM: Efficient Scaling of Language Models with Mixture-of-Experts,Nan Du and Yanping Huang and Andrew M. Dai and Simon Tong and Dmitry Lepikhin and Yuanzhong Xu and Maxim Krikun and Yanqi Zhou and Adams Wei Yu and Orhan Firat and Barret Zoph and Liam Fedus and Maarten Bosma and Zongwei Zhou and Tao Wang and Yu Emma Wang and Kellie Webster and Marie Pellat and Kevin Robinson and Kathleen Meier-Hellstern and Toju Duke and Lucas Dixon and Kun Zhang and Quoc V Le and Yonghui Wu and Zhifeng Chen and Claire Cui,http://arxiv.org/pdf/2112.06905v2
 http://arxiv.org/abs/2211.08412v1,creativecommons.org/licenses/by/4.0/,Evaluating the Factual Consistency of Large Language Models Through Summarization,Derek Tam and Anisha Mascarenhas and Shiyue Zhang and Sarah Kwan and Mohit Bansal and Colin Raffel,http://arxiv.org/pdf/2211.08412v1
 http://arxiv.org/abs/2303.06748v1,creativecommons.org/licenses/by/4.0/,DTT: An Example-Driven Tabular Transformer by Leveraging Large Language Models,Arash Dargahi Nobari and Davood Rafiei,http://arxiv.org/pdf/2303.06748v1
 http://arxiv.org/abs/2209.11035v2,creativecommons.org/licenses/by/4.0/,MonoByte: A Pool of Monolingual Byte-level Language Models,Hugo Abonizio and Leandro Rodrigues de Souza and Roberto Lotufo and Rodrigo Nogueira,http://arxiv.org/pdf/2209.11035v2
 http://arxiv.org/abs/2106.13474v2,creativecommons.org/licenses/by/4.0/,"Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains",Yunzhi Yao and Shaohan Huang and Wenhui Wang and Li Dong and Furu Wei,http://arxiv.org/pdf/2106.13474v2
 http://arxiv.org/abs/1910.14243v1,creativecommons.org/licenses/by/4.0/,DiaNet: BERT and Hierarchical Attention Multi-Task Learning of Fine-Grained Dialect,Muhammad Abdul-Mageed and Chiyu Zhang and AbdelRahim Elmadany and Arun Rajendran and Lyle Ungar,http://arxiv.org/pdf/1910.14243v1
 http://arxiv.org/abs/2011.11536v1,creativecommons.org/licenses/by/4.0/,Studying Taxonomy Enrichment on Diachronic WordNet Versions,Irina Nikishina and Alexander Panchenko and Varvara Logacheva and Natalia Loukachevitch,http://arxiv.org/pdf/2011.11536v1
 http://arxiv.org/abs/2210.05033v1,creativecommons.org/licenses/by/4.0/,Multilingual Representation Distillation with Contrastive Learning,Weiting Tan and Kevin Heffernan and Holger Schwenk and Philipp Koehn,http://arxiv.org/pdf/2210.05033v1
 http://arxiv.org/abs/2302.12746v2,creativecommons.org/licenses/by/4.0/,Spanish Built Factual Freectianary (Spanish-BFF): the first AI-generated free dictionary,Miguel Ortega-Martín and Óscar García-Sierra and Alfonso Ardoiz and Juan Carlos Armenteros and Jorge Álvarez and Adrián Alonso,http://arxiv.org/pdf/2302.12746v2
 http://arxiv.org/abs/2103.07449v3,creativecommons.org/licenses/by/4.0/,Cooperative Self-training of Machine Reading Comprehension,Hongyin Luo and Shang-Wen Li and Mingye Gao and Seunghak Yu and James Glass,http://arxiv.org/pdf/2103.07449v3
 http://arxiv.org/abs/2106.10076v2,creativecommons.org/licenses/by/4.0/,Label prompt for multi-label text classification,Rui Song and Xingbing Chen and Zelong Liu and Haining An and Zhiqi Zhang and Xiaoguang Wang and Hao Xu,http://arxiv.org/pdf/2106.10076v2
 http://arxiv.org/abs/2202.04350v1,creativecommons.org/licenses/by/4.0/,pNLP-Mixer: an Efficient all-MLP Architecture for Language,Francesco Fusco and Damian Pascual and Peter Staar,http://arxiv.org/pdf/2202.04350v1
 http://arxiv.org/abs/2209.11902v1,creativecommons.org/licenses/by/4.0/,Learning Chess With Language Models and Transformers,Michael DeLeo and Erhan Guven,http://arxiv.org/pdf/2209.11902v1
 http://arxiv.org/abs/2210.04186v2,creativecommons.org/licenses/by/4.0/,Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT,Bhavya Bhavya and Jinjun Xiong and Chengxiang Zhai,http://arxiv.org/pdf/2210.04186v2
 http://arxiv.org/abs/2211.11216v2,creativecommons.org/licenses/by/4.0/,Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music Generation Task,Shangda Wu and Maosong Sun,http://arxiv.org/pdf/2211.11216v2
 http://arxiv.org/abs/2010.09803v2,creativecommons.org/licenses/by/4.0/,Adversarial Training for Code Retrieval with Question-Description Relevance Regularization,Jie Zhao and Huan Sun,http://arxiv.org/pdf/2010.09803v2
 http://arxiv.org/abs/2105.15014v1,creativecommons.org/licenses/by/4.0/,Singing Language Identification using a Deep Phonotactic Approach,Lenny Renault and Andrea Vaglio and Romain Hennequin,http://arxiv.org/pdf/2105.15014v1
 http://arxiv.org/abs/2204.08887v1,creativecommons.org/licenses/by/4.0/,Cross-Lingual Phrase Retrieval,Heqi Zheng and Xiao Zhang and Zewen Chi and Heyan Huang and Tan Yan and Tian Lan and Wei Wei and Xian-Ling Mao,http://arxiv.org/pdf/2204.08887v1
 http://arxiv.org/abs/2210.13002v1,creativecommons.org/licenses/by/4.0/,An Empirical Revisiting of Linguistic Knowledge Fusion in Language Understanding Tasks,Changlong Yu and Tianyi Xiao and Lingpeng Kong and Yangqiu Song and Wilfred Ng,http://arxiv.org/pdf/2210.13002v1
 http://arxiv.org/abs/2301.10481v2,creativecommons.org/licenses/by/4.0/,FewShotTextGCN: K-hop neighborhood regularization for few-shot learning on graphs,Niels van der Heijden and Ekaterina Shutova and Helen Yannakoudakis,http://arxiv.org/pdf/2301.10481v2
 http://arxiv.org/abs/2207.12759v1,creativecommons.org/licenses/by/4.0/,Training Effective Neural Sentence Encoders from Automatically Mined Paraphrases,Sławomir Dadas,http://arxiv.org/pdf/2207.12759v1
 http://arxiv.org/abs/2208.11981v1,creativecommons.org/licenses/by/4.0/,On Reality and the Limits of Language Data,Nigel H. Collier and Fangyu Liu and Ehsan Shareghi,http://arxiv.org/pdf/2208.11981v1
 http://arxiv.org/abs/2210.06576v1,creativecommons.org/licenses/by/4.0/,DATScore: Evaluating Translation with Data Augmented Translations,Moussa Kamal Eddine and Guokan Shang and Michalis Vazirgiannis,http://arxiv.org/pdf/2210.06576v1
 http://arxiv.org/abs/2303.15621v2,creativecommons.org/licenses/by/4.0/,ChatGPT as a Factual Inconsistency Evaluator for Text Summarization,Zheheng Luo and Qianqian Xie and Sophia Ananiadou,http://arxiv.org/pdf/2303.15621v2
 http://arxiv.org/abs/2211.11081v1,creativecommons.org/licenses/by/4.0/,A Theory of Unsupervised Translation Motivated by Understanding Animal Communication,Shafi Goldwasser and David F. Gruber and Adam Tauman Kalai and Orr Paradise,http://arxiv.org/pdf/2211.11081v1
 http://arxiv.org/abs/2206.01335v2,creativecommons.org/licenses/by/4.0/,"Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code",Patrick Bareiß and Beatriz Souza and Marcelo d'Amorim and Michael Pradel,http://arxiv.org/pdf/2206.01335v2
 http://arxiv.org/abs/2003.04866v1,creativecommons.org/licenses/by/4.0/,Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity,Ivan Vulić and Simon Baker and Edoardo Maria Ponti and Ulla Petti and Ira Leviant and Kelly Wing and Olga Majewska and Eden Bar and Matt Malone and Thierry Poibeau and Roi Reichart and Anna Korhonen,http://arxiv.org/pdf/2003.04866v1
 http://arxiv.org/abs/2202.07894v1,creativecommons.org/licenses/by/4.0/,Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers,Yotaro Kubo and Shigeki Karita and Michiel Bacchiani,http://arxiv.org/pdf/2202.07894v1
 http://arxiv.org/abs/2304.09842v1,creativecommons.org/licenses/by/4.0/,Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models,Pan Lu and Baolin Peng and Hao Cheng and Michel Galley and Kai-Wei Chang and Ying Nian Wu and Song-Chun Zhu and Jianfeng Gao,http://arxiv.org/pdf/2304.09842v1
 http://arxiv.org/abs/2304.11679v1,creativecommons.org/licenses/by/4.0/,Domain Mastery Benchmark: An Ever-Updating Benchmark for Evaluating Holistic Domain Knowledge of Large Language Model--A Preliminary Release,Zhouhong Gu and Xiaoxuan Zhu and Haoning Ye and Lin Zhang and Zhuozhi Xiong and Zihan Li and Qianyu He and Sihang Jiang and Hongwei Feng and Yanghua Xiao,http://arxiv.org/pdf/2304.11679v1
 http://arxiv.org/abs/2301.09919v2,creativecommons.org/licenses/by/4.0/,Opportunities and Challenges in Neural Dialog Tutoring,Jakub Macina and Nico Daheim and Lingzhi Wang and Tanmay Sinha and Manu Kapur and Iryna Gurevych and Mrinmaya Sachan,http://arxiv.org/pdf/2301.09919v2
 http://arxiv.org/abs/2304.11082v1,creativecommons.org/licenses/by/4.0/,Fundamental Limitations of Alignment in Large Language Models,Yotam Wolf and Noam Wies and Yoav Levine and Amnon Shashua,http://arxiv.org/pdf/2304.11082v1
 http://arxiv.org/abs/2303.16749v1,creativecommons.org/licenses/by/4.0/,Improving Code Generation by Training with Natural Language Feedback,Angelica Chen and Jérémy Scheurer and Tomasz Korbak and Jon Ander Campos and Jun Shern Chan and Samuel R. Bowman and Kyunghyun Cho and Ethan Perez,http://arxiv.org/pdf/2303.16749v1
 http://arxiv.org/abs/2304.02017v5,creativecommons.org/licenses/by/4.0/,"Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its Applications, Advantages, Limitations, and Future Directions in Natural Language Processing",Walid Hariri,http://arxiv.org/pdf/2304.02017v5
 http://arxiv.org/abs/2106.01091v1,creativecommons.org/licenses/by/4.0/,belabBERT: a Dutch RoBERTa-based language model applied to psychiatric classification,Joppe Wouts and Janna de Boer and Alban Voppel and Sanne Brederoo and Sander van Splunter and Iris Sommer,http://arxiv.org/pdf/2106.01091v1
 http://arxiv.org/abs/2203.04904v3,creativecommons.org/licenses/by/4.0/,Rethinking Task Sampling for Few-shot Vision-Language Transfer Learning,Zhenhailong Wang and Hang Yu and Manling Li and Han Zhao and Heng Ji,http://arxiv.org/pdf/2203.04904v3
 http://arxiv.org/abs/2103.14583v3,creativecommons.org/licenses/by/4.0/,Leveraging pre-trained representations to improve access to untranscribed speech from endangered languages,Nay San and Martijn Bartelds and Mitchell Browne and Lily Clifford and Fiona Gibson and John Mansfield and David Nash and Jane Simpson and Myfany Turpin and Maria Vollmer and Sasha Wilmoth and Dan Jurafsky,http://arxiv.org/pdf/2103.14583v3
 http://arxiv.org/abs/2205.12673v2,creativecommons.org/licenses/by/4.0/,InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning,Prakhar Gupta and Cathy Jiao and Yi-Ting Yeh and Shikib Mehri and Maxine Eskenazi and Jeffrey P. Bigham,http://arxiv.org/pdf/2205.12673v2
 http://arxiv.org/abs/2211.11875v1,creativecommons.org/licenses/by/4.0/,Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference,Eric Mitchell and Joseph J. Noh and Siyan Li and William S. Armstrong and Ananth Agarwal and Patrick Liu and Chelsea Finn and Christopher D. Manning,http://arxiv.org/pdf/2211.11875v1
 http://arxiv.org/abs/2302.06560v1,creativecommons.org/licenses/by/4.0/,Large Scale Multi-Lingual Multi-Modal Summarization Dataset,Yash Verma and Anubhav Jangra and Raghvendra Kumar and Sriparna Saha,http://arxiv.org/pdf/2302.06560v1
 http://arxiv.org/abs/2212.04088v3,creativecommons.org/licenses/by/4.0/,LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models,Chan Hee Song and Jiaman Wu and Clayton Washington and Brian M. Sadler and Wei-Lun Chao and Yu Su,http://arxiv.org/pdf/2212.04088v3
 http://arxiv.org/abs/2108.03867v1,creativecommons.org/licenses/by/4.0/,Benchmarking Multi-Task Learning for Sentiment Analysis and Offensive Language Identification in Under-Resourced Dravidian Languages,Adeep Hande and Siddhanth U Hegde and Ruba Priyadharshini and Rahul Ponnusamy and Prasanna Kumar Kumaresan and Sajeetha Thavareesan and Bharathi Raja Chakravarthi,http://arxiv.org/pdf/2108.03867v1
 http://arxiv.org/abs/2205.14660v1,creativecommons.org/licenses/by/4.0/,SFE-AI at SemEval-2022 Task 11: Low-Resource Named Entity Recognition using Large Pre-trained Language Models,Changyu Hou and Jun Wang and Yixuan Qiao and Peng Jiang and Peng Gao and Guotong Xie and Qizhi Lin and Xiaopeng Wang and Xiandi Jiang and Benqi Wang and Qifeng Xiao,http://arxiv.org/pdf/2205.14660v1
 http://arxiv.org/abs/2211.01180v2,creativecommons.org/licenses/by/4.0/,"M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval",Layne Berry and Yi-Jen Shih and Hsuan-Fu Wang and Heng-Jui Chang and Hung-yi Lee and David Harwath,http://arxiv.org/pdf/2211.01180v2
 http://arxiv.org/abs/2212.10534v1,creativecommons.org/licenses/by/4.0/,DISCO: Distilling Phrasal Counterfactuals with Large Language Models,Zeming Chen and Qiyue Gao and Kyle Richardson and Antoine Bosselut and Ashish Sabharwal,http://arxiv.org/pdf/2212.10534v1
 http://arxiv.org/abs/2303.14951v1,creativecommons.org/licenses/by/4.0/,Improving Contextualized Topic Models with Negative Sampling,Suman Adhya and Avishek Lahiri and Debarshi Kumar Sanyal and Partha Pratim Das,http://arxiv.org/pdf/2303.14951v1
 http://arxiv.org/abs/2211.05110v1,creativecommons.org/licenses/by/4.0/,Large Language Models with Controllable Working Memory,Daliang Li and Ankit Singh Rawat and Manzil Zaheer and Xin Wang and Michal Lukasik and Andreas Veit and Felix Yu and Sanjiv Kumar,http://arxiv.org/pdf/2211.05110v1
 http://arxiv.org/abs/2302.09207v1,creativecommons.org/licenses/by/4.0/,RetVec: Resilient and Efficient Text Vectorizer,Elie Bursztein and Marina Zhang and Owen Vallis and Xinyu Jia and Alexey Kurakin,http://arxiv.org/pdf/2302.09207v1
 http://arxiv.org/abs/2110.01710v1,creativecommons.org/licenses/by/4.0/,PyTorrent: A Python Library Corpus for Large-scale Language Models,Mehdi Bahrami and N. C. Shrikanth and Shade Ruangwan and Lei Liu and Yuji Mizobuchi and Masahiro Fukuyori and Wei-Peng Chen and Kazuki Munakata and Tim Menzies,http://arxiv.org/pdf/2110.01710v1
 http://arxiv.org/abs/2111.00607v3,creativecommons.org/licenses/by/4.0/,A Systematic Investigation of Commonsense Knowledge in Large Language Models,Xiang Lorraine Li and Adhiguna Kuncoro and Jordan Hoffmann and Cyprien de Masson d'Autume and Phil Blunsom and Aida Nematzadeh,http://arxiv.org/pdf/2111.00607v3
 http://arxiv.org/abs/2208.14536v1,creativecommons.org/licenses/by/4.0/,MultiCoNER: A Large-scale Multilingual dataset for Complex Named Entity Recognition,Shervin Malmasi and Anjie Fang and Besnik Fetahu and Sudipta Kar and Oleg Rokhlenko,http://arxiv.org/pdf/2208.14536v1
 http://arxiv.org/abs/2212.10511v1,creativecommons.org/licenses/by/4.0/,When Not to Trust Language Models: Investigating Effectiveness and Limitations of Parametric and Non-Parametric Memories,Alex Mallen and Akari Asai and Victor Zhong and Rajarshi Das and Hannaneh Hajishirzi and Daniel Khashabi,http://arxiv.org/pdf/2212.10511v1
 http://arxiv.org/abs/2304.06712v1,creativecommons.org/licenses/by/4.0/,What does CLIP know about a red circle? Visual prompt engineering for VLMs,Aleksandar Shtedritski and Christian Rupprecht and Andrea Vedaldi,http://arxiv.org/pdf/2304.06712v1
 http://arxiv.org/abs/2304.10548v1,creativecommons.org/licenses/by/4.0/,Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding,Ziang Xiao and Xingdi Yuan and Q. Vera Liao and Rania Abdelghani and Pierre-Yves Oudeyer,http://arxiv.org/pdf/2304.10548v1
 http://arxiv.org/abs/1911.00637v1,creativecommons.org/licenses/by/4.0/,Sentence-Level BERT and Multi-Task Learning of Age and Gender in Social Media,Muhammad Abdul-Mageed and Chiyu Zhang and Arun Rajendran and AbdelRahim Elmadany and Michael Przystupa and Lyle Ungar,http://arxiv.org/pdf/1911.00637v1
 http://arxiv.org/abs/2012.08146v1,creativecommons.org/licenses/by/4.0/,Generation of complex database queries and API calls from natural language utterances,Amol Kelkar and Nachiketa Rajpurohit and Utkarsh Mittal and Peter Relan,http://arxiv.org/pdf/2012.08146v1
 http://arxiv.org/abs/2107.06483v1,creativecommons.org/licenses/by/4.0/,From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text,Ishan Tarunesh and Syamantak Kumar and Preethi Jyothi,http://arxiv.org/pdf/2107.06483v1
 http://arxiv.org/abs/2110.03047v1,creativecommons.org/licenses/by/4.0/,Integrating Categorical Features in End-to-End ASR,Rongqing Huang,http://arxiv.org/pdf/2110.03047v1
 http://arxiv.org/abs/2110.03252v1,creativecommons.org/licenses/by/4.0/,Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling,Kyuhong Shim and Iksoo Choi and Wonyong Sung and Jungwook Choi,http://arxiv.org/pdf/2110.03252v1
 http://arxiv.org/abs/2110.13229v2,creativecommons.org/licenses/by/4.0/,Distributionally Robust Recurrent Decoders with Random Network Distillation,Antonio Valerio Miceli-Barone and Alexandra Birch and Rico Sennrich,http://arxiv.org/pdf/2110.13229v2
 http://arxiv.org/abs/2202.05209v1,creativecommons.org/licenses/by/4.0/,Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding,Peter Sullivan and Toshiko Shibano and Muhammad Abdul-Mageed,http://arxiv.org/pdf/2202.05209v1
 http://arxiv.org/abs/2204.04289v1,creativecommons.org/licenses/by/4.0/,Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models,Patrick Huber and Giuseppe Carenini,http://arxiv.org/pdf/2204.04289v1
 http://arxiv.org/abs/2205.12702v3,creativecommons.org/licenses/by/4.0/,Detecting Label Errors by using Pre-Trained Language Models,Derek Chong and Jenny Hong and Christopher D. Manning,http://arxiv.org/pdf/2205.12702v3
 http://arxiv.org/abs/2206.10781v1,creativecommons.org/licenses/by/4.0/,Efficient and effective training of language and graph neural network models,Vassilis N. Ioannidis and Xiang Song and Da Zheng and Houyu Zhang and Jun Ma and Yi Xu and Belinda Zeng and Trishul Chilimbi and George Karypis,http://arxiv.org/pdf/2206.10781v1
 http://arxiv.org/abs/2208.03713v1,creativecommons.org/licenses/by/4.0/,Study of Encoder-Decoder Architectures for Code-Mix Search Query Translation,Mandar Kulkarni and Soumya Chennabasavaraj and Nikesh Garera,http://arxiv.org/pdf/2208.03713v1
 http://arxiv.org/abs/2210.06150v1,creativecommons.org/licenses/by/4.0/,Annotating Norwegian Language Varieties on Twitter for Part-of-Speech,Petter Mæhlum and Andre Kåsen and Samia Touileb and Jeremy Barnes,http://arxiv.org/pdf/2210.06150v1
 http://arxiv.org/abs/2211.08073v2,creativecommons.org/licenses/by/4.0/,GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-distribution Generalization Perspective,Linyi Yang and Shuibai Zhang and Libo Qin and Yafu Li and Yidong Wang and Hanmeng Liu and Jindong Wang and Xing Xie and Yue Zhang,http://arxiv.org/pdf/2211.08073v2
 http://arxiv.org/abs/2301.11507v1,creativecommons.org/licenses/by/4.0/,Semi-Parametric Video-Grounded Text Generation,Sungdong Kim and Jin-Hwa Kim and Jiyoung Lee and Minjoon Seo,http://arxiv.org/pdf/2301.11507v1
 http://arxiv.org/abs/2304.05221v1,creativecommons.org/licenses/by/4.0/,Towards preserving word order importance through Forced Invalidation,Hadeel Al-Negheimish and Pranava Madhyastha and Alessandra Russo,http://arxiv.org/pdf/2304.05221v1
 http://arxiv.org/abs/2102.02557v1,creativecommons.org/licenses/by/4.0/,Adaptive Semiparametric Language Models,Dani Yogatama and Cyprien de Masson d'Autume and Lingpeng Kong,http://arxiv.org/pdf/2102.02557v1
 http://arxiv.org/abs/2104.09617v1,creativecommons.org/licenses/by/4.0/,Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model,Per E Kummervold and Javier de la Rosa and Freddy Wetjen and Svein Arne Brygfjeld,http://arxiv.org/pdf/2104.09617v1
 http://arxiv.org/abs/2107.05697v1,creativecommons.org/licenses/by/4.0/,Few-shot Language Coordination by Modeling Theory of Mind,Hao Zhu and Graham Neubig and Yonatan Bisk,http://arxiv.org/pdf/2107.05697v1
 http://arxiv.org/abs/2205.11503v1,creativecommons.org/licenses/by/4.0/,Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models,Mirac Suzgun and Luke Melas-Kyriazi and Dan Jurafsky,http://arxiv.org/pdf/2205.11503v1
 http://arxiv.org/abs/2211.10000v1,creativecommons.org/licenses/by/4.0/,Protein language model rescue mutations highlight variant effects and structure in clinically relevant genes,Onuralp Soylemez and Pablo Cordero,http://arxiv.org/pdf/2211.10000v1
 http://arxiv.org/abs/2212.10622v1,creativecommons.org/licenses/by/4.0/,mFACE: Multilingual Summarization with Factual Consistency Evaluation,Roee Aharoni and Shashi Narayan and Joshua Maynez and Jonathan Herzig and Elizabeth Clark and Mirella Lapata,http://arxiv.org/pdf/2212.10622v1
 http://arxiv.org/abs/2111.14709v3,creativecommons.org/licenses/by/4.0/,Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching,Zhengxiang Wang,http://arxiv.org/pdf/2111.14709v3
 http://arxiv.org/abs/2204.13509v2,creativecommons.org/licenses/by/4.0/,On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model,Seongjin Shin and Sang-Woo Lee and Hwijeen Ahn and Sungdong Kim and HyoungSeok Kim and Boseop Kim and Kyunghyun Cho and Gichang Lee and Woomyoung Park and Jung-Woo Ha and Nako Sung,http://arxiv.org/pdf/2204.13509v2
 http://arxiv.org/abs/2209.15093v1,creativecommons.org/licenses/by/4.0/,Unpacking Large Language Models with Conceptual Consistency,Pritish Sahu and Michael Cogswell and Yunye Gong and Ajay Divakaran,http://arxiv.org/pdf/2209.15093v1
 http://arxiv.org/abs/2109.00165v1,creativecommons.org/licenses/by/4.0/,An Unsupervised Method for Building Sentence Simplification Corpora in Multiple Languages,Xinyu Lu and Jipeng Qiang and Yun Li and Yunhao Yuan and Yi Zhu,http://arxiv.org/pdf/2109.00165v1
 http://arxiv.org/abs/2206.12036v2,creativecommons.org/licenses/by/4.0/,SC-Ques: A Sentence Completion Question Dataset for English as a Second Language Learners,Qiongqiong Liu and Yaying Huang and Zitao Liu and Shuyan Huang and Jiahao Chen and Xiangyu Zhao and Guimin Lin and Yuyu Zhou and Weiqi Luo,http://arxiv.org/pdf/2206.12036v2
 http://arxiv.org/abs/2212.03551v5,creativecommons.org/licenses/by/4.0/,Talking About Large Language Models,Murray Shanahan,http://arxiv.org/pdf/2212.03551v5
 http://arxiv.org/abs/2301.05843v1,creativecommons.org/licenses/by/4.0/,Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data,Jing Wei and Sungdong Kim and Hyunhoon Jung and Young-Ho Kim,http://arxiv.org/pdf/2301.05843v1
 http://arxiv.org/abs/2303.17511v1,creativecommons.org/licenses/by/4.0/,On pitfalls (and advantages) of sophisticated large language models,Anna Strasser,http://arxiv.org/pdf/2303.17511v1
 http://arxiv.org/abs/2110.05367v1,creativecommons.org/licenses/by/4.0/,Improving Gender Fairness of Pre-Trained Language Models without Catastrophic Forgetting,Zahra Fatemi and Chen Xing and Wenhao Liu and Caiming Xiong,http://arxiv.org/pdf/2110.05367v1
 http://arxiv.org/abs/2210.14868v3,creativecommons.org/licenses/by/4.0/,Multi-lingual Evaluation of Code Generation Models,Ben Athiwaratkun and Sanjay Krishna Gouda and Zijian Wang and Xiaopeng Li and Yuchen Tian and Ming Tan and Wasi Uddin Ahmad and Shiqi Wang and Qing Sun and Mingyue Shang and Sujan Kumar Gonugondla and Hantian Ding and Varun Kumar and Nathan Fulton and Arash Farahani and Siddhartha Jain and Robert Giaquinto and Haifeng Qian and Murali Krishna Ramanathan and Ramesh Nallapati and Baishakhi Ray and Parminder Bhatia and Sudipta Sengupta and Dan Roth and Bing Xiang,http://arxiv.org/pdf/2210.14868v3
 http://arxiv.org/abs/2012.07528v1,creativecommons.org/licenses/by/4.0/,Disentangling Homophemes in Lip Reading using Perplexity Analysis,Souheil Fenghour and Daqing Chen and Kun Guo and Perry Xiao,http://arxiv.org/pdf/2012.07528v1
 http://arxiv.org/abs/2011.09031v4,creativecommons.org/licenses/by/4.0/,Predictions For Pre-training Language Models,Tong Guo,http://arxiv.org/pdf/2011.09031v4
 http://arxiv.org/abs/2209.10063v3,creativecommons.org/licenses/by/4.0/,Generate rather than Retrieve: Large Language Models are Strong Context Generators,Wenhao Yu and Dan Iter and Shuohang Wang and Yichong Xu and Mingxuan Ju and Soumya Sanyal and Chenguang Zhu and Michael Zeng and Meng Jiang,http://arxiv.org/pdf/2209.10063v3
 http://arxiv.org/abs/2108.01928v1,creativecommons.org/licenses/by/4.0/,How to Query Language Models?,Leonard Adolphs and Shehzaad Dhuliawala and Thomas Hofmann,http://arxiv.org/pdf/2108.01928v1
 http://arxiv.org/abs/2304.10946v1,creativecommons.org/licenses/by/4.0/,CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models,Tianhao Li and Sandesh Shetty and Advaith Kamath and Ajay Jaiswal and Xianqian Jiang and Ying Ding and Yejin Kim,http://arxiv.org/pdf/2304.10946v1
 http://arxiv.org/abs/2011.02323v1,creativecommons.org/licenses/by/4.0/,Indic-Transformers: An Analysis of Transformer Language Models for Indian Languages,Kushal Jain and Adwait Deshpande and Kumar Shridhar and Felix Laumann and Ayushman Dash,http://arxiv.org/pdf/2011.02323v1
 http://arxiv.org/abs/2101.05509v3,creativecommons.org/licenses/by/4.0/,Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake News Detection,Ben Chen and Bin Chen and Dehong Gao and Qijin Chen and Chengfu Huo and Xiaonan Meng and Weijun Ren and Yang Zhou,http://arxiv.org/pdf/2101.05509v3
 http://arxiv.org/abs/2110.08484v2,creativecommons.org/licenses/by/4.0/,A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models,Woojeong Jin and Yu Cheng and Yelong Shen and Weizhu Chen and Xiang Ren,http://arxiv.org/pdf/2110.08484v2
 http://arxiv.org/abs/1808.07364v3,creativecommons.org/licenses/by/4.0/,Neural Named Entity Recognition from Subword Units,Abdalghani Abujabal and Judith Gaspers,http://arxiv.org/pdf/1808.07364v3
 http://arxiv.org/abs/2304.11384v1,creativecommons.org/licenses/by/4.0/,An Empirical Study on Using Large Language Models for Multi-Intent Comment Generation,Mingyang Geng and Shangwen Wang and Dezun Dong and Haotian Wang and Ge Li and Zhi Jin and Xiaoguang Mao and Xiangke Liao,http://arxiv.org/pdf/2304.11384v1
 http://arxiv.org/abs/2205.13708v1,creativecommons.org/licenses/by/4.0/,HiJoNLP at SemEval-2022 Task 2: Detecting Idiomaticity of Multiword Expressions using Multilingual Pretrained Language Models,Minghuan Tan,http://arxiv.org/pdf/2205.13708v1
 http://arxiv.org/abs/1804.01768v1,creativecommons.org/licenses/by/4.0/,Chinese-Portuguese Machine Translation: A Study on Building Parallel Corpora from Comparable Texts,Siyou Liu and Longyue Wang and Chao-Hong Liu,http://arxiv.org/pdf/1804.01768v1
 http://arxiv.org/abs/2010.14534v1,creativecommons.org/licenses/by/4.0/,Unmasking Contextual Stereotypes: Measuring and Mitigating BERT's Gender Bias,Marion Bartl and Malvina Nissim and Albert Gatt,http://arxiv.org/pdf/2010.14534v1
 http://arxiv.org/abs/2012.15263v1,creativecommons.org/licenses/by/4.0/,Predicting cross-linguistic adjective order with information gain,William Dyer and Richard Futrell and Zoey Liu and Gregory Scontras,http://arxiv.org/pdf/2012.15263v1
 http://arxiv.org/abs/2101.12338v1,creativecommons.org/licenses/by/4.0/,Enabling Robots to Draw and Tell: Towards Visually Grounded Multimodal Description Generation,Ting Han and Sina Zarrieß,http://arxiv.org/pdf/2101.12338v1
 http://arxiv.org/abs/2104.05753v1,creativecommons.org/licenses/by/4.0/,Towards a parallel corpus of Portuguese and the Bantu language Emakhuwa of Mozambique,Felermino D. M. A. Ali and Andrew Caines and Jaimito L. A. Malavi,http://arxiv.org/pdf/2104.05753v1
 http://arxiv.org/abs/2205.10078v1,creativecommons.org/licenses/by/4.0/,Uzbek affix finite state machine for stemming,Maksud Sharipov and Ulugbek Salaev,http://arxiv.org/pdf/2205.10078v1
 http://arxiv.org/abs/2207.00758v1,creativecommons.org/licenses/by/4.0/,MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question Answering for 16 Diverse Languages,Akari Asai and Shayne Longpre and Jungo Kasai and Chia-Hsuan Lee and Rui Zhang and Junjie Hu and Ikuya Yamada and Jonathan H. Clark and Eunsol Choi,http://arxiv.org/pdf/2207.00758v1
 http://arxiv.org/abs/2304.07297v1,creativecommons.org/licenses/by/4.0/,Language Instructed Reinforcement Learning for Human-AI Coordination,Hengyuan Hu and Dorsa Sadigh,http://arxiv.org/pdf/2304.07297v1
 http://arxiv.org/abs/2103.10685v3,creativecommons.org/licenses/by/4.0/,Controllable Generation from Pre-trained Language Models via Inverse Prompting,Xu Zou and Da Yin and Qingyang Zhong and Ming Ding and Hongxia Yang and Zhilin Yang and Jie Tang,http://arxiv.org/pdf/2103.10685v3
 http://arxiv.org/abs/2205.05535v1,creativecommons.org/licenses/by/4.0/,Clinical Prompt Learning with Frozen Language Models,Niall Taylor and Yi Zhang and Dan Joyce and Alejo Nevado-Holgado and Andrey Kormilitzin,http://arxiv.org/pdf/2205.05535v1
 http://arxiv.org/abs/2205.09712v1,creativecommons.org/licenses/by/4.0/,Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning,Antonia Creswell and Murray Shanahan and Irina Higgins,http://arxiv.org/pdf/2205.09712v1
 http://arxiv.org/abs/2302.00763v1,creativecommons.org/licenses/by/4.0/,Collaborating with language models for embodied reasoning,Ishita Dasgupta and Christine Kaeser-Chen and Kenneth Marino and Arun Ahuja and Sheila Babayan and Felix Hill and Rob Fergus,http://arxiv.org/pdf/2302.00763v1
 http://arxiv.org/abs/2209.05946v1,creativecommons.org/licenses/by/4.0/,OmDet: Language-Aware Object Detection with Large-scale Vision-Language Multi-dataset Pre-training,Tiancheng Zhao and Peng Liu and Xiaopeng Lu and Kyusong Lee,http://arxiv.org/pdf/2209.05946v1
 http://arxiv.org/abs/2211.03730v1,creativecommons.org/licenses/by/4.0/,DPCSpell: A Transformer-based Detector-Purificator-Corrector Framework for Spelling Error Correction of Bangla and Resource Scarce Indic Languages,Mehedi Hasan Bijoy and Nahid Hossain and Salekul Islam and Swakkhar Shatabda,http://arxiv.org/pdf/2211.03730v1
 http://arxiv.org/abs/1906.08237v2,creativecommons.org/licenses/by/4.0/,XLNet: Generalized Autoregressive Pretraining for Language Understanding,Zhilin Yang and Zihang Dai and Yiming Yang and Jaime Carbonell and Ruslan Salakhutdinov and Quoc V. Le,http://arxiv.org/pdf/1906.08237v2
 http://arxiv.org/abs/2011.14039v2,creativecommons.org/licenses/by/4.0/,An Investigation of Language Model Interpretability via Sentence Editing,Samuel Stevens and Yu Su,http://arxiv.org/pdf/2011.14039v2
 http://arxiv.org/abs/2105.02486v2,creativecommons.org/licenses/by/4.0/,Towards General Natural Language Understanding with Probabilistic Worldbuilding,Abulhair Saparov and Tom M. Mitchell,http://arxiv.org/pdf/2105.02486v2
 http://arxiv.org/abs/2109.06704v1,creativecommons.org/licenses/by/4.0/,KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning,Haonan Li and Yeyun Gong and Jian Jiao and Ruofei Zhang and Timothy Baldwin and Nan Duan,http://arxiv.org/pdf/2109.06704v1
 http://arxiv.org/abs/2204.01845v1,creativecommons.org/licenses/by/4.0/,Compliance Checking with NLI: Privacy Policies vs. Regulations,Amin Rabinia and Zane Nygaard,http://arxiv.org/pdf/2204.01845v1
 http://arxiv.org/abs/2210.01296v2,creativecommons.org/licenses/by/4.0/,Recitation-Augmented Language Models,Zhiqing Sun and Xuezhi Wang and Yi Tay and Yiming Yang and Denny Zhou,http://arxiv.org/pdf/2210.01296v2
 http://arxiv.org/abs/2212.06369v3,creativecommons.org/licenses/by/4.0/,Technical Report -- Competition Solution for Prompt Tuning using Pretrained Language Model,Jiang-Long Song and Wu-He Zou and Feng Li and Xiao-Lei Qin and Wei-Dong Zhang,http://arxiv.org/pdf/2212.06369v3
 http://arxiv.org/abs/2302.07388v1,creativecommons.org/licenses/by/4.0/,Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models,Shrimai Prabhumoye and Mostofa Patwary and Mohammad Shoeybi and Bryan Catanzaro,http://arxiv.org/pdf/2302.07388v1
 http://arxiv.org/abs/2110.08551v1,creativecommons.org/licenses/by/4.0/,HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression,Chenhe Dong and Yaliang Li and Ying Shen and Minghui Qiu,http://arxiv.org/pdf/2110.08551v1
 http://arxiv.org/abs/2203.15917v1,creativecommons.org/licenses/by/4.0/,Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization,Evelina Bakhturina and Yang Zhang and Boris Ginsburg,http://arxiv.org/pdf/2203.15917v1
 http://arxiv.org/abs/2206.13289v1,creativecommons.org/licenses/by/4.0/,Analyzing Encoded Concepts in Transformer Language Models,Hassan Sajjad and Nadir Durrani and Fahim Dalvi and Firoj Alam and Abdul Rafae Khan and Jia Xu,http://arxiv.org/pdf/2206.13289v1
 http://arxiv.org/abs/2302.04269v1,creativecommons.org/licenses/by/4.0/,Diagnosing and Rectifying Vision Models using Language,Yuhui Zhang and Jeff Z. HaoChen and Shih-Cheng Huang and Kuan-Chieh Wang and James Zou and Serena Yeung,http://arxiv.org/pdf/2302.04269v1
 http://arxiv.org/abs/2303.03103v1,creativecommons.org/licenses/by/4.0/,Towards Zero-Shot Functional Compositionality of Language Models,Hangyeol Yu and Myeongho Jeong and Jamin Shin and Hyeongdon Moon and Juneyoung Park and Seungtaek Choi,http://arxiv.org/pdf/2303.03103v1
 http://arxiv.org/abs/2110.06419v1,creativecommons.org/licenses/by/4.0/,Federated Natural Language Generation for Personalized Dialogue System,Yujie Lu and Chao Huang and Huanli Zhan and Yong Zhuang,http://arxiv.org/pdf/2110.06419v1
 http://arxiv.org/abs/2202.00470v1,creativecommons.org/licenses/by/4.0/,An Assessment of the Impact of OCR Noise on Language Models,Konstantin Todorov and Giovanni Colavizza,http://arxiv.org/pdf/2202.00470v1
 http://arxiv.org/abs/2201.11576v1,creativecommons.org/licenses/by/4.0/,Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation,Jixuan Wang and Kuan-Chieh Wang and Frank Rudzicz and Michael Brudno,http://arxiv.org/pdf/2201.11576v1
 http://arxiv.org/abs/2208.14754v1,creativecommons.org/licenses/by/4.0/,LexMAE: Lexicon-Bottlenecked Pretraining for Large-Scale Retrieval,Tao Shen and Xiubo Geng and Chongyang Tao and Can Xu and Xiaolong Huang and Binxing Jiao and Linjun Yang and Daxin Jiang,http://arxiv.org/pdf/2208.14754v1
 http://arxiv.org/abs/2303.06573v1,creativecommons.org/licenses/by/4.0/,Large Language Models Know Your Contextual Search Intent: A Prompting Framework for Conversational Search,Kelong Mao and Zhicheng Dou and Haonan Chen and Fengran Mo and Hongjin Qian,http://arxiv.org/pdf/2303.06573v1
 http://arxiv.org/abs/2304.06030v2,creativecommons.org/licenses/by/4.0/,The Role of Large Language Models in the Recognition of Territorial Sovereignty: An Analysis of the Construction of Legitimacy,Francisco Castillo-Eslava and Carlos Mougan and Alejandro Romero-Reche and Steffen Staab,http://arxiv.org/pdf/2304.06030v2
 http://arxiv.org/abs/2008.06268v1,creativecommons.org/licenses/by/4.0/,An Efficient Model Inference Algorithm for Learning-based Testing of Reactive Systems,Muddassar A. Sindhu,http://arxiv.org/pdf/2008.06268v1
 http://arxiv.org/abs/2011.12170v1,creativecommons.org/licenses/by/4.0/,Domain-Transferable Method for Named Entity Recognition Task,Vladislav Mikhailov and Tatiana Shavrina,http://arxiv.org/pdf/2011.12170v1
 http://arxiv.org/abs/2109.04607v1,creativecommons.org/licenses/by/4.0/,IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization,Fajri Koto and Jey Han Lau and Timothy Baldwin,http://arxiv.org/pdf/2109.04607v1
 http://arxiv.org/abs/2206.03216v2,creativecommons.org/licenses/by/4.0/,Data Governance in the Age of Large-Scale Data-Driven Language Technology,Yacine Jernite and Huu Nguyen and Stella Biderman and Anna Rogers and Maraim Masoud and Valentin Danchev and Samson Tan and Alexandra Sasha Luccioni and Nishant Subramani and Gérard Dupont and Jesse Dodge and Kyle Lo and Zeerak Talat and Isaac Johnson and Dragomir Radev and Somaieh Nikpoor and Jörg Frohberg and Aaron Gokaslan and Peter Henderson and Rishi Bommasani and Margaret Mitchell,http://arxiv.org/pdf/2206.03216v2
 http://arxiv.org/abs/1803.05820v1,creativecommons.org/licenses/by/4.0/,RUSSE: The First Workshop on Russian Semantic Similarity,Alexander Panchenko and Natalia Loukachevitch and Dmitry Ustalov and Denis Paperno and Christian Meyer and Natalia Konstantinova,http://arxiv.org/pdf/1803.05820v1
 http://arxiv.org/abs/2109.00590v4,creativecommons.org/licenses/by/4.0/,WebQA: Multihop and Multimodal QA,Yingshan Chang and Mridu Narang and Hisami Suzuki and Guihong Cao and Jianfeng Gao and Yonatan Bisk,http://arxiv.org/pdf/2109.00590v4
 http://arxiv.org/abs/2111.09296v3,creativecommons.org/licenses/by/4.0/,XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale,Arun Babu and Changhan Wang and Andros Tjandra and Kushal Lakhotia and Qiantong Xu and Naman Goyal and Kritika Singh and Patrick von Platen and Yatharth Saraf and Juan Pino and Alexei Baevski and Alexis Conneau and Michael Auli,http://arxiv.org/pdf/2111.09296v3
 http://arxiv.org/abs/2210.14472v1,creativecommons.org/licenses/by/4.0/,Sinhala Sentence Embedding: A Two-Tiered Structure for Low-Resource Languages,Gihan Weeraprameshwara and Vihanga Jayawickrama and Nisansa de Silva and Yudhanjaya Wijeratne,http://arxiv.org/pdf/2210.14472v1
 http://arxiv.org/abs/2212.10505v1,creativecommons.org/licenses/by/4.0/,DePlot: One-shot visual language reasoning by plot-to-table translation,Fangyu Liu and Julian Martin Eisenschlos and Francesco Piccinno and Syrine Krichene and Chenxi Pang and Kenton Lee and Mandar Joshi and Wenhu Chen and Nigel Collier and Yasemin Altun,http://arxiv.org/pdf/2212.10505v1
 http://arxiv.org/abs/2302.13241v1,creativecommons.org/licenses/by/4.0/,Cross-Lingual Question Answering over Knowledge Base as Reading Comprehension,Chen Zhang and Yuxuan Lai and Yansong Feng and Xingyu Shen and Haowei Du and Dongyan Zhao,http://arxiv.org/pdf/2302.13241v1
 http://arxiv.org/abs/2101.01785v3,creativecommons.org/licenses/by/4.0/,ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic,Muhammad Abdul-Mageed and AbdelRahim Elmadany and El Moatez Billah Nagoudi,http://arxiv.org/pdf/2101.01785v3
 http://arxiv.org/abs/2108.11857v2,creativecommons.org/licenses/by/4.0/,Probing Pre-trained Auto-regressive Language Models for Named Entity Typing and Recognition,Elena V. Epure and Romain Hennequin,http://arxiv.org/pdf/2108.11857v2
 http://arxiv.org/abs/2110.10329v1,creativecommons.org/licenses/by/4.0/,SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training,Ankur Bapna and Yu-an Chung and Nan Wu and Anmol Gulati and Ye Jia and Jonathan H. Clark and Melvin Johnson and Jason Riesa and Alexis Conneau and Yu Zhang,http://arxiv.org/pdf/2110.10329v1
 http://arxiv.org/abs/2210.07792v2,creativecommons.org/licenses/by/4.0/,Robust Preference Learning for Storytelling via Contrastive Reinforcement Learning,Louis Castricato and Alexander Havrilla and Shahbuland Matiana and Michael Pieler and Anbang Ye and Ian Yang and Spencer Frazier and Mark Riedl,http://arxiv.org/pdf/2210.07792v2
 http://arxiv.org/abs/2210.08726v2,creativecommons.org/licenses/by/4.0/,"RARR: Researching and Revising What Language Models Say, Using Language Models",Luyu Gao and Zhuyun Dai and Panupong Pasupat and Anthony Chen and Arun Tejasvi Chaganty and Yicheng Fan and Vincent Y. Zhao and Ni Lao and Hongrae Lee and Da-Cheng Juan and Kelvin Guu,http://arxiv.org/pdf/2210.08726v2
 http://arxiv.org/abs/2303.05707v1,creativecommons.org/licenses/by/4.0/,MuLTI: Efficient Video-and-Language Understanding with MultiWay-Sampler and Multiple Choice Modeling,Jiaqi Xu and Bo Liu and Yunkuo Chen and Mengli Cheng and Xing Shi,http://arxiv.org/pdf/2303.05707v1
 http://arxiv.org/abs/2304.07327v1,creativecommons.org/licenses/by/4.0/,OpenAssistant Conversations -- Democratizing Large Language Model Alignment,Andreas Köpf and Yannic Kilcher and Dimitri von Rütte and Sotiris Anagnostidis and Zhi-Rui Tam and Keith Stevens and Abdullah Barhoum and Nguyen Minh Duc and Oliver Stanley and Richárd Nagyfi and Shahul ES and Sameer Suri and David Glushkov and Arnav Dantuluri and Andrew Maguire and Christoph Schuhmann and Huu Nguyen and Alexander Mattick,http://arxiv.org/pdf/2304.07327v1
 http://arxiv.org/abs/2005.07503v1,creativecommons.org/licenses/by/4.0/,COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter,Martin Müller and Marcel Salathé and Per E Kummervold,http://arxiv.org/pdf/2005.07503v1
 http://arxiv.org/abs/2105.08840v3,creativecommons.org/licenses/by/4.0/,Training Heterogeneous Features in Sequence to Sequence Tasks: Latent Enhanced Multi-filter Seq2Seq Model,Yunhao Yang and Zhaokun Xue,http://arxiv.org/pdf/2105.08840v3
 http://arxiv.org/abs/2205.12538v2,creativecommons.org/licenses/by/4.0/,Is a Question Decomposition Unit All We Need?,Pruthvi Patel and Swaroop Mishra and Mihir Parmar and Chitta Baral,http://arxiv.org/pdf/2205.12538v2
 http://arxiv.org/abs/2206.05802v2,creativecommons.org/licenses/by/4.0/,Self-critiquing models for assisting human evaluators,William Saunders and Catherine Yeh and Jeff Wu and Steven Bills and Long Ouyang and Jonathan Ward and Jan Leike,http://arxiv.org/pdf/2206.05802v2
 http://arxiv.org/abs/2211.06687v3,creativecommons.org/licenses/by/4.0/,Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation,Yusong Wu and Ke Chen and Tianyu Zhang and Yuchen Hui and Taylor Berg-Kirkpatrick and Shlomo Dubnov,http://arxiv.org/pdf/2211.06687v3
 http://arxiv.org/abs/2304.09138v1,creativecommons.org/licenses/by/4.0/,Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI Task,Zihao Wu and Lu Zhang and Chao Cao and Xiaowei Yu and Haixing Dai and Chong Ma and Zhengliang Liu and Lin Zhao and Gang Li and Wei Liu and Quanzheng Li and Dinggang Shen and Xiang Li and Dajiang Zhu and Tianming Liu,http://arxiv.org/pdf/2304.09138v1
 http://arxiv.org/abs/2210.13701v1,creativecommons.org/licenses/by/4.0/,Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence,Hung-Ting Chen and Michael J. Q. Zhang and Eunsol Choi,http://arxiv.org/pdf/2210.13701v1
 http://arxiv.org/abs/2205.02022v2,creativecommons.org/licenses/by/4.0/,A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation,David Ifeoluwa Adelani and Jesujoba Oluwadara Alabi and Angela Fan and Julia Kreutzer and Xiaoyu Shen and Machel Reid and Dana Ruiter and Dietrich Klakow and Peter Nabende and Ernie Chang and Tajuddeen Gwadabe and Freshia Sackey and Bonaventure F. P. Dossou and Chris Chinenye Emezue and Colin Leong and Michael Beukman and Shamsuddeen Hassan Muhammad and Guyo Dub Jarso and Oreen Yousuf and Andre Niyongabo Rubungo and Gilles Hacheme and Eric Peter Wairagala and Muhammad Umair Nasir and Benjamin Ayoade Ajibade and Tunde Oluwaseyi Ajayi and Yvonne Wambui Gitau and Jade Abbott and Mohamed Ahmed and Millicent Ochieng and Anuoluwapo Aremu and Perez Ogayo and Jonathan Mukiibi and Fatoumata Ouoba Kabore and Godson Koffi Kalipe and Derguene Mbaye and Allahsera Auguste Tapo and Victoire Memdjokam Koagne and Edwin Munkoh-Buabeng and Valencia Wagner and Idris Abdulmumin and Ayodele Awokoya and Happy Buzaaba and Blessing Sibanda and Andiswa Bukula and Sam Manthalu,http://arxiv.org/pdf/2205.02022v2
 http://arxiv.org/abs/2301.11847v1,creativecommons.org/licenses/by/4.0/,A Comparative Study of Pretrained Language Models for Long Clinical Text,Yikuan Li and Ramsey M. Wehbe and Faraz S. Ahmad and Hanyin Wang and Yuan Luo,http://arxiv.org/pdf/2301.11847v1
 http://arxiv.org/abs/2105.04024v3,creativecommons.org/licenses/by/4.0/,DocSCAN: Unsupervised Text Classification via Learning from Neighbors,Dominik Stammbach and Elliott Ash,http://arxiv.org/pdf/2105.04024v3
 http://arxiv.org/abs/2109.06692v1,creativecommons.org/licenses/by/4.0/,LRWR: Large-Scale Benchmark for Lip Reading in Russian language,Evgeniy Egorov and Vasily Kostyumov and Mikhail Konyk and Sergey Kolesnikov,http://arxiv.org/pdf/2109.06692v1
 http://arxiv.org/abs/2207.09854v1,creativecommons.org/licenses/by/4.0/,"Auto-active Verification of Graph Algorithms, Written in OCaml",Daniel Castanho and Mário Pereira,http://arxiv.org/pdf/2207.09854v1
 http://arxiv.org/abs/2106.01251v1,creativecommons.org/licenses/by/4.0/,Multilingual Medical Question Answering and Information Retrieval for Rural Health Intelligence Access,Vishal Vinod and Susmit Agrawal and Vipul Gaurav and Pallavi R and Savita Choudhary,http://arxiv.org/pdf/2106.01251v1
 http://arxiv.org/abs/2111.08088v1,creativecommons.org/licenses/by/4.0/,Assessing gender bias in medical and scientific masked language models with StereoSet,Robert Robinson,http://arxiv.org/pdf/2111.08088v1
 http://arxiv.org/abs/2301.01224v1,creativecommons.org/licenses/by/4.0/,An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation,Kevin Moran and Ali Yachnes and George Purnell and Junayed Mahmud and Michele Tufano and Carlos Bernal-Cárdenas and Denys Poshyvanyk and Zach H'Doubler,http://arxiv.org/pdf/2301.01224v1
 http://arxiv.org/abs/2304.03086v1,creativecommons.org/licenses/by/4.0/,ChatGPT for Shaping the Future of Dentistry: The Potential of Multi-Modal Large Language Model,Hanyao Huang and Ou Zheng and Dongdong Wang and Jiayi Yin and Zijin Wang and Shengxuan Ding and Heng Yin and Chuan Xu and Renjie Yang and Qian Zheng and Bing Shi,http://arxiv.org/pdf/2304.03086v1
 http://arxiv.org/abs/2304.05501v1,creativecommons.org/licenses/by/4.0/,L3MVN: Leveraging Large Language Models for Visual Target Navigation,Bangguo Yu and Hamidreza Kasaei and Ming Cao,http://arxiv.org/pdf/2304.05501v1
 http://arxiv.org/abs/2109.04877v1,creativecommons.org/licenses/by/4.0/,Efficient Test Time Adapter Ensembling for Low-resource Language Varieties,Xinyi Wang and Yulia Tsvetkov and Sebastian Ruder and Graham Neubig,http://arxiv.org/pdf/2109.04877v1
 http://arxiv.org/abs/2204.04873v1,creativecommons.org/licenses/by/4.0/,Adapting BigScience Multilingual Model to Unseen Languages,Zheng-Xin Yong and Vassilina Nikoulina,http://arxiv.org/pdf/2204.04873v1
 http://arxiv.org/abs/2009.13570v2,creativecommons.org/licenses/by/4.0/,DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue,Shikib Mehri and Mihail Eric and Dilek Hakkani-Tur,http://arxiv.org/pdf/2009.13570v2
 http://arxiv.org/abs/2101.07120v1,creativecommons.org/licenses/by/4.0/,Neural Abstractive Text Summarizer for Telugu Language,Mohan Bharath B and Aravindh Gowtham B and Akhil M,http://arxiv.org/pdf/2101.07120v1
 http://arxiv.org/abs/2106.05589v1,creativecommons.org/licenses/by/4.0/,AUGNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation,Xinnuo Xu and Guoyin Wang and Young-Bum Kim and Sungjin Lee,http://arxiv.org/pdf/2106.05589v1
 http://arxiv.org/abs/2205.03815v1,creativecommons.org/licenses/by/4.0/,Beyond Distributional Hypothesis: Let Language Models Learn Meaning-Text Correspondence,Myeongjun Jang and Frank Mtumbuka and Thomas Lukasiewicz,http://arxiv.org/pdf/2205.03815v1
 http://arxiv.org/abs/2205.07523v1,creativecommons.org/licenses/by/4.0/,Prompting to Distill: Boosting Data-Free Knowledge Distillation via Reinforced Prompt,Xinyin Ma and Xinchao Wang and Gongfan Fang and Yongliang Shen and Weiming Lu,http://arxiv.org/pdf/2205.07523v1
 http://arxiv.org/abs/2205.10036v1,creativecommons.org/licenses/by/4.0/,Exploring Extreme Parameter Compression for Pre-trained Language Models,Yuxin Ren and Benyou Wang and Lifeng Shang and Xin Jiang and Qun Liu,http://arxiv.org/pdf/2205.10036v1
 http://arxiv.org/abs/2210.09150v2,creativecommons.org/licenses/by/4.0/,Prompting GPT-3 To Be Reliable,Chenglei Si and Zhe Gan and Zhengyuan Yang and Shuohang Wang and Jianfeng Wang and Jordan Boyd-Graber and Lijuan Wang,http://arxiv.org/pdf/2210.09150v2
 http://arxiv.org/abs/2210.13578v1,creativecommons.org/licenses/by/4.0/,Speeding Up Question Answering Task of Language Models via Inverted Index,Xiang Ji and Yesim Sungu-Eryilmaz and Elaheh Momeni and Reza Rawassizadeh,http://arxiv.org/pdf/2210.13578v1
 http://arxiv.org/abs/2301.09211v1,creativecommons.org/licenses/by/4.0/,An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models,Saghar Hosseini and Hamid Palangi and Ahmed Hassan Awadallah,http://arxiv.org/pdf/2301.09211v1
 http://arxiv.org/abs/2302.07926v1,creativecommons.org/licenses/by/4.0/,Commonsense Reasoning for Conversational AI: A Survey of the State of the Art,Christopher Richardson and Larry Heck,http://arxiv.org/pdf/2302.07926v1
 http://arxiv.org/abs/2303.03480v1,creativecommons.org/licenses/by/4.0/,"Can an Embodied Agent Find Your Cat-shaped Mug""? LLM-Based Zero-Shot Object Navigation""",Vishnu Sashank Dorbala and James F. Mullen Jr. and Dinesh Manocha,http://arxiv.org/pdf/2303.03480v1
 http://arxiv.org/abs/2304.02886v1,creativecommons.org/licenses/by/4.0/,Automatic ICD-10 Code Association: A Challenging Task on French Clinical Texts,Yakini Tchouka and Jean-François Couchot and David Laiymani and Philippe Selles and Azzedine Rahmani,http://arxiv.org/pdf/2304.02886v1
 http://arxiv.org/abs/2006.00031v1,creativecommons.org/licenses/by/4.0/,A Comparative Study of Lexical Substitution Approaches based on Neural Language Models,Nikolay Arefyev and Boris Sheludko and Alexander Podolskiy and Alexander Panchenko,http://arxiv.org/pdf/2006.00031v1
 http://arxiv.org/abs/2102.04490v2,creativecommons.org/licenses/by/4.0/,Unsupervised Abstractive Summarization of Bengali Text Documents,Radia Rayan Chowdhury and Mir Tafseer Nayeem and Tahsin Tasnim Mim and Md. Saifur Rahman Chowdhury and Taufiqul Jannat,http://arxiv.org/pdf/2102.04490v2
 http://arxiv.org/abs/2104.04487v1,creativecommons.org/licenses/by/4.0/,Language model fusion for streaming end to end speech recognition,Rodrigo Cabrera and Xiaofeng Liu and Mohammadreza Ghodsi and Zebulun Matteson and Eugene Weinstein and Anjuli Kannan,http://arxiv.org/pdf/2104.04487v1
 http://arxiv.org/abs/2109.08259v1,creativecommons.org/licenses/by/4.0/,Self-training with Few-shot Rationalization: Teacher Explanations Aid Student in Few-shot NLU,Meghana Moorthy Bhat and Alessandro Sordoni and Subhabrata Mukherjee,http://arxiv.org/pdf/2109.08259v1
 http://arxiv.org/abs/2111.08284v2,creativecommons.org/licenses/by/4.0/,Few-Shot Self-Rationalization with Natural Language Prompts,Ana Marasović and Iz Beltagy and Doug Downey and Matthew E. Peters,http://arxiv.org/pdf/2111.08284v2
 http://arxiv.org/abs/2112.08547v2,creativecommons.org/licenses/by/4.0/,Learning Rich Representation of Keyphrases from Text,Mayank Kulkarni and Debanjan Mahata and Ravneet Arora and Rajarshi Bhowmik,http://arxiv.org/pdf/2112.08547v2
 http://arxiv.org/abs/2210.04873v2,creativecommons.org/licenses/by/4.0/,CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation,Tanay Dixit and Bhargavi Paranjape and Hannaneh Hajishirzi and Luke Zettlemoyer,http://arxiv.org/pdf/2210.04873v2
 http://arxiv.org/abs/2211.16198v2,creativecommons.org/licenses/by/4.0/,SuS-X: Training-Free Name-Only Transfer of Vision-Language Models,Vishaal Udandarao and Ankush Gupta and Samuel Albanie,http://arxiv.org/pdf/2211.16198v2
 http://arxiv.org/abs/2303.09384v1,creativecommons.org/licenses/by/4.0/,LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations,Catherine Tony and Markus Mutas and Nicolás E. Díaz Ferreyra and Riccardo Scandariato,http://arxiv.org/pdf/2303.09384v1
 http://arxiv.org/abs/2303.17760v1,creativecommons.org/licenses/by/4.0/,"CAMEL: Communicative Agents for Mind"" Exploration of Large Scale Language Model Society""",Guohao Li and Hasan Abed Al Kader Hammoud and Hani Itani and Dmitrii Khizbullin and Bernard Ghanem,http://arxiv.org/pdf/2303.17760v1
 http://arxiv.org/abs/2011.03203v1,creativecommons.org/licenses/by/4.0/,Unleashing the Power of Neural Discourse Parsers -- A Context and Structure Aware Approach Using Large Scale Pretraining,Grigorii Guz and Patrick Huber and Giuseppe Carenini,http://arxiv.org/pdf/2011.03203v1
 http://arxiv.org/abs/2104.11070v2,creativecommons.org/licenses/by/4.0/,Adapting Long Context NLM for ASR Rescoring in Conversational Agents,Ashish Shenoy and Sravan Bodapati and Monica Sunkara and Srikanth Ronanki and Katrin Kirchhoff,http://arxiv.org/pdf/2104.11070v2
 http://arxiv.org/abs/2111.09564v2,creativecommons.org/licenses/by/4.0/,LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model,Yukyung Lee and Jina Kim and Pilsung Kang,http://arxiv.org/pdf/2111.09564v2
 http://arxiv.org/abs/2204.06745v1,creativecommons.org/licenses/by/4.0/,GPT-NeoX-20B: An Open-Source Autoregressive Language Model,Sid Black and Stella Biderman and Eric Hallahan and Quentin Anthony and Leo Gao and Laurence Golding and Horace He and Connor Leahy and Kyle McDonell and Jason Phang and Michael Pieler and USVSN Sai Prashanth and Shivanshu Purohit and Laria Reynolds and Jonathan Tow and Ben Wang and Samuel Weinbach,http://arxiv.org/pdf/2204.06745v1
 http://arxiv.org/abs/2302.06860v2,creativecommons.org/licenses/by/4.0/,BLIAM: Literature-based Data Synthesis for Synergistic Drug Combination Prediction,Cai Yang and Addie Woicik and Hoifung Poon and Sheng Wang,http://arxiv.org/pdf/2302.06860v2
 http://arxiv.org/abs/2303.17590v1,creativecommons.org/licenses/by/4.0/,Going Beyond Nouns With Vision & Language Models Using Synthetic Data,Paola Cascante-Bonilla and Khaled Shehada and James Seale Smith and Sivan Doveh and Donghyun Kim and Rameswar Panda and Gül Varol and Aude Oliva and Vicente Ordonez and Rogerio Feris and Leonid Karlinsky,http://arxiv.org/pdf/2303.17590v1
 http://arxiv.org/abs/2102.04887v2,creativecommons.org/licenses/by/4.0/,NewsBERT: Distilling Pre-trained Language Model for Intelligent News Application,Chuhan Wu and Fangzhao Wu and Yang Yu and Tao Qi and Yongfeng Huang and Qi Liu,http://arxiv.org/pdf/2102.04887v2
 http://arxiv.org/abs/2304.06762v1,creativecommons.org/licenses/by/4.0/,Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study,Boxin Wang and Wei Ping and Peng Xu and Lawrence McAfee and Zihan Liu and Mohammad Shoeybi and Yi Dong and Oleksii Kuchaiev and Bo Li and Chaowei Xiao and Anima Anandkumar and Bryan Catanzaro,http://arxiv.org/pdf/2304.06762v1
 http://arxiv.org/abs/1707.03762v4,creativecommons.org/licenses/by/4.0/,Revisiting Elementary Denotational Semantics,Jeremy G. Siek,http://arxiv.org/pdf/1707.03762v4
 http://arxiv.org/abs/2112.07868v2,creativecommons.org/licenses/by/4.0/,Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases,Shrimai Prabhumoye and Rafal Kocielnik and Mohammad Shoeybi and Anima Anandkumar and Bryan Catanzaro,http://arxiv.org/pdf/2112.07868v2
 http://arxiv.org/abs/2303.09639v1,creativecommons.org/licenses/by/4.0/,Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models,Aashka Trivedi and Takuma Udagawa and Michele Merler and Rameswar Panda and Yousef El-Kurdi and Bishwaranjan Bhattacharjee,http://arxiv.org/pdf/2303.09639v1
 http://arxiv.org/abs/2304.02210v1,creativecommons.org/licenses/by/4.0/,Document-Level Machine Translation with Large Language Models,Longyue Wang and Chenyang Lyu and Tianbo Ji and Zhirui Zhang and Dian Yu and Shuming Shi and Zhaopeng Tu,http://arxiv.org/pdf/2304.02210v1
 http://arxiv.org/abs/2108.02962v1,creativecommons.org/licenses/by/4.0/,Dezyne: Paving the Way to Practical Formal Software Engineering,Rutger van Beusekom and Bert de Jonge and Paul Hoogendijk and Jan Nieuwenhuizen,http://arxiv.org/pdf/2108.02962v1
 http://arxiv.org/abs/2204.06130v2,creativecommons.org/licenses/by/4.0/,Impossible Triangle: What's Next for Pre-trained Language Models?,Chenguang Zhu and Michael Zeng,http://arxiv.org/pdf/2204.06130v2
 http://arxiv.org/abs/2206.03931v3,creativecommons.org/licenses/by/4.0/,Learning to Generate Prompts for Dialogue Generation through Reinforcement Learning,Hsuan Su and Pohan Chi and Shih-Cheng Huang and Chung Ho Lam and Saurav Sahay and Shang-Tse Chen and Hung-yi Lee,http://arxiv.org/pdf/2206.03931v3
 http://arxiv.org/abs/2212.10474v1,creativecommons.org/licenses/by/4.0/,ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models,Jonas Belouadi and Steffen Eger,http://arxiv.org/pdf/2212.10474v1
 http://arxiv.org/abs/2102.04130v3,creativecommons.org/licenses/by/4.0/,Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models,Hannah Kirk and Yennie Jun and Haider Iqbal and Elias Benussi and Filippo Volpin and Frederic A. Dreyer and Aleksandar Shtedritski and Yuki M. Asano,http://arxiv.org/pdf/2102.04130v3
 http://arxiv.org/abs/2105.03119v1,creativecommons.org/licenses/by/4.0/,Applying Model-based Requirements Engineering in Three Large European Collaborative Projects: An Experience Report,Andrey Sadovykh and Dragos Truscan and Hugo Bruneliere,http://arxiv.org/pdf/2105.03119v1
 http://arxiv.org/abs/2106.09449v1,creativecommons.org/licenses/by/4.0/,DocNLI: A Large-scale Dataset for Document-level Natural Language Inference,Wenpeng Yin and Dragomir Radev and Caiming Xiong,http://arxiv.org/pdf/2106.09449v1
 http://arxiv.org/abs/2205.10569v1,creativecommons.org/licenses/by/4.0/,HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking,Yanzhao Zhang and Dingkun Long and Guangwei Xu and Pengjun Xie,http://arxiv.org/pdf/2205.10569v1
 http://arxiv.org/abs/2211.12508v1,creativecommons.org/licenses/by/4.0/,Time-Aware Datasets are Adaptive Knowledgebases for the New Normal,Abhijit Suprem and Sanjyot Vaidya and Joao Eduardo Ferreira and Calton Pu,http://arxiv.org/pdf/2211.12508v1
 http://arxiv.org/abs/2301.04788v1,creativecommons.org/licenses/by/4.0/,Language Cognition and Language Computation -- Human and Machine Language Understanding,Shaonan Wang and Nai Ding and Nan Lin and Jiajun Zhang and Chengqing Zong,http://arxiv.org/pdf/2301.04788v1
 http://arxiv.org/abs/2107.03141v1,creativecommons.org/licenses/by/4.0/,Hierarchical Text Classification of Urdu News using Deep Neural Network,Taimoor Ahmed Javed and Waseem Shahzad and Umair Arshad,http://arxiv.org/pdf/2107.03141v1
 http://arxiv.org/abs/1810.06635v1,creativecommons.org/licenses/by/4.0/,Semi-supervised and Active-learning Scenarios: Efficient Acoustic Model Refinement for a Low Resource Indian Language,Maharajan Chellapriyadharshini and Anoop Toffy and Srinivasa Raghavan K. M. and V Ramasubramanian,http://arxiv.org/pdf/1810.06635v1
 http://arxiv.org/abs/1910.00883v2,creativecommons.org/licenses/by/4.0/,Exploiting BERT for End-to-End Aspect-based Sentiment Analysis,Xin Li and Lidong Bing and Wenxuan Zhang and Wai Lam,http://arxiv.org/pdf/1910.00883v2
 http://arxiv.org/abs/2005.11768v2,creativecommons.org/licenses/by/4.0/,KaLM at SemEval-2020 Task 4: Knowledge-aware Language Models for Comprehension And Generation,Jiajing Wan and Xinting Huang,http://arxiv.org/pdf/2005.11768v2
 http://arxiv.org/abs/2104.05228v1,creativecommons.org/licenses/by/4.0/,SuperSim: a test set for word similarity and relatedness in Swedish,Simon Hengchen and Nina Tahmasebi,http://arxiv.org/pdf/2104.05228v1
 http://arxiv.org/abs/2106.03598v1,creativecommons.org/licenses/by/4.0/,SciFive: a text-to-text transformer model for biomedical literature,Long N. Phan and James T. Anibal and Hieu Tran and Shaurya Chanana and Erol Bahadroglu and Alec Peltekian and Grégoire Altan-Bonnet,http://arxiv.org/pdf/2106.03598v1
 http://arxiv.org/abs/2109.07465v1,creativecommons.org/licenses/by/4.0/,On the Limits of Minimal Pairs in Contrastive Evaluation,Jannis Vamvas and Rico Sennrich,http://arxiv.org/pdf/2109.07465v1
 http://arxiv.org/abs/2302.11773v1,creativecommons.org/licenses/by/4.0/,Detecting software vulnerabilities using Language Models,Marwan Omar,http://arxiv.org/pdf/2302.11773v1
 http://arxiv.org/abs/2110.10305v1,creativecommons.org/licenses/by/4.0/,"When in Doubt, Summon the Titans: Efficient Inference with Large Models",Ankit Singh Rawat and Manzil Zaheer and Aditya Krishna Menon and Amr Ahmed and Sanjiv Kumar,http://arxiv.org/pdf/2110.10305v1
 http://arxiv.org/abs/2108.13961v1,creativecommons.org/licenses/by/4.0/,Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools,Nils Feldhus and Robert Schwarzenberg and Sebastian Möller,http://arxiv.org/pdf/2108.13961v1
 http://arxiv.org/abs/2110.01691v3,creativecommons.org/licenses/by/4.0/,AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts,Tongshuang Wu and Michael Terry and Carrie J. Cai,http://arxiv.org/pdf/2110.01691v3
 http://arxiv.org/abs/2110.08387v3,creativecommons.org/licenses/by/4.0/,Generated Knowledge Prompting for Commonsense Reasoning,Jiacheng Liu and Alisa Liu and Ximing Lu and Sean Welleck and Peter West and Ronan Le Bras and Yejin Choi and Hannaneh Hajishirzi,http://arxiv.org/pdf/2110.08387v3
 http://arxiv.org/abs/2205.00176v1,creativecommons.org/licenses/by/4.0/,Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models,Sanghwan Bae and Donghyun Kwak and Sungdong Kim and Donghoon Ham and Soyoung Kang and Sang-Woo Lee and Woomyoung Park,http://arxiv.org/pdf/2205.00176v1
 http://arxiv.org/abs/2206.11309v1,creativecommons.org/licenses/by/4.0/,GODEL: Large-Scale Pre-Training for Goal-Directed Dialog,Baolin Peng and Michel Galley and Pengcheng He and Chris Brockett and Lars Liden and Elnaz Nouri and Zhou Yu and Bill Dolan and Jianfeng Gao,http://arxiv.org/pdf/2206.11309v1
 http://arxiv.org/abs/2208.04417v2,creativecommons.org/licenses/by/4.0/,Debiased Large Language Models Still Associate Muslims with Uniquely Violent Acts,Babak Hemmatian and Lav R. Varshney,http://arxiv.org/pdf/2208.04417v2
 http://arxiv.org/abs/2208.10063v2,creativecommons.org/licenses/by/4.0/,Selection Collider Bias in Large Language Models,Emily McMilin,http://arxiv.org/pdf/2208.10063v2
 http://arxiv.org/abs/2208.14271v1,creativecommons.org/licenses/by/4.0/,Faithful Reasoning Using Large Language Models,Antonia Creswell and Murray Shanahan,http://arxiv.org/pdf/2208.14271v1
 http://arxiv.org/abs/2210.12022v1,creativecommons.org/licenses/by/4.0/,Performance-Efficiency Trade-Offs in Adapting Language Models to Text Classification Tasks,Laura Aina and Nikos Voskarides and Roi Blanco,http://arxiv.org/pdf/2210.12022v1
 http://arxiv.org/abs/2211.08466v1,creativecommons.org/licenses/by/4.0/,Reasoning Circuits: Few-shot Multihop Question Generation with Structured Rationales,Saurabh Kulshreshtha and Anna Rumshisky,http://arxiv.org/pdf/2211.08466v1
 http://arxiv.org/abs/2212.10509v1,creativecommons.org/licenses/by/4.0/,Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions,Harsh Trivedi and Niranjan Balasubramanian and Tushar Khot and Ashish Sabharwal,http://arxiv.org/pdf/2212.10509v1
 http://arxiv.org/abs/2212.10726v1,creativecommons.org/licenses/by/4.0/,Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval,John Wieting and Jonathan H. Clark and William W. Cohen and Graham Neubig and Taylor Berg-Kirkpatrick,http://arxiv.org/pdf/2212.10726v1
 http://arxiv.org/abs/2301.13848v1,creativecommons.org/licenses/by/4.0/,Benchmarking Large Language Models for News Summarization,Tianyi Zhang and Faisal Ladhak and Esin Durmus and Percy Liang and Kathleen McKeown and Tatsunori B. Hashimoto,http://arxiv.org/pdf/2301.13848v1
 http://arxiv.org/abs/2303.15125v1,creativecommons.org/licenses/by/4.0/,LMCanvas: Object-Oriented Interaction to Personalize Large Language Model-Powered Writing Environments,Tae Soo Kim and Arghya Sarkar and Yoonjoo Lee and Minsuk Chang and Juho Kim,http://arxiv.org/pdf/2303.15125v1
 http://arxiv.org/abs/2010.05731v1,creativecommons.org/licenses/by/4.0/,Probing Pretrained Language Models for Lexical Semantics,Ivan Vulić and Edoardo Maria Ponti and Robert Litschko and Goran Glavaš and Anna Korhonen,http://arxiv.org/pdf/2010.05731v1
 http://arxiv.org/abs/2109.06050v2,creativecommons.org/licenses/by/4.0/,Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-Training,Momchil Hardalov and Arnav Arora and Preslav Nakov and Isabelle Augenstein,http://arxiv.org/pdf/2109.06050v2
 http://arxiv.org/abs/2201.13405v1,creativecommons.org/licenses/by/4.0/,Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation,Olga Majewska and Evgeniia Razumovskaia and Edoardo Maria Ponti and Ivan Vulić and Anna Korhonen,http://arxiv.org/pdf/2201.13405v1
 http://arxiv.org/abs/2210.01848v2,creativecommons.org/licenses/by/4.0/,Explaining Patterns in Data with Language Models via Interpretable Autoprompting,Chandan Singh and John X. Morris and Jyoti Aneja and Alexander M. Rush and Jianfeng Gao,http://arxiv.org/pdf/2210.01848v2
 http://arxiv.org/abs/2212.01944v3,creativecommons.org/licenses/by/4.0/,Automaton-Based Representations of Task Knowledge from Generative Language Models,Yunhao Yang and Jean-Raphaël Gaglione and Cyrus Neary and Ufuk Topcu,http://arxiv.org/pdf/2212.01944v3
 http://arxiv.org/abs/2211.14920v1,creativecommons.org/licenses/by/4.0/,EPIK: Eliminating multi-model Pipelines with Knowledge-distillation,Bhavesh Laddagiri and Yash Raj and Anshuman Dash,http://arxiv.org/pdf/2211.14920v1
 http://arxiv.org/abs/2205.03695v1,creativecommons.org/licenses/by/4.0/,AKI-BERT: a Pre-trained Clinical Language Model for Early Prediction of Acute Kidney Injury,Chengsheng Mao and Liang Yao and Yuan Luo,http://arxiv.org/pdf/2205.03695v1
 http://arxiv.org/abs/2212.10873v2,creativecommons.org/licenses/by/4.0/,Prompt-Augmented Linear Probing: Scaling Beyond The Limit of Few-shot In-Context Learners,Hyunsoo Cho and Hyuhng Joon Kim and Junyeob Kim and Sang-Woo Lee and Sang-goo Lee and Kang Min Yoo and Taeuk Kim,http://arxiv.org/pdf/2212.10873v2
 http://arxiv.org/abs/2301.02828v2,creativecommons.org/licenses/by/4.0/,Why do Nearest Neighbor Language Models Work?,Frank F. Xu and Uri Alon and Graham Neubig,http://arxiv.org/pdf/2301.02828v2
 http://arxiv.org/abs/2302.11713v2,creativecommons.org/licenses/by/4.0/,Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?,Yang Chen and Hexiang Hu and Yi Luan and Haitian Sun and Soravit Changpinyo and Alan Ritter and Ming-Wei Chang,http://arxiv.org/pdf/2302.11713v2
 http://arxiv.org/abs/2210.03251v1,creativecommons.org/licenses/by/4.0/,Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints,Ganesh Jawahar and Subhabrata Mukherjee and Debadeepta Dey and Muhammad Abdul-Mageed and Laks V. S. Lakshmanan and Caio Cesar Teodoro Mendes and Gustavo Henrique de Rosa and Shital Shah,http://arxiv.org/pdf/2210.03251v1
 http://arxiv.org/abs/1807.10311v1,creativecommons.org/licenses/by/4.0/,Open Source Automatic Speech Recognition for German,Benjamin Milde and Arne Köhn,http://arxiv.org/pdf/1807.10311v1
 http://arxiv.org/abs/2004.14848v2,creativecommons.org/licenses/by/4.0/,Enriched Pre-trained Transformers for Joint Slot Filling and Intent Detection,Momchil Hardalov and Ivan Koychev and Preslav Nakov,http://arxiv.org/pdf/2004.14848v2
 http://arxiv.org/abs/2109.11295v1,creativecommons.org/licenses/by/4.0/,Dynamic Knowledge Distillation for Pre-trained Language Models,Lei Li and Yankai Lin and Shuhuai Ren and Peng Li and Jie Zhou and Xu Sun,http://arxiv.org/pdf/2109.11295v1
 http://arxiv.org/abs/2210.00045v1,creativecommons.org/licenses/by/4.0/,Calibrating Sequence likelihood Improves Conditional Language Generation,Yao Zhao and Misha Khalman and Rishabh Joshi and Shashi Narayan and Mohammad Saleh and Peter J. Liu,http://arxiv.org/pdf/2210.00045v1
 http://arxiv.org/abs/2304.06975v1,creativecommons.org/licenses/by/4.0/,HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge,Haochun Wang and Chi Liu and Nuwa Xi and Zewen Qiang and Sendong Zhao and Bing Qin and Ting Liu,http://arxiv.org/pdf/2304.06975v1
 http://arxiv.org/abs/2304.03153v1,creativecommons.org/licenses/by/4.0/,Zero-Shot Next-Item Recommendation using Large Pretrained Language Models,Lei Wang and Ee-Peng Lim,http://arxiv.org/pdf/2304.03153v1
 http://arxiv.org/abs/1803.02324v2,creativecommons.org/licenses/by/4.0/,Annotation Artifacts in Natural Language Inference Data,Suchin Gururangan and Swabha Swayamdipta and Omer Levy and Roy Schwartz and Samuel R. Bowman and Noah A. Smith,http://arxiv.org/pdf/1803.02324v2
 http://arxiv.org/abs/2003.01200v4,creativecommons.org/licenses/by/4.0/,Natural Language Processing Advancements By Deep Learning: A Survey,Amirsina Torfi and Rouzbeh A. Shirvani and Yaser Keneshloo and Nader Tavaf and Edward A. Fox,http://arxiv.org/pdf/2003.01200v4
 http://arxiv.org/abs/2108.08252v1,creativecommons.org/licenses/by/4.0/,Deep Natural Language Processing for LinkedIn Search Systems,Weiwei Guo and Xiaowei Liu and Sida Wang and Michaeel Kazi and Zhoutong Fu and Huiji Gao and Jun Jia and Liang Zhang and Bo Long,http://arxiv.org/pdf/2108.08252v1
 http://arxiv.org/abs/2109.10475v1,creativecommons.org/licenses/by/4.0/,Salience-Aware Event Chain Modeling for Narrative Understanding,Xiyang Zhang and Muhao Chen and Jonathan May,http://arxiv.org/pdf/2109.10475v1
 http://arxiv.org/abs/2203.06228v1,creativecommons.org/licenses/by/4.0/,CoDA21: Evaluating Language Understanding Capabilities of NLP Models With Context-Definition Alignment,Lütfi Kerem Senel and Timo Schick and Hinrich Schütze,http://arxiv.org/pdf/2203.06228v1
 http://arxiv.org/abs/2207.14525v1,creativecommons.org/licenses/by/4.0/,Curriculum Learning for Data-Efficient Vision-Language Alignment,Tejas Srinivasan and Xiang Ren and Jesse Thomason,http://arxiv.org/pdf/2207.14525v1
 http://arxiv.org/abs/2210.01091v2,creativecommons.org/licenses/by/4.0/,The (In)Effectiveness of Intermediate Task Training For Domain Adaptation and Cross-Lingual Transfer Learning,Sovesh Mohapatra and Somesh Mohapatra,http://arxiv.org/pdf/2210.01091v2
 http://arxiv.org/abs/2211.09102v2,creativecommons.org/licenses/by/4.0/,Prompting PaLM for Translation: Assessing Strategies and Performance,David Vilar and Markus Freitag and Colin Cherry and Jiaming Luo and Viresh Ratnakar and George Foster,http://arxiv.org/pdf/2211.09102v2
 http://arxiv.org/abs/2212.00616v1,creativecommons.org/licenses/by/4.0/,Extensible Prompts for Language Models,Tao Ge and Jing Hu and Li Dong and Shaoguang Mao and Yan Xia and Xun Wang and Si-Qing Chen and Furu Wei,http://arxiv.org/pdf/2212.00616v1
 http://arxiv.org/abs/2212.02712v1,creativecommons.org/licenses/by/4.0/,Improved Beam Search for Hallucination Mitigation in Abstractive Summarization,Arvind Krishna Sridhar and Erik Visser,http://arxiv.org/pdf/2212.02712v1
 http://arxiv.org/abs/2212.10408v1,creativecommons.org/licenses/by/4.0/,Geographic and Geopolitical Biases of Language Models,Fahim Faisal and Antonios Anastasopoulos,http://arxiv.org/pdf/2212.10408v1
 http://arxiv.org/abs/2302.03194v2,creativecommons.org/licenses/by/4.0/,UDApter -- Efficient Domain Adaptation Using Adapters,Bhavitvya Malik and Abhinav Ramesh Kashyap and Min-Yen Kan and Soujanya Poria,http://arxiv.org/pdf/2302.03194v2
 http://arxiv.org/abs/2303.00001v1,creativecommons.org/licenses/by/4.0/,Reward Design with Language Models,Minae Kwon and Sang Michael Xie and Kalesha Bullard and Dorsa Sadigh,http://arxiv.org/pdf/2303.00001v1
 http://arxiv.org/abs/2304.00557v1,creativecommons.org/licenses/by/4.0/,Semi-supervised Neural Machine Translation with Consistency Regularization for Low-Resource Languages,Viet H. Pham and Thang M. Pham and Giang Nguyen and Long Nguyen and Dien Dinh,http://arxiv.org/pdf/2304.00557v1
 http://arxiv.org/abs/2210.10692v2,creativecommons.org/licenses/by/4.0/,Separating Grains from the Chaff: Using Data Filtering to Improve Multilingual Translation for Low-Resourced African Languages,Idris Abdulmumin and Michael Beukman and Jesujoba O. Alabi and Chris Emezue and Everlyn Asiko and Tosin Adewumi and Shamsuddeen Hassan Muhammad and Mofetoluwa Adeyemi and Oreen Yousuf and Sahib Singh and Tajuddeen Rabiu Gwadabe,http://arxiv.org/pdf/2210.10692v2
 http://arxiv.org/abs/2301.05272v1,creativecommons.org/licenses/by/4.0/,Inaccessible Neural Language Models Could Reinvigorate Linguistic Nativism,Patrick Perrine,http://arxiv.org/pdf/2301.05272v1
 http://arxiv.org/abs/2207.11906v2,creativecommons.org/licenses/by/4.0/,Learning a Dual-Mode Speech Recognition Model via Self-Pruning,Chunxi Liu and Yuan Shangguan and Haichuan Yang and Yangyang Shi and Raghuraman Krishnamoorthi and Ozlem Kalinli,http://arxiv.org/pdf/2207.11906v2
 http://arxiv.org/abs/2202.02635v1,creativecommons.org/licenses/by/4.0/,Multilingual Hate Speech and Offensive Content Detection using Modified Cross-entropy Loss,Arka Mitra and Priyanshu Sankhala,http://arxiv.org/pdf/2202.02635v1
 http://arxiv.org/abs/2203.10692v1,creativecommons.org/licenses/by/4.0/,Better Language Model with Hypernym Class Prediction,He Bai and Tong Wang and Alessandro Sordoni and Peng Shi,http://arxiv.org/pdf/2203.10692v1
 http://arxiv.org/abs/2206.14576v1,creativecommons.org/licenses/by/4.0/,Using cognitive psychology to understand GPT-3,Marcel Binz and Eric Schulz,http://arxiv.org/pdf/2206.14576v1
 http://arxiv.org/abs/2208.02957v2,creativecommons.org/licenses/by/4.0/,Meaning without reference in large language models,Steven T. Piantadosi and Felix Hill,http://arxiv.org/pdf/2208.02957v2
 http://arxiv.org/abs/2210.05598v3,creativecommons.org/licenses/by/4.0/,Enriching Biomedical Knowledge for Low-resource Language Through Large-Scale Translation,Long Phan and Tai Dang and Hieu Tran and Trieu H. Trinh and Vy Phan and Lam D. Chau and Minh-Thang Luong,http://arxiv.org/pdf/2210.05598v3
 http://arxiv.org/abs/2210.09658v1,creativecommons.org/licenses/by/4.0/,ROSE: Robust Selective Fine-tuning for Pre-trained Language Models,Lan Jiang and Hao Zhou and Yankai Lin and Peng Li and Jie Zhou and Rui Jiang,http://arxiv.org/pdf/2210.09658v1
 http://arxiv.org/abs/2210.13979v2,creativecommons.org/licenses/by/4.0/,Meta-learning Pathologies from Radiology Reports using Variance Aware Prototypical Networks,Arijit Sehanobish and Kawshik Kannan and Nabila Abraham and Anasuya Das and Benjamin Odry,http://arxiv.org/pdf/2210.13979v2
 http://arxiv.org/abs/2212.01217v1,creativecommons.org/licenses/by/4.0/,Using Large Pre-Trained Language Model to Assist FDA in Premarket Medical Device,Zongzhe Xu,http://arxiv.org/pdf/2212.01217v1
 http://arxiv.org/abs/2212.13392v1,creativecommons.org/licenses/by/4.0/,DeepCuts: Single-Shot Interpretability based Pruning for BERT,Jasdeep Singh Grover and Bhavesh Gawri and Ruskin Raj Manku,http://arxiv.org/pdf/2212.13392v1
 http://arxiv.org/abs/2303.07678v1,creativecommons.org/licenses/by/4.0/,Query2doc: Query Expansion with Large Language Models,Liang Wang and Nan Yang and Furu Wei,http://arxiv.org/pdf/2303.07678v1
 http://arxiv.org/abs/2304.08637v1,creativecommons.org/licenses/by/4.0/,An Evaluation on Large Language Model Outputs: Discourse and Memorization,Adrian de Wynter and Xun Wang and Alex Sokolov and Qilong Gu and Si-Qing Chen,http://arxiv.org/pdf/2304.08637v1
 http://arxiv.org/abs/2304.11490v1,creativecommons.org/licenses/by/4.0/,Boosting Theory-of-Mind Performance in Large Language Models via Prompting,Shima Rahimi Moghaddam and Christopher J. Honey,http://arxiv.org/pdf/2304.11490v1
 http://arxiv.org/abs/2206.04105v3,creativecommons.org/licenses/by/4.0/,Words are all you need? Language as an approximation for human similarity judgments,Raja Marjieh and Pol van Rijn and Ilia Sucholutsky and Theodore R. Sumers and Harin Lee and Thomas L. Griffiths and Nori Jacoby,http://arxiv.org/pdf/2206.04105v3
 http://arxiv.org/abs/2211.05015v1,creativecommons.org/licenses/by/4.0/,Detecting Languages Unintelligible to Multilingual Models through Local Structure Probes,Louis Clouâtre and Prasanna Parthasarathi and Amal Zouaq and Sarath Chandar,http://arxiv.org/pdf/2211.05015v1
 http://arxiv.org/abs/2206.06888v1,creativecommons.org/licenses/by/4.0/,CERT: Continual Pre-Training on Sketches for Library-Oriented Code Generation,Daoguang Zan and Bei Chen and Dejian Yang and Zeqi Lin and Minsu Kim and Bei Guan and Yongji Wang and Weizhu Chen and Jian-Guang Lou,http://arxiv.org/pdf/2206.06888v1
 http://arxiv.org/abs/2207.14000v1,creativecommons.org/licenses/by/4.0/,Multi-Step Deductive Reasoning Over Natural Language: An Empirical Study on Out-of-Distribution Generalisation,Qiming Bao and Alex Yuxuan Peng and Tim Hartill and Neset Tan and Zhenyun Deng and Michael Witbrock and Jiamou Liu,http://arxiv.org/pdf/2207.14000v1
 http://arxiv.org/abs/2211.00384v2,creativecommons.org/licenses/by/4.0/,The future is different: Large pre-trained language models fail in prediction tasks,Kostadin Cvejoski and Ramsés J. Sánchez and César Ojeda,http://arxiv.org/pdf/2211.00384v2
 http://arxiv.org/abs/2303.18116v1,creativecommons.org/licenses/by/4.0/,Pair Programming with Large Language Models for Sampling and Estimation of Copulas,Jan Górecki,http://arxiv.org/pdf/2303.18116v1
 http://arxiv.org/abs/2005.11197v2,creativecommons.org/licenses/by/4.0/,Simplify-then-Translate: Automatic Preprocessing for Black-Box Machine Translation,Sneha Mehta and Bahareh Azarnoush and Boris Chen and Avneesh Saluja and Vinith Misra and Ballav Bihani and Ritwik Kumar,http://arxiv.org/pdf/2005.11197v2
 http://arxiv.org/abs/2203.17247v3,creativecommons.org/licenses/by/4.0/,VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers,Estelle Aflalo and Meng Du and Shao-Yen Tseng and Yongfei Liu and Chenfei Wu and Nan Duan and Vasudev Lal,http://arxiv.org/pdf/2203.17247v3
 http://arxiv.org/abs/2212.07798v1,creativecommons.org/licenses/by/4.0/,Utilizing Background Knowledge for Robust Reasoning over Traffic Situations,Jiarui Zhang and Filip Ilievski and Aravinda Kollaa and Jonathan Francis and Kaixin Ma and Alessandro Oltramari,http://arxiv.org/pdf/2212.07798v1
 http://arxiv.org/abs/2302.11521v1,creativecommons.org/licenses/by/4.0/,How Does In-Context Learning Help Prompt Tuning?,Simeng Sun and Yang Liu and Dan Iter and Chenguang Zhu and Mohit Iyyer,http://arxiv.org/pdf/2302.11521v1
 http://arxiv.org/abs/2110.13995v1,creativecommons.org/licenses/by/4.0/,Software Engineering Meets Systems Engineering: Conceptual Modeling Applied to Engineering Operations,Sabah Al-Fedaghi and Mahdi Modhaffar,http://arxiv.org/pdf/2110.13995v1
 http://arxiv.org/abs/2302.03900v1,creativecommons.org/licenses/by/4.0/,Zero-shot Generation of Coherent Storybook from Plain Text Story using Diffusion Models,Hyeonho Jeong and Gihyun Kwon and Jong Chul Ye,http://arxiv.org/pdf/2302.03900v1
 http://arxiv.org/abs/2201.11838v3,creativecommons.org/licenses/by/4.0/,Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences,Yikuan Li and Ramsey M. Wehbe and Faraz S. Ahmad and Hanyin Wang and Yuan Luo,http://arxiv.org/pdf/2201.11838v3
 http://arxiv.org/abs/2301.01181v7,creativecommons.org/licenses/by/4.0/,Large Language Models as Corporate Lobbyists,John J. Nay,http://arxiv.org/pdf/2301.01181v7
 http://arxiv.org/abs/1912.09582v1,creativecommons.org/licenses/by/4.0/,BERTje: A Dutch BERT Model,Wietse de Vries and Andreas van Cranenburgh and Arianna Bisazza and Tommaso Caselli and Gertjan van Noord and Malvina Nissim,http://arxiv.org/pdf/1912.09582v1
 http://arxiv.org/abs/2101.00297v3,creativecommons.org/licenses/by/4.0/,Analyzing Commonsense Emergence in Few-shot Knowledge Models,Jeff Da and Ronan Le Bras and Ximing Lu and Yejin Choi and Antoine Bosselut,http://arxiv.org/pdf/2101.00297v3
 http://arxiv.org/abs/2104.08860v2,creativecommons.org/licenses/by/4.0/,CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval,Huaishao Luo and Lei Ji and Ming Zhong and Yang Chen and Wen Lei and Nan Duan and Tianrui Li,http://arxiv.org/pdf/2104.08860v2
 http://arxiv.org/abs/2106.00851v1,creativecommons.org/licenses/by/4.0/,Parameter-Efficient Neural Question Answering Models via Graph-Enriched Document Representations,Louis Castricato and Stephen Fitz and Won Young Shin,http://arxiv.org/pdf/2106.00851v1
 http://arxiv.org/abs/2110.02402v1,creativecommons.org/licenses/by/4.0/,Language Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformers,Narsimha Chilkuri and Eric Hunsberger and Aaron Voelker and Gurshaant Malik and Chris Eliasmith,http://arxiv.org/pdf/2110.02402v1
 http://arxiv.org/abs/2111.08210v1,creativecommons.org/licenses/by/4.0/,Meeting Summarization with Pre-training and Clustering Methods,Andras Huebner and Wei Ji and Xiang Xiao,http://arxiv.org/pdf/2111.08210v1
 http://arxiv.org/abs/2206.07593v1,creativecommons.org/licenses/by/4.0/,HICEM: A High-Coverage Emotion Model for Artificial Emotional Intelligence,Benjamin Wortman and James Z. Wang,http://arxiv.org/pdf/2206.07593v1
 http://arxiv.org/abs/2209.10505v1,creativecommons.org/licenses/by/4.0/,Text Revealer: Private Text Reconstruction via Model Inversion Attacks against Transformers,Ruisi Zhang and Seira Hidano and Farinaz Koushanfar,http://arxiv.org/pdf/2209.10505v1
 http://arxiv.org/abs/2210.02833v1,creativecommons.org/licenses/by/4.0/,Matching Text and Audio Embeddings: Exploring Transfer-learning Strategies for Language-based Audio Retrieval,Benno Weck and Miguel Pérez Fernández and Holger Kirchhoff and Xavier Serra,http://arxiv.org/pdf/2210.02833v1
 http://arxiv.org/abs/2210.11468v1,creativecommons.org/licenses/by/4.0/,ObSynth: An Interactive Synthesis System for Generating Object Models from Natural Language Specifications,Alex Gu and Tamara Mitrovska and Daniela Velez and Jacob Andreas and Armando Solar-Lezama,http://arxiv.org/pdf/2210.11468v1
 http://arxiv.org/abs/2211.00593v1,creativecommons.org/licenses/by/4.0/,Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small,Kevin Wang and Alexandre Variengien and Arthur Conmy and Buck Shlegeris and Jacob Steinhardt,http://arxiv.org/pdf/2211.00593v1
 http://arxiv.org/abs/2211.08987v1,creativecommons.org/licenses/by/4.0/,TSMind: Alibaba and Soochow University's Submission to the WMT22 Translation Suggestion Task,Xin Ge and Ke Wang and Jiayi Wang and Nini Xiao and Xiangyu Duan and Yu Zhao and Yuqi Zhang,http://arxiv.org/pdf/2211.08987v1
 http://arxiv.org/abs/2302.01318v1,creativecommons.org/licenses/by/4.0/,Accelerating Large Language Model Decoding with Speculative Sampling,Charlie Chen and Sebastian Borgeaud and Geoffrey Irving and Jean-Baptiste Lespiau and Laurent Sifre and John Jumper,http://arxiv.org/pdf/2302.01318v1
 http://arxiv.org/abs/2302.05932v1,creativecommons.org/licenses/by/4.0/,Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking,Derek Chen and Kun Qian and Zhou Yu,http://arxiv.org/pdf/2302.05932v1
 http://arxiv.org/abs/2302.12128v1,creativecommons.org/licenses/by/4.0/,On the Generalization Ability of Retrieval-Enhanced Transformers,Tobias Norlund and Ehsan Doostmohammadi and Richard Johansson and Marco Kuhlmann,http://arxiv.org/pdf/2302.12128v1
 http://arxiv.org/abs/2304.11721v1,creativecommons.org/licenses/by/4.0/,A Lightweight Constrained Generation Alternative for Query-focused Summarization,Zhichao Xu and Daniel Cohen,http://arxiv.org/pdf/2304.11721v1
 http://arxiv.org/abs/2103.02432v2,creativecommons.org/licenses/by/4.0/,FuncADL: Functional Analysis Description Language,Mason Proffitt and Gordon Watts,http://arxiv.org/pdf/2103.02432v2
 http://arxiv.org/abs/2203.10744v1,creativecommons.org/licenses/by/4.0/,Programming Language Agnostic Mining of Code and Language Pairs with Sequence Labeling Based Question Answering,Changran Hu and Akshara Reddi Methukupalli and Yutong Zhou and Chen Wu and Yubo Chen,http://arxiv.org/pdf/2203.10744v1
 http://arxiv.org/abs/2304.03816v1,creativecommons.org/licenses/by/4.0/,Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions,Sarah Fakhoury and Saikat Chakraborty and Madan Musuvathi and Shuvendu K. Lahiri,http://arxiv.org/pdf/2304.03816v1
 http://arxiv.org/abs/2207.00112v1,creativecommons.org/licenses/by/4.0/,Language model compression with weighted low-rank factorization,Yen-Chang Hsu and Ting Hua and Sungen Chang and Qian Lou and Yilin Shen and Hongxia Jin,http://arxiv.org/pdf/2207.00112v1
 http://arxiv.org/abs/2008.05055v1,creativecommons.org/licenses/by/4.0/,The Annotation Guideline of LST20 Corpus,Prachya Boonkwan and Vorapon Luantangsrisuk and Sitthaa Phaholphinyo and Kanyanat Kriengket and Dhanon Leenoi and Charun Phrombut and Monthika Boriboon and Krit Kosawat and Thepchai Supnithi,http://arxiv.org/pdf/2008.05055v1
 http://arxiv.org/abs/2205.07407v1,creativecommons.org/licenses/by/4.0/,What GPT Knows About Who is Who,Xiaohan Yang and Eduardo Peynetti and Vasco Meerman and Chris Tanner,http://arxiv.org/pdf/2205.07407v1
 http://arxiv.org/abs/2206.13517v1,creativecommons.org/licenses/by/4.0/,ProGen2: Exploring the Boundaries of Protein Language Models,Erik Nijkamp and Jeffrey Ruffolo and Eli N. Weinstein and Nikhil Naik and Ali Madani,http://arxiv.org/pdf/2206.13517v1
 http://arxiv.org/abs/2210.10332v2,creativecommons.org/licenses/by/4.0/,Revision Transformers: Getting RiT of No-Nos,Felix Friedrich and Wolfgang Stammer and Patrick Schramowski and Kristian Kersting,http://arxiv.org/pdf/2210.10332v2
 http://arxiv.org/abs/2304.06861v1,creativecommons.org/licenses/by/4.0/,Evaluation of Social Biases in Recent Large Pre-Trained Models,Swapnil Sharma and Nikita Anand and Kranthi Kiran G. V. and Alind Jain,http://arxiv.org/pdf/2304.06861v1
 http://arxiv.org/abs/2104.05146v1,creativecommons.org/licenses/by/4.0/,Assessing Reference-Free Peer Evaluation for Machine Translation,Sweta Agrawal and George Foster and Markus Freitag and Colin Cherry,http://arxiv.org/pdf/2104.05146v1
 http://arxiv.org/abs/2108.10580v1,creativecommons.org/licenses/by/4.0/,Detection of Criminal Texts for the Polish State Border Guard,Artur Nowakowski and Krzysztof Jassem,http://arxiv.org/pdf/2108.10580v1
 http://arxiv.org/abs/2210.06928v2,creativecommons.org/licenses/by/4.0/,"Sentence Ambiguity, Grammaticality and Complexity Probes",Sunit Bhattacharya and Vilém Zouhar and Ondřej Bojar,http://arxiv.org/pdf/2210.06928v2
 http://arxiv.org/abs/2303.15642v1,creativecommons.org/licenses/by/4.0/,Graph Sequence Learning for Premise Selection,Edvard K. Holden and Konstantin Korovin,http://arxiv.org/pdf/2303.15642v1
 http://arxiv.org/abs/2304.05012v1,creativecommons.org/licenses/by/4.0/,Human-machine cooperation for semantic feature listing,Kushin Mukherjee and Siddharth Suresh and Timothy T. Rogers,http://arxiv.org/pdf/2304.05012v1
 http://arxiv.org/abs/2304.08823v1,creativecommons.org/licenses/by/4.0/,Transfer to a Low-Resource Language via Close Relatives: The Case Study on Faroese,Vésteinn Snæbjarnarson and Annika Simonsen and Goran Glavaš and Ivan Vulić,http://arxiv.org/pdf/2304.08823v1
 http://arxiv.org/abs/2303.10464v1,creativecommons.org/licenses/by/4.0/,SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language Models,Vithursan Thangarasa and Abhay Gupta and William Marshall and Tianda Li and Kevin Leong and Dennis DeCoste and Sean Lie and Shreyas Saxena,http://arxiv.org/pdf/2303.10464v1
 http://arxiv.org/abs/2203.09509v4,creativecommons.org/licenses/by/4.0/,ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection,Thomas Hartvigsen and Saadia Gabriel and Hamid Palangi and Maarten Sap and Dipankar Ray and Ece Kamar,http://arxiv.org/pdf/2203.09509v4
 http://arxiv.org/abs/2208.03030v1,creativecommons.org/licenses/by/4.0/,ChiQA: A Large Scale Image-based Real-World Question Answering Dataset for Multi-Modal Understanding,Bingning Wang and Feiyang Lv and Ting Yao and Yiming Yuan and Jin Ma and Yu Luo and Haijin Liang,http://arxiv.org/pdf/2208.03030v1
 http://arxiv.org/abs/2012.03084v1,creativecommons.org/licenses/by/4.0/,Pre-training Protein Language Models with Label-Agnostic Binding Pairs Enhances Performance in Downstream Tasks,Modestas Filipavicius and Matteo Manica and Joris Cadow and Maria Rodriguez Martinez,http://arxiv.org/pdf/2012.03084v1
 http://arxiv.org/abs/2203.05115v2,creativecommons.org/licenses/by/4.0/,Internet-augmented language models through few-shot prompting for open-domain question answering,Angeliki Lazaridou and Elena Gribovskaya and Wojciech Stokowiec and Nikolai Grigorev,http://arxiv.org/pdf/2203.05115v2
 http://arxiv.org/abs/2302.01588v1,creativecommons.org/licenses/by/4.0/,Bioformer: an efficient transformer language model for biomedical text mining,Li Fang and Qingyu Chen and Chih-Hsuan Wei and Zhiyong Lu and Kai Wang,http://arxiv.org/pdf/2302.01588v1
 http://arxiv.org/abs/2302.14233v1,creativecommons.org/licenses/by/4.0/,Goal Driven Discovery of Distributional Differences via Language Descriptions,Ruiqi Zhong and Peter Zhang and Steve Li and Jinwoo Ahn and Dan Klein and Jacob Steinhardt,http://arxiv.org/pdf/2302.14233v1
 http://arxiv.org/abs/2302.14828v1,creativecommons.org/licenses/by/4.0/,Automatic Scoring of Dream Reports' Emotional Content with Large Language Models,Lorenzo Bertolini and Valentina Elce and Adriana Michalak and Giulio Bernardi and Julie Weeds,http://arxiv.org/pdf/2302.14828v1
 http://arxiv.org/abs/2304.01852v2,creativecommons.org/licenses/by/4.0/,Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models,Yiheng Liu and Tianle Han and Siyuan Ma and Jiayue Zhang and Yuanyuan Yang and Jiaming Tian and Hao He and Antong Li and Mengshen He and Zhengliang Liu and Zihao Wu and Dajiang Zhu and Xiang Li and Ning Qiang and Dingang Shen and Tianming Liu and Bao Ge,http://arxiv.org/pdf/2304.01852v2
 http://arxiv.org/abs/2302.11520v2,creativecommons.org/licenses/by/4.0/,Guiding Large Language Models via Directional Stimulus Prompting,Zekun Li and Baolin Peng and Pengcheng He and Michel Galley and Jianfeng Gao and Xifeng Yan,http://arxiv.org/pdf/2302.11520v2
 http://arxiv.org/abs/2303.17071v1,creativecommons.org/licenses/by/4.0/,DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents,Varun Nair and Elliot Schumacher and Geoffrey Tso and Anitha Kannan,http://arxiv.org/pdf/2303.17071v1
 http://arxiv.org/abs/2304.00457v1,creativecommons.org/licenses/by/4.0/,LLMMaps -- A Visual Metaphor for Stratified Evaluation of Large Language Models,Patrik Puchert and Poonam Poonam and Christian van Onzenoodt and Timo Ropinski,http://arxiv.org/pdf/2304.00457v1
 http://arxiv.org/abs/2304.09406v1,creativecommons.org/licenses/by/4.0/,How to Do Things with Deep Learning Code,Minh Hua and Rita Raley,http://arxiv.org/pdf/2304.09406v1
 http://arxiv.org/abs/2211.11720v3,creativecommons.org/licenses/by/4.0/,Multitask Vision-Language Prompt Tuning,Sheng Shen and Shijia Yang and Tianjun Zhang and Bohan Zhai and Joseph E. Gonzalez and Kurt Keutzer and Trevor Darrell,http://arxiv.org/pdf/2211.11720v3
 http://arxiv.org/abs/2212.00193v1,creativecommons.org/licenses/by/4.0/,Distilling Multi-Step Reasoning Capabilities of Large Language Models into Smaller Models via Semantic Decompositions,Kumar Shridhar and Alessandro Stolfo and Mrinmaya Sachan,http://arxiv.org/pdf/2212.00193v1
 http://arxiv.org/abs/2212.10537v2,creativecommons.org/licenses/by/4.0/,Does CLIP Bind Concepts? Probing Compositionality in Large Image Models,Martha Lewis and Nihal V. Nayak and Peilin Yu and Qinan Yu and Jack Merullo and Stephen H. Bach and Ellie Pavlick,http://arxiv.org/pdf/2212.10537v2
 http://arxiv.org/abs/2106.07225v1,creativecommons.org/licenses/by/4.0/,English to Bangla Machine Translation Using Recurrent Neural Network,Shaykh Siddique and Tahmid Ahmed and Md. Rifayet Azam Talukder and Md. Mohsin Uddin,http://arxiv.org/pdf/2106.07225v1
 http://arxiv.org/abs/2107.07651v2,creativecommons.org/licenses/by/4.0/,Align before Fuse: Vision and Language Representation Learning with Momentum Distillation,Junnan Li and Ramprasaath R. Selvaraju and Akhilesh Deepak Gotmare and Shafiq Joty and Caiming Xiong and Steven Hoi,http://arxiv.org/pdf/2107.07651v2
 http://arxiv.org/abs/2108.09105v1,creativecommons.org/licenses/by/4.0/,Airbert: In-domain Pretraining for Vision-and-Language Navigation,Pierre-Louis Guhur and Makarand Tapaswi and Shizhe Chen and Ivan Laptev and Cordelia Schmid,http://arxiv.org/pdf/2108.09105v1
 http://arxiv.org/abs/2202.12814v1,creativecommons.org/licenses/by/4.0/,The Reality of Multi-Lingual Machine Translation,Tom Kocmi and Dominik Macháček and Ondřej Bojar,http://arxiv.org/pdf/2202.12814v1
 http://arxiv.org/abs/2204.11454v2,creativecommons.org/licenses/by/4.0/,Natural Language to Code Translation with Execution,Freda Shi and Daniel Fried and Marjan Ghazvininejad and Luke Zettlemoyer and Sida I. Wang,http://arxiv.org/pdf/2204.11454v2
 http://arxiv.org/abs/2205.15503v3,creativecommons.org/licenses/by/4.0/,Leveraging Pre-Trained Language Models to Streamline Natural Language Interaction for Self-Tracking,Young-Ho Kim and Sungdong Kim and Minsuk Chang and Sang-Woo Lee,http://arxiv.org/pdf/2205.15503v3
 http://arxiv.org/abs/2210.03629v3,creativecommons.org/licenses/by/4.0/,ReAct: Synergizing Reasoning and Acting in Language Models,Shunyu Yao and Jeffrey Zhao and Dian Yu and Nan Du and Izhak Shafran and Karthik Narasimhan and Yuan Cao,http://arxiv.org/pdf/2210.03629v3
 http://arxiv.org/abs/2303.18027v2,creativecommons.org/licenses/by/4.0/,Evaluating GPT-4 and ChatGPT on Japanese Medical Licensing Examinations,Jungo Kasai and Yuhei Kasai and Keisuke Sakaguchi and Yutaro Yamada and Dragomir Radev,http://arxiv.org/pdf/2303.18027v2
 http://arxiv.org/abs/2304.02754v1,creativecommons.org/licenses/by/4.0/,Behavioral estimates of conceptual structure are robust across tasks in humans but not large language models,Siddharth Suresh and Lisa Padua and Kushin Mukherjee and Timothy T Rogers,http://arxiv.org/pdf/2304.02754v1
 http://arxiv.org/abs/2007.15211v2,creativecommons.org/licenses/by/4.0/,NeuralQA: A Usable Library for Question Answering (Contextual Query Expansion + BERT) on Large Datasets,Victor Dibia,http://arxiv.org/pdf/2007.15211v2
 http://arxiv.org/abs/2203.07259v3,creativecommons.org/licenses/by/4.0/,The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models,Eldar Kurtic and Daniel Campos and Tuan Nguyen and Elias Frantar and Mark Kurtz and Benjamin Fineran and Michael Goin and Dan Alistarh,http://arxiv.org/pdf/2203.07259v3
 http://arxiv.org/abs/2211.17192v1,creativecommons.org/licenses/by/4.0/,Fast Inference from Transformers via Speculative Decoding,Yaniv Leviathan and Matan Kalman and Yossi Matias,http://arxiv.org/pdf/2211.17192v1
 http://arxiv.org/abs/2302.06321v1,creativecommons.org/licenses/by/4.0/,Parameter-efficient Modularised Bias Mitigation via AdapterFusion,Deepak Kumar and Oleg Lesota and George Zerveas and Daniel Cohen and Carsten Eickhoff and Markus Schedl and Navid Rekabsaz,http://arxiv.org/pdf/2302.06321v1
 http://arxiv.org/abs/2011.06195v1,creativecommons.org/licenses/by/4.0/,Towards Semi-Supervised Semantics Understanding from Speech,Cheng-I Lai and Jin Cao and Sravan Bodapati and Shang-Wen Li,http://arxiv.org/pdf/2011.06195v1
 http://arxiv.org/abs/2101.04998v1,creativecommons.org/licenses/by/4.0/,Coarse and Fine-Grained Hostility Detection in Hindi Posts using Fine Tuned Multilingual Embeddings,Arkadipta De and Venkatesh E and Kaushal Kumar Maurya and Maunendra Sankar Desarkar,http://arxiv.org/pdf/2101.04998v1
 http://arxiv.org/abs/2104.06378v5,creativecommons.org/licenses/by/4.0/,QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering,Michihiro Yasunaga and Hongyu Ren and Antoine Bosselut and Percy Liang and Jure Leskovec,http://arxiv.org/pdf/2104.06378v5
 http://arxiv.org/abs/2106.10619v1,creativecommons.org/licenses/by/4.0/,A Brief Study on the Effects of Training Generative Dialogue Models with a Semantic loss,Prasanna Parthasarathi and Mohamed Abdelsalam and Joelle Pineau and Sarath Chandar,http://arxiv.org/pdf/2106.10619v1
 http://arxiv.org/abs/2108.00946v2,creativecommons.org/licenses/by/4.0/,StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators,Rinon Gal and Or Patashnik and Haggai Maron and Gal Chechik and Daniel Cohen-Or,http://arxiv.org/pdf/2108.00946v2
 http://arxiv.org/abs/2109.07953v1,creativecommons.org/licenses/by/4.0/,Efficient Attribute Injection for Pretrained Language Models,Reinald Kim Amplayo and Kang Min Yoo and Sang-Woo Lee,http://arxiv.org/pdf/2109.07953v1
 http://arxiv.org/abs/2110.15836v2,creativecommons.org/licenses/by/4.0/,Combining Unsupervised and Text Augmented Semi-Supervised Learning for Low Resourced Autoregressive Speech Recognition,Chak-Fai Li and Francis Keith and William Hartmann and Matthew Snover,http://arxiv.org/pdf/2110.15836v2
 http://arxiv.org/abs/2202.03753v2,creativecommons.org/licenses/by/4.0/,Semantic features of object concepts generated with GPT-3,Hannes Hansen and Martin N. Hebart,http://arxiv.org/pdf/2202.03753v2
 http://arxiv.org/abs/2205.05448v2,creativecommons.org/licenses/by/4.0/,Symphony Generation with Permutation Invariant Language Model,Jiafeng Liu and Yuanliang Dong and Zehua Cheng and Xinran Zhang and Xiaobing Li and Feng Yu and Maosong Sun,http://arxiv.org/pdf/2205.05448v2
 http://arxiv.org/abs/2206.15014v1,creativecommons.org/licenses/by/4.0/,Compressing Pre-trained Transformers via Low-Bit NxM Sparsity for Natural Language Understanding,Connor Holmes and Minjia Zhang and Yuxiong He and Bo Wu,http://arxiv.org/pdf/2206.15014v1
 http://arxiv.org/abs/2209.11068v1,creativecommons.org/licenses/by/4.0/,Prompting for a conversation: How to control a dialog model?,Josef Valvoda and Yimai Fang and David Vandyke,http://arxiv.org/pdf/2209.11068v1
 http://arxiv.org/abs/2210.09132v1,creativecommons.org/licenses/by/4.0/,Pseudo-OOD training for robust language models,Dhanasekar Sundararaman and Nikhil Mehta and Lawrence Carin,http://arxiv.org/pdf/2210.09132v1
 http://arxiv.org/abs/2210.15452v1,creativecommons.org/licenses/by/4.0/,Exploring Predictive Uncertainty and Calibration in NLP: A Study on the Impact of Method & Data Scarcity,Dennis Ulmer and Jes Frellsen and Christian Hardmeier,http://arxiv.org/pdf/2210.15452v1
 http://arxiv.org/abs/2211.08989v1,creativecommons.org/licenses/by/4.0/,Avoid Overthinking in Self-Supervised Models for Speech Recognition,Dan Berrebbi and Brian Yan and Shinji Watanabe,http://arxiv.org/pdf/2211.08989v1
 http://arxiv.org/abs/1709.09443v1,creativecommons.org/licenses/by/4.0/,Prosodic Features from Large Corpora of Child-Directed Speech as Predictors of the Age of Acquisition of Words,Lea Frermann and Michael C. Frank,http://arxiv.org/pdf/1709.09443v1
 http://arxiv.org/abs/2101.00376v2,creativecommons.org/licenses/by/4.0/,RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge,Bill Yuchen Lin and Ziyi Wu and Yichi Yang and Dong-Ho Lee and Xiang Ren,http://arxiv.org/pdf/2101.00376v2
 http://arxiv.org/abs/2112.12926v1,creativecommons.org/licenses/by/4.0/,nvBench: A Large-Scale Synthesized Dataset for Cross-Domain Natural Language to Visualization Task,Yuyu Luo and Jiawei Tang and Guoliang Li,http://arxiv.org/pdf/2112.12926v1
 http://arxiv.org/abs/2202.04742v1,creativecommons.org/licenses/by/4.0/,FedQAS: Privacy-aware machine reading comprehension with federated learning,Addi Ait-Mlouk and Sadi Alawadi and Salman Toor and Andreas Hellander,http://arxiv.org/pdf/2202.04742v1
 http://arxiv.org/abs/2204.13309v1,creativecommons.org/licenses/by/4.0/,Improving robustness of language models from a geometry-aware perspective,Bin Zhu and Zhaoquan Gu and Le Wang and Jinyin Chen and Qi Xuan,http://arxiv.org/pdf/2204.13309v1
 http://arxiv.org/abs/2205.10981v1,creativecommons.org/licenses/by/4.0/,Improving Short Text Classification With Augmented Data Using GPT-3,Salvador Balkus and Donghui Yan,http://arxiv.org/pdf/2205.10981v1
 http://arxiv.org/abs/2205.12615v1,creativecommons.org/licenses/by/4.0/,Autoformalization with Large Language Models,Yuhuai Wu and Albert Q. Jiang and Wenda Li and Markus N. Rabe and Charles Staats and Mateja Jamnik and Christian Szegedy,http://arxiv.org/pdf/2205.12615v1
 http://arxiv.org/abs/2206.04585v2,creativecommons.org/licenses/by/4.0/,Extracting Zero-shot Common Sense from Large Language Models for Robot 3D Scene Understanding,William Chen and Siyi Hu and Rajat Talak and Luca Carlone,http://arxiv.org/pdf/2206.04585v2
 http://arxiv.org/abs/2209.04811v1,creativecommons.org/licenses/by/4.0/,Probing for Understanding of English Verb Classes and Alternations in Large Pre-trained Language Models,David K. Yi and James V. Bruno and Jiayu Han and Peter Zukerman and Shane Steinert-Threlkeld,http://arxiv.org/pdf/2209.04811v1
 http://arxiv.org/abs/2209.12711v1,creativecommons.org/licenses/by/4.0/,Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts,Joel Jang and Seonghyeon Ye and Minjoon Seo,http://arxiv.org/pdf/2209.12711v1
 http://arxiv.org/abs/2211.04699v1,creativecommons.org/licenses/by/4.0/,FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration,Yangjun Wu and Kebin Fang and Yao Zhao and Hao Zhang and Lifeng Shi and Mengqi Zhang,http://arxiv.org/pdf/2211.04699v1
 http://arxiv.org/abs/2211.04898v2,creativecommons.org/licenses/by/4.0/,Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token,Baohao Liao and David Thulke and Sanjika Hewavitharana and Hermann Ney and Christof Monz,http://arxiv.org/pdf/2211.04898v2
 http://arxiv.org/abs/2212.05113v1,creativecommons.org/licenses/by/4.0/,Automatically Generating CS Learning Materials with Large Language Models,Stephen MacNeil and Andrew Tran and Juho Leinonen and Paul Denny and Joanne Kim and Arto Hellas and Seth Bernstein and Sami Sarsa,http://arxiv.org/pdf/2212.05113v1
 http://arxiv.org/abs/2212.11456v1,creativecommons.org/licenses/by/4.0/,CAMeMBERT: Cascading Assistant-Mediated Multilingual BERT,Dan DeGenaro and Jugal Kalita,http://arxiv.org/pdf/2212.11456v1
 http://arxiv.org/abs/2212.14047v1,creativecommons.org/licenses/by/4.0/,Using Large Language Models to Generate Engaging Captions for Data Visualizations,Ashley Liew and Klaus Mueller,http://arxiv.org/pdf/2212.14047v1
 http://arxiv.org/abs/2301.08721v1,creativecommons.org/licenses/by/4.0/,Batch Prompting: Efficient Inference with Large Language Model APIs,Zhoujun Cheng and Jungo Kasai and Tao Yu,http://arxiv.org/pdf/2301.08721v1
 http://arxiv.org/abs/2302.07856v1,creativecommons.org/licenses/by/4.0/,Dictionary-based Phrase-level Prompting of Large Language Models for Machine Translation,Marjan Ghazvininejad and Hila Gonen and Luke Zettlemoyer,http://arxiv.org/pdf/2302.07856v1
 http://arxiv.org/abs/2302.12832v1,creativecommons.org/licenses/by/4.0/,Fluid Transformers and Creative Analogies: Exploring Large Language Models' Capacity for Augmenting Cross-Domain Analogical Creativity,Zijian Ding and Arvind Srinivasan and Stephen MacNeil and Joel Chan,http://arxiv.org/pdf/2302.12832v1
 http://arxiv.org/abs/2303.01580v1,creativecommons.org/licenses/by/4.0/,Mixture of Soft Prompts for Controllable Data Generation,Derek Chen and Celine Lee and Yunan Lu and Domenic Rosati and Zhou Yu,http://arxiv.org/pdf/2303.01580v1
 http://arxiv.org/abs/2303.06247v2,creativecommons.org/licenses/by/4.0/,Task and Motion Planning with Large Language Models for Object Rearrangement,Yan Ding and Xiaohan Zhang and Chris Paxton and Shiqi Zhang,http://arxiv.org/pdf/2303.06247v2
 http://arxiv.org/abs/2303.15473v1,creativecommons.org/licenses/by/4.0/,Can Large Language Models assist in Hazard Analysis?,Simon Diemert and Jens H Weber,http://arxiv.org/pdf/2303.15473v1
 http://arxiv.org/abs/2303.16421v1,creativecommons.org/licenses/by/4.0/,ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models,Ning Bian and Xianpei Han and Le Sun and Hongyu Lin and Yaojie Lu and Ben He,http://arxiv.org/pdf/2303.16421v1
 http://arxiv.org/abs/2304.04487v1,creativecommons.org/licenses/by/4.0/,Inference with Reference: Lossless Acceleration of Large Language Models,Nan Yang and Tao Ge and Liang Wang and Binxing Jiao and Daxin Jiang and Linjun Yang and Rangan Majumder and Furu Wei,http://arxiv.org/pdf/2304.04487v1
 http://arxiv.org/abs/2304.06638v1,creativecommons.org/licenses/by/4.0/,How Useful are Educational Questions Generated by Large Language Models?,Sabina Elkins and Ekaterina Kochmar and Jackie C. K. Cheung and Iulian Serban,http://arxiv.org/pdf/2304.06638v1
 http://arxiv.org/abs/2012.13354v2,creativecommons.org/licenses/by/4.0/,To what extent do human explanations of model behavior align with actual model behavior?,Grusha Prasad and Yixin Nie and Mohit Bansal and Robin Jia and Douwe Kiela and Adina Williams,http://arxiv.org/pdf/2012.13354v2
 http://arxiv.org/abs/2303.02939v3,creativecommons.org/licenses/by/4.0/,FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model,Ruiqing Xue and Yanqing Liu and Lei He and Xu Tan and Linquan Liu and Edward Lin and Sheng Zhao,http://arxiv.org/pdf/2303.02939v3
 http://arxiv.org/abs/2209.07686v2,creativecommons.org/licenses/by/4.0/,"Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango",Aman Madaan and Amir Yazdanbakhsh,http://arxiv.org/pdf/2209.07686v2
 http://arxiv.org/abs/1604.05372v1,creativecommons.org/licenses/by/4.0/,Clustering Comparable Corpora of Russian and Ukrainian Academic Texts: Word Embeddings and Semantic Fingerprints,Andrey Kutuzov and Mikhail Kopotev and Tatyana Sviridenko and Lyubov Ivanova,http://arxiv.org/pdf/1604.05372v1
 http://arxiv.org/abs/2103.01273v2,creativecommons.org/licenses/by/4.0/,"On the Effectiveness of Dataset Embeddings in Mono-lingual,Multi-lingual and Zero-shot Conditions",Rob van der Goot and Ahmet Üstün and Barbara Plank,http://arxiv.org/pdf/2103.01273v2
 http://arxiv.org/abs/2105.09081v1,creativecommons.org/licenses/by/4.0/,Essay-BR: a Brazilian Corpus of Essays,Jeziel C. Marinho and Rafael T. Anchieta and Raimundo S. Moura,http://arxiv.org/pdf/2105.09081v1
 http://arxiv.org/abs/2109.05357v1,creativecommons.org/licenses/by/4.0/,Learning from Language Description: Low-shot Named Entity Recognition via Decomposed Framework,Yaqing Wang and Haoda Chu and Chao Zhang and Jing Gao,http://arxiv.org/pdf/2109.05357v1
 http://arxiv.org/abs/2109.10147v1,creativecommons.org/licenses/by/4.0/,Knowledge Distillation with Noisy Labels for Natural Language Understanding,Shivendra Bhardwaj and Abbas Ghaddar and Ahmad Rashid and Khalil Bibi and Chengyang Li and Ali Ghodsi and Philippe Langlais and Mehdi Rezagholizadeh,http://arxiv.org/pdf/2109.10147v1
 http://arxiv.org/abs/2110.09635v1,creativecommons.org/licenses/by/4.0/,A ground-truth dataset of real security patches,Sofia Reis and Rui Abreu,http://arxiv.org/pdf/2110.09635v1
 http://arxiv.org/abs/2110.11790v1,creativecommons.org/licenses/by/4.0/,Automatic Guide Generation for Stan via NumPyro,Guillaume Baudart and Louis Mandel,http://arxiv.org/pdf/2110.11790v1
 http://arxiv.org/abs/2112.08633v2,creativecommons.org/licenses/by/4.0/,Learning To Retrieve Prompts for In-Context Learning,Ohad Rubin and Jonathan Herzig and Jonathan Berant,http://arxiv.org/pdf/2112.08633v2
 http://arxiv.org/abs/2201.05613v2,creativecommons.org/licenses/by/4.0/,The Dark Side of the Language: Pre-trained Transformers in the DarkNet,Leonardo Ranaldi and Aria Nourbakhsh and Arianna Patrizi and Elena Sofia Ruzzetti and Dario Onorati and Francesca Fallucchi and Fabio Massimo Zanzotto,http://arxiv.org/pdf/2201.05613v2
 http://arxiv.org/abs/2202.07991v1,creativecommons.org/licenses/by/4.0/,ADIMA: Abuse Detection In Multilingual Audio,Vikram Gupta and Rini Sharon and Ramit Sawhney and Debdoot Mukherjee,http://arxiv.org/pdf/2202.07991v1
 http://arxiv.org/abs/2204.14243v2,creativecommons.org/licenses/by/4.0/,Training Naturalized Semantic Parsers with Very Little Data,Subendhu Rongali and Konstantine Arkoudas and Melanie Rubino and Wael Hamza,http://arxiv.org/pdf/2204.14243v2
 http://arxiv.org/abs/2209.00731v2,creativecommons.org/licenses/by/4.0/,In conversation with Artificial Intelligence: aligning language models with human values,Atoosa Kasirzadeh and Iason Gabriel,http://arxiv.org/pdf/2209.00731v2
 http://arxiv.org/abs/2210.13838v2,creativecommons.org/licenses/by/4.0/,Multilingual Relation Classification via Efficient and Effective Prompting,Yuxuan Chen and David Harbecke and Leonhard Hennig,http://arxiv.org/pdf/2210.13838v2
 http://arxiv.org/abs/2211.00046v1,creativecommons.org/licenses/by/4.0/,Very Low Resource Sentence Alignment: Luhya and Swahili,Everlyn Asiko Chimoto and Bruce A. Bassett,http://arxiv.org/pdf/2211.00046v1
 http://arxiv.org/abs/2212.10539v1,creativecommons.org/licenses/by/4.0/,"Toward Human Readable Prompt Tuning: Kubrick's The Shining is a good movie, and a good prompt too?",Weijia Shi and Xiaochuang Han and Hila Gonen and Ari Holtzman and Yulia Tsvetkov and Luke Zettlemoyer,http://arxiv.org/pdf/2212.10539v1
 http://arxiv.org/abs/2303.00733v1,creativecommons.org/licenses/by/4.0/,SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks,Kai-Wei Chang and Yu-Kai Wang and Hua Shen and Iu-thing Kang and Wei-Cheng Tseng and Shang-Wen Li and Hung-yi Lee,http://arxiv.org/pdf/2303.00733v1
 http://arxiv.org/abs/2304.01746v1,creativecommons.org/licenses/by/4.0/,Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation,Tao Fang and Shu Yang and Kaixin Lan and Derek F. Wong and Jinpeng Hu and Lidia S. Chao and Yue Zhang,http://arxiv.org/pdf/2304.01746v1
 http://arxiv.org/abs/2304.03682v1,creativecommons.org/licenses/by/4.0/,BenCoref: A Multi-Domain Dataset of Nominal Phrases and Pronominal Reference Annotations,Shadman Rohan and Mojammel Hossain and Mohammad Mamun Or Rashid and Nabeel Mohammed,http://arxiv.org/pdf/2304.03682v1
 http://arxiv.org/abs/1512.08823v2,creativecommons.org/licenses/by/4.0/,Reduction of Nondeterministic Tree Automata,Ricardo Almeida and Lukáš Holík and Richard Mayr,http://arxiv.org/pdf/1512.08823v2
 http://arxiv.org/abs/2004.02077v1,creativecommons.org/licenses/by/4.0/,Machine Translation Pre-training for Data-to-Text Generation -- A Case Study in Czech,Mihir Kale and Scott Roy,http://arxiv.org/pdf/2004.02077v1
 http://arxiv.org/abs/2112.12750v1,creativecommons.org/licenses/by/4.0/,SLIP: Self-supervision meets Language-Image Pre-training,Norman Mu and Alexander Kirillov and David Wagner and Saining Xie,http://arxiv.org/pdf/2112.12750v1
 http://arxiv.org/abs/2111.09543v4,creativecommons.org/licenses/by/4.0/,DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing,Pengcheng He and Jianfeng Gao and Weizhu Chen,http://arxiv.org/pdf/2111.09543v4
 http://arxiv.org/abs/2106.06297v1,creativecommons.org/licenses/by/4.0/,Dynamic Language Models for Continuously Evolving Content,Spurthi Amba Hombaiah and Tao Chen and Mingyang Zhang and Michael Bendersky and Marc Najork,http://arxiv.org/pdf/2106.06297v1
 http://arxiv.org/abs/2303.06135v2,creativecommons.org/licenses/by/4.0/,Rewarding Chatbots for Real-World Engagement with Millions of Users,Robert Irvine and Douglas Boubert and Vyas Raina and Adian Liusie and Ziyi Zhu and Vineet Mudupalli and Aliaksei Korshuk and Zongyi Liu and Fritz Cremer and Valentin Assassi and Christie-Carol Beauchamp and Xiaoding Lu and Thomas Rialan and William Beauchamp,http://arxiv.org/pdf/2303.06135v2
 http://arxiv.org/abs/2301.09626v1,creativecommons.org/licenses/by/4.0/,Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning,Malte Ostendorff and Georg Rehm,http://arxiv.org/pdf/2301.09626v1
 http://arxiv.org/abs/2012.07534v1,creativecommons.org/licenses/by/4.0/,Effect of Word Embedding Models on Hate and Offensive Speech Detection,Safa Alsafari and Samira Sadaoui and Malek Mouhoub,http://arxiv.org/pdf/2012.07534v1
 http://arxiv.org/abs/2105.07465v3,creativecommons.org/licenses/by/4.0/,SLGPT: Using Transfer Learning to Directly Generate Simulink Model Files and Find Bugs in the Simulink Toolchain,Sohil Lal Shrestha and Christoph Csallner,http://arxiv.org/pdf/2105.07465v3
 http://arxiv.org/abs/2106.00840v1,creativecommons.org/licenses/by/4.0/,Comparing Test Sets with Item Response Theory,Clara Vania and Phu Mon Htut and William Huang and Dhara Mungra and Richard Yuanzhe Pang and Jason Phang and Haokun Liu and Kyunghyun Cho and Samuel R. Bowman,http://arxiv.org/pdf/2106.00840v1
 http://arxiv.org/abs/2112.00791v2,creativecommons.org/licenses/by/4.0/,Controlling Conditional Language Models without Catastrophic Forgetting,Tomasz Korbak and Hady Elsahar and German Kruszewski and Marc Dymetman,http://arxiv.org/pdf/2112.00791v2
 http://arxiv.org/abs/2204.05979v1,creativecommons.org/licenses/by/4.0/,Discovering material information using hierarchical Reformer model on financial regulatory filings,Francois Mercier and Makesh Narsimhan,http://arxiv.org/pdf/2204.05979v1
 http://arxiv.org/abs/2211.09800v2,creativecommons.org/licenses/by/4.0/,InstructPix2Pix: Learning to Follow Image Editing Instructions,Tim Brooks and Aleksander Holynski and Alexei A. Efros,http://arxiv.org/pdf/2211.09800v2
 http://arxiv.org/abs/2211.15199v1,creativecommons.org/licenses/by/4.0/,Large Pre-Trained Models with Extra-Large Vocabularies: A Contrastive Analysis of Hebrew BERT Models and a New One to Outperform Them All,Eylon Guetta and Avi Shmidman and Shaltiel Shmidman and Cheyn Shmuel Shmidman and Joshua Guedalia and Moshe Koppel and Dan Bareket and Amit Seker and Reut Tsarfaty,http://arxiv.org/pdf/2211.15199v1
 http://arxiv.org/abs/2112.08726v1,creativecommons.org/licenses/by/4.0/,NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics,Ximing Lu and Sean Welleck and Peter West and Liwei Jiang and Jungo Kasai and Daniel Khashabi and Ronan Le Bras and Lianhui Qin and Youngjae Yu and Rowan Zellers and Noah A. Smith and Yejin Choi,http://arxiv.org/pdf/2112.08726v1
 http://arxiv.org/abs/2201.06723v2,creativecommons.org/licenses/by/4.0/,Emojis as Anchors to Detect Arabic Offensive Language and Hate Speech,Hamdy Mubarak and Sabit Hassan and Shammur Absar Chowdhury,http://arxiv.org/pdf/2201.06723v2
 http://arxiv.org/abs/2202.01374v1,creativecommons.org/licenses/by/4.0/,mSLAM: Massively multilingual joint pre-training for speech and text,Ankur Bapna and Colin Cherry and Yu Zhang and Ye Jia and Melvin Johnson and Yong Cheng and Simran Khanuja and Jason Riesa and Alexis Conneau,http://arxiv.org/pdf/2202.01374v1
 http://arxiv.org/abs/2205.04652v1,creativecommons.org/licenses/by/4.0/,SuMe: A Dataset Towards Summarizing Biomedical Mechanisms,Mohaddeseh Bastan and Nishant Shankar and Mihai Surdeanu and Niranjan Balasubramanian,http://arxiv.org/pdf/2205.04652v1
 http://arxiv.org/abs/2206.08932v1,creativecommons.org/licenses/by/4.0/,Putting GPT-3's Creativity to the (Alternative Uses) Test,Claire Stevenson and Iris Smal and Matthijs Baas and Raoul Grasman and Han van der Maas,http://arxiv.org/pdf/2206.08932v1
 http://arxiv.org/abs/2212.10846v2,creativecommons.org/licenses/by/4.0/,From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models,Jiaxian Guo and Junnan Li and Dongxu Li and Anthony Meng Huat Tiong and Boyang Li and Dacheng Tao and Steven C. H. Hoi,http://arxiv.org/pdf/2212.10846v2
 http://arxiv.org/abs/2302.04931v1,creativecommons.org/licenses/by/4.0/,In-Context Learning with Many Demonstration Examples,Mukai Li and Shansan Gong and Jiangtao Feng and Yiheng Xu and Jun Zhang and Zhiyong Wu and Lingpeng Kong,http://arxiv.org/pdf/2302.04931v1
 http://arxiv.org/abs/2111.00276v2,creativecommons.org/licenses/by/4.0/,EventNarrative: A large-scale Event-centric Dataset for Knowledge Graph-to-Text Generation,Anthony Colas and Ali Sadeghian and Yue Wang and Daisy Zhe Wang,http://arxiv.org/pdf/2111.00276v2
 http://arxiv.org/abs/2208.11701v1,creativecommons.org/licenses/by/4.0/,Ontology-Driven Self-Supervision for Adverse Childhood Experiences Identification Using Social Media Datasets,Jinge Wu and Rowena Smith and Honghan Wu,http://arxiv.org/pdf/2208.11701v1
 http://arxiv.org/abs/2205.05055v6,creativecommons.org/licenses/by/4.0/,Data Distributional Properties Drive Emergent In-Context Learning in Transformers,Stephanie C. Y. Chan and Adam Santoro and Andrew K. Lampinen and Jane X. Wang and Aaditya Singh and Pierre H. Richemond and Jay McClelland and Felix Hill,http://arxiv.org/pdf/2205.05055v6
 http://arxiv.org/abs/2205.11005v1,creativecommons.org/licenses/by/4.0/,Parameter-Efficient Sparsity for Large Language Models Fine-Tuning,Yuchao Li and Fuli Luo and Chuanqi Tan and Mengdi Wang and Songfang Huang and Shen Li and Junjie Bai,http://arxiv.org/pdf/2205.11005v1
 http://arxiv.org/abs/2110.12201v1,creativecommons.org/licenses/by/4.0/,Spanish Legalese Language Model and Corpora,Asier Gutiérrez-Fandiño and Jordi Armengol-Estapé and Aitor Gonzalez-Agirre and Marta Villegas,http://arxiv.org/pdf/2110.12201v1
 http://arxiv.org/abs/1911.00461v1,creativecommons.org/licenses/by/4.0/,On the Unintended Social Bias of Training Language Generation Models with Data from Local Media,Omar U. Florez,http://arxiv.org/pdf/1911.00461v1
 http://arxiv.org/abs/2010.04897v1,creativecommons.org/licenses/by/4.0/,Information Extraction from Swedish Medical Prescriptions with Sig-Transformer Encoder,John Pougue Biyong and Bo Wang and Terry Lyons and Alejo J Nevado-Holgado,http://arxiv.org/pdf/2010.04897v1
 http://arxiv.org/abs/2102.03551v1,creativecommons.org/licenses/by/4.0/,Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling,Ernie Chang and Vera Demberg and Alex Marin,http://arxiv.org/pdf/2102.03551v1
 http://arxiv.org/abs/2104.01394v1,creativecommons.org/licenses/by/4.0/,MMBERT: Multimodal BERT Pretraining for Improved Medical VQA,Yash Khare and Viraj Bagal and Minesh Mathew and Adithi Devi and U Deva Priyakumar and CV Jawahar,http://arxiv.org/pdf/2104.01394v1
 http://arxiv.org/abs/2106.00590v2,creativecommons.org/licenses/by/4.0/,NewsEmbed: Modeling News through Pre-trained Document Representations,Jialu Liu and Tianqi Liu and Cong Yu,http://arxiv.org/pdf/2106.00590v2
 http://arxiv.org/abs/2109.12036v1,creativecommons.org/licenses/by/4.0/,Transformers Generalize Linearly,Jackson Petty and Robert Frank,http://arxiv.org/pdf/2109.12036v1
 http://arxiv.org/abs/2109.12406v1,creativecommons.org/licenses/by/4.0/,MINIMAL: Mining Models for Data Free Universal Adversarial Triggers,Swapnil Parekh and Yaman Singla Kumar and Somesh Singh and Changyou Chen and Balaji Krishnamurthy and Rajiv Ratn Shah,http://arxiv.org/pdf/2109.12406v1
 http://arxiv.org/abs/2111.00526v2,creativecommons.org/licenses/by/4.0/,FinEAS: Financial Embedding Analysis of Sentiment,Asier Gutiérrez-Fandiño and Miquel Noguer i Alonso and Petter Kolm and Jordi Armengol-Estapé,http://arxiv.org/pdf/2111.00526v2
 http://arxiv.org/abs/2111.01543v1,creativecommons.org/licenses/by/4.0/,UQuAD1.0: Development of an Urdu Question Answering Training Data for Machine Reading Comprehension,Samreen Kazi and Shakeel Khoja,http://arxiv.org/pdf/2111.01543v1
 http://arxiv.org/abs/2203.10378v1,creativecommons.org/licenses/by/4.0/,On Robust Prefix-Tuning for Text Classification,Zonghan Yang and Yang Liu,http://arxiv.org/pdf/2203.10378v1
 http://arxiv.org/abs/2204.05185v2,creativecommons.org/licenses/by/4.0/,Uniform Complexity for Text Generation,Joseph Marvin Imperial,http://arxiv.org/pdf/2204.05185v2
 http://arxiv.org/abs/2205.00363v3,creativecommons.org/licenses/by/4.0/,Visual Spatial Reasoning,Fangyu Liu and Guy Emerson and Nigel Collier,http://arxiv.org/pdf/2205.00363v3
 http://arxiv.org/abs/2205.09246v1,creativecommons.org/licenses/by/4.0/,Transformer-based Program Synthesis for Low-Data Environments,Jack Roper,http://arxiv.org/pdf/2205.09246v1
 http://arxiv.org/abs/2208.05798v1,creativecommons.org/licenses/by/4.0/,Aesthetic Visual Question Answering of Photographs,Xin Jin and Wu Zhou and Xinghui Zhou and Shuai Cui and Le Zhang and Jianwen Lv and Shu Zhao,http://arxiv.org/pdf/2208.05798v1
 http://arxiv.org/abs/2210.01240v4,creativecommons.org/licenses/by/4.0/,Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought,Abulhair Saparov and He He,http://arxiv.org/pdf/2210.01240v4
 http://arxiv.org/abs/2210.05839v1,creativecommons.org/licenses/by/4.0/,SEAL : Interactive Tool for Systematic Error Analysis and Labeling,Nazneen Rajani and Weixin Liang and Lingjiao Chen and Meg Mitchell and James Zou,http://arxiv.org/pdf/2210.05839v1
 http://arxiv.org/abs/2211.07954v1,creativecommons.org/licenses/by/4.0/,An Overview on Controllable Text Generation via Variational Auto-Encoders,Haoqin Tu and Yitong Li,http://arxiv.org/pdf/2211.07954v1
 http://arxiv.org/abs/2212.10114v1,creativecommons.org/licenses/by/4.0/,True Detective: A Challenging Benchmark for Deep Abductive Reasoning \\in Foundation Models,Maksym Del and Mark Fishel,http://arxiv.org/pdf/2212.10114v1
 http://arxiv.org/abs/2212.10773v1,creativecommons.org/licenses/by/4.0/,MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning,Zhiyang Xu and Ying Shen and Lifu Huang,http://arxiv.org/pdf/2212.10773v1
 http://arxiv.org/abs/2301.10166v1,creativecommons.org/licenses/by/4.0/,Leveraging Vision-Language Models for Granular Market Change Prediction,Christopher Wimmer and Navid Rekabsaz,http://arxiv.org/pdf/2301.10166v1
 http://arxiv.org/abs/2301.12314v1,creativecommons.org/licenses/by/4.0/,Progressive Prompts: Continual Learning for Language Models,Anastasia Razdaibiedina and Yuning Mao and Rui Hou and Madian Khabsa and Mike Lewis and Amjad Almahairi,http://arxiv.org/pdf/2301.12314v1
 http://arxiv.org/abs/2302.02463v3,creativecommons.org/licenses/by/4.0/,Nationality Bias in Text Generation,Pranav Narayanan Venkit and Sanjana Gautam and Ruchi Panchanadikar and Ting-Hao 'Kenneth' Huang and Shomir Wilson,http://arxiv.org/pdf/2302.02463v3
 http://arxiv.org/abs/2303.03840v2,creativecommons.org/licenses/by/4.0/,A Challenging Benchmark for Low-Resource Learning,Yudong Wang and Chang Ma and Qingxiu Dong and Lingpeng Kong and Jingjing Xu,http://arxiv.org/pdf/2303.03840v2
 http://arxiv.org/abs/2303.04497v1,creativecommons.org/licenses/by/4.0/,Exploiting the Textual Potential from Vision-Language Pre-training for Text-based Person Search,Guanshuo Wang and Fufu Yu and Junjie Li and Qiong Jia and Shouhong Ding,http://arxiv.org/pdf/2303.04497v1
 http://arxiv.org/abs/2304.08243v1,creativecommons.org/licenses/by/4.0/,Stochastic Code Generation,Swapnil Sharma and Nikita Anand and Kranthi Kiran G. V,http://arxiv.org/pdf/2304.08243v1
 http://arxiv.org/abs/2201.07520v1,creativecommons.org/licenses/by/4.0/,CM3: A Causal Masked Multimodal Model of the Internet,Armen Aghajanyan and Bernie Huang and Candace Ross and Vladimir Karpukhin and Hu Xu and Naman Goyal and Dmytro Okhonko and Mandar Joshi and Gargi Ghosh and Mike Lewis and Luke Zettlemoyer,http://arxiv.org/pdf/2201.07520v1
 http://arxiv.org/abs/2009.09223v1,creativecommons.org/licenses/by/4.0/,BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition,Usman Naseem and Matloob Khushi and Vinay Reddy and Sakthivel Rajendran and Imran Razzak and Jinman Kim,http://arxiv.org/pdf/2009.09223v1
 http://arxiv.org/abs/2105.02605v2,creativecommons.org/licenses/by/4.0/,GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph,Junhan Yang and Zheng Liu and Shitao Xiao and Chaozhuo Li and Defu Lian and Sanjay Agrawal and Amit Singh and Guangzhong Sun and Xing Xie,http://arxiv.org/pdf/2105.02605v2
 http://arxiv.org/abs/2301.02120v1,creativecommons.org/licenses/by/4.0/,Reprogramming Pretrained Language Models for Protein Sequence Representation Learning,Ria Vinod and Pin-Yu Chen and Payel Das,http://arxiv.org/pdf/2301.02120v1
 http://arxiv.org/abs/2301.07851v1,creativecommons.org/licenses/by/4.0/,From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition,Chao-Han Huck Yang and Bo Li and Yu Zhang and Nanxin Chen and Rohit Prabhavalkar and Tara N. Sainath and Trevor Strohman,http://arxiv.org/pdf/2301.07851v1
 http://arxiv.org/abs/2302.05527v1,creativecommons.org/licenses/by/4.0/,CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code,Shuyan Zhou and Uri Alon and Sumit Agarwal and Graham Neubig,http://arxiv.org/pdf/2302.05527v1
 http://arxiv.org/abs/2106.12797v1,creativecommons.org/licenses/by/4.0/,A comprehensive empirical analysis on cross-domain semantic enrichment for detection of depressive language,Nawshad Farruque and Randy Goebel and Osmar Zaiane,http://arxiv.org/pdf/2106.12797v1
 http://arxiv.org/abs/2109.09707v1,creativecommons.org/licenses/by/4.0/,A Plug-and-Play Method for Controlled Text Generation,Damian Pascual and Beni Egressy and Clara Meister and Ryan Cotterell and Roger Wattenhofer,http://arxiv.org/pdf/2109.09707v1
 http://arxiv.org/abs/2110.04544v1,creativecommons.org/licenses/by/4.0/,CLIP-Adapter: Better Vision-Language Models with Feature Adapters,Peng Gao and Shijie Geng and Renrui Zhang and Teli Ma and Rongyao Fang and Yongfeng Zhang and Hongsheng Li and Yu Qiao,http://arxiv.org/pdf/2110.04544v1
 http://arxiv.org/abs/2210.12810v1,creativecommons.org/licenses/by/4.0/,Code4Struct: Code Generation for Few-Shot Structured Prediction from Natural Language,Xingyao Wang and Sha Li and Heng Ji,http://arxiv.org/pdf/2210.12810v1
 http://arxiv.org/abs/2302.05852v1,creativecommons.org/licenses/by/4.0/,"Why is this misleading?"": Detecting News Headline Hallucinations with Explanations""",Jiaming Shen and Jialu Liu and Dan Finnie and Negar Rahmati and Michael Bendersky and Marc Najork,http://arxiv.org/pdf/2302.05852v1
 http://arxiv.org/abs/2304.11107v1,creativecommons.org/licenses/by/4.0/,ChatABL: Abductive Learning via Natural Language Interaction with ChatGPT,Tianyang Zhong and Yaonai Wei and Li Yang and Zihao Wu and Zhengliang Liu and Xiaozheng Wei and Wenjun Li and Junjie Yao and Chong Ma and Xiang Li and Dajiang Zhu and Xi Jiang and Junwei Han and Dinggang Shen and Tianming Liu and Tuo Zhang,http://arxiv.org/pdf/2304.11107v1
 http://arxiv.org/abs/2102.06991v2,creativecommons.org/licenses/by/4.0/,The first large scale collection of diverse Hausa language datasets,Isa Inuwa-Dutse,http://arxiv.org/pdf/2102.06991v2
 http://arxiv.org/abs/2107.09948v4,creativecommons.org/licenses/by/4.0/,A Statistical Model of Word Rank Evolution,Alex John Quijano and Rick Dale and Suzanne Sindi,http://arxiv.org/pdf/2107.09948v4
 http://arxiv.org/abs/2303.00807v1,creativecommons.org/licenses/by/4.0/,UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers,Jon Saad-Falcon and Omar Khattab and Keshav Santhanam and Radu Florian and Martin Franz and Salim Roukos and Avirup Sil and Md Arafat Sultan and Christopher Potts,http://arxiv.org/pdf/2303.00807v1
 http://arxiv.org/abs/1902.06092v1,creativecommons.org/licenses/by/4.0/,Exploring Language Similarities with Dimensionality Reduction Technique,Sangarshanan Veeraraghavan,http://arxiv.org/pdf/1902.06092v1
 http://arxiv.org/abs/2205.12672v2,creativecommons.org/licenses/by/4.0/,Discovering Language-neutral Sub-networks in Multilingual Language Models,Negar Foroutan and Mohammadreza Banaei and Remi Lebret and Antoine Bosselut and Karl Aberer,http://arxiv.org/pdf/2205.12672v2
 http://arxiv.org/abs/2303.15350v1,creativecommons.org/licenses/by/4.0/,Improving Neural Topic Models with Wasserstein Knowledge Distillation,Suman Adhya and Debarshi Kumar Sanyal,http://arxiv.org/pdf/2303.15350v1
 http://arxiv.org/abs/2205.01204v1,creativecommons.org/licenses/by/4.0/,Multi-Task Text Classification using Graph Convolutional Networks for Large-Scale Low Resource Language,Mounika Marreddy and Subba Reddy Oota and Lakshmi Sireesha Vakada and Venkata Charan Chinni and Radhika Mamidi,http://arxiv.org/pdf/2205.01204v1
 http://arxiv.org/abs/1904.02036v1,creativecommons.org/licenses/by/4.0/,A Large-Scale Comparison of Historical Text Normalization Systems,Marcel Bollmann,http://arxiv.org/pdf/1904.02036v1
 http://arxiv.org/abs/2104.04243v1,creativecommons.org/licenses/by/4.0/,Incorporating External Knowledge to Enhance Tabular Reasoning,J. Neeraja and Vivek Gupta and Vivek Srikumar,http://arxiv.org/pdf/2104.04243v1
 http://arxiv.org/abs/2204.10483v1,creativecommons.org/licenses/by/4.0/,NLP Based Anomaly Detection for Categorical Time Series,Matthew Horak and Sowmya Chandrasekaran and Giovanni Tobar,http://arxiv.org/pdf/2204.10483v1
 http://arxiv.org/abs/2210.00131v2,creativecommons.org/licenses/by/4.0/,Selection Induced Collider Bias: A Gender Pronoun Uncertainty Case Study,Emily McMilin,http://arxiv.org/pdf/2210.00131v2
 http://arxiv.org/abs/2304.05591v1,creativecommons.org/licenses/by/4.0/,Semantic Feature Verification in FLAN-T5,Siddharth Suresh and Kushin Mukherjee and Timothy T. Rogers,http://arxiv.org/pdf/2304.05591v1
 http://arxiv.org/abs/2304.11094v1,creativecommons.org/licenses/by/4.0/,Effectiveness of Debiasing Techniques: An Indigenous Qualitative Analysis,Vithya Yogarajan and Gillian Dobbie and Henry Gouk,http://arxiv.org/pdf/2304.11094v1
 http://arxiv.org/abs/2109.02797v1,creativecommons.org/licenses/by/4.0/,Puzzle Solving without Search or Human Knowledge: An Unnatural Language Approach,David Noever and Ryerson Burdick,http://arxiv.org/pdf/2109.02797v1
 http://arxiv.org/abs/2109.11577v2,creativecommons.org/licenses/by/4.0/,Text Ranking and Classification using Data Compression,Nitya Kasturi and Igor L. Markov,http://arxiv.org/pdf/2109.11577v2
 http://arxiv.org/abs/2205.12643v1,creativecommons.org/licenses/by/4.0/,Asking the Right Questions in Low Resource Template Extraction,Nils Holzenberger and Yunmo Chen and Benjamin Van Durme,http://arxiv.org/pdf/2205.12643v1
 http://arxiv.org/abs/2210.06384v3,creativecommons.org/licenses/by/4.0/,GMP*: Well-Tuned Gradual Magnitude Pruning Can Outperform Most BERT-Pruning Methods,Eldar Kurtic and Dan Alistarh,http://arxiv.org/pdf/2210.06384v3
 http://arxiv.org/abs/2210.14852v2,creativecommons.org/licenses/by/4.0/,Causality Detection using Multiple Annotation Decisions,Quynh Anh Nguyen and Arka Mitra,http://arxiv.org/pdf/2210.14852v2
 http://arxiv.org/abs/2211.17201v1,creativecommons.org/licenses/by/4.0/,ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT,Rui Pan and Shizhe Diao and Jianlin Chen and Tong Zhang,http://arxiv.org/pdf/2211.17201v1
 http://arxiv.org/abs/2302.11042v1,creativecommons.org/licenses/by/4.0/,In-context Example Selection with Influences,Tai Nguyen and Eric Wong,http://arxiv.org/pdf/2302.11042v1
 http://arxiv.org/abs/2304.12102v1,creativecommons.org/licenses/by/4.0/,Unlocking Context Constraints of LLMs: Enhancing Context Efficiency of LLMs with Self-Information-Based Content Filtering,Yucheng Li,http://arxiv.org/pdf/2304.12102v1
 http://arxiv.org/abs/2109.14259v1,creativecommons.org/licenses/by/4.0/,Hierarchical Character Tagger for Short Text Spelling Error Correction,Mengyi Gao and Canran Xu and Peng Shi,http://arxiv.org/pdf/2109.14259v1
 http://arxiv.org/abs/2111.14447v2,creativecommons.org/licenses/by/4.0/,ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic,Yoad Tewel and Yoav Shalev and Idan Schwartz and Lior Wolf,http://arxiv.org/pdf/2111.14447v2
 http://arxiv.org/abs/2206.01838v1,creativecommons.org/licenses/by/4.0/,Differentially Private Model Compression,Fatemehsadat Mireshghallah and Arturs Backurs and Huseyin A Inan and Lukas Wutschitz and Janardhan Kulkarni,http://arxiv.org/pdf/2206.01838v1
 http://arxiv.org/abs/2212.01365v1,creativecommons.org/licenses/by/4.0/,An Information-Theoretic Analysis of Compute-Optimal Neural Scaling Laws,Hong Jun Jeon and Benjamin Van Roy,http://arxiv.org/pdf/2212.01365v1
 http://arxiv.org/abs/2302.09185v1,creativecommons.org/licenses/by/4.0/,Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints,Albert Lu and Hongxin Zhang and Yanzhe Zhang and Xuezhi Wang and Diyi Yang,http://arxiv.org/pdf/2302.09185v1
 http://arxiv.org/abs/2212.14834v4,creativecommons.org/licenses/by/4.0/,Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models,Yinlin Deng and Chunqiu Steven Xia and Haoran Peng and Chenyuan Yang and Lingming Zhang,http://arxiv.org/pdf/2212.14834v4
 http://arxiv.org/abs/2212.07143v1,creativecommons.org/licenses/by/4.0/,Reproducible scaling laws for contrastive language-image learning,Mehdi Cherti and Romain Beaumont and Ross Wightman and Mitchell Wortsman and Gabriel Ilharco and Cade Gordon and Christoph Schuhmann and Ludwig Schmidt and Jenia Jitsev,http://arxiv.org/pdf/2212.07143v1
 http://arxiv.org/abs/2204.02311v5,creativecommons.org/licenses/by/4.0/,PaLM: Scaling Language Modeling with Pathways,Aakanksha Chowdhery and Sharan Narang and Jacob Devlin and Maarten Bosma and Gaurav Mishra and Adam Roberts and Paul Barham and Hyung Won Chung and Charles Sutton and Sebastian Gehrmann and Parker Schuh and Kensen Shi and Sasha Tsvyashchenko and Joshua Maynez and Abhishek Rao and Parker Barnes and Yi Tay and Noam Shazeer and Vinodkumar Prabhakaran and Emily Reif and Nan Du and Ben Hutchinson and Reiner Pope and James Bradbury and Jacob Austin and Michael Isard and Guy Gur-Ari and Pengcheng Yin and Toju Duke and Anselm Levskaya and Sanjay Ghemawat and Sunipa Dev and Henryk Michalewski and Xavier Garcia and Vedant Misra and Kevin Robinson and Liam Fedus and Denny Zhou and Daphne Ippolito and David Luan and Hyeontaek Lim and Barret Zoph and Alexander Spiridonov and Ryan Sepassi and David Dohan and Shivani Agrawal and Mark Omernick and Andrew M. Dai and Thanumalayan Sankaranarayana Pillai and Marie Pellat and Aitor Lewkowycz and Erica Moreira and Rewon Child and Oleksandr Polozov and Katherine Lee and Zongwei Zhou and Xuezhi Wang and Brennan Saeta and Mark Diaz and Orhan Firat and Michele Catasta and Jason Wei and Kathy Meier-Hellstern and Douglas Eck and Jeff Dean and Slav Petrov and Noah Fiedel,http://arxiv.org/pdf/2204.02311v5
 http://arxiv.org/abs/2111.09791v1,creativecommons.org/licenses/by/4.0/,Supporting Undotted Arabic with Pre-trained Language Models,Aviad Rom and Kfir Bar,http://arxiv.org/pdf/2111.09791v1
 http://arxiv.org/abs/2212.10502v1,creativecommons.org/licenses/by/4.0/,A Measure-Theoretic Characterization of Tight Language Models,Li Du and Lucas Torroba Hennigen and Tiago Pimentel and Clara Meister and Jason Eisner and Ryan Cotterell,http://arxiv.org/pdf/2212.10502v1
 http://arxiv.org/abs/2301.06527v1,creativecommons.org/licenses/by/4.0/,XNLI 2.0: Improving XNLI dataset and performance on Cross Lingual Understanding (XLU),Ankit Kumar Upadhyay and Harsit Kumar Upadhya,http://arxiv.org/pdf/2301.06527v1
 http://arxiv.org/abs/1910.06426v1,creativecommons.org/licenses/by/4.0/,Tell-the-difference: Fine-grained Visual Descriptor via a Discriminating Referee,Shuangjie Xu and Feng Xu and Yu Cheng and Pan Zhou,http://arxiv.org/pdf/1910.06426v1
 http://arxiv.org/abs/2111.08267v1,creativecommons.org/licenses/by/4.0/,Solving Probability and Statistics Problems by Program Synthesis,Leonard Tang and Elizabeth Ke and Nikhil Singh and Nakul Verma and Iddo Drori,http://arxiv.org/pdf/2111.08267v1
 http://arxiv.org/abs/2205.01287v3,creativecommons.org/licenses/by/4.0/,SemAttack: Natural Textual Attacks via Different Semantic Spaces,Boxin Wang and Chejian Xu and Xiangyu Liu and Yu Cheng and Bo Li,http://arxiv.org/pdf/2205.01287v3
 http://arxiv.org/abs/2301.05318v1,creativecommons.org/licenses/by/4.0/,Language-Informed Transfer Learning for Embodied Household Activities,Yuqian Jiang and Qiaozi Gao and Govind Thattai and Gaurav Sukhatme,http://arxiv.org/pdf/2301.05318v1
 http://arxiv.org/abs/2304.11163v1,creativecommons.org/licenses/by/4.0/,"ChatGPT, Large Language Technologies, and the Bumpy Road of Benefiting Humanity",Atoosa Kasirzadeh,http://arxiv.org/pdf/2304.11163v1
 http://arxiv.org/abs/2302.07257v1,creativecommons.org/licenses/by/4.0/,ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models,Sheng Wang and Zihao Zhao and Xi Ouyang and Qian Wang and Dinggang Shen,http://arxiv.org/pdf/2302.07257v1
 http://arxiv.org/abs/1908.05672v5,creativecommons.org/licenses/by/4.0/,Towards Making the Most of BERT in Neural Machine Translation,Jiacheng Yang and Mingxuan Wang and Hao Zhou and Chengqi Zhao and Yong Yu and Weinan Zhang and Lei Li,http://arxiv.org/pdf/1908.05672v5
 http://arxiv.org/abs/2006.00671v2,creativecommons.org/licenses/by/4.0/,Conversational Machine Comprehension: a Literature Review,Somil Gupta and Bhanu Pratap Singh Rawat and Hong Yu,http://arxiv.org/pdf/2006.00671v2
 http://arxiv.org/abs/2104.08251v1,creativecommons.org/licenses/by/4.0/,proScript: Partially Ordered Scripts Generation via Pre-trained Language Models,Keisuke Sakaguchi and Chandra Bhagavatula and Ronan Le Bras and Niket Tandon and Peter Clark and Yejin Choi,http://arxiv.org/pdf/2104.08251v1
 http://arxiv.org/abs/2202.05993v1,creativecommons.org/licenses/by/4.0/,Wav2Vec2.0 on the Edge: Performance Evaluation,Santosh Gondi,http://arxiv.org/pdf/2202.05993v1
 http://arxiv.org/abs/2205.11505v1,creativecommons.org/licenses/by/4.0/,What Makes Data-to-Text Generation Hard for Pretrained Language Models?,Moniba Keymanesh and Adrian Benton and Mark Dredze,http://arxiv.org/pdf/2205.11505v1
 http://arxiv.org/abs/2205.07065v1,creativecommons.org/licenses/by/4.0/,What do Models Learn From Training on More Than Text? Measuring Visual Commonsense Knowledge,Lovisa Hagström and Richard Johansson,http://arxiv.org/pdf/2205.07065v1
 http://arxiv.org/abs/2104.10658v1,creativecommons.org/licenses/by/4.0/,Using GPT-2 to Create Synthetic Data to Improve the Prediction Performance of NLP Machine Learning Classification Models,Dewayne Whitfield,http://arxiv.org/pdf/2104.10658v1
 http://arxiv.org/abs/2211.13317v1,creativecommons.org/licenses/by/4.0/,Rank-One Editing of Encoder-Decoder Models,Vikas Raunak and Arul Menezes,http://arxiv.org/pdf/2211.13317v1
 http://arxiv.org/abs/2211.04508v1,creativecommons.org/licenses/by/4.0/,SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations,Paul-Ambroise Duquenne and Hongyu Gong and Ning Dong and Jingfei Du and Ann Lee and Vedanuj Goswani and Changhan Wang and Juan Pino and Benoît Sagot and Holger Schwenk,http://arxiv.org/pdf/2211.04508v1
 http://arxiv.org/abs/2211.14133v1,creativecommons.org/licenses/by/4.0/,PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices,Kazuki Osawa and Shigang Li and Torsten Hoefler,http://arxiv.org/pdf/2211.14133v1
 http://arxiv.org/abs/2302.08399v5,creativecommons.org/licenses/by/4.0/,Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks,Tomer Ullman,http://arxiv.org/pdf/2302.08399v5
 http://arxiv.org/abs/1812.01250v1,creativecommons.org/licenses/by/4.0/,Quantification and Analysis of Scientific Language Variation Across Research Fields,Pei Zhou and Muhao Chen and Kai-Wei Chang and Carlo Zaniolo,http://arxiv.org/pdf/1812.01250v1
 http://arxiv.org/abs/2003.07019v1,creativecommons.org/licenses/by/4.0/,Key Phrase Classification in Complex Assignments,Manikandan Ravikiran,http://arxiv.org/pdf/2003.07019v1
 http://arxiv.org/abs/2005.08314v1,creativecommons.org/licenses/by/4.0/,TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data,Pengcheng Yin and Graham Neubig and Wen-tau Yih and Sebastian Riedel,http://arxiv.org/pdf/2005.08314v1
 http://arxiv.org/abs/2009.09870v2,creativecommons.org/licenses/by/4.0/,Content Planning for Neural Story Generation with Aristotelian Rescoring,Seraphina Goldfarb-Tarrant and Tuhin Chakrabarty and Ralph Weischedel and Nanyun Peng,http://arxiv.org/pdf/2009.09870v2
 http://arxiv.org/abs/2011.05197v1,creativecommons.org/licenses/by/4.0/,UmBERTo-MTSA @ AcCompl-It: Improving Complexity and Acceptability Prediction with Multi-task Learning on Self-Supervised Annotations,Gabriele Sarti,http://arxiv.org/pdf/2011.05197v1
 http://arxiv.org/abs/2012.04332v1,creativecommons.org/licenses/by/4.0/,Facts2Story: Controlling Text Generation by Key Facts,Eyal Orbach and Yoav Goldberg,http://arxiv.org/pdf/2012.04332v1
 http://arxiv.org/abs/2102.05766v2,creativecommons.org/licenses/by/4.0/,Fused Acoustic and Text Encoding for Multimodal Bilingual Pretraining and Speech Translation,Renjie Zheng and Junkun Chen and Mingbo Ma and Liang Huang,http://arxiv.org/pdf/2102.05766v2
 http://arxiv.org/abs/2103.09535v1,creativecommons.org/licenses/by/4.0/,Towards Few-Shot Fact-Checking via Perplexity,Nayeon Lee and Yejin Bang and Andrea Madotto and Madian Khabsa and Pascale Fung,http://arxiv.org/pdf/2103.09535v1
 http://arxiv.org/abs/2104.06999v2,creativecommons.org/licenses/by/4.0/,Detecting Cross-Geographic Biases in Toxicity Modeling on Social Media,Sayan Ghosh and Dylan Baker and David Jurgens and Vinodkumar Prabhakaran,http://arxiv.org/pdf/2104.06999v2
 http://arxiv.org/abs/2104.07885v2,creativecommons.org/licenses/by/4.0/,Probing Across Time: What Does RoBERTa Know and When?,Leo Z. Liu and Yizhong Wang and Jungo Kasai and Hannaneh Hajishirzi and Noah A. Smith,http://arxiv.org/pdf/2104.07885v2
 http://arxiv.org/abs/2105.11601v2,creativecommons.org/licenses/by/4.0/,Personalized Transformer for Explainable Recommendation,Lei Li and Yongfeng Zhang and Li Chen,http://arxiv.org/pdf/2105.11601v2
 http://arxiv.org/abs/2105.14277v3,creativecommons.org/licenses/by/4.0/,Grammar Accuracy Evaluation (GAE): Quantifiable Quantitative Evaluation of Machine Translation Models,Dojun Park and Youngjin Jang and Harksoo Kim,http://arxiv.org/pdf/2105.14277v3
 http://arxiv.org/abs/2106.07207v1,creativecommons.org/licenses/by/4.0/,Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation,Xiang Lin and Simeng Han and Shafiq Joty,http://arxiv.org/pdf/2106.07207v1
 http://arxiv.org/abs/2107.07610v3,creativecommons.org/licenses/by/4.0/,Self-Supervised Contrastive Learning with Adversarial Perturbations for Defending Word Substitution-based Attacks,Zhao Meng and Yihan Dong and Mrinmaya Sachan and Roger Wattenhofer,http://arxiv.org/pdf/2107.07610v3
 http://arxiv.org/abs/2107.08582v1,creativecommons.org/licenses/by/4.0/,Bridging the Gap between Language Model and Reading Comprehension: Unsupervised MRC via Self-Supervision,Ning Bian and Xianpei Han and Bo Chen and Hongyu Lin and Ben He and Le Sun,http://arxiv.org/pdf/2107.08582v1
 http://arxiv.org/abs/2107.09622v1,creativecommons.org/licenses/by/4.0/,More Parameters? No Thanks!,Zeeshan Khan and Kartheek Akella and Vinay P. Namboodiri and C V Jawahar,http://arxiv.org/pdf/2107.09622v1
 http://arxiv.org/abs/2109.06822v2,creativecommons.org/licenses/by/4.0/,LM-Critic: Language Models for Unsupervised Grammatical Error Correction,Michihiro Yasunaga and Jure Leskovec and Percy Liang,http://arxiv.org/pdf/2109.06822v2
 http://arxiv.org/abs/2109.10274v2,creativecommons.org/licenses/by/4.0/,The Trade-offs of Domain Adaptation for Neural Language Models,David Grangier and Dan Iter,http://arxiv.org/pdf/2109.10274v2
 http://arxiv.org/abs/2109.12788v1,creativecommons.org/licenses/by/4.0/,Multiplicative Position-aware Transformer Models for Language Understanding,Zhiheng Huang and Davis Liang and Peng Xu and Bing Xiang,http://arxiv.org/pdf/2109.12788v1
 http://arxiv.org/abs/2110.03111v3,creativecommons.org/licenses/by/4.0/,Cut the CARP: Fishing for zero-shot story evaluation,Shahbuland Matiana and JR Smith and Ryan Teehan and Louis Castricato and Stella Biderman and Leo Gao and Spencer Frazier,http://arxiv.org/pdf/2110.03111v3
 http://arxiv.org/abs/2111.13611v1,creativecommons.org/licenses/by/4.0/,Predicting Document Coverage for Relation Extraction,Sneha Singhania and Simon Razniewski and Gerhard Weikum,http://arxiv.org/pdf/2111.13611v1
 http://arxiv.org/abs/2111.15417v1,creativecommons.org/licenses/by/4.0/,A Comparative Study of Transformers on Word Sense Disambiguation,Avi Chawla and Nidhi Mulay and Vikas Bishnoi and Gaurav Dhama and Dr. Anil Kumar Singh,http://arxiv.org/pdf/2111.15417v1
 http://arxiv.org/abs/2112.00283v1,creativecommons.org/licenses/by/4.0/,Wiki to Automotive: Understanding the Distribution Shift and its impact on Named Entity Recognition,Anmol Nayak and Hari Prasad Timmapathini,http://arxiv.org/pdf/2112.00283v1
 http://arxiv.org/abs/2112.07522v2,creativecommons.org/licenses/by/4.0/,LMTurk: Few-Shot Learners as Crowdsourcing Workers in a Language-Model-as-a-Service Framework,Mengjie Zhao and Fei Mi and Yasheng Wang and Minglei Li and Xin Jiang and Qun Liu and Hinrich Schütze,http://arxiv.org/pdf/2112.07522v2
 http://arxiv.org/abs/2112.11480v1,creativecommons.org/licenses/by/4.0/,On the Compression of Natural Language Models,Saeed Damadi,http://arxiv.org/pdf/2112.11480v1
 http://arxiv.org/abs/2202.02617v1,creativecommons.org/licenses/by/4.0/,Adaptive Fine-Tuning of Transformer-Based Language Models for Named Entity Recognition,Felix Stollenwerk,http://arxiv.org/pdf/2202.02617v1
 http://arxiv.org/abs/2202.04728v1,creativecommons.org/licenses/by/4.0/,Predicting Human Similarity Judgments Using Large Language Models,Raja Marjieh and Ilia Sucholutsky and Theodore R. Sumers and Nori Jacoby and Thomas L. Griffiths,http://arxiv.org/pdf/2202.04728v1
 http://arxiv.org/abs/2204.07289v1,creativecommons.org/licenses/by/4.0/,Identifying and Measuring Token-Level Sentiment Bias in Pre-trained Language Models with Prompts,Apoorv Garg and Deval Srivastava and Zhiyang Xu and Lifu Huang,http://arxiv.org/pdf/2204.07289v1
 http://arxiv.org/abs/2205.06160v2,creativecommons.org/licenses/by/4.0/,Localized Vision-Language Matching for Open-vocabulary Object Detection,Maria A. Bravo and Sudhanshu Mittal and Thomas Brox,http://arxiv.org/pdf/2205.06160v2
 http://arxiv.org/abs/2205.07081v1,creativecommons.org/licenses/by/4.0/,GoalNet: Inferring Conjunctive Goal Predicates from Human Plan Demonstrations for Robot Instruction Following,Shreya Sharma and Jigyasa Gupta and Shreshth Tuli and Rohan Paul and Mausam,http://arxiv.org/pdf/2205.07081v1
 http://arxiv.org/abs/2208.03711v1,creativecommons.org/licenses/by/4.0/,Vernacular Search Query Translation with Unsupervised Domain Adaptation,Mandar Kulkarni and Nikesh Garera,http://arxiv.org/pdf/2208.03711v1
 http://arxiv.org/abs/2209.11000v1,creativecommons.org/licenses/by/4.0/,Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation,Xingdi Yuan and Tong Wang and Yen-Hsiang Wang and Emery Fine and Rania Abdelghani and Pauline Lucas and Hélène Sauzéon and Pierre-Yves Oudeyer,http://arxiv.org/pdf/2209.11000v1
 http://arxiv.org/abs/2210.04726v1,creativecommons.org/licenses/by/4.0/,Knowledge Prompts: Injecting World Knowledge into Language Models through Soft Prompts,Cicero Nogueira dos Santos and Zhe Dong and Daniel Cer and John Nham and Siamak Shakeri and Jianmo Ni and Yun-hsuan Sung,http://arxiv.org/pdf/2210.04726v1
 http://arxiv.org/abs/2210.04964v1,creativecommons.org/licenses/by/4.0/,Generating Executable Action Plans with Environmentally-Aware Language Models,Maitrey Gramopadhye and Daniel Szafir,http://arxiv.org/pdf/2210.04964v1
 http://arxiv.org/abs/2210.07323v3,creativecommons.org/licenses/by/4.0/,Experiments on Turkish ASR with Self-Supervised Speech Representation Learning,Ali Safaya and Engin Erzin,http://arxiv.org/pdf/2210.07323v3
 http://arxiv.org/abs/2210.07688v2,creativecommons.org/licenses/by/4.0/,Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training,Wenliang Dai and Zihan Liu and Ziwei Ji and Dan Su and Pascale Fung,http://arxiv.org/pdf/2210.07688v2
 http://arxiv.org/abs/2211.04126v1,creativecommons.org/licenses/by/4.0/,Conciseness: An Overlooked Language Task,Felix Stahlberg and Aashish Kumar and Chris Alberti and Shankar Kumar,http://arxiv.org/pdf/2211.04126v1
 http://arxiv.org/abs/2211.07828v1,creativecommons.org/licenses/by/4.0/,Adaptation Approaches for Nearest Neighbor Language Models,Rishabh Bhardwaj and George Polovets and Monica Sunkara,http://arxiv.org/pdf/2211.07828v1
 http://arxiv.org/abs/2212.02437v1,creativecommons.org/licenses/by/4.0/,In-context Examples Selection for Machine Translation,Sweta Agrawal and Chunting Zhou and Mike Lewis and Luke Zettlemoyer and Marjan Ghazvininejad,http://arxiv.org/pdf/2212.02437v1
 http://arxiv.org/abs/2212.08120v1,creativecommons.org/licenses/by/4.0/,Injecting Domain Knowledge in Language Models for Task-Oriented Dialogue Systems,Denis Emelin and Daniele Bonadiman and Sawsan Alqahtani and Yi Zhang and Saab Mansour,http://arxiv.org/pdf/2212.08120v1
 http://arxiv.org/abs/2212.10535v1,creativecommons.org/licenses/by/4.0/,A Survey of Deep Learning for Mathematical Reasoning,Pan Lu and Liang Qiu and Wenhao Yu and Sean Welleck and Kai-Wei Chang,http://arxiv.org/pdf/2212.10535v1
 http://arxiv.org/abs/2212.11185v1,creativecommons.org/licenses/by/4.0/,Entropy- and Distance-Based Predictors From GPT-2 Attention Patterns Predict Reading Times Over and Above GPT-2 Surprisal,Byung-Doh Oh and William Schuler,http://arxiv.org/pdf/2212.11185v1
 http://arxiv.org/abs/2212.14882v1,creativecommons.org/licenses/by/4.0/,ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports,Katharina Jeblick and Balthasar Schachtner and Jakob Dexl and Andreas Mittermeier and Anna Theresa Stüber and Johanna Topalis and Tobias Weber and Philipp Wesp and Bastian Sabel and Jens Ricke and Michael Ingrisch,http://arxiv.org/pdf/2212.14882v1
 http://arxiv.org/abs/2302.06541v1,creativecommons.org/licenses/by/4.0/,Towards Agile Text Classifiers for Everyone,Maximilian Mozes and Jessica Hoffmann and Katrin Tomanek and Muhamed Kouate and Nithum Thain and Ann Yuan and Tolga Bolukbasi and Lucas Dixon,http://arxiv.org/pdf/2302.06541v1
 http://arxiv.org/abs/2302.07371v1,creativecommons.org/licenses/by/4.0/,AutoBiasTest: Controllable Sentence Generation for Automated and Open-Ended Social Bias Testing in Language Models,Rafal Kocielnik and Shrimai Prabhumoye and Vivian Zhang and R. Michael Alvarez and Anima Anandkumar,http://arxiv.org/pdf/2302.07371v1
 http://arxiv.org/abs/2302.11054v1,creativecommons.org/licenses/by/4.0/,Conversational Text-to-SQL: An Odyssey into State-of-the-Art and Challenges Ahead,Sree Hari Krishnan Parthasarathi and Lu Zeng and Dilek Hakkani-Tur,http://arxiv.org/pdf/2302.11054v1
 http://arxiv.org/abs/2303.13547v1,creativecommons.org/licenses/by/4.0/,A comprehensive evaluation of ChatGPT's zero-shot Text-to-SQL capability,Aiwei Liu and Xuming Hu and Lijie Wen and Philip S. Yu,http://arxiv.org/pdf/2303.13547v1
 http://arxiv.org/abs/2303.15422v1,creativecommons.org/licenses/by/4.0/,KPEval: Towards Fine-grained Semantic-based Evaluation of Keyphrase Extraction and Generation Systems,Di Wu and Da Yin and Kai-Wei Chang,http://arxiv.org/pdf/2303.15422v1
 http://arxiv.org/abs/2102.10958v1,creativecommons.org/licenses/by/4.0/,"Bilingual Language Modeling, A transfer learning technique for Roman Urdu",Usama Khalid and Mirza Omer Beg and Muhammad Umair Arshad,http://arxiv.org/pdf/2102.10958v1
 http://arxiv.org/abs/2212.10785v1,creativecommons.org/licenses/by/4.0/,SERENGETI: Massively Multilingual Language Models for Africa,Ife Adebara and AbdelRahim Elmadany and Muhammad Abdul-Mageed and Alcides Alcoba Inciarte,http://arxiv.org/pdf/2212.10785v1
 http://arxiv.org/abs/2206.11719v2,creativecommons.org/licenses/by/4.0/,AST-Probe: Recovering abstract syntax trees from hidden representations of pre-trained language models,José Antonio Hernández López and Martin Weyssow and Jesús Sánchez Cuadrado and Houari Sahraoui,http://arxiv.org/pdf/2206.11719v2
 http://arxiv.org/abs/2102.04472v1,creativecommons.org/licenses/by/4.0/,PyAutoFit: A Classy Probabilistic Programming Language for Model Composition and Fitting,James. W. Nightingale and Richard G. Hayes and Matthew Griffiths,http://arxiv.org/pdf/2102.04472v1
 http://arxiv.org/abs/2201.01845v2,creativecommons.org/licenses/by/4.0/,Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation,Zoey Liu and Emily Prud'hommeaux,http://arxiv.org/pdf/2201.01845v2
 http://arxiv.org/abs/2201.06170v2,creativecommons.org/licenses/by/4.0/,Evaluation of HTR models without Ground Truth Material,Phillip Benjamin Ströbel and Simon Clematide and Martin Volk and Raphael Schwitter and Tobias Hodel and David Schoch,http://arxiv.org/pdf/2201.06170v2
 http://arxiv.org/abs/2302.04870v1,creativecommons.org/licenses/by/4.0/,Offsite-Tuning: Transfer Learning without Full Model,Guangxuan Xiao and Ji Lin and Song Han,http://arxiv.org/pdf/2302.04870v1
 http://arxiv.org/abs/2011.04732v1,creativecommons.org/licenses/by/4.0/,CLAR: A Cross-Lingual Argument Regularizer for Semantic Role Labeling,Ishan Jindal and Yunyao Li and Siddhartha Brahma and Huaiyu Zhu,http://arxiv.org/pdf/2011.04732v1
 http://arxiv.org/abs/2102.10957v1,creativecommons.org/licenses/by/4.0/,Co-occurrences using Fasttext embeddings for word similarity tasks in Urdu,Usama Khalid and Aizaz Hussain and Muhammad Umair Arshad and Waseem Shahzad and Mirza Omer Beg,http://arxiv.org/pdf/2102.10957v1
 http://arxiv.org/abs/2012.07974v3,creativecommons.org/licenses/by/4.0/,A review of on-device fully neural end-to-end automatic speech recognition algorithms,Chanwoo Kim and Dhananjaya Gowda and Dongsoo Lee and Jiyeon Kim and Ankur Kumar and Sungsoo Kim and Abhinav Garg and Changwoo Han,http://arxiv.org/pdf/2012.07974v3
 http://arxiv.org/abs/2210.01293v1,creativecommons.org/licenses/by/4.0/,ThinkSum: Probabilistic reasoning over sets using large language models,Batu Ozturkler and Nikolay Malkin and Zhen Wang and Nebojsa Jojic,http://arxiv.org/pdf/2210.01293v1
 http://arxiv.org/abs/2101.06351v1,creativecommons.org/licenses/by/4.0/,Weakly-Supervised Hierarchical Models for Predicting Persuasive Strategies in Good-faith Textual Requests,Jiaao Chen and Diyi Yang,http://arxiv.org/pdf/2101.06351v1
 http://arxiv.org/abs/2211.15603v3,creativecommons.org/licenses/by/4.0/,Action-GPT: Leveraging Large-scale Language Models for Improved and Generalized Action Generation,Sai Shashank Kalakonda and Shubh Maheshwari and Ravi Kiran Sarvadevabhatla,http://arxiv.org/pdf/2211.15603v3
 http://arxiv.org/abs/2302.08722v3,creativecommons.org/licenses/by/4.0/,GPT4MIA: Utilizing Generative Pre-trained Transformer (GPT-3) as A Plug-and-Play Transductive Model for Medical Image Analysis,Yizhe Zhang and Danny Z. Chen,http://arxiv.org/pdf/2302.08722v3
 http://arxiv.org/abs/2304.09337v1,creativecommons.org/licenses/by/4.0/,Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models,Stephen Brade and Bryan Wang and Mauricio Sousa and Sageev Oore and Tovi Grossman,http://arxiv.org/pdf/2304.09337v1
 http://arxiv.org/abs/1908.10747v1,creativecommons.org/licenses/by/4.0/,Language Tasks and Language Games: On Methodology in Current Natural Language Processing Research,David Schlangen,http://arxiv.org/pdf/1908.10747v1
 http://arxiv.org/abs/2104.01294v1,creativecommons.org/licenses/by/4.0/,Representations of Language Varieties Are Reliable Given Corpus Similarity Measures,Jonathan Dunn,http://arxiv.org/pdf/2104.01294v1
 http://arxiv.org/abs/2108.09814v1,creativecommons.org/licenses/by/4.0/,UzBERT: pretraining a BERT model for Uzbek,B. Mansurov and A. Mansurov,http://arxiv.org/pdf/2108.09814v1
 http://arxiv.org/abs/2109.15254v2,creativecommons.org/licenses/by/4.0/,SlovakBERT: Slovak Masked Language Model,Matúš Pikuliak and Štefan Grivalský and Martin Konôpka and Miroslav Blšták and Martin Tamajka and Viktor Bachratý and Marián Šimko and Pavol Balážik and Michal Trnka and Filip Uhlárik,http://arxiv.org/pdf/2109.15254v2
 http://arxiv.org/abs/2108.03070v1,creativecommons.org/licenses/by/4.0/,SWSR: A Chinese Dataset and Lexicon for Online Sexism Detection,Aiqi Jiang and Xiaohan Yang and Yang Liu and Arkaitz Zubiaga,http://arxiv.org/pdf/2108.03070v1
 http://arxiv.org/abs/2209.04725v1,creativecommons.org/licenses/by/4.0/,Anticipating the Unseen Discrepancy for Vision and Language Navigation,Yujie Lu and Huiliang Zhang and Ping Nie and Weixi Feng and Wenda Xu and Xin Eric Wang and William Yang Wang,http://arxiv.org/pdf/2209.04725v1
 http://arxiv.org/abs/2304.04498v2,creativecommons.org/licenses/by/4.0/,Towards Digital Nature: Bridging the Gap between Turing Machine Objects and Linguistic Objects in LLMMs for Universal Interaction of Object-Oriented Descriptions,Yoichi Ochiai and Naruya Kondo and Tatsuki Fushimi,http://arxiv.org/pdf/2304.04498v2
 http://arxiv.org/abs/2208.05577v1,creativecommons.org/licenses/by/4.0/,Reducing Retraining by Recycling Parameter-Efficient Prompts,Brian Lester and Joshua Yurtsever and Siamak Shakeri and Noah Constant,http://arxiv.org/pdf/2208.05577v1
 http://arxiv.org/abs/2210.16298v1,creativecommons.org/licenses/by/4.0/,Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers,Jieyu Zhao and Xuezhi Wang and Yao Qin and Jilin Chen and Kai-Wei Chang,http://arxiv.org/pdf/2210.16298v1
 http://arxiv.org/abs/2112.01753v2,creativecommons.org/licenses/by/4.0/,Probing Linguistic Information For Logical Inference In Pre-trained Language Models,Zeming Chen and Qiyue Gao,http://arxiv.org/pdf/2112.01753v2
 http://arxiv.org/abs/2203.05936v2,creativecommons.org/licenses/by/4.0/,Are discrete units necessary for Spoken Language Modeling?,Tu Anh Nguyen and Benoit Sagot and Emmanuel Dupoux,http://arxiv.org/pdf/2203.05936v2
 http://arxiv.org/abs/2108.05652v1,creativecommons.org/licenses/by/4.0/,Modeling Relevance Ranking under the Pre-training and Fine-tuning Paradigm,Lin Bo and Liang Pang and Gang Wang and Jun Xu and XiuQiang He and Ji-Rong Wen,http://arxiv.org/pdf/2108.05652v1
 http://arxiv.org/abs/2009.08445v2,creativecommons.org/licenses/by/4.0/,Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks,Trapit Bansal and Rishikesh Jha and Tsendsuren Munkhdalai and Andrew McCallum,http://arxiv.org/pdf/2009.08445v2
 http://arxiv.org/abs/2104.06591v2,creativecommons.org/licenses/by/4.0/,Zero-Resource Multi-Dialectal Arabic Natural Language Understanding,Muhammad Khalifa and Hesham Hassan and Aly Fahmy,http://arxiv.org/pdf/2104.06591v2
 http://arxiv.org/abs/2108.10561v1,creativecommons.org/licenses/by/4.0/,Taming the Beast: Learning to Control Neural Conversational Models,Andrea Madotto,http://arxiv.org/pdf/2108.10561v1
 http://arxiv.org/abs/2109.13620v1,creativecommons.org/licenses/by/4.0/,Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking,Nikita Moghe and Mark Steedman and Alexandra Birch,http://arxiv.org/pdf/2109.13620v1
 http://arxiv.org/abs/2112.02945v1,creativecommons.org/licenses/by/4.0/,Configuration Space Exploration for Digital Printing Systems,Jasper Denkers and Marvin Brunner and Louis van Gool and Eelco Visser,http://arxiv.org/pdf/2112.02945v1
 http://arxiv.org/abs/2201.03425v1,creativecommons.org/licenses/by/4.0/,"Towards Trustworthy AutoGrading of Short, Multi-lingual, Multi-type Answers",Johannes Schneider and Robin Richner and Micha Riser,http://arxiv.org/pdf/2201.03425v1
 http://arxiv.org/abs/2202.13529v1,creativecommons.org/licenses/by/4.0/,"KMIR: A Benchmark for Evaluating Knowledge Memorization, Identification and Reasoning Abilities of Language Models",Daniel Gao and Yantao Jia and Lei Li and Chengzhen Fu and Zhicheng Dou and Hao Jiang and Xinyu Zhang and Lei Chen and Zhao Cao,http://arxiv.org/pdf/2202.13529v1
 http://arxiv.org/abs/2203.07731v1,creativecommons.org/licenses/by/4.0/,Evaluating BERT-based Pre-training Language Models for Detecting Misinformation,Rini Anggrainingsih and Ghulam Mubashar Hassan and Amitava Datta,http://arxiv.org/pdf/2203.07731v1
 http://arxiv.org/abs/2204.08405v1,creativecommons.org/licenses/by/4.0/,Zero-shot Entity and Tweet Characterization with Designed Conditional Prompts and Contexts,Sharath Srivatsa and Tushar Mohan and Kumari Neha and Nishchay Malakar and Ponnurangam Kumaraguru and Srinath Srinivasa,http://arxiv.org/pdf/2204.08405v1
 http://arxiv.org/abs/2206.00761v2,creativecommons.org/licenses/by/4.0/,On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting,Tomasz Korbak and Hady Elsahar and Germán Kruszewski and Marc Dymetman,http://arxiv.org/pdf/2206.00761v2
 http://arxiv.org/abs/2209.12786v1,creativecommons.org/licenses/by/4.0/,Do ever larger octopi still amplify reporting biases? Evidence from judgments of typical colour,Fangyu Liu and Julian Martin Eisenschlos and Jeremy R. Cole and Nigel Collier,http://arxiv.org/pdf/2209.12786v1
 http://arxiv.org/abs/2303.08033v1,creativecommons.org/licenses/by/4.0/,Large Language Models (GPT) Struggle to Answer Multiple-Choice Questions about Code,Jaromir Savelka and Arav Agarwal and Christopher Bogart and Majd Sakr,http://arxiv.org/pdf/2303.08033v1
 http://arxiv.org/abs/2010.01063v1,creativecommons.org/licenses/by/4.0/,Syntax Representation in Word Embeddings and Neural Networks -- A Survey,Tomasz Limisiewicz and David Mareček,http://arxiv.org/pdf/2010.01063v1
 http://arxiv.org/abs/2205.06644v1,creativecommons.org/licenses/by/4.0/,Controlling Translation Formality Using Pre-trained Multilingual Language Models,Elijah Rippeth and Sweta Agrawal and Marine Carpuat,http://arxiv.org/pdf/2205.06644v1
 http://arxiv.org/abs/2012.08673v2,creativecommons.org/licenses/by/4.0/,A Closer Look at the Robustness of Vision-and-Language Pre-trained Models,Linjie Li and Zhe Gan and Jingjing Liu,http://arxiv.org/pdf/2012.08673v2
 http://arxiv.org/abs/2106.12230v1,creativecommons.org/licenses/by/4.0/,Recognising Biomedical Names: Challenges and Solutions,Xiang Dai,http://arxiv.org/pdf/2106.12230v1
 http://arxiv.org/abs/2205.11388v1,creativecommons.org/licenses/by/4.0/,StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models,Adam Liška and Tomáš Kočiský and Elena Gribovskaya and Tayfun Terzi and Eren Sezener and Devang Agrawal and Cyprien de Masson d'Autume and Tim Scholtes and Manzil Zaheer and Susannah Young and Ellen Gilsenan-McMahon and Sophia Austin and Phil Blunsom and Angeliki Lazaridou,http://arxiv.org/pdf/2205.11388v1
 http://arxiv.org/abs/2210.07144v1,creativecommons.org/licenses/by/4.0/,Reprogramming Large Pretrained Language Models for Antibody Sequence Infilling,Igor Melnyk and Vijil Chenthamarakshan and Pin-Yu Chen and Payel Das and Amit Dhurandhar and Inkit Padhi and Devleena Das,http://arxiv.org/pdf/2210.07144v1
 http://arxiv.org/abs/2303.14070v4,creativecommons.org/licenses/by/4.0/,ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge,Yunxiang Li and Zihan Li and Kai Zhang and Ruilong Dan and You Zhang,http://arxiv.org/pdf/2303.14070v4
 http://arxiv.org/abs/2212.10947v1,creativecommons.org/licenses/by/4.0/,Parallel Context Windows Improve In-Context Learning of Large Language Models,Nir Ratner and Yoav Levine and Yonatan Belinkov and Ori Ram and Omri Abend and Ehud Karpas and Amnon Shashua and Kevin Leyton-Brown and Yoav Shoham,http://arxiv.org/pdf/2212.10947v1
 http://arxiv.org/abs/2304.10014v1,creativecommons.org/licenses/by/4.0/,Physics task development of prospective physics teachers using ChatGPT,Stefan Küchemann and Steffen Steinert and Natalia Revenga and Matthias Schweinberger and Yavuz Dinc and Karina E. Avila and Jochen Kuhn,http://arxiv.org/pdf/2304.10014v1
 http://arxiv.org/abs/2304.10691v1,creativecommons.org/licenses/by/4.0/,SkinGPT: A Dermatology Diagnostic System with Vision Large Language Model,Juexiao Zhou and Xin Gao,http://arxiv.org/pdf/2304.10691v1
 http://arxiv.org/abs/2203.10326v2,creativecommons.org/licenses/by/4.0/,Pretraining with Artificial Language: Studying Transferable Knowledge in Language Models,Ryokan Ri and Yoshimasa Tsuruoka,http://arxiv.org/pdf/2203.10326v2
 http://arxiv.org/abs/2101.09368v2,creativecommons.org/licenses/by/4.0/,Effects of Pre- and Post-Processing on type-based Embeddings in Lexical Semantic Change Detection,Jens Kaiser and Sinan Kurtyigit and Serge Kotchourko and Dominik Schlechtweg,http://arxiv.org/pdf/2101.09368v2
 http://arxiv.org/abs/2304.05403v1,creativecommons.org/licenses/by/4.0/,Isolated Sign Language Recognition based on Tree Structure Skeleton Images,David Laines and Gissella Bejarano and Miguel Gonzalez-Mendoza and Gilberto Ochoa-Ruiz,http://arxiv.org/pdf/2304.05403v1
 http://arxiv.org/abs/1806.02557v2,creativecommons.org/licenses/by/4.0/,Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification,Zhenpeng Chen and Sheng Shen and Ziniu Hu and Xuan Lu and Qiaozhu Mei and Xuanzhe Liu,http://arxiv.org/pdf/1806.02557v2
 http://arxiv.org/abs/2107.06569v1,creativecommons.org/licenses/by/4.0/,Importance-based Neuron Allocation for Multilingual Neural Machine Translation,Wanying Xie and Yang Feng and Shuhao Gu and Dong Yu,http://arxiv.org/pdf/2107.06569v1
 http://arxiv.org/abs/2303.01249v1,creativecommons.org/licenses/by/4.0/,Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition,Zhijie Shen and Wu Guo and Bin Gu,http://arxiv.org/pdf/2303.01249v1
 http://arxiv.org/abs/2001.05315v1,creativecommons.org/licenses/by/4.0/,A Continuous Space Neural Language Model for Bengali Language,Hemayet Ahmed Chowdhury and Md. Azizul Haque Imon and Anisur Rahman and Aisha Khatun and Md. Saiful Islam,http://arxiv.org/pdf/2001.05315v1
 http://arxiv.org/abs/2211.00635v1,creativecommons.org/licenses/by/4.0/,Preserving In-Context Learning ability in Large Language Model Fine-tuning,Yihan Wang and Si Si and Daliang Li and Michal Lukasik and Felix Yu and Cho-Jui Hsieh and Inderjit S Dhillon and Sanjiv Kumar,http://arxiv.org/pdf/2211.00635v1
 http://arxiv.org/abs/2210.14353v2,creativecommons.org/licenses/by/4.0/,"RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering",Victor Zhong and Weijia Shi and Wen-tau Yih and Luke Zettlemoyer,http://arxiv.org/pdf/2210.14353v2
 http://arxiv.org/abs/2301.11293v1,creativecommons.org/licenses/by/4.0/,Understanding Finetuning for Factual Knowledge Extraction from Language Models,Mehran Kazemi and Sid Mittal and Deepak Ramachandran,http://arxiv.org/pdf/2301.11293v1
 http://arxiv.org/abs/2303.01903v2,creativecommons.org/licenses/by/4.0/,Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering,Zhenwei Shao and Zhou Yu and Meng Wang and Jun Yu,http://arxiv.org/pdf/2303.01903v2
 http://arxiv.org/abs/2303.09128v1,creativecommons.org/licenses/by/4.0/,Exploring Distributional Shifts in Large Language Models for Code Analysis,Shushan Arakelyan and Rocktim Jyoti Das and Yi Mao and Xiang Ren,http://arxiv.org/pdf/2303.09128v1
 http://arxiv.org/abs/2303.10431v1,creativecommons.org/licenses/by/4.0/,DeAR: Debiasing Vision-Language Models with Additive Residuals,Ashish Seth and Mayur Hemani and Chirag Agarwal,http://arxiv.org/pdf/2303.10431v1
 http://arxiv.org/abs/2303.13217v3,creativecommons.org/licenses/by/4.0/,Fairness-guided Few-shot Prompting for Large Language Models,Huan Ma and Changqing Zhang and Yatao Bian and Lemao Liu and Zhirui Zhang and Peilin Zhao and Shu Zhang and Huazhu Fu and Qinghua Hu and Bingzhe Wu,http://arxiv.org/pdf/2303.13217v3
 http://arxiv.org/abs/2303.13379v1,creativecommons.org/licenses/by/4.0/,Practical and Ethical Challenges of Large Language Models in Education: A Systematic Literature Review,Lixiang Yan and Lele Sha and Linxuan Zhao and Yuheng Li and Roberto Martinez-Maldonado and Guanliang Chen and Xinyu Li and Yueqiao Jin and Dragan Gašević,http://arxiv.org/pdf/2303.13379v1
 http://arxiv.org/abs/2108.06665v1,creativecommons.org/licenses/by/4.0/,"Accurate, yet inconsistent? Consistency Analysis on Language Understanding Models",Myeongjun Jang and Deuk Sin Kwon and Thomas Lukasiewicz,http://arxiv.org/pdf/2108.06665v1
 http://arxiv.org/abs/2012.14005v1,creativecommons.org/licenses/by/4.0/,Neural document expansion for ad-hoc information retrieval,Cheng Tang and Andrew Arnold,http://arxiv.org/pdf/2012.14005v1
 http://arxiv.org/abs/2103.15760v2,creativecommons.org/licenses/by/4.0/,Shrinking Bigfoot: Reducing wav2vec 2.0 footprint,Zilun Peng and Akshay Budhkar and Ilana Tuil and Jason Levy and Parinaz Sobhani and Raphael Cohen and Jumana Nassour,http://arxiv.org/pdf/2103.15760v2
 http://arxiv.org/abs/2109.08627v1,creativecommons.org/licenses/by/4.0/,Classification-based Quality Estimation: Small and Efficient Models for Real-world Applications,Shuo Sun and Ahmed El-Kishky and Vishrav Chaudhary and James Cross and Francisco Guzmán and Lucia Specia,http://arxiv.org/pdf/2109.08627v1
 http://arxiv.org/abs/2205.12113v2,creativecommons.org/licenses/by/4.0/,The Curious Case of Control,Elias Stengel-Eskin and Benjamin Van Durme,http://arxiv.org/pdf/2205.12113v2
 http://arxiv.org/abs/2211.02941v3,creativecommons.org/licenses/by/4.0/,Small Language Models for Tabular Data,Benjamin L. Badger,http://arxiv.org/pdf/2211.02941v3
 http://arxiv.org/abs/2210.06475v2,creativecommons.org/licenses/by/4.0/,Equi-Tuning: Group Equivariant Fine-Tuning of Pretrained Models,Sourya Basu and Prasanna Sattigeri and Karthikeyan Natesan Ramamurthy and Vijil Chenthamarakshan and Kush R. Varshney and Lav R. Varshney and Payel Das,http://arxiv.org/pdf/2210.06475v2
 http://arxiv.org/abs/2212.10696v1,creativecommons.org/licenses/by/4.0/,Analyzing Semantic Faithfulness of Language Models via Input Intervention on Conversational Question Answering,Akshay Chaturvedi and Swarnadeep Bhar and Soumadeep Saha and Utpal Garain and Nicholas Asher,http://arxiv.org/pdf/2212.10696v1
 http://arxiv.org/abs/1803.06456v1,creativecommons.org/licenses/by/4.0/,Experiments with Neural Networks for Small and Large Scale Authorship Verification,Marjan Hosseinia and Arjun Mukherjee,http://arxiv.org/pdf/1803.06456v1
 http://arxiv.org/abs/2104.05022v2,creativecommons.org/licenses/by/4.0/,WEC: Deriving a Large-scale Cross-document Event Coreference dataset from Wikipedia,Alon Eirew and Arie Cattan and Ido Dagan,http://arxiv.org/pdf/2104.05022v2
 http://arxiv.org/abs/2203.08118v3,creativecommons.org/licenses/by/4.0/,Representation Learning for Resource-Constrained Keyphrase Generation,Di Wu and Wasi Uddin Ahmad and Sunipa Dev and Kai-Wei Chang,http://arxiv.org/pdf/2203.08118v3
 http://arxiv.org/abs/2210.05793v2,creativecommons.org/licenses/by/4.0/,Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR,Dongseong Hwang and Khe Chai Sim and Yu Zhang and Trevor Strohman,http://arxiv.org/pdf/2210.05793v2
 http://arxiv.org/abs/1904.00585v2,creativecommons.org/licenses/by/4.0/,Using Similarity Measures to Select Pretraining Data for NER,Xiang Dai and Sarvnaz Karimi and Ben Hachey and Cecile Paris,http://arxiv.org/pdf/1904.00585v2
 http://arxiv.org/abs/1911.04286v1,creativecommons.org/licenses/by/4.0/,Deep Contextualized Self-training for Low Resource Dependency Parsing,Guy Rotman and Roi Reichart,http://arxiv.org/pdf/1911.04286v1
 http://arxiv.org/abs/2004.12198v2,creativecommons.org/licenses/by/4.0/,Quantifying the Contextualization of Word Representations with Semantic Class Probing,Mengjie Zhao and Philipp Dufter and Yadollah Yaghoobzadeh and Hinrich Schütze,http://arxiv.org/pdf/2004.12198v2
 http://arxiv.org/abs/2010.07261v2,creativecommons.org/licenses/by/4.0/,Learning Improvised Chatbots from Adversarial Modifications of Natural Language Feedback,Makesh Narsimhan Sreedhar and Kun Ni and Siva Reddy,http://arxiv.org/pdf/2010.07261v2
 http://arxiv.org/abs/2012.00124v1,creativecommons.org/licenses/by/4.0/,Extreme Model Compression for On-device Natural Language Understanding,Kanthashree Mysore Sathyendra and Samridhi Choudhary and Leah Nicolich-Henkin,http://arxiv.org/pdf/2012.00124v1
 http://arxiv.org/abs/2012.04446v1,creativecommons.org/licenses/by/4.0/,LAMP: Label Augmented Multimodal Pretraining,Jia Guo and Chen Zhu and Yilun Zhao and Heda Wang and Yao Hu and Xiaofei He and Deng Cai,http://arxiv.org/pdf/2012.04446v1
 http://arxiv.org/abs/2103.15335v1,creativecommons.org/licenses/by/4.0/,Changing the Mind of Transformers for Topically-Controllable Language Generation,Haw-Shiuan Chang and Jiaming Yuan and Mohit Iyyer and Andrew McCallum,http://arxiv.org/pdf/2103.15335v1
 http://arxiv.org/abs/2108.12848v2,creativecommons.org/licenses/by/4.0/,Span Fine-tuning for Pre-trained Language Models,Rongzhou Bao and Zhuosheng Zhang and Hai Zhao,http://arxiv.org/pdf/2108.12848v2
 http://arxiv.org/abs/2109.05190v3,creativecommons.org/licenses/by/4.0/,PoKE: A Prompt-based Knowledge Eliciting Approach for Event Argument Extraction,Jiaju Lin and Qin Chen,http://arxiv.org/pdf/2109.05190v3
 http://arxiv.org/abs/2109.07765v1,creativecommons.org/licenses/by/4.0/,"Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models",Casimiro Pio Carrino and Jordi Armengol-Estapé and Ona de Gibert Bonet and Asier Gutiérrez-Fandiño and Aitor Gonzalez-Agirre and Martin Krallinger and Marta Villegas,http://arxiv.org/pdf/2109.07765v1
 http://arxiv.org/abs/2109.08648v1,creativecommons.org/licenses/by/4.0/,Efficient Measuring of Readability to Improve Documents Accessibility for Arabic Language Learners,Sadik Bessou and Ghozlane Chenni,http://arxiv.org/pdf/2109.08648v1
 http://arxiv.org/abs/2110.01799v1,creativecommons.org/licenses/by/4.0/,ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts,Yuta Koreeda and Christopher D. Manning,http://arxiv.org/pdf/2110.01799v1
 http://arxiv.org/abs/2110.04541v3,creativecommons.org/licenses/by/4.0/,The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design,Yoav Levine and Noam Wies and Daniel Jannai and Dan Navon and Yedid Hoshen and Amnon Shashua,http://arxiv.org/pdf/2110.04541v3
 http://arxiv.org/abs/2110.08329v2,creativecommons.org/licenses/by/4.0/,Control Prefixes for Parameter-Efficient Text Generation,Jordan Clive and Kris Cao and Marek Rei,http://arxiv.org/pdf/2110.08329v2
 http://arxiv.org/abs/2111.06467v2,creativecommons.org/licenses/by/4.0/,SynthBio: A Case Study in Human-AI Collaborative Curation of Text Datasets,Ann Yuan and Daphne Ippolito and Vitaly Nikolaev and Chris Callison-Burch and Andy Coenen and Sebastian Gehrmann,http://arxiv.org/pdf/2111.06467v2
 http://arxiv.org/abs/2111.08249v1,creativecommons.org/licenses/by/4.0/,Bengali Handwritten Grapheme Classification: Deep Learning Approach,Tarun Roy and Hasib Hasan and Kowsar Hossain and Masuma Akter Rumi,http://arxiv.org/pdf/2111.08249v1
 http://arxiv.org/abs/2203.02167v1,creativecommons.org/licenses/by/4.0/,SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models,Liang Wang and Wei Zhao and Zhuoyu Wei and Jingming Liu,http://arxiv.org/pdf/2203.02167v1
 http://arxiv.org/abs/2203.05648v1,creativecommons.org/licenses/by/4.0/,"Contextualized Sensorimotor Norms: multi-dimensional measures of sensorimotor strength for ambiguous English words, in context",Sean Trott and Benjamin Bergen,http://arxiv.org/pdf/2203.05648v1
 http://arxiv.org/abs/2203.11171v4,creativecommons.org/licenses/by/4.0/,Self-Consistency Improves Chain of Thought Reasoning in Language Models,Xuezhi Wang and Jason Wei and Dale Schuurmans and Quoc Le and Ed Chi and Sharan Narang and Aakanksha Chowdhery and Denny Zhou,http://arxiv.org/pdf/2203.11171v4
 http://arxiv.org/abs/2204.01959v1,creativecommons.org/licenses/by/4.0/,Data Augmentation for Intent Classification with Off-the-shelf Large Language Models,Gaurav Sahu and Pau Rodriguez and Issam H. Laradji and Parmida Atighehchian and David Vazquez and Dzmitry Bahdanau,http://arxiv.org/pdf/2204.01959v1
 http://arxiv.org/abs/2204.03031v2,creativecommons.org/licenses/by/4.0/,VALUE: Understanding Dialect Disparity in NLU,Caleb Ziems and Jiaao Chen and Camille Harris and Jessica Anderson and Diyi Yang,http://arxiv.org/pdf/2204.03031v2
 http://arxiv.org/abs/2206.11815v1,creativecommons.org/licenses/by/4.0/,Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution,Nikolay Arefyev and Boris Sheludko and Alexander Podolskiy and Alexander Panchenko,http://arxiv.org/pdf/2206.11815v1
 http://arxiv.org/abs/2206.14607v1,creativecommons.org/licenses/by/4.0/,NERDA-Con: Extending NER models for Continual Learning -- Integrating Distinct Tasks and Updating Distribution Shifts,Supriti Vijay and Aman Priyanshu,http://arxiv.org/pdf/2206.14607v1
 http://arxiv.org/abs/2208.05393v1,creativecommons.org/licenses/by/4.0/,A Quantum Natural Language Processing Approach to Pronoun Resolution,Hadi Wazni and Kin Ian Lo and Lachlan McPheat and Mehrnoosh Sadrzadeh,http://arxiv.org/pdf/2208.05393v1
 http://arxiv.org/abs/2208.05596v1,creativecommons.org/licenses/by/4.0/,Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines,Patrick Flynn and Tristan Vanderbruggen and Chunhua Liao and Pei-Hung Lin and Murali Emani and Xipeng Shen,http://arxiv.org/pdf/2208.05596v1
 http://arxiv.org/abs/2208.08195v2,creativecommons.org/licenses/by/4.0/,Benchmarking Compositionality with Formal Languages,Josef Valvoda and Naomi Saphra and Jonathan Rawski and Adina Williams and Ryan Cotterell,http://arxiv.org/pdf/2208.08195v2
 http://arxiv.org/abs/2208.12367v2,creativecommons.org/licenses/by/4.0/,A Compact Pretraining Approach for Neural Language Models,Shahriar Golchin and Mihai Surdeanu and Nazgol Tavabi and Ata Kiapour,http://arxiv.org/pdf/2208.12367v2
 http://arxiv.org/abs/2208.14652v1,creativecommons.org/licenses/by/4.0/,Unified Knowledge Prompt Pre-training for Customer Service Dialogues,Keqing He and Jingang Wang and Chaobo Sun and Wei Wu,http://arxiv.org/pdf/2208.14652v1
 http://arxiv.org/abs/2209.14161v1,creativecommons.org/licenses/by/4.0/,Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models,Youness Moukafih and Mounir Ghogho and Kamel Smaili,http://arxiv.org/pdf/2209.14161v1
 http://arxiv.org/abs/2210.02969v3,creativecommons.org/licenses/by/4.0/,Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners,Seonghyeon Ye and Doyoung Kim and Joel Jang and Joongbo Shin and Minjoon Seo,http://arxiv.org/pdf/2210.02969v3
 http://arxiv.org/abs/2210.05245v2,creativecommons.org/licenses/by/4.0/,PatternRank: Leveraging Pretrained Language Models and Part of Speech for Unsupervised Keyphrase Extraction,Tim Schopf and Simon Klimek and Florian Matthes,http://arxiv.org/pdf/2210.05245v2
 http://arxiv.org/abs/2210.10325v1,creativecommons.org/licenses/by/4.0/,Improving Stability of Fine-Tuning Pretrained Language Models via Component-Wise Gradient Norm Clipping,Chenghao Yang and Xuezhe Ma,http://arxiv.org/pdf/2210.10325v1
 http://arxiv.org/abs/2210.11255v1,creativecommons.org/licenses/by/4.0/,Evidence > Intuition: Transferability Estimation for Encoder Selection,Elisa Bassignana and Max Müller-Eberstein and Mike Zhang and Barbara Plank,http://arxiv.org/pdf/2210.11255v1
 http://arxiv.org/abs/2210.11617v1,creativecommons.org/licenses/by/4.0/,Boosting Natural Language Generation from Instructions with Meta-Learning,Budhaditya Deb and Guoqing Zheng and Ahmed Hassan Awadallah,http://arxiv.org/pdf/2210.11617v1
 http://arxiv.org/abs/2210.13144v1,creativecommons.org/licenses/by/4.0/,Weak-Supervised Dysarthria-invariant Features for Spoken Language Understanding using an FHVAE and Adversarial Training,Jinzi Qi and Hugo Van hamme,http://arxiv.org/pdf/2210.13144v1
 http://arxiv.org/abs/2210.14128v1,creativecommons.org/licenses/by/4.0/,IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models,Chenguang Wang and Xiao Liu and Dawn Song,http://arxiv.org/pdf/2210.14128v1
 http://arxiv.org/abs/2210.15224v1,creativecommons.org/licenses/by/4.0/,The Effect of Normalization for Bi-directional Amharic-English Neural Machine Translation,Tadesse Destaw Belay and Atnafu Lambebo Tonja and Olga Kolesnikova and Seid Muhie Yimam and Abinew Ali Ayele and Silesh Bogale Haile and Grigori Sidorov and Alexander Gelbukh,http://arxiv.org/pdf/2210.15224v1
 http://arxiv.org/abs/2211.01071v1,creativecommons.org/licenses/by/4.0/,Gradient Knowledge Distillation for Pre-trained Language Models,Lean Wang and Lei Li and Xu Sun,http://arxiv.org/pdf/2211.01071v1
 http://arxiv.org/abs/2211.17121v1,creativecommons.org/licenses/by/4.0/,sEHR-CE: Language modelling of structured EHR data for efficient and generalizable patient cohort expansion,Anna Munoz-Farre and Harry Rose and Sera Aylin Cakiroglu,http://arxiv.org/pdf/2211.17121v1
 http://arxiv.org/abs/2212.11680v1,creativecommons.org/licenses/by/4.0/,Smooth Sailing: Improving Active Learning for Pre-trained Language Models with Representation Smoothness Analysis,Josip Jukić and Jan Šnajder,http://arxiv.org/pdf/2212.11680v1
 http://arxiv.org/abs/2302.03494v8,creativecommons.org/licenses/by/4.0/,A Categorical Archive of ChatGPT Failures,Ali Borji,http://arxiv.org/pdf/2302.03494v8
 http://arxiv.org/abs/2303.03628v1,creativecommons.org/licenses/by/4.0/,CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification,Seungone Kim and Se June Joo and Yul Jang and Hyungjoo Chae and Jinyoung Yeo,http://arxiv.org/pdf/2303.03628v1
 http://arxiv.org/abs/2303.08991v2,creativecommons.org/licenses/by/4.0/,DeltaScore: Story Evaluation with Perturbations,Zhuohan Xie and Miao Li and Trevor Cohn and Jey Han Lau,http://arxiv.org/pdf/2303.08991v2
 http://arxiv.org/abs/2303.17650v1,creativecommons.org/licenses/by/4.0/,Comparing Abstractive Summaries Generated by ChatGPT to Real Summaries Through Blinded Reviewers and Text Classification Algorithms,Mayank Soni and Vincent Wade,http://arxiv.org/pdf/2303.17650v1
 http://arxiv.org/abs/2304.02828v1,creativecommons.org/licenses/by/4.0/,Uncurated Image-Text Datasets: Shedding Light on Demographic Bias,Noa Garcia and Yusuke Hirota and Yankun Wu and Yuta Nakashima,http://arxiv.org/pdf/2304.02828v1
 http://arxiv.org/abs/2304.07830v1,creativecommons.org/licenses/by/4.0/,How does ChatGPT rate sound semantics?,Kai Siedenburg and Charalampos Saitis,http://arxiv.org/pdf/2304.07830v1
 http://arxiv.org/abs/2304.10346v1,creativecommons.org/licenses/by/4.0/,Interventional Probing in High Dimensions: An NLI Case Study,Julia Rozanova and Marco Valentino and Lucas Cordeiro and Andre Freitas,http://arxiv.org/pdf/2304.10346v1
 http://arxiv.org/abs/2303.12417v2,creativecommons.org/licenses/by/4.0/,CLIP$^2$: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data,Yihan Zeng and Chenhan Jiang and Jiageng Mao and Jianhua Han and Chaoqiang Ye and Qingqiu Huang and Dit-Yan Yeung and Zhen Yang and Xiaodan Liang and Hang Xu,http://arxiv.org/pdf/2303.12417v2
 http://arxiv.org/abs/1810.11960v2,creativecommons.org/licenses/by/4.0/,Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language,Yusuke Yasuda and Xin Wang and Shinji Takaki and Junichi Yamagishi,http://arxiv.org/pdf/1810.11960v2
 http://arxiv.org/abs/2106.06090v2,creativecommons.org/licenses/by/4.0/,Graph Neural Networks for Natural Language Processing: A Survey,Lingfei Wu and Yu Chen and Kai Shen and Xiaojie Guo and Hanning Gao and Shucheng Li and Jian Pei and Bo Long,http://arxiv.org/pdf/2106.06090v2
 http://arxiv.org/abs/2205.11024v2,creativecommons.org/licenses/by/4.0/,Vector-Quantized Input-Contextualized Soft Prompts for Natural Language Understanding,Rishabh Bhardwaj and Amrita Saha and Steven C. H. Hoi and Soujanya Poria,http://arxiv.org/pdf/2205.11024v2
 http://arxiv.org/abs/2010.07665v1,creativecommons.org/licenses/by/4.0/,Diverse Keyphrase Generation with Neural Unlikelihood Training,Hareesh Bahuleyan and Layla El Asri,http://arxiv.org/pdf/2010.07665v1
 http://arxiv.org/abs/2102.07983v1,creativecommons.org/licenses/by/4.0/,"FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the Dictionary",Terra Blevins and Mandar Joshi and Luke Zettlemoyer,http://arxiv.org/pdf/2102.07983v1
 http://arxiv.org/abs/2105.07623v2,creativecommons.org/licenses/by/4.0/,Sentence Similarity Based on Contexts,Xiaofei Sun and Yuxian Meng and Xiang Ao and Fei Wu and Tianwei Zhang and Jiwei Li and Chun Fan,http://arxiv.org/pdf/2105.07623v2
 http://arxiv.org/abs/2109.06264v3,creativecommons.org/licenses/by/4.0/,Post-OCR Document Correction with large Ensembles of Character Sequence-to-Sequence Models,Juan Ramirez-Orta and Eduardo Xamena and Ana Maguitman and Evangelos Milios and Axel J. Soto,http://arxiv.org/pdf/2109.06264v3
 http://arxiv.org/abs/2301.07093v2,creativecommons.org/licenses/by/4.0/,GLIGEN: Open-Set Grounded Text-to-Image Generation,Yuheng Li and Haotian Liu and Qingyang Wu and Fangzhou Mu and Jianwei Yang and Jianfeng Gao and Chunyuan Li and Yong Jae Lee,http://arxiv.org/pdf/2301.07093v2
 http://arxiv.org/abs/2301.07543v1,creativecommons.org/licenses/by/4.0/,Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?,John J. Horton,http://arxiv.org/pdf/2301.07543v1
 http://arxiv.org/abs/2109.03537v2,creativecommons.org/licenses/by/4.0/,On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets,Cheng-Han Chiang and Hung-yi Lee,http://arxiv.org/pdf/2109.03537v2
 http://arxiv.org/abs/2209.06422v1,creativecommons.org/licenses/by/4.0/,Language Chameleon: Transformation analysis between languages using Cross-lingual Post-training based on Pre-trained language models,Suhyune Son and Chanjun Park and Jungseob Lee and Midan Shim and Chanhee Lee and Yoonna Jang and Jaehyung Seo and Heuiseok Lim,http://arxiv.org/pdf/2209.06422v1
 http://arxiv.org/abs/2303.18232v1,creativecommons.org/licenses/by/4.0/,DIME-FM: DIstilling Multimodal and Efficient Foundation Models,Ximeng Sun and Pengchuan Zhang and Peizhao Zhang and Hardik Shah and Kate Saenko and Xide Xia,http://arxiv.org/pdf/2303.18232v1
 http://arxiv.org/abs/1606.06361v2,creativecommons.org/licenses/by/4.0/,A Probabilistic Generative Grammar for Semantic Parsing,Abulhair Saparov,http://arxiv.org/pdf/1606.06361v2
 http://arxiv.org/abs/2107.00650v2,creativecommons.org/licenses/by/4.0/,CLIP-It! Language-Guided Video Summarization,Medhini Narasimhan and Anna Rohrbach and Trevor Darrell,http://arxiv.org/pdf/2107.00650v2
 http://arxiv.org/abs/2112.04426v3,creativecommons.org/licenses/by/4.0/,Improving language models by retrieving from trillions of tokens,Sebastian Borgeaud and Arthur Mensch and Jordan Hoffmann and Trevor Cai and Eliza Rutherford and Katie Millican and George van den Driessche and Jean-Baptiste Lespiau and Bogdan Damoc and Aidan Clark and Diego de Las Casas and Aurelia Guy and Jacob Menick and Roman Ring and Tom Hennigan and Saffron Huang and Loren Maggiore and Chris Jones and Albin Cassirer and Andy Brock and Michela Paganini and Geoffrey Irving and Oriol Vinyals and Simon Osindero and Karen Simonyan and Jack W. Rae and Erich Elsen and Laurent Sifre,http://arxiv.org/pdf/2112.04426v3
 http://arxiv.org/abs/2203.09590v4,creativecommons.org/licenses/by/4.0/,Enhanced Temporal Knowledge Embeddings with Contextualized Language Representations,Zhen Han and Ruotong Liao and Beiyan Liu and Yao Zhang and Zifeng Ding and Jindong Gu and Heinz Köppl and Hinrich Schütze and Volker Tresp,http://arxiv.org/pdf/2203.09590v4
 http://arxiv.org/abs/2203.14267v2,creativecommons.org/licenses/by/4.0/,bitsa_nlp@LT-EDI-ACL2022: Leveraging Pretrained Language Models for Detecting Homophobia and Transphobia in Social Media Comments,Vitthal Bhandari and Poonam Goyal,http://arxiv.org/pdf/2203.14267v2
 http://arxiv.org/abs/2204.02123v1,creativecommons.org/licenses/by/4.0/,Improved and Efficient Conversational Slot Labeling through Question Answering,Gabor Fuisz and Ivan Vulić and Samuel Gibbons and Inigo Casanueva and Paweł Budzianowski,http://arxiv.org/pdf/2204.02123v1
 http://arxiv.org/abs/2303.03697v1,creativecommons.org/licenses/by/4.0/,Stylometric Detection of AI-Generated Text in Twitter Timelines,Tharindu Kumarage and Joshua Garland and Amrita Bhattacharjee and Kirill Trapeznikov and Scott Ruston and Huan Liu,http://arxiv.org/pdf/2303.03697v1
 http://arxiv.org/abs/2303.03836v2,creativecommons.org/licenses/by/4.0/,Exploring the Feasibility of ChatGPT for Event Extraction,Jun Gao and Huan Zhao and Changlong Yu and Ruifeng Xu,http://arxiv.org/pdf/2303.03836v2
 http://arxiv.org/abs/2303.14822v1,creativecommons.org/licenses/by/4.0/,MGTBench: Benchmarking Machine-Generated Text Detection,Xinlei He and Xinyue Shen and Zeyuan Chen and Michael Backes and Yang Zhang,http://arxiv.org/pdf/2303.14822v1
 http://arxiv.org/abs/2304.03843v1,creativecommons.org/licenses/by/4.0/,Why think step-by-step? Reasoning emerges from the locality of experience,Ben Prystawski and Noah D. Goodman,http://arxiv.org/pdf/2304.03843v1
 http://arxiv.org/abs/2212.13138v1,creativecommons.org/licenses/by/4.0/,Large Language Models Encode Clinical Knowledge,Karan Singhal and Shekoofeh Azizi and Tao Tu and S. Sara Mahdavi and Jason Wei and Hyung Won Chung and Nathan Scales and Ajay Tanwani and Heather Cole-Lewis and Stephen Pfohl and Perry Payne and Martin Seneviratne and Paul Gamble and Chris Kelly and Nathaneal Scharli and Aakanksha Chowdhery and Philip Mansfield and Blaise Aguera y Arcas and Dale Webster and Greg S. Corrado and Yossi Matias and Katherine Chou and Juraj Gottweis and Nenad Tomasev and Yun Liu and Alvin Rajkomar and Joelle Barral and Christopher Semturs and Alan Karthikesalingam and Vivek Natarajan,http://arxiv.org/pdf/2212.13138v1
 http://arxiv.org/abs/2208.02402v2,creativecommons.org/licenses/by/4.0/,Fusing Sentence Embeddings Into LSTM-based Autoregressive Language Models,Vilém Zouhar and Marius Mosbach and Dietrich Klakow,http://arxiv.org/pdf/2208.02402v2
 http://arxiv.org/abs/2103.13275v1,creativecommons.org/licenses/by/4.0/,When Word Embeddings Become Endangered,Khalid Alnajjar,http://arxiv.org/pdf/2103.13275v1
 http://arxiv.org/abs/2111.07180v1,creativecommons.org/licenses/by/4.0/,Explainable Semantic Space by Grounding Language to Vision with Cross-Modal Contrastive Learning,Yizhen Zhang and Minkyu Choi and Kuan Han and Zhongming Liu,http://arxiv.org/pdf/2111.07180v1
 http://arxiv.org/abs/2107.10137v2,creativecommons.org/licenses/by/4.0/,Improved Text Classification via Contrastive Adversarial Training,Lin Pan and Chung-Wei Hang and Avirup Sil and Saloni Potdar,http://arxiv.org/pdf/2107.10137v2
 http://arxiv.org/abs/1903.04739v1,creativecommons.org/licenses/by/4.0/,Syllable-based Neural Named Entity Recognition for Myanmar Language,Hsu Myat Mo and Khin Mar Soe,http://arxiv.org/pdf/1903.04739v1
 http://arxiv.org/abs/2102.00894v1,creativecommons.org/licenses/by/4.0/,Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models,Nora Kassner and Philipp Dufter and Hinrich Schütze,http://arxiv.org/pdf/2102.00894v1
 http://arxiv.org/abs/2106.06683v2,creativecommons.org/licenses/by/4.0/,Assessing Multilingual Fairness in Pre-trained Multimodal Representations,Jialu Wang and Yang Liu and Xin Eric Wang,http://arxiv.org/pdf/2106.06683v2
 http://arxiv.org/abs/2210.00066v1,creativecommons.org/licenses/by/4.0/,Improving Policy Learning via Language Dynamics Distillation,Victor Zhong and Jesse Mu and Luke Zettlemoyer and Edward Grefenstette and Tim Rocktäschel,http://arxiv.org/pdf/2210.00066v1
 http://arxiv.org/abs/2108.03739v2,creativecommons.org/licenses/by/4.0/,Machine Translation of Low-Resource Indo-European Languages,Wei-Rui Chen and Muhammad Abdul-Mageed,http://arxiv.org/pdf/2108.03739v2
 http://arxiv.org/abs/2104.09400v1,creativecommons.org/licenses/by/4.0/,Probing for Bridging Inference in Transformer Language Models,Onkar Pandit and Yufang Hou,http://arxiv.org/pdf/2104.09400v1
 http://arxiv.org/abs/2202.09955v2,creativecommons.org/licenses/by/4.0/,StyleBERT: Chinese pretraining by font style information,Chao Lv and Han Zhang and XinKai Du and Yunhao Zhang and Ying Huang and Wenhao Li and Jia Han and Shanshan Gu,http://arxiv.org/pdf/2202.09955v2
 http://arxiv.org/abs/2303.13310v1,creativecommons.org/licenses/by/4.0/,SwissBERT: The Multilingual Language Model for Switzerland,Jannis Vamvas and Johannes Graën and Rico Sennrich,http://arxiv.org/pdf/2303.13310v1
 http://arxiv.org/abs/2110.07304v1,creativecommons.org/licenses/by/4.0/,An Empirical Investigation of Multi-bridge Multilingual NMT models,Anoop Kunchukuttan,http://arxiv.org/pdf/2110.07304v1
 http://arxiv.org/abs/2208.11194v1,creativecommons.org/licenses/by/4.0/,Bitext Mining for Low-Resource Languages via Contrastive Learning,Weiting Tan and Philipp Koehn,http://arxiv.org/pdf/2208.11194v1
 http://arxiv.org/abs/2103.15877v1,creativecommons.org/licenses/by/4.0/,Unsupervised Machine Translation On Dravidian Languages,Sai Koneru and Danni Liu and Jan Niehues,http://arxiv.org/pdf/2103.15877v1
 http://arxiv.org/abs/2110.15943v2,creativecommons.org/licenses/by/4.0/,MetaICL: Learning to Learn In Context,Sewon Min and Mike Lewis and Luke Zettlemoyer and Hannaneh Hajishirzi,http://arxiv.org/pdf/2110.15943v2
 http://arxiv.org/abs/2111.10337v2,creativecommons.org/licenses/by/4.0/,Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions,Hongwei Xue and Tiankai Hang and Yanhong Zeng and Yuchong Sun and Bei Liu and Huan Yang and Jianlong Fu and Baining Guo,http://arxiv.org/pdf/2111.10337v2
 http://arxiv.org/abs/2202.12837v2,creativecommons.org/licenses/by/4.0/,Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?,Sewon Min and Xinxi Lyu and Ari Holtzman and Mikel Artetxe and Mike Lewis and Hannaneh Hajishirzi and Luke Zettlemoyer,http://arxiv.org/pdf/2202.12837v2
 http://arxiv.org/abs/2302.12813v3,creativecommons.org/licenses/by/4.0/,Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback,Baolin Peng and Michel Galley and Pengcheng He and Hao Cheng and Yujia Xie and Yu Hu and Qiuyuan Huang and Lars Liden and Zhou Yu and Weizhu Chen and Jianfeng Gao,http://arxiv.org/pdf/2302.12813v3
 http://arxiv.org/abs/2304.10428v1,creativecommons.org/licenses/by/4.0/,GPT-NER: Named Entity Recognition via Large Language Models,Shuhe Wang and Xiaofei Sun and Xiaoya Li and Rongbin Ouyang and Fei Wu and Tianwei Zhang and Jiwei Li and Guoyin Wang,http://arxiv.org/pdf/2304.10428v1
 http://arxiv.org/abs/2101.00419v2,creativecommons.org/licenses/by/4.0/,KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation,Yiran Xing and Zai Shi and Zhao Meng and Gerhard Lakemeyer and Yunpu Ma and Roger Wattenhofer,http://arxiv.org/pdf/2101.00419v2
 http://arxiv.org/abs/2110.02067v1,creativecommons.org/licenses/by/4.0/,Teach Me What to Say and I Will Learn What to Pick: Unsupervised Knowledge Selection Through Response Generation with Pretrained Generative Models,Ehsan Lotfi and Maxime De Bruyn and Jeska Buhmann and Walter Daelemans,http://arxiv.org/pdf/2110.02067v1
 http://arxiv.org/abs/2112.08616v1,creativecommons.org/licenses/by/4.0/,Masked Measurement Prediction: Learning to Jointly Predict Quantities and Units from Textual Context,Daniel Spokoyny and Ivan Lee and Zhao Jin and Taylor Berg-Kirkpatrick,http://arxiv.org/pdf/2112.08616v1
 http://arxiv.org/abs/2112.11909v2,creativecommons.org/licenses/by/4.0/,Few-shot Multi-hop Question Answering over Knowledge Base,Meihao Fan and Lei Zhang and Siyao Xiao and Yuru Liang,http://arxiv.org/pdf/2112.11909v2
 http://arxiv.org/abs/2211.08473v1,creativecommons.org/licenses/by/4.0/,On the Compositional Generalization Gap of In-Context Learning,Arian Hosseini and Ankit Vani and Dzmitry Bahdanau and Alessandro Sordoni and Aaron Courville,http://arxiv.org/pdf/2211.08473v1
 http://arxiv.org/abs/2211.14865v2,creativecommons.org/licenses/by/4.0/,Understanding BLOOM: An empirical study on diverse NLP tasks,Parag Pravin Dakle and SaiKrishna Rallabandi and Preethi Raghavan,http://arxiv.org/pdf/2211.14865v2
 http://arxiv.org/abs/2210.08402v1,creativecommons.org/licenses/by/4.0/,LAION-5B: An open large-scale dataset for training next generation image-text models,Christoph Schuhmann and Romain Beaumont and Richard Vencu and Cade Gordon and Ross Wightman and Mehdi Cherti and Theo Coombes and Aarush Katta and Clayton Mullis and Mitchell Wortsman and Patrick Schramowski and Srivatsa Kundurthy and Katherine Crowson and Ludwig Schmidt and Robert Kaczmarczyk and Jenia Jitsev,http://arxiv.org/pdf/2210.08402v1
 http://arxiv.org/abs/1707.06480v1,creativecommons.org/licenses/by/4.0/,Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones,Zhenisbek Assylbekov and Rustem Takhanov and Bagdat Myrzakhmetov and Jonathan N. Washington,http://arxiv.org/pdf/1707.06480v1
 http://arxiv.org/abs/2012.05776v3,creativecommons.org/licenses/by/4.0/,Multi-Sense Language Modelling,Andrea Lekkas and Peter Schneider-Kamp and Isabelle Augenstein,http://arxiv.org/pdf/2012.05776v3
 http://arxiv.org/abs/2012.12543v2,creativecommons.org/licenses/by/4.0/,Code Switching Language Model Using Monolingual Training Data,Asad Ullah and Tauseef Ahmed,http://arxiv.org/pdf/2012.12543v2
 http://arxiv.org/abs/2203.16512v2,creativecommons.org/licenses/by/4.0/,Vakyansh: ASR Toolkit for Low Resource Indic languages,Harveen Singh Chadha and Anirudh Gupta and Priyanshi Shah and Neeraj Chhimwal and Ankur Dhuriya and Rishabh Gaur and Vivek Raghavan,http://arxiv.org/pdf/2203.16512v2
 http://arxiv.org/abs/2212.10440v1,creativecommons.org/licenses/by/4.0/,Perplexed by Quality: A Perplexity-based Method for Adult and Harmful Content Detection in Multilingual Heterogeneous Web Data,Tim Jansen and Yangling Tong and Victoria Zevallos and Pedro Ortiz Suarez,http://arxiv.org/pdf/2212.10440v1
 http://arxiv.org/abs/2303.05453v1,creativecommons.org/licenses/by/4.0/,Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback,Hannah Rose Kirk and Bertie Vidgen and Paul Röttger and Scott A. Hale,http://arxiv.org/pdf/2303.05453v1
 http://arxiv.org/abs/2210.15042v3,creativecommons.org/licenses/by/4.0/,Privately Fine-Tuning Large Language Models with Differential Privacy,Rouzbeh Behnia and Mohamamdreza Ebrahimi and Jason Pacheco and Balaji Padmanabhan,http://arxiv.org/pdf/2210.15042v3
 http://arxiv.org/abs/2110.08207v3,creativecommons.org/licenses/by/4.0/,Multitask Prompted Training Enables Zero-Shot Task Generalization,Victor Sanh and Albert Webson and Colin Raffel and Stephen H. Bach and Lintang Sutawika and Zaid Alyafeai and Antoine Chaffin and Arnaud Stiegler and Teven Le Scao and Arun Raja and Manan Dey and M Saiful Bari and Canwen Xu and Urmish Thakker and Shanya Sharma Sharma and Eliza Szczechla and Taewoon Kim and Gunjan Chhablani and Nihal Nayak and Debajyoti Datta and Jonathan Chang and Mike Tian-Jian Jiang and Han Wang and Matteo Manica and Sheng Shen and Zheng Xin Yong and Harshit Pandey and Rachel Bawden and Thomas Wang and Trishala Neeraj and Jos Rozen and Abheesht Sharma and Andrea Santilli and Thibault Fevry and Jason Alan Fries and Ryan Teehan and Tali Bers and Stella Biderman and Leo Gao and Thomas Wolf and Alexander M. Rush,http://arxiv.org/pdf/2110.08207v3
 http://arxiv.org/abs/2207.07411v1,creativecommons.org/licenses/by/4.0/,Plex: Towards Reliability using Pretrained Large Model Extensions,Dustin Tran and Jeremiah Liu and Michael W. Dusenberry and Du Phan and Mark Collier and Jie Ren and Kehang Han and Zi Wang and Zelda Mariet and Huiyi Hu and Neil Band and Tim G. J. Rudner and Karan Singhal and Zachary Nado and Joost van Amersfoort and Andreas Kirsch and Rodolphe Jenatton and Nithum Thain and Honglin Yuan and Kelly Buchanan and Kevin Murphy and D. Sculley and Yarin Gal and Zoubin Ghahramani and Jasper Snoek and Balaji Lakshminarayanan,http://arxiv.org/pdf/2207.07411v1
 http://arxiv.org/abs/2110.05877v1,creativecommons.org/licenses/by/4.0/,OpenHands: Making Sign Language Recognition Accessible with Pose-based Pretrained Models across Languages,Prem Selvaraj and Gokul NC and Pratyush Kumar and Mitesh Khapra,http://arxiv.org/pdf/2110.05877v1
 http://arxiv.org/abs/2302.05016v2,creativecommons.org/licenses/by/4.0/,Is Multimodal Vision Supervision Beneficial to Language?,Avinash Madasu and Vasudev Lal,http://arxiv.org/pdf/2302.05016v2
 http://arxiv.org/abs/2205.07307v1,creativecommons.org/licenses/by/4.0/,Optimization of Decision Tree Evaluation Using SIMD Instructions,Alexey Mironov and Ilnur Khuziev,http://arxiv.org/pdf/2205.07307v1
 http://arxiv.org/abs/2211.15271v2,creativecommons.org/licenses/by/4.0/,The Myth of Culturally Agnostic AI Models,Eva Cetinic,http://arxiv.org/pdf/2211.15271v2
 http://arxiv.org/abs/1708.00415v2,creativecommons.org/licenses/by/4.0/,A Generative Parser with a Discriminative Recognition Algorithm,Jianpeng Cheng and Adam Lopez and Mirella Lapata,http://arxiv.org/pdf/1708.00415v2
 http://arxiv.org/abs/2103.16716v1,creativecommons.org/licenses/by/4.0/,"BASE Layers: Simplifying Training of Large, Sparse Models",Mike Lewis and Shruti Bhosale and Tim Dettmers and Naman Goyal and Luke Zettlemoyer,http://arxiv.org/pdf/2103.16716v1
 http://arxiv.org/abs/2107.03176v1,creativecommons.org/licenses/by/4.0/,On Training Instance Selection for Few-Shot Neural Text Generation,Ernie Chang and Xiaoyu Shen and Hui-Syuan Yeh and Vera Demberg,http://arxiv.org/pdf/2107.03176v1
 http://arxiv.org/abs/2203.08568v3,creativecommons.org/licenses/by/4.0/,In-Context Learning for Few-Shot Dialogue State Tracking,Yushi Hu and Chia-Hsuan Lee and Tianbao Xie and Tao Yu and Noah A. Smith and Mari Ostendorf,http://arxiv.org/pdf/2203.08568v3
 http://arxiv.org/abs/2205.09685v1,creativecommons.org/licenses/by/4.0/,ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD,Moustafa Al-Hajj and Mustafa Jarrar,http://arxiv.org/pdf/2205.09685v1
 http://arxiv.org/abs/2205.11194v1,creativecommons.org/licenses/by/4.0/,UnifieR: A Unified Retriever for Large-Scale Retrieval,Tao Shen and Xiubo Geng and Chongyang Tao and Can Xu and Kai Zhang and Daxin Jiang,http://arxiv.org/pdf/2205.11194v1
 http://arxiv.org/abs/2212.01558v1,creativecommons.org/licenses/by/4.0/,PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained Image-Language Models,Minghua Liu and Yinhao Zhu and Hong Cai and Shizhong Han and Zhan Ling and Fatih Porikli and Hao Su,http://arxiv.org/pdf/2212.01558v1
 http://arxiv.org/abs/2212.14149v1,creativecommons.org/licenses/by/4.0/,Macro-block dropout for improved regularization in training end-to-end speech recognition models,Chanwoo Kim and Sathish Indurti and Jinhwan Park and Wonyong Sung,http://arxiv.org/pdf/2212.14149v1
 http://arxiv.org/abs/2301.01820v3,creativecommons.org/licenses/by/4.0/,InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval,Vitor Jeronymo and Luiz Bonifacio and Hugo Abonizio and Marzieh Fadaee and Roberto Lotufo and Jakub Zavrel and Rodrigo Nogueira,http://arxiv.org/pdf/2301.01820v3
 http://arxiv.org/abs/2304.11015v1,creativecommons.org/licenses/by/4.0/,DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction,Mohammadreza Pourreza and Davood Rafiei,http://arxiv.org/pdf/2304.11015v1
 http://arxiv.org/abs/2101.04566v1,creativecommons.org/licenses/by/4.0/,Frequency Limited $\mathcal{H}_2$ Optimal Model Reduction of Large-Scale Sparse Dynamical Systems,Xin Du and M. Monir Uddin and A. Mostakim Fony and Md. Tanzim Hossain and Mohammaed Sahadat-Hossain,http://arxiv.org/pdf/2101.04566v1
 http://arxiv.org/abs/1910.03806v1,creativecommons.org/licenses/by/4.0/,Is Multilingual BERT Fluent in Language Generation?,Samuel Rönnqvist and Jenna Kanerva and Tapio Salakoski and Filip Ginter,http://arxiv.org/pdf/1910.03806v1
 http://arxiv.org/abs/1911.07613v1,creativecommons.org/licenses/by/4.0/,A Subword Level Language Model for Bangla Language,Aisha Khatun and Anisur Rahman and Hemayet Ahmed Chowdhury and Md. Saiful Islam and Ayesha Tasnim,http://arxiv.org/pdf/1911.07613v1
 http://arxiv.org/abs/2206.10668v1,creativecommons.org/licenses/by/4.0/,BenchCLAMP: A Benchmark for Evaluating Language Models on Semantic Parsing,Subhro Roy and Sam Thomson and Tongfei Chen and Richard Shin and Adam Pauls and Jason Eisner and Benjamin Van Durme,http://arxiv.org/pdf/2206.10668v1
 http://arxiv.org/abs/2105.09938v3,creativecommons.org/licenses/by/4.0/,Measuring Coding Challenge Competence With APPS,Dan Hendrycks and Steven Basart and Saurav Kadavath and Mantas Mazeika and Akul Arora and Ethan Guo and Collin Burns and Samir Puranik and Horace He and Dawn Song and Jacob Steinhardt,http://arxiv.org/pdf/2105.09938v3
 http://arxiv.org/abs/2205.12302v2,creativecommons.org/licenses/by/4.0/,Garden-Path Traversal in GPT-2,William Jurayj and William Rudman and Carsten Eickhoff,http://arxiv.org/pdf/2205.12302v2
 http://arxiv.org/abs/2303.15846v1,creativecommons.org/licenses/by/4.0/,Soft-prompt tuning to predict lung cancer using primary care free-text Dutch medical notes,Auke Elfrink and Iacopo Vagliano and Ameen Abu-Hanna and Iacer Calixto,http://arxiv.org/pdf/2303.15846v1
 http://arxiv.org/abs/2304.07258v1,creativecommons.org/licenses/by/4.0/,"Learn What Is Possible, Then Choose What Is Best: Disentangling One-To-Many Relations in Language Through Text-based Games",Benjamin Towle and Ke Zhou,http://arxiv.org/pdf/2304.07258v1
 http://arxiv.org/abs/2104.08384v2,creativecommons.org/licenses/by/4.0/,"Wikily"" Supervised Neural Translation Tailored to Cross-Lingual Tasks""",Mohammad Sadegh Rasooli and Chris Callison-Burch and Derry Tanti Wijaya,http://arxiv.org/pdf/2104.08384v2
 http://arxiv.org/abs/2105.14839v2,creativecommons.org/licenses/by/4.0/,Greedy-layer Pruning: Speeding up Transformer Models for Natural Language Processing,David Peer and Sebastian Stabinger and Stefan Engl and Antonio Rodriguez-Sanchez,http://arxiv.org/pdf/2105.14839v2
 http://arxiv.org/abs/2212.10233v1,creativecommons.org/licenses/by/4.0/,Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study,Di Wu and Wasi Uddin Ahmad and Kai-Wei Chang,http://arxiv.org/pdf/2212.10233v1
 http://arxiv.org/abs/2302.08583v1,creativecommons.org/licenses/by/4.0/,JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition,Zhong Meng and Weiran Wang and Rohit Prabhavalkar and Tara N. Sainath and Tongzhou Chen and Ehsan Variani and Yu Zhang and Bo Li and Andrew Rosenberg and Bhuvana Ramabhadran,http://arxiv.org/pdf/2302.08583v1
 http://arxiv.org/abs/2209.09513v2,creativecommons.org/licenses/by/4.0/,Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering,Pan Lu and Swaroop Mishra and Tony Xia and Liang Qiu and Kai-Wei Chang and Song-Chun Zhu and Oyvind Tafjord and Peter Clark and Ashwin Kalyan,http://arxiv.org/pdf/2209.09513v2
 http://arxiv.org/abs/2302.07459v2,creativecommons.org/licenses/by/4.0/,The Capacity for Moral Self-Correction in Large Language Models,Deep Ganguli and Amanda Askell and Nicholas Schiefer and Thomas I. Liao and Kamilė Lukošiūtė and Anna Chen and Anna Goldie and Azalia Mirhoseini and Catherine Olsson and Danny Hernandez and Dawn Drain and Dustin Li and Eli Tran-Johnson and Ethan Perez and Jackson Kernion and Jamie Kerr and Jared Mueller and Joshua Landau and Kamal Ndousse and Karina Nguyen and Liane Lovitt and Michael Sellitto and Nelson Elhage and Noemi Mercado and Nova DasSarma and Oliver Rausch and Robert Lasenby and Robin Larson and Sam Ringer and Sandipan Kundu and Saurav Kadavath and Scott Johnston and Shauna Kravec and Sheer El Showk and Tamera Lanham and Timothy Telleen-Lawton and Tom Henighan and Tristan Hume and Yuntao Bai and Zac Hatfield-Dodds and Ben Mann and Dario Amodei and Nicholas Joseph and Sam McCandlish and Tom Brown and Christopher Olah and Jack Clark and Samuel R. Bowman and Jared Kaplan,http://arxiv.org/pdf/2302.07459v2
 http://arxiv.org/abs/2201.00075v1,creativecommons.org/licenses/by/4.0/,How do lexical semantics affect translation? An empirical study,Vivek Subramanian and Dhanasekar Sundararaman,http://arxiv.org/pdf/2201.00075v1
 http://arxiv.org/abs/2208.10347v1,creativecommons.org/licenses/by/4.0/,A robust class of languages of 2-nested words,Séverine Fratani and Guillaume Maurras and Pierre-Alain Reynier,http://arxiv.org/pdf/2208.10347v1
 http://arxiv.org/abs/2211.09110v1,creativecommons.org/licenses/by/4.0/,Holistic Evaluation of Language Models,Percy Liang and Rishi Bommasani and Tony Lee and Dimitris Tsipras and Dilara Soylu and Michihiro Yasunaga and Yian Zhang and Deepak Narayanan and Yuhuai Wu and Ananya Kumar and Benjamin Newman and Binhang Yuan and Bobby Yan and Ce Zhang and Christian Cosgrove and Christopher D. Manning and Christopher Ré and Diana Acosta-Navas and Drew A. Hudson and Eric Zelikman and Esin Durmus and Faisal Ladhak and Frieda Rong and Hongyu Ren and Huaxiu Yao and Jue Wang and Keshav Santhanam and Laurel Orr and Lucia Zheng and Mert Yuksekgonul and Mirac Suzgun and Nathan Kim and Neel Guha and Niladri Chatterji and Omar Khattab and Peter Henderson and Qian Huang and Ryan Chi and Sang Michael Xie and Shibani Santurkar and Surya Ganguli and Tatsunori Hashimoto and Thomas Icard and Tianyi Zhang and Vishrav Chaudhary and William Wang and Xuechen Li and Yifan Mai and Yuhui Zhang and Yuta Koreeda,http://arxiv.org/pdf/2211.09110v1
 http://arxiv.org/abs/1812.11549v1,creativecommons.org/licenses/by/4.0/,Visibly Pushdown Languages over Sliding Windows,Moses Ganardi,http://arxiv.org/pdf/1812.11549v1
 http://arxiv.org/abs/2102.10535v1,creativecommons.org/licenses/by/4.0/,Automatic Code Generation using Pre-Trained Language Models,Luis Perez and Lizi Ottens and Sudharshan Viswanathan,http://arxiv.org/pdf/2102.10535v1
 http://arxiv.org/abs/2204.03951v1,creativecommons.org/licenses/by/4.0/,RuBioRoBERTa: a pre-trained biomedical language model for Russian language biomedical text mining,Alexander Yalunin and Alexander Nesterov and Dmitriy Umerenkov,http://arxiv.org/pdf/2204.03951v1
 http://arxiv.org/abs/2005.12656v3,creativecommons.org/licenses/by/4.0/,An open-source voice type classifier for child-centered daylong recordings,Marvin Lavechin and Ruben Bousbib and Hervé Bredin and Emmanuel Dupoux and Alejandrina Cristia,http://arxiv.org/pdf/2005.12656v3
 http://arxiv.org/abs/2104.08815v3,creativecommons.org/licenses/by/4.0/,FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks,Bill Yuchen Lin and Chaoyang He and Zihang Zeng and Hulin Wang and Yufen Huang and Christophe Dupuy and Rahul Gupta and Mahdi Soltanolkotabi and Xiang Ren and Salman Avestimehr,http://arxiv.org/pdf/2104.08815v3
 http://arxiv.org/abs/2105.11407v1,creativecommons.org/licenses/by/4.0/,VANiLLa : Verbalized Answers in Natural Language at Large Scale,Debanjali Biswas and Mohnish Dubey and Md Rashad Al Hasan Rony and Jens Lehmann,http://arxiv.org/pdf/2105.11407v1
 http://arxiv.org/abs/2111.02574v2,creativecommons.org/licenses/by/4.0/,Contextual Semantic Parsing for Multilingual Task-Oriented Dialogues,Mehrad Moradshahi and Victoria Tsai and Giovanni Campagna and Monica S. Lam,http://arxiv.org/pdf/2111.02574v2
 http://arxiv.org/abs/2111.06741v2,creativecommons.org/licenses/by/4.0/,A Quantum Natural Language Processing Approach to Musical Intelligence,Eduardo Reck Miranda and Richie Yeung and Anna Pearson and Konstantinos Meichanetzidis and Bob Coecke,http://arxiv.org/pdf/2111.06741v2
 http://arxiv.org/abs/2112.02333v1,creativecommons.org/licenses/by/4.0/,LoNLI: An Extensible Framework for Testing Diverse Logical Reasoning Capabilities for NLI,Ishan Tarunesh and Somak Aditya and Monojit Choudhury,http://arxiv.org/pdf/2112.02333v1
 http://arxiv.org/abs/2302.13007v3,creativecommons.org/licenses/by/4.0/,AugGPT: Leveraging ChatGPT for Text Data Augmentation,Haixing Dai and Zhengliang Liu and Wenxiong Liao and Xiaoke Huang and Yihan Cao and Zihao Wu and Lin Zhao and Shaochen Xu and Wei Liu and Ninghao Liu and Sheng Li and Dajiang Zhu and Hongmin Cai and Lichao Sun and Quanzheng Li and Dinggang Shen and Tianming Liu and Xiang Li,http://arxiv.org/pdf/2302.13007v3
 http://arxiv.org/abs/2303.12093v2,creativecommons.org/licenses/by/4.0/,ChatGPT for Programming Numerical Methods,Ali Kashefi and Tapan Mukerji,http://arxiv.org/pdf/2303.12093v2
 http://arxiv.org/abs/2304.03347v1,creativecommons.org/licenses/by/4.0/,On the Evaluations of ChatGPT and Emotion-enhanced Prompting for Mental Health Analysis,Kailai Yang and Shaoxiong Ji and Tianlin Zhang and Qianqian Xie and Sophia Ananiadou,http://arxiv.org/pdf/2304.03347v1
 http://arxiv.org/abs/2111.13792v3,creativecommons.org/licenses/by/4.0/,LAFITE: Towards Language-Free Training for Text-to-Image Generation,Yufan Zhou and Ruiyi Zhang and Changyou Chen and Chunyuan Li and Chris Tensmeyer and Tong Yu and Jiuxiang Gu and Jinhui Xu and Tong Sun,http://arxiv.org/pdf/2111.13792v3
 http://arxiv.org/abs/2203.00056v1,creativecommons.org/licenses/by/4.0/,An Empirical Study on Explanations in Out-of-Domain Settings,George Chrysostomou and Nikolaos Aletras,http://arxiv.org/pdf/2203.00056v1
 http://arxiv.org/abs/2211.13196v1,creativecommons.org/licenses/by/4.0/,SeedBERT: Recovering Annotator Rating Distributions from an Aggregated Label,Aneesha Sampath and Victoria Lin and Louis-Philippe Morency,http://arxiv.org/pdf/2211.13196v1
 http://arxiv.org/abs/1907.09038v1,creativecommons.org/licenses/by/4.0/,Augmenting a BiLSTM tagger with a Morphological Lexicon and a Lexical Category Identification Step,Steinþór Steingrímsson and Örvar Kárason and Hrafn Loftsson,http://arxiv.org/pdf/1907.09038v1
 http://arxiv.org/abs/2007.12544v3,creativecommons.org/licenses/by/4.0/,FiSSA at SemEval-2020 Task 9: Fine-tuned For Feelings,Bertelt Braaksma and Richard Scholtens and Stan van Suijlekom and Remy Wang and Ahmet Üstün,http://arxiv.org/pdf/2007.12544v3
 http://arxiv.org/abs/2101.00416v2,creativecommons.org/licenses/by/4.0/,Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting,Wangchunshu Zhou and Tao Ge and Canwen Xu and Ke Xu and Furu Wei,http://arxiv.org/pdf/2101.00416v2
 http://arxiv.org/abs/2101.11423v1,creativecommons.org/licenses/by/4.0/,A More Efficient Chinese Named Entity Recognition base on BERT and Syntactic Analysis,Xiao Fu and Guijun Zhang,http://arxiv.org/pdf/2101.11423v1
 http://arxiv.org/abs/2106.08770v1,creativecommons.org/licenses/by/4.0/,TSSuBERT: Tweet Stream Summarization Using BERT,Alexis Dusart and Karen Pinel-Sauvagnat and Gilles Hubert,http://arxiv.org/pdf/2106.08770v1
 http://arxiv.org/abs/2111.15588v4,creativecommons.org/licenses/by/4.0/,SimpleTRON: Simple Transformer with O(N) Complexity,Uladzislau Yorsh and Alexander Kovalenko and Vojtěch Vančura and Daniel Vašata and Pavel Kordík and Tomáš Mikolov,http://arxiv.org/pdf/2111.15588v4
 http://arxiv.org/abs/2303.13506v1,creativecommons.org/licenses/by/4.0/,The Quantization Model of Neural Scaling,Eric J. Michaud and Ziming Liu and Uzay Girit and Max Tegmark,http://arxiv.org/pdf/2303.13506v1
 http://arxiv.org/abs/2203.11480v5,creativecommons.org/licenses/by/4.0/,WuDaoMM: A large-scale Multi-Modal Dataset for Pre-training models,Sha Yuan and Shuai Zhao and Jiahong Leng and Zhao Xue and Hanyu Zhao and Peiyu Liu and Zheng Gong and Wayne Xin Zhao and Junyi Li and Jie Tang,http://arxiv.org/pdf/2203.11480v5
 http://arxiv.org/abs/2209.08982v1,creativecommons.org/licenses/by/4.0/,How to Adapt Pre-trained Vision-and-Language Models to a Text-only Input?,Lovisa Hagström and Richard Johansson,http://arxiv.org/pdf/2209.08982v1
 http://arxiv.org/abs/2212.10923v1,creativecommons.org/licenses/by/4.0/,Language Models as Inductive Reasoners,Zonglin Yang and Li Dong and Xinya Du and Hao Cheng and Erik Cambria and Xiaodong Liu and Jianfeng Gao and Furu Wei,http://arxiv.org/pdf/2212.10923v1
 http://arxiv.org/abs/2104.07483v2,creativecommons.org/licenses/by/4.0/,IndT5: A Text-to-Text Transformer for 10 Indigenous Languages,El Moatez Billah Nagoudi and Wei-Rui Chen and Muhammad Abdul-Mageed and Hasan Cavusogl,http://arxiv.org/pdf/2104.07483v2
 http://arxiv.org/abs/2107.11976v2,creativecommons.org/licenses/by/4.0/,One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval,Akari Asai and Xinyan Yu and Jungo Kasai and Hannaneh Hajishirzi,http://arxiv.org/pdf/2107.11976v2
 http://arxiv.org/abs/2103.11790v3,creativecommons.org/licenses/by/4.0/,Large Pre-trained Language Models Contain Human-like Biases of What is Right and Wrong to Do,Patrick Schramowski and Cigdem Turan and Nico Andersen and Constantin A. Rothkopf and Kristian Kersting,http://arxiv.org/pdf/2103.11790v3
 http://arxiv.org/abs/2103.13020v3,creativecommons.org/licenses/by/4.0/,deGraphCS: Embedding Variable-based Flow Graph for Neural Code Search,Chen Zeng and Yue Yu and Shanshan Li and Xin Xia and Zhiming Wang and Mingyang Geng and Bailin Xiao and Wei Dong and Xiangke Liao,http://arxiv.org/pdf/2103.13020v3
 http://arxiv.org/abs/2203.08410v3,creativecommons.org/licenses/by/4.0/,Thinking about GPT-3 In-Context Learning for Biomedical IE? Think Again,Bernal Jiménez Gutiérrez and Nikolas McNeal and Clay Washington and You Chen and Lang Li and Huan Sun and Yu Su,http://arxiv.org/pdf/2203.08410v3
 http://arxiv.org/abs/2204.00458v2,creativecommons.org/licenses/by/4.0/,Evaluation of Fake News Detection with Knowledge-Enhanced Language Models,Chenxi Whitehouse and Tillman Weyde and Pranava Madhyastha and Nikos Komninos,http://arxiv.org/pdf/2204.00458v2
 http://arxiv.org/abs/2205.03767v3,creativecommons.org/licenses/by/4.0/,Context-Aware Abbreviation Expansion Using Large Language Models,Shanqing Cai and Subhashini Venugopalan and Katrin Tomanek and Ajit Narayanan and Meredith Ringel Morris and Michael P. Brenner,http://arxiv.org/pdf/2205.03767v3
 http://arxiv.org/abs/2205.12105v2,creativecommons.org/licenses/by/4.0/,HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval,Feilong Chen and Xiuyi Chen and Jiaxin Shi and Duzhen Zhang and Jianlong Chang and Qi Tian,http://arxiv.org/pdf/2205.12105v2
 http://arxiv.org/abs/2209.02812v1,creativecommons.org/licenses/by/4.0/,Increasing Adverse Drug Events extraction robustness on social media: case study on negation and speculation,Simone Scaboro and Beatrice Portelli and Emmanuele Chersoni and Enrico Santus and Giuseppe Serra,http://arxiv.org/pdf/2209.02812v1
 http://arxiv.org/abs/2209.09900v1,creativecommons.org/licenses/by/4.0/,LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging,Andy Rosenbaum and Saleh Soltan and Wael Hamza and Yannick Versley and Markus Boese,http://arxiv.org/pdf/2209.09900v1
 http://arxiv.org/abs/2210.00720v2,creativecommons.org/licenses/by/4.0/,Complexity-Based Prompting for Multi-Step Reasoning,Yao Fu and Hao Peng and Ashish Sabharwal and Peter Clark and Tushar Khot,http://arxiv.org/pdf/2210.00720v2
 http://arxiv.org/abs/2210.05675v2,creativecommons.org/licenses/by/4.0/,Transformers generalize differently from information stored in context vs in weights,Stephanie C. Y. Chan and Ishita Dasgupta and Junkyung Kim and Dharshan Kumaran and Andrew K. Lampinen and Felix Hill,http://arxiv.org/pdf/2210.05675v2
 http://arxiv.org/abs/2211.09699v3,creativecommons.org/licenses/by/4.0/,PromptCap: Prompt-Guided Task-Aware Image Captioning,Yushi Hu and Hang Hua and Zhengyuan Yang and Weijia Shi and Noah A. Smith and Jiebo Luo,http://arxiv.org/pdf/2211.09699v3
 http://arxiv.org/abs/2212.05856v1,creativecommons.org/licenses/by/4.0/,"I think this is the most disruptive technology"": Exploring Sentiments of ChatGPT Early Adopters using Twitter Data""",Mubin Ul Haque and Isuru Dharmadasa and Zarrin Tasnim Sworna and Roshan Namal Rajapakse and Hussain Ahmad,http://arxiv.org/pdf/2212.05856v1
 http://arxiv.org/abs/2212.08037v2,creativecommons.org/licenses/by/4.0/,Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models,Bernd Bohnet and Vinh Q. Tran and Pat Verga and Roee Aharoni and Daniel Andor and Livio Baldini Soares and Massimiliano Ciaramita and Jacob Eisenstein and Kuzman Ganchev and Jonathan Herzig and Kai Hui and Tom Kwiatkowski and Ji Ma and Jianmo Ni and Lierni Sestorain Saralegui and Tal Schuster and William W. Cohen and Michael Collins and Dipanjan Das and Donald Metzler and Slav Petrov and Kellie Webster,http://arxiv.org/pdf/2212.08037v2
 http://arxiv.org/abs/2302.08500v1,creativecommons.org/licenses/by/4.0/,Auditing large language models: a three-layered approach,Jakob Mökander and Jonas Schuett and Hannah Rose Kirk and Luciano Floridi,http://arxiv.org/pdf/2302.08500v1
 http://arxiv.org/abs/2303.08896v1,creativecommons.org/licenses/by/4.0/,SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models,Potsawee Manakul and Adian Liusie and Mark J. F. Gales,http://arxiv.org/pdf/2303.08896v1
 http://arxiv.org/abs/2107.11020v3,creativecommons.org/licenses/by/4.0/,Emotion analysis and detection during COVID-19,Tiberiu Sosea and Chau Pham and Alexander Tekle and Cornelia Caragea and Junyi Jessy Li,http://arxiv.org/pdf/2107.11020v3
 http://arxiv.org/abs/2212.01476v1,creativecommons.org/licenses/by/4.0/,NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization,Chao Zhao and Faeze Brahman and Kaiqiang Song and Wenlin Yao and Dian Yu and Snigdha Chaturvedi,http://arxiv.org/pdf/2212.01476v1
 http://arxiv.org/abs/2212.09257v1,creativecommons.org/licenses/by/4.0/,PromptBoosting: Black-Box Text Classification with Ten Forward Passes,Bairu Hou and Joe O'Connor and Jacob Andreas and Shiyu Chang and Yang Zhang,http://arxiv.org/pdf/2212.09257v1
 http://arxiv.org/abs/2303.14770v1,creativecommons.org/licenses/by/4.0/,Koala: An Index for Quantifying Overlaps with Pre-training Corpora,Thuy-Trang Vu and Xuanli He and Gholamreza Haffari and Ehsan Shareghi,http://arxiv.org/pdf/2303.14770v1
 http://arxiv.org/abs/2109.01942v2,creativecommons.org/licenses/by/4.0/,On the ability of monolingual models to learn language-agnostic representations,Leandro Rodrigues de Souza and Rodrigo Nogueira and Roberto Lotufo,http://arxiv.org/pdf/2109.01942v2
 http://arxiv.org/abs/2301.08913v1,creativecommons.org/licenses/by/4.0/,Unifying Structure Reasoning and Language Model Pre-training for Complex Reasoning,Siyuan Wang and Zhongyu Wei and Jiarong Xu and Zhihao Fan,http://arxiv.org/pdf/2301.08913v1
 http://arxiv.org/abs/1912.01072v2,creativecommons.org/licenses/by/4.0/,Leveraging Contextual Embeddings for Detecting Diachronic Semantic Shift,Matej Martinc and Petra Kralj Novak and Senja Pollak,http://arxiv.org/pdf/1912.01072v2
 http://arxiv.org/abs/2004.11449v1,creativecommons.org/licenses/by/4.0/,Upgrading the Newsroom: An Automated Image Selection System for News Articles,Fangyu Liu and Rémi Lebret and Didier Orel and Philippe Sordet and Karl Aberer,http://arxiv.org/pdf/2004.11449v1
 http://arxiv.org/abs/2008.08547v1,creativecommons.org/licenses/by/4.0/,UoB at SemEval-2020 Task 12: Boosting BERT with Corpus Level Information,Wah Meng Lim and Harish Tayyar Madabushi,http://arxiv.org/pdf/2008.08547v1
 http://arxiv.org/abs/2101.11360v1,creativecommons.org/licenses/by/4.0/,An Empirical Study of Cross-Lingual Transferability in Generative Dialogue State Tracker,Yen-Ting Lin and Yun-Nung Chen,http://arxiv.org/pdf/2101.11360v1
 http://arxiv.org/abs/2103.01620v2,creativecommons.org/licenses/by/4.0/,Disentangling Syntax and Semantics in the Brain with Deep Networks,Charlotte Caucheteux and Alexandre Gramfort and Jean-Remi King,http://arxiv.org/pdf/2103.01620v2
 http://arxiv.org/abs/2103.08993v1,creativecommons.org/licenses/by/4.0/,Fast Development of ASR in African Languages using Self Supervised Speech Representation Learning,Jama Hussein Mohamud and Lloyd Acquaye Thompson and Aissatou Ndoye and Laurent Besacier,http://arxiv.org/pdf/2103.08993v1
 http://arxiv.org/abs/2103.10198v2,creativecommons.org/licenses/by/4.0/,Phylogenetic typology,Gerhard Jäger and Johannes Wahle,http://arxiv.org/pdf/2103.10198v2
 http://arxiv.org/abs/2105.06027v1,creativecommons.org/licenses/by/4.0/,Towards Human-Free Automatic Quality Evaluation of German Summarization,Neslihan Iskender and Oleg Vasilyev and Tim Polzehl and John Bohannon and Sebastian Möller,http://arxiv.org/pdf/2105.06027v1
 http://arxiv.org/abs/2107.01982v1,creativecommons.org/licenses/by/4.0/,The DCU-EPFL Enhanced Dependency Parser at the IWPT 2021 Shared Task,James Barry and Alireza Mohammadshahi and Joachim Wagner and Jennifer Foster and James Henderson,http://arxiv.org/pdf/2107.01982v1
 http://arxiv.org/abs/2107.09710v1,creativecommons.org/licenses/by/4.0/,TLA: Twitter Linguistic Analysis,Tushar Sarkar and Nishant Rajadhyaksha,http://arxiv.org/pdf/2107.09710v1
 http://arxiv.org/abs/2109.15196v2,creativecommons.org/licenses/by/4.0/,Multilingual AMR Parsing with Noisy Knowledge Distillation,Deng Cai and Xin Li and Jackie Chun-Sing Ho and Lidong Bing and Wai Lam,http://arxiv.org/pdf/2109.15196v2
 http://arxiv.org/abs/2110.08559v1,creativecommons.org/licenses/by/4.0/,"FrugalScore: Learning Cheaper, Lighter and Faster Evaluation Metricsfor Automatic Text Generation",Moussa Kamal Eddine and Guokan Shang and Antoine J. -P. Tixier and Michalis Vazirgiannis,http://arxiv.org/pdf/2110.08559v1
 http://arxiv.org/abs/2111.13662v2,creativecommons.org/licenses/by/4.0/,Modular Information Flow through Ownership,Will Crichton and Marco Patrignani and Maneesh Agrawala and Pat Hanrahan,http://arxiv.org/pdf/2111.13662v2
 http://arxiv.org/abs/2204.00743v2,creativecommons.org/licenses/by/4.0/,Entity-Centric Query Refinement,David Wadden and Nikita Gupta and Kenton Lee and Kristina Toutanova,http://arxiv.org/pdf/2204.00743v2
 http://arxiv.org/abs/2207.08286v1,creativecommons.org/licenses/by/4.0/,An Overview of Distant Supervision for Relation Extraction with a Focus on Denoising and Pre-training Methods,William Hogan,http://arxiv.org/pdf/2207.08286v1
 http://arxiv.org/abs/2208.01009v2,creativecommons.org/licenses/by/4.0/,Few-shot Adaptation Works with UnpredicTable Data,Jun Shern Chan and Michael Pieler and Jonathan Jao and Jérémy Scheurer and Ethan Perez,http://arxiv.org/pdf/2208.01009v2
 http://arxiv.org/abs/2208.05559v1,creativecommons.org/licenses/by/4.0/,"Comparing Channel Restrictions of Communicating State Machines, High-level Message Sequence Charts, and Multiparty Session Types",Felix Stutz and Damien Zufferey,http://arxiv.org/pdf/2208.05559v1
 http://arxiv.org/abs/2208.14610v2,creativecommons.org/licenses/by/4.0/,The Sparse Abstract Machine,Olivia Hsu and Maxwell Strange and Ritvik Sharma and Jaeyeon Won and Kunle Olukotun and Joel Emer and Mark Horowitz and Fredrik Kjolstad,http://arxiv.org/pdf/2208.14610v2
 http://arxiv.org/abs/2209.06049v3,creativecommons.org/licenses/by/4.0/,Pre-training Transformers on Indian Legal Text,Shounak Paul and Arpan Mandal and Pawan Goyal and Saptarshi Ghosh,http://arxiv.org/pdf/2209.06049v3
 http://arxiv.org/abs/2211.05596v1,creativecommons.org/licenses/by/4.0/,Prompt Learning for Domain Adaptation in Task-Oriented Dialogue,Makesh Narsimhan Sreedhar and Christopher Parisien,http://arxiv.org/pdf/2211.05596v1
 http://arxiv.org/abs/2211.11890v1,creativecommons.org/licenses/by/4.0/,TEMPERA: Test-Time Prompting via Reinforcement Learning,Tianjun Zhang and Xuezhi Wang and Denny Zhou and Dale Schuurmans and Joseph E. Gonzalez,http://arxiv.org/pdf/2211.11890v1
 http://arxiv.org/abs/2301.12074v1,creativecommons.org/licenses/by/4.0/,Comparing Intrinsic Gender Bias Evaluation Measures without using Human Annotated Examples,Masahiro Kaneko and Danushka Bollegala and Naoaki Okazaki,http://arxiv.org/pdf/2301.12074v1
 http://arxiv.org/abs/2302.00493v1,creativecommons.org/licenses/by/4.0/,You Are What You Talk About: Inducing Evaluative Topics for Personality Analysis,Josip Jukić and Iva Vukojević and Jan Šnajder,http://arxiv.org/pdf/2302.00493v1
 http://arxiv.org/abs/2302.00739v1,creativecommons.org/licenses/by/4.0/,Inference of Partial Colexifications from Multilingual Wordlists,Johann-Mattis List,http://arxiv.org/pdf/2302.00739v1
 http://arxiv.org/abs/2302.03694v1,creativecommons.org/licenses/by/4.0/,Characterizing Financial Market Coverage using Artificial Intelligence,Jean Marie Tshimula and D'Jeff K. Nkashama and Patrick Owusu and Marc Frappier and Pierre-Martin Tardif and Froduald Kabanza and Armelle Brun and Jean-Marc Patenaude and Shengrui Wang and Belkacem Chikhaoui,http://arxiv.org/pdf/2302.03694v1
 http://arxiv.org/abs/2303.12810v1,creativecommons.org/licenses/by/4.0/,Are LLMs the Master of All Trades? : Exploring Domain-Agnostic Reasoning Skills of LLMs,Shrivats Agrawal,http://arxiv.org/pdf/2303.12810v1
 http://arxiv.org/abs/2303.16145v1,creativecommons.org/licenses/by/4.0/,NeuralMind-UNICAMP at 2022 TREC NeuCLIR: Large Boring Rerankers for Cross-lingual Retrieval,Vitor Jeronymo and Roberto Lotufo and Rodrigo Nogueira,http://arxiv.org/pdf/2303.16145v1
 http://arxiv.org/abs/2304.01240v2,creativecommons.org/licenses/by/4.0/,Identifying Mentions of Pain in Mental Health Records Text: A Natural Language Processing Approach,Jaya Chaturvedi and Sumithra Velupillai and Robert Stewart and Angus Roberts,http://arxiv.org/pdf/2304.01240v2
 http://arxiv.org/abs/2304.06028v1,creativecommons.org/licenses/by/4.0/,RECLIP: Resource-efficient CLIP by Training with Small Images,Runze Li and Dahun Kim and Bir Bhanu and Weicheng Kuo,http://arxiv.org/pdf/2304.06028v1
 http://arxiv.org/abs/2205.11656v1,creativecommons.org/licenses/by/4.0/,FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?,Shikhar Tuli and Bhishma Dedhia and Shreshth Tuli and Niraj K. Jha,http://arxiv.org/pdf/2205.11656v1
 http://arxiv.org/abs/1911.04118v2,creativecommons.org/licenses/by/4.0/,TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection,Siddhant Garg and Thuy Vu and Alessandro Moschitti,http://arxiv.org/pdf/1911.04118v2
 http://arxiv.org/abs/2303.07514v1,creativecommons.org/licenses/by/4.0/,Handwritten Word Recognition using Deep Learning Approach: A Novel Way of Generating Handwritten Words,Mst Shapna Akter and Hossain Shahriar and Alfredo Cuzzocrea and Nova Ahmed and Carson Leung,http://arxiv.org/pdf/2303.07514v1
 http://arxiv.org/abs/1412.7415v2,creativecommons.org/licenses/by/4.0/,A prototype Malayalam to Sign Language Automatic Translator,Jestin Joy and Kannan Balakrishnan,http://arxiv.org/pdf/1412.7415v2
 http://arxiv.org/abs/1709.02076v1,creativecommons.org/licenses/by/4.0/,Composition by Conversation,Donya Quick and Clayton T. Morrison,http://arxiv.org/pdf/1709.02076v1
 http://arxiv.org/abs/2007.11865v1,creativecommons.org/licenses/by/4.0/,AI4D -- African Language Dataset Challenge,Kathleen Siminyu and Sackey Freshia and Jade Abbott and Vukosi Marivate,http://arxiv.org/pdf/2007.11865v1
 http://arxiv.org/abs/2103.14698v1,creativecommons.org/licenses/by/4.0/,Implementing G-Machine in HyperLMNtal,Jin Sano,http://arxiv.org/pdf/2103.14698v1
 http://arxiv.org/abs/2109.01628v1,creativecommons.org/licenses/by/4.0/,Cross-Lingual Training with Dense Retrieval for Document Retrieval,Peng Shi and Rui Zhang and He Bai and Jimmy Lin,http://arxiv.org/pdf/2109.01628v1
 http://arxiv.org/abs/2102.09268v2,creativecommons.org/licenses/by/4.0/,Training Large-Scale News Recommenders with Pretrained Language Models in the Loop,Shitao Xiao and Zheng Liu and Yingxia Shao and Tao Di and Xing Xie,http://arxiv.org/pdf/2102.09268v2
 http://arxiv.org/abs/2106.03373v4,creativecommons.org/licenses/by/4.0/,Pre-trained Language Model for Web-scale Retrieval in Baidu Search,Yiding Liu and Guan Huang and Jiaxiang Liu and Weixue Lu and Suqi Cheng and Yukun Li and Daiting Shi and Shuaiqiang Wang and Zhicong Cheng and Dawei Yin,http://arxiv.org/pdf/2106.03373v4
 http://arxiv.org/abs/2109.02401v4,creativecommons.org/licenses/by/4.0/,Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization,Tiezheng Yu and Wenliang Dai and Zihan Liu and Pascale Fung,http://arxiv.org/pdf/2109.02401v4
 http://arxiv.org/abs/2204.12130v2,creativecommons.org/licenses/by/4.0/,LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models,Mor Geva and Avi Caciularu and Guy Dar and Paul Roit and Shoval Sadde and Micah Shlain and Bar Tamir and Yoav Goldberg,http://arxiv.org/pdf/2204.12130v2
 http://arxiv.org/abs/2208.13916v1,creativecommons.org/licenses/by/4.0/,A Language Agnostic Multilingual Streaming On-Device ASR System,Bo Li and Tara N. Sainath and Ruoming Pang and Shuo-yiin Chang and Qiumin Xu and Trevor Strohman and Vince Chen and Qiao Liang and Heguang Liu and Yanzhang He and Parisa Haghani and Sameer Bidichandani,http://arxiv.org/pdf/2208.13916v1
 http://arxiv.org/abs/2304.08448v1,creativecommons.org/licenses/by/4.0/,ImpressionGPT: An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT,Chong Ma and Zihao Wu and Jiaqi Wang and Shaochen Xu and Yaonai Wei and Zhengliang Liu and Lei Guo and Xiaoyan Cai and Shu Zhang and Tuo Zhang and Dajiang Zhu and Dinggang Shen and Tianming Liu and Xiang Li,http://arxiv.org/pdf/2304.08448v1
 http://arxiv.org/abs/2212.01488v2,creativecommons.org/licenses/by/4.0/,Event knowledge in large language models: the gap between the impossible and the unlikely,Carina Kauf and Anna A. Ivanova and Giulia Rambelli and Emmanuele Chersoni and Jingyuan S. She and Zawad Chowdhury and Evelina Fedorenko and Alessandro Lenci,http://arxiv.org/pdf/2212.01488v2
 http://arxiv.org/abs/2208.10091v2,creativecommons.org/licenses/by/4.0/,Incorporating Domain Knowledge through Task Augmentation for Front-End JavaScript Code Generation,Sijie Shen and Xiang Zhu and Yihong Dong and Qizhi Guo and Yankun Zhen and Ge Li,http://arxiv.org/pdf/2208.10091v2
 http://arxiv.org/abs/2004.01221v1,creativecommons.org/licenses/by/4.0/,Towards Relevance and Sequence Modeling in Language Recognition,Bharat Padi and Anand Mohan and Sriram Ganapathy,http://arxiv.org/pdf/2004.01221v1
 http://arxiv.org/abs/2011.03965v1,creativecommons.org/licenses/by/4.0/,On the Practical Ability of Recurrent Neural Networks to Recognize Hierarchical Languages,Satwik Bhattamishra and Kabir Ahuja and Navin Goyal,http://arxiv.org/pdf/2011.03965v1
 http://arxiv.org/abs/2105.07144v3,creativecommons.org/licenses/by/4.0/,A Cognitive Regularizer for Language Modeling,Jason Wei and Clara Meister and Ryan Cotterell,http://arxiv.org/pdf/2105.07144v3
 http://arxiv.org/abs/2212.09662v1,creativecommons.org/licenses/by/4.0/,MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering,Fangyu Liu and Francesco Piccinno and Syrine Krichene and Chenxi Pang and Kenton Lee and Mandar Joshi and Yasemin Altun and Nigel Collier and Julian Martin Eisenschlos,http://arxiv.org/pdf/2212.09662v1
 http://arxiv.org/abs/2105.14444v1,creativecommons.org/licenses/by/4.0/,NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search,Jin Xu and Xu Tan and Renqian Luo and Kaitao Song and Jian Li and Tao Qin and Tie-Yan Liu,http://arxiv.org/pdf/2105.14444v1
 http://arxiv.org/abs/2112.11805v2,creativecommons.org/licenses/by/4.0/,Neural-Symbolic Integration for Interactive Learning and Conceptual Grounding,Benedikt Wagner and Artur d'Avila Garcez,http://arxiv.org/pdf/2112.11805v2
 http://arxiv.org/abs/2205.01941v1,creativecommons.org/licenses/by/4.0/,Lexical Knowledge Internalization for Neural Dialog Generation,Zhiyong Wu and Wei Bi and Xiang Li and Lingpeng Kong and Ben Kao,http://arxiv.org/pdf/2205.01941v1
 http://arxiv.org/abs/1903.10915v1,creativecommons.org/licenses/by/4.0/,Language Model Adaptation for Language and Dialect Identification of Text,Tommi Jauhiainen and Krister Lindén and Heidi Jauhiainen,http://arxiv.org/pdf/1903.10915v1
 http://arxiv.org/abs/2010.03542v1,creativecommons.org/licenses/by/4.0/,Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive Language Identification using Pre-trained Language Models,Shuohuan Wang and Jiaxiang Liu and Xuan Ouyang and Yu Sun,http://arxiv.org/pdf/2010.03542v1
 http://arxiv.org/abs/2106.02232v1,creativecommons.org/licenses/by/4.0/,Language Scaling for Universal Suggested Replies Model,Qianlan Ying and Payal Bajaj and Budhaditya Deb and Yu Yang and Wei Wang and Bojia Lin and Milad Shokouhi and Xia Song and Yang Yang and Daxin Jiang,http://arxiv.org/pdf/2106.02232v1
 http://arxiv.org/abs/2201.06469v1,creativecommons.org/licenses/by/4.0/,Handling Compounding in Mobile Keyboard Input,Andreas Kabel and Keith Hall and Tom Ouyang and David Rybach and Daan van Esch and Françoise Beaufays,http://arxiv.org/pdf/2201.06469v1
 http://arxiv.org/abs/2201.10707v1,creativecommons.org/licenses/by/4.0/,A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model,Xin Sun and Tao Ge and Shuming Ma and Jingjing Li and Furu Wei and Houfeng Wang,http://arxiv.org/pdf/2201.10707v1
 http://arxiv.org/abs/2210.12246v1,creativecommons.org/licenses/by/4.0/,A General Architecture for Client-Agnostic Hybrid Model Editors as a Service,Liam Walsh and Juergen Dingel and Karim Jahed,http://arxiv.org/pdf/2210.12246v1
 http://arxiv.org/abs/2303.17517v1,creativecommons.org/licenses/by/4.0/,Hindi as a Second Language: Improving Visually Grounded Speech with Semantically Similar Samples,Hyeonggon Ryu and Arda Senocak and In So Kweon and Joon Son Chung,http://arxiv.org/pdf/2303.17517v1
 http://arxiv.org/abs/2102.00405v2,creativecommons.org/licenses/by/4.0/,BNLP: Natural language processing toolkit for Bengali language,Sagor Sarker,http://arxiv.org/pdf/2102.00405v2
 http://arxiv.org/abs/2102.07150v1,creativecommons.org/licenses/by/4.0/,indicnlp@kgp at DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Languages,Kushal Kedia and Abhilash Nandy,http://arxiv.org/pdf/2102.07150v1
 http://arxiv.org/abs/2106.01195v2,creativecommons.org/licenses/by/4.0/,Figurative Language in Recognizing Textual Entailment,Tuhin Chakrabarty and Debanjan Ghosh and Adam Poliak and Smaranda Muresan,http://arxiv.org/pdf/2106.01195v2
 http://arxiv.org/abs/2201.13072v1,creativecommons.org/licenses/by/4.0/,Are Mutually Intelligible Languages Easier to Translate?,Avital Friedland and Jonathan Zeltser and Omer Levy,http://arxiv.org/pdf/2201.13072v1
 http://arxiv.org/abs/2203.09313v2,creativecommons.org/licenses/by/4.0/,EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training,Yuxian Gu and Jiaxin Wen and Hao Sun and Yi Song and Pei Ke and Chujie Zheng and Zheng Zhang and Jianzhu Yao and Xiaoyan Zhu and Jie Tang and Minlie Huang,http://arxiv.org/pdf/2203.09313v2
 http://arxiv.org/abs/2111.00830v2,creativecommons.org/licenses/by/4.0/,Deep Learning Transformer Architecture for Named Entity Recognition on Low Resourced Languages: State of the art results,Ridewaan Hanslo,http://arxiv.org/pdf/2111.00830v2
 http://arxiv.org/abs/1908.03837v2,creativecommons.org/licenses/by/4.0/,Modeling Graphs with Vertex Replacement Grammars,Satyaki Sikdar and Justus Hibshman and Tim Weninger,http://arxiv.org/pdf/1908.03837v2
 http://arxiv.org/abs/1707.04678v1,creativecommons.org/licenses/by/4.0/,Lyrics-Based Music Genre Classification Using a Hierarchical Attention Network,Alexandros Tsaptsinos,http://arxiv.org/pdf/1707.04678v1
 http://arxiv.org/abs/1906.01543v2,creativecommons.org/licenses/by/4.0/,Training Neural Response Selection for Task-Oriented Dialogue Systems,Matthew Henderson and Ivan Vulić and Daniela Gerz and Iñigo Casanueva and Paweł Budzianowski and Sam Coope and Georgios Spithourakis and Tsung-Hsien Wen and Nikola Mrkšić and Pei-Hao Su,http://arxiv.org/pdf/1906.01543v2
 http://arxiv.org/abs/2012.15419v1,creativecommons.org/licenses/by/4.0/,An Experimental Evaluation of Transformer-based Language Models in the Biomedical Domain,Paul Grouchy and Shobhit Jain and Michael Liu and Kuhan Wang and Max Tian and Nidhi Arora and Hillary Ngai and Faiza Khan Khattak and Elham Dolatabadi and Sedef Akinli Kocak,http://arxiv.org/pdf/2012.15419v1
 http://arxiv.org/abs/2106.02497v1,creativecommons.org/licenses/by/4.0/,COINS: Dynamically Generating COntextualized Inference Rules for Narrative Story Completion,Debjit Paul and Anette Frank,http://arxiv.org/pdf/2106.02497v1
 http://arxiv.org/abs/2106.05634v1,creativecommons.org/licenses/by/4.0/,Exploring Unsupervised Pretraining Objectives for Machine Translation,Christos Baziotis and Ivan Titov and Alexandra Birch and Barry Haddow,http://arxiv.org/pdf/2106.05634v1
 http://arxiv.org/abs/2106.15332v1,creativecommons.org/licenses/by/4.0/,Winner Team Mia at TextVQA Challenge 2021: Vision-and-Language Representation Learning with Pre-trained Sequence-to-Sequence Model,Yixuan Qiao and Hao Chen and Jun Wang and Yihao Chen and Xianbin Ye and Ziliang Li and Xianbiao Qi and Peng Gao and Guotong Xie,http://arxiv.org/pdf/2106.15332v1
 http://arxiv.org/abs/2109.10164v2,creativecommons.org/licenses/by/4.0/,RAIL-KD: RAndom Intermediate Layer Mapping for Knowledge Distillation,Md Akmal Haidar and Nithin Anchuri and Mehdi Rezagholizadeh and Abbas Ghaddar and Philippe Langlais and Pascal Poupart,http://arxiv.org/pdf/2109.10164v2
 http://arxiv.org/abs/2109.11745v1,creativecommons.org/licenses/by/4.0/,DACT-BERT: Differentiable Adaptive Computation Time for an Efficient BERT Inference,Cristóbal Eyzaguirre and Felipe del Río and Vladimir Araujo and Álvaro Soto,http://arxiv.org/pdf/2109.11745v1
 http://arxiv.org/abs/2110.06176v2,creativecommons.org/licenses/by/4.0/,Mention Memory: incorporating textual knowledge into Transformers through entity mention attention,Michiel de Jong and Yury Zemlyanskiy and Nicholas FitzGerald and Fei Sha and William Cohen,http://arxiv.org/pdf/2110.06176v2
 http://arxiv.org/abs/2112.01147v2,creativecommons.org/licenses/by/4.0/,CO2Sum:Contrastive Learning for Factual-Consistent Abstractive Summarization,Wei Liu and Huanqin Wu and Wenjing Mu and Zhen Li and Tao Chen and Dan Nie,http://arxiv.org/pdf/2112.01147v2
 http://arxiv.org/abs/2205.09224v2,creativecommons.org/licenses/by/4.0/,Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner,Danilo Ribeiro and Shen Wang and Xiaofei Ma and Rui Dong and Xiaokai Wei and Henry Zhu and Xinchi Chen and Zhiheng Huang and Peng Xu and Andrew Arnold and Dan Roth,http://arxiv.org/pdf/2205.09224v2
 http://arxiv.org/abs/2205.13657v3,creativecommons.org/licenses/by/4.0/,An enhanced Conv-TasNet model for speech separation using a speaker distance-based loss function,Jose A. Arango-Sánchez and Julián D. Arias-Londoño,http://arxiv.org/pdf/2205.13657v3
 http://arxiv.org/abs/2207.10802v2,creativecommons.org/licenses/by/4.0/,Combing for Credentials: Active Pattern Extraction from Smart Reply,Bargav Jayaraman and Esha Ghosh and Melissa Chase and Sambuddha Roy and Huseyin Inan and Wei Dai and David Evans,http://arxiv.org/pdf/2207.10802v2
 http://arxiv.org/abs/2211.01542v2,creativecommons.org/licenses/by/4.0/,Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions,Shuhao Gu and Bojie Hu and Yang Feng,http://arxiv.org/pdf/2211.01542v2
 http://arxiv.org/abs/2211.07712v1,creativecommons.org/licenses/by/4.0/,Cloning Ideology and Style using Deep Learning,Dr. Omer Beg and Muhammad Nasir Zafar and Waleed Anjum,http://arxiv.org/pdf/2211.07712v1
 http://arxiv.org/abs/2211.16912v1,creativecommons.org/licenses/by/4.0/,Quadapter: Adapter for GPT-2 Quantization,Minseop Park and Jaeseong You and Markus Nagel and Simyung Chang,http://arxiv.org/pdf/2211.16912v1
 http://arxiv.org/abs/2212.01692v1,creativecommons.org/licenses/by/4.0/,What is Not in the Context? Evaluation of Few-shot Learners with Informative Demonstrations,Michal Štefánik and Marek Kadlčík,http://arxiv.org/pdf/2212.01692v1
 http://arxiv.org/abs/2301.10521v1,creativecommons.org/licenses/by/4.0/,ExaRanker: Explanation-Augmented Neural Ranker,Fernando Ferraretto and Thiago Laitz and Roberto Lotufo and Rodrigo Nogueira,http://arxiv.org/pdf/2301.10521v1
 http://arxiv.org/abs/2301.11845v2,creativecommons.org/licenses/by/4.0/,Learning the Effects of Physical Actions in a Multi-modal Environment,Gautier Dagan and Frank Keller and Alex Lascarides,http://arxiv.org/pdf/2301.11845v2
 http://arxiv.org/abs/2302.05608v1,creativecommons.org/licenses/by/4.0/,Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis,Zhu Wang and Sourav Medya and Sathya N. Ravi,http://arxiv.org/pdf/2302.05608v1
 http://arxiv.org/abs/2302.05888v1,creativecommons.org/licenses/by/4.0/,Position Matters! Empirical Study of Order Effect in Knowledge-grounded Dialogue,Hsuan Su and Shachi H Kumar and Sahisnu Mazumder and Wenda Chen and Ramesh Manuvinakurike and Eda Okur and Saurav Sahay and Lama Nachman and Shang-Tse Chen and Hung-yi Lee,http://arxiv.org/pdf/2302.05888v1
 http://arxiv.org/abs/2302.12367v1,creativecommons.org/licenses/by/4.0/,Extracting Victim Counts from Text,Mian Zhong and Shehzaad Dhuliawala and Niklas Stoehr,http://arxiv.org/pdf/2302.12367v1
 http://arxiv.org/abs/2303.17003v1,creativecommons.org/licenses/by/4.0/,Evaluating GPT-3.5 and GPT-4 Models on Brazilian University Admission Exams,Desnes Nunes and Ricardo Primi and Ramon Pires and Roberto Lotufo and Rodrigo Nogueira,http://arxiv.org/pdf/2303.17003v1
 http://arxiv.org/abs/2303.17972v2,creativecommons.org/licenses/by/4.0/,$\varepsilon$ KÚ <MASK>: Integrating Yorùbá cultural greetings into machine translation,Idris Akinade and Jesujoba Alabi and David Adelani and Clement Odoje and Dietrich Klakow,http://arxiv.org/pdf/2303.17972v2
 http://arxiv.org/abs/2304.01830v1,creativecommons.org/licenses/by/4.0/,Learning to Name Classes for Vision and Language Models,Sarah Parisot and Yongxin Yang and Steven McDonagh,http://arxiv.org/pdf/2304.01830v1
 http://arxiv.org/abs/2304.08247v1,creativecommons.org/licenses/by/4.0/,MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data,Tianyu Han and Lisa C. Adams and Jens-Michalis Papaioannou and Paul Grundmann and Tom Oberhauser and Alexander Löser and Daniel Truhn and Keno K. Bressem,http://arxiv.org/pdf/2304.08247v1
 http://arxiv.org/abs/2304.11075v1,creativecommons.org/licenses/by/4.0/,Spaiche: Extending State-of-the-Art ASR Models to Swiss German Dialects,Clément Sicard and Kajetan Pyszkowski and Victor Gillioz,http://arxiv.org/pdf/2304.11075v1
 http://arxiv.org/abs/2110.09574v1,creativecommons.org/licenses/by/4.0/,Multilingual Domain Adaptation for NMT: Decoupling Language and Domain Information with Adapters,Asa Cooper Stickland and Alexandre Bérard and Vassilina Nikoulina,http://arxiv.org/pdf/2110.09574v1
 http://arxiv.org/abs/2205.08755v1,creativecommons.org/licenses/by/4.0/,Persian Natural Language Inference: A Meta-learning approach,Heydar Soudani and Mohammad Hassan Mojab and Hamid Beigy,http://arxiv.org/pdf/2205.08755v1
 http://arxiv.org/abs/2101.11038v1,creativecommons.org/licenses/by/4.0/,Muppet: Massive Multi-task Representations with Pre-Finetuning,Armen Aghajanyan and Anchit Gupta and Akshat Shrivastava and Xilun Chen and Luke Zettlemoyer and Sonal Gupta,http://arxiv.org/pdf/2101.11038v1
 http://arxiv.org/abs/2201.12799v1,creativecommons.org/licenses/by/4.0/,Recognition of Implicit Geographic Movement in Text,Scott Pezanowski and Prasenjit Mitra,http://arxiv.org/pdf/2201.12799v1
 http://arxiv.org/abs/2204.04711v1,creativecommons.org/licenses/by/4.0/,Data Augmentation for Biomedical Factoid Question Answering,Dimitris Pappas and Prodromos Malakasiotis and Ion Androutsopoulos,http://arxiv.org/pdf/2204.04711v1
 http://arxiv.org/abs/2211.15914v1,creativecommons.org/licenses/by/4.0/,Zero-Shot Opinion Summarization with GPT-3,Adithya Bhaskar and Alexander R. Fabbri and Greg Durrett,http://arxiv.org/pdf/2211.15914v1
 http://arxiv.org/abs/2109.05812v2,creativecommons.org/licenses/by/4.0/,UniMS: A Unified Framework for Multimodal Summarization with Knowledge Distillation,Zhengkun Zhang and Xiaojun Meng and Yasheng Wang and Xin Jiang and Qun Liu and Zhenglu Yang,http://arxiv.org/pdf/2109.05812v2
 http://arxiv.org/abs/2203.06486v3,creativecommons.org/licenses/by/4.0/,Chart-to-Text: A Large-Scale Benchmark for Chart Summarization,Shankar Kantharaj and Rixie Tiffany Ko Leong and Xiang Lin and Ahmed Masry and Megh Thakkar and Enamul Hoque and Shafiq Joty,http://arxiv.org/pdf/2203.06486v3
 http://arxiv.org/abs/2205.11409v1,creativecommons.org/licenses/by/4.0/,Many-Class Text Classification with Matching,Yi Song and Yuxian Gu and Minlie Huang,http://arxiv.org/pdf/2205.11409v1
 http://arxiv.org/abs/2212.01588v1,creativecommons.org/licenses/by/4.0/,RHO ($ρ$): Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding,Ziwei Ji and Zihan Liu and Nayeon Lee and Tiezheng Yu and Bryan Wilie and Min Zeng and Pascale Fung,http://arxiv.org/pdf/2212.01588v1
 http://arxiv.org/abs/2212.08153v1,creativecommons.org/licenses/by/4.0/,FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference,Michiel de Jong and Yury Zemlyanskiy and Joshua Ainslie and Nicholas FitzGerald and Sumit Sanghai and Fei Sha and William Cohen,http://arxiv.org/pdf/2212.08153v1
 http://arxiv.org/abs/2303.10138v1,creativecommons.org/licenses/by/4.0/,"Generate, Transform, Answer: Question Specific Tool Synthesis for Tabular Data",Carlos Gemmell and Jeffrey Dalton,http://arxiv.org/pdf/2303.10138v1
 http://arxiv.org/abs/2304.09386v1,creativecommons.org/licenses/by/4.0/,Towards Objective-Tailored Genetic Improvement Through Large Language Models,Sungmin Kang and Shin Yoo,http://arxiv.org/pdf/2304.09386v1
 http://arxiv.org/abs/2112.08491v1,creativecommons.org/licenses/by/4.0/,"Human Languages with Greater Information Density Increase Communication Speed, but Decrease Conversation Breadth",Pedro Aceves and James A. Evans,http://arxiv.org/pdf/2112.08491v1
 http://arxiv.org/abs/2303.03915v1,creativecommons.org/licenses/by/4.0/,The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset,Hugo Laurençon and Lucile Saulnier and Thomas Wang and Christopher Akiki and Albert Villanova del Moral and Teven Le Scao and Leandro Von Werra and Chenghao Mou and Eduardo González Ponferrada and Huu Nguyen and Jörg Frohberg and Mario Šaško and Quentin Lhoest and Angelina McMillan-Major and Gerard Dupont and Stella Biderman and Anna Rogers and Loubna Ben allal and Francesco De Toni and Giada Pistilli and Olivier Nguyen and Somaieh Nikpoor and Maraim Masoud and Pierre Colombo and Javier de la Rosa and Paulo Villegas and Tristan Thrush and Shayne Longpre and Sebastian Nagel and Leon Weber and Manuel Muñoz and Jian Zhu and Daniel Van Strien and Zaid Alyafeai and Khalid Almubarak and Minh Chien Vu and Itziar Gonzalez-Dios and Aitor Soroa and Kyle Lo and Manan Dey and Pedro Ortiz Suarez and Aaron Gokaslan and Shamik Bose and David Adelani and Long Phan and Hieu Tran and Ian Yu and Suhas Pai and Jenny Chim and Violette Lepercq and Suzana Ilic and Margaret Mitchell and Sasha Alexandra Luccioni and Yacine Jernite,http://arxiv.org/pdf/2303.03915v1
 http://arxiv.org/abs/1912.10308v1,creativecommons.org/licenses/by/4.0/,Candidate Fusion: Integrating Language Modelling into a Sequence-to-Sequence Handwritten Word Recognition Architecture,Lei Kang and Pau Riba and Mauricio Villegas and Alicia Fornés and Marçal Rusiñol,http://arxiv.org/pdf/1912.10308v1
 http://arxiv.org/abs/2204.03498v1,creativecommons.org/licenses/by/4.0/,On the Effectiveness of Pretrained Models for API Learning,Mohammad Abdul Hadi and Imam Nur Bani Yusuf and Ferdian Thung and Kien Gia Luong and Jiang Lingxiao and Fatemeh H. Fard and David Lo,http://arxiv.org/pdf/2204.03498v1
 http://arxiv.org/abs/2204.07288v1,creativecommons.org/licenses/by/4.0/,Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models,Phyllis Ang and Bhuwan Dhingra and Lisa Wu Wills,http://arxiv.org/pdf/2204.07288v1
 http://arxiv.org/abs/2109.14728v1,creativecommons.org/licenses/by/4.0/,Collaborative Storytelling with Human Actors and AI Narrators,Boyd Branch and Piotr Mirowski and Kory W. Mathewson,http://arxiv.org/pdf/2109.14728v1
 http://arxiv.org/abs/2112.14757v2,creativecommons.org/licenses/by/4.0/,A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model,Mengde Xu and Zheng Zhang and Fangyun Wei and Yutong Lin and Yue Cao and Han Hu and Xiang Bai,http://arxiv.org/pdf/2112.14757v2
 http://arxiv.org/abs/2208.00636v2,creativecommons.org/licenses/by/4.0/,Interacting with next-phrase suggestions: How suggestion systems aid and influence the cognitive processes of writing,Advait Bhat and Saaket Agashe and Niharika Mohile and Parth Oberoi and Ravi Jangir and Anirudha Joshi,http://arxiv.org/pdf/2208.00636v2
 http://arxiv.org/abs/2301.11305v1,creativecommons.org/licenses/by/4.0/,DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature,Eric Mitchell and Yoonho Lee and Alexander Khazatsky and Christopher D. Manning and Chelsea Finn,http://arxiv.org/pdf/2301.11305v1
 http://arxiv.org/abs/2111.00640v2,creativecommons.org/licenses/by/4.0/,VSEC: Transformer-based Model for Vietnamese Spelling Correction,Dinh-Truong Do and Ha Thanh Nguyen and Thang Ngoc Bui and Dinh Hieu Vo,http://arxiv.org/pdf/2111.00640v2
 http://arxiv.org/abs/2205.01068v4,creativecommons.org/licenses/by/4.0/,OPT: Open Pre-trained Transformer Language Models,Susan Zhang and Stephen Roller and Naman Goyal and Mikel Artetxe and Moya Chen and Shuohui Chen and Christopher Dewan and Mona Diab and Xian Li and Xi Victoria Lin and Todor Mihaylov and Myle Ott and Sam Shleifer and Kurt Shuster and Daniel Simig and Punit Singh Koura and Anjali Sridhar and Tianlu Wang and Luke Zettlemoyer,http://arxiv.org/pdf/2205.01068v4
 http://arxiv.org/abs/2303.11156v1,creativecommons.org/licenses/by/4.0/,Can AI-Generated Text be Reliably Detected?,Vinu Sankar Sadasivan and Aounon Kumar and Sriram Balasubramanian and Wenxiao Wang and Soheil Feizi,http://arxiv.org/pdf/2303.11156v1
 http://arxiv.org/abs/2303.08448v1,creativecommons.org/licenses/by/4.0/,A Cross-institutional Evaluation on Breast Cancer Phenotyping NLP Algorithms on Electronic Health Records,Sicheng Zhou and Nan Wang and Liwei Wang and Ju Sun and Anne Blaes and Hongfang Liu and Rui Zhang,http://arxiv.org/pdf/2303.08448v1
 http://arxiv.org/abs/2203.13590v1,creativecommons.org/licenses/by/4.0/,Impact of Dataset on Acoustic Models for Automatic Speech Recognition,Siddhesh Singh,http://arxiv.org/pdf/2203.13590v1
 http://arxiv.org/abs/2206.08853v2,creativecommons.org/licenses/by/4.0/,MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge,Linxi Fan and Guanzhi Wang and Yunfan Jiang and Ajay Mandlekar and Yuncong Yang and Haoyi Zhu and Andrew Tang and De-An Huang and Yuke Zhu and Anima Anandkumar,http://arxiv.org/pdf/2206.08853v2
 http://arxiv.org/abs/2211.10438v4,creativecommons.org/licenses/by/4.0/,SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models,Guangxuan Xiao and Ji Lin and Mickael Seznec and Hao Wu and Julien Demouth and Song Han,http://arxiv.org/pdf/2211.10438v4
 http://arxiv.org/abs/2301.13268v2,creativecommons.org/licenses/by/4.0/,Contextual Dynamic Prompting for Response Generation in Task-oriented Dialog Systems,Sandesh Swamy and Narges Tabari and Chacha Chen and Rashmi Gangadharaiah,http://arxiv.org/pdf/2301.13268v2
 http://arxiv.org/abs/2302.05319v1,creativecommons.org/licenses/by/4.0/,Controlling Large Language Models to Generate Secure and Vulnerable Code,Jingxuan He and Martin Vechev,http://arxiv.org/pdf/2302.05319v1
 http://arxiv.org/abs/2302.08659v1,creativecommons.org/licenses/by/4.0/,Uncertainty-aware Self-training for Low-resource Neural Sequence Labeling,Jianing Wang and Chengyu Wang and Jun Huang and Ming Gao and Aoying Zhou,http://arxiv.org/pdf/2302.08659v1
 http://arxiv.org/abs/2304.01246v1,creativecommons.org/licenses/by/4.0/,Safety Analysis in the Era of Large Language Models: A Case Study of STPA using ChatGPT,Yi Qi and Xingyu Zhao and Xiaowei Huang,http://arxiv.org/pdf/2304.01246v1
 http://arxiv.org/abs/2304.09667v2,creativecommons.org/licenses/by/4.0/,GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information,Qiao Jin and Yifan Yang and Qingyu Chen and Zhiyong Lu,http://arxiv.org/pdf/2304.09667v2
 http://arxiv.org/abs/2112.01742v1,creativecommons.org/licenses/by/4.0/,Multitask Finetuning for Improving Neural Machine Translation in Indian Languages,Shaily Desai and Atharva Kshirsagar and Manisha Marathe,http://arxiv.org/pdf/2112.01742v1
 http://arxiv.org/abs/2107.05476v1,creativecommons.org/licenses/by/4.0/,Technical Report of Team GraphMIRAcles in the WikiKG90M-LSC Track of OGB-LSC @ KDD Cup 2021,Jianyu Cai and Jiajun Chen and Taoxing Pan and Zhanqiu Zhang and Jie Wang,http://arxiv.org/pdf/2107.05476v1
 http://arxiv.org/abs/2110.01963v1,creativecommons.org/licenses/by/4.0/,"Multimodal datasets: misogyny, pornography, and malignant stereotypes",Abeba Birhane and Vinay Uday Prabhu and Emmanuel Kahembwe,http://arxiv.org/pdf/2110.01963v1
 http://arxiv.org/abs/2109.09405v1,creativecommons.org/licenses/by/4.0/,Assessing the quality of sources in Wikidata across languages: a hybrid approach,Gabriel Amaral and Alessandro Piscopo and Lucie-Aimée Kaffee and Odinaldo Rodrigues and Elena Simperl,http://arxiv.org/pdf/2109.09405v1
 http://arxiv.org/abs/2109.09475v1,creativecommons.org/licenses/by/4.0/,Knowledge Graph Question Answering via SPARQL Silhouette Generation,Sukannya Purkayastha and Saswati Dana and Dinesh Garg and Dinesh Khandelwal and G P Shrivatsa Bhargav,http://arxiv.org/pdf/2109.09475v1
 http://arxiv.org/abs/2109.10504v3,creativecommons.org/licenses/by/4.0/,KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation,Yongfei Liu and Chenfei Wu and Shao-yen Tseng and Vasudev Lal and Xuming He and Nan Duan,http://arxiv.org/pdf/2109.10504v3
 http://arxiv.org/abs/2205.09229v3,creativecommons.org/licenses/by/4.0/,PromptDA: Label-guided Data Augmentation for Prompt-based Few-shot Learners,Canyu Chen and Kai Shu,http://arxiv.org/pdf/2205.09229v3
 http://arxiv.org/abs/2205.12089v2,creativecommons.org/licenses/by/4.0/,Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity Resolution,Georgios Tziafas and Hamidreza Kasaei,http://arxiv.org/pdf/2205.12089v2
 http://arxiv.org/abs/2210.12328v1,creativecommons.org/licenses/by/4.0/,"R$^2$F: A General Retrieval, Reading and Fusion Framework for Document-level Natural Language Inference",Hao Wang and Yixin Cao and Yangguang Li and Zhen Huang and Kun Wang and Jing Shao,http://arxiv.org/pdf/2210.12328v1
 http://arxiv.org/abs/2211.08462v1,creativecommons.org/licenses/by/4.0/,Navigating Connected Memories with a Task-oriented Dialog System,Seungwhan Moon and Satwik Kottur and Alborz Geramifard and Babak Damavandi,http://arxiv.org/pdf/2211.08462v1
 http://arxiv.org/abs/2303.04142v1,creativecommons.org/licenses/by/4.0/,From Copilot to Pilot: Towards AI Supported Software Development,Rohith Pudari and Neil A. Ernst,http://arxiv.org/pdf/2303.04142v1
 http://arxiv.org/abs/2211.10086v1,creativecommons.org/licenses/by/4.0/,Metadata Might Make Language Models Better,Kaspar Beelen and Daniel van Strien,http://arxiv.org/pdf/2211.10086v1
 http://arxiv.org/abs/2204.09168v2,creativecommons.org/licenses/by/4.0/,Analyzing Gender Representation in Multilingual Models,Hila Gonen and Shauli Ravfogel and Yoav Goldberg,http://arxiv.org/pdf/2204.09168v2
 http://arxiv.org/abs/2103.00854v3,creativecommons.org/licenses/by/4.0/,Vyākarana: A Colorless Green Benchmark for Syntactic Evaluation in Indic Languages,Rajaswa Patil and Jasleen Dhillon and Siddhant Mahurkar and Saumitra Kulkarni and Manav Malhotra and Veeky Baths,http://arxiv.org/pdf/2103.00854v3
 http://arxiv.org/abs/2108.04080v2,creativecommons.org/licenses/by/4.0/,Aspect-based Sentiment Analysis in Document -- FOMC Meeting Minutes on Economic Projection,Sarah-Yifei-Wang,http://arxiv.org/pdf/2108.04080v2
 http://arxiv.org/abs/2110.03142v1,creativecommons.org/licenses/by/4.0/,A Comparative Study of Transformer-Based Language Models on Extractive Question Answering,Kate Pearce and Tiffany Zhan and Aneesh Komanduri and Justin Zhan,http://arxiv.org/pdf/2110.03142v1
 http://arxiv.org/abs/1909.08135v3,creativecommons.org/licenses/by/4.0/,SUPP.AI: Finding Evidence for Supplement-Drug Interactions,Lucy Lu Wang and Oyvind Tafjord and Arman Cohan and Sarthak Jain and Sam Skjonsberg and Carissa Schoenick and Nick Botner and Waleed Ammar,http://arxiv.org/pdf/1909.08135v3
 http://arxiv.org/abs/2011.04767v1,creativecommons.org/licenses/by/4.0/,An Analysis of Dataset Overlap on Winograd-Style Tasks,Ali Emami and Adam Trischler and Kaheer Suleman and Jackie Chi Kit Cheung,http://arxiv.org/pdf/2011.04767v1
 http://arxiv.org/abs/2011.10285v1,creativecommons.org/licenses/by/4.0/,Learning Informative Representations of Biomedical Relations with Latent Variable Models,Harshil Shah and Julien Fauqueur,http://arxiv.org/pdf/2011.10285v1
 http://arxiv.org/abs/2012.15353v1,creativecommons.org/licenses/by/4.0/,Deriving Contextualised Semantic Features from BERT (and Other Transformer Model) Embeddings,Jacob Turton and David Vinson and Robert Elliott Smith,http://arxiv.org/pdf/2012.15353v1
 http://arxiv.org/abs/2103.06779v2,creativecommons.org/licenses/by/4.0/,MERMAID: Metaphor Generation with Symbolism and Discriminative Decoding,Tuhin Chakrabarty and Xurui Zhang and Smaranda Muresan and Nanyun Peng,http://arxiv.org/pdf/2103.06779v2
 http://arxiv.org/abs/2103.16102v1,creativecommons.org/licenses/by/4.0/,XRJL-HKUST at SemEval-2021 Task 4: WordNet-Enhanced Dual Multi-head Co-Attention for Reading Comprehension of Abstract Meaning,Yuxin Jiang and Ziyi Shou and Qijun Wang and Hao Wu and Fangzhen Lin,http://arxiv.org/pdf/2103.16102v1
 http://arxiv.org/abs/2104.03465v1,creativecommons.org/licenses/by/4.0/,Nutribullets Hybrid: Multi-document Health Summarization,Darsh J Shah and Lili Yu and Tao Lei and Regina Barzilay,http://arxiv.org/pdf/2104.03465v1
 http://arxiv.org/abs/2109.00239v1,creativecommons.org/licenses/by/4.0/,OptAGAN: Entropy-based finetuning on text VAE-GAN,Paolo Tirotta and Stefano Lodi,http://arxiv.org/pdf/2109.00239v1
 http://arxiv.org/abs/2109.13067v1,creativecommons.org/licenses/by/4.0/,Multi-Task and Multi-Corpora Training Strategies to Enhance Argumentative Sentence Linking Performance,Jan Wira Gotama Putra and Simone Teufel and Takenobu Tokunaga,http://arxiv.org/pdf/2109.13067v1
 http://arxiv.org/abs/2110.08355v2,creativecommons.org/licenses/by/4.0/,Clean or Annotate: How to Spend a Limited Data Collection Budget,Derek Chen and Zhou Yu and Samuel R. Bowman,http://arxiv.org/pdf/2110.08355v2
 http://arxiv.org/abs/2112.11670v1,creativecommons.org/licenses/by/4.0/,Domain Adaptation with Pre-trained Transformers for Query Focused Abstractive Text Summarization,Md Tahmid Rahman Laskar and Enamul Hoque and Jimmy Xiangji Huang,http://arxiv.org/pdf/2112.11670v1
 http://arxiv.org/abs/2204.10989v3,creativecommons.org/licenses/by/4.0/,Dialogue Meaning Representation for Task-Oriented Dialogue Systems,Xiangkun Hu and Junqi Dai and Hang Yan and Yi Zhang and Qipeng Guo and Xipeng Qiu and Zheng Zhang,http://arxiv.org/pdf/2204.10989v3
 http://arxiv.org/abs/2204.12061v2,creativecommons.org/licenses/by/4.0/,PLOD: An Abbreviation Detection Dataset for Scientific Documents,Leonardo Zilio and Hadeel Saadany and Prashant Sharma and Diptesh Kanojia and Constantin Orăsan,http://arxiv.org/pdf/2204.12061v2
 http://arxiv.org/abs/2205.12506v2,creativecommons.org/licenses/by/4.0/,Memorization in NLP Fine-tuning Methods,Fatemehsadat Mireshghallah and Archit Uniyal and Tianhao Wang and David Evans and Taylor Berg-Kirkpatrick,http://arxiv.org/pdf/2205.12506v2
 http://arxiv.org/abs/2206.02770v1,creativecommons.org/licenses/by/4.0/,Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts,Basil Mustafa and Carlos Riquelme and Joan Puigcerver and Rodolphe Jenatton and Neil Houlsby,http://arxiv.org/pdf/2206.02770v1
 http://arxiv.org/abs/2209.12616v1,creativecommons.org/licenses/by/4.0/,T-NER: An All-Round Python Library for Transformer-based Named Entity Recognition,Asahi Ushio and Jose Camacho-Collados,http://arxiv.org/pdf/2209.12616v1
 http://arxiv.org/abs/2210.02570v1,creativecommons.org/licenses/by/4.0/,Revisiting Structured Dropout,Yiren Zhao and Oluwatomisin Dada and Xitong Gao and Robert D Mullins,http://arxiv.org/pdf/2210.02570v1
 http://arxiv.org/abs/2210.07469v2,creativecommons.org/licenses/by/4.0/,StyLEx: Explaining Style Using Human Lexical Annotations,Shirley Anugrah Hayati and Kyumin Park and Dheeraj Rajagopal and Lyle Ungar and Dongyeop Kang,http://arxiv.org/pdf/2210.07469v2
 http://arxiv.org/abs/2210.12929v1,creativecommons.org/licenses/by/4.0/,Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks,Vikas Raunak and Arul Menezes,http://arxiv.org/pdf/2210.12929v1
 http://arxiv.org/abs/2211.07716v1,creativecommons.org/licenses/by/4.0/,Zero-Shot Text Matching for Automated Auditing using Sentence Transformers,David Biesner and Maren Pielka and Rajkumar Ramamurthy and Tim Dilmaghani and Bernd Kliem and Rüdiger Loitz and Rafet Sifa,http://arxiv.org/pdf/2211.07716v1
 http://arxiv.org/abs/2211.13638v1,creativecommons.org/licenses/by/4.0/,Prototypical Fine-tuning: Towards Robust Performance Under Varying Data Sizes,Yiqiao Jin and Xiting Wang and Yaru Hao and Yizhou Sun and Xing Xie,http://arxiv.org/pdf/2211.13638v1
 http://arxiv.org/abs/2212.10515v1,creativecommons.org/licenses/by/4.0/,CausalDialogue: Modeling Utterance-level Causality in Conversations,Yi-Lin Tuan and Alon Albalak and Wenda Xu and Michael Saxon and Connor Pryor and Lise Getoor and William Yang Wang,http://arxiv.org/pdf/2212.10515v1
 http://arxiv.org/abs/2302.05454v1,creativecommons.org/licenses/by/4.0/,Distillation of encoder-decoder transformers for sequence labelling,Marco Farina and Duccio Pappadopulo and Anant Gupta and Leslie Huang and Ozan İrsoy and Thamar Solorio,http://arxiv.org/pdf/2302.05454v1
 http://arxiv.org/abs/2302.06598v1,creativecommons.org/licenses/by/4.0/,Gradient-Based Automated Iterative Recovery for Parameter-Efficient Tuning,Maximilian Mozes and Tolga Bolukbasi and Ann Yuan and Frederick Liu and Nithum Thain and Lucas Dixon,http://arxiv.org/pdf/2302.06598v1
 http://arxiv.org/abs/2303.12135v1,creativecommons.org/licenses/by/4.0/,Understand Legal Documents with Contextualized Large Language Models,Xin Jin and Yuchen Wang,http://arxiv.org/pdf/2303.12135v1
 http://arxiv.org/abs/2303.13314v1,creativecommons.org/licenses/by/4.0/,Leveraging Foundation Models for Clinical Text Analysis,Shaina Raza and Syed Raza Bashir,http://arxiv.org/pdf/2303.13314v1
 http://arxiv.org/abs/2304.06148v1,creativecommons.org/licenses/by/4.0/,Detection of Fake Generated Scientific Abstracts,Panagiotis C. Theocharopoulos and Panagiotis Anagnostou and Anastasia Tsoukala and Spiros V. Georgakopoulos and Sotiris K. Tasoulis and Vassilis P. Plagianakos,http://arxiv.org/pdf/2304.06148v1
 http://arxiv.org/abs/2212.08094v1,creativecommons.org/licenses/by/4.0/,Joint processing of linguistic properties in brains and language models,Subba Reddy Oota and Manish Gupta and Mariya Toneva,http://arxiv.org/pdf/2212.08094v1
 http://arxiv.org/abs/2108.03089v1,creativecommons.org/licenses/by/4.0/,Cross-lingual Capsule Network for Hate Speech Detection in Social Media,Aiqi Jiang and Arkaitz Zubiaga,http://arxiv.org/pdf/2108.03089v1
 http://arxiv.org/abs/2203.09299v1,creativecommons.org/licenses/by/4.0/,Proceedings Fifth Workshop on Models for Formal Analysis of Real Systems,Clemens Dubslaff and Bas Luttik,http://arxiv.org/pdf/2203.09299v1
 http://arxiv.org/abs/2207.04648v1,creativecommons.org/licenses/by/4.0/,Learning Large-scale Universal User Representation with Sparse Mixture of Experts,Caigao Jiang and Siqiao Xue and James Zhang and Lingyue Liu and Zhibo Zhu and Hongyan Hao,http://arxiv.org/pdf/2207.04648v1
 http://arxiv.org/abs/2012.03837v2,creativecommons.org/licenses/by/4.0/,Parallel Training of Deep Networks with Local Updates,Michael Laskin and Luke Metz and Seth Nabarro and Mark Saroufim and Badreddine Noune and Carlo Luschi and Jascha Sohl-Dickstein and Pieter Abbeel,http://arxiv.org/pdf/2012.03837v2
 http://arxiv.org/abs/1910.04269v1,creativecommons.org/licenses/by/4.0/,Spoken Language Identification using ConvNets,Sarthak and Shikhar Shukla and Govind Mittal,http://arxiv.org/pdf/1910.04269v1
 http://arxiv.org/abs/2104.00767v1,creativecommons.org/licenses/by/4.0/,Canonical and Surface Morphological Segmentation for Nguni Languages,Tumi Moeng and Sheldon Reay and Aaron Daniels and Jan Buys,http://arxiv.org/pdf/2104.00767v1
 http://arxiv.org/abs/2010.05953v2,creativecommons.org/licenses/by/4.0/,COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs,Jena D. Hwang and Chandra Bhagavatula and Ronan Le Bras and Jeff Da and Keisuke Sakaguchi and Antoine Bosselut and Yejin Choi,http://arxiv.org/pdf/2010.05953v2
 http://arxiv.org/abs/2107.06955v1,creativecommons.org/licenses/by/4.0/,HTLM: Hyper-Text Pre-Training and Prompting of Language Models,Armen Aghajanyan and Dmytro Okhonko and Mike Lewis and Mandar Joshi and Hu Xu and Gargi Ghosh and Luke Zettlemoyer,http://arxiv.org/pdf/2107.06955v1
 http://arxiv.org/abs/2210.08773v3,creativecommons.org/licenses/by/4.0/,Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training,Anthony Meng Huat Tiong and Junnan Li and Boyang Li and Silvio Savarese and Steven C. H. Hoi,http://arxiv.org/pdf/2210.08773v3
 http://arxiv.org/abs/2301.03238v1,creativecommons.org/licenses/by/4.0/,MAQA: A Multimodal QA Benchmark for Negation,Judith Yue Li and Aren Jansen and Qingqing Huang and Joonseok Lee and Ravi Ganti and Dima Kuzmin,http://arxiv.org/pdf/2301.03238v1
 http://arxiv.org/abs/2304.11029v2,creativecommons.org/licenses/by/4.0/,CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval,Shangda Wu and Dingyao Yu and Xu Tan and Maosong Sun,http://arxiv.org/pdf/2304.11029v2
 http://arxiv.org/abs/2110.02600v3,creativecommons.org/licenses/by/4.0/,Sequential Reptile: Inter-Task Gradient Alignment for Multilingual Learning,Seanie Lee and Hae Beom Lee and Juho Lee and Sung Ju Hwang,http://arxiv.org/pdf/2110.02600v3
 http://arxiv.org/abs/2203.13550v1,creativecommons.org/licenses/by/4.0/,Modeling Target-Side Morphology in Neural Machine Translation: A Comparison of Strategies,Marion Weller-Di Marco and Matthias Huck and Alexander Fraser,http://arxiv.org/pdf/2203.13550v1
 http://arxiv.org/abs/2205.12542v3,creativecommons.org/licenses/by/4.0/,ER-Test: Evaluating Explanation Regularization Methods for Language Models,Brihi Joshi and Aaron Chan and Ziyi Liu and Shaoliang Nie and Maziar Sanjabi and Hamed Firooz and Xiang Ren,http://arxiv.org/pdf/2205.12542v3
 http://arxiv.org/abs/2209.09815v2,creativecommons.org/licenses/by/4.0/,Towards Fine-tuning Pre-trained Language Models with Integer Forward and Backward Propagation,Mohammadreza Tayaranian and Alireza Ghaffari and Marzieh S. Tahaei and Mehdi Rezagholizadeh and Masoud Asgharian and Vahid Partovi Nia,http://arxiv.org/pdf/2209.09815v2
 http://arxiv.org/abs/2210.02914v2,creativecommons.org/licenses/by/4.0/,Generative Entity Typing with Curriculum Learning,Siyu Yuan and Deqing Yang and Jiaqing Liang and Zhixu Li and Jinxi Liu and Jingyue Huang and Yanghua Xiao,http://arxiv.org/pdf/2210.02914v2
 http://arxiv.org/abs/2210.17049v2,creativecommons.org/licenses/by/4.0/,Modular Hybrid Autoregressive Transducer,Zhong Meng and Tongzhou Chen and Rohit Prabhavalkar and Yu Zhang and Gary Wang and Kartik Audhkhasi and Jesse Emond and Trevor Strohman and Bhuvana Ramabhadran and W. Ronny Huang and Ehsan Variani and Yinghui Huang and Pedro J. Moreno,http://arxiv.org/pdf/2210.17049v2
 http://arxiv.org/abs/2212.07127v4,creativecommons.org/licenses/by/4.0/,Towards mapping the contemporary art world with ArtLM: an art-specific NLP model,Qinkai Chen and Mohamed El-Mennaoui and Antoine Fosset and Amine Rebei and Haoyang Cao and Philine Bouscasse and Christy Eóin O'Beirne and Sasha Shevchenko and Mathieu Rosenbaum,http://arxiv.org/pdf/2212.07127v4
 http://arxiv.org/abs/2303.14480v1,creativecommons.org/licenses/by/4.0/,GANTEE: Generative Adversatial Network for Taxonomy Entering Evaluation,Zhouhong Gu and Sihang Jiang and Jingping Liu and Yanghua Xiao and Hongwei Feng and Zhixu Li and Jiaqing Liang and Jian Zhong,http://arxiv.org/pdf/2303.14480v1
 http://arxiv.org/abs/1808.03570v1,creativecommons.org/licenses/by/4.0/,Densely Connected Convolutional Networks for Speech Recognition,Chia Yu Li and Ngoc Thang Vu,http://arxiv.org/pdf/1808.03570v1
 http://arxiv.org/abs/1811.06968v2,creativecommons.org/licenses/by/4.0/,Symbolic Register Automata,Loris D'Antoni and Tiago Ferreira and Matteo Sammartino and Alexandra Silva,http://arxiv.org/pdf/1811.06968v2
 http://arxiv.org/abs/1909.03526v3,creativecommons.org/licenses/by/4.0/,Multi-Task Bidirectional Transformer Representations for Irony Detection,Chiyu Zhang and Muhammad Abdul-Mageed,http://arxiv.org/pdf/1909.03526v3
 http://arxiv.org/abs/2109.10441v1,creativecommons.org/licenses/by/4.0/,Evaluating Debiasing Techniques for Intersectional Biases,Shivashankar Subramanian and Xudong Han and Timothy Baldwin and Trevor Cohn and Lea Frermann,http://arxiv.org/pdf/2109.10441v1
 http://arxiv.org/abs/2203.07890v1,creativecommons.org/licenses/by/4.0/,K-VQG: Knowledge-aware Visual Question Generation for Common-sense Acquisition,Kohei Uehara and Tatsuya Harada,http://arxiv.org/pdf/2203.07890v1
 http://arxiv.org/abs/2206.02661v1,creativecommons.org/licenses/by/4.0/,Evaluating Deep Taylor Decomposition for Reliability Assessment in the Wild,Stephanie Brandl and Daniel Hershcovich and Anders Søgaard,http://arxiv.org/pdf/2206.02661v1
 http://arxiv.org/abs/2209.14981v2,creativecommons.org/licenses/by/4.0/,Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging,Jean Kaddour,http://arxiv.org/pdf/2209.14981v2
 http://arxiv.org/abs/2210.05103v1,creativecommons.org/licenses/by/4.0/,Leveraging Artificial Intelligence on Binary Code Comprehension,Yifan Zhang,http://arxiv.org/pdf/2210.05103v1
 http://arxiv.org/abs/2212.08926v1,creativecommons.org/licenses/by/4.0/,A Simple Baseline for Beam Search Reranking,Lior Vassertail and Omer Levy,http://arxiv.org/pdf/2212.08926v1
 http://arxiv.org/abs/2212.10544v1,creativecommons.org/licenses/by/4.0/,Pretraining Without Attention,Junxiong Wang and Jing Nathan Yan and Albert Gu and Alexander M. Rush,http://arxiv.org/pdf/2212.10544v1
 http://arxiv.org/abs/2204.05541v1,creativecommons.org/licenses/by/4.0/,Not always about you: Prioritizing community needs when developing endangered language technology,Zoey Liu and Crystal Richardson and Richard Hatcher Jr and Emily Prud'hommeaux,http://arxiv.org/pdf/2204.05541v1
 http://arxiv.org/abs/2201.09680v1,creativecommons.org/licenses/by/4.0/,Relational Memory Augmented Language Models,Qi Liu and Dani Yogatama and Phil Blunsom,http://arxiv.org/pdf/2201.09680v1
 http://arxiv.org/abs/2106.06017v2,creativecommons.org/licenses/by/4.0/,Cross-lingual Emotion Detection,Sabit Hassan and Shaden Shaar and Kareem Darwish,http://arxiv.org/pdf/2106.06017v2
 http://arxiv.org/abs/1906.11608v2,creativecommons.org/licenses/by/4.0/,Simple Natural Language Processing Tools for Danish,Leon Derczynski,http://arxiv.org/pdf/1906.11608v2
 http://arxiv.org/abs/2210.11416v5,creativecommons.org/licenses/by/4.0/,Scaling Instruction-Finetuned Language Models,Hyung Won Chung and Le Hou and Shayne Longpre and Barret Zoph and Yi Tay and William Fedus and Yunxuan Li and Xuezhi Wang and Mostafa Dehghani and Siddhartha Brahma and Albert Webson and Shixiang Shane Gu and Zhuyun Dai and Mirac Suzgun and Xinyun Chen and Aakanksha Chowdhery and Alex Castro-Ros and Marie Pellat and Kevin Robinson and Dasha Valter and Sharan Narang and Gaurav Mishra and Adams Yu and Vincent Zhao and Yanping Huang and Andrew Dai and Hongkun Yu and Slav Petrov and Ed H. Chi and Jeff Dean and Jacob Devlin and Adam Roberts and Denny Zhou and Quoc V. Le and Jason Wei,http://arxiv.org/pdf/2210.11416v5
 http://arxiv.org/abs/2104.12470v5,creativecommons.org/licenses/by/4.0/,Easy and Efficient Transformer : Scalable Inference Solution For large NLP model,Gongzheng Li and Yadong Xi and Jingzhen Ding and Duan Wang and Bai Liu and Changjie Fan and Xiaoxi Mao and Zeng Zhao,http://arxiv.org/pdf/2104.12470v5
 http://arxiv.org/abs/2111.14247v2,creativecommons.org/licenses/by/4.0/,A Survey of Large-Scale Deep Learning Serving System Optimization: Challenges and Opportunities,Fuxun Yu and Di Wang and Longfei Shangguan and Minjia Zhang and Xulong Tang and Chenchen Liu and Xiang Chen,http://arxiv.org/pdf/2111.14247v2
 http://arxiv.org/abs/2304.07493v1,creativecommons.org/licenses/by/4.0/,OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization,Cong Guo and Jiaming Tang and Weiming Hu and Jingwen Leng and Chen Zhang and Fan Yang and Yunxin Liu and Minyi Guo and Yuhao Zhu,http://arxiv.org/pdf/2304.07493v1
 http://arxiv.org/abs/2304.12198v1,creativecommons.org/licenses/by/4.0/,Performance of ChatGPT on the US Fundamentals of Engineering Exam: Comprehensive Assessment of Proficiency and Potential Implications for Professional Environmental Engineering Practice,Vinay Pursnani and Yusuf Sermet and Ibrahim Demir,http://arxiv.org/pdf/2304.12198v1
 http://arxiv.org/abs/2301.01701v2,creativecommons.org/licenses/by/4.0/,Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries,Ali Al-Kaswan and Toufique Ahmed and Maliheh Izadi and Anand Ashok Sawant and Premkumar Devanbu and Arie van Deursen,http://arxiv.org/pdf/2301.01701v2
 http://arxiv.org/abs/2204.06271v1,creativecommons.org/licenses/by/4.0/,TangoBERT: Reducing Inference Cost by using Cascaded Architecture,Jonathan Mamou and Oren Pereg and Moshe Wasserblat and Roy Schwartz,http://arxiv.org/pdf/2204.06271v1
 http://arxiv.org/abs/2211.04939v1,creativecommons.org/licenses/by/4.0/,Efficient Speech Translation with Pre-trained Models,Zhaolin Li and Jan Niehues,http://arxiv.org/pdf/2211.04939v1
 http://arxiv.org/abs/2111.02687v1,creativecommons.org/licenses/by/4.0/,CoreLM: Coreference-aware Language Model Fine-Tuning,Nikolaos Stylianou and Ioannis Vlahavas,http://arxiv.org/pdf/2111.02687v1
 http://arxiv.org/abs/2303.18190v1,creativecommons.org/licenses/by/4.0/,Assessing Language Model Deployment with Risk Cards,Leon Derczynski and Hannah Rose Kirk and Vidhisha Balachandran and Sachin Kumar and Yulia Tsvetkov and M. R. Leiser and Saif Mohammad,http://arxiv.org/pdf/2303.18190v1
 http://arxiv.org/abs/1911.12559v1,creativecommons.org/licenses/by/4.0/,KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents,Ygor Gallina and Florian Boudin and Béatrice Daille,http://arxiv.org/pdf/1911.12559v1
 http://arxiv.org/abs/2012.07575v2,creativecommons.org/licenses/by/4.0/,Large-scale Quantitative Evidence of Media Impact on Public Opinion toward China,Junming Huang and Gavin Cook and Yu Xie,http://arxiv.org/pdf/2012.07575v2
 http://arxiv.org/abs/2103.01242v2,creativecommons.org/licenses/by/4.0/,Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language,Avia Efrat and Uri Shaham and Dan Kilman and Omer Levy,http://arxiv.org/pdf/2103.01242v2
 http://arxiv.org/abs/2110.10328v1,creativecommons.org/licenses/by/4.0/,R$^3$Net:Relation-embedded Representation Reconstruction Network for Change Captioning,Yunbin Tu and Liang Li and Chenggang Yan and Shengxiang Gao and Zhengtao Yu,http://arxiv.org/pdf/2110.10328v1
 http://arxiv.org/abs/2111.11431v1,creativecommons.org/licenses/by/4.0/,"RedCaps: web-curated image-text data created by the people, for the people",Karan Desai and Gaurav Kaul and Zubin Aysola and Justin Johnson,http://arxiv.org/pdf/2111.11431v1
 http://arxiv.org/abs/2203.14371v1,creativecommons.org/licenses/by/4.0/,MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering,Ankit Pal and Logesh Kumar Umapathi and Malaikannan Sankarasubbu,http://arxiv.org/pdf/2203.14371v1
 http://arxiv.org/abs/2204.12785v1,creativecommons.org/licenses/by/4.0/,Plug-and-Play Adaptation for Continuously-updated QA,Kyungjae Lee and Wookje Han and Seung-won Hwang and Hwaran Lee and Joonsuk Park and Sang-Woo Lee,http://arxiv.org/pdf/2204.12785v1
 http://arxiv.org/abs/2205.04050v1,creativecommons.org/licenses/by/4.0/,Few-shot Mining of Naturally Occurring Inputs and Outputs,Mandar Joshi and Terra Blevins and Mike Lewis and Daniel S. Weld and Luke Zettlemoyer,http://arxiv.org/pdf/2205.04050v1
 http://arxiv.org/abs/2205.10153v1,creativecommons.org/licenses/by/4.0/,Mapping Complex Technologies via Science-Technology Linkages; The Case of Neuroscience -- A transformer based keyword extraction approach,Daniel Hain and Roman Jurowetzki and Mariagrazia Squicciarini,http://arxiv.org/pdf/2205.10153v1
 http://arxiv.org/abs/2206.07106v1,creativecommons.org/licenses/by/4.0/,NewsEdits: A News Article Revision Dataset and a Document-Level Reasoning Challenge,Alexander Spangher and Xiang Ren and Jonathan May and Nanyun Peng,http://arxiv.org/pdf/2206.07106v1
 http://arxiv.org/abs/2210.05257v1,creativecommons.org/licenses/by/4.0/,Rethinking the Event Coding Pipeline with Prompt Entailment,Clément Lefebvre and Niklas Stoehr,http://arxiv.org/pdf/2210.05257v1
 http://arxiv.org/abs/2302.14534v1,creativecommons.org/licenses/by/4.0/,Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face,Christopher Akiki and Odunayo Ogundepo and Aleksandra Piktus and Xinyu Zhang and Akintunde Oladipo and Jimmy Lin and Martin Potthast,http://arxiv.org/pdf/2302.14534v1
 http://arxiv.org/abs/2212.08967v1,creativecommons.org/licenses/by/4.0/,"Foundation models in brief: A historical, socio-technical focus",Johannes Schneider,http://arxiv.org/pdf/2212.08967v1
 http://arxiv.org/abs/1711.05159v5,creativecommons.org/licenses/by/4.0/,"Classical Control, Quantum Circuits and Linear Logic in Enriched Category Theory",Mathys Rennela and Sam Staton,http://arxiv.org/pdf/1711.05159v5
 http://arxiv.org/abs/2109.04727v1,creativecommons.org/licenses/by/4.0/,A Simple and Effective Method To Eliminate the Self Language Bias in Multilingual Representations,Ziyi Yang and Yinfei Yang and Daniel Cer and Eric Darve,http://arxiv.org/pdf/2109.04727v1
 http://arxiv.org/abs/2205.12148v3,creativecommons.org/licenses/by/4.0/,Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer,Ahmet Üstün and Arianna Bisazza and Gosse Bouma and Gertjan van Noord and Sebastian Ruder,http://arxiv.org/pdf/2205.12148v3
 http://arxiv.org/abs/2210.11359v1,creativecommons.org/licenses/by/4.0/,Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages,Paul Röttger and Debora Nozza and Federico Bianchi and Dirk Hovy,http://arxiv.org/pdf/2210.11359v1
 http://arxiv.org/abs/2211.02136v1,creativecommons.org/licenses/by/4.0/,Logographic Information Aids Learning Better Representations for Natural Language Inference,Zijian Jin and Duygu Ataman,http://arxiv.org/pdf/2211.02136v1
 http://arxiv.org/abs/2301.08115v1,creativecommons.org/licenses/by/4.0/,Language Embeddings Sometimes Contain Typological Generalizations,Robert Östling and Murathan Kurfalı,http://arxiv.org/pdf/2301.08115v1
 http://arxiv.org/abs/2303.01616v1,creativecommons.org/licenses/by/4.0/,Separated and Shared Effects in Higher-Order Languages,Pedro H. Azevedo de Amorim and Justin Hsu,http://arxiv.org/pdf/2303.01616v1
 http://arxiv.org/abs/2104.09777v2,creativecommons.org/licenses/by/4.0/,Subsentence Extraction from Text Using Coverage-Based Deep Learning Language Models,JongYoon Lim and Inkyu Sa and Ho Seok Ahn and Norina Gasteiger and Sanghyub John Lee and Bruce MacDonald,http://arxiv.org/pdf/2104.09777v2
 http://arxiv.org/abs/2110.07592v3,creativecommons.org/licenses/by/4.0/,DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances,Sreyan Ghosh and Samden Lepcha and S Sakshi and Rajiv Ratn Shah and S. Umesh,http://arxiv.org/pdf/2110.07592v3
 http://arxiv.org/abs/2201.02662v2,creativecommons.org/licenses/by/4.0/,Imagined versus Remembered Stories: Quantifying Differences in Narrative Flow,Maarten Sap and Anna Jafarpour and Yejin Choi and Noah A. Smith and James W. Pennebaker and Eric Horvitz,http://arxiv.org/pdf/2201.02662v2
 http://arxiv.org/abs/2211.12164v1,creativecommons.org/licenses/by/4.0/,OLGA : An Ontology and LSTM-based approach for generating Arithmetic Word Problems (AWPs) of transfer type,Suresh Kumar and P Sreenivasa Kumar,http://arxiv.org/pdf/2211.12164v1
 http://arxiv.org/abs/2301.12030v1,creativecommons.org/licenses/by/4.0/,TiLT: A Time-Centric Approach for Stream Query Optimization and Parallelization,Anand Jayarajan and Wei Zhao and Yudi Sun and Gennady Pekhimenko,http://arxiv.org/pdf/2301.12030v1
 http://arxiv.org/abs/2301.12867v3,creativecommons.org/licenses/by/4.0/,Exploring AI Ethics of ChatGPT: A Diagnostic Analysis,Terry Yue Zhuo and Yujin Huang and Chunyang Chen and Zhenchang Xing,http://arxiv.org/pdf/2301.12867v3
 http://arxiv.org/abs/1507.01701v1,creativecommons.org/licenses/by/4.0/,A Survey and Classification of Controlled Natural Languages,Tobias Kuhn,http://arxiv.org/pdf/1507.01701v1
 http://arxiv.org/abs/1711.05468v1,creativecommons.org/licenses/by/4.0/,Tracking Typological Traits of Uralic Languages in Distributed Language Representations,Johannes Bjerva and Isabelle Augenstein,http://arxiv.org/pdf/1711.05468v1
 http://arxiv.org/abs/2210.05726v1,creativecommons.org/licenses/by/4.0/,Automatic Speech Recognition of Low-Resource Languages Based on Chukchi,Anastasia Safonova and Tatiana Yudina and Emil Nadimanov and Cydnie Davenport,http://arxiv.org/pdf/2210.05726v1
 http://arxiv.org/abs/2106.00143v1,creativecommons.org/licenses/by/4.0/,An Exploratory Analysis of Multilingual Word-Level Quality Estimation with Cross-Lingual Transformers,Tharindu Ranasinghe and Constantin Orasan and Ruslan Mitkov,http://arxiv.org/pdf/2106.00143v1
 http://arxiv.org/abs/2110.15621v1,creativecommons.org/licenses/by/4.0/,MentalBERT: Publicly Available Pretrained Language Models for Mental Healthcare,Shaoxiong Ji and Tianlin Zhang and Luna Ansari and Jie Fu and Prayag Tiwari and Erik Cambria,http://arxiv.org/pdf/2110.15621v1
 http://arxiv.org/abs/2203.09679v1,creativecommons.org/licenses/by/4.0/,Modeling Intensification for Sign Language Generation: A Computational Approach,Mert İnan and Yang Zhong and Sabit Hassan and Lorna Quandt and Malihe Alikhani,http://arxiv.org/pdf/2203.09679v1
 http://arxiv.org/abs/2207.00560v1,creativecommons.org/licenses/by/4.0/,Is neural language acquisition similar to natural? A chronological probing study,Ekaterina Voloshina and Oleg Serikov and Tatiana Shavrina,http://arxiv.org/pdf/2207.00560v1
 http://arxiv.org/abs/2207.05666v1,creativecommons.org/licenses/by/4.0/,Zero-shot Cross-lingual Transfer is Under-specified Optimization,Shijie Wu and Benjamin Van Durme and Mark Dredze,http://arxiv.org/pdf/2207.05666v1
 http://arxiv.org/abs/2207.10576v2,creativecommons.org/licenses/by/4.0/,Democratizing Ethical Assessment of Natural Language Generation Models,Amin Rasekh and Ian Eisenberg,http://arxiv.org/pdf/2207.10576v2
 http://arxiv.org/abs/2208.10801v1,creativecommons.org/licenses/by/4.0/,MATra: A Multilingual Attentive Transliteration System for Indian Scripts,Yash Raj and Bhavesh Laddagiri,http://arxiv.org/pdf/2208.10801v1
 http://arxiv.org/abs/2211.08237v2,creativecommons.org/licenses/by/4.0/,Multilingual Speech Emotion Recognition With Multi-Gating Mechanism and Neural Architecture Search,Zihan Wang and Qi Meng and HaiFeng Lan and XinRui Zhang and KeHao Guo and Akshat Gupta,http://arxiv.org/pdf/2211.08237v2
 http://arxiv.org/abs/2302.09345v1,creativecommons.org/licenses/by/4.0/,Improving the Out-Of-Distribution Generalization Capability of Language Models: Counterfactually-Augmented Data is not Enough,Caoyun Fan and Wenqing Chen and Jidong Tian and Yitian Li and Hao He and Yaohui Jin,http://arxiv.org/pdf/2302.09345v1
 http://arxiv.org/abs/2303.01157v2,creativecommons.org/licenses/by/4.0/,How will Language Modelers like ChatGPT Affect Occupations and Industries?,Ed Felten and Manav Raj and Robert Seamans,http://arxiv.org/pdf/2303.01157v2
 http://arxiv.org/abs/2212.00006v1,creativecommons.org/licenses/by/4.0/,"Operationalizing Specifications, In Addition to Test Sets for Evaluating Constrained Generative Models",Vikas Raunak and Matt Post and Arul Menezes,http://arxiv.org/pdf/2212.00006v1
 http://arxiv.org/abs/2302.07867v3,creativecommons.org/licenses/by/4.0/,Learning Performance-Improving Code Edits,Aman Madaan and Alexander Shypula and Uri Alon and Milad Hashemi and Parthasarathy Ranganathan and Yiming Yang and Graham Neubig and Amir Yazdanbakhsh,http://arxiv.org/pdf/2302.07867v3
 http://arxiv.org/abs/2012.04080v1,creativecommons.org/licenses/by/4.0/,A Taxonomy of Empathetic Response Intents in Human Social Conversations,Anuradha Welivita and Pearl Pu,http://arxiv.org/pdf/2012.04080v1
 http://arxiv.org/abs/2303.16985v1,creativecommons.org/licenses/by/4.0/,Adapting to the Low-Resource Double-Bind: Investigating Low-Compute Methods on Low-Resource African Languages,Colin Leong and Herumb Shandilya and Bonaventure F. P. Dossou and Atnafu Lambebo Tonja and Joel Mathew and Abdul-Hakeem Omotayo and Oreen Yousuf and Zainab Akinjobi and Chris Chinenye Emezue and Shamsudeen Muhammad and Steven Kolawole and Younwoo Choi and Tosin Adewumi,http://arxiv.org/pdf/2303.16985v1
 http://arxiv.org/abs/2102.00291v1,creativecommons.org/licenses/by/4.0/,Speech Recognition by Simply Fine-tuning BERT,Wen-Chin Huang and Chia-Hua Wu and Shang-Bao Luo and Kuan-Yu Chen and Hsin-Min Wang and Tomoki Toda,http://arxiv.org/pdf/2102.00291v1
 http://arxiv.org/abs/2103.05070v1,creativecommons.org/licenses/by/4.0/,Text Simplification by Tagging,Kostiantyn Omelianchuk and Vipul Raheja and Oleksandr Skurzhanskyi,http://arxiv.org/pdf/2103.05070v1
 http://arxiv.org/abs/2103.09548v1,creativecommons.org/licenses/by/4.0/,ENCONTER: Entity Constrained Progressive Sequence Generation via Insertion-based Transformer,Lee-Hsun Hsieh and Yang-Yin Lee and Ee-Peng Lim,http://arxiv.org/pdf/2103.09548v1
 http://arxiv.org/abs/2104.05696v1,creativecommons.org/licenses/by/4.0/,Joint Universal Syntactic and Semantic Parsing,Elias Stengel-Eskin and Kenton Murray and Sheng Zhang and Aaron Steven White and Benjamin Van Durme,http://arxiv.org/pdf/2104.05696v1
 http://arxiv.org/abs/2104.08459v3,creativecommons.org/licenses/by/4.0/,KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset,Saida Mussakhojayeva and Aigerim Janaliyeva and Almas Mirzakhmetov and Yerbolat Khassanov and Huseyin Atakan Varol,http://arxiv.org/pdf/2104.08459v3
 http://arxiv.org/abs/2106.00149v1,creativecommons.org/licenses/by/4.0/,HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalization,Jiaao Chen and Dinghan Shen and Weizhu Chen and Diyi Yang,http://arxiv.org/pdf/2106.00149v1
 http://arxiv.org/abs/2107.12460v1,creativecommons.org/licenses/by/4.0/,Don't Sweep your Learning Rate under the Rug: A Closer Look at Cross-modal Transfer of Pretrained Transformers,Danielle Rothermel and Margaret Li and Tim Rocktäschel and Jakob Foerster,http://arxiv.org/pdf/2107.12460v1
 http://arxiv.org/abs/2107.14600v2,creativecommons.org/licenses/by/4.0/,MDQE: A More Accurate Direct Pretraining for Machine Translation Quality Estimation,Lei Lin,http://arxiv.org/pdf/2107.14600v2
 http://arxiv.org/abs/2108.01377v1,creativecommons.org/licenses/by/4.0/,A Dynamic Head Importance Computation Mechanism for Neural Machine Translation,Akshay Goindani and Manish Shrivastava,http://arxiv.org/pdf/2108.01377v1
 http://arxiv.org/abs/2108.06614v1,creativecommons.org/licenses/by/4.0/,The SelectGen Challenge: Finding the Best Training Samples for Few-Shot Neural Text Generation,Ernie Chang and Xiaoyu Shen and Alex Marin and Vera Demberg,http://arxiv.org/pdf/2108.06614v1
 http://arxiv.org/abs/2108.09164v1,creativecommons.org/licenses/by/4.0/,A Neural Conversation Generation Model via Equivalent Shared Memory Investigation,Changzhen Ji and Yating Zhang and Xiaozhong Liu and Adam Jatowt and Changlong Sun and Conghui Zhu and Tiejun Zhao,http://arxiv.org/pdf/2108.09164v1
 http://arxiv.org/abs/2110.01643v1,creativecommons.org/licenses/by/4.0/,Privacy enabled Financial Text Classification using Differential Privacy and Federated Learning,Priyam Basu and Tiasa Singha Roy and Rakshit Naidu and Zumrut Muftuoglu,http://arxiv.org/pdf/2110.01643v1
 http://arxiv.org/abs/2110.13495v1,creativecommons.org/licenses/by/4.0/,Assessing the Sufficiency of Arguments through Conclusion Generation,Timon Gurcke and Milad Alshomary and Henning Wachsmuth,http://arxiv.org/pdf/2110.13495v1
 http://arxiv.org/abs/2112.01187v1,creativecommons.org/licenses/by/4.0/,Computing Class Hierarchies from Classifiers,Kai Kang and Fangzhen Lin,http://arxiv.org/pdf/2112.01187v1
 http://arxiv.org/abs/2201.09523v1,creativecommons.org/licenses/by/4.0/,BTPK-based learning: An Interpretable Method for Named Entity Recognition,Yulin Chen and Zelai Yao and Haixiao Chi and Dov Gabbay and Bo Yuan and Bruno Bentzen and Beishui Liao,http://arxiv.org/pdf/2201.09523v1
 http://arxiv.org/abs/2203.09178v1,creativecommons.org/licenses/by/4.0/,Multilingual Detection of Personal Employment Status on Twitter,Manuel Tonneau and Dhaval Adjodah and João Palotti and Nir Grinberg and Samuel Fraiberger,http://arxiv.org/pdf/2203.09178v1
 http://arxiv.org/abs/2204.08952v3,creativecommons.org/licenses/by/4.0/,Retrieval Enhanced Data Augmentation for Question Answering on Privacy Policies,Md Rizwan Parvez and Jianfeng Chi and Wasi Uddin Ahmad and Yuan Tian and Kai-Wei Chang,http://arxiv.org/pdf/2204.08952v3
 http://arxiv.org/abs/2205.00498v2,creativecommons.org/licenses/by/4.0/,CUP: Curriculum Learning based Prompt Tuning for Implicit Event Argument Extraction,Jiaju Lin and Qin Chen and Jie Zhou and Jian Jin and Liang He,http://arxiv.org/pdf/2205.00498v2
 http://arxiv.org/abs/2205.13792v2,creativecommons.org/licenses/by/4.0/,kNN-Prompt: Nearest Neighbor Zero-Shot Inference,Weijia Shi and Julian Michael and Suchin Gururangan and Luke Zettlemoyer,http://arxiv.org/pdf/2205.13792v2
 http://arxiv.org/abs/2205.15661v1,creativecommons.org/licenses/by/4.0/,NEWTS: A Corpus for News Topic-Focused Summarization,Seyed Ali Bahrainian and Sheridan Feucht and Carsten Eickhoff,http://arxiv.org/pdf/2205.15661v1
 http://arxiv.org/abs/2206.05696v1,creativecommons.org/licenses/by/4.0/,Grounding in social media: An approach to building a chit-chat dialogue model,Ritvik Choudhary and Daisuke Kawahara,http://arxiv.org/pdf/2206.05696v1
 http://arxiv.org/abs/2209.01712v1,creativecommons.org/licenses/by/4.0/,ChemBERTa-2: Towards Chemical Foundation Models,Walid Ahmad and Elana Simon and Seyone Chithrananda and Gabriel Grand and Bharath Ramsundar,http://arxiv.org/pdf/2209.01712v1
 http://arxiv.org/abs/2209.12687v1,creativecommons.org/licenses/by/4.0/,"A Case Report On The A.I. Locked-In Problem"": social concerns with modern NLP""",Yoshija Walter,http://arxiv.org/pdf/2209.12687v1
 http://arxiv.org/abs/2209.12953v1,creativecommons.org/licenses/by/4.0/,Dialog Acts for Task-Driven Embodied Agents,Spandana Gella and Aishwarya Padmakumar and Patrick Lange and Dilek Hakkani-Tur,http://arxiv.org/pdf/2209.12953v1
 http://arxiv.org/abs/2210.00705v2,creativecommons.org/licenses/by/4.0/,SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model,Yi-Jen Shih and Hsuan-Fu Wang and Heng-Jui Chang and Layne Berry and Hung-yi Lee and David Harwath,http://arxiv.org/pdf/2210.00705v2
 http://arxiv.org/abs/2210.13952v5,creativecommons.org/licenses/by/4.0/,KnowGL: Knowledge Generation and Linking from Text,Gaetano Rossiello and Md Faisal Mahbub Chowdhury and Nandana Mihindukulasooriya and Owen Cornec and Alfio Massimiliano Gliozzo,http://arxiv.org/pdf/2210.13952v5
 http://arxiv.org/abs/2210.17525v1,creativecommons.org/licenses/by/4.0/,Query Refinement Prompts for Closed-Book Long-Form Question Answering,Reinald Kim Amplayo and Kellie Webster and Michael Collins and Dipanjan Das and Shashi Narayan,http://arxiv.org/pdf/2210.17525v1
 http://arxiv.org/abs/2212.01453v1,creativecommons.org/licenses/by/4.0/,Twitter Data Analysis: Izmir Earthquake Case,Özgür Agrali and Hakan Sökün and Enis Karaarslan,http://arxiv.org/pdf/2212.01453v1
 http://arxiv.org/abs/2212.09865v1,creativecommons.org/licenses/by/4.0/,Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations,Xinxi Lyu and Sewon Min and Iz Beltagy and Luke Zettlemoyer and Hannaneh Hajishirzi,http://arxiv.org/pdf/2212.09865v1
 http://arxiv.org/abs/2212.10504v1,creativecommons.org/licenses/by/4.0/,Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?,Sang-Woo Lee and Sungdong Kim and Donghyeon Ko and Donghoon Ham and Youngki Hong and Shin Ah Oh and Hyunhoon Jung and Wangkyo Jung and Kyunghyun Cho and Donghyun Kwak and Hyungsuk Noh and Woomyoung Park,http://arxiv.org/pdf/2212.10504v1
 http://arxiv.org/abs/2304.01097v2,creativecommons.org/licenses/by/4.0/,DoctorGLM: Fine-tuning your Chinese Doctor is not a Herculean Task,Honglin Xiong and Sheng Wang and Yitao Zhu and Zihao Zhao and Yuxiao Liu and Linlin Huang and Qian Wang and Dinggang Shen,http://arxiv.org/pdf/2304.01097v2
 http://arxiv.org/abs/2304.02822v1,creativecommons.org/licenses/by/4.0/,Approach Intelligent Writing Assistants Usability with Seven Stages of Action,Avinash Bhat and Disha Shrivastava and Jin L. C. Guo,http://arxiv.org/pdf/2304.02822v1
 http://arxiv.org/abs/2304.09865v1,creativecommons.org/licenses/by/4.0/,Safer Conversational AI as a Source of User Delight,Xiaoding Lu and Aleksey Korshuk and Zongyi Liu and William Beauchamp and Chai Research,http://arxiv.org/pdf/2304.09865v1
 http://arxiv.org/abs/2201.01549v4,creativecommons.org/licenses/by/4.0/,SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations,Changan Niu and Chuanyi Li and Vincent Ng and Jidong Ge and Liguo Huang and Bin Luo,http://arxiv.org/pdf/2201.01549v4
 http://arxiv.org/abs/2206.10265v2,creativecommons.org/licenses/by/4.0/,KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Low-Resource NLP,Yufei Wang and Jiayi Zheng and Can Xu and Xiubo Geng and Tao Shen and Chongyang Tao and Daxin Jiang,http://arxiv.org/pdf/2206.10265v2
 http://arxiv.org/abs/1805.01952v2,creativecommons.org/licenses/by/4.0/,A Coherent Unsupervised Model for Toponym Resolution,Ehsan Kamalloo and Davood Rafiei,http://arxiv.org/pdf/1805.01952v2
 http://arxiv.org/abs/2106.13876v4,creativecommons.org/licenses/by/4.0/,Knowledge-Grounded Self-Rationalization via Extractive and Natural Language Explanations,Bodhisattwa Prasad Majumder and Oana-Maria Camburu and Thomas Lukasiewicz and Julian McAuley,http://arxiv.org/pdf/2106.13876v4
 http://arxiv.org/abs/2204.13874v1,creativecommons.org/licenses/by/4.0/,OA-Mine: Open-World Attribute Mining for E-Commerce Products with Weak Supervision,Xinyang Zhang and Chenwei Zhang and Xian Li and Xin Luna Dong and Jingbo Shang and Christos Faloutsos and Jiawei Han,http://arxiv.org/pdf/2204.13874v1
 http://arxiv.org/abs/2206.02369v2,creativecommons.org/licenses/by/4.0/,Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation,Jin Xu and Xiaojiang Liu and Jianhao Yan and Deng Cai and Huayang Li and Jian Li,http://arxiv.org/pdf/2206.02369v2
 http://arxiv.org/abs/2208.08080v1,creativecommons.org/licenses/by/4.0/,Multimodal Lecture Presentations Dataset: Understanding Multimodality in Educational Slides,Dong Won Lee and Chaitanya Ahuja and Paul Pu Liang and Sanika Natu and Louis-Philippe Morency,http://arxiv.org/pdf/2208.08080v1
 http://arxiv.org/abs/2209.05433v2,creativecommons.org/licenses/by/4.0/,FP8 Formats for Deep Learning,Paulius Micikevicius and Dusan Stosic and Neil Burgess and Marius Cornea and Pradeep Dubey and Richard Grisenthwaite and Sangwon Ha and Alexander Heinecke and Patrick Judd and John Kamalu and Naveen Mellempudi and Stuart Oberman and Mohammad Shoeybi and Michael Siu and Hao Wu,http://arxiv.org/pdf/2209.05433v2
 http://arxiv.org/abs/2209.15197v1,creativecommons.org/licenses/by/4.0/,Evaluation of taxonomic and neural embedding methods for calculating semantic similarity,Dongqiang Yang and Yanqin Yin,http://arxiv.org/pdf/2209.15197v1
 http://arxiv.org/abs/2302.07324v1,creativecommons.org/licenses/by/4.0/,READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises,Chenglei Si and Zhengyan Zhang and Yingfa Chen and Xiaozhi Wang and Zhiyuan Liu and Maosong Sun,http://arxiv.org/pdf/2302.07324v1
 http://arxiv.org/abs/2303.15078v2,creativecommons.org/licenses/by/4.0/,Large Language Models are Diverse Role-Players for Summarization Evaluation,Ning Wu and Ming Gong and Linjun Shou and Shining Liang and Daxin Jiang,http://arxiv.org/pdf/2303.15078v2
 http://arxiv.org/abs/2304.02080v1,creativecommons.org/licenses/by/4.0/,Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data,Vladislav Lialin and Stephen Rawls and David Chan and Shalini Ghosh and Anna Rumshisky and Wael Hamza,http://arxiv.org/pdf/2304.02080v1
 http://arxiv.org/abs/2111.09453v3,creativecommons.org/licenses/by/4.0/,RoBERTuito: a pre-trained language model for social media text in Spanish,Juan Manuel Pérez and Damián A. Furman and Laura Alonso Alemany and Franco Luque,http://arxiv.org/pdf/2111.09453v3
 http://arxiv.org/abs/2104.08666v2,creativecommons.org/licenses/by/4.0/,Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language Models,Tejas Srinivasan and Yonatan Bisk,http://arxiv.org/pdf/2104.08666v2
 http://arxiv.org/abs/2206.11993v1,creativecommons.org/licenses/by/4.0/,A Disability Lens towards Biases in GPT-3 Generated Open-Ended Languages,Akhter Al Amin and Kazi Sinthia Kabir,http://arxiv.org/pdf/2206.11993v1
 http://arxiv.org/abs/2206.01685v2,creativecommons.org/licenses/by/4.0/,Toward a realistic model of speech processing in the brain with self-supervised learning,Juliette Millet and Charlotte Caucheteux and Pierre Orhan and Yves Boubenec and Alexandre Gramfort and Ewan Dunbar and Christophe Pallier and Jean-Remi King,http://arxiv.org/pdf/2206.01685v2
 http://arxiv.org/abs/2211.16044v1,creativecommons.org/licenses/by/4.0/,Model Extraction Attack against Self-supervised Speech Models,Tsu-Yuan Hsu and Chen-An Li and Tung-Yu Wu and Hung-yi Lee,http://arxiv.org/pdf/2211.16044v1
 http://arxiv.org/abs/2103.07762v2,creativecommons.org/licenses/by/4.0/,OkwuGbé: End-to-End Speech Recognition for Fon and Igbo,Bonaventure F. P. Dossou and Chris C. Emezue,http://arxiv.org/pdf/2103.07762v2
 http://arxiv.org/abs/2205.03983v3,creativecommons.org/licenses/by/4.0/,Building Machine Translation Systems for the Next Thousand Languages,Ankur Bapna and Isaac Caswell and Julia Kreutzer and Orhan Firat and Daan van Esch and Aditya Siddhant and Mengmeng Niu and Pallavi Baljekar and Xavier Garcia and Wolfgang Macherey and Theresa Breiner and Vera Axelrod and Jason Riesa and Yuan Cao and Mia Xu Chen and Klaus Macherey and Maxim Krikun and Pidong Wang and Alexander Gutkin and Apurva Shah and Yanping Huang and Zhifeng Chen and Yonghui Wu and Macduff Hughes,http://arxiv.org/pdf/2205.03983v3
 http://arxiv.org/abs/2302.01496v1,creativecommons.org/licenses/by/4.0/,Efficient Domain Adaptation for Speech Foundation Models,Bo Li and Dongseong Hwang and Zhouyuan Huo and Junwen Bai and Guru Prakash and Tara N. Sainath and Khe Chai Sim and Yu Zhang and Wei Han and Trevor Strohman and Francoise Beaufays,http://arxiv.org/pdf/2302.01496v1
 http://arxiv.org/abs/2303.16133v1,creativecommons.org/licenses/by/4.0/,Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models,Adyasha Maharana and Amita Kamath and Christopher Clark and Mohit Bansal and Aniruddha Kembhavi,http://arxiv.org/pdf/2303.16133v1
 http://arxiv.org/abs/2010.02429v3,creativecommons.org/licenses/by/4.0/,Modeling Preconditions in Text with a Crowd-sourced Dataset,Heeyoung Kwon and Mahnaz Koupaee and Pratyush Singh and Gargi Sawhney and Anmol Shukla and Keerthi Kumar Kallur and Nathanael Chambers and Niranjan Balasubramanian,http://arxiv.org/pdf/2010.02429v3
 http://arxiv.org/abs/2106.13802v1,creativecommons.org/licenses/by/4.0/,Efficient Document Image Classification Using Region-Based Graph Neural Network,Jaya Krishna Mandivarapu and Eric Bunch and Qian You and Glenn Fung,http://arxiv.org/pdf/2106.13802v1
 http://arxiv.org/abs/2107.06785v2,creativecommons.org/licenses/by/4.0/,Large-Scale News Classification using BERT Language Model: Spark NLP Approach,Kuncahyo Setyo Nugroho and Anantha Yullian Sukmadewa and Novanto Yudistira,http://arxiv.org/pdf/2107.06785v2
 http://arxiv.org/abs/2112.09118v4,creativecommons.org/licenses/by/4.0/,Unsupervised Dense Information Retrieval with Contrastive Learning,Gautier Izacard and Mathilde Caron and Lucas Hosseini and Sebastian Riedel and Piotr Bojanowski and Armand Joulin and Edouard Grave,http://arxiv.org/pdf/2112.09118v4
 http://arxiv.org/abs/2205.15241v2,creativecommons.org/licenses/by/4.0/,Multi-Game Decision Transformers,Kuang-Huei Lee and Ofir Nachum and Mengjiao Yang and Lisa Lee and Daniel Freeman and Winnie Xu and Sergio Guadarrama and Ian Fischer and Eric Jang and Henryk Michalewski and Igor Mordatch,http://arxiv.org/pdf/2205.15241v2
 http://arxiv.org/abs/2211.15388v2,creativecommons.org/licenses/by/4.0/,Shifted Diffusion for Text-to-image Generation,Yufan Zhou and Bingchen Liu and Yizhe Zhu and Xiao Yang and Changyou Chen and Jinhui Xu,http://arxiv.org/pdf/2211.15388v2
 http://arxiv.org/abs/2102.10094v3,creativecommons.org/licenses/by/4.0/,Formal Language Theory Meets Modern NLP,William Merrill,http://arxiv.org/pdf/2102.10094v3
 http://arxiv.org/abs/2203.10321v1,creativecommons.org/licenses/by/4.0/,Sequence-to-Sequence Knowledge Graph Completion and Question Answering,Apoorv Saxena and Adrian Kochsiek and Rainer Gemulla,http://arxiv.org/pdf/2203.10321v1
 http://arxiv.org/abs/2206.12866v1,creativecommons.org/licenses/by/4.0/,Contextual embedding and model weighting by fusing domain knowledge on Biomedical Question Answering,Yuxuan Lu and Jingya Yan and Zhixuan Qi and Zhongzheng Ge and Yongping Du,http://arxiv.org/pdf/2206.12866v1
 http://arxiv.org/abs/2211.03270v1,creativecommons.org/licenses/by/4.0/,Reconciliation of Pre-trained Models and Prototypical Neural Networks in Few-shot Named Entity Recognition,Youcheng Huang and Wenqiang Lei and Jie Fu and Jiancheng Lv,http://arxiv.org/pdf/2211.03270v1
 http://arxiv.org/abs/2011.08242v1,creativecommons.org/licenses/by/4.0/,Opportunities and Challenges for Circuit Board Level Hardware Description Languages,Richard Lin and Björn Hartmann,http://arxiv.org/pdf/2011.08242v1
 http://arxiv.org/abs/2012.11995v1,creativecommons.org/licenses/by/4.0/,Pre-Training a Language Model Without Human Language,Cheng-Han Chiang and Hung-yi Lee,http://arxiv.org/pdf/2012.11995v1
 http://arxiv.org/abs/2109.00087v3,creativecommons.org/licenses/by/4.0/,It's not Rocket Science : Interpreting Figurative Language in Narratives,Tuhin Chakrabarty and Yejin Choi and Vered Shwartz,http://arxiv.org/pdf/2109.00087v3
 http://arxiv.org/abs/2304.00906v1,creativecommons.org/licenses/by/4.0/,ScandEval: A Benchmark for Scandinavian Natural Language Processing,Dan Saattrup Nielsen,http://arxiv.org/pdf/2304.00906v1
 http://arxiv.org/abs/2108.09931v1,creativecommons.org/licenses/by/4.0/,"Towards a Formal Modelling, Analysis, and Verification of a Clone Node Attack Detection Scheme in the Internet of Things",Khizar Hameed and Saurabh Garg and Muhammad Bilal Amin and Byeong Kang,http://arxiv.org/pdf/2108.09931v1
 http://arxiv.org/abs/2301.00704v1,creativecommons.org/licenses/by/4.0/,Muse: Text-To-Image Generation via Masked Generative Transformers,Huiwen Chang and Han Zhang and Jarred Barber and AJ Maschinot and Jose Lezama and Lu Jiang and Ming-Hsuan Yang and Kevin Murphy and William T. Freeman and Michael Rubinstein and Yuanzhen Li and Dilip Krishnan,http://arxiv.org/pdf/2301.00704v1
 http://arxiv.org/abs/2010.13347v1,creativecommons.org/licenses/by/4.0/,"A Language and Methodology based on Scenarios, Grammars and Views, for Administrative Business Processes Modelling",Milliam Maxime Zekeng Ndadji and Maurice Tchoupé Tchendji and Clémentin Tayou Djamegni and Didier Parigot,http://arxiv.org/pdf/2010.13347v1
 http://arxiv.org/abs/2012.03864v1,creativecommons.org/licenses/by/4.0/,Evaluating Cross-Lingual Transfer Learning Approaches in Multilingual Conversational Agent Models,Lizhen Tan and Olga Golovneva,http://arxiv.org/pdf/2012.03864v1
 http://arxiv.org/abs/2210.15184v2,creativecommons.org/licenses/by/4.0/,Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models,Harshita Diddee and Sandipan Dandapat and Monojit Choudhury and Tanuja Ganu and Kalika Bali,http://arxiv.org/pdf/2210.15184v2
 http://arxiv.org/abs/2106.03441v3,creativecommons.org/licenses/by/4.0/,Attention Temperature Matters in Abstractive Summarization Distillation,Shengqiang Zhang and Xingxing Zhang and Hangbo Bao and Furu Wei,http://arxiv.org/pdf/2106.03441v3
 http://arxiv.org/abs/2006.16370v1,creativecommons.org/licenses/by/4.0/,Classification of cancer pathology reports: a large-scale comparative study,Stefano Martina and Leonardo Ventura and Paolo Frasconi,http://arxiv.org/pdf/2006.16370v1
 http://arxiv.org/abs/2210.13312v2,creativecommons.org/licenses/by/4.0/,Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs,Maarten Sap and Ronan LeBras and Daniel Fried and Yejin Choi,http://arxiv.org/pdf/2210.13312v2
 http://arxiv.org/abs/2112.09866v1,creativecommons.org/licenses/by/4.0/,Cascading Adaptors to Leverage English Data to Improve Performance of Question Answering for Low-Resource Languages,Hariom A. Pandya and Bhavik Ardeshna and Dr. Brijesh S. Bhatt,http://arxiv.org/pdf/2112.09866v1
 http://arxiv.org/abs/2203.09435v2,creativecommons.org/licenses/by/4.0/,Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation,Xinyi Wang and Sebastian Ruder and Graham Neubig,http://arxiv.org/pdf/2203.09435v2
 http://arxiv.org/abs/2207.05259v1,creativecommons.org/licenses/by/4.0/,Language-Based Causal Representation Learning,Blai Bonet and Hector Geffner,http://arxiv.org/pdf/2207.05259v1
 http://arxiv.org/abs/2302.04087v1,creativecommons.org/licenses/by/4.0/,Büchi-like characterizations for Parikh-recognizable omega-languages,Mario Grobler and Sebastian Siebertz,http://arxiv.org/pdf/2302.04087v1
 http://arxiv.org/abs/2103.11441v3,creativecommons.org/licenses/by/4.0/,TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing,Tao Gui and Xiao Wang and Qi Zhang and Qin Liu and Yicheng Zou and Xin Zhou and Rui Zheng and Chong Zhang and Qinzhuo Wu and Jiacheng Ye and Zexiong Pang and Yongxin Zhang and Zhengyan Li and Ruotian Ma and Zichu Fei and Ruijian Cai and Jun Zhao and Xingwu Hu and Zhiheng Yan and Yiding Tan and Yuan Hu and Qiyuan Bian and Zhihua Liu and Bolin Zhu and Shan Qin and Xiaoyu Xing and Jinlan Fu and Yue Zhang and Minlong Peng and Xiaoqing Zheng and Yaqian Zhou and Zhongyu Wei and Xipeng Qiu and Xuanjing Huang,http://arxiv.org/pdf/2103.11441v3
 http://arxiv.org/abs/2204.13743v1,creativecommons.org/licenses/by/4.0/,HiNER: A Large Hindi Named Entity Recognition Dataset,Rudra Murthy and Pallab Bhattacharjee and Rahul Sharnagat and Jyotsana Khatri and Diptesh Kanojia and Pushpak Bhattacharyya,http://arxiv.org/pdf/2204.13743v1
 http://arxiv.org/abs/2203.02094v2,creativecommons.org/licenses/by/4.0/,LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models,Mojan Javaheripi and Gustavo H. de Rosa and Subhabrata Mukherjee and Shital Shah and Tomasz L. Religa and Caio C. T. Mendes and Sebastien Bubeck and Farinaz Koushanfar and Debadeepta Dey,http://arxiv.org/pdf/2203.02094v2
 http://arxiv.org/abs/2011.08626v2,creativecommons.org/licenses/by/4.0/,Neural Semi-supervised Learning for Text Classification Under Large-Scale Pretraining,Zijun Sun and Chun Fan and Xiaofei Sun and Yuxian Meng and Fei Wu and Jiwei Li,http://arxiv.org/pdf/2011.08626v2
 http://arxiv.org/abs/2304.03589v1,creativecommons.org/licenses/by/4.0/,On Efficient Training of Large-Scale Deep Learning Models: A Literature Review,Li Shen and Yan Sun and Zhiyuan Yu and Liang Ding and Xinmei Tian and Dacheng Tao,http://arxiv.org/pdf/2304.03589v1
 http://arxiv.org/abs/2202.00540v1,creativecommons.org/licenses/by/4.0/,Dominant Set-based Active Learning for Text Classification and its Application to Online Social Media,Toktam A. Oghaz and Ivan Garibay,http://arxiv.org/pdf/2202.00540v1
 http://arxiv.org/abs/2205.10487v1,creativecommons.org/licenses/by/4.0/,Scaling Laws and Interpretability of Learning from Repeated Data,Danny Hernandez and Tom Brown and Tom Conerly and Nova DasSarma and Dawn Drain and Sheer El-Showk and Nelson Elhage and Zac Hatfield-Dodds and Tom Henighan and Tristan Hume and Scott Johnston and Ben Mann and Chris Olah and Catherine Olsson and Dario Amodei and Nicholas Joseph and Jared Kaplan and Sam McCandlish,http://arxiv.org/pdf/2205.10487v1
 http://arxiv.org/abs/1706.03499v1,creativecommons.org/licenses/by/4.0/,SU-RUG at the CoNLL-SIGMORPHON 2017 shared task: Morphological Inflection with Attentional Sequence-to-Sequence Models,Robert Östling and Johannes Bjerva,http://arxiv.org/pdf/1706.03499v1
 http://arxiv.org/abs/1611.02988v1,creativecommons.org/licenses/by/4.0/,Distant supervision for emotion detection using Facebook reactions,Chris Pool and Malvina Nissim,http://arxiv.org/pdf/1611.02988v1
 http://arxiv.org/abs/2001.03216v1,creativecommons.org/licenses/by/4.0/,Simulating Lexical Semantic Change from Sense-Annotated Data,Dominik Schlechtweg and Sabine Schulte im Walde,http://arxiv.org/pdf/2001.03216v1
 http://arxiv.org/abs/2001.05314v2,creativecommons.org/licenses/by/4.0/,Embedding Compression with Isotropic Iterative Quantization,Siyu Liao and Jie Chen and Yanzhi Wang and Qinru Qiu and Bo Yuan,http://arxiv.org/pdf/2001.05314v2
 http://arxiv.org/abs/2009.12695v1,creativecommons.org/licenses/by/4.0/,Techniques to Improve Q&A Accuracy with Transformer-based models on Large Complex Documents,Chejui Liao and Tabish Maniar and Sravanajyothi N and Anantha Sharma,http://arxiv.org/pdf/2009.12695v1
 http://arxiv.org/abs/2011.03138v1,creativecommons.org/licenses/by/4.0/,Semi-supervised URL Segmentation with Recurrent Neural Networks Pre-trained on Knowledge Graph Entities,Hao Zhang and Jae Ro and Richard Sproat,http://arxiv.org/pdf/2011.03138v1
 http://arxiv.org/abs/2103.07259v1,creativecommons.org/licenses/by/4.0/,Explaining and Improving BERT Performance on Lexical Semantic Change Detection,Severin Laicher and Sinan Kurtyigit and Dominik Schlechtweg and Jonas Kuhn and Sabine Schulte im Walde,http://arxiv.org/pdf/2103.07259v1
 http://arxiv.org/abs/2106.03111v1,creativecommons.org/licenses/by/4.0/,Lexical Semantic Change Discovery,Sinan Kurtyigit and Maike Park and Dominik Schlechtweg and Jonas Kuhn and Sabine Schulte im Walde,http://arxiv.org/pdf/2106.03111v1
 http://arxiv.org/abs/2106.03161v2,creativecommons.org/licenses/by/4.0/,Identifying Populist Paragraphs in Text: A machine-learning approach,Jogilė Ulinskaitė and Lukas Pukelis,http://arxiv.org/pdf/2106.03161v2
 http://arxiv.org/abs/2107.03474v1,creativecommons.org/licenses/by/4.0/,Differentiable Random Access Memory using Lattices,Adam P. Goucher and Rajan Troll,http://arxiv.org/pdf/2107.03474v1
 http://arxiv.org/abs/2109.03200v2,creativecommons.org/licenses/by/4.0/,ExCode-Mixed: Explainable Approaches towards Sentiment Analysis on Code-Mixed Data using BERT models,Aman Priyanshu and Aleti Vardhan and Sudarshan Sivakumar and Supriti Vijay and Nipuna Chhabra,http://arxiv.org/pdf/2109.03200v2
 http://arxiv.org/abs/2112.02810v1,creativecommons.org/licenses/by/4.0/,An Effective GCN-based Hierarchical Multi-label classification for Protein Function Prediction,Kyudam Choi and Yurim Lee and Cheongwon Kim and Minsung Yoon,http://arxiv.org/pdf/2112.02810v1
 http://arxiv.org/abs/2203.01282v2,creativecommons.org/licenses/by/4.0/,py-irt: A Scalable Item Response Theory Library for Python,John P. Lalor and Pedro Rodriguez,http://arxiv.org/pdf/2203.01282v2
 http://arxiv.org/abs/2205.01825v1,creativecommons.org/licenses/by/4.0/,AmbiPun: Generating Humorous Puns with Ambiguous Context,Anirudh Mittal and Yufei Tian and Nanyun Peng,http://arxiv.org/pdf/2205.01825v1
 http://arxiv.org/abs/2209.00797v2,creativecommons.org/licenses/by/4.0/,"Random Text Perturbations Work, but not Always",Zhengxiang Wang,http://arxiv.org/pdf/2209.00797v2
 http://arxiv.org/abs/2209.12614v1,creativecommons.org/licenses/by/4.0/,Identifying epidemic related Tweets using noisy learning,Ramya Tekumalla and Juan M. Banda,http://arxiv.org/pdf/2209.12614v1
 http://arxiv.org/abs/2211.04417v1,creativecommons.org/licenses/by/4.0/,nBIIG: A Neural BI Insights Generation System for Table Reporting,Yotam Perlitz and Dafna Sheinwald and Noam Slonim and Michal Shmueli-Scheuer,http://arxiv.org/pdf/2211.04417v1
 http://arxiv.org/abs/2304.09172v1,creativecommons.org/licenses/by/4.0/,Hyperbolic Image-Text Representations,Karan Desai and Maximilian Nickel and Tanmay Rajpurohit and Justin Johnson and Ramakrishna Vedantam,http://arxiv.org/pdf/2304.09172v1
 http://arxiv.org/abs/2106.05426v4,creativecommons.org/licenses/by/4.0/,Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses,Richard Antonello and Javier Turek and Vy Vo and Alexander Huth,http://arxiv.org/pdf/2106.05426v4
 http://arxiv.org/abs/1906.12230v1,creativecommons.org/licenses/by/4.0/,FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms,Henry B. Moss and Andrew Moore and David S. Leslie and Paul Rayson,http://arxiv.org/pdf/1906.12230v1
 http://arxiv.org/abs/2003.13198v4,creativecommons.org/licenses/by/4.0/,InterBERT: Vision-and-Language Interaction for Multi-modal Pretraining,Junyang Lin and An Yang and Yichang Zhang and Jie Liu and Jingren Zhou and Hongxia Yang,http://arxiv.org/pdf/2003.13198v4
 http://arxiv.org/abs/1911.03268v1,creativecommons.org/licenses/by/4.0/,Inducing brain-relevant bias in natural language processing models,Dan Schwartz and Mariya Toneva and Leila Wehbe,http://arxiv.org/pdf/1911.03268v1
 http://arxiv.org/abs/2103.05132v2,creativecommons.org/licenses/by/4.0/,AfriVEC: Word Embedding Models for African Languages. Case Study of Fon and Nobiin,Bonaventure F. P. Dossou and Mohammed Sabry,http://arxiv.org/pdf/2103.05132v2
 http://arxiv.org/abs/2210.11621v1,creativecommons.org/licenses/by/4.0/,SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages,Alireza Mohammadshahi and Vassilina Nikoulina and Alexandre Berard and Caroline Brun and James Henderson and Laurent Besacier,http://arxiv.org/pdf/2210.11621v1
 http://arxiv.org/abs/2211.11678v1,creativecommons.org/licenses/by/4.0/,Measuring Harmful Representations in Scandinavian Language Models,Samia Touileb and Debora Nozza,http://arxiv.org/pdf/2211.11678v1
 http://arxiv.org/abs/2301.12652v2,creativecommons.org/licenses/by/4.0/,REPLUG: Retrieval-Augmented Black-Box Language Models,Weijia Shi and Sewon Min and Michihiro Yasunaga and Minjoon Seo and Rich James and Mike Lewis and Luke Zettlemoyer and Wen-tau Yih,http://arxiv.org/pdf/2301.12652v2
 http://arxiv.org/abs/2011.07347v1,creativecommons.org/licenses/by/4.0/,Conditioned Natural Language Generation using only Unconditioned Language Model: An Exploration,Fan-Keng Sun and Cheng-I Lai,http://arxiv.org/pdf/2011.07347v1
 http://arxiv.org/abs/2206.11146v1,creativecommons.org/licenses/by/4.0/,Modeling Emergent Lexicon Formation with a Self-Reinforcing Stochastic Process,Brendon Boldt and David Mortensen,http://arxiv.org/pdf/2206.11146v1
 http://arxiv.org/abs/2303.15714v1,creativecommons.org/licenses/by/4.0/,Explicit Planning Helps Language Models in Logical Reasoning,Hongyu Zhao and Kangrui Wang and Mo Yu and Hongyuan Mei,http://arxiv.org/pdf/2303.15714v1
 http://arxiv.org/abs/1909.05088v1,creativecommons.org/licenses/by/4.0/,Getting Gender Right in Neural Machine Translation,Eva Vanmassenhove and Christian Hardmeier and Andy Way,http://arxiv.org/pdf/1909.05088v1
 http://arxiv.org/abs/1807.01784v1,creativecommons.org/licenses/by/4.0/,Program Language Translation Using a Grammar-Driven Tree-to-Tree Model,Mehdi Drissi and Olivia Watkins and Aditya Khant and Vivaswat Ojha and Pedro Sandoval and Rakia Segev and Eric Weiner and Robert Keller,http://arxiv.org/pdf/1807.01784v1
 http://arxiv.org/abs/2007.02629v2,creativecommons.org/licenses/by/4.0/,Learning Spoken Language Representations with Neural Lattice Language Modeling,Chao-Wei Huang and Yun-Nung Chen,http://arxiv.org/pdf/2007.02629v2
 http://arxiv.org/abs/2010.04482v1,creativecommons.org/licenses/by/4.0/,Word Level Language Identification in English Telugu Code Mixed Data,Sunil Gundapu and Radhika Mamidi,http://arxiv.org/pdf/2010.04482v1
 http://arxiv.org/abs/2110.06128v3,creativecommons.org/licenses/by/4.0/,Regionalized models for Spanish language variations based on Twitter,Eric S. Tellez and Daniela Moctezuma and Sabino Miranda and Mario Graff and Guillermo Ruiz,http://arxiv.org/pdf/2110.06128v3
 http://arxiv.org/abs/2112.10543v1,creativecommons.org/licenses/by/4.0/,Spiral Language Modeling,Yong Cao and Yukun Feng and Shaohui Kuang and Gu Xu,http://arxiv.org/pdf/2112.10543v1
 http://arxiv.org/abs/2202.00794v2,creativecommons.org/licenses/by/4.0/,Learning to pronounce as measuring cross-lingual joint orthography-phonology complexity,Domenic Rosati,http://arxiv.org/pdf/2202.00794v2
 http://arxiv.org/abs/2205.01620v2,creativecommons.org/licenses/by/4.0/,Unifying the Convergences in Multilingual Neural Machine Translation,Yichong Huang and Xiaocheng Feng and Xinwei Geng and Bing Qin,http://arxiv.org/pdf/2205.01620v2
 http://arxiv.org/abs/2205.12404v3,creativecommons.org/licenses/by/4.0/,FLUTE: Figurative Language Understanding through Textual Explanations,Tuhin Chakrabarty and Arkadiy Saakyan and Debanjan Ghosh and Smaranda Muresan,http://arxiv.org/pdf/2205.12404v3
 http://arxiv.org/abs/2207.09157v1,creativecommons.org/licenses/by/4.0/,On the cross-lingual transferability of multilingual prototypical models across NLU tasks,Oralie Cattan and Christophe Servan and Sophie Rosset,http://arxiv.org/pdf/2207.09157v1
 http://arxiv.org/abs/2211.03818v1,creativecommons.org/licenses/by/4.0/,CELLS: A Parallel Corpus for Biomedical Lay Language Generation,Yue Guo and Wei Qiu and Gondy Leroy and Sheng Wang and Trevor Cohen,http://arxiv.org/pdf/2211.03818v1
 http://arxiv.org/abs/2302.07974v1,creativecommons.org/licenses/by/4.0/,Tree-Based Representation and Generation of Natural and Mathematical Language,Alexander Scarlatos and Andrew Lan,http://arxiv.org/pdf/2302.07974v1
 http://arxiv.org/abs/2304.05764v1,creativecommons.org/licenses/by/4.0/,Measuring Normative and Descriptive Biases in Language Models Using Census Data,Samia Touileb and Lilja Øvrelid and Erik Velldal,http://arxiv.org/pdf/2304.05764v1
 http://arxiv.org/abs/2011.09567v1,creativecommons.org/licenses/by/4.0/,Predicting metrical patterns in Spanish poetry with language models,Javier de la Rosa and Salvador Ros and Elena González-Blanco,http://arxiv.org/pdf/2011.09567v1
 http://arxiv.org/abs/2211.02011v4,creativecommons.org/licenses/by/4.0/,Inverse scaling can become U-shaped,Jason Wei and Najoung Kim and Yi Tay and Quoc V. Le,http://arxiv.org/pdf/2211.02011v4
 http://arxiv.org/abs/2103.06922v3,creativecommons.org/licenses/by/4.0/,Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU Models,Mengnan Du and Varun Manjunatha and Rajiv Jain and Ruchi Deshpande and Franck Dernoncourt and Jiuxiang Gu and Tong Sun and Xia Hu,http://arxiv.org/pdf/2103.06922v3
 http://arxiv.org/abs/2205.01541v1,creativecommons.org/licenses/by/4.0/,Efficient Fine-Tuning of BERT Models on the Edge,Danilo Vucetic and Mohammadreza Tayaranian and Maryam Ziaeefard and James J. Clark and Brett H. Meyer and Warren J. Gross,http://arxiv.org/pdf/2205.01541v1
 http://arxiv.org/abs/2212.09747v1,creativecommons.org/licenses/by/4.0/,Do CoNLL-2003 Named Entity Taggers Still Work Well in 2023?,Shuheng Liu and Alan Ritter,http://arxiv.org/pdf/2212.09747v1
 http://arxiv.org/abs/2204.02531v1,creativecommons.org/licenses/by/4.0/,Improving Zero-Shot Event Extraction via Sentence Simplification,Sneha Mehta and Huzefa Rangwala and Naren Ramakrishnan,http://arxiv.org/pdf/2204.02531v1
 http://arxiv.org/abs/2202.11844v3,creativecommons.org/licenses/by/4.0/,First is Better Than Last for Language Data Influence,Chih-Kuan Yeh and Ankur Taly and Mukund Sundararajan and Frederick Liu and Pradeep Ravikumar,http://arxiv.org/pdf/2202.11844v3
 http://arxiv.org/abs/2212.12017v3,creativecommons.org/licenses/by/4.0/,OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization,Srinivasan Iyer and Xi Victoria Lin and Ramakanth Pasunuru and Todor Mihaylov and Daniel Simig and Ping Yu and Kurt Shuster and Tianlu Wang and Qing Liu and Punit Singh Koura and Xian Li and Brian O'Horo and Gabriel Pereyra and Jeff Wang and Christopher Dewan and Asli Celikyilmaz and Luke Zettlemoyer and Ves Stoyanov,http://arxiv.org/pdf/2212.12017v3
 http://arxiv.org/abs/2006.07698v2,creativecommons.org/publicdomain/zero/1.0/,Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya,Abrhalei Tela and Abraham Woubie and Ville Hautamaki,http://arxiv.org/pdf/2006.07698v2
 http://arxiv.org/abs/2109.06327v2,creativecommons.org/publicdomain/zero/1.0/,Evaluating Transferability of BERT Models on Uralic Languages,Judit Ács and Dániel Lévai and András Kornai,http://arxiv.org/pdf/2109.06327v2
 http://arxiv.org/abs/2205.06885v1,creativecommons.org/publicdomain/zero/1.0/,PathologyBERT -- Pre-trained Vs. A New Transformer Language Model for Pathology Domain,Thiago Santos and Amara Tariq and Susmita Das and Kavyasree Vayalpati and Geoffrey H. Smith and Hari Trivedi and Imon Banerjee,http://arxiv.org/pdf/2205.06885v1
 http://arxiv.org/abs/2209.10583v1,creativecommons.org/publicdomain/zero/1.0/,Representing Affect Information in Word Embeddings,Yuhan Zhang and Wenqi Chen and Ruihan Zhang and Xiajie Zhang,http://arxiv.org/pdf/2209.10583v1
 http://arxiv.org/abs/2204.04748v1,creativecommons.org/publicdomain/zero/1.0/,Breaking Character: Are Subwords Good Enough for MRLs After All?,Omri Keren and Tal Avinari and Reut Tsarfaty and Omer Levy,http://arxiv.org/pdf/2204.04748v1
 http://arxiv.org/abs/1912.13415v1,creativecommons.org/publicdomain/zero/1.0/,End-to-end Named Entity Recognition and Relation Extraction using Pre-trained Language Models,John Giorgi and Xindi Wang and Nicola Sahar and Won Young Shin and Gary D. Bader and Bo Wang,http://arxiv.org/pdf/1912.13415v1
 http://arxiv.org/abs/2011.08724v2,creativecommons.org/publicdomain/zero/1.0/,Multi-SQL: An extensible multi-model data query language,Yu Yan and Nan Jiang and Hongzhi Wang and Yutong Wang and Chang Liu and Yuzhuo Wang,http://arxiv.org/pdf/2011.08724v2
 http://arxiv.org/abs/2302.00083v1,creativecommons.org/publicdomain/zero/1.0/,In-Context Retrieval-Augmented Language Models,Ori Ram and Yoav Levine and Itay Dalmedigos and Dor Muhlgay and Amnon Shashua and Kevin Leyton-Brown and Yoav Shoham,http://arxiv.org/pdf/2302.00083v1
 http://arxiv.org/abs/2303.10131v1,creativecommons.org/publicdomain/zero/1.0/,She Elicits Requirements and He Tests: Software Engineering Gender Bias in Large Language Models,Christoph Treude and Hideaki Hata,http://arxiv.org/pdf/2303.10131v1
 http://arxiv.org/abs/2004.13819v1,creativecommons.org/publicdomain/zero/1.0/,Neural Machine Translation for Low-Resourced Indian Languages,Himanshu Choudhary and Shivansh Rao and Rajesh Rohilla,http://arxiv.org/pdf/2004.13819v1
 http://arxiv.org/abs/2104.04670v5,creativecommons.org/publicdomain/zero/1.0/,Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections,Ruiqi Zhong and Kristy Lee and Zheng Zhang and Dan Klein,http://arxiv.org/pdf/2104.04670v5
 http://arxiv.org/abs/2108.02170v1,creativecommons.org/publicdomain/zero/1.0/,Curriculum learning for language modeling,Daniel Campos,http://arxiv.org/pdf/2108.02170v1
 http://arxiv.org/abs/2209.10792v2,creativecommons.org/publicdomain/zero/1.0/,Deep Learning Based Page Creation for Improving E-Commerce Organic Search Traffic,Cheng Jie and Da Xu and Zigeng Wang and Wei Shen,http://arxiv.org/pdf/2209.10792v2
 http://arxiv.org/abs/1810.08606v1,creativecommons.org/publicdomain/zero/1.0/,An Exploration of Dropout with RNNs for Natural Language Inference,Amit Gajbhiye and Sardar Jaf and Noura Al Moubayed and A. Stephen McGough and Steven Bradley,http://arxiv.org/pdf/1810.08606v1
 http://arxiv.org/abs/2302.13344v1,creativecommons.org/publicdomain/zero/1.0/,Tailoring Language Generation Models under Total Variation Distance,Haozhe Ji and Pei Ke and Zhipeng Hu and Rongsheng Zhang and Minlie Huang,http://arxiv.org/pdf/2302.13344v1
 http://arxiv.org/abs/2203.01104v4,creativecommons.org/publicdomain/zero/1.0/,Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models,Ze-Feng Gao and Peiyu Liu and Wayne Xin Zhao and Zhong-Yi Lu and Ji-Rong Wen,http://arxiv.org/pdf/2203.01104v4
 http://arxiv.org/abs/2204.10624v1,creativecommons.org/publicdomain/zero/1.0/,Learning Functional Distributional Semantics with Visual Data,Yinhong Liu and Guy Emerson,http://arxiv.org/pdf/2204.10624v1
 http://arxiv.org/abs/2301.05226v1,creativecommons.org/publicdomain/zero/1.0/,"See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning",Zhenfang Chen and Qinhong Zhou and Yikang Shen and Yining Hong and Hao Zhang and Chuang Gan,http://arxiv.org/pdf/2301.05226v1
 http://arxiv.org/abs/2001.08896v5,creativecommons.org/publicdomain/zero/1.0/,Compressing Language Models using Doped Kronecker Products,Urmish Thakker and Paul N. Whatmough and Zhi-Gang Liu and Matthew Mattina and Jesse Beu,http://arxiv.org/pdf/2001.08896v5
 http://arxiv.org/abs/2006.03659v4,creativecommons.org/publicdomain/zero/1.0/,DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations,John Giorgi and Osvald Nitski and Bo Wang and Gary Bader,http://arxiv.org/pdf/2006.03659v4
 http://arxiv.org/abs/2206.04575v1,creativecommons.org/publicdomain/zero/1.0/,Transformer based Urdu Handwritten Text Optical Character Reader,Mohammad Daniyal Shaiq and Musa Dildar Ahmed Cheema and Ali Kamal,http://arxiv.org/pdf/2206.04575v1
 http://arxiv.org/abs/2212.10179v1,creativecommons.org/publicdomain/zero/1.0/,Toward Human-Like Evaluation for Natural Language Generation with Error Analysis,Qingyu Lu and Liang Ding and Liping Xie and Kanjian Zhang and Derek F. Wong and Dacheng Tao,http://arxiv.org/pdf/2212.10179v1
 http://arxiv.org/abs/2107.00157v5,creativecommons.org/publicdomain/zero/1.0/,Cross-Lingual Transfer Learning for Statistical Type Inference,Zhiming Li and Xiaofei Xie and Haoliang Li and Zhengzi Xu and Yi Li and Yang Liu,http://arxiv.org/pdf/2107.00157v5
 http://arxiv.org/abs/2302.14338v3,creativecommons.org/publicdomain/zero/1.0/,Turning a CLIP Model into a Scene Text Detector,Wenwen Yu and Yuliang Liu and Wei Hua and Deqiang Jiang and Bo Ren and Xiang Bai,http://arxiv.org/pdf/2302.14338v3
 http://arxiv.org/abs/2211.10435v2,creativecommons.org/publicdomain/zero/1.0/,PAL: Program-aided Language Models,Luyu Gao and Aman Madaan and Shuyan Zhou and Uri Alon and Pengfei Liu and Yiming Yang and Jamie Callan and Graham Neubig,http://arxiv.org/pdf/2211.10435v2
 http://arxiv.org/abs/2109.11314v1,creativecommons.org/publicdomain/zero/1.0/,ParaShoot: A Hebrew Question Answering Dataset,Omri Keren and Omer Levy,http://arxiv.org/pdf/2109.11314v1
 http://arxiv.org/abs/2301.09072v1,creativecommons.org/publicdomain/zero/1.0/,ContraBERT: Enhancing Code Pre-trained Models via Contrastive Learning,Shangqing Liu and Bozhi Wu and Xiaofei Xie and Guozhu Meng and Yang Liu,http://arxiv.org/pdf/2301.09072v1
 http://arxiv.org/abs/2302.08901v1,creativecommons.org/publicdomain/zero/1.0/,Exploring External Knowledge for Accurate modeling of Visual and Language Problems,Xuewen Yang,http://arxiv.org/pdf/2302.08901v1
 http://arxiv.org/abs/2303.13809v1,creativecommons.org/publicdomain/zero/1.0/,Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models: A Case Study on ChatGPT,Qingyu Lu and Baopu Qiu and Liang Ding and Liping Xie and Dacheng Tao,http://arxiv.org/pdf/2303.13809v1
 http://arxiv.org/abs/2103.03493v1,creativecommons.org/publicdomain/zero/1.0/,Causal Attention for Vision-Language Tasks,Xu Yang and Hanwang Zhang and Guojun Qi and Jianfei Cai,http://arxiv.org/pdf/2103.03493v1
 http://arxiv.org/abs/2207.07039v3,creativecommons.org/publicdomain/zero/1.0/,Convolutional Bypasses Are Better Vision Transformer Adapters,Shibo Jie and Zhi-Hong Deng,http://arxiv.org/pdf/2207.07039v3
 http://arxiv.org/abs/2109.13348v2,creativecommons.org/publicdomain/zero/1.0/,Evaluating Biomedical BERT Models for Vocabulary Alignment at Scale in the UMLS Metathesaurus,Goonmeet Bajaj and Vinh Nguyen and Thilini Wijesiriwardene and Hong Yung Yip and Vishesh Javangula and Srinivasan Parthasarathy and Amit Sheth and Olivier Bodenreider,http://arxiv.org/pdf/2109.13348v2
 http://arxiv.org/abs/2010.04823v1,creativecommons.org/publicdomain/zero/1.0/,On some representations of context-free languages,Krasimir Yordzhev,http://arxiv.org/pdf/2010.04823v1
 http://arxiv.org/abs/2303.09062v1,creativecommons.org/publicdomain/zero/1.0/,Knowledge Transfer for Pseudo-code Generation from Low Resource Programming Language,Ankita Sontakke and Kanika Kalra and Manasi Patwardhan and Lovekesh Vig and Raveendra Kumar Medicherla and Ravindra Naik and Shrishti Pradhan,http://arxiv.org/pdf/2303.09062v1
 http://arxiv.org/abs/2205.08514v2,creativecommons.org/publicdomain/zero/1.0/,Recovering Private Text in Federated Learning of Language Models,Samyak Gupta and Yangsibo Huang and Zexuan Zhong and Tianyu Gao and Kai Li and Danqi Chen,http://arxiv.org/pdf/2205.08514v2
 http://arxiv.org/abs/2210.05549v1,creativecommons.org/publicdomain/zero/1.0/,Continual Training of Language Models for Few-Shot Learning,Zixuan Ke and Haowei Lin and Yijia Shao and Hu Xu and Lei Shu and Bing Liu,http://arxiv.org/pdf/2210.05549v1
 http://arxiv.org/abs/2006.06434v1,creativecommons.org/publicdomain/zero/1.0/,TableQA: a Large-Scale Chinese Text-to-SQL Dataset for Table-Aware SQL Generation,Ningyuan Sun and Xuefeng Yang and Yunfeng Liu,http://arxiv.org/pdf/2006.06434v1
 http://arxiv.org/abs/2205.00484v1,creativecommons.org/publicdomain/zero/1.0/,Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs,Songlin Yang and Wei Liu and Kewei Tu,http://arxiv.org/pdf/2205.00484v1
 http://arxiv.org/abs/2207.14157v1,creativecommons.org/publicdomain/zero/1.0/,A Hazard Analysis Framework for Code Synthesis Large Language Models,Heidy Khlaaf and Pamela Mishkin and Joshua Achiam and Gretchen Krueger and Miles Brundage,http://arxiv.org/pdf/2207.14157v1
 http://arxiv.org/abs/1909.09491v1,creativecommons.org/publicdomain/zero/1.0/,A simple discriminative training method for machine translation with large-scale features,Tian Xia and Shaodan Zhai and Shaojun Wang,http://arxiv.org/pdf/1909.09491v1
 http://arxiv.org/abs/2101.00434v2,creativecommons.org/publicdomain/zero/1.0/,Coreference Resolution without Span Representations,Yuval Kirstain and Ori Ram and Omer Levy,http://arxiv.org/pdf/2101.00434v2
 http://arxiv.org/abs/2210.02441v3,creativecommons.org/publicdomain/zero/1.0/,Ask Me Anything: A simple strategy for prompting language models,Simran Arora and Avanika Narayan and Mayee F. Chen and Laurel Orr and Neel Guha and Kush Bhatia and Ines Chami and Frederic Sala and Christopher Ré,http://arxiv.org/pdf/2210.02441v3
 http://arxiv.org/abs/1912.03334v1,creativecommons.org/publicdomain/zero/1.0/,Explaining Sequence-Level Knowledge Distillation as Data-Augmentation for Neural Machine Translation,Mitchell A. Gordon and Kevin Duh,http://arxiv.org/pdf/1912.03334v1
 http://arxiv.org/abs/2201.05337v2,creativecommons.org/publicdomain/zero/1.0/,A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models,Hanqing Zhang and Haolin Song and Shaoyu Li and Ming Zhou and Dawei Song,http://arxiv.org/pdf/2201.05337v2
 http://arxiv.org/abs/2304.09433v2,creativecommons.org/publicdomain/zero/1.0/,Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes,Simran Arora and Brandon Yang and Sabri Eyuboglu and Avanika Narayan and Andrew Hojel and Immanuel Trummer and Christopher Ré,http://arxiv.org/pdf/2304.09433v2
 http://arxiv.org/abs/2004.05986v3,creativecommons.org/publicdomain/zero/1.0/,CLUE: A Chinese Language Understanding Evaluation Benchmark,Liang Xu and Hai Hu and Xuanwei Zhang and Lu Li and Chenjie Cao and Yudong Li and Yechen Xu and Kai Sun and Dian Yu and Cong Yu and Yin Tian and Qianqian Dong and Weitang Liu and Bo Shi and Yiming Cui and Junyi Li and Jun Zeng and Rongzhao Wang and Weijian Xie and Yanting Li and Yina Patterson and Zuoyu Tian and Yiwen Zhang and He Zhou and Shaoweihua Liu and Zhe Zhao and Qipeng Zhao and Cong Yue and Xinrui Zhang and Zhengliang Yang and Kyle Richardson and Zhenzhong Lan,http://arxiv.org/pdf/2004.05986v3
 http://arxiv.org/abs/2110.05287v1,creativecommons.org/publicdomain/zero/1.0/,TEET! Tunisian Dataset for Toxic Speech Detection,Slim Gharbi and Heger Arfaoui and Hatem Haddad and Mayssa Kchaou,http://arxiv.org/pdf/2110.05287v1
 http://arxiv.org/abs/2203.15754v1,creativecommons.org/publicdomain/zero/1.0/,Evaluating Prompts Across Multiple Choice Tasks In a Zero-Shot Setting,Gabriel Orlanski,http://arxiv.org/pdf/2203.15754v1
 http://arxiv.org/abs/2210.09304v1,creativecommons.org/publicdomain/zero/1.0/,Non-Contrastive Learning Meets Language-Image Pre-Training,Jinghao Zhou and Li Dong and Zhe Gan and Lijuan Wang and Furu Wei,http://arxiv.org/pdf/2210.09304v1
 http://arxiv.org/abs/2209.09444v3,creativecommons.org/publicdomain/zero/1.0/,Vega-MT: The JD Explore Academy Translation System for WMT22,Changtong Zan and Keqin Peng and Liang Ding and Baopu Qiu and Boan Liu and Shwai He and Qingyu Lu and Zheng Zhang and Chuang Liu and Weifeng Liu and Yibing Zhan and Dacheng Tao,http://arxiv.org/pdf/2209.09444v3
 http://arxiv.org/abs/2304.11276v1,creativecommons.org/publicdomain/zero/1.0/,The Role of AI in Human-AI Creative Writing for Hong Kong Secondary Students,Hengky Susanto and David James Woo and Kai Guo,http://arxiv.org/pdf/2304.11276v1
 http://arxiv.org/abs/2212.09656v1,creativecommons.org/publicdomain/zero/1.0/,Visconde: Multi-document QA with GPT-3 and Neural Reranking,Jayr Pereira and Robson Fidalgo and Roberto Lotufo and Rodrigo Nogueira,http://arxiv.org/pdf/2212.09656v1
 http://arxiv.org/abs/2205.09911v2,creativecommons.org/publicdomain/zero/1.0/,Can Foundation Models Wrangle Your Data?,Avanika Narayan and Ines Chami and Laurel Orr and Simran Arora and Christopher Ré,http://arxiv.org/pdf/2205.09911v2
 http://arxiv.org/abs/2103.14620v2,creativecommons.org/publicdomain/zero/1.0/,LiGCN: Label-interpretable Graph Convolutional Networks for Multi-label Text Classification,Irene Li and Aosong Feng and Hao Wu and Tianxiao Li and Toyotaro Suzumura and Ruihai Dong,http://arxiv.org/pdf/2103.14620v2
 http://arxiv.org/abs/1803.10299v3,creativecommons.org/publicdomain/zero/1.0/,Multi-Modal Data Augmentation for End-to-End ASR,Adithya Renduchintala and Shuoyang Ding and Matthew Wiesner and Shinji Watanabe,http://arxiv.org/pdf/1803.10299v3
 http://arxiv.org/abs/2211.07855v1,creativecommons.org/publicdomain/zero/1.0/,Relationship of the language distance to English ability of a country,Cao Xinxin and Lei Xiaolan and Murtadha Ahmed,http://arxiv.org/pdf/2211.07855v1
 http://arxiv.org/abs/1908.01841v2,creativecommons.org/publicdomain/zero/1.0/,DLGNet: A Transformer-based Model for Dialogue Response Generation,Oluwatobi Olabiyi and Erik T. Mueller,http://arxiv.org/pdf/1908.01841v2
 http://arxiv.org/abs/2207.11716v1,creativecommons.org/publicdomain/zero/1.0/,A Cognitive Study on Semantic Similarity Analysis of Large Corpora: A Transformer-based Approach,Praneeth Nemani and Satyanarayana Vollala,http://arxiv.org/pdf/2207.11716v1
 http://arxiv.org/abs/2202.13871v2,creativecommons.org/publicdomain/zero/1.0/,Wastewater Pipe Rating Model Using Natural Language Processing,Sai Nethra Betgeri and Shashank Reddy Vadyala and Dr. John C. Mattews and Dr. Hongfang Lu,http://arxiv.org/pdf/2202.13871v2
 http://arxiv.org/abs/2109.12997v1,creativecommons.org/publicdomain/zero/1.0/,Benchmarking the Status of Default Pseudorandom Number Generators in Common Programming Languages,Nils van den Honert and Diederick Vermetten and Anna V. Kononova,http://arxiv.org/pdf/2109.12997v1
 http://arxiv.org/abs/1212.6183v1,creativecommons.org/licenses/by-nc-sa/3.0/,The Buffered π-Calculus: A Model for Concurrent Languages,Xiaojie Deng and Yu Zhang and Yuxin Deng and Farong Zhong,http://arxiv.org/pdf/1212.6183v1
 http://arxiv.org/abs/2304.11406v1,creativecommons.org/licenses/by-nc-sa/4.0/,LaMP: When Large Language Models Meet Personalization,Alireza Salemi and Sheshera Mysore and Michael Bendersky and Hamed Zamani,http://arxiv.org/pdf/2304.11406v1
 http://arxiv.org/abs/1810.12387v1,creativecommons.org/licenses/by-nc-sa/4.0/,Language Modeling with Sparse Product of Sememe Experts,Yihong Gu and Jun Yan and Hao Zhu and Zhiyuan Liu and Ruobing Xie and Maosong Sun and Fen Lin and Leyu Lin,http://arxiv.org/pdf/1810.12387v1
 http://arxiv.org/abs/2303.01229v1,creativecommons.org/licenses/by-nc-sa/4.0/,Almanac: Knowledge-Grounded Language Models for Clinical Medicine,Cyril Zakka and Akash Chaurasia and Rohan Shad and William Hiesinger,http://arxiv.org/pdf/2303.01229v1
 http://arxiv.org/abs/2006.02144v1,creativecommons.org/licenses/by-nc-sa/4.0/,Transfer Learning for British Sign Language Modelling,Boris Mocialov and Graham Turner and Helen Hastie,http://arxiv.org/pdf/2006.02144v1
 http://arxiv.org/abs/2202.07138v2,creativecommons.org/licenses/by-nc-sa/4.0/,Integrating AI Planning with Natural Language Processing: A Combination of Explicit and Tacit Knowledge,Kebing Jin and Hankz Hankui Zhuo,http://arxiv.org/pdf/2202.07138v2
 http://arxiv.org/abs/2109.05522v1,creativecommons.org/licenses/by-nc-sa/4.0/,TEASEL: A Transformer-Based Speech-Prefixed Language Model,Mehdi Arjmand and Mohammad Javad Dousti and Hadi Moradi,http://arxiv.org/pdf/2109.05522v1
 http://arxiv.org/abs/2210.07054v2,creativecommons.org/licenses/by-nc-sa/4.0/,Scaling Back-Translation with Domain Text Generation for Sign Language Gloss Translation,Jinhui Ye and Wenxiang Jiao and Xing Wang and Zhaopeng Tu,http://arxiv.org/pdf/2210.07054v2
 http://arxiv.org/abs/1709.03759v1,creativecommons.org/licenses/by-nc-sa/4.0/,Language Models of Spoken Dutch,Lyan Verwimp and Joris Pelemans and Marieke Lycke and Hugo Van hamme and Patrick Wambacq,http://arxiv.org/pdf/1709.03759v1
 http://arxiv.org/abs/2006.02120v1,creativecommons.org/licenses/by-nc-sa/4.0/,Towards Large-Scale Data Mining for Data-Driven Analysis of Sign Languages,Boris Mocialov and Graham Turner and Helen Hastie,http://arxiv.org/pdf/2006.02120v1
 http://arxiv.org/abs/2302.09432v2,creativecommons.org/licenses/by-nc-sa/4.0/,"BBT-Fin: Comprehensive Construction of Chinese Financial Domain Pre-trained Language Model, Corpus and Benchmark",Dakuan Lu and Hengkui Wu and Jiaqing Liang and Yipei Xu and Qianyu He and Yipeng Geng and Mengkun Han and Yingsi Xin and Yanghua Xiao,http://arxiv.org/pdf/2302.09432v2
 http://arxiv.org/abs/1805.09016v1,creativecommons.org/licenses/by-nc-sa/4.0/,Bilingual Sentiment Embeddings: Joint Projection of Sentiment Across Languages,Jeremy Barnes and Roman Klinger and Sabine Schulte im Walde,http://arxiv.org/pdf/1805.09016v1
 http://arxiv.org/abs/2203.13397v1,creativecommons.org/licenses/by-nc-sa/4.0/,GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models,Changye Li and David Knopman and Weizhe Xu and Trevor Cohen and Serguei Pakhomov,http://arxiv.org/pdf/2203.13397v1
 http://arxiv.org/abs/2210.05359v1,creativecommons.org/licenses/by-nc-sa/4.0/,Mind's Eye: Grounded Language Model Reasoning through Simulation,Ruibo Liu and Jason Wei and Shixiang Shane Gu and Te-Yen Wu and Soroush Vosoughi and Claire Cui and Denny Zhou and Andrew M. Dai,http://arxiv.org/pdf/2210.05359v1
 http://arxiv.org/abs/2112.11070v1,creativecommons.org/licenses/by-nc-sa/4.0/,An Inference Approach To Question Answering Over Knowledge Graphs,Aayushee Gupta and K. M. Annervaz and Ambedkar Dukkipati and Shubhashis Sengupta,http://arxiv.org/pdf/2112.11070v1
 http://arxiv.org/abs/2010.12472v2,creativecommons.org/licenses/by-nc-sa/4.0/,HateBERT: Retraining BERT for Abusive Language Detection in English,Tommaso Caselli and Valerio Basile and Jelena Mitrović and Michael Granitzer,http://arxiv.org/pdf/2010.12472v2
 http://arxiv.org/abs/2203.06906v1,creativecommons.org/licenses/by-nc-sa/4.0/,PERT: Pre-training BERT with Permuted Language Model,Yiming Cui and Ziqing Yang and Ting Liu,http://arxiv.org/pdf/2203.06906v1
 http://arxiv.org/abs/2304.05368v2,creativecommons.org/licenses/by-nc-sa/4.0/,Are Large Language Models Ready for Healthcare? A Comparative Study on Clinical Language Understanding,Yuqing Wang and Yun Zhao and Linda Petzold,http://arxiv.org/pdf/2304.05368v2
 http://arxiv.org/abs/2008.06788v2,creativecommons.org/licenses/by-nc-sa/4.0/,Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation,Goran Glavaš and Ivan Vulić,http://arxiv.org/pdf/2008.06788v2
 http://arxiv.org/abs/2211.00083v1,creativecommons.org/licenses/by-nc-sa/4.0/,WHEN FLUE MEETS FLANG: Benchmarks and Large Pre-trained Language Model for Financial Domain,Raj Sanjay Shah and Kunal Chawla and Dheeraj Eidnani and Agam Shah and Wendi Du and Sudheer Chava and Natraj Raman and Charese Smiley and Jiaao Chen and Diyi Yang,http://arxiv.org/pdf/2211.00083v1
 http://arxiv.org/abs/2109.03646v1,creativecommons.org/licenses/by-nc-sa/4.0/,Sustainable Modular Debiasing of Language Models,Anne Lauscher and Tobias Lüken and Goran Glavaš,http://arxiv.org/pdf/2109.03646v1
 http://arxiv.org/abs/2103.12801v1,creativecommons.org/licenses/by-nc-sa/4.0/,Variable Name Recovery in Decompiled Binary Code using Constrained Masked Language Modeling,Pratyay Banerjee and Kuntal Kumar Pal and Fish Wang and Chitta Baral,http://arxiv.org/pdf/2103.12801v1
 http://arxiv.org/abs/2109.05093v1,creativecommons.org/licenses/by-nc-sa/4.0/,PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models,Torsten Scholak and Nathan Schucher and Dzmitry Bahdanau,http://arxiv.org/pdf/2109.05093v1
 http://arxiv.org/abs/2205.10661v1,creativecommons.org/licenses/by-nc-sa/4.0/,An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs,Jiarui Zhang and Filip Ilievski and Kaixin Ma and Jonathan Francis and Alessandro Oltramari,http://arxiv.org/pdf/2205.10661v1
 http://arxiv.org/abs/2208.14493v1,creativecommons.org/licenses/by-nc-sa/4.0/,Annotated Dataset Creation through General Purpose Language Models for non-English Medical NLP,Johann Frei and Frank Kramer,http://arxiv.org/pdf/2208.14493v1
 http://arxiv.org/abs/1801.06436v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Resource-Light Method for Cross-Lingual Semantic Textual Similarity,Goran Glavaš and Marc Franco-Salvador and Simone Paolo Ponzetto and Paolo Rosso,http://arxiv.org/pdf/1801.06436v1
 http://arxiv.org/abs/2110.03546v2,creativecommons.org/licenses/by-nc-sa/4.0/,mRAT-SQL+GAP:A Portuguese Text-to-SQL Transformer,Marcelo Archanjo José and Fabio Gagliardi Cozman,http://arxiv.org/pdf/2110.03546v2
 http://arxiv.org/abs/2108.00801v2,creativecommons.org/licenses/by-nc-sa/4.0/,LICHEE: Improving Language Model Pre-training with Multi-grained Tokenization,Weidong Guo and Mingjun Zhao and Lusheng Zhang and Di Niu and Jinwen Luo and Zhenhua Liu and Zhenyang Li and Jianbo Tang,http://arxiv.org/pdf/2108.00801v2
 http://arxiv.org/abs/2210.15133v1,creativecommons.org/licenses/by-nc-sa/4.0/,Retrieval Oriented Masking Pre-training Language Model for Dense Passage Retrieval,Dingkun Long and Yanzhao Zhang and Guangwei Xu and Pengjun Xie,http://arxiv.org/pdf/2210.15133v1
 http://arxiv.org/abs/2202.11558v1,creativecommons.org/licenses/by-nc-sa/4.0/,Short-answer scoring with ensembles of pretrained language models,Christopher Ormerod,http://arxiv.org/pdf/2202.11558v1
 http://arxiv.org/abs/2205.10012v3,creativecommons.org/licenses/by-nc-sa/4.0/,Descartes: Generating Short Descriptions of Wikipedia Articles,Marija Sakota and Maxime Peyrard and Robert West,http://arxiv.org/pdf/2205.10012v3
 http://arxiv.org/abs/2112.08709v2,creativecommons.org/licenses/by-nc-sa/4.0/,DOCmT5: Document-Level Pretraining of Multilingual Language Models,Chia-Hsuan Lee and Aditya Siddhant and Viresh Ratnakar and Melvin Johnson,http://arxiv.org/pdf/2112.08709v2
 http://arxiv.org/abs/2204.05939v1,creativecommons.org/licenses/by-nc-sa/4.0/,Mining Logical Event Schemas From Pre-Trained Language Models,Lane Lawley and Lenhart Schubert,http://arxiv.org/pdf/2204.05939v1
 http://arxiv.org/abs/2302.05578v2,creativecommons.org/licenses/by-nc-sa/4.0/,Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models,Renat Aksitov and Chung-Ching Chang and David Reitter and Siamak Shakeri and Yunhsuan Sung,http://arxiv.org/pdf/2302.05578v2
 http://arxiv.org/abs/1907.12009v1,creativecommons.org/licenses/by-nc-sa/4.0/,Representation Degeneration Problem in Training Natural Language Generation Models,Jun Gao and Di He and Xu Tan and Tao Qin and Liwei Wang and Tie-Yan Liu,http://arxiv.org/pdf/1907.12009v1
 http://arxiv.org/abs/2111.01582v1,creativecommons.org/licenses/by-nc-sa/4.0/,LMdiff: A Visual Diff Tool to Compare Language Models,Hendrik Strobelt and Benjamin Hoover and Arvind Satyanarayan and Sebastian Gehrmann,http://arxiv.org/pdf/2111.01582v1
 http://arxiv.org/abs/2102.10275v1,creativecommons.org/licenses/by-nc-sa/4.0/,An Attention Ensemble Approach for Efficient Text Classification of Indian Languages,Atharva Kulkarni and Amey Hengle and Rutuja Udyawar,http://arxiv.org/pdf/2102.10275v1
 http://arxiv.org/abs/2211.16594v3,creativecommons.org/licenses/by-nc-sa/4.0/,Exploiting Category Names for Few-Shot Classification with Vision-Language Models,Taihong Xiao and Zirui Wang and Liangliang Cao and Jiahui Yu and Shengyang Dai and Ming-Hsuan Yang,http://arxiv.org/pdf/2211.16594v3
 http://arxiv.org/abs/2202.04173v3,creativecommons.org/licenses/by-nc-sa/4.0/,Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models,Boxin Wang and Wei Ping and Chaowei Xiao and Peng Xu and Mostofa Patwary and Mohammad Shoeybi and Bo Li and Anima Anandkumar and Bryan Catanzaro,http://arxiv.org/pdf/2202.04173v3
 http://arxiv.org/abs/2211.09527v1,creativecommons.org/licenses/by-nc-sa/4.0/,Ignore Previous Prompt: Attack Techniques For Language Models,Fábio Perez and Ian Ribeiro,http://arxiv.org/pdf/2211.09527v1
 http://arxiv.org/abs/2108.13169v1,creativecommons.org/licenses/by-nc-sa/4.0/,Enterprise Architecture Model Transformation Engine,Erik Heiland and Peter Hillmann and Andreas Karcher,http://arxiv.org/pdf/2108.13169v1
 http://arxiv.org/abs/2109.10234v1,creativecommons.org/licenses/by-nc-sa/4.0/,BERTweetFR : Domain Adaptation of Pre-Trained Language Models for French Tweets,Yanzhu Guo and Virgile Rennard and Christos Xypolopoulos and Michalis Vazirgiannis,http://arxiv.org/pdf/2109.10234v1
 http://arxiv.org/abs/2102.12516v1,creativecommons.org/licenses/by-nc-sa/4.0/,"A Large-Scale, Automated Study of Language Surrounding Artificial Intelligence",Autumn Toney,http://arxiv.org/pdf/2102.12516v1
 http://arxiv.org/abs/2209.14901v2,creativecommons.org/licenses/by-nc-sa/4.0/,DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing,Yanjun Gao and Dmitriy Dligach and Timothy Miller and John Caskey and Brihat Sharma and Matthew M Churpek and Majid Afshar,http://arxiv.org/pdf/2209.14901v2
 http://arxiv.org/abs/2302.08150v1,creativecommons.org/licenses/by-nc-sa/4.0/,Reanalyzing L2 Preposition Learning with Bayesian Mixed Effects and a Large Language Model,Jakob Prange and Man Ho Ivy Wong,http://arxiv.org/pdf/2302.08150v1
 http://arxiv.org/abs/2303.08288v1,creativecommons.org/licenses/by-nc-sa/4.0/,Attention-likelihood relationship in transformers,Valeria Ruscio and Valentino Maiorca and Fabrizio Silvestri,http://arxiv.org/pdf/2303.08288v1
 http://arxiv.org/abs/2007.10629v1,creativecommons.org/licenses/by-nc-sa/4.0/,SLNSpeech: solving extended speech separation problem by the help of sign language,Jiasong Wu and Taotao Li and Youyong Kong and Guanyu Yang and Lotfi Senhadji and Huazhong Shu,http://arxiv.org/pdf/2007.10629v1
 http://arxiv.org/abs/1912.01580v2,creativecommons.org/licenses/by-nc-sa/4.0/,A Comparative Study of Pretrained Language Models on Thai Social Text Categorization,Thanapapas Horsuwan and Kasidis Kanwatchara and Peerapon Vateekul and Boonserm Kijsirikul,http://arxiv.org/pdf/1912.01580v2
 http://arxiv.org/abs/2002.05955v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Data Efficient End-To-End Spoken Language Understanding Architecture,Marco Dinarelli and Nikita Kapoor and Bassam Jabaian and Laurent Besacier,http://arxiv.org/pdf/2002.05955v1
 http://arxiv.org/abs/2212.03404v1,creativecommons.org/licenses/by-nc-sa/4.0/,Towards using Few-Shot Prompt Learning for Automating Model Completion,Meriem Ben Chaaben and Lola Burgueño and Houari Sahraoui,http://arxiv.org/pdf/2212.03404v1
 http://arxiv.org/abs/2004.15006v2,creativecommons.org/licenses/by-nc-sa/4.0/,Template Guided Text Generation for Task-Oriented Dialogue,Mihir Kale and Abhinav Rastogi,http://arxiv.org/pdf/2004.15006v2
 http://arxiv.org/abs/2304.00228v1,creativecommons.org/licenses/by-nc-sa/4.0/,Large language models can rate news outlet credibility,Kai-Cheng Yang and Filippo Menczer,http://arxiv.org/pdf/2304.00228v1
 http://arxiv.org/abs/2110.12396v2,creativecommons.org/licenses/by-nc-sa/4.0/,Using Motion History Images with 3D Convolutional Networks in Isolated Sign Language Recognition,Ozge Mercanoglu Sincan and Hacer Yalim Keles,http://arxiv.org/pdf/2110.12396v2
 http://arxiv.org/abs/1907.12763v2,creativecommons.org/licenses/by-nc-sa/4.0/,Finding Moments in Video Collections Using Natural Language,Victor Escorcia and Mattia Soldan and Josef Sivic and Bernard Ghanem and Bryan Russell,http://arxiv.org/pdf/1907.12763v2
 http://arxiv.org/abs/2211.00609v1,creativecommons.org/licenses/by-nc-sa/4.0/,"A Simple, Yet Effective Approach to Finding Biases in Code Generation",Spyridon Mouselinos and Mateusz Malinowski and Henryk Michalewski,http://arxiv.org/pdf/2211.00609v1
 http://arxiv.org/abs/2204.08975v1,creativecommons.org/licenses/by-nc-sa/4.0/,Detecting Text Formality: A Study of Text Classification Approaches,Daryna Dementieva and Ivan Trifinov and Andrey Likhachev and Alexander Panchenko,http://arxiv.org/pdf/2204.08975v1
 http://arxiv.org/abs/2302.13939v2,creativecommons.org/licenses/by-nc-sa/4.0/,SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks,Rui-Jie Zhu and Qihang Zhao and Jason K. Eshraghian,http://arxiv.org/pdf/2302.13939v2
 http://arxiv.org/abs/2106.00241v1,creativecommons.org/licenses/by-nc-sa/4.0/,Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition,Shining Liang and Ming Gong and Jian Pei and Linjun Shou and Wanli Zuo and Xianglin Zuo and Daxin Jiang,http://arxiv.org/pdf/2106.00241v1
 http://arxiv.org/abs/2212.05740v1,creativecommons.org/licenses/by-nc-sa/4.0/,Searching for Effective Multilingual Fine-Tuning Methods: A Case Study in Summarization,Yiwei Qin and Graham Neubig and Pengfei Liu,http://arxiv.org/pdf/2212.05740v1
 http://arxiv.org/abs/2106.05852v2,creativecommons.org/licenses/by-nc-sa/4.0/,Automatic Speech Recognition in Sanskrit: A New Speech Corpus and Modelling Insights,Devaraja Adiga and Rishabh Kumar and Amrith Krishna and Preethi Jyothi and Ganesh Ramakrishnan and Pawan Goyal,http://arxiv.org/pdf/2106.05852v2
 http://arxiv.org/abs/2112.00405v1,creativecommons.org/licenses/by-nc-sa/4.0/,NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging,Zihan Liu and Feijun Jiang and Yuxiang Hu and Chen Shi and Pascale Fung,http://arxiv.org/pdf/2112.00405v1
 http://arxiv.org/abs/2206.14366v3,creativecommons.org/licenses/by-nc-sa/4.0/,Knowledge Distillation of Transformer-based Language Models Revisited,Chengqiang Lu and Jianwei Zhang and Yunfei Chu and Zhengyu Chen and Jingren Zhou and Fei Wu and Haiqing Chen and Hongxia Yang,http://arxiv.org/pdf/2206.14366v3
 http://arxiv.org/abs/1901.00297v1,creativecommons.org/licenses/by-nc-sa/4.0/,"A Deep Learning Approach for Similar Languages, Varieties and Dialects",Vidya Prasad K and Akarsh S and Vinayakumar R and Soman KP,http://arxiv.org/pdf/1901.00297v1
 http://arxiv.org/abs/2108.00356v4,creativecommons.org/licenses/by-nc-sa/4.0/,Improving Social Meaning Detection with Pragmatic Masking and Surrogate Fine-Tuning,Chiyu Zhang and Muhammad Abdul-Mageed,http://arxiv.org/pdf/2108.00356v4
 http://arxiv.org/abs/2110.00428v1,creativecommons.org/licenses/by-nc-sa/4.0/,Zero-shot Natural Language Video Localization,Jinwoo Nam and Daechul Ahn and Dongyeop Kang and Seong Jong Ha and Jonghyun Choi,http://arxiv.org/pdf/2110.00428v1
 http://arxiv.org/abs/2110.08455v1,creativecommons.org/licenses/by-nc-sa/4.0/,Knowledge Enhanced Pretrained Language Models: A Compreshensive Survey,Xiaokai Wei and Shen Wang and Dejiao Zhang and Parminder Bhatia and Andrew Arnold,http://arxiv.org/pdf/2110.08455v1
 http://arxiv.org/abs/2303.07142v3,creativecommons.org/licenses/by-nc-sa/4.0/,Large Language Models in the Workplace: A Case Study on Prompt Engineering for Job Type Classification,Benjamin Clavié and Alexandru Ciceu and Frederick Naylor and Guillaume Soulié and Thomas Brightwell,http://arxiv.org/pdf/2303.07142v3
 http://arxiv.org/abs/1909.11687v2,creativecommons.org/licenses/by-nc-sa/4.0/,Extremely Small BERT Models from Mixed-Vocabulary Training,Sanqiang Zhao and Raghav Gupta and Yang Song and Denny Zhou,http://arxiv.org/pdf/1909.11687v2
 http://arxiv.org/abs/2105.00824v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Survey of Recent Abstract Summarization Techniques,Diyah Puspitaningrum,http://arxiv.org/pdf/2105.00824v1
 http://arxiv.org/abs/2212.08635v1,creativecommons.org/licenses/by-nc-sa/4.0/,Self-Prompting Large Language Models for Open-Domain QA,Junlong Li and Zhuosheng Zhang and Hai Zhao,http://arxiv.org/pdf/2212.08635v1
 http://arxiv.org/abs/1701.09123v1,creativecommons.org/licenses/by-nc-sa/4.0/,Robust Multilingual Named Entity Recognition with Shallow Semi-Supervised Features,Rodrigo Agerri and German Rigau,http://arxiv.org/pdf/1701.09123v1
 http://arxiv.org/abs/2202.13257v1,creativecommons.org/licenses/by-nc-sa/4.0/,Controllable Natural Language Generation with Contrastive Prefixes,Jing Qian and Li Dong and Yelong Shen and Furu Wei and Weizhu Chen,http://arxiv.org/pdf/2202.13257v1
 http://arxiv.org/abs/2303.03290v1,creativecommons.org/licenses/by-nc-sa/4.0/,AmQA: Amharic Question Answering Dataset,Tilahun Abedissa and Ricardo Usbeck and Yaregal Assabie,http://arxiv.org/pdf/2303.03290v1
 http://arxiv.org/abs/2304.00634v1,creativecommons.org/licenses/by-nc-sa/4.0/,MMT: A Multilingual and Multi-Topic Indian Social Media Dataset,Dwip Dalal and Vivek Srivastava and Mayank Singh,http://arxiv.org/pdf/2304.00634v1
 http://arxiv.org/abs/2203.09866v1,creativecommons.org/licenses/by-nc-sa/4.0/,Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation,Beatrice Savoldi and Marco Gaido and Luisa Bentivogli and Matteo Negri and Marco Turchi,http://arxiv.org/pdf/2203.09866v1
 http://arxiv.org/abs/2304.01238v2,creativecommons.org/licenses/by-nc-sa/4.0/,Spam-T5: Benchmarking Large Language Models for Few-Shot Email Spam Detection,Maxime Labonne and Sean Moran,http://arxiv.org/pdf/2304.01238v2
 http://arxiv.org/abs/2010.01611v2,creativecommons.org/licenses/by-nc-sa/4.0/,"When in Doubt, Ask: Generating Answerable and Unanswerable Questions, Unsupervised",Liubov Nikolenko and Pouya Rezazadeh Kalehbasti,http://arxiv.org/pdf/2010.01611v2
 http://arxiv.org/abs/2109.00796v1,creativecommons.org/licenses/by-nc-sa/4.0/,Multi-Modal Zero-Shot Sign Language Recognition,Razieh Rastgoo and Kourosh Kiani and Sergio Escalera and Mohammad Sabokrou,http://arxiv.org/pdf/2109.00796v1
 http://arxiv.org/abs/2211.06398v1,creativecommons.org/licenses/by-nc-sa/4.0/,Investigating Fairness Disparities in Peer Review: A Language Model Enhanced Approach,Jiayao Zhang and Hongming Zhang and Zhun Deng and Dan Roth,http://arxiv.org/pdf/2211.06398v1
 http://arxiv.org/abs/2210.12770v2,creativecommons.org/licenses/by-nc-sa/4.0/,On Cross-Domain Pre-Trained Language Models for Clinical Text Mining: How Do They Perform on Data-Constrained Fine-Tuning?,Yuping Wu and Lifeng Han and Valerio Antonini and Goran Nenadic,http://arxiv.org/pdf/2210.12770v2
 http://arxiv.org/abs/2204.02685v3,creativecommons.org/licenses/by-nc-sa/4.0/,SecureBERT: A Domain-Specific Language Model for Cybersecurity,Ehsan Aghaei and Xi Niu and Waseem Shadid and Ehab Al-Shaer,http://arxiv.org/pdf/2204.02685v3
 http://arxiv.org/abs/2203.01111v2,creativecommons.org/licenses/by-nc-sa/4.0/,Large-Scale Hate Speech Detection with Cross-Domain Transfer,Cagri Toraman and Furkan Şahinuç and Eyup Halit Yilmaz,http://arxiv.org/pdf/2203.01111v2
 http://arxiv.org/abs/2301.02071v1,creativecommons.org/licenses/by-nc-sa/4.0/,Towards Table-to-Text Generation with Pretrained Language Model: A Table Structure Understanding and Text Deliberating Approach,Miao Chen and Xinjiang Lu and Tong Xu and Yanyan Li and Jingbo Zhou and Dejing Dou and Hui Xiong,http://arxiv.org/pdf/2301.02071v1
 http://arxiv.org/abs/2210.01185v1,creativecommons.org/licenses/by-nc-sa/4.0/,ContraGen: Effective Contrastive Learning For Causal Language Model,Nihal Jain and Dejiao Zhang and Wasi Uddin Ahmad and Zijian Wang and Feng Nan and Xiaopeng Li and Ming Tan and Ramesh Nallapati and Baishakhi Ray and Parminder Bhatia and Xiaofei Ma and Bing Xiang,http://arxiv.org/pdf/2210.01185v1
 http://arxiv.org/abs/2010.07835v3,creativecommons.org/licenses/by-nc-sa/4.0/,Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach,Yue Yu and Simiao Zuo and Haoming Jiang and Wendi Ren and Tuo Zhao and Chao Zhang,http://arxiv.org/pdf/2010.07835v3
 http://arxiv.org/abs/2202.13758v3,creativecommons.org/licenses/by-nc-sa/4.0/,Logical Fallacy Detection,Zhijing Jin and Abhinav Lalwani and Tejas Vaidhya and Xiaoyu Shen and Yiwen Ding and Zhiheng Lyu and Mrinmaya Sachan and Rada Mihalcea and Bernhard Schölkopf,http://arxiv.org/pdf/2202.13758v3
 http://arxiv.org/abs/2304.02697v1,creativecommons.org/licenses/by-nc-sa/4.0/,Revolutionizing Single Cell Analysis: The Power of Large Language Models for Cell Type Annotation,Zehua Zeng and Hongwu Du,http://arxiv.org/pdf/2304.02697v1
 http://arxiv.org/abs/2301.04528v1,creativecommons.org/licenses/by-nc-sa/4.0/,The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference,Richard Brath and Daniel Keim and Johannes Knittel and Shimei Pan and Pia Sommerauer and Hendrik Strobelt,http://arxiv.org/pdf/2301.04528v1
 http://arxiv.org/abs/1510.01717v1,creativecommons.org/licenses/by-nc-sa/4.0/,Language Segmentation,David Alfter,http://arxiv.org/pdf/1510.01717v1
 http://arxiv.org/abs/2109.03009v1,creativecommons.org/licenses/by-nc-sa/4.0/,Sequential Attention Module for Natural Language Processing,Mengyuan Zhou and Jian Ma and Haiqin Yang and Lianxin Jiang and Yang Mo,http://arxiv.org/pdf/2109.03009v1
 http://arxiv.org/abs/2105.05066v1,creativecommons.org/licenses/by-nc-sa/4.0/,"ChaLearn LAP Large Scale Signer Independent Isolated Sign Language Recognition Challenge: Design, Results and Future Research",Ozge Mercanoglu Sincan and Julio C. S. Jacques Junior and Sergio Escalera and Hacer Yalim Keles,http://arxiv.org/pdf/2105.05066v1
 http://arxiv.org/abs/2301.02241v1,creativecommons.org/licenses/by-nc-sa/4.0/,CiT: Curation in Training for Effective Vision-Language Data,Hu Xu and Saining Xie and Po-Yao Huang and Licheng Yu and Russell Howes and Gargi Ghosh and Luke Zettlemoyer and Christoph Feichtenhofer,http://arxiv.org/pdf/2301.02241v1
 http://arxiv.org/abs/1812.06624v1,creativecommons.org/licenses/by-nc-sa/4.0/,Feature Fusion Effects of Tensor Product Representation on (De)Compositional Network for Caption Generation for Images,Chiranjib Sur,http://arxiv.org/pdf/1812.06624v1
 http://arxiv.org/abs/2204.04327v2,creativecommons.org/licenses/by-nc-sa/4.0/,"Show, Don't Tell: Demonstrations Outperform Descriptions for Schema-Guided Task-Oriented Dialogue",Raghav Gupta and Harrison Lee and Jeffrey Zhao and Abhinav Rastogi and Yuan Cao and Yonghui Wu,http://arxiv.org/pdf/2204.04327v2
 http://arxiv.org/abs/2304.03472v2,creativecommons.org/licenses/by-nc-sa/4.0/,Does Prompt-Tuning Language Model Ensure Privacy?,Shangyu Xie and Wei Dai and Esha Ghosh and Sambuddha Roy and Dan Schwartz and Kim Laine,http://arxiv.org/pdf/2304.03472v2
 http://arxiv.org/abs/2102.07818v2,creativecommons.org/licenses/by-nc-sa/4.0/,Certified Robustness to Programmable Transformations in LSTMs,Yuhao Zhang and Aws Albarghouthi and Loris D'Antoni,http://arxiv.org/pdf/2102.07818v2
 http://arxiv.org/abs/2009.06054v1,creativecommons.org/licenses/by-nc-sa/4.0/,Deconstructing Legal Text_Object Oriented Design in Legal Adjudication,Megan Ma and Dmitriy Podkopaev and Avalon Campbell-Cousins and Adam Nicholas,http://arxiv.org/pdf/2009.06054v1
 http://arxiv.org/abs/2211.13112v1,creativecommons.org/licenses/by-nc-sa/4.0/,"This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish",Łukasz Augustyniak and Kamil Tagowski and Albert Sawczyn and Denis Janiak and Roman Bartusiak and Adrian Szymczak and Marcin Wątroba and Arkadiusz Janz and Piotr Szymański and Mikołaj Morzy and Tomasz Kajdanowicz and Maciej Piasecki,http://arxiv.org/pdf/2211.13112v1
 http://arxiv.org/abs/2302.04012v1,creativecommons.org/licenses/by-nc-sa/4.0/,Systematically Finding Security Vulnerabilities in Black-Box Code Generation Models,Hossein Hajipour and Thorsten Holz and Lea Schönherr and Mario Fritz,http://arxiv.org/pdf/2302.04012v1
 http://arxiv.org/abs/2203.05008v2,creativecommons.org/licenses/by-nc-sa/4.0/,Sentence-Select: Large-Scale Language Model Data Selection for Rare-Word Speech Recognition,W. Ronny Huang and Cal Peyser and Tara N. Sainath and Ruoming Pang and Trevor Strohman and Shankar Kumar,http://arxiv.org/pdf/2203.05008v2
 http://arxiv.org/abs/2301.13126v1,creativecommons.org/licenses/by-nc-sa/4.0/,LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain,Joel Niklaus and Veton Matoshi and Pooja Rani and Andrea Galassi and Matthias Stürmer and Ilias Chalkidis,http://arxiv.org/pdf/2301.13126v1
 http://arxiv.org/abs/2107.00430v3,creativecommons.org/licenses/by-nc-sa/4.0/,Language-Level Semantics Conditioned 3D Point Cloud Segmentation,Bo Liu and Shuang Deng and Qiulei Dong and Zhanyi Hu,http://arxiv.org/pdf/2107.00430v3
 http://arxiv.org/abs/1902.00508v1,creativecommons.org/licenses/by-nc-sa/4.0/,"How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions",Goran Glavas and Robert Litschko and Sebastian Ruder and Ivan Vulic,http://arxiv.org/pdf/1902.00508v1
 http://arxiv.org/abs/2302.02029v1,creativecommons.org/licenses/by-nc-sa/4.0/,Towards Few-Shot Identification of Morality Frames using In-Context Learning,Shamik Roy and Nishanth Sridhar Nakshatri and Dan Goldwasser,http://arxiv.org/pdf/2302.02029v1
 http://arxiv.org/abs/2304.08592v1,creativecommons.org/licenses/by-nc-sa/4.0/,Improving Scene Text Recognition for Character-Level Long-Tailed Distribution,Sunghyun Park and Sunghyo Chung and Jungsoo Lee and Jaegul Choo,http://arxiv.org/pdf/2304.08592v1
 http://arxiv.org/abs/2302.08575v1,creativecommons.org/licenses/by-nc-sa/4.0/,Foundation Models for Natural Language Processing -- Pre-trained Language Models Integrating Media,Gerhard Paaß and Sven Giesselbach,http://arxiv.org/pdf/2302.08575v1
 http://arxiv.org/abs/2008.06121v1,creativecommons.org/licenses/by-nc-sa/4.0/,LSTM Acoustic Models Learn to Align and Pronounce with Graphemes,Arindrima Datta and Guanlong Zhao and Bhuvana Ramabhadran and Eugene Weinstein,http://arxiv.org/pdf/2008.06121v1
 http://arxiv.org/abs/2203.11239v1,creativecommons.org/licenses/by-nc-sa/4.0/,DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization,Zheng Li and Zijian Wang and Ming Tan and Ramesh Nallapati and Parminder Bhatia and Andrew Arnold and Bing Xiang and Dan Roth,http://arxiv.org/pdf/2203.11239v1
 http://arxiv.org/abs/2106.08898v1,creativecommons.org/licenses/by-nc-sa/4.0/,RefBERT: Compressing BERT by Referencing to Pre-computed Representations,Xinyi Wang and Haiqin Yang and Liang Zhao and Yang Mo and Jianping Shen,http://arxiv.org/pdf/2106.08898v1
 http://arxiv.org/abs/1906.09379v1,creativecommons.org/licenses/by-nc-sa/4.0/,Evaluating Computational Language Models with Scaling Properties of Natural Language,Shuntaro Takahashi and Kumiko Tanaka-Ishii,http://arxiv.org/pdf/1906.09379v1
 http://arxiv.org/abs/2303.15669v1,creativecommons.org/licenses/by-nc-sa/4.0/,Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low Resource Languages,Seongyeon Park and Myungseo Song and Bohyung Kim and Tae-Hyun Oh,http://arxiv.org/pdf/2303.15669v1
 http://arxiv.org/abs/2104.00933v1,creativecommons.org/licenses/by-nc-sa/4.0/,Humor@IITK at SemEval-2021 Task 7: Large Language Models for Quantifying Humor and Offensiveness,Aishwarya Gupta and Avik Pal and Bholeshwar Khurana and Lakshay Tyagi and Ashutosh Modi,http://arxiv.org/pdf/2104.00933v1
 http://arxiv.org/abs/2209.00981v1,creativecommons.org/licenses/by-nc-sa/4.0/,Exploiting Pretrained Biochemical Language Models for Targeted Drug Design,Gökçe Uludoğan and Elif Ozkirimli and Kutlu O. Ulgen and Nilgün Karalı and Arzucan Özgür,http://arxiv.org/pdf/2209.00981v1
 http://arxiv.org/abs/2209.05034v1,creativecommons.org/licenses/by-nc-sa/4.0/,CSL: A Large-scale Chinese Scientific Literature Dataset,Yudong Li and Yuqing Zhang and Zhe Zhao and Linlin Shen and Weijie Liu and Weiquan Mao and Hui Zhang,http://arxiv.org/pdf/2209.05034v1
 http://arxiv.org/abs/2304.11060v1,creativecommons.org/licenses/by-nc-sa/4.0/,SkillGPT: a RESTful API service for skill extraction and standardization using a Large Language Model,Nan Li and Bo Kang and Tijl De Bie,http://arxiv.org/pdf/2304.11060v1
 http://arxiv.org/abs/2105.05605v1,creativecommons.org/licenses/by-nc-sa/4.0/,Priberam Labs at the NTCIR-15 SHINRA2020-ML: Classification Task,Ruben Cardoso and Afonso Mendes and Andre Lamurias,http://arxiv.org/pdf/2105.05605v1
 http://arxiv.org/abs/2209.02267v1,creativecommons.org/licenses/by-nc-sa/4.0/,Entity Aware Syntax Tree Based Data Augmentation for Natural Language Understanding,Jiaxing Xu and Jianbin Cui and Jiangneng Li and Wenge Rong and Noboru Matsuda,http://arxiv.org/pdf/2209.02267v1
 http://arxiv.org/abs/2301.10075v1,creativecommons.org/licenses/by-nc-sa/4.0/,From Inclusive Language to Gender-Neutral Machine Translation,Andrea Piergentili and Dennis Fucci and Beatrice Savoldi and Luisa Bentivogli and Matteo Negri,http://arxiv.org/pdf/2301.10075v1
 http://arxiv.org/abs/2109.00859v1,creativecommons.org/licenses/by-nc-sa/4.0/,CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation,Yue Wang and Weishi Wang and Shafiq Joty and Steven C. H. Hoi,http://arxiv.org/pdf/2109.00859v1
 http://arxiv.org/abs/2012.01266v2,creativecommons.org/licenses/by-nc-sa/4.0/,Meta-KD: A Meta Knowledge Distillation Framework for Language Model Compression across Domains,Haojie Pan and Chengyu Wang and Minghui Qiu and Yichang Zhang and Yaliang Li and Jun Huang,http://arxiv.org/pdf/2012.01266v2
 http://arxiv.org/abs/2207.06591v3,creativecommons.org/licenses/by-nc-sa/4.0/,A methodology to characterize bias and harmful stereotypes in natural language processing in Latin America,Laura Alonso Alemany and Luciana Benotti and Hernán Maina and Lucía González and Mariela Rajngewerc and Lautaro Martínez and Jorge Sánchez and Mauro Schilman and Guido Ivetta and Alexia Halvorsen and Amanda Mata Rojo and Matías Bordone and Beatriz Busaniche,http://arxiv.org/pdf/2207.06591v3
 http://arxiv.org/abs/1903.09442v2,creativecommons.org/licenses/by-nc-sa/4.0/,LINSPECTOR: Multilingual Probing Tasks for Word Representations,Gözde Gül Şahin and Clara Vania and Ilia Kuznetsov and Iryna Gurevych,http://arxiv.org/pdf/1903.09442v2
 http://arxiv.org/abs/2301.11564v1,creativecommons.org/licenses/by-nc-sa/4.0/,Learning 6-DoF Fine-grained Grasp Detection Based on Part Affordance Grounding,Yaoxian Song and Penglei Sun and Yi Ren and Yu Zheng and Yue Zhang,http://arxiv.org/pdf/2301.11564v1
 http://arxiv.org/abs/2304.02213v5,creativecommons.org/licenses/by-nc-sa/4.0/,Large Language Models as Master Key: Unlocking the Secrets of Materials Science with GPT,Tong Xie and Yuwei Wan and Wei Huang and Yufei Zhou and Yixuan Liu and Qingyuan Linghu and Shaozhou Wang and Chunyu Kit and Clara Grazian and Wenjie Zhang and Bram Hoex,http://arxiv.org/pdf/2304.02213v5
 http://arxiv.org/abs/2004.03636v1,creativecommons.org/licenses/by-nc-sa/4.0/,Efficient long-distance relation extraction with DG-SpanBERT,Jun Chen and Robert Hoehndorf and Mohamed Elhoseiny and Xiangliang Zhang,http://arxiv.org/pdf/2004.03636v1
 http://arxiv.org/abs/2204.00923v3,creativecommons.org/licenses/by-nc-sa/4.0/,Word separation in continuous sign language using isolated signs and post-processing,Razieh Rastgoo and Kourosh Kiani and Sergio Escalera,http://arxiv.org/pdf/2204.00923v3
 http://arxiv.org/abs/2211.07713v1,creativecommons.org/licenses/by-nc-sa/4.0/,How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling,Samuel Cahyawijaya and Bryan Wilie and Holy Lovenia and Huan Zhong and MingQian Zhong and Yuk-Yu Nancy Ip and Pascale Fung,http://arxiv.org/pdf/2211.07713v1
 http://arxiv.org/abs/2301.02773v1,creativecommons.org/licenses/by-nc-sa/4.0/,Building a Parallel Corpus and Training Translation Models Between Luganda and English,Richard Kimera and Daniela N. Rim and Heeyoul Choi,http://arxiv.org/pdf/2301.02773v1
 http://arxiv.org/abs/2301.12066v1,creativecommons.org/licenses/by-nc-sa/4.0/,Truth Machines: Synthesizing Veracity in AI Language Models,Luke Munn and Liam Magee and Vanicka Arora,http://arxiv.org/pdf/2301.12066v1
 http://arxiv.org/abs/2210.11431v1,creativecommons.org/licenses/by-nc-sa/4.0/,Counterfactual Recipe Generation: Exploring Compositional Generalization in a Realistic Scenario,Xiao Liu and Yansong Feng and Jizhi Tang and Chengang Hu and Dongyan Zhao,http://arxiv.org/pdf/2210.11431v1
 http://arxiv.org/abs/1609.06649v1,creativecommons.org/licenses/by-nc-sa/4.0/,Minimally Supervised Written-to-Spoken Text Normalization,Ke Wu and Kyle Gorman and Richard Sproat,http://arxiv.org/pdf/1609.06649v1
 http://arxiv.org/abs/2109.13582v2,creativecommons.org/licenses/by-nc-sa/4.0/,PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided MCTS Decoding,Antoine Chaffin and Vincent Claveau and Ewa Kijak,http://arxiv.org/pdf/2109.13582v2
 http://arxiv.org/abs/2201.07338v2,creativecommons.org/licenses/by-nc-sa/4.0/,Controllable Protein Design with Language Models,Noelia Ferruz and Birte Höcker,http://arxiv.org/pdf/2201.07338v2
 http://arxiv.org/abs/2205.05391v1,creativecommons.org/licenses/by-nc-sa/4.0/,Query-Based Keyphrase Extraction from Long Documents,Martin Docekal and Pavel Smrz,http://arxiv.org/pdf/2205.05391v1
 http://arxiv.org/abs/2304.06337v1,creativecommons.org/licenses/by-nc-sa/4.0/,Computational modeling of semantic change,Nina Tahmasebi and Haim Dubossarsky,http://arxiv.org/pdf/2304.06337v1
 http://arxiv.org/abs/2101.08106v2,creativecommons.org/licenses/by-nc-sa/4.0/,Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation,Lingyun Feng and Minghui Qiu and Yaliang Li and Hai-Tao Zheng and Ying Shen,http://arxiv.org/pdf/2101.08106v2
 http://arxiv.org/abs/2106.04403v2,creativecommons.org/licenses/by-nc-sa/4.0/,SynthRef: Generation of Synthetic Referring Expressions for Object Segmentation,Ioannis Kazakos and Carles Ventura and Miriam Bellver and Carina Silberer and Xavier Giro-i-Nieto,http://arxiv.org/pdf/2106.04403v2
 http://arxiv.org/abs/2102.10407v5,creativecommons.org/licenses/by-nc-sa/4.0/,VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning,Jun Chen and Han Guo and Kai Yi and Boyang Li and Mohamed Elhoseiny,http://arxiv.org/pdf/2102.10407v5
 http://arxiv.org/abs/2301.03029v6,creativecommons.org/licenses/by-nc-sa/4.0/,Topic Modelling of Swedish Newspaper Articles about Coronavirus: a Case Study using Latent Dirichlet Allocation Method,Bernadeta Griciūtė and Lifeng Han and Goran Nenadic,http://arxiv.org/pdf/2301.03029v6
 http://arxiv.org/abs/2302.10593v1,creativecommons.org/licenses/by-nc-sa/4.0/,Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys,Henk van den Heuvel and Martijn Bentum and Simone Wills and Judith C. Koops,http://arxiv.org/pdf/2302.10593v1
 http://arxiv.org/abs/2107.07653v3,creativecommons.org/licenses/by-nc-sa/4.0/,TAPEX: Table Pre-training via Learning a Neural SQL Executor,Qian Liu and Bei Chen and Jiaqi Guo and Morteza Ziyadi and Zeqi Lin and Weizhu Chen and Jian-Guang Lou,http://arxiv.org/pdf/2107.07653v3
 http://arxiv.org/abs/1805.06087v1,creativecommons.org/licenses/by-nc-sa/4.0/,Learning to Write with Cooperative Discriminators,Ari Holtzman and Jan Buys and Maxwell Forbes and Antoine Bosselut and David Golub and Yejin Choi,http://arxiv.org/pdf/1805.06087v1
 http://arxiv.org/abs/2106.06566v1,creativecommons.org/licenses/by-nc-sa/4.0/,Sample-efficient Linguistic Generalizations through Program Synthesis: Experiments with Phonology Problems,Saujas Vaduguru and Aalok Sathe and Monojit Choudhury and Dipti Misra Sharma,http://arxiv.org/pdf/2106.06566v1
 http://arxiv.org/abs/2204.04914v1,creativecommons.org/licenses/by-nc-sa/4.0/,Zero-shot Cross-lingual Conversational Semantic Role Labeling,Han Wu and Haochen Tan and Kun Xu and Shuqi Liu and Lianwei Wu and Linqi Song,http://arxiv.org/pdf/2204.04914v1
 http://arxiv.org/abs/2205.00258v2,creativecommons.org/licenses/by-nc-sa/4.0/,EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language Processing,Chengyu Wang and Minghui Qiu and Chen Shi and Taolin Zhang and Tingting Liu and Lei Li and Jianing Wang and Ming Wang and Jun Huang and Wei Lin,http://arxiv.org/pdf/2205.00258v2
 http://arxiv.org/abs/2206.07666v1,creativecommons.org/licenses/by-nc-sa/4.0/,Transformer-based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project,Jan Lehečka and Josef V. Psutka and Josef Psutka,http://arxiv.org/pdf/2206.07666v1
 http://arxiv.org/abs/2210.05883v1,creativecommons.org/licenses/by-nc-sa/4.0/,AD-DROP: Attribution-Driven Dropout for Robust Language Model Fine-Tuning,Tao Yang and Jinghao Deng and Xiaojun Quan and Qifan Wang and Shaoliang Nie,http://arxiv.org/pdf/2210.05883v1
 http://arxiv.org/abs/2301.13816v2,creativecommons.org/licenses/by-nc-sa/4.0/,Execution-based Code Generation using Deep Reinforcement Learning,Parshin Shojaee and Aneesh Jain and Sindhu Tipirneni and Chandan K. Reddy,http://arxiv.org/pdf/2301.13816v2
 http://arxiv.org/abs/2303.09306v2,creativecommons.org/licenses/by-nc-sa/4.0/,BanglaCoNER: Towards Robust Bangla Complex Named Entity Recognition,HAZ Sameen Shahgir and Ramisa Alam and Md. Zarif Ul Alam,http://arxiv.org/pdf/2303.09306v2
 http://arxiv.org/abs/2205.06983v2,creativecommons.org/licenses/by-nc-sa/4.0/,RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL,Jiexing Qi and Jingyao Tang and Ziwei He and Xiangpeng Wan and Yu Cheng and Chenghu Zhou and Xinbing Wang and Quanshi Zhang and Zhouhan Lin,http://arxiv.org/pdf/2205.06983v2
 http://arxiv.org/abs/2202.07630v1,creativecommons.org/licenses/by-nc-sa/4.0/,Delving Deeper into Cross-lingual Visual Question Answering,Chen Liu and Jonas Pfeiffer and Anna Korhonen and Ivan Vulic and Iryna Gurevych,http://arxiv.org/pdf/2202.07630v1
 http://arxiv.org/abs/2203.10250v1,creativecommons.org/licenses/by-nc-sa/4.0/,Meta-X$_{NLG}$: A Meta-Learning Approach Based on Language Clustering for Zero-Shot Cross-Lingual Transfer and Generation,Kaushal Kumar Maurya and Maunendra Sankar Desarkar,http://arxiv.org/pdf/2203.10250v1
 http://arxiv.org/abs/1810.09699v1,creativecommons.org/licenses/by-nc-sa/4.0/,Semi-supervised acoustic model training for speech with code-switching,Emre Yılmaz and Mitchell McLaren and Henk van den Heuvel and David A. van Leeuwen,http://arxiv.org/pdf/1810.09699v1
 http://arxiv.org/abs/2002.12683v2,creativecommons.org/licenses/by-nc-sa/4.0/,RP-DNN: A Tweet level propagation context based deep neural networks for early rumor detection in Social Media,Jie Gao and Sooji Han and Xingyi Song and Fabio Ciravegna,http://arxiv.org/pdf/2002.12683v2
 http://arxiv.org/abs/2206.07278v1,creativecommons.org/licenses/by-nc-sa/4.0/,Nebula Graph: An open source distributed graph database,Min Wu and Xinglu Yi and Hui Yu and Yu Liu and Yujue Wang,http://arxiv.org/pdf/2206.07278v1
 http://arxiv.org/abs/1901.03116v2,creativecommons.org/licenses/by-nc-sa/4.0/,Equalizing Gender Biases in Neural Machine Translation with Word Embeddings Techniques,Joel Escudé Font and Marta R. Costa-jussà,http://arxiv.org/pdf/1901.03116v2
 http://arxiv.org/abs/2110.08518v2,creativecommons.org/licenses/by-nc-sa/4.0/,MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding,Junlong Li and Yiheng Xu and Lei Cui and Furu Wei,http://arxiv.org/pdf/2110.08518v2
 http://arxiv.org/abs/2112.06953v1,creativecommons.org/licenses/by-nc-sa/4.0/,Controlled Cue Generation for Play Scripts,Alara Dirik and Hilal Donmez and Pinar Yanardag,http://arxiv.org/pdf/2112.06953v1
 http://arxiv.org/abs/2202.00535v2,creativecommons.org/licenses/by-nc-sa/4.0/,Novelty Controlled Paraphrase Generation with Retrieval Augmented Conditional Prompt Tuning,Jishnu Ray Chowdhury and Yong Zhuang and Shuyi Wang,http://arxiv.org/pdf/2202.00535v2
 http://arxiv.org/abs/2304.06594v1,creativecommons.org/licenses/by-nc-sa/4.0/,Solving Tensor Low Cycle Rank Approximation,Yichuan Deng and Yeqi Gao and Zhao Song,http://arxiv.org/pdf/2304.06594v1
 http://arxiv.org/abs/2212.07016v2,creativecommons.org/licenses/by-nc-sa/4.0/,Understanding Zero-Shot Adversarial Robustness for Large-Scale Models,Chengzhi Mao and Scott Geng and Junfeng Yang and Xin Wang and Carl Vondrick,http://arxiv.org/pdf/2212.07016v2
 http://arxiv.org/abs/2104.01619v1,creativecommons.org/licenses/by-nc-sa/4.0/,KnowGraph@IITK at SemEval-2021 Task 11: Building KnowledgeGraph for NLP Research,Shashank Shailabh and Sajal Chaurasia and Ashutosh Modi,http://arxiv.org/pdf/2104.01619v1
 http://arxiv.org/abs/2110.08395v2,creativecommons.org/licenses/by-nc-sa/4.0/,DS-TOD: Efficient Domain Specialization for Task Oriented Dialog,Chia-Chien Hung and Anne Lauscher and Simone Paolo Ponzetto and Goran Glavaš,http://arxiv.org/pdf/2110.08395v2
 http://arxiv.org/abs/2302.09051v4,creativecommons.org/licenses/by-nc-sa/4.0/,"Complex QA and language models hybrid architectures, Survey",Xavier Daull and Patrice Bellot and Emmanuel Bruno and Vincent Martin and Elisabeth Murisasco,http://arxiv.org/pdf/2302.09051v4
 http://arxiv.org/abs/2110.07816v1,creativecommons.org/licenses/by-nc-sa/4.0/,Multilingual Neural Machine Translation:Can Linguistic Hierarchies Help?,Fahimeh Saleh and Wray Buntine and Gholamreza Haffari and Lan Du,http://arxiv.org/pdf/2110.07816v1
 http://arxiv.org/abs/2304.00472v1,creativecommons.org/licenses/by-nc-sa/4.0/,Querying Large Language Models with SQL,Mohammed Saeed and Nicola De Cao and Paolo Papotti,http://arxiv.org/pdf/2304.00472v1
 http://arxiv.org/abs/1712.05972v2,creativecommons.org/licenses/by-nc-sa/4.0/,"Train Once, Test Anywhere: Zero-Shot Learning for Text Classification",Pushpankar Kumar Pushp and Muktabh Mayank Srivastava,http://arxiv.org/pdf/1712.05972v2
 http://arxiv.org/abs/2109.10282v5,creativecommons.org/licenses/by-nc-sa/4.0/,TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models,Minghao Li and Tengchao Lv and Jingye Chen and Lei Cui and Yijuan Lu and Dinei Florencio and Cha Zhang and Zhoujun Li and Furu Wei,http://arxiv.org/pdf/2109.10282v5
 http://arxiv.org/abs/2203.07996v2,creativecommons.org/licenses/by-nc-sa/4.0/,Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition,Xichen Pan and Peiyu Chen and Yichen Gong and Helong Zhou and Xinbing Wang and Zhouhan Lin,http://arxiv.org/pdf/2203.07996v2
 http://arxiv.org/abs/2109.08270v3,creativecommons.org/licenses/by-nc-sa/4.0/,Language Models as a Knowledge Source for Cognitive Agents,Robert E. Wray,III and James R. Kirk and John E. Laird
 http://arxiv.org/abs/2210.12460v1,creativecommons.org/licenses/by-nc-sa/4.0/,Collaborative Reasoning on Multi-Modal Semantic Graphs for Video-Grounded Dialogue Generation,Xueliang Zhao and Yuxuan Wang and Chongyang Tao and Chenshuo Wang and Dongyan Zhao,http://arxiv.org/pdf/2210.12460v1
 http://arxiv.org/abs/2108.03533v3,creativecommons.org/licenses/by-nc-sa/4.0/,Improving Similar Language Translation With Transfer Learning,Ife Adebara and Muhammad Abdul-Mageed,http://arxiv.org/pdf/2108.03533v3
 http://arxiv.org/abs/2004.00139v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Swiss German Dictionary: Variation in Speech and Writing,Larissa Schmidt and Lucy Linder and Sandra Djambazovska and Alexandros Lazaridis and Tanja Samardžić and Claudiu Musat,http://arxiv.org/pdf/2004.00139v1
 http://arxiv.org/abs/2105.09043v2,creativecommons.org/licenses/by-nc-sa/4.0/,Sentence Extraction-Based Machine Reading Comprehension for Vietnamese,Phong Nguyen-Thuan Do and Nhat Duy Nguyen and Tin Van Huynh and Kiet Van Nguyen and Anh Gia-Tuan Nguyen and Ngan Luu-Thuy Nguyen,http://arxiv.org/pdf/2105.09043v2
 http://arxiv.org/abs/2105.14779v2,creativecommons.org/licenses/by-nc-sa/4.0/,Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR,Shammur Absar Chowdhury and Amir Hussein and Ahmed Abdelali and Ahmed Ali,http://arxiv.org/pdf/2105.14779v2
 http://arxiv.org/abs/1904.02818v1,creativecommons.org/licenses/by-nc-sa/4.0/,Neural Networks for Modeling Source Code Edits,Rui Zhao and David Bieber and Kevin Swersky and Daniel Tarlow,http://arxiv.org/pdf/1904.02818v1
 http://arxiv.org/abs/2205.14583v2,creativecommons.org/licenses/by-nc-sa/4.0/,Learning Locality and Isotropy in Dialogue Modeling,Han Wu and Haochen Tan and Mingjie Zhan and Gangming Zhao and Shaoqing Lu and Ding Liang and Linqi Song,http://arxiv.org/pdf/2205.14583v2
 http://arxiv.org/abs/2107.00281v3,creativecommons.org/licenses/by-nc-sa/4.0/,Scientia Potentia Est -- On the Role of Knowledge in Computational Argumentation,Anne Lauscher and Henning Wachsmuth and Iryna Gurevych and Goran Glavaš,http://arxiv.org/pdf/2107.00281v3
 http://arxiv.org/abs/1908.01341v1,creativecommons.org/licenses/by-nc-sa/4.0/,SF-Net: Structured Feature Network for Continuous Sign Language Recognition,Zhaoyang Yang and Zhenmei Shi and Xiaoyong Shen and Yu-Wing Tai,http://arxiv.org/pdf/1908.01341v1
 http://arxiv.org/abs/2110.07938v1,creativecommons.org/licenses/by-nc-sa/4.0/,Identifying Causal Influences on Publication Trends and Behavior: A Case Study of the Computational Linguistics Community,Maria Glenski and Svitlana Volkova,http://arxiv.org/pdf/2110.07938v1
 http://arxiv.org/abs/2111.01340v2,creativecommons.org/licenses/by-nc-sa/4.0/,Adapting to the Long Tail: A Meta-Analysis of Transfer Learning Research for Language Understanding Tasks,Aakanksha Naik and Jill Lehman and Carolyn Rose,http://arxiv.org/pdf/2111.01340v2
 http://arxiv.org/abs/2111.08545v1,creativecommons.org/licenses/by-nc-sa/4.0/,Coral: An Approach for Conversational Agents in Mental Health Applications,Harsh Sakhrani and Saloni Parekh and Shubham Mahajan,http://arxiv.org/pdf/2111.08545v1
 http://arxiv.org/abs/2206.08474v1,creativecommons.org/licenses/by-nc-sa/4.0/,XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence,Ming Zhu and Aneesh Jain and Karthik Suresh and Roshan Ravindran and Sindhu Tipirneni and Chandan K. Reddy,http://arxiv.org/pdf/2206.08474v1
 http://arxiv.org/abs/2210.07095v2,creativecommons.org/licenses/by-nc-sa/4.0/,Incorporating Context into Subword Vocabularies,Shaked Yehezkel and Yuval Pinter,http://arxiv.org/pdf/2210.07095v2
 http://arxiv.org/abs/2303.06273v1,creativecommons.org/licenses/by-nc-sa/4.0/,Consistency Analysis of ChatGPT,Myeongjun Jang and Thomas Lukasiewicz,http://arxiv.org/pdf/2303.06273v1
 http://arxiv.org/abs/2212.10015v1,creativecommons.org/licenses/by-nc-sa/4.0/,Benchmarking Spatial Relationships in Text-to-Image Generation,Tejas Gokhale and Hamid Palangi and Besmira Nushi and Vibhav Vineet and Eric Horvitz and Ece Kamar and Chitta Baral and Yezhou Yang,http://arxiv.org/pdf/2212.10015v1
 http://arxiv.org/abs/2301.11749v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Multi-task Multi-stage Transitional Training Framework for Neural Chat Translation,Chulun Zhou and Yunlong Liang and Fandong Meng and Jie Zhou and Jinan Xu and Hongji Wang and Min Zhang and Jinsong Su,http://arxiv.org/pdf/2301.11749v1
 http://arxiv.org/abs/1909.05855v2,creativecommons.org/licenses/by-nc-sa/4.0/,Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset,Abhinav Rastogi and Xiaoxue Zang and Srinivas Sunkara and Raghav Gupta and Pranav Khaitan,http://arxiv.org/pdf/1909.05855v2
 http://arxiv.org/abs/2106.00245v2,creativecommons.org/licenses/by-nc-sa/4.0/,Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models,Linjie Li and Jie Lei and Zhe Gan and Jingjing Liu,http://arxiv.org/pdf/2106.00245v2
 http://arxiv.org/abs/2006.07490v1,creativecommons.org/licenses/by-nc-sa/4.0/,Understanding Unintended Memorization in Federated Learning,Om Thakkar and Swaroop Ramaswamy and Rajiv Mathews and Françoise Beaufays,http://arxiv.org/pdf/2006.07490v1
 http://arxiv.org/abs/2210.04802v1,creativecommons.org/licenses/by-nc-sa/4.0/,SimSCOOD: Systematic Analysis of Out-of-Distribution Behavior of Source Code Models,Hossein Hajipour and Ning Yu and Cristian-Alexandru Staicu and Mario Fritz,http://arxiv.org/pdf/2210.04802v1
 http://arxiv.org/abs/2304.03277v1,creativecommons.org/licenses/by-nc-sa/4.0/,Instruction Tuning with GPT-4,Baolin Peng and Chunyuan Li and Pengcheng He and Michel Galley and Jianfeng Gao,http://arxiv.org/pdf/2304.03277v1
 http://arxiv.org/abs/2112.11914v1,creativecommons.org/licenses/by-nc-sa/4.0/,Assisted Text Annotation Using Active Learning to Achieve High Quality with Little Effort,Franziska Weeber and Felix Hamborg and Karsten Donnay and Bela Gipp,http://arxiv.org/pdf/2112.11914v1
 http://arxiv.org/abs/2210.03235v3,creativecommons.org/licenses/by-nc-sa/4.0/,Improving Large-scale Paraphrase Acquisition and Generation,Yao Dou and Chao Jiang and Wei Xu,http://arxiv.org/pdf/2210.03235v3
 http://arxiv.org/abs/2303.15725v1,creativecommons.org/licenses/by-nc-sa/4.0/,"Solving Regularized Exp, Cosh and Sinh Regression Problems",Zhihang Li and Zhao Song and Tianyi Zhou,http://arxiv.org/pdf/2303.15725v1
 http://arxiv.org/abs/2304.02839v1,creativecommons.org/licenses/by-nc-sa/4.0/,"Whose Text Is It Anyway? Exploring BigCode, Intellectual Property, and Ethics",Madiha Zahrah Choksi and David Goedicke,http://arxiv.org/pdf/2304.02839v1
 http://arxiv.org/abs/1604.03184v1,creativecommons.org/licenses/by-nc-sa/4.0/,Desiree - a Refinement Calculus for Requirements Engineering,Feng-Lin Li and John Mylopoulos,http://arxiv.org/pdf/1604.03184v1
 http://arxiv.org/abs/1912.03768v2,creativecommons.org/licenses/by-nc-sa/4.0/,TypeWriter: Neural Type Prediction with Search-based Validation,Michael Pradel and Georgios Gousios and Jason Liu and Satish Chandra,http://arxiv.org/pdf/1912.03768v2
 http://arxiv.org/abs/2104.08087v1,creativecommons.org/licenses/by-nc-sa/4.0/,Citations are not opinions: a corpus linguistics approach to understanding how citations are made,Domenic Rosati,http://arxiv.org/pdf/2104.08087v1
 http://arxiv.org/abs/2303.01490v1,creativecommons.org/licenses/by-nc-sa/4.0/,Language Variety Identification with True Labels,Marcos Zampieri and Kai North and Tommi Jauhiainen and Mariano Felice and Neha Kumari and Nishant Nair and Yash Bangera,http://arxiv.org/pdf/2303.01490v1
 http://arxiv.org/abs/2111.05193v2,creativecommons.org/licenses/by-nc-sa/4.0/,A Survey on Green Deep Learning,Jingjing Xu and Wangchunshu Zhou and Zhiyi Fu and Hao Zhou and Lei Li,http://arxiv.org/pdf/2111.05193v2
 http://arxiv.org/abs/2304.07772v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Comprehensive Evaluation of the Copy Mechanism for Natural Language to SPARQL Query Generation,Samuel Reyd and Amal Zouaq and Papa Abdou Karim Karou Diallo,http://arxiv.org/pdf/2304.07772v1
 http://arxiv.org/abs/2212.01757v1,creativecommons.org/licenses/by-nc-sa/4.0/,Languages You Know Influence Those You Learn: Impact of Language Characteristics on Multi-Lingual Text-to-Text Transfer,Benjamin Muller and Deepanshu Gupta and Siddharth Patwardhan and Jean-Philippe Fauconnier and David Vandyke and Sachin Agarwal,http://arxiv.org/pdf/2212.01757v1
 http://arxiv.org/abs/1809.01229v1,creativecommons.org/licenses/by-nc-sa/4.0/,t-Exponential Memory Networks for Question-Answering Machines,Kyriakos Tolias and Sotirios Chatzis,http://arxiv.org/pdf/1809.01229v1
 http://arxiv.org/abs/2202.13623v1,creativecommons.org/licenses/by-nc-sa/4.0/,Interactive Machine Learning for Image Captioning,Mareike Hartmann and Aliki Anagnostopoulou and Daniel Sonntag,http://arxiv.org/pdf/2202.13623v1
 http://arxiv.org/abs/2109.03926v2,creativecommons.org/licenses/by-nc-sa/4.0/,Transformers in the loop: Polarity in neural models of language,Lisa Bylinina and Alexey Tikhonov,http://arxiv.org/pdf/2109.03926v2
 http://arxiv.org/abs/2206.07627v1,creativecommons.org/licenses/by-nc-sa/4.0/,Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech,Jan Lehečka and Jan Švec and Aleš Pražák and Josef V. Psutka,http://arxiv.org/pdf/2206.07627v1
 http://arxiv.org/abs/2106.07876v3,creativecommons.org/licenses/by-nc-sa/4.0/,Vision-Language Navigation with Random Environmental Mixup,Chong Liu and Fengda Zhu and Xiaojun Chang and Xiaodan Liang and Zongyuan Ge and Yi-Dong Shen,http://arxiv.org/pdf/2106.07876v3
 http://arxiv.org/abs/1809.06471v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Language for Large-Scale Collaboration in Economics: A Streamlined Computational Representation of Financial Models,Jorge Faleiro,http://arxiv.org/pdf/1809.06471v1
 http://arxiv.org/abs/2212.01218v1,creativecommons.org/licenses/by-nc-sa/4.0/,Answer ranking in Community Question Answering: a deep learning approach,Lucas Valentin,http://arxiv.org/pdf/2212.01218v1
 http://arxiv.org/abs/1610.08431v3,creativecommons.org/licenses/by-nc-sa/4.0/,Broad Context Language Modeling as Reading Comprehension,Zewei Chu and Hai Wang and Kevin Gimpel and David McAllester,http://arxiv.org/pdf/1610.08431v3
 http://arxiv.org/abs/2105.15065v1,creativecommons.org/licenses/by-nc-sa/4.0/,Picking Pearl From Seabed: Extracting Artefacts from Noisy Issue Triaging Collaborative Conversations for Hybrid Cloud Services,Amar Prakash Azad and Supriyo Ghosh and Ajay Gupta and Harshit Kumar and Prateeti Mohapatra,http://arxiv.org/pdf/2105.15065v1
 http://arxiv.org/abs/2111.08374v3,creativecommons.org/licenses/by-nc-sa/4.0/,Literature-Augmented Clinical Outcome Prediction,Aakanksha Naik and Sravanthi Parasa and Sergey Feldman and Lucy Lu Wang and Tom Hope,http://arxiv.org/pdf/2111.08374v3
 http://arxiv.org/abs/2205.02293v3,creativecommons.org/licenses/by-nc-sa/4.0/,Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance,Jingwei Ni and Zhijing Jin and Markus Freitag and Mrinmaya Sachan and Bernhard Schölkopf,http://arxiv.org/pdf/2205.02293v3
 http://arxiv.org/abs/2209.08966v2,creativecommons.org/licenses/by-nc-sa/4.0/,Will It Blend? Mixing Training Paradigms & Prompting for Argument Quality Prediction,Michiel van der Meer and Myrthe Reuver and Urja Khurana and Lea Krause and Selene Báez Santamaría,http://arxiv.org/pdf/2209.08966v2
 http://arxiv.org/abs/2212.02851v1,creativecommons.org/licenses/by-nc-sa/4.0/,DiSTRICT: Dialogue State Tracking with Retriever Driven In-Context Tuning,Praveen Venkateswaran and Evelyn Duesterwald and Vatche Isahagian,http://arxiv.org/pdf/2212.02851v1
 http://arxiv.org/abs/2303.07240v1,creativecommons.org/licenses/by-nc-sa/4.0/,PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents,Weixiong Lin and Ziheng Zhao and Xiaoman Zhang and Chaoyi Wu and Ya Zhang and Yanfeng Wang and Weidi Xie,http://arxiv.org/pdf/2303.07240v1
 http://arxiv.org/abs/2304.05973v1,creativecommons.org/licenses/by-nc-sa/4.0/,HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented Prompting,Jiaying Lu and Jiaming Shen and Bo Xiong and Wenjing Ma and Steffen Staab and Carl Yang,http://arxiv.org/pdf/2304.05973v1
 http://arxiv.org/abs/2206.05224v2,creativecommons.org/licenses/by-nc-sa/4.0/,A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction,Wonseok Hwang and Dongjun Lee and Kyoungyeon Cho and Hanuhl Lee and Minjoon Seo,http://arxiv.org/pdf/2206.05224v2
 http://arxiv.org/abs/2201.07614v1,creativecommons.org/licenses/by-nc-sa/4.0/,Uncovering More Shallow Heuristics: Probing the Natural Language Inference Capacities of Transformer-Based Pre-Trained Language Models Using Syllogistic Patterns,Reto Gubelmann and Siegfried Handschuh,http://arxiv.org/pdf/2201.07614v1
 http://arxiv.org/abs/2203.03191v1,creativecommons.org/licenses/by-nc-sa/4.0/,Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features,Florian Lux and Ngoc Thang Vu,http://arxiv.org/pdf/2203.03191v1
 http://arxiv.org/abs/2012.14740v4,creativecommons.org/licenses/by-nc-sa/4.0/,LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding,Yang Xu and Yiheng Xu and Tengchao Lv and Lei Cui and Furu Wei and Guoxin Wang and Yijuan Lu and Dinei Florencio and Cha Zhang and Wanxiang Che and Min Zhang and Lidong Zhou,http://arxiv.org/pdf/2012.14740v4
 http://arxiv.org/abs/2204.04504v1,creativecommons.org/licenses/by-nc-sa/4.0/,TANet: Thread-Aware Pretraining for Abstractive Conversational Summarization,Ze Yang and Liran Wang and Zhoujin Tian and Wei Wu and Zhoujun Li,http://arxiv.org/pdf/2204.04504v1
 http://arxiv.org/abs/2011.13633v2,creativecommons.org/licenses/by-nc-sa/4.0/,CoRe: An Efficient Coarse-refined Training Framework for BERT,Cheng Yang and Shengnan Wang and Yuechuan Li and Chao Yang and Ming Yan and Jingqiao Zhang and Fangquan Lin,http://arxiv.org/pdf/2011.13633v2
 http://arxiv.org/abs/1806.10423v2,creativecommons.org/licenses/by-nc-sa/4.0/,Implementing Convex Optimization in R: Two Econometric Examples,Zhan Gao and Zhentao Shi,http://arxiv.org/pdf/1806.10423v2
 http://arxiv.org/abs/2210.09472v2,creativecommons.org/licenses/by-nc-sa/4.0/,Multi-granularity Argument Mining in Legal Texts,Huihui Xu and Kevin Ashley,http://arxiv.org/pdf/2210.09472v2
 http://arxiv.org/abs/2211.06774v2,creativecommons.org/licenses/by-nc-sa/4.0/,Large-Scale Bidirectional Training for Zero-Shot Image Captioning,Taehoon Kim and Mark Marsden and Pyunghwan Ahn and Sangyun Kim and Sihaeng Lee and Alessandra Sala and Seung Hwan Kim,http://arxiv.org/pdf/2211.06774v2
 http://arxiv.org/abs/1908.05828v1,creativecommons.org/licenses/by-nc-sa/4.0/,Named Entity Recognition for Nepali Language,Oyesh Mann Singh and Ankur Padia and Anupam Joshi,http://arxiv.org/pdf/1908.05828v1
 http://arxiv.org/abs/2002.00759v2,creativecommons.org/licenses/by-nc-sa/4.0/,Comparison Between Traditional Machine Learning Models And Neural Network Models For Vietnamese Hate Speech Detection,Son T. Luu and Hung P. Nguyen and Kiet Van Nguyen and Ngan Luu-Thuy Nguyen,http://arxiv.org/pdf/2002.00759v2
 http://arxiv.org/abs/2005.06752v1,creativecommons.org/licenses/by-nc-sa/4.0/,Large Scale Font Independent Urdu Text Recognition System,Atique Ur Rehman and Sibt Ul Hussain,http://arxiv.org/pdf/2005.06752v1