Created
April 30, 2023 22:56
-
-
Save pcmoritz/d250edb29288078bf06d1d2ff76ac1f9 to your computer and use it in GitHub Desktop.
llm-papers
We can't make this file beautiful and searchable because it's too large.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
URL,License,Title,Author(s),PDF | |
http://arxiv.org/abs/2202.03371v1,creativecommons.org/licenses/by-sa/4.0/,Cedille: A large autoregressive French language model,Martin Müller and Florian Laurent,http://arxiv.org/pdf/2202.03371v1 | |
http://arxiv.org/abs/2303.00077v1,creativecommons.org/licenses/by-sa/4.0/,Beyond the limitations of any imaginable mechanism: large language models and psycholinguistics,Conor Houghton and Nina Kazanina and Priyanka Sukumaran,http://arxiv.org/pdf/2303.00077v1 | |
http://arxiv.org/abs/2010.12858v2,creativecommons.org/licenses/by-sa/4.0/,When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models,Benjamin Muller and Antonis Anastasopoulos and Benoît Sagot and Djamé Seddah,http://arxiv.org/pdf/2010.12858v2 | |
http://arxiv.org/abs/2303.01911v1,creativecommons.org/licenses/by-sa/4.0/,Investigating the Translation Performance of a Large Multilingual Language Model: the Case of BLOOM,Rachel Bawden and François Yvon,http://arxiv.org/pdf/2303.01911v1 | |
http://arxiv.org/abs/2111.06053v1,creativecommons.org/licenses/by-sa/4.0/,Improving Large-scale Language Models and Resources for Filipino,Jan Christian Blaise Cruz and Charibeth Cheng,http://arxiv.org/pdf/2111.06053v1 | |
http://arxiv.org/abs/2301.01162v1,creativecommons.org/licenses/by-sa/4.0/,Language Models are Drummers: Drum Composition with Natural Language Pre-Training,Li Zhang and Chris Callison-Burch,http://arxiv.org/pdf/2301.01162v1 | |
http://arxiv.org/abs/2207.13988v2,creativecommons.org/licenses/by-sa/4.0/,Sequence to sequence pretraining for a less-resourced Slovenian language,Matej Ulčar and Marko Robnik-Šikonja,http://arxiv.org/pdf/2207.13988v2 | |
http://arxiv.org/abs/2005.00318v1,creativecommons.org/licenses/by-sa/4.0/,Can Multilingual Language Models Transfer to an Unseen Dialect? A Case Study on North African Arabizi,Benjamin Muller and Benoit Sagot and Djamé Seddah,http://arxiv.org/pdf/2005.00318v1 | |
http://arxiv.org/abs/2301.10472v1,creativecommons.org/licenses/by-sa/4.0/,XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models,Davis Liang and Hila Gonen and Yuning Mao and Rui Hou and Naman Goyal and Marjan Ghazvininejad and Luke Zettlemoyer and Madian Khabsa,http://arxiv.org/pdf/2301.10472v1 | |
http://arxiv.org/abs/1810.11895v3,creativecommons.org/licenses/by-sa/4.0/,"Language Modeling for Code-Switching: Evaluation, Integration of Monolingual Data, and Discriminative Training",Hila Gonen and Yoav Goldberg,http://arxiv.org/pdf/1810.11895v3 | |
http://arxiv.org/abs/2110.10319v1,creativecommons.org/licenses/by-sa/4.0/,LMSOC: An Approach for Socially Sensitive Pretraining,Vivek Kulkarni and Shubhanshu Mishra and Aria Haghighi,http://arxiv.org/pdf/2110.10319v1 | |
http://arxiv.org/abs/2207.06882v1,creativecommons.org/licenses/by-sa/4.0/,Multilinguals at SemEval-2022 Task 11: Complex NER in Semantically Ambiguous Settings for Low Resource Languages,Amit Pandey and Swayatta Daw and Narendra Babu Unnam and Vikram Pudi,http://arxiv.org/pdf/2207.06882v1 | |
http://arxiv.org/abs/2109.02903v2,creativecommons.org/licenses/by-sa/4.0/,IndicBART: A Pre-trained Model for Indic Natural Language Generation,Raj Dabre and Himani Shrotriya and Anoop Kunchukuttan and Ratish Puduppully and Mitesh M. Khapra and Pratyush Kumar,http://arxiv.org/pdf/2109.02903v2 | |
http://arxiv.org/abs/2210.00320v1,creativecommons.org/licenses/by-sa/4.0/,MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation,Kshitij Gupta,http://arxiv.org/pdf/2210.00320v1 | |
http://arxiv.org/abs/2205.14288v1,creativecommons.org/licenses/by-sa/4.0/,Few-shot Subgoal Planning with Language Models,Lajanugen Logeswaran and Yao Fu and Moontae Lee and Honglak Lee,http://arxiv.org/pdf/2205.14288v1 | |
http://arxiv.org/abs/2211.07524v1,creativecommons.org/licenses/by-sa/4.0/,Towards a Mathematics Formalisation Assistant using Large Language Models,Ayush Agrawal and Siddhartha Gadgil and Navin Goyal and Ashvni Narayanan and Anand Tadipatri,http://arxiv.org/pdf/2211.07524v1 | |
http://arxiv.org/abs/1904.01989v1,creativecommons.org/licenses/by-sa/4.0/,Subword-Level Language Identification for Intra-Word Code-Switching,Manuel Mager and Özlem Çetinoğlu and Katharina Kann,http://arxiv.org/pdf/1904.01989v1 | |
http://arxiv.org/abs/2009.08712v1,creativecommons.org/licenses/by-sa/4.0/,The birth of Romanian BERT,Stefan Daniel Dumitrescu and Andrei-Marius Avram and Sampo Pyysalo,http://arxiv.org/pdf/2009.08712v1 | |
http://arxiv.org/abs/2204.02292v2,creativecommons.org/licenses/by-sa/4.0/,Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual Retrieval,Robert Litschko and Ivan Vulić and Goran Glavaš,http://arxiv.org/pdf/2204.02292v2 | |
http://arxiv.org/abs/2101.11575v1,creativecommons.org/licenses/by-sa/4.0/,Mining Large-Scale Low-Resource Pronunciation Data From Wikipedia,Tania Chakraborty and Manasa Prasad and Theresa Breiner and Sandy Ritchie and Daan van Esch,http://arxiv.org/pdf/2101.11575v1 | |
http://arxiv.org/abs/2203.06462v2,creativecommons.org/licenses/by-sa/4.0/,Low-Rank Softmax Can Have Unargmaxable Classes in Theory but Rarely in Practice,Andreas Grivas and Nikolay Bogoychev and Adam Lopez,http://arxiv.org/pdf/2203.06462v2 | |
http://arxiv.org/abs/1910.04210v1,creativecommons.org/licenses/by-sa/4.0/,Perturbation Sensitivity Analysis to Detect Unintended Model Biases,Vinodkumar Prabhakaran and Ben Hutchinson and Margaret Mitchell,http://arxiv.org/pdf/1910.04210v1 | |
http://arxiv.org/abs/2201.08471v1,creativecommons.org/licenses/by-sa/4.0/,Transfer Learning Approaches for Building Cross-Language Dense Retrieval Models,Suraj Nair and Eugene Yang and Dawn Lawrie and Kevin Duh and Paul McNamee and Kenton Murray and James Mayfield and Douglas W. Oard,http://arxiv.org/pdf/2201.08471v1 | |
http://arxiv.org/abs/2203.03759v1,creativecommons.org/licenses/by-sa/4.0/,IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation,Gabriele Sarti and Malvina Nissim,http://arxiv.org/pdf/2203.03759v1 | |
http://arxiv.org/abs/2112.10553v1,creativecommons.org/licenses/by-sa/4.0/,Training dataset and dictionary sizes matter in BERT models: the case of Baltic languages,Matej Ulčar and Marko Robnik-Šikonja,http://arxiv.org/pdf/2112.10553v1 | |
http://arxiv.org/abs/2204.09391v1,creativecommons.org/licenses/by-sa/4.0/,You Are What You Write: Preserving Privacy in the Era of Large Language Models,Richard Plant and Valerio Giuffrida and Dimitra Gkatzia,http://arxiv.org/pdf/2204.09391v1 | |
http://arxiv.org/abs/2212.09895v1,creativecommons.org/licenses/by-sa/4.0/,Improved Long-Form Spoken Language Translation with Large Language Models,Arya D. McCarthy and Hao Zhang and Shankar Kumar and Felix Stahlberg and Axel H. Ng,http://arxiv.org/pdf/2212.09895v1 | |
http://arxiv.org/abs/2004.09456v1,creativecommons.org/licenses/by-sa/4.0/,StereoSet: Measuring stereotypical bias in pretrained language models,Moin Nadeem and Anna Bethke and Siva Reddy,http://arxiv.org/pdf/2004.09456v1 | |
http://arxiv.org/abs/2109.15290v1,creativecommons.org/licenses/by-sa/4.0/,MatSciBERT: A Materials Domain Language Model for Text Mining and Information Extraction,Tanishq Gupta and Mohd Zaki and N. M. Anoop Krishnan and Mausam,http://arxiv.org/pdf/2109.15290v1 | |
http://arxiv.org/abs/2206.01205v1,creativecommons.org/licenses/by-sa/4.0/,Snow Mountain: Dataset of Audio Recordings of The Bible in Low Resource Languages,Kavitha Raju and Anjaly V and Ryan Lish and Joel Mathew,http://arxiv.org/pdf/2206.01205v1 | |
http://arxiv.org/abs/2211.00142v1,creativecommons.org/licenses/by-sa/4.0/,TaTa: A Multilingual Table-to-Text Dataset for African Languages,Sebastian Gehrmann and Sebastian Ruder and Vitaly Nikolaev and Jan A. Botha and Michael Chavinda and Ankur Parikh and Clara Rivera,http://arxiv.org/pdf/2211.00142v1 | |
http://arxiv.org/abs/2107.09931v1,creativecommons.org/licenses/by-sa/4.0/,The Effectiveness of Intermediate-Task Training for Code-Switched Natural Language Understanding,Archiki Prasad and Mohammad Ali Rehan and Shreya Pathak and Preethi Jyothi,http://arxiv.org/pdf/2107.09931v1 | |
http://arxiv.org/abs/2302.07912v1,creativecommons.org/licenses/by-sa/4.0/,Meeting the Needs of Low-Resource Languages: The Value of Automatic Alignments via Pretrained Models,Abteen Ebrahimi and Arya D. McCarthy and Arturo Oncevay and Luis Chiruzzo and John E. Ortega and Gustavo A. Giménez-Lugo and Rolando Coto-Solano and Katharina Kann,http://arxiv.org/pdf/2302.07912v1 | |
http://arxiv.org/abs/2102.12162v1,creativecommons.org/licenses/by-sa/4.0/,From Universal Language Model to Downstream Task: Improving RoBERTa-Based Vietnamese Hate Speech Detection,Quang Huu Pham and Viet Anh Nguyen and Linh Bao Doan and Ngoc N. Tran and Ta Minh Thanh,http://arxiv.org/pdf/2102.12162v1 | |
http://arxiv.org/abs/2211.13613v2,creativecommons.org/licenses/by-sa/4.0/,Ham2Pose: Animating Sign Language Notation into Pose Sequences,Rotem Shalev-Arkushin and Amit Moryossef and Ohad Fried,http://arxiv.org/pdf/2211.13613v2 | |
http://arxiv.org/abs/2212.05058v1,creativecommons.org/licenses/by-sa/4.0/,Structured Like a Language Model: Analysing AI as an Automated Subject,Liam Magee and Vanicka Arora and Luke Munn,http://arxiv.org/pdf/2212.05058v1 | |
http://arxiv.org/abs/2108.01589v1,creativecommons.org/licenses/by-sa/4.0/,ExBERT: An External Knowledge Enhanced BERT for Natural Language Inference,Amit Gajbhiye and Noura Al Moubayed and Steven Bradley,http://arxiv.org/pdf/2108.01589v1 | |
http://arxiv.org/abs/2108.02598v1,creativecommons.org/licenses/by-sa/4.0/,Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification,Yidi Jiang and Bidisha Sharma and Maulik Madhavi and Haizhou Li,http://arxiv.org/pdf/2108.02598v1 | |
http://arxiv.org/abs/2301.08130v2,creativecommons.org/licenses/by-sa/4.0/,A Cohesive Distillation Architecture for Neural Language Models,Jan Philip Wahle,http://arxiv.org/pdf/2301.08130v2 | |
http://arxiv.org/abs/2107.10614v1,creativecommons.org/licenses/by-sa/4.0/,Evaluation of contextual embeddings on less-resourced languages,Matej Ulčar and Aleš Žagar and Carlos S. Armendariz and Andraž Repar and Senja Pollak and Matthew Purver and Marko Robnik-Šikonja,http://arxiv.org/pdf/2107.10614v1 | |
http://arxiv.org/abs/2109.10724v1,creativecommons.org/licenses/by-sa/4.0/,Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network,Takaaki Saeki and Shinnosuke Takamichi and Hiroshi Saruwatari,http://arxiv.org/pdf/2109.10724v1 | |
http://arxiv.org/abs/2011.12432v2,creativecommons.org/licenses/by-sa/4.0/,Enhancing deep neural networks with morphological information,Matej Klemen and Luka Krsnik and Marko Robnik-Šikonja,http://arxiv.org/pdf/2011.12432v2 | |
http://arxiv.org/abs/2211.02956v1,creativecommons.org/licenses/by-sa/4.0/,Privacy-Preserving Models for Legal Natural Language Processing,Ying Yin and Ivan Habernal,http://arxiv.org/pdf/2211.02956v1 | |
http://arxiv.org/abs/1606.09403v1,creativecommons.org/licenses/by-sa/4.0/,Learning Crosslingual Word Embeddings without Bilingual Corpora,Long Duong and Hiroshi Kanayama and Tengfei Ma and Steven Bird and Trevor Cohn,http://arxiv.org/pdf/1606.09403v1 | |
http://arxiv.org/abs/2303.12153v1,creativecommons.org/licenses/by-sa/4.0/,Text2Motion: From Natural Language Instructions to Feasible Plans,Kevin Lin and Christopher Agia and Toki Migimatsu and Marco Pavone and Jeannette Bohg,http://arxiv.org/pdf/2303.12153v1 | |
http://arxiv.org/abs/2207.01772v1,creativecommons.org/licenses/by-sa/4.0/,Vision-and-Language Pretraining,Thong Nguyen and Cong-Duy Nguyen and Xiaobao Wu and Anh Tuan Luu,http://arxiv.org/pdf/2207.01772v1 | |
http://arxiv.org/abs/2302.00923v4,creativecommons.org/licenses/by-sa/4.0/,Multimodal Chain-of-Thought Reasoning in Language Models,Zhuosheng Zhang and Aston Zhang and Mu Li and Hai Zhao and George Karypis and Alex Smola,http://arxiv.org/pdf/2302.00923v4 | |
http://arxiv.org/abs/2210.03568v3,creativecommons.org/licenses/by-sa/4.0/,How Large Language Models are Transforming Machine-Paraphrased Plagiarism,Jan Philip Wahle and Terry Ruas and Frederic Kirstein and Bela Gipp,http://arxiv.org/pdf/2210.03568v3 | |
http://arxiv.org/abs/2110.07982v1,creativecommons.org/licenses/by-sa/4.0/,Scribosermo: Fast Speech-to-Text models for German and other Languages,Daniel Bermuth and Alexander Poeppel and Wolfgang Reif,http://arxiv.org/pdf/2110.07982v1 | |
http://arxiv.org/abs/2112.12650v3,creativecommons.org/licenses/by-sa/4.0/,Distilling the Knowledge of Romanian BERTs Using Multiple Teachers,Andrei-Marius Avram and Darius Catrina and Dumitru-Clementin Cercel and Mihai Dascălu and Traian Rebedea and Vasile Păiş and Dan Tufiş,http://arxiv.org/pdf/2112.12650v3 | |
http://arxiv.org/abs/2207.06814v1,creativecommons.org/licenses/by-sa/4.0/,BERTIN: Efficient Pre-Training of a Spanish Language Model using Perplexity Sampling,Javier de la Rosa and Eduardo G. Ponferrada and Paulo Villegas and Pablo Gonzalez de Prado Salas and Manu Romero and Marıa Grandury,http://arxiv.org/pdf/2207.06814v1 | |
http://arxiv.org/abs/2109.05772v1,creativecommons.org/licenses/by-sa/4.0/,Wine is Not v i n. -- On the Compatibility of Tokenizations Across Languages,Antonis Maronikolakis and Philipp Dufter and Hinrich Schütze,http://arxiv.org/pdf/2109.05772v1 | |
http://arxiv.org/abs/2212.10548v1,creativecommons.org/licenses/by-sa/4.0/,T-Projection: High Quality Annotation Projection for Sequence Labeling Tasks,Iker García-Ferrero and Rodrigo Agerri and German Rigau,http://arxiv.org/pdf/2212.10548v1 | |
http://arxiv.org/abs/2302.06476v2,creativecommons.org/licenses/by-sa/4.0/,Is ChatGPT a General-Purpose Natural Language Processing Task Solver?,Chengwei Qin and Aston Zhang and Zhuosheng Zhang and Jiaao Chen and Michihiro Yasunaga and Diyi Yang,http://arxiv.org/pdf/2302.06476v2 | |
http://arxiv.org/abs/2101.03289v5,creativecommons.org/licenses/by-sa/4.0/,Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing,Minh Van Nguyen and Viet Dac Lai and Amir Pouran Ben Veyseh and Thien Huu Nguyen,http://arxiv.org/pdf/2101.03289v5 | |
http://arxiv.org/abs/2211.09085v1,creativecommons.org/licenses/by-sa/4.0/,Galactica: A Large Language Model for Science,Ross Taylor and Marcin Kardas and Guillem Cucurull and Thomas Scialom and Anthony Hartshorn and Elvis Saravia and Andrew Poulton and Viktor Kerkez and Robert Stojnic,http://arxiv.org/pdf/2211.09085v1 | |
http://arxiv.org/abs/2302.07735v1,creativecommons.org/licenses/by-sa/4.0/,Targeted Attack on GPT-Neo for the SATML Language Model Data Extraction Challenge,Ali Al-Kaswan and Maliheh Izadi and Arie van Deursen,http://arxiv.org/pdf/2302.07735v1 | |
http://arxiv.org/abs/2202.07359v1,creativecommons.org/licenses/by-sa/4.0/,textless-lib: a Library for Textless Spoken Language Processing,Eugene Kharitonov and Jade Copet and Kushal Lakhotia and Tu Anh Nguyen and Paden Tomasello and Ann Lee and Ali Elkahky and Wei-Ning Hsu and Abdelrahman Mohamed and Emmanuel Dupoux and Yossi Adi,http://arxiv.org/pdf/2202.07359v1 | |
http://arxiv.org/abs/2106.05822v1,creativecommons.org/licenses/by-sa/4.0/,GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures,Ivan Chelombiev and Daniel Justus and Douglas Orr and Anastasia Dietrich and Frithjof Gressmann and Alexandros Koliousis and Carlo Luschi,http://arxiv.org/pdf/2106.05822v1 | |
http://arxiv.org/abs/2207.00352v1,creativecommons.org/licenses/by-sa/4.0/,Toward Low-Cost End-to-End Spoken Language Understanding,Marco Dinarelli and Marco Naguib and François Portet,http://arxiv.org/pdf/2207.00352v1 | |
http://arxiv.org/abs/2204.13913v1,creativecommons.org/licenses/by-sa/4.0/,Leaner and Faster: Two-Stage Model Compression for Lightweight Text-Image Retrieval,Siyu Ren and Kenny Q. Zhu,http://arxiv.org/pdf/2204.13913v1 | |
http://arxiv.org/abs/2212.07126v1,creativecommons.org/licenses/by-sa/4.0/,Explainability of Text Processing and Retrieval Methods: A Critical Survey,Sourav Saha and Debapriyo Majumdar and Mandar Mitra,http://arxiv.org/pdf/2212.07126v1 | |
http://arxiv.org/abs/2301.13382v1,creativecommons.org/licenses/by-sa/4.0/,Numeracy from Literacy: Data Science as an Emergent Skill from Large Language Models,David Noever and Forrest McKee,http://arxiv.org/pdf/2301.13382v1 | |
http://arxiv.org/abs/2304.01373v1,creativecommons.org/licenses/by-sa/4.0/,Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling,Stella Biderman and Hailey Schoelkopf and Quentin Anthony and Herbie Bradley and Kyle O'Brien and Eric Hallahan and Mohammad Aflah Khan and Shivanshu Purohit and USVSN Sai Prashanth and Edward Raff and Aviya Skowron and Lintang Sutawika and Oskar van der Wal,http://arxiv.org/pdf/2304.01373v1 | |
http://arxiv.org/abs/2106.02679v1,creativecommons.org/licenses/by-sa/4.0/,Layered gradient accumulation and modular pipeline parallelism: fast and efficient training of large language models,Joel Lamy-Poirier,http://arxiv.org/pdf/2106.02679v1 | |
http://arxiv.org/abs/2103.12450v5,creativecommons.org/licenses/by-sa/4.0/,Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection,Jan Philip Wahle and Terry Ruas and Norman Meuschke and Bela Gipp,http://arxiv.org/pdf/2103.12450v5 | |
http://arxiv.org/abs/2201.03382v1,creativecommons.org/licenses/by-sa/4.0/,BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives,Frederico Souza and João Filho,http://arxiv.org/pdf/2201.03382v1 | |
http://arxiv.org/abs/2209.06899v1,creativecommons.org/licenses/by-sa/4.0/,"Out of One, Many: Using Language Models to Simulate Human Samples",Lisa P. Argyle and Ethan C. Busby and Nancy Fulda and Joshua Gubler and Christopher Rytting and David Wingate,http://arxiv.org/pdf/2209.06899v1 | |
http://arxiv.org/abs/2102.06203v2,creativecommons.org/licenses/by-sa/4.0/,Proof Artifact Co-training for Theorem Proving with Language Models,Jesse Michael Han and Jason Rute and Yuhuai Wu and Edward W. Ayers and Stanislas Polu,http://arxiv.org/pdf/2102.06203v2 | |
http://arxiv.org/abs/2006.14223v1,creativecommons.org/licenses/by-sa/4.0/,Neural Machine Translation For Paraphrase Generation,Alex Sokolov and Denis Filimonov,http://arxiv.org/pdf/2006.14223v1 | |
http://arxiv.org/abs/2207.00349v1,creativecommons.org/licenses/by-sa/4.0/,Vers la compréhension automatique de la parole bout-en-bout à moindre effort,Marco Naguib and François Portet and Marco Dinarelli,http://arxiv.org/pdf/2207.00349v1 | |
http://arxiv.org/abs/2210.10668v1,creativecommons.org/licenses/by-sa/4.0/,N-Best Hypotheses Reranking for Text-To-SQL Systems,Lu Zeng and Sree Hari Krishnan Parthasarathi and Dilek Hakkani-Tur,http://arxiv.org/pdf/2210.10668v1 | |
http://arxiv.org/abs/2304.12203v1,creativecommons.org/licenses/by-sa/4.0/,Creating Large Language Model Resistant Exams: Guidelines and Strategies,Simon kaare Larsen,http://arxiv.org/pdf/2304.12203v1 | |
http://arxiv.org/abs/2105.09680v4,creativecommons.org/licenses/by-sa/4.0/,KLUE: Korean Language Understanding Evaluation,Sungjoon Park and Jihyung Moon and Sungdong Kim and Won Ik Cho and Jiyoon Han and Jangwon Park and Chisung Song and Junseong Kim and Yongsook Song and Taehwan Oh and Joohong Lee and Juhyun Oh and Sungwon Lyu and Younghoon Jeong and Inkwon Lee and Sangwoo Seo and Dongjun Lee and Hyunwoo Kim and Myeonghwa Lee and Seongbo Jang and Seungwon Do and Sunkyoung Kim and Kyungtae Lim and Jongwon Lee and Kyumin Park and Jamin Shin and Seonghyun Kim and Lucy Park and Alice Oh and Jung-Woo Ha and Kyunghyun Cho,http://arxiv.org/pdf/2105.09680v4 | |
http://arxiv.org/abs/2302.02083v3,creativecommons.org/licenses/by-sa/4.0/,Theory of Mind May Have Spontaneously Emerged in Large Language Models,Michal Kosinski,http://arxiv.org/pdf/2302.02083v3 | |
http://arxiv.org/abs/2208.06042v1,creativecommons.org/licenses/by-sa/4.0/,CodeBERT-nt: code naturalness via CodeBERT,Ahmed Khanfir and Matthieu Jimenez and Mike Papadakis and Yves Le Traon,http://arxiv.org/pdf/2208.06042v1 | |
http://arxiv.org/abs/2110.02056v1,creativecommons.org/licenses/by-sa/4.0/,Are Training Resources Insufficient? Predict First Then Explain!,Myeongjun Jang and Thomas Lukasiewicz,http://arxiv.org/pdf/2110.02056v1 | |
http://arxiv.org/abs/2006.07890v1,creativecommons.org/licenses/by-sa/4.0/,FinEst BERT and CroSloEngual BERT: less is more in multilingual models,Matej Ulčar and Marko Robnik-Šikonja,http://arxiv.org/pdf/2006.07890v1 | |
http://arxiv.org/abs/2302.13942v2,creativecommons.org/licenses/by-sa/4.0/,Inseq: An Interpretability Toolkit for Sequence Generation Models,Gabriele Sarti and Nils Feldhus and Ludwig Sickert and Oskar van der Wal and Malvina Nissim and Arianna Bisazza,http://arxiv.org/pdf/2302.13942v2 | |
http://arxiv.org/abs/2011.10208v1,creativecommons.org/licenses/by-sa/4.0/,Collaborative Storytelling with Large-scale Neural Language Models,Eric Nichols and Leo Gao and Randy Gomez,http://arxiv.org/pdf/2011.10208v1 | |
http://arxiv.org/abs/2001.07063v4,creativecommons.org/licenses/by-sa/4.0/,Modular coinduction up-to for higher-order languages via first-order transition systems,Jean-Marie Madiot and Damien Pous and Davide Sangiorgi,http://arxiv.org/pdf/2001.07063v4 | |
http://arxiv.org/abs/2212.11135v1,creativecommons.org/licenses/by-sa/4.0/,Array-Aware Matching: Taming the Complexity of Large-Scale Simulation Models,Massimo Fioravanti and Daniele Cattaneo and Federico Terraneo and Silvano Seva and Stefano Cherubin and Giovanni Agosta and Francesco Casella and Alberto Leva,http://arxiv.org/pdf/2212.11135v1 | |
http://arxiv.org/abs/2107.02027v2,creativecommons.org/licenses/by-sa/4.0/,Efficient Sequence Packing without Cross-contamination: Accelerating Large Language Models without Impacting Performance,Mario Michael Krell and Matej Kosec and Sergio P. Perez and Andrew Fitzgibbon,http://arxiv.org/pdf/2107.02027v2 | |
http://arxiv.org/abs/2302.14021v1,creativecommons.org/licenses/by-sa/4.0/,Quantifying Valence and Arousal in Text with Multilingual Pre-trained Transformers,Gonçalo Azevedo Mendes and Bruno Martins,http://arxiv.org/pdf/2302.14021v1 | |
http://arxiv.org/abs/2206.15076v1,creativecommons.org/licenses/by-sa/4.0/,BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing,Jason Alan Fries and Leon Weber and Natasha Seelam and Gabriel Altay and Debajyoti Datta and Samuele Garda and Myungsun Kang and Ruisi Su and Wojciech Kusa and Samuel Cahyawijaya and Fabio Barth and Simon Ott and Matthias Samwald and Stephen Bach and Stella Biderman and Mario Sänger and Bo Wang and Alison Callahan and Daniel León Periñán and Théo Gigant and Patrick Haller and Jenny Chim and Jose David Posada and John Michael Giorgi and Karthik Rangasai Sivaraman and Marc Pàmies and Marianna Nezhurina and Robert Martin and Michael Cullan and Moritz Freidank and Nathan Dahlberg and Shubhanshu Mishra and Shamik Bose and Nicholas Michio Broad and Yanis Labrak and Shlok S Deshmukh and Sid Kiblawi and Ayush Singh and Minh Chien Vu and Trishala Neeraj and Jonas Golde and Albert Villanova del Moral and Benjamin Beilharz,http://arxiv.org/pdf/2206.15076v1 | |
http://arxiv.org/abs/2211.17163v1,creativecommons.org/licenses/by-sa/4.0/,Misogyny classification of German newspaper forum comments,Johann Petrak and Brigitte Krenn,http://arxiv.org/pdf/2211.17163v1 | |
http://arxiv.org/abs/2212.10448v1,creativecommons.org/licenses/by-sa/4.0/,Parameter-efficient Zero-shot Transfer for Cross-Language Dense Retrieval with Adapters,Eugene Yang and Suraj Nair and Dawn Lawrie and James Mayfield and Douglas W. Oard,http://arxiv.org/pdf/2212.10448v1 | |
http://arxiv.org/abs/2304.02016v1,creativecommons.org/licenses/by-sa/4.0/,The Multimodal And Modular Ai Chef: Complex Recipe Generation From Imagery,David Noever and Samantha Elizabeth Miller Noever,http://arxiv.org/pdf/2304.02016v1 | |
http://arxiv.org/abs/2108.06277v1,creativecommons.org/licenses/by-sa/4.0/,Towards Structured Dynamic Sparse Pre-Training of BERT,Anastasia Dietrich and Frithjof Gressmann and Douglas Orr and Ivan Chelombiev and Daniel Justus and Carlo Luschi,http://arxiv.org/pdf/2108.06277v1 | |
http://arxiv.org/abs/2304.03325v1,creativecommons.org/licenses/by-sa/4.0/,ChatGPT-Crawler: Find out if ChatGPT really knows what it's talking about,Aman Rangapur and Haoran Wang,http://arxiv.org/pdf/2304.03325v1 | |
http://arxiv.org/abs/2302.13652v1,creativecommons.org/licenses/by-sa/4.0/,Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech,Dong Yang and Tomoki Koriyama and Yuki Saito and Takaaki Saeki and Detai Xin and Hiroshi Saruwatari,http://arxiv.org/pdf/2302.13652v1 | |
http://arxiv.org/abs/2104.08512v2,creativecommons.org/licenses/by-sa/4.0/,Minimal Supervision for Morphological Inflection,Omer Goldman and Reut Tsarfaty,http://arxiv.org/pdf/2104.08512v2 | |
http://arxiv.org/abs/2107.08091v1,creativecommons.org/licenses/by-sa/4.0/,A Comparison of Methods for OOV-word Recognition on a New Public Dataset,Rudolf A. Braun and Srikanth Madikeri and Petr Motlicek,http://arxiv.org/pdf/2107.08091v1 | |
http://arxiv.org/abs/2201.07406v2,creativecommons.org/licenses/by-sa/4.0/,Fooling MOSS Detection with Pretrained Language Models,Stella Biderman and Edward Raff,http://arxiv.org/pdf/2201.07406v2 | |
http://arxiv.org/abs/2206.07048v1,creativecommons.org/licenses/by-sa/4.0/,A smile is all you need: Predicting limiting activity coefficients from SMILES with natural language processing,Benedikt Winter and Clemens Winter and Johannes Schilling and André Bardow,http://arxiv.org/pdf/2206.07048v1 | |
http://arxiv.org/abs/2210.13778v1,creativecommons.org/licenses/by-sa/4.0/,IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension,Rifki Afina Putri and Alice Oh,http://arxiv.org/pdf/2210.13778v1 | |
http://arxiv.org/abs/2008.02878v1,creativecommons.org/licenses/by-sa/4.0/,A Multilingual Neural Machine Translation Model for Biomedical Data,Alexandre Bérard and Zae Myung Kim and Vassilina Nikoulina and Eunjeong L. Park and Matthias Gallé,http://arxiv.org/pdf/2008.02878v1 | |
http://arxiv.org/abs/2208.10448v1,creativecommons.org/licenses/by-sa/4.0/,Dialogue Term Extraction using Transfer Learning and Topological Data Analysis,Renato Vukovic and Michael Heck and Benjamin Matthias Ruppik and Carel van Niekerk and Marcus Zibrowius and Milica Gašić,http://arxiv.org/pdf/2208.10448v1 | |
http://arxiv.org/abs/2210.15187v1,creativecommons.org/licenses/by-sa/4.0/,Learning Joint Representation of Human Motion and Language,Jihoon Kim and Youngjae Yu and Seungyoun Shin and Taehyun Byun and Sungjoon Choi,http://arxiv.org/pdf/2210.15187v1 | |
http://arxiv.org/abs/2304.02496v1,creativecommons.org/licenses/by-sa/4.0/,Evaluation of ChatGPT Family of Models for Biomedical Reasoning and Classification,Shan Chen and Yingya Li and Sheng Lu and Hoang Van and Hugo JWL Aerts and Guergana K. Savova and Danielle S. Bitterman,http://arxiv.org/pdf/2304.02496v1 | |
http://arxiv.org/abs/1711.01048v2,creativecommons.org/licenses/by-sa/4.0/,Dual Language Models for Code Switched Speech Recognition,Saurabh Garg and Tanmay Parekh and Preethi Jyothi,http://arxiv.org/pdf/1711.01048v2 | |
http://arxiv.org/abs/2109.00271v1,creativecommons.org/licenses/by-sa/4.0/,Discovering Representation Sprachbund For Multilingual Pre-Training,Yimin Fan and Yaobo Liang and Alexandre Muzio and Hany Hassan and Houqiang Li and Ming Zhou and Nan Duan,http://arxiv.org/pdf/2109.00271v1 | |
http://arxiv.org/abs/2204.12632v2,creativecommons.org/licenses/by-sa/4.0/,Testing the Ability of Language Models to Interpret Figurative Language,Emmy Liu and Chen Cui and Kenneth Zheng and Graham Neubig,http://arxiv.org/pdf/2204.12632v2 | |
http://arxiv.org/abs/2010.00622v2,creativecommons.org/licenses/by-sa/4.0/,Pix2Prof: fast extraction of sequential information from galaxy imagery via a deep natural language 'captioning' model,Michael J. Smith and Nikhil Arora and Connor Stone and Stéphane Courteau and James E. Geach,http://arxiv.org/pdf/2010.00622v2 | |
http://arxiv.org/abs/2208.07870v1,creativecommons.org/licenses/by-sa/4.0/,Language-guided Semantic Style Transfer of 3D Indoor Scenes,Bu Jin and Beiwen Tian and Hao Zhao and Guyue Zhou,http://arxiv.org/pdf/2208.07870v1 | |
http://arxiv.org/abs/2304.11350v1,creativecommons.org/licenses/by-sa/4.0/,Romanian Multiword Expression Detection Using Multilingual Adversarial Training and Lateral Inhibition,Andrei-Marius Avram and Verginica Barbu Mititelu and Dumitru-Clementin Cercel,http://arxiv.org/pdf/2304.11350v1 | |
http://arxiv.org/abs/2301.12596v2,creativecommons.org/licenses/by-sa/4.0/,Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining,Takaaki Saeki and Soumi Maiti and Xinjian Li and Shinji Watanabe and Shinnosuke Takamichi and Hiroshi Saruwatari,http://arxiv.org/pdf/2301.12596v2 | |
http://arxiv.org/abs/2304.05468v1,creativecommons.org/licenses/by-sa/4.0/,A Survey of Resources and Methods for Natural Language Processing of Serbian Language,Ulfeta A. Marovac and Aldina R. Avdić and Nikola Lj. Milošević,http://arxiv.org/pdf/2304.05468v1 | |
http://arxiv.org/abs/2012.12612v2,creativecommons.org/licenses/by-sa/4.0/,Incremental Text-to-Speech Synthesis Using Pseudo Lookahead with Large Pretrained Language Model,Takaaki Saeki and Shinnosuke Takamichi and Hiroshi Saruwatari,http://arxiv.org/pdf/2012.12612v2 | |
http://arxiv.org/abs/2208.12461v1,creativecommons.org/licenses/by-sa/4.0/,AutoQGS: Auto-Prompt for Low-Resource Knowledge-based Question Generation from SPARQL,Guanming Xiong and Junwei Bao and Wen Zhao and Youzheng Wu and Xiaodong He,http://arxiv.org/pdf/2208.12461v1 | |
http://arxiv.org/abs/2210.03493v1,creativecommons.org/licenses/by-sa/4.0/,Automatic Chain of Thought Prompting in Large Language Models,Zhuosheng Zhang and Aston Zhang and Mu Li and Alex Smola,http://arxiv.org/pdf/2210.03493v1 | |
http://arxiv.org/abs/2101.08370v1,creativecommons.org/licenses/by-sa/4.0/,Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval,Robert Litschko and Ivan Vulić and Simone Paolo Ponzetto and Goran Glavaš,http://arxiv.org/pdf/2101.08370v1 | |
http://arxiv.org/abs/2211.15533v1,creativecommons.org/licenses/by-sa/4.0/,The Stack: 3 TB of permissively licensed source code,Denis Kocetkov and Raymond Li and Loubna Ben Allal and Jia Li and Chenghao Mou and Carlos Muñoz Ferrandis and Yacine Jernite and Margaret Mitchell and Sean Hughes and Thomas Wolf and Dzmitry Bahdanau and Leandro von Werra and Harm de Vries,http://arxiv.org/pdf/2211.15533v1 | |
http://arxiv.org/abs/2106.04571v1,creativecommons.org/licenses/by-sa/4.0/,TIMEDIAL: Temporal Commonsense Reasoning in Dialog,Lianhui Qin and Aditya Gupta and Shyam Upadhyay and Luheng He and Yejin Choi and Manaal Faruqui,http://arxiv.org/pdf/2106.04571v1 | |
http://arxiv.org/abs/2209.15168v1,creativecommons.org/licenses/by-sa/4.0/,Depth-Wise Attention (DWAtt): A Layer Fusion Method for Data-Efficient Classification,Muhammad ElNokrashy and Badr AlKhamissi and Mona Diab,http://arxiv.org/pdf/2209.15168v1 | |
http://arxiv.org/abs/2301.10172v2,creativecommons.org/licenses/by-sa/4.0/,MTTN: Multi-Pair Text to Text Narratives for Prompt Generation,Archan Ghosh and Debgandhar Ghosh and Madhurima Maji and Suchinta Chanda and Kalporup Goswami,http://arxiv.org/pdf/2301.10172v2 | |
http://arxiv.org/abs/2304.04083v1,creativecommons.org/licenses/by-sa/4.0/,"VOICE: Visual Oracle for Interaction, Conversation, and Explanation",Donggang Jia and Alexandra Irger and Ondrej Strnad and Johanna Bjorklund and Anders Ynnerman and Ivan Viola,http://arxiv.org/pdf/2304.04083v1 | |
http://arxiv.org/abs/1906.02979v1,creativecommons.org/licenses/by-sa/4.0/,A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains,Dominik Schlechtweg and Anna Hätty and Marco del Tredici and Sabine Schulte im Walde,http://arxiv.org/pdf/1906.02979v1 | |
http://arxiv.org/abs/2107.10989v1,creativecommons.org/licenses/by-sa/4.0/,Estimating Predictive Uncertainty Under Program Data Distribution Shift,Yufei Li and Simin Chen and Wei Yang,http://arxiv.org/pdf/2107.10989v1 | |
http://arxiv.org/abs/2104.09712v1,creativecommons.org/licenses/by-sa/4.0/,Problems and Countermeasures in Natural Language Processing Evaluation,Qingxiu Dong and Zhifang Sui and Weidong Zhan and Baobao Chang,http://arxiv.org/pdf/2104.09712v1 | |
http://arxiv.org/abs/2210.01512v2,creativecommons.org/licenses/by-sa/4.0/,Code-Switching without Switching: Language Agnostic End-to-End Speech Translation,Christian Huber and Enes Yavuz Ugan and Alexander Waibel,http://arxiv.org/pdf/2210.01512v2 | |
http://arxiv.org/abs/1811.01115v1,creativecommons.org/licenses/by-sa/4.0/,Neural Task Representations as Weak Supervision for Model Agnostic Cross-Lingual Transfer,Sujay Kumar Jauhar and Michael Gamon and Patrick Pantel,http://arxiv.org/pdf/1811.01115v1 | |
http://arxiv.org/abs/1903.08905v1,creativecommons.org/licenses/by-sa/4.0/,RAP-Net: Recurrent Attention Pooling Networks for Dialogue Response Selection,Chao-Wei Huang and Ting-Rui Chiang and Shang-Yu Su and Yun-Nung Chen,http://arxiv.org/pdf/1903.08905v1 | |
http://arxiv.org/abs/2202.08316v2,creativecommons.org/licenses/by-sa/4.0/,FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction,Minh Van Nguyen and Nghia Trung Ngo and Bonan Min and Thien Huu Nguyen,http://arxiv.org/pdf/2202.08316v2 | |
http://arxiv.org/abs/2202.03052v2,creativecommons.org/licenses/by-sa/4.0/,"OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework",Peng Wang and An Yang and Rui Men and Junyang Lin and Shuai Bai and Zhikang Li and Jianxin Ma and Chang Zhou and Jingren Zhou and Hongxia Yang,http://arxiv.org/pdf/2202.03052v2 | |
http://arxiv.org/abs/2210.01478v3,creativecommons.org/licenses/by-sa/4.0/,When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment,Zhijing Jin and Sydney Levine and Fernando Gonzalez and Ojasv Kamal and Maarten Sap and Mrinmaya Sachan and Rada Mihalcea and Josh Tenenbaum and Bernhard Schölkopf,http://arxiv.org/pdf/2210.01478v3 | |
http://arxiv.org/abs/2110.03888v3,creativecommons.org/licenses/by-sa/4.0/,M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining,Junyang Lin and An Yang and Jinze Bai and Chang Zhou and Le Jiang and Xianyan Jia and Ang Wang and Jie Zhang and Yong Li and Wei Lin and Jingren Zhou and Hongxia Yang,http://arxiv.org/pdf/2110.03888v3 | |
http://arxiv.org/abs/1710.01799v1,creativecommons.org/licenses/by-sa/4.0/,Counterfactual Language Model Adaptation for Suggesting Phrases,Kenneth C. Arnold and Kai-Wei Chang and Adam T. Kalai,http://arxiv.org/pdf/1710.01799v1 | |
http://arxiv.org/abs/2102.12971v1,creativecommons.org/licenses/by-sa/4.0/,Are pre-trained text representations useful for multilingual and multi-dimensional language proficiency modeling?,Taraka Rama and Sowmya Vajjala,http://arxiv.org/pdf/2102.12971v1 | |
http://arxiv.org/abs/2303.03953v2,creativecommons.org/licenses/by-sa/4.0/,ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification,Taja Kuzman and Igor Mozetič and Nikola Ljubešić,http://arxiv.org/pdf/2303.03953v2 | |
http://arxiv.org/abs/2209.01335v2,creativecommons.org/licenses/by-sa/4.0/,Neural Approaches to Multilingual Information Retrieval,Dawn Lawrie and Eugene Yang and Douglas W. Oard and James Mayfield,http://arxiv.org/pdf/2209.01335v2 | |
http://arxiv.org/abs/1911.03894v3,creativecommons.org/licenses/by-sa/4.0/,CamemBERT: a Tasty French Language Model,Louis Martin and Benjamin Muller and Pedro Javier Ortiz Suárez and Yoann Dupont and Laurent Romary and Éric Villemonte de la Clergerie and Djamé Seddah and Benoît Sagot,http://arxiv.org/pdf/1911.03894v3 | |
http://arxiv.org/abs/2203.14507v2,creativecommons.org/licenses/by-sa/4.0/,ANNA: Enhanced Language Representation for Question Answering,Changwook Jun and Hansol Jang and Myoseop Sim and Hyun Kim and Jooyoung Choi and Kyungkoo Min and Kyunghoon Bae,http://arxiv.org/pdf/2203.14507v2 | |
http://arxiv.org/abs/1406.1241v1,creativecommons.org/licenses/by/3.0/,The Best Templates Match Technique For Example Based Machine Translation,T. El-Shishtawy and A. El-Sammak,http://arxiv.org/pdf/1406.1241v1 | |
http://arxiv.org/abs/1311.3837v1,creativecommons.org/licenses/by/3.0/,SBML for optimizing decision support's tools,Dalila Hamami and Baghdad Atmani,http://arxiv.org/pdf/1311.3837v1 | |
http://arxiv.org/abs/1005.4752v1,creativecommons.org/licenses/by/3.0/,A database approach to information retrieval: The remarkable relationship between language models and region models,Djoerd Hiemstra and Vojkan Mihajlovic,http://arxiv.org/pdf/1005.4752v1 | |
http://arxiv.org/abs/2104.10441v1,creativecommons.org/licenses/by/4.0/,"Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead?",Tim Isbister and Fredrik Carlsson and Magnus Sahlgren,http://arxiv.org/pdf/2104.10441v1 | |
http://arxiv.org/abs/1909.04879v1,creativecommons.org/licenses/by/4.0/,Dynamic Fusion: Attentional Language Model for Neural Machine Translation,Michiki Kurosawa and Mamoru Komachi,http://arxiv.org/pdf/1909.04879v1 | |
http://arxiv.org/abs/2303.15324v1,creativecommons.org/licenses/by/4.0/,Can Large Language Models design a Robot?,Francesco Stella and Cosimo Della Santina and Josie Hughes,http://arxiv.org/pdf/2303.15324v1 | |
http://arxiv.org/abs/2112.02969v1,creativecommons.org/licenses/by/4.0/,Jigsaw: Large Language Models meet Program Synthesis,Naman Jain and Skanda Vaidyanath and Arun Iyer and Nagarajan Natarajan and Suresh Parthasarathy and Sriram Rajamani and Rahul Sharma,http://arxiv.org/pdf/2112.02969v1 | |
http://arxiv.org/abs/2105.00572v1,creativecommons.org/licenses/by/4.0/,Larger-Scale Transformers for Multilingual Masked Language Modeling,Naman Goyal and Jingfei Du and Myle Ott and Giri Anantharaman and Alexis Conneau,http://arxiv.org/pdf/2105.00572v1 | |
http://arxiv.org/abs/2206.02252v1,creativecommons.org/licenses/by/4.0/,Exploring Cross-lingual Textual Style Transfer with Large Multilingual Language Models,Daniil Moskovskiy and Daryna Dementieva and Alexander Panchenko,http://arxiv.org/pdf/2206.02252v1 | |
http://arxiv.org/abs/2104.00772v1,creativecommons.org/licenses/by/4.0/,Low-Resource Language Modelling of South African Languages,Stuart Mesham and Luc Hayward and Jared Shapiro and Jan Buys,http://arxiv.org/pdf/2104.00772v1 | |
http://arxiv.org/abs/2302.12299v1,creativecommons.org/licenses/by/4.0/,In What Languages are Generative Language Models the Most Formal? Analyzing Formality Distribution across Languages,Asım Ersoy and Gerson Vizcarra and Tasmiah Tahsin Mayeesha and Benjamin Muller,http://arxiv.org/pdf/2302.12299v1 | |
http://arxiv.org/abs/2212.09535v1,creativecommons.org/licenses/by/4.0/,BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting,Zheng-Xin Yong and Hailey Schoelkopf and Niklas Muennighoff and Alham Fikri Aji and David Ifeoluwa Adelani and Khalid Almubarak and M Saiful Bari and Lintang Sutawika and Jungo Kasai and Ahmed Baruwa and Genta Indra Winata and Stella Biderman and Dragomir Radev and Vassilina Nikoulina,http://arxiv.org/pdf/2212.09535v1 | |
http://arxiv.org/abs/2210.14473v1,creativecommons.org/licenses/by/4.0/,Benchmarking Language Models for Code Syntax Understanding,Da Shen and Xinyun Chen and Chenguang Wang and Koushik Sen and Dawn Song,http://arxiv.org/pdf/2210.14473v1 | |
http://arxiv.org/abs/2110.13658v1,creativecommons.org/licenses/by/4.0/,Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?,Arij Riabi and Benoît Sagot and Djamé Seddah,http://arxiv.org/pdf/2110.13658v1 | |
http://arxiv.org/abs/2302.03491v1,creativecommons.org/licenses/by/4.0/,Learning Translation Quality Evaluation on Low Resource Languages from Large Language Models,Amirkeivan Mohtashami and Mauro Verzetti and Paul K. Rubenstein,http://arxiv.org/pdf/2302.03491v1 | |
http://arxiv.org/abs/2110.00687v1,creativecommons.org/licenses/by/4.0/,Investigating Robustness of Dialog Models to Popular Figurative Language Constructs,Harsh Jhamtani and Varun Gangal and Eduard Hovy and Taylor Berg-Kirkpatrick,http://arxiv.org/pdf/2110.00687v1 | |
http://arxiv.org/abs/2112.00567v1,creativecommons.org/licenses/by/4.0/,DPRK-BERT: The Supreme Language Model,Arda Akdemir and Yeojoo Jeon,http://arxiv.org/pdf/2112.00567v1 | |
http://arxiv.org/abs/2106.14127v1,creativecommons.org/licenses/by/4.0/,Visual Conceptual Blending with Large-scale Language and Vision Models,Songwei Ge and Devi Parikh,http://arxiv.org/pdf/2106.14127v1 | |
http://arxiv.org/abs/2209.15236v3,creativecommons.org/licenses/by/4.0/,Language-Family Adapters for Low-Resource Multilingual Neural Machine Translation,Alexandra Chronopoulou and Dario Stojanovski and Alexander Fraser,http://arxiv.org/pdf/2209.15236v3 | |
http://arxiv.org/abs/2203.05300v1,creativecommons.org/licenses/by/4.0/,Connecting Neural Response measurements & Computational Models of language: a non-comprehensive guide,Mostafa Abdou,http://arxiv.org/pdf/2203.05300v1 | |
http://arxiv.org/abs/2301.12597v1,creativecommons.org/licenses/by/4.0/,BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models,Junnan Li and Dongxu Li and Silvio Savarese and Steven Hoi,http://arxiv.org/pdf/2301.12597v1 | |
http://arxiv.org/abs/2007.15813v1,creativecommons.org/licenses/by/4.0/,Language Modelling for Source Code with Transformer-XL,Thomas Dowdell and Hongyu Zhang,http://arxiv.org/pdf/2007.15813v1 | |
http://arxiv.org/abs/2210.00185v1,creativecommons.org/licenses/by/4.0/,Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks,Zhenhailong Wang and Xiaoman Pan and Dian Yu and Dong Yu and Jianshu Chen and Heng Ji,http://arxiv.org/pdf/2210.00185v1 | |
http://arxiv.org/abs/2104.06546v1,creativecommons.org/licenses/by/4.0/,Large-Scale Contextualised Language Modelling for Norwegian,Andrey Kutuzov and Jeremy Barnes and Erik Velldal and Lilja Øvrelid and Stephan Oepen,http://arxiv.org/pdf/2104.06546v1 | |
http://arxiv.org/abs/2303.07304v1,creativecommons.org/licenses/by/4.0/,Algorithmic Ghost in the Research Shell: Large Language Models and Academic Knowledge Creation in Management Research,Nigel Williams and Stanislav Ivanov and Dimitrios Buhalis,http://arxiv.org/pdf/2303.07304v1 | |
http://arxiv.org/abs/1711.01100v1,creativecommons.org/licenses/by/4.0/,One Model to Rule them all: Multitask and Multilingual Modelling for Lexical Analysis,Johannes Bjerva,http://arxiv.org/pdf/1711.01100v1 | |
http://arxiv.org/abs/2212.09146v1,creativecommons.org/licenses/by/4.0/,Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model,Parishad BehnamGhader and Santiago Miret and Siva Reddy,http://arxiv.org/pdf/2212.09146v1 | |
http://arxiv.org/abs/2209.01515v2,creativecommons.org/licenses/by/4.0/,Do Large Language Models know what humans know?,Sean Trott and Cameron Jones and Tyler Chang and James Michaelov and Benjamin Bergen,http://arxiv.org/pdf/2209.01515v2 | |
http://arxiv.org/abs/2204.06487v3,creativecommons.org/licenses/by/4.0/,Adapting Pre-trained Language Models to African Languages via Multilingual Adaptive Fine-Tuning,Jesujoba O. Alabi and David Ifeoluwa Adelani and Marius Mosbach and Dietrich Klakow,http://arxiv.org/pdf/2204.06487v3 | |
http://arxiv.org/abs/2211.02069v2,creativecommons.org/licenses/by/4.0/,LMentry: A Language Model Benchmark of Elementary Language Tasks,Avia Efrat and Or Honovich and Omer Levy,http://arxiv.org/pdf/2211.02069v2 | |
http://arxiv.org/abs/2301.12566v1,creativecommons.org/licenses/by/4.0/,Improving Cross-lingual Information Retrieval on Low-Resource Languages via Optimal Transport Distillation,Zhiqi Huang and Puxuan Yu and James Allan,http://arxiv.org/pdf/2301.12566v1 | |
http://arxiv.org/abs/2210.07041v1,creativecommons.org/licenses/by/4.0/,Spontaneous Emerging Preference in Two-tower Language Model,Zhengqi He and Taro Toyoizumi,http://arxiv.org/pdf/2210.07041v1 | |
http://arxiv.org/abs/2212.02564v1,creativecommons.org/licenses/by/4.0/,INCLUSIFY: A benchmark and a model for gender-inclusive German,David Pomerenke,http://arxiv.org/pdf/2212.02564v1 | |
http://arxiv.org/abs/2205.09634v2,creativecommons.org/licenses/by/4.0/,Phylogeny-Inspired Adaptation of Multilingual Models to New Languages,Fahim Faisal and Antonios Anastasopoulos,http://arxiv.org/pdf/2205.09634v2 | |
http://arxiv.org/abs/2209.02842v1,creativecommons.org/licenses/by/4.0/,ASR2K: Speech Recognition for Around 2000 Languages without Audio,Xinjian Li and Florian Metze and David R Mortensen and Alan W Black and Shinji Watanabe,http://arxiv.org/pdf/2209.02842v1 | |
http://arxiv.org/abs/2210.12302v1,creativecommons.org/licenses/by/4.0/,What do Large Language Models Learn beyond Language?,Avinash Madasu and Shashank Srivastava,http://arxiv.org/pdf/2210.12302v1 | |
http://arxiv.org/abs/2302.00093v2,creativecommons.org/licenses/by/4.0/,Large Language Models Can Be Easily Distracted by Irrelevant Context,Freda Shi and Xinyun Chen and Kanishka Misra and Nathan Scales and David Dohan and Ed Chi and Nathanael Schärli and Denny Zhou,http://arxiv.org/pdf/2302.00093v2 | |
http://arxiv.org/abs/2112.06598v2,creativecommons.org/licenses/by/4.0/,WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models,Benjamin Minixhofer and Fabian Paischer and Navid Rekabsaz,http://arxiv.org/pdf/2112.06598v2 | |
http://arxiv.org/abs/2301.06627v1,creativecommons.org/licenses/by/4.0/,Dissociating language and thought in large language models: a cognitive perspective,Kyle Mahowald and Anna A. Ivanova and Idan A. Blank and Nancy Kanwisher and Joshua B. Tenenbaum and Evelina Fedorenko,http://arxiv.org/pdf/2301.06627v1 | |
http://arxiv.org/abs/2212.08390v1,creativecommons.org/licenses/by/4.0/,Lessons learned from the evaluation of Spanish Language Models,Rodrigo Agerri and Eneko Agirre,http://arxiv.org/pdf/2212.08390v1 | |
http://arxiv.org/abs/2301.04589v1,creativecommons.org/licenses/by/4.0/,Memory Augmented Large Language Models are Computationally Universal,Dale Schuurmans,http://arxiv.org/pdf/2301.04589v1 | |
http://arxiv.org/abs/2110.12010v3,creativecommons.org/licenses/by/4.0/,ClimateBert: A Pretrained Language Model for Climate-Related Text,Nicolas Webersinke and Mathias Kraus and Julia Anna Bingler and Markus Leippold,http://arxiv.org/pdf/2110.12010v3 | |
http://arxiv.org/abs/2210.05758v1,creativecommons.org/licenses/by/4.0/,Decoupled Context Processing for Context Augmented Language Modeling,Zonglin Li and Ruiqi Guo and Sanjiv Kumar,http://arxiv.org/pdf/2210.05758v1 | |
http://arxiv.org/abs/2208.12097v1,creativecommons.org/licenses/by/4.0/,Training a T5 Using Lab-sized Resources,Manuel R. Ciosici and Leon Derczynski,http://arxiv.org/pdf/2208.12097v1 | |
http://arxiv.org/abs/2010.14571v2,creativecommons.org/licenses/by/4.0/,Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus,Isaac Caswell and Theresa Breiner and Daan van Esch and Ankur Bapna,http://arxiv.org/pdf/2010.14571v2 | |
http://arxiv.org/abs/2203.13344v1,creativecommons.org/licenses/by/4.0/,Linking Emergent and Natural Languages via Corpus Transfer,Shunyu Yao and Mo Yu and Yang Zhang and Karthik R Narasimhan and Joshua B. Tenenbaum and Chuang Gan,http://arxiv.org/pdf/2203.13344v1 | |
http://arxiv.org/abs/1810.07156v2,creativecommons.org/licenses/by/4.0/,Strategies for Language Identification in Code-Mixed Low Resource Languages,Soumil Mandal and Sankalp Sanand,http://arxiv.org/pdf/1810.07156v2 | |
http://arxiv.org/abs/2105.02855v2,creativecommons.org/licenses/by/4.0/,Adapting Monolingual Models: Data can be Scarce when Language Similarity is High,Wietse de Vries and Martijn Bartelds and Malvina Nissim and Martijn Wieling,http://arxiv.org/pdf/2105.02855v2 | |
http://arxiv.org/abs/2206.02885v2,creativecommons.org/licenses/by/4.0/,Norm Participation Grounds Language,David Schlangen,http://arxiv.org/pdf/2206.02885v2 | |
http://arxiv.org/abs/2304.03728v1,creativecommons.org/licenses/by/4.0/,Interpretable Unified Language Checking,Tianhua Zhang and Hongyin Luo and Yung-Sung Chuang and Wei Fang and Luc Gaitskell and Thomas Hartvigsen and Xixin Wu and Danny Fox and Helen Meng and James Glass,http://arxiv.org/pdf/2304.03728v1 | |
http://arxiv.org/abs/2204.06283v2,creativecommons.org/licenses/by/4.0/,Curriculum: A Broad-Coverage Benchmark for Linguistic Phenomena in Natural Language Understanding,Zeming Chen and Qiyue Gao,http://arxiv.org/pdf/2204.06283v2 | |
http://arxiv.org/abs/2212.06094v1,creativecommons.org/licenses/by/4.0/,Prompting Is Programming: A Query Language For Large Language Models,Luca Beurer-Kellner and Marc Fischer and Martin Vechev,http://arxiv.org/pdf/2212.06094v1 | |
http://arxiv.org/abs/2206.12638v1,creativecommons.org/licenses/by/4.0/,Distilling a Pretrained Language Model to a Multilingual ASR Model,Kwanghee Choi and Hyung-Min Park,http://arxiv.org/pdf/2206.12638v1 | |
http://arxiv.org/abs/2204.05717v1,creativecommons.org/licenses/by/4.0/,Do Not Fire the Linguist: Grammatical Profiles Help Language Models Detect Semantic Change,Mario Giulianelli and Andrey Kutuzov and Lidia Pivovarova,http://arxiv.org/pdf/2204.05717v1 | |
http://arxiv.org/abs/2202.09662v6,creativecommons.org/licenses/by/4.0/,Reward Modeling for Mitigating Toxicity in Transformer-based Language Models,Farshid Faal and Ketra Schmitt and Jia Yuan Yu,http://arxiv.org/pdf/2202.09662v6 | |
http://arxiv.org/abs/2304.02015v1,creativecommons.org/licenses/by/4.0/,How well do Large Language Models perform in Arithmetic tasks?,Zheng Yuan and Hongyi Yuan and Chuanqi Tan and Wei Wang and Songfang Huang,http://arxiv.org/pdf/2304.02015v1 | |
http://arxiv.org/abs/2304.09960v2,creativecommons.org/licenses/by/4.0/,A Latent Space Theory for Emergent Abilities in Large Language Models,Hui Jiang,http://arxiv.org/pdf/2304.09960v2 | |
http://arxiv.org/abs/2211.03263v2,creativecommons.org/licenses/by/4.0/,AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages,Bonaventure F. P. Dossou and Atnafu Lambebo Tonja and Oreen Yousuf and Salomey Osei and Abigail Oppong and Iyanuoluwa Shode and Oluwabusayo Olufunke Awoyomi and Chris Chinenye Emezue,http://arxiv.org/pdf/2211.03263v2 | |
http://arxiv.org/abs/2302.01973v2,creativecommons.org/licenses/by/4.0/,Measuring The Impact Of Programming Language Distribution,Gabriel Orlanski and Kefan Xiao and Xavier Garcia and Jeffrey Hui and Joshua Howland and Jonathan Malmaud and Jacob Austin and Rishah Singh and Michele Catasta,http://arxiv.org/pdf/2302.01973v2 | |
http://arxiv.org/abs/2109.01207v4,creativecommons.org/licenses/by/4.0/,Similarity of Sentence Representations in Multilingual LMs: Resolving Conflicting Literature and Case Study of Baltic Languages,Maksym Del and Mark Fishel,http://arxiv.org/pdf/2109.01207v4 | |
http://arxiv.org/abs/2210.14409v1,creativecommons.org/licenses/by/4.0/,Modeling the Graphotactics of Low-Resource Languages Using Sequential GANs,Isaac Wasserman,http://arxiv.org/pdf/2210.14409v1 | |
http://arxiv.org/abs/2301.10439v2,creativecommons.org/licenses/by/4.0/,ViDeBERTa: A powerful pre-trained language model for Vietnamese,Cong Dao Tran and Nhut Huy Pham and Anh Nguyen and Truong Son Hy and Tu Vu,http://arxiv.org/pdf/2301.10439v2 | |
http://arxiv.org/abs/2304.00869v1,creativecommons.org/licenses/by/4.0/,GreekBART: The First Pretrained Greek Sequence-to-Sequence Model,Iakovos Evdaimon and Hadi Abdine and Christos Xypolopoulos and Stamatis Outsios and Michalis Vazirgiannis and Giorgos Stamou,http://arxiv.org/pdf/2304.00869v1 | |
http://arxiv.org/abs/2201.09227v2,creativecommons.org/licenses/by/4.0/,A Large and Diverse Arabic Corpus for Language Modeling,Abbas Raza Ali and Muhammad Ajmal Siddiqui and Rema Algunaibet and Hasan Raza Ali,http://arxiv.org/pdf/2201.09227v2 | |
http://arxiv.org/abs/2211.05417v1,creativecommons.org/licenses/by/4.0/,Can Transformers Reason in Fragments of Natural Language?,Viktor Schlegel and Kamen V. Pavlov and Ian Pratt-Hartmann,http://arxiv.org/pdf/2211.05417v1 | |
http://arxiv.org/abs/1904.09122v1,creativecommons.org/licenses/by/4.0/,Zero-Shot Cross-Lingual Opinion Target Extraction,Soufian Jebbara and Philipp Cimiano,http://arxiv.org/pdf/1904.09122v1 | |
http://arxiv.org/abs/2303.08006v2,creativecommons.org/licenses/by/4.0/,Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification,Jiayi Pan and Glen Chou and Dmitry Berenson,http://arxiv.org/pdf/2303.08006v2 | |
http://arxiv.org/abs/2111.01243v1,creativecommons.org/licenses/by/4.0/,Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey,Bonan Min and Hayley Ross and Elior Sulem and Amir Pouran Ben Veyseh and Thien Huu Nguyen and Oscar Sainz and Eneko Agirre and Ilana Heinz and Dan Roth,http://arxiv.org/pdf/2111.01243v1 | |
http://arxiv.org/abs/1802.08375v2,creativecommons.org/licenses/by/4.0/,Reusing Weights in Subword-aware Neural Language Models,Zhenisbek Assylbekov and Rustem Takhanov,http://arxiv.org/pdf/1802.08375v2 | |
http://arxiv.org/abs/2304.06962v1,creativecommons.org/licenses/by/4.0/,Prompt Engineering and Calibration for Zero-Shot Commonsense Reasoning,Chenkai Ma,http://arxiv.org/pdf/2304.06962v1 | |
http://arxiv.org/abs/2304.08865v1,creativecommons.org/licenses/by/4.0/,Romanization-based Large-scale Adaptation of Multilingual Language Models,Sukannya Purkayastha and Sebastian Ruder and Jonas Pfeiffer and Iryna Gurevych and Ivan Vulić,http://arxiv.org/pdf/2304.08865v1 | |
http://arxiv.org/abs/2105.12428v1,creativecommons.org/licenses/by/4.0/,"Neural Morphology Dataset and Models for Multiple Languages, from the Large to the Endangered",Mika Hämäläinen and Niko Partanen and Jack Rueter and Khalid Alnajjar,http://arxiv.org/pdf/2105.12428v1 | |
http://arxiv.org/abs/2111.08546v1,creativecommons.org/licenses/by/4.0/,Interpreting Language Models Through Knowledge Graph Extraction,Vinitra Swamy and Angelika Romanou and Martin Jaggi,http://arxiv.org/pdf/2111.08546v1 | |
http://arxiv.org/abs/2104.08826v2,creativecommons.org/licenses/by/4.0/,GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation,Kang Min Yoo and Dongju Park and Jaewook Kang and Sang-Woo Lee and Woomyeong Park,http://arxiv.org/pdf/2104.08826v2 | |
http://arxiv.org/abs/2201.11903v6,creativecommons.org/licenses/by/4.0/,Chain-of-Thought Prompting Elicits Reasoning in Large Language Models,Jason Wei and Xuezhi Wang and Dale Schuurmans and Maarten Bosma and Brian Ichter and Fei Xia and Ed Chi and Quoc Le and Denny Zhou,http://arxiv.org/pdf/2201.11903v6 | |
http://arxiv.org/abs/2002.05417v1,creativecommons.org/licenses/by/4.0/,Comparison of Turkish Word Representations Trained on Different Morphological Forms,Gökhan Güler and A. Cüneyd Tantuğ,http://arxiv.org/pdf/2002.05417v1 | |
http://arxiv.org/abs/2006.00591v2,creativecommons.org/licenses/by/4.0/,Efficient Deployment of Conversational Natural Language Interfaces over Databases,Anthony Colas and Trung Bui and Franck Dernoncourt and Moumita Sinha and Doo Soon Kim,http://arxiv.org/pdf/2006.00591v2 | |
http://arxiv.org/abs/2303.07226v1,creativecommons.org/licenses/by/4.0/,Scaling Vision-Language Models with Sparse Mixture of Experts,Sheng Shen and Zhewei Yao and Chunyuan Li and Trevor Darrell and Kurt Keutzer and Yuxiong He,http://arxiv.org/pdf/2303.07226v1 | |
http://arxiv.org/abs/2204.10198v2,creativecommons.org/licenses/by/4.0/,Context-Aware Language Modeling for Goal-Oriented Dialogue Systems,Charlie Snell and Mengjiao Yang and Justin Fu and Yi Su and Sergey Levine,http://arxiv.org/pdf/2204.10198v2 | |
http://arxiv.org/abs/2012.07331v1,creativecommons.org/licenses/by/4.0/,Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval,Yuma Koizumi and Yasunori Ohishi and Daisuke Niizumi and Daiki Takeuchi and Masahiro Yasuda,http://arxiv.org/pdf/2012.07331v1 | |
http://arxiv.org/abs/2102.07350v1,creativecommons.org/licenses/by/4.0/,Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm,Laria Reynolds and Kyle McDonell,http://arxiv.org/pdf/2102.07350v1 | |
http://arxiv.org/abs/2207.06839v1,creativecommons.org/licenses/by/4.0/,Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model,Chris van der Lee and Thiago Castro Ferreira and Chris Emmery and Travis Wiltshire and Emiel Krahmer,http://arxiv.org/pdf/2207.06839v1 | |
http://arxiv.org/abs/2212.10461v1,creativecommons.org/licenses/by/4.0/,Go-tuning: Improving Zero-shot Learning Abilities of Smaller Language Models,Jingjing Xu and Qingxiu Dong and Hongyi Liu and Lei Li,http://arxiv.org/pdf/2212.10461v1 | |
http://arxiv.org/abs/2207.03777v1,creativecommons.org/licenses/by/4.0/,Hidden Schema Networks,Ramsés J. Sánchez and Lukas Conrads and Pascal Welke and Kostadin Cvejoski and César Ojeda,http://arxiv.org/pdf/2207.03777v1 | |
http://arxiv.org/abs/2105.14880v1,creativecommons.org/licenses/by/4.0/,A Multilingual Modeling Method for Span-Extraction Reading Comprehension,Gaochen Wu and Bin Xu and Dejie Chang and Bangchang Liu,http://arxiv.org/pdf/2105.14880v1 | |
http://arxiv.org/abs/2301.12868v3,creativecommons.org/licenses/by/4.0/,On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex,Terry Yue Zhuo and Zhuang Li and Yujin Huang and Fatemeh Shiri and Weiqing Wang and Gholamreza Haffari and Yuan-Fang Li,http://arxiv.org/pdf/2301.12868v3 | |
http://arxiv.org/abs/2204.08110v4,creativecommons.org/licenses/by/4.0/,Language Contamination Helps Explain the Cross-lingual Capabilities of English Pretrained Models,Terra Blevins and Luke Zettlemoyer,http://arxiv.org/pdf/2204.08110v4 | |
http://arxiv.org/abs/2207.09152v1,creativecommons.org/licenses/by/4.0/,Benchmarking Transformers-based models on French Spoken Language Understanding tasks,Oralie Cattan and Sahar Ghannay and Christophe Servan and Sophie Rosset,http://arxiv.org/pdf/2207.09152v1 | |
http://arxiv.org/abs/2112.08346v1,creativecommons.org/licenses/by/4.0/,Simple Text Detoxification by Identifying a Linear Toxic Subspace in Language Model Embeddings,Andrew Wang and Mohit Sudhakar and Yangfeng Ji,http://arxiv.org/pdf/2112.08346v1 | |
http://arxiv.org/abs/2205.05718v1,creativecommons.org/licenses/by/4.0/,"Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks",Katherine M. Collins and Catherine Wong and Jiahai Feng and Megan Wei and Joshua B. Tenenbaum,http://arxiv.org/pdf/2205.05718v1 | |
http://arxiv.org/abs/2104.14830v2,creativecommons.org/licenses/by/4.0/,Scaling End-to-End Models for Large-Scale Multilingual ASR,Bo Li and Ruoming Pang and Tara N. Sainath and Anmol Gulati and Yu Zhang and James Qin and Parisa Haghani and W. Ronny Huang and Min Ma and Junwen Bai,http://arxiv.org/pdf/2104.14830v2 | |
http://arxiv.org/abs/2208.13078v1,creativecommons.org/licenses/by/4.0/,MDIA: A Benchmark for Multilingual Dialogue Generation in 46 Languages,Qingyu Zhang and Xiaoyu Shen and Ernie Chang and Jidong Ge and Pengke Chen,http://arxiv.org/pdf/2208.13078v1 | |
http://arxiv.org/abs/2304.09957v1,creativecommons.org/licenses/by/4.0/,Low-resource Bilingual Dialect Lexicon Induction with Large Language Models,Ekaterina Artemova and Barbara Plank,http://arxiv.org/pdf/2304.09957v1 | |
http://arxiv.org/abs/2102.03596v1,creativecommons.org/licenses/by/4.0/,Does He Wink or Does He Nod? A Challenging Benchmark for Evaluating Word Understanding of Language Models,Lutfi Kerem Senel and Hinrich Schütze,http://arxiv.org/pdf/2102.03596v1 | |
http://arxiv.org/abs/2109.12346v3,creativecommons.org/licenses/by/4.0/,DziriBERT: a Pre-trained Language Model for the Algerian Dialect,Amine Abdaoui and Mohamed Berrimi and Mourad Oussalah and Abdelouahab Moussaoui,http://arxiv.org/pdf/2109.12346v3 | |
http://arxiv.org/abs/2110.14782v3,creativecommons.org/licenses/by/4.0/,When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer,Ameet Deshpande and Partha Talukdar and Karthik Narasimhan,http://arxiv.org/pdf/2110.14782v3 | |
http://arxiv.org/abs/2104.09411v1,creativecommons.org/licenses/by/4.0/,Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training,Chenyi Lei and Shixian Luo and Yong Liu and Wanggui He and Jiamang Wang and Guoxin Wang and Haihong Tang and Chunyan Miao and Houqiang Li,http://arxiv.org/pdf/2104.09411v1 | |
http://arxiv.org/abs/2201.00150v5,creativecommons.org/licenses/by/4.0/,Cross-Domain Deep Code Search with Few-Shot Meta Learning,Yitian Chai and Hongyu Zhang and Beijun Shen and Xiaodong Gu,http://arxiv.org/pdf/2201.00150v5 | |
http://arxiv.org/abs/2304.03738v2,creativecommons.org/licenses/by/4.0/,Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models,Emilio Ferrara,http://arxiv.org/pdf/2304.03738v2 | |
http://arxiv.org/abs/2104.05882v1,creativecommons.org/licenses/by/4.0/,Discourse Probing of Pretrained Language Models,Fajri Koto and Jey Han Lau and Timothy Baldwin,http://arxiv.org/pdf/2104.05882v1 | |
http://arxiv.org/abs/2107.08146v2,creativecommons.org/licenses/by/4.0/,Picard understanding Darmok: A Dataset and Model for Metaphor-Rich Translation in a Constructed Language,Peter Jansen and Jordan Boyd-Graber,http://arxiv.org/pdf/2107.08146v2 | |
http://arxiv.org/abs/2201.07311v1,creativecommons.org/licenses/by/4.0/,Datasheet for the Pile,Stella Biderman and Kieran Bicheno and Leo Gao,http://arxiv.org/pdf/2201.07311v1 | |
http://arxiv.org/abs/2210.13693v1,creativecommons.org/licenses/by/4.0/,XRICL: Cross-lingual Retrieval-Augmented In-Context Learning for Cross-lingual Text-to-SQL Semantic Parsing,Peng Shi and Rui Zhang and He Bai and Jimmy Lin,http://arxiv.org/pdf/2210.13693v1 | |
http://arxiv.org/abs/2202.08772v1,creativecommons.org/licenses/by/4.0/,A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models,Da Yin and Li Dong and Hao Cheng and Xiaodong Liu and Kai-Wei Chang and Furu Wei and Jianfeng Gao,http://arxiv.org/pdf/2202.08772v1 | |
http://arxiv.org/abs/2004.14963v3,creativecommons.org/licenses/by/4.0/,Data and Representation for Turkish Natural Language Inference,Emrah Budur and Rıza Özçelik and Tunga Güngör and Christopher Potts,http://arxiv.org/pdf/2004.14963v3 | |
http://arxiv.org/abs/2205.14326v1,creativecommons.org/licenses/by/4.0/,Adaptive Activation Network For Low Resource Multilingual Speech Recognition,Jian Luo and Jianzong Wang and Ning Cheng and Zhenpeng Zheng and Jing Xiao,http://arxiv.org/pdf/2205.14326v1 | |
http://arxiv.org/abs/2206.11871v1,creativecommons.org/licenses/by/4.0/,Offline RL for Natural Language Generation with Implicit Language Q Learning,Charlie Snell and Ilya Kostrikov and Yi Su and Mengjiao Yang and Sergey Levine,http://arxiv.org/pdf/2206.11871v1 | |
http://arxiv.org/abs/2208.03067v2,creativecommons.org/licenses/by/4.0/,Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning,Sandy Ritchie and You-Chi Cheng and Mingqing Chen and Rajiv Mathews and Daan van Esch and Bo Li and Khe Chai Sim,http://arxiv.org/pdf/2208.03067v2 | |
http://arxiv.org/abs/2212.09196v2,creativecommons.org/licenses/by/4.0/,Emergent Analogical Reasoning in Large Language Models,Taylor Webb and Keith J. Holyoak and Hongjing Lu,http://arxiv.org/pdf/2212.09196v2 | |
http://arxiv.org/abs/2203.13411v1,creativecommons.org/licenses/by/4.0/,Reshaping Robot Trajectories Using Natural Language Commands: A Study of Multi-Modal Data Alignment Using Transformers,Arthur Bucker and Luis Figueredo and Sami Haddadin and Ashish Kapoor and Shuang Ma and Rogerio Bonatti,http://arxiv.org/pdf/2203.13411v1 | |
http://arxiv.org/abs/2012.04307v2,creativecommons.org/licenses/by/4.0/,Cross-lingual Transfer of Abstractive Summarizer to Less-resource Language,Aleš Žagar and Marko Robnik-Šikonja,http://arxiv.org/pdf/2012.04307v2 | |
http://arxiv.org/abs/2205.00551v3,creativecommons.org/licenses/by/4.0/,Gender Bias in Masked Language Models for Multiple Languages,Masahiro Kaneko and Aizhan Imankulova and Danushka Bollegala and Naoaki Okazaki,http://arxiv.org/pdf/2205.00551v3 | |
http://arxiv.org/abs/2104.07358v2,creativecommons.org/licenses/by/4.0/,Adaptive Sparse Transformer for Multilingual Translation,Hongyu Gong and Xian Li and Dmitriy Genzel,http://arxiv.org/pdf/2104.07358v2 | |
http://arxiv.org/abs/2108.07790v3,creativecommons.org/licenses/by/4.0/,Mitigating harm in language models with conditional-likelihood filtration,Helen Ngo and Cooper Raterink and João G. M. Araújo and Ivan Zhang and Carol Chen and Adrien Morisot and Nicholas Frosst,http://arxiv.org/pdf/2108.07790v3 | |
http://arxiv.org/abs/2110.01485v2,creativecommons.org/licenses/by/4.0/,JuriBERT: A Masked-Language Model Adaptation for French Legal Text,Stella Douka and Hadi Abdine and Michalis Vazirgiannis and Rajaa El Hamdani and David Restrepo Amariles,http://arxiv.org/pdf/2110.01485v2 | |
http://arxiv.org/abs/2110.13032v2,creativecommons.org/licenses/by/4.0/,Paradigm Shift in Language Modeling: Revisiting CNN for Modeling Sanskrit Originated Bengali and Hindi Language,Chowdhury Rafeed Rahman and MD. Hasibur Rahman and Mohammad Rafsan and Samiha Zakir and Mohammed Eunus Ali and Rafsanjani Muhammod,http://arxiv.org/pdf/2110.13032v2 | |
http://arxiv.org/abs/2207.09099v1,creativecommons.org/licenses/by/4.0/,Analyzing Bagging Methods for Language Models,Pranab Islam and Shaan Khosla and Arthur Lok and Mudit Saxena,http://arxiv.org/pdf/2207.09099v1 | |
http://arxiv.org/abs/2209.02982v2,creativecommons.org/licenses/by/4.0/,Improving the Cross-Lingual Generalisation in Visual Question Answering,Farhad Nooralahzadeh and Rico Sennrich,http://arxiv.org/pdf/2209.02982v2 | |
http://arxiv.org/abs/2212.12937v1,creativecommons.org/licenses/by/4.0/,GAE-ISumm: Unsupervised Graph-Based Summarization of Indian Languages,Lakshmi Sireesha Vakada and Anudeep Ch and Mounika Marreddy and Subba Reddy Oota and Radhika Mamidi,http://arxiv.org/pdf/2212.12937v1 | |
http://arxiv.org/abs/2207.08982v1,creativecommons.org/licenses/by/4.0/,Selection Bias Induced Spurious Correlations in Large Language Models,Emily McMilin,http://arxiv.org/pdf/2207.08982v1 | |
http://arxiv.org/abs/2111.04909v3,creativecommons.org/licenses/by/4.0/,FPM: A Collection of Large-scale Foundation Pre-trained Language Models,Dezhou Shen,http://arxiv.org/pdf/2111.04909v3 | |
http://arxiv.org/abs/2212.10471v1,creativecommons.org/licenses/by/4.0/,Little Red Riding Hood Goes Around the Globe:Crosslingual Story Planning and Generation with Large Language Models,Evgeniia Razumovskaia and Joshua Maynez and Annie Louis and Mirella Lapata and Shashi Narayan,http://arxiv.org/pdf/2212.10471v1 | |
http://arxiv.org/abs/2302.12834v1,creativecommons.org/licenses/by/4.0/,Leveraging Large Language Model and Story-Based Gamification in Intelligent Tutoring System to Scaffold Introductory Programming Courses: A Design-Based Research Study,Chen Cao,http://arxiv.org/pdf/2302.12834v1 | |
http://arxiv.org/abs/2106.13627v1,creativecommons.org/licenses/by/4.0/,Language Models are Good Translators,Shuo Wang and Zhaopeng Tu and Zhixing Tan and Wenxuan Wang and Maosong Sun and Yang Liu,http://arxiv.org/pdf/2106.13627v1 | |
http://arxiv.org/abs/2204.10365v1,creativecommons.org/licenses/by/4.0/,Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias,Anoop K. and Manjary P. Gangan and Deepak P. and Lajish V. L,http://arxiv.org/pdf/2204.10365v1 | |
http://arxiv.org/abs/2205.10782v1,creativecommons.org/licenses/by/4.0/,Instruction Induction: From Few Examples to Natural Language Task Descriptions,Or Honovich and Uri Shaham and Samuel R. Bowman and Omer Levy,http://arxiv.org/pdf/2205.10782v1 | |
http://arxiv.org/abs/2206.04439v1,creativecommons.org/licenses/by/4.0/,Dict-NMT: Bilingual Dictionary based NMT for Extremely Low Resource Languages,Nalin Kumar and Deepak Kumar and Subhankar Mishra,http://arxiv.org/pdf/2206.04439v1 | |
http://arxiv.org/abs/2205.10583v4,creativecommons.org/licenses/by/4.0/,Automated Repair of Programs from Large Language Models,Zhiyu Fan and Xiang Gao and Martin Mirchev and Abhik Roychoudhury and Shin Hwei Tan,http://arxiv.org/pdf/2205.10583v4 | |
http://arxiv.org/abs/2304.06123v1,creativecommons.org/licenses/by/4.0/,The Impact of Large Language Multi-Modal Models on the Future of Job Market,Tarry Singh,http://arxiv.org/pdf/2304.06123v1 | |
http://arxiv.org/abs/2210.03941v1,creativecommons.org/licenses/by/4.0/,Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling,Hsin-Ying Lee and Hung-Ting Su and Bing-Chen Tsai and Tsung-Han Wu and Jia-Fong Yeh and Winston H. Hsu,http://arxiv.org/pdf/2210.03941v1 | |
http://arxiv.org/abs/2211.05958v2,creativecommons.org/licenses/by/4.0/,MINION: a Large-Scale and Diverse Dataset for Multilingual Event Detection,Amir Pouran Ben Veyseh and Minh Van Nguyen and Franck Dernoncourt and Thien Huu Nguyen,http://arxiv.org/pdf/2211.05958v2 | |
http://arxiv.org/abs/1909.04302v1,creativecommons.org/licenses/by/4.0/,Multimodal Embeddings from Language Models,Shao-Yen Tseng and Panayiotis Georgiou and Shrikanth Narayanan,http://arxiv.org/pdf/1909.04302v1 | |
http://arxiv.org/abs/2202.12576v1,creativecommons.org/licenses/by/4.0/,A Survey of Multilingual Models for Automatic Speech Recognition,Hemant Yadav and Sunayana Sitaram,http://arxiv.org/pdf/2202.12576v1 | |
http://arxiv.org/abs/2210.17236v1,creativecommons.org/licenses/by/4.0/,When Language Model Meets Private Library,Daoguang Zan and Bei Chen and Zeqi Lin and Bei Guan and Yongji Wang and Jian-Guang Lou,http://arxiv.org/pdf/2210.17236v1 | |
http://arxiv.org/abs/2211.02098v1,creativecommons.org/licenses/by/4.0/,Overcoming Barriers to Skill Injection in Language Modeling: Case Study in Arithmetic,Mandar Sharma and Nikhil Muralidhar and Naren Ramakrishnan,http://arxiv.org/pdf/2211.02098v1 | |
http://arxiv.org/abs/1902.08830v1,creativecommons.org/licenses/by/4.0/,Categorization in the Wild: Generalizing Cognitive Models to Naturalistic Data across Languages,Lea Frermann and Mirella Lapata,http://arxiv.org/pdf/1902.08830v1 | |
http://arxiv.org/abs/2012.15643v2,creativecommons.org/licenses/by/4.0/,CoCoLM: COmplex COmmonsense Enhanced Language Model with Discourse Relations,Changlong Yu and Hongming Zhang and Yangqiu Song and Wilfred Ng,http://arxiv.org/pdf/2012.15643v2 | |
http://arxiv.org/abs/2212.10551v1,creativecommons.org/licenses/by/4.0/,Lego-MT: Towards Detachable Models in Massively Multilingual Machine Translation,Fei Yuan and Yinquan Lu and WenHao Zhu and Lingpeng Kong and Lei Li and Jingjing Xu,http://arxiv.org/pdf/2212.10551v1 | |
http://arxiv.org/abs/2301.12004v1,creativecommons.org/licenses/by/4.0/,Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation,Jessica Huynh and Cathy Jiao and Prakhar Gupta and Shikib Mehri and Payal Bajaj and Vishrav Chaudhary and Maxine Eskenazi,http://arxiv.org/pdf/2301.12004v1 | |
http://arxiv.org/abs/2111.07119v1,creativecommons.org/licenses/by/4.0/,Extracting and filtering paraphrases by bridging natural language inference and paraphrasing,Matej Klemen and Marko Robnik-Šikonja,http://arxiv.org/pdf/2111.07119v1 | |
http://arxiv.org/abs/2210.11757v1,creativecommons.org/licenses/by/4.0/,University of Cape Town's WMT22 System: Multilingual Machine Translation for Southern African Languages,Khalid N. Elmadani and Francois Meyer and Jan Buys,http://arxiv.org/pdf/2210.11757v1 | |
http://arxiv.org/abs/2003.01355v2,creativecommons.org/licenses/by/4.0/,CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model,Liang Xu and Xuanwei Zhang and Qianqian Dong,http://arxiv.org/pdf/2003.01355v2 | |
http://arxiv.org/abs/2110.08294v2,creativecommons.org/licenses/by/4.0/,Coherence boosting: When your pretrained language model is not paying enough attention,Nikolay Malkin and Zhen Wang and Nebojsa Jojic,http://arxiv.org/pdf/2110.08294v2 | |
http://arxiv.org/abs/2112.13960v1,creativecommons.org/licenses/by/4.0/,A Preordered RNN Layer Boosts Neural Machine Translation in Low Resource Settings,Mohaddeseh Bastan and Shahram Khadivi,http://arxiv.org/pdf/2112.13960v1 | |
http://arxiv.org/abs/2302.06555v1,creativecommons.org/licenses/by/4.0/,Implications of the Convergence of Language and Vision Model Geometries,Jiaang Li and Yova Kementchedjhieva and Anders Søgaard,http://arxiv.org/pdf/2302.06555v1 | |
http://arxiv.org/abs/2110.03501v3,creativecommons.org/licenses/by/4.0/,Pretrained Language Models are Symbolic Mathematics Solvers too!,Kimia Noorbakhsh and Modar Sulaiman and Mahdi Sharifi and Kallol Roy and Pooyan Jamshidi,http://arxiv.org/pdf/2110.03501v3 | |
http://arxiv.org/abs/2104.11390v1,creativecommons.org/licenses/by/4.0/,Transfer training from smaller language model,Han Zhang,http://arxiv.org/pdf/2104.11390v1 | |
http://arxiv.org/abs/2301.04013v1,creativecommons.org/licenses/by/4.0/,There is No Big Brother or Small Brother: Knowledge Infusion in Language Models for Link Prediction and Question Answering,Ankush Agarwal and Sakharam Gawade and Sachin Channabasavarajendra and Pushpak Bhattacharyya,http://arxiv.org/pdf/2301.04013v1 | |
http://arxiv.org/abs/2211.13899v1,creativecommons.org/licenses/by/4.0/,Comparison Study Between Token Classification and Sequence Classification In Text Classification,Amir Jafari,http://arxiv.org/pdf/2211.13899v1 | |
http://arxiv.org/abs/2304.11389v1,creativecommons.org/licenses/by/4.0/,Transformer-Based LM Surprisal Predicts Human Reading Times Best with About Two Billion Training Tokens,Byung-Doh Oh and William Schuler,http://arxiv.org/pdf/2304.11389v1 | |
http://arxiv.org/abs/2201.06317v1,creativecommons.org/licenses/by/4.0/,Language Model-Based Paired Variational Autoencoders for Robotic Language Learning,Ozan Özdemir and Matthias Kerzel and Cornelius Weber and Jae Hee Lee and Stefan Wermter,http://arxiv.org/pdf/2201.06317v1 | |
http://arxiv.org/abs/2304.07880v2,creativecommons.org/licenses/by/4.0/,Sabiá: Portuguese Large Language Models,Ramon Pires and Hugo Abonizio and Thales Sales Almeida and Rodrigo Nogueira,http://arxiv.org/pdf/2304.07880v2 | |
http://arxiv.org/abs/2302.12313v2,creativecommons.org/licenses/by/4.0/,Testing AI performance on less frequent aspects of language reveals insensitivity to underlying meaning,Vittoria Dentella and Elliot Murphy and Gary Marcus and Evelina Leivada,http://arxiv.org/pdf/2302.12313v2 | |
http://arxiv.org/abs/2211.08264v1,creativecommons.org/licenses/by/4.0/,QAmeleon: Multilingual QA with Only 5 Examples,Priyanka Agrawal and Chris Alberti and Fantine Huot and Joshua Maynez and Ji Ma and Sebastian Ruder and Kuzman Ganchev and Dipanjan Das and Mirella Lapata,http://arxiv.org/pdf/2211.08264v1 | |
http://arxiv.org/abs/2304.08485v1,creativecommons.org/licenses/by/4.0/,Visual Instruction Tuning,Haotian Liu and Chunyuan Li and Qingyang Wu and Yong Jae Lee,http://arxiv.org/pdf/2304.08485v1 | |
http://arxiv.org/abs/2303.06689v1,creativecommons.org/licenses/by/4.0/,Self-planning Code Generation with Large Language Model,Xue Jiang and Yihong Dong and Lecheng Wang and Qiwei Shang and Ge Li,http://arxiv.org/pdf/2303.06689v1 | |
http://arxiv.org/abs/2010.08319v1,creativecommons.org/licenses/by/4.0/,Detecting ESG topics using domain-specific language models and data augmentation approaches,Tim Nugent and Nicole Stelea and Jochen L. Leidner,http://arxiv.org/pdf/2010.08319v1 | |
http://arxiv.org/abs/2203.16595v3,creativecommons.org/licenses/by/4.0/,Improving Speech Recognition for Indic Languages using Language Model,Ankur Dhuriya and Harveen Singh Chadha and Anirudh Gupta and Priyanshi Shah and Neeraj Chhimwal and Rishabh Gaur and Vivek Raghavan,http://arxiv.org/pdf/2203.16595v3 | |
http://arxiv.org/abs/2205.12910v2,creativecommons.org/licenses/by/4.0/,NaturalProver: Grounded Mathematical Proof Generation with Language Models,Sean Welleck and Jiacheng Liu and Ximing Lu and Hannaneh Hajishirzi and Yejin Choi,http://arxiv.org/pdf/2205.12910v2 | |
http://arxiv.org/abs/2209.12099v1,creativecommons.org/licenses/by/4.0/,Controllable Text Generation for Open-Domain Creativity and Fairness,Nanyun Peng,http://arxiv.org/pdf/2209.12099v1 | |
http://arxiv.org/abs/2301.04347v3,creativecommons.org/licenses/by/4.0/,Counteracts: Testing Stereotypical Representation in Pre-trained Language Models,Damin Zhang and Julia Rayz and Romila Pradhan,http://arxiv.org/pdf/2301.04347v3 | |
http://arxiv.org/abs/2109.04921v1,creativecommons.org/licenses/by/4.0/,Examining Cross-lingual Contextual Embeddings with Orthogonal Structural Probes,Tomasz Limisiewicz and David Mareček,http://arxiv.org/pdf/2109.04921v1 | |
http://arxiv.org/abs/2207.10648v1,creativecommons.org/licenses/by/4.0/,A No-Code Low-Code Paradigm for Authoring Business Automations Using Natural Language,Michael Desmond and Evelyn Duesterwald and Vatche Isahagian and Vinod Muthusamy,http://arxiv.org/pdf/2207.10648v1 | |
http://arxiv.org/abs/2111.09728v1,creativecommons.org/licenses/by/4.0/,Measuring source code conciseness across programming languages using compression,Lodewijk Bergmans and Xander Schrijen and Edwin Ouwehand and Magiel Bruntink,http://arxiv.org/pdf/2111.09728v1 | |
http://arxiv.org/abs/2112.09600v1,creativecommons.org/licenses/by/4.0/,Transcribing Natural Languages for The Deaf via Neural Editing Programs,Dongxu Li and Chenchen Xu and Liu Liu and Yiran Zhong and Rong Wang and Lars Petersson and Hongdong Li,http://arxiv.org/pdf/2112.09600v1 | |
http://arxiv.org/abs/2203.16972v3,creativecommons.org/licenses/by/4.0/,Improving Language Identification of Accented Speech,Kunnar Kukk and Tanel Alumäe,http://arxiv.org/pdf/2203.16972v3 | |
http://arxiv.org/abs/2209.06794v2,creativecommons.org/licenses/by/4.0/,PaLI: A Jointly-Scaled Multilingual Language-Image Model,Xi Chen and Xiao Wang and Soravit Changpinyo and AJ Piergiovanni and Piotr Padlewski and Daniel Salz and Sebastian Goodman and Adam Grycner and Basil Mustafa and Lucas Beyer and Alexander Kolesnikov and Joan Puigcerver and Nan Ding and Keran Rong and Hassan Akbari and Gaurav Mishra and Linting Xue and Ashish Thapliyal and James Bradbury and Weicheng Kuo and Mojtaba Seyedhosseini and Chao Jia and Burcu Karagol Ayan and Carlos Riquelme and Andreas Steiner and Anelia Angelova and Xiaohua Zhai and Neil Houlsby and Radu Soricut,http://arxiv.org/pdf/2209.06794v2 | |
http://arxiv.org/abs/2203.05081v1,creativecommons.org/licenses/by/4.0/,NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks,Fawaz Sammani and Tanmoy Mukherjee and Nikos Deligiannis,http://arxiv.org/pdf/2203.05081v1 | |
http://arxiv.org/abs/2303.12528v2,creativecommons.org/licenses/by/4.0/,MEGA: Multilingual Evaluation of Generative AI,Kabir Ahuja and Rishav Hada and Millicent Ochieng and Prachi Jain and Harshita Diddee and Samuel Maina and Tanuja Ganu and Sameer Segal and Maxamed Axmed and Kalika Bali and Sunayana Sitaram,http://arxiv.org/pdf/2303.12528v2 | |
http://arxiv.org/abs/2302.05508v1,creativecommons.org/licenses/by/4.0/,FairPy: A Toolkit for Evaluation of Social Biases and their Mitigation in Large Language Models,Hrishikesh Viswanath and Tianyi Zhang,http://arxiv.org/pdf/2302.05508v1 | |
http://arxiv.org/abs/2303.16104v1,creativecommons.org/licenses/by/4.0/,Hallucinations in Large Multilingual Translation Models,Nuno M. Guerreiro and Duarte Alves and Jonas Waldendorf and Barry Haddow and Alexandra Birch and Pierre Colombo and André F. T. Martins,http://arxiv.org/pdf/2303.16104v1 | |
http://arxiv.org/abs/2204.06644v2,creativecommons.org/licenses/by/4.0/,METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals,Payal Bajaj and Chenyan Xiong and Guolin Ke and Xiaodong Liu and Di He and Saurabh Tiwary and Tie-Yan Liu and Paul Bennett and Xia Song and Jianfeng Gao,http://arxiv.org/pdf/2204.06644v2 | |
http://arxiv.org/abs/2212.10422v2,creativecommons.org/licenses/by/4.0/,Localising In-Domain Adaptation of Transformer-Based Biomedical Language Models,Tommaso Mario Buonocore and Claudio Crema and Alberto Redolfi and Riccardo Bellazzi and Enea Parimbelli,http://arxiv.org/pdf/2212.10422v2 | |
http://arxiv.org/abs/1908.09892v1,creativecommons.org/licenses/by/4.0/,Does BERT agree? Evaluating knowledge of structure dependence through agreement relations,Geoff Bacon and Terry Regier,http://arxiv.org/pdf/1908.09892v1 | |
http://arxiv.org/abs/1906.05149v1,creativecommons.org/licenses/by/4.0/,Putting words in context: LSTM language models and lexical ambiguity,Laura Aina and Kristina Gulordava and Gemma Boleda,http://arxiv.org/pdf/1906.05149v1 | |
http://arxiv.org/abs/2212.10536v1,creativecommons.org/licenses/by/4.0/,"Measure More, Question More: Experimental Studies on Transformer-based Language Models and Complement Coercion",Yuling Gu,http://arxiv.org/pdf/2212.10536v1 | |
http://arxiv.org/abs/2202.09452v1,creativecommons.org/licenses/by/4.0/,From FreEM to D'AlemBERT: a Large Corpus and a Language Model for Early Modern French,Simon Gabay and Pedro Ortiz Suarez and Alexandre Bartz and Alix Chagué and Rachel Bawden and Philippe Gambette and Benoît Sagot,http://arxiv.org/pdf/2202.09452v1 | |
http://arxiv.org/abs/2203.02092v1,creativecommons.org/licenses/by/4.0/,Deep Lexical Hypothesis: Identifying personality structure in natural language,Andrew Cutler and David M. Condon,http://arxiv.org/pdf/2203.02092v1 | |
http://arxiv.org/abs/2210.05287v2,creativecommons.org/licenses/by/4.0/,Revisiting and Advancing Chinese Natural Language Understanding with Accelerated Heterogeneous Knowledge Pre-training,Taolin Zhang and Junwei Dong and Jianing Wang and Chengyu Wang and Ang Wang and Yinghui Liu and Jun Huang and Yong Li and Xiaofeng He,http://arxiv.org/pdf/2210.05287v2 | |
http://arxiv.org/abs/2212.10815v1,creativecommons.org/licenses/by/4.0/,ZEROTOP: Zero-Shot Task-Oriented Semantic Parsing using Large Language Models,Dheeraj Mekala and Jason Wolfe and Subhro Roy,http://arxiv.org/pdf/2212.10815v1 | |
http://arxiv.org/abs/2201.09377v1,creativecommons.org/licenses/by/4.0/,An Application of Pseudo-Log-Likelihoods to Natural Language Scoring,Darren Abramson and Ali Emami,http://arxiv.org/pdf/2201.09377v1 | |
http://arxiv.org/abs/2304.11158v1,creativecommons.org/licenses/by/4.0/,Emergent and Predictable Memorization in Large Language Models,Stella Biderman and USVSN Sai Prashanth and Lintang Sutawika and Hailey Schoelkopf and Quentin Anthony and Shivanshu Purohit and Edward Raf,http://arxiv.org/pdf/2304.11158v1 | |
http://arxiv.org/abs/2104.04052v1,creativecommons.org/licenses/by/4.0/,AlephBERT:A Hebrew Large Pre-Trained Language Model to Start-off your Hebrew NLP Application With,Amit Seker and Elron Bandel and Dan Bareket and Idan Brusilovsky and Refael Shaked Greenfeld and Reut Tsarfaty,http://arxiv.org/pdf/2104.04052v1 | |
http://arxiv.org/abs/2205.08184v1,creativecommons.org/licenses/by/4.0/,SKILL: Structured Knowledge Infusion for Large Language Models,Fedor Moiseev and Zhe Dong and Enrique Alfonseca and Martin Jaggi,http://arxiv.org/pdf/2205.08184v1 | |
http://arxiv.org/abs/2012.10309v1,creativecommons.org/licenses/by/4.0/,Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training,Peng Shi and Patrick Ng and Zhiguo Wang and Henghui Zhu and Alexander Hanbo Li and Jun Wang and Cicero Nogueira dos Santos and Bing Xiang,http://arxiv.org/pdf/2012.10309v1 | |
http://arxiv.org/abs/2107.07253v5,creativecommons.org/licenses/by/4.0/,MarIA: Spanish Language Models,Asier Gutiérrez-Fandiño and Jordi Armengol-Estapé and Marc Pàmies and Joan Llop-Palao and Joaquín Silveira-Ocampo and Casimiro Pio Carrino and Aitor Gonzalez-Agirre and Carme Armentano-Oller and Carlos Rodriguez-Penagos and Marta Villegas,http://arxiv.org/pdf/2107.07253v5 | |
http://arxiv.org/abs/2109.11321v2,creativecommons.org/licenses/by/4.0/,Transferring Knowledge from Vision to Language: How to Achieve it and how to Measure it?,Tobias Norlund and Lovisa Hagström and Richard Johansson,http://arxiv.org/pdf/2109.11321v2 | |
http://arxiv.org/abs/2304.03159v1,creativecommons.org/licenses/by/4.0/,Bridging the Language Gap: Knowledge Injected Multilingual Question Answering,Zhichao Duan and Xiuxing Li and Zhengyan Zhang and Zhenyu Li and Ning Liu and Jianyong Wang,http://arxiv.org/pdf/2304.03159v1 | |
http://arxiv.org/abs/1910.10893v1,creativecommons.org/licenses/by/4.0/,Low-Resource Sequence Labeling via Unsupervised Multilingual Contextualized Representations,Zuyi Bao and Rui Huang and Chen Li and Kenny Q. Zhu,http://arxiv.org/pdf/1910.10893v1 | |
http://arxiv.org/abs/2202.08882v1,creativecommons.org/licenses/by/4.0/,Improving English to Sinhala Neural Machine Translation using Part-of-Speech Tag,Ravinga Perera and Thilakshi Fonseka and Rashmini Naranpanawa and Uthayasanker Thayasivam,http://arxiv.org/pdf/2202.08882v1 | |
http://arxiv.org/abs/2304.10464v2,creativecommons.org/licenses/by/4.0/,Learning to Program with Natural Language,Yiduo Guo and Yaobo Liang and Chenfei Wu and Wenshan Wu and Dongyan Zhao and Nan Duan,http://arxiv.org/pdf/2304.10464v2 | |
http://arxiv.org/abs/1811.00258v1,creativecommons.org/licenses/by/4.0/,Language-Independent Representor for Neural Machine Translation,Long Zhou and Yuchen Liu and Jiajun Zhang and Chengqing Zong and Guoping Huang,http://arxiv.org/pdf/1811.00258v1 | |
http://arxiv.org/abs/2303.01793v1,creativecommons.org/licenses/by/4.0/,Exploiting Language Relatedness in Machine Translation Through Domain Adaptation Techniques,Amit Kumar and Rupjyoti Baruah and Ajay Pratap and Mayank Swarnkar and Anil Kumar Singh,http://arxiv.org/pdf/2303.01793v1 | |
http://arxiv.org/abs/2103.06434v1,creativecommons.org/licenses/by/4.0/,Topical Language Generation using Transformers,Rohola Zandie and Mohammad H. Mahoor,http://arxiv.org/pdf/2103.06434v1 | |
http://arxiv.org/abs/2104.05433v1,creativecommons.org/licenses/by/4.0/,Multilingual Language Models Predict Human Reading Behavior,Nora Hollenstein and Federico Pirovano and Ce Zhang and Lena Jäger and Lisa Beinborn,http://arxiv.org/pdf/2104.05433v1 | |
http://arxiv.org/abs/2109.14989v2,creativecommons.org/licenses/by/4.0/,Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations,Arabella Sinclair and Jaap Jumelet and Willem Zuidema and Raquel Fernández,http://arxiv.org/pdf/2109.14989v2 | |
http://arxiv.org/abs/2206.14858v2,creativecommons.org/licenses/by/4.0/,Solving Quantitative Reasoning Problems with Language Models,Aitor Lewkowycz and Anders Andreassen and David Dohan and Ethan Dyer and Henryk Michalewski and Vinay Ramasesh and Ambrose Slone and Cem Anil and Imanol Schlag and Theo Gutman-Solo and Yuhuai Wu and Behnam Neyshabur and Guy Gur-Ari and Vedant Misra,http://arxiv.org/pdf/2206.14858v2 | |
http://arxiv.org/abs/2207.04429v2,creativecommons.org/licenses/by/4.0/,"LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action",Dhruv Shah and Blazej Osinski and Brian Ichter and Sergey Levine,http://arxiv.org/pdf/2207.04429v2 | |
http://arxiv.org/abs/2105.02570v4,creativecommons.org/licenses/by/4.0/,Capturing the diversity of multilingual societies,Thomas Louf and David Sanchez and Jose J. Ramasco,http://arxiv.org/pdf/2105.02570v4 | |
http://arxiv.org/abs/2107.13723v2,creativecommons.org/licenses/by/4.0/,An Empirical Study of Developers' Discussions about Security Challenges of Different Programming Languages,Roland Croft and Yongzheng Xie and Mansooreh Zahedi and M. Ali Babar and Christoph Treude,http://arxiv.org/pdf/2107.13723v2 | |
http://arxiv.org/abs/2208.09021v3,creativecommons.org/licenses/by/4.0/,VAuLT: Augmenting the Vision-and-Language Transformer for Sentiment Classification on Social Media,Georgios Chochlakis and Tejas Srinivasan and Jesse Thomason and Shrikanth Narayanan,http://arxiv.org/pdf/2208.09021v3 | |
http://arxiv.org/abs/2006.05014v2,creativecommons.org/licenses/by/4.0/,HausaMT v1.0: Towards English-Hausa Neural Machine Translation,Adewale Akinfaderin,http://arxiv.org/pdf/2006.05014v2 | |
http://arxiv.org/abs/2109.05704v2,creativecommons.org/licenses/by/4.0/,Mitigating Language-Dependent Ethnic Bias in BERT,Jaimeen Ahn and Alice Oh,http://arxiv.org/pdf/2109.05704v2 | |
http://arxiv.org/abs/2112.13800v1,creativecommons.org/licenses/by/4.0/,"A Passage to India"": Pre-trained Word Embeddings for Indian Languages""",Kumar Saurav and Kumar Saunack and Diptesh Kanojia and Pushpak Bhattacharyya,http://arxiv.org/pdf/2112.13800v1 | |
http://arxiv.org/abs/2203.07911v2,creativecommons.org/licenses/by/4.0/,Signal in Noise: Exploring Meaning Encoded in Random Character Sequences with Character-Aware Language Models,Mark Chu and Bhargav Srinivasa Desikan and Ethan O. Nadler and D. Ruggiero Lo Sardo and Elise Darragh-Ford and Douglas Guilbeault,http://arxiv.org/pdf/2203.07911v2 | |
http://arxiv.org/abs/2302.01308v1,creativecommons.org/licenses/by/4.0/,What Language Reveals about Perception: Distilling Psychophysical Knowledge from Large Language Models,Raja Marjieh and Ilia Sucholutsky and Pol van Rijn and Nori Jacoby and Thomas L. Griffiths,http://arxiv.org/pdf/2302.01308v1 | |
http://arxiv.org/abs/2304.02468v1,creativecommons.org/licenses/by/4.0/,Comparative Analysis of CHATGPT and the evolution of language models,Oluwatosin Ogundare and Gustavo Quiros Araya,http://arxiv.org/pdf/2304.02468v1 | |
http://arxiv.org/abs/2201.12469v1,creativecommons.org/licenses/by/4.0/,ScaLA: Accelerating Adaptation of Pre-Trained Transformer-Based Language Models via Efficient Large-Batch Adversarial Noise,Minjia Zhang and Niranjan Uma Naresh and Yuxiong He,http://arxiv.org/pdf/2201.12469v1 | |
http://arxiv.org/abs/2206.07682v2,creativecommons.org/licenses/by/4.0/,Emergent Abilities of Large Language Models,Jason Wei and Yi Tay and Rishi Bommasani and Colin Raffel and Barret Zoph and Sebastian Borgeaud and Dani Yogatama and Maarten Bosma and Denny Zhou and Donald Metzler and Ed H. Chi and Tatsunori Hashimoto and Oriol Vinyals and Percy Liang and Jeff Dean and William Fedus,http://arxiv.org/pdf/2206.07682v2 | |
http://arxiv.org/abs/2208.11671v1,creativecommons.org/licenses/by/4.0/,Interpreting Song Lyrics with an Audio-Informed Pre-trained Language Model,Yixiao Zhang and Junyan Jiang and Gus Xia and Simon Dixon,http://arxiv.org/pdf/2208.11671v1 | |
http://arxiv.org/abs/2205.08605v1,creativecommons.org/licenses/by/4.0/,OneAligner: Zero-shot Cross-lingual Transfer with One Rich-Resource Language Pair for Low-Resource Sentence Retrieval,Tong Niu and Kazuma Hashimoto and Yingbo Zhou and Caiming Xiong,http://arxiv.org/pdf/2205.08605v1 | |
http://arxiv.org/abs/2303.16275v1,creativecommons.org/licenses/by/4.0/,Writing Assistants Should Model Social Factors of Language,Vivek Kulkarni and Vipul Raheja,http://arxiv.org/pdf/2303.16275v1 | |
http://arxiv.org/abs/2106.03379v1,creativecommons.org/licenses/by/4.0/,LAWDR: Language-Agnostic Weighted Document Representations from Pre-trained Models,Hongyu Gong and Vishrav Chaudhary and Yuqing Tang and Francisco Guzmán,http://arxiv.org/pdf/2106.03379v1 | |
http://arxiv.org/abs/2112.04482v3,creativecommons.org/licenses/by/4.0/,FLAVA: A Foundational Language And Vision Alignment Model,Amanpreet Singh and Ronghang Hu and Vedanuj Goswami and Guillaume Couairon and Wojciech Galuba and Marcus Rohrbach and Douwe Kiela,http://arxiv.org/pdf/2112.04482v3 | |
http://arxiv.org/abs/2112.05253v2,creativecommons.org/licenses/by/4.0/,MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning,Constantin Eichenberg and Sidney Black and Samuel Weinbach and Letitia Parcalabescu and Anette Frank,http://arxiv.org/pdf/2112.05253v2 | |
http://arxiv.org/abs/2203.06386v2,creativecommons.org/licenses/by/4.0/,Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation,Wenliang Dai and Lu Hou and Lifeng Shang and Xin Jiang and Qun Liu and Pascale Fung,http://arxiv.org/pdf/2203.06386v2 | |
http://arxiv.org/abs/2205.10893v1,creativecommons.org/licenses/by/4.0/,Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers,Albert Q. Jiang and Wenda Li and Szymon Tworkowski and Konrad Czechowski and Tomasz Odrzygóźdź and Piotr Miłoś and Yuhuai Wu and Mateja Jamnik,http://arxiv.org/pdf/2205.10893v1 | |
http://arxiv.org/abs/2301.12112v2,creativecommons.org/licenses/by/4.0/,On Pre-trained Language Models for Antibody,Danqing Wang and Fei Ye and Hao Zhou,http://arxiv.org/pdf/2301.12112v2 | |
http://arxiv.org/abs/2110.07143v1,creativecommons.org/licenses/by/4.0/,bert2BERT: Towards Reusable Pretrained Language Models,Cheng Chen and Yichun Yin and Lifeng Shang and Xin Jiang and Yujia Qin and Fengyu Wang and Zhi Wang and Xiao Chen and Zhiyuan Liu and Qun Liu,http://arxiv.org/pdf/2110.07143v1 | |
http://arxiv.org/abs/2304.05613v1,creativecommons.org/licenses/by/4.0/,ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning,Viet Dac Lai and Nghia Trung Ngo and Amir Pouran Ben Veyseh and Hieu Man and Franck Dernoncourt and Trung Bui and Thien Huu Nguyen,http://arxiv.org/pdf/2304.05613v1 | |
http://arxiv.org/abs/2101.12462v1,creativecommons.org/licenses/by/4.0/,Synthesizing Monolingual Data for Neural Machine Translation,Benjamin Marie and Atsushi Fujita,http://arxiv.org/pdf/2101.12462v1 | |
http://arxiv.org/abs/2212.09271v2,creativecommons.org/licenses/by/4.0/,Very Large Language Model as a Unified Methodology of Text Mining,Meng Jiang,http://arxiv.org/pdf/2212.09271v2 | |
http://arxiv.org/abs/2301.13820v1,creativecommons.org/licenses/by/4.0/,Explaining Large Language Model-Based Neural Semantic Parsers (Student Abstract),Daking Rai and Yilun Zhou and Bailin Wang and Ziyu Yao,http://arxiv.org/pdf/2301.13820v1 | |
http://arxiv.org/abs/2303.12024v2,creativecommons.org/licenses/by/4.0/,cTBL: Augmenting Large Language Models for Conversational Tables,Anirudh S Sundar and Larry Heck,http://arxiv.org/pdf/2303.12024v2 | |
http://arxiv.org/abs/2212.00851v1,creativecommons.org/licenses/by/4.0/,SOLD: Sinhala Offensive Language Dataset,Tharindu Ranasinghe and Isuri Anuradha and Damith Premasiri and Kanishka Silva and Hansi Hettiarachchi and Lasitha Uyangodage and Marcos Zampieri,http://arxiv.org/pdf/2212.00851v1 | |
http://arxiv.org/abs/2301.09003v1,creativecommons.org/licenses/by/4.0/,Blacks is to Anger as Whites is to Joy? Understanding Latent Affective Bias in Large Pre-trained Neural Language Models,Anoop Kadan and Deepak P. and Sahely Bhadra and Manjary P. Gangan and Lajish V. L,http://arxiv.org/pdf/2301.09003v1 | |
http://arxiv.org/abs/2210.14431v3,creativecommons.org/licenses/by/4.0/,$N$-gram Is Back: Residual Learning of Neural Text Generation with $n$-gram Language Model,Huayang Li and Deng Cai and Jin Xu and Taro Watanabe,http://arxiv.org/pdf/2210.14431v3 | |
http://arxiv.org/abs/2008.09049v1,creativecommons.org/licenses/by/4.0/,Discovering Useful Sentence Representations from Large Pretrained Language Models,Nishant Subramani and Nivedita Suresh,http://arxiv.org/pdf/2008.09049v1 | |
http://arxiv.org/abs/2201.00971v1,creativecommons.org/licenses/by/4.0/,Submix: Practical Private Prediction for Large-Scale Language Models,Antonio Ginart and Laurens van der Maaten and James Zou and Chuan Guo,http://arxiv.org/pdf/2201.00971v1 | |
http://arxiv.org/abs/2207.04901v2,creativecommons.org/licenses/by/4.0/,Exploring Length Generalization in Large Language Models,Cem Anil and Yuhuai Wu and Anders Andreassen and Aitor Lewkowycz and Vedant Misra and Vinay Ramasesh and Ambrose Slone and Guy Gur-Ari and Ethan Dyer and Behnam Neyshabur,http://arxiv.org/pdf/2207.04901v2 | |
http://arxiv.org/abs/2303.15430v2,creativecommons.org/licenses/by/4.0/,TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models,Md Kamrul Hasan and Md Saiful Islam and Sangwu Lee and Wasifur Rahman and Iftekhar Naim and Mohammed Ibrahim Khan and Ehsan Hoque,http://arxiv.org/pdf/2303.15430v2 | |
http://arxiv.org/abs/1906.10519v1,creativecommons.org/licenses/by/4.0/,Embedding Projection for Targeted Cross-Lingual Sentiment: Model Comparisons and a Real-World Study,Jeremy Barnes and Roman Klinger,http://arxiv.org/pdf/1906.10519v1 | |
http://arxiv.org/abs/2106.06937v1,creativecommons.org/licenses/by/4.0/,Common Sense Beyond English: Evaluating and Improving Multilingual Language Models for Commonsense Reasoning,Bill Yuchen Lin and Seyeon Lee and Xiaoyang Qiao and Xiang Ren,http://arxiv.org/pdf/2106.06937v1 | |
http://arxiv.org/abs/2201.12086v2,creativecommons.org/licenses/by/4.0/,BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation,Junnan Li and Dongxu Li and Caiming Xiong and Steven Hoi,http://arxiv.org/pdf/2201.12086v2 | |
http://arxiv.org/abs/2205.04086v1,creativecommons.org/licenses/by/4.0/,A Balanced Data Approach for Evaluating Cross-Lingual Transfer: Mapping the Linguistic Blood Bank,Dan Malkin and Tomasz Limisiewicz and Gabriel Stanovsky,http://arxiv.org/pdf/2205.04086v1 | |
http://arxiv.org/abs/2210.07128v3,creativecommons.org/licenses/by/4.0/,Language Models of Code are Few-Shot Commonsense Learners,Aman Madaan and Shuyan Zhou and Uri Alon and Yiming Yang and Graham Neubig,http://arxiv.org/pdf/2210.07128v3 | |
http://arxiv.org/abs/2304.05128v1,creativecommons.org/licenses/by/4.0/,Teaching Large Language Models to Self-Debug,Xinyun Chen and Maxwell Lin and Nathanael Schärli and Denny Zhou,http://arxiv.org/pdf/2304.05128v1 | |
http://arxiv.org/abs/2109.08634v1,creativecommons.org/licenses/by/4.0/,Grounding Natural Language Instructions: Can Large Language Models Capture Spatial Information?,Julia Rozanova and Deborah Ferreira and Krishna Dubba and Weiwei Cheng and Dell Zhang and Andre Freitas,http://arxiv.org/pdf/2109.08634v1 | |
http://arxiv.org/abs/2110.07640v1,creativecommons.org/licenses/by/4.0/,Sparks: Inspiration for Science Writing using Language Models,Katy Ilonka Gero and Vivian Liu and Lydia B. Chilton,http://arxiv.org/pdf/2110.07640v1 | |
http://arxiv.org/abs/2207.01893v1,creativecommons.org/licenses/by/4.0/,ASR-Generated Text for Language Model Pre-training Applied to Speech Tasks,Valentin Pelloin and Franck Dary and Nicolas Herve and Benoit Favre and Nathalie Camelin and Antoine Laurent and Laurent Besacier,http://arxiv.org/pdf/2207.01893v1 | |
http://arxiv.org/abs/2207.05289v1,creativecommons.org/licenses/by/4.0/,PLM-ICD: Automatic ICD Coding with Pretrained Language Models,Chao-Wei Huang and Shang-Chi Tsai and Yun-Nung Chen,http://arxiv.org/pdf/2207.05289v1 | |
http://arxiv.org/abs/2210.03057v1,creativecommons.org/licenses/by/4.0/,Language Models are Multilingual Chain-of-Thought Reasoners,Freda Shi and Mirac Suzgun and Markus Freitag and Xuezhi Wang and Suraj Srivats and Soroush Vosoughi and Hyung Won Chung and Yi Tay and Sebastian Ruder and Denny Zhou and Dipanjan Das and Jason Wei,http://arxiv.org/pdf/2210.03057v1 | |
http://arxiv.org/abs/2210.07700v2,creativecommons.org/licenses/by/4.0/,Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey,Sachin Kumar and Vidhisha Balachandran and Lucille Njoo and Antonios Anastasopoulos and Yulia Tsvetkov,http://arxiv.org/pdf/2210.07700v2 | |
http://arxiv.org/abs/2302.09664v3,creativecommons.org/licenses/by/4.0/,Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation,Lorenz Kuhn and Yarin Gal and Sebastian Farquhar,http://arxiv.org/pdf/2302.09664v3 | |
http://arxiv.org/abs/2304.10977v1,creativecommons.org/licenses/by/4.0/,Evaluating Transformer Language Models on Arithmetic Operations Using Number Decomposition,Matteo Muffo and Aldo Cocco and Enrico Bertino,http://arxiv.org/pdf/2304.10977v1 | |
http://arxiv.org/abs/2210.14199v1,creativecommons.org/licenses/by/4.0/,"Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models",Hong Liu and Sang Michael Xie and Zhiyuan Li and Tengyu Ma,http://arxiv.org/pdf/2210.14199v1 | |
http://arxiv.org/abs/2110.06490v2,creativecommons.org/licenses/by/4.0/,Dict-BERT: Enhancing Language Model Pre-training with Dictionary,Wenhao Yu and Chenguang Zhu and Yuwei Fang and Donghan Yu and Shuohang Wang and Yichong Xu and Michael Zeng and Meng Jiang,http://arxiv.org/pdf/2110.06490v2 | |
http://arxiv.org/abs/2111.02840v2,creativecommons.org/licenses/by/4.0/,Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models,Boxin Wang and Chejian Xu and Shuohang Wang and Zhe Gan and Yu Cheng and Jianfeng Gao and Ahmed Hassan Awadallah and Bo Li,http://arxiv.org/pdf/2111.02840v2 | |
http://arxiv.org/abs/2206.05658v1,creativecommons.org/licenses/by/4.0/,Fine-tuning Pre-trained Language Models with Noise Stability Regularization,Hang Hua and Xingjian Li and Dejing Dou and Cheng-Zhong Xu and Jiebo Luo,http://arxiv.org/pdf/2206.05658v1 | |
http://arxiv.org/abs/2209.13279v1,creativecommons.org/licenses/by/4.0/,Improving Multilingual Neural Machine Translation System for Indic Languages,Sudhansu Bala Das and Atharv Biradar and Tapas Kumar Mishra and Bidyut Kumar Patra,http://arxiv.org/pdf/2209.13279v1 | |
http://arxiv.org/abs/2109.12584v4,creativecommons.org/licenses/by/4.0/,Curb Your Carbon Emissions: Benchmarking Carbon Emissions in Machine Translation,Mirza Yusuf and Praatibh Surana and Gauri Gupta and Krithika Ramesh,http://arxiv.org/pdf/2109.12584v4 | |
http://arxiv.org/abs/2212.10560v1,creativecommons.org/licenses/by/4.0/,Self-Instruct: Aligning Language Model with Self Generated Instructions,Yizhong Wang and Yeganeh Kordi and Swaroop Mishra and Alisa Liu and Noah A. Smith and Daniel Khashabi and Hannaneh Hajishirzi,http://arxiv.org/pdf/2212.10560v1 | |
http://arxiv.org/abs/1703.02504v1,creativecommons.org/licenses/by/4.0/,Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification,Jan Deriu and Aurelien Lucchi and Valeria De Luca and Aliaksei Severyn and Simon Müller and Mark Cieliebak and Thomas Hofmann and Martin Jaggi,http://arxiv.org/pdf/1703.02504v1 | |
http://arxiv.org/abs/2201.08277v3,creativecommons.org/licenses/by/4.0/,NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis,Shamsuddeen Hassan Muhammad and David Ifeoluwa Adelani and Sebastian Ruder and Ibrahim Said Ahmad and Idris Abdulmumin and Bello Shehu Bello and Monojit Choudhury and Chris Chinenye Emezue and Saheed Salahudeen Abdullahi and Anuoluwapo Aremu and Alipio Jeorge and Pavel Brazdil,http://arxiv.org/pdf/2201.08277v3 | |
http://arxiv.org/abs/2211.07615v1,creativecommons.org/licenses/by/4.0/,UGIF: UI Grounded Instruction Following,Sagar Gubbi Venkatesh and Partha Talukdar and Srini Narayanan,http://arxiv.org/pdf/2211.07615v1 | |
http://arxiv.org/abs/2204.03067v2,creativecommons.org/licenses/by/4.0/,ByT5 model for massively multilingual grapheme-to-phoneme conversion,Jian Zhu and Cong Zhang and David Jurgens,http://arxiv.org/pdf/2204.03067v2 | |
http://arxiv.org/abs/2210.07993v1,creativecommons.org/licenses/by/4.0/,MiQA: A Benchmark for Inference on Metaphorical Questions,Iulia-Maria Comsa and Julian Martin Eisenschlos and Srini Narayanan,http://arxiv.org/pdf/2210.07993v1 | |
http://arxiv.org/abs/2303.01347v1,creativecommons.org/licenses/by/4.0/,Letz Translate: Low-Resource Machine Translation for Luxembourgish,Yewei Song and Saad Ezzini and Jacques Klein and Tegawende Bissyande and Clément Lefebvre and Anne Goujon,http://arxiv.org/pdf/2303.01347v1 | |
http://arxiv.org/abs/2302.08091v1,creativecommons.org/licenses/by/4.0/,Do We Still Need Clinical Language Models?,Eric Lehman and Evan Hernandez and Diwakar Mahajan and Jonas Wulff and Micah J. Smith and Zachary Ziegler and Daniel Nadler and Peter Szolovits and Alistair Johnson and Emily Alsentzer,http://arxiv.org/pdf/2302.08091v1 | |
http://arxiv.org/abs/2212.09723v1,creativecommons.org/licenses/by/4.0/,MANER: Mask Augmented Named Entity Recognition for Extreme Low-Resource Languages,Shashank Sonkar and Zichao Wang and Richard G. Baraniuk,http://arxiv.org/pdf/2212.09723v1 | |
http://arxiv.org/abs/1809.02428v1,creativecommons.org/licenses/by/4.0/,Multitask and Multilingual Modelling for Lexical Analysis,Johannes Bjerva,http://arxiv.org/pdf/1809.02428v1 | |
http://arxiv.org/abs/2107.12603v1,creativecommons.org/licenses/by/4.0/,Federated Learning Meets Natural Language Processing: A Survey,Ming Liu and Stella Ho and Mengqi Wang and Longxiang Gao and Yuan Jin and He Zhang,http://arxiv.org/pdf/2107.12603v1 | |
http://arxiv.org/abs/2203.02912v1,creativecommons.org/licenses/by/4.0/,Graph Neural Network Enhanced Language Models for Efficient Multilingual Text Classification,Samujjwal Ghosh and Subhadeep Maji and Maunendra Sankar Desarkar,http://arxiv.org/pdf/2203.02912v1 | |
http://arxiv.org/abs/2210.12814v1,creativecommons.org/licenses/by/4.0/,RuCoLA: Russian Corpus of Linguistic Acceptability,Vladislav Mikhailov and Tatiana Shamardina and Max Ryabinin and Alena Pestova and Ivan Smurov and Ekaterina Artemova,http://arxiv.org/pdf/2210.12814v1 | |
http://arxiv.org/abs/2301.12507v1,creativecommons.org/licenses/by/4.0/,Distilling Internet-Scale Vision-Language Models into Embodied Agents,Theodore Sumers and Kenneth Marino and Arun Ahuja and Rob Fergus and Ishita Dasgupta,http://arxiv.org/pdf/2301.12507v1 | |
http://arxiv.org/abs/2010.10077v2,creativecommons.org/licenses/by/4.0/,Neural Language Modeling for Contextualized Temporal Graph Generation,Aman Madaan and Yiming Yang,http://arxiv.org/pdf/2010.10077v2 | |
http://arxiv.org/abs/2012.13978v1,creativecommons.org/licenses/by/4.0/,MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining,Zhi Wen and Xing Han Lu and Siva Reddy,http://arxiv.org/pdf/2012.13978v1 | |
http://arxiv.org/abs/2012.05983v2,creativecommons.org/licenses/by/4.0/,Towards Neural Programming Interfaces,Zachary C. Brown and Nathaniel Robinson and David Wingate and Nancy Fulda,http://arxiv.org/pdf/2012.05983v2 | |
http://arxiv.org/abs/2301.12031v1,creativecommons.org/licenses/by/4.0/,Context Matters: A Strategy to Pre-train Language Model for Science Education,Zhengliang Liu and Xinyu He and Lei Liu and Tianming Liu and Xiaoming Zhai,http://arxiv.org/pdf/2301.12031v1 | |
http://arxiv.org/abs/1909.04625v1,creativecommons.org/licenses/by/4.0/,Representation of Constituents in Neural Language Models: Coordination Phrase as a Case Study,Aixiu An and Peng Qian and Ethan Wilcox and Roger Levy,http://arxiv.org/pdf/1909.04625v1 | |
http://arxiv.org/abs/2201.06642v1,creativecommons.org/licenses/by/4.0/,Towards a Cleaner Document-Oriented Multilingual Crawled Corpus,Julien Abadji and Pedro Ortiz Suarez and Laurent Romary and Benoît Sagot,http://arxiv.org/pdf/2201.06642v1 | |
http://arxiv.org/abs/2301.12726v1,creativecommons.org/licenses/by/4.0/,Specializing Smaller Language Models towards Multi-Step Reasoning,Yao Fu and Hao Peng and Litu Ou and Ashish Sabharwal and Tushar Khot,http://arxiv.org/pdf/2301.12726v1 | |
http://arxiv.org/abs/2303.03846v2,creativecommons.org/licenses/by/4.0/,Larger language models do in-context learning differently,Jerry Wei and Jason Wei and Yi Tay and Dustin Tran and Albert Webson and Yifeng Lu and Xinyun Chen and Hanxiao Liu and Da Huang and Denny Zhou and Tengyu Ma,http://arxiv.org/pdf/2303.03846v2 | |
http://arxiv.org/abs/2205.06457v2,creativecommons.org/licenses/by/4.0/,ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language Generation,Long Phan and Hieu Tran and Hieu Nguyen and Trieu H. Trinh,http://arxiv.org/pdf/2205.06457v2 | |
http://arxiv.org/abs/2206.08916v2,creativecommons.org/licenses/by/4.0/,"Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks",Jiasen Lu and Christopher Clark and Rowan Zellers and Roozbeh Mottaghi and Aniruddha Kembhavi,http://arxiv.org/pdf/2206.08916v2 | |
http://arxiv.org/abs/2212.10678v1,creativecommons.org/licenses/by/4.0/,Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing,Justus Mattern and Zhijing Jin and Mrinmaya Sachan and Rada Mihalcea and Bernhard Schölkopf,http://arxiv.org/pdf/2212.10678v1 | |
http://arxiv.org/abs/2303.13367v2,creativecommons.org/licenses/by/4.0/,ChatGPT and a New Academic Reality: Artificial Intelligence-Written Research Papers and the Ethics of the Large Language Models in Scholarly Publishing,Brady Lund and Ting Wang and Nishith Reddy Mannuru and Bing Nie and Somipam Shimray and Ziang Wang,http://arxiv.org/pdf/2303.13367v2 | |
http://arxiv.org/abs/2304.09542v1,creativecommons.org/licenses/by/4.0/,Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent,Weiwei Sun and Lingyong Yan and Xinyu Ma and Pengjie Ren and Dawei Yin and Zhaochun Ren,http://arxiv.org/pdf/2304.09542v1 | |
http://arxiv.org/abs/1706.00377v1,creativecommons.org/licenses/by/4.0/,Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules,Ivan Vulić and Nikola Mrkšić and Roi Reichart and Diarmuid Ó Séaghdha and Steve Young and Anna Korhonen,http://arxiv.org/pdf/1706.00377v1 | |
http://arxiv.org/abs/1612.01744v1,creativecommons.org/licenses/by/4.0/,Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation,Alexandre Berard and Olivier Pietquin and Christophe Servan and Laurent Besacier,http://arxiv.org/pdf/1612.01744v1 | |
http://arxiv.org/abs/2103.16613v1,creativecommons.org/licenses/by/4.0/,Tracking Knowledge Propagation Across Wikipedia Languages,Roldolfo Valentim and Giovanni Comarela and Souneil Park and Diego Saez-Trumper,http://arxiv.org/pdf/2103.16613v1 | |
http://arxiv.org/abs/2111.09749v2,creativecommons.org/licenses/by/4.0/,Detecting Cross-Language Plagiarism using Open Knowledge Graphs,Johannes Stegmüller and Fabian Bauer-Marquart and Norman Meuschke and Terry Ruas and Moritz Schubotz and Bela Gipp,http://arxiv.org/pdf/2111.09749v2 | |
http://arxiv.org/abs/2212.10011v1,creativecommons.org/licenses/by/4.0/,PLUE: Language Understanding Evaluation Benchmark for Privacy Policies in English,Jianfeng Chi and Wasi Uddin Ahmad and Yuan Tian and Kai-Wei Chang,http://arxiv.org/pdf/2212.10011v1 | |
http://arxiv.org/abs/2302.11186v1,creativecommons.org/licenses/by/4.0/,UML: A Universal Monolingual Output Layer for Multilingual ASR,Chao Zhang and Bo Li and Tara N. Sainath and Trevor Strohman and Shuo-yiin Chang,http://arxiv.org/pdf/2302.11186v1 | |
http://arxiv.org/abs/2304.07840v1,creativecommons.org/licenses/by/4.0/,Automated Program Repair Based on Code Review: How do Pre-trained Transformer Models Perform?,Rishov Paul and Md. Mohib Hossain and Masum Hasan and Anindya Iqbal,http://arxiv.org/pdf/2304.07840v1 | |
http://arxiv.org/abs/2205.11342v1,creativecommons.org/licenses/by/4.0/,ScholarBERT: Bigger is Not Always Better,Zhi Hong and Aswathy Ajith and Gregory Pauloski and Eamon Duede and Carl Malamud and Roger Magoulas and Kyle Chard and Ian Foster,http://arxiv.org/pdf/2205.11342v1 | |
http://arxiv.org/abs/2304.12244v1,creativecommons.org/licenses/by/4.0/,WizardLM: Empowering Large Language Models to Follow Complex Instructions,Can Xu and Qingfeng Sun and Kai Zheng and Xiubo Geng and Pu Zhao and Jiazhan Feng and Chongyang Tao and Daxin Jiang,http://arxiv.org/pdf/2304.12244v1 | |
http://arxiv.org/abs/2211.14402v1,creativecommons.org/licenses/by/4.0/,An Analysis of Social Biases Present in BERT Variants Across Multiple Languages,Aristides Milios and Parishad BehnamGhader,http://arxiv.org/pdf/2211.14402v1 | |
http://arxiv.org/abs/2301.10095v2,creativecommons.org/licenses/by/4.0/,Large Language Models as Fiduciaries: A Case Study Toward Robustly Communicating With Artificial Intelligence Through Legal Standards,John J. Nay,http://arxiv.org/pdf/2301.10095v2 | |
http://arxiv.org/abs/2201.10066v1,creativecommons.org/licenses/by/4.0/,Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources,Angelina McMillan-Major and Zaid Alyafeai and Stella Biderman and Kimbo Chen and Francesco De Toni and Gérard Dupont and Hady Elsahar and Chris Emezue and Alham Fikri Aji and Suzana Ilić and Nurulaqilla Khamis and Colin Leong and Maraim Masoud and Aitor Soroa and Pedro Ortiz Suarez and Zeerak Talat and Daniel van Strien and Yacine Jernite,http://arxiv.org/pdf/2201.10066v1 | |
http://arxiv.org/abs/1909.09543v1,creativecommons.org/licenses/by/4.0/,"Process Query Language: Design, Implementation, and Evaluation",Artem Polyvyanyy and Arthur H. M. ter Hofstede and Marcello La Rosa and Chun Ouyang and Anastasiia Pika,http://arxiv.org/pdf/1909.09543v1 | |
http://arxiv.org/abs/2012.07098v1,creativecommons.org/licenses/by/4.0/,MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision and Language Research in Turkish,Begum Citamak and Ozan Caglayan and Menekse Kuyu and Erkut Erdem and Aykut Erdem and Pranava Madhyastha and Lucia Specia,http://arxiv.org/pdf/2012.07098v1 | |
http://arxiv.org/abs/2208.11640v3,creativecommons.org/licenses/by/4.0/,Repair Is Nearly Generation: Multilingual Program Repair with LLMs,Harshit Joshi and José Cambronero and Sumit Gulwani and Vu Le and Ivan Radicek and Gust Verbruggen,http://arxiv.org/pdf/2208.11640v3 | |
http://arxiv.org/abs/2210.01343v3,creativecommons.org/licenses/by/4.0/,The Surprising Computational Power of Nondeterministic Stack RNNs,Brian DuSell and David Chiang,http://arxiv.org/pdf/2210.01343v3 | |
http://arxiv.org/abs/2211.10017v1,creativecommons.org/licenses/by/4.0/,Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production,Young Jin Kim and Rawn Henry and Raffy Fahim and Hany Hassan Awadalla,http://arxiv.org/pdf/2211.10017v1 | |
http://arxiv.org/abs/2206.08446v1,creativecommons.org/licenses/by/4.0/,Methods for Estimating and Improving Robustness of Language Models,Michal Štefánik,http://arxiv.org/pdf/2206.08446v1 | |
http://arxiv.org/abs/2302.11412v1,creativecommons.org/licenses/by/4.0/,Data Augmentation for Neural NLP,Domagoj Pluščec and Jan Šnajder,http://arxiv.org/pdf/2302.11412v1 | |
http://arxiv.org/abs/2109.04593v1,creativecommons.org/licenses/by/4.0/,A Large-Scale Study of Machine Translation in the Turkic Languages,Jamshidbek Mirzakhalov and Anoop Babu and Duygu Ataman and Sherzod Kariev and Francis Tyers and Otabek Abduraufov and Mammad Hajili and Sardana Ivanova and Abror Khaytbaev and Antonio Laverghetta Jr. and Behzodbek Moydinboyev and Esra Onal and Shaxnoza Pulatova and Ahsan Wahab and Orhan Firat and Sriram Chellappan,http://arxiv.org/pdf/2109.04593v1 | |
http://arxiv.org/abs/2010.06478v1,creativecommons.org/licenses/by/4.0/,XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization,Alessandro Raganato and Tommaso Pasini and Jose Camacho-Collados and Mohammad Taher Pilehvar,http://arxiv.org/pdf/2010.06478v1 | |
http://arxiv.org/abs/2304.10453v1,creativecommons.org/licenses/by/4.0/,Phoenix: Democratizing ChatGPT across Languages,Zhihong Chen and Feng Jiang and Junying Chen and Tiannan Wang and Fei Yu and Guiming Chen and Hongbo Zhang and Juhao Liang and Chen Zhang and Zhiyi Zhang and Jianquan Li and Xiang Wan and Benyou Wang and Haizhou Li,http://arxiv.org/pdf/2304.10453v1 | |
http://arxiv.org/abs/2111.09734v1,creativecommons.org/licenses/by/4.0/,ClipCap: CLIP Prefix for Image Captioning,Ron Mokady and Amir Hertz and Amit H. Bermano,http://arxiv.org/pdf/2111.09734v1 | |
http://arxiv.org/abs/2105.11832v2,creativecommons.org/licenses/by/4.0/,Estimating Redundancy in Clinical Text,Thomas Searle and Zina Ibrahim and James Teo and Richard JB Dobson,http://arxiv.org/pdf/2105.11832v2 | |
http://arxiv.org/abs/2303.03457v1,creativecommons.org/licenses/by/4.0/,Spelling convention sensitivity in neural language models,Elizabeth Nielsen and Christo Kirov and Brian Roark,http://arxiv.org/pdf/2303.03457v1 | |
http://arxiv.org/abs/2304.05406v1,creativecommons.org/licenses/by/4.0/,Galactic ChitChat: Using Large Language Models to Converse with Astronomy Literature,Ioana Ciucă and Yuan-Sen Ting,http://arxiv.org/pdf/2304.05406v1 | |
http://arxiv.org/abs/2209.08141v1,creativecommons.org/licenses/by/4.0/,Psychologically-informed chain-of-thought prompts for metaphor understanding in large language models,Ben Prystawski and Paul Thibodeau and Noah Goodman,http://arxiv.org/pdf/2209.08141v1 | |
http://arxiv.org/abs/2210.03871v1,creativecommons.org/licenses/by/4.0/,Data-Efficiency with a Single GPU: An Exploration of Transfer Methods for Small Language Models,Alon Albalak and Akshat Shrivastava and Chinnadhurai Sankar and Adithya Sagar and Mike Ross,http://arxiv.org/pdf/2210.03871v1 | |
http://arxiv.org/abs/2001.03521v1,creativecommons.org/licenses/by/4.0/,Towards Minimal Supervision BERT-based Grammar Error Correction,Yiyuan Li and Antonios Anastasopoulos and Alan W Black,http://arxiv.org/pdf/2001.03521v1 | |
http://arxiv.org/abs/2109.06605v1,creativecommons.org/licenses/by/4.0/,MDAPT: Multilingual Domain Adaptive Pretraining in a Single Model,Rasmus Kær Jørgensen and Mareike Hartmann and Xiang Dai and Desmond Elliott,http://arxiv.org/pdf/2109.06605v1 | |
http://arxiv.org/abs/2210.03347v1,creativecommons.org/licenses/by/4.0/,Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding,Kenton Lee and Mandar Joshi and Iulia Turc and Hexiang Hu and Fangyu Liu and Julian Eisenschlos and Urvashi Khandelwal and Peter Shaw and Ming-Wei Chang and Kristina Toutanova,http://arxiv.org/pdf/2210.03347v1 | |
http://arxiv.org/abs/2201.13348v3,creativecommons.org/licenses/by/4.0/,Advantages and Disadvantages of (Dedicated) Model Transformation Languages A Qualitative Interview Study,Stefan Höppner and Yves Haas and Matthias Tichy and Katharina Juhnke,http://arxiv.org/pdf/2201.13348v3 | |
http://arxiv.org/abs/2204.10555v2,creativecommons.org/licenses/by/4.0/,KALA: Knowledge-Augmented Language Model Adaptation,Minki Kang and Jinheon Baek and Sung Ju Hwang,http://arxiv.org/pdf/2204.10555v2 | |
http://arxiv.org/abs/2301.05402v1,creativecommons.org/licenses/by/4.0/,In BLOOM: Creativity and Affinity in Artificial Lyrics and Art,Evan Crothers and Herna Viktor and Nathalie Japkowicz,http://arxiv.org/pdf/2301.05402v1 | |
http://arxiv.org/abs/2303.02927v1,creativecommons.org/licenses/by/4.0/,LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models,Victor Dibia,http://arxiv.org/pdf/2303.02927v1 | |
http://arxiv.org/abs/2303.04496v1,creativecommons.org/licenses/by/4.0/,MenuCraft: Interactive Menu System Design with Large Language Models,Amir Hossein Kargaran and Nafiseh Nikeghbal and Abbas Heydarnoori and Hinrich Schütze,http://arxiv.org/pdf/2303.04496v1 | |
http://arxiv.org/abs/2304.06597v1,creativecommons.org/licenses/by/4.0/,"What It Wants Me To Say"": Bridging the Abstraction Gap Between End-User Programmers and Code-Generating Large Language Models""",Michael Xieyang Liu and Advait Sarkar and Carina Negreanu and Ben Zorn and Jack Williams and Neil Toronto and Andrew D. Gordon,http://arxiv.org/pdf/2304.06597v1 | |
http://arxiv.org/abs/2010.15036v1,creativecommons.org/licenses/by/4.0/,A Comprehensive Survey on Word Representation Models: From Classical to State-Of-The-Art Word Representation Language Models,Usman Naseem and Imran Razzak and Shah Khalid Khan and Mukesh Prasad,http://arxiv.org/pdf/2010.15036v1 | |
http://arxiv.org/abs/2103.13997v1,creativecommons.org/licenses/by/4.0/,Real-time low-resource phoneme recognition on edge devices,Yonatan Alon,http://arxiv.org/pdf/2103.13997v1 | |
http://arxiv.org/abs/2105.13573v1,creativecommons.org/licenses/by/4.0/,Investigating Code-Mixed Modern Standard Arabic-Egyptian to English Machine Translation,El Moatez Billah Nagoudi and AbdelRahim Elmadany and Muhammad Abdul-Mageed,http://arxiv.org/pdf/2105.13573v1 | |
http://arxiv.org/abs/2106.16038v1,creativecommons.org/licenses/by/4.0/,ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information,Zijun Sun and Xiaoya Li and Xiaofei Sun and Yuxian Meng and Xiang Ao and Qing He and Fei Wu and Jiwei Li,http://arxiv.org/pdf/2106.16038v1 | |
http://arxiv.org/abs/2109.07046v1,creativecommons.org/licenses/by/4.0/,A Conditional Generative Matching Model for Multi-lingual Reply Suggestion,Budhaditya Deb and Guoqing Zheng and Milad Shokouhi and Ahmed Hassan Awadallah,http://arxiv.org/pdf/2109.07046v1 | |
http://arxiv.org/abs/2205.12630v1,creativecommons.org/licenses/by/4.0/,Multimodal Knowledge Alignment with Reinforcement Learning,Youngjae Yu and Jiwan Chung and Heeseung Yun and Jack Hessel and JaeSung Park and Ximing Lu and Prithviraj Ammanabrolu and Rowan Zellers and Ronan Le Bras and Gunhee Kim and Yejin Choi,http://arxiv.org/pdf/2205.12630v1 | |
http://arxiv.org/abs/2212.01964v1,creativecommons.org/licenses/by/4.0/,Building Metadata Inference Using a Transducer Based Language Model,David Waterworth and Subbu Sethuvenkatraman and Quan Z. Sheng,http://arxiv.org/pdf/2212.01964v1 | |
http://arxiv.org/abs/2212.03278v1,creativecommons.org/licenses/by/4.0/,Counterfactual reasoning: Do language models need world knowledge for causal understanding?,Jiaxuan Li and Lang Yu and Allyson Ettinger,http://arxiv.org/pdf/2212.03278v1 | |
http://arxiv.org/abs/2302.03773v1,creativecommons.org/licenses/by/4.0/,What Matters In The Structured Pruning of Generative Language Models?,Michael Santacroce and Zixin Wen and Yelong Shen and Yuanzhi Li,http://arxiv.org/pdf/2302.03773v1 | |
http://arxiv.org/abs/2110.08413v2,creativecommons.org/licenses/by/4.0/,Invariant Language Modeling,Maxime Peyrard and Sarvjeet Singh Ghotra and Martin Josifoski and Vidhan Agarwal and Barun Patra and Dean Carignan and Emre Kiciman and Robert West,http://arxiv.org/pdf/2110.08413v2 | |
http://arxiv.org/abs/2203.00902v1,creativecommons.org/licenses/by/4.0/,Do Prompts Solve NLP Tasks Using Natural Language?,Sen Yang and Yunchen Zhang and Leyang Cui and Yue Zhang,http://arxiv.org/pdf/2203.00902v1 | |
http://arxiv.org/abs/2201.10716v1,creativecommons.org/licenses/by/4.0/,Neural Grapheme-to-Phoneme Conversion with Pre-trained Grapheme Models,Lu Dong and Zhi-Qiang Guo and Chao-Hong Tan and Ya-Jun Hu and Yuan Jiang and Zhen-Hua Ling,http://arxiv.org/pdf/2201.10716v1 | |
http://arxiv.org/abs/2301.13779v1,creativecommons.org/licenses/by/4.0/,FLAME: A small language model for spreadsheet formulas,Harshit Joshi and Abishai Ebenezer and José Cambronero and Sumit Gulwani and Aditya Kanade and Vu Le and Ivan Radiček and Gust Verbruggen,http://arxiv.org/pdf/2301.13779v1 | |
http://arxiv.org/abs/2104.08786v2,creativecommons.org/licenses/by/4.0/,Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity,Yao Lu and Max Bartolo and Alastair Moore and Sebastian Riedel and Pontus Stenetorp,http://arxiv.org/pdf/2104.08786v2 | |
http://arxiv.org/abs/2210.15458v1,creativecommons.org/licenses/by/4.0/,Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models,Luke Vilnis and Yury Zemlyanskiy and Patrick Murray and Alexandre Passos and Sumit Sanghai,http://arxiv.org/pdf/2210.15458v1 | |
http://arxiv.org/abs/2203.07687v1,creativecommons.org/licenses/by/4.0/,Compressing Sentence Representation for Semantic Retrieval via Homomorphic Projective Distillation,Xuandong Zhao and Zhiguo Yu and Ming Wu and Lei Li,http://arxiv.org/pdf/2203.07687v1 | |
http://arxiv.org/abs/2110.07560v2,creativecommons.org/licenses/by/4.0/,Composable Sparse Fine-Tuning for Cross-Lingual Transfer,Alan Ansell and Edoardo Maria Ponti and Anna Korhonen and Ivan Vulić,http://arxiv.org/pdf/2110.07560v2 | |
http://arxiv.org/abs/2302.13681v2,creativecommons.org/licenses/by/4.0/,The (ab)use of Open Source Code to Train Large Language Models,Ali Al-Kaswan and Maliheh Izadi,http://arxiv.org/pdf/2302.13681v2 | |
http://arxiv.org/abs/2211.06452v1,creativecommons.org/licenses/by/4.0/,Cross-Platform and Cross-Domain Abusive Language Detection with Supervised Contrastive Learning,Md Tawkat Islam Khondaker and Muhammad Abdul-Mageed and Laks V. S. Lakshmanan,http://arxiv.org/pdf/2211.06452v1 | |
http://arxiv.org/abs/2105.08645v4,creativecommons.org/licenses/by/4.0/,CoTexT: Multi-task Learning with Code-Text Transformer,Long Phan and Hieu Tran and Daniel Le and Hieu Nguyen and James Anibal and Alec Peltekian and Yanfang Ye,http://arxiv.org/pdf/2105.08645v4 | |
http://arxiv.org/abs/2110.05679v6,creativecommons.org/licenses/by/4.0/,Large Language Models Can Be Strong Differentially Private Learners,Xuechen Li and Florian Tramèr and Percy Liang and Tatsunori Hashimoto,http://arxiv.org/pdf/2110.05679v6 | |
http://arxiv.org/abs/1907.03187v1,creativecommons.org/licenses/by/4.0/,Applying a Pre-trained Language Model to Spanish Twitter Humor Prediction,Bobak Farzin and Piotr Czapla and Jeremy Howard,http://arxiv.org/pdf/1907.03187v1 | |
http://arxiv.org/abs/2104.05277v1,creativecommons.org/licenses/by/4.0/,Building a Swedish Open-Domain Conversational Language Model,Tobias Norlund and Agnes Stenbom,http://arxiv.org/pdf/2104.05277v1 | |
http://arxiv.org/abs/2104.09933v1,creativecommons.org/licenses/by/4.0/,Grammatical Error Generation Based on Translated Fragments,Eetu Sjöblom and Mathias Creutz and Teemu Vahtola,http://arxiv.org/pdf/2104.09933v1 | |
http://arxiv.org/abs/2106.04653v1,creativecommons.org/licenses/by/4.0/,Comprehension Based Question Answering using Bloom's Taxonomy,Pritish Sahu and Michael Cogswell and Sara Rutherford-Quach and Ajay Divakaran,http://arxiv.org/pdf/2106.04653v1 | |
http://arxiv.org/abs/2204.00498v1,creativecommons.org/licenses/by/4.0/,Evaluating the Text-to-SQL Capabilities of Large Language Models,Nitarshan Rajkumar and Raymond Li and Dzmitry Bahdanau,http://arxiv.org/pdf/2204.00498v1 | |
http://arxiv.org/abs/2303.08014v1,creativecommons.org/licenses/by/4.0/,Does ChatGPT resemble humans in language use?,Zhenguang G. Cai and David A. Haslett and Xufeng Duan and Shuqi Wang and Martin J. Pickering,http://arxiv.org/pdf/2303.08014v1 | |
http://arxiv.org/abs/2304.12191v1,creativecommons.org/licenses/by/4.0/,"Genlangs"" and Zipf's Law: Do languages generated by ChatGPT statistically look human?""",Justin Diamond,http://arxiv.org/pdf/2304.12191v1 | |
http://arxiv.org/abs/2203.16601v3,creativecommons.org/licenses/by/4.0/,Is Word Error Rate a good evaluation metric for Speech Recognition in Indic Languages?,Priyanshi Shah and Harveen Singh Chadha and Anirudh Gupta and Ankur Dhuriya and Neeraj Chhimwal and Rishabh Gaur and Vivek Raghavan,http://arxiv.org/pdf/2203.16601v3 | |
http://arxiv.org/abs/2303.13592v2,creativecommons.org/licenses/by/4.0/,Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages,Zheng-Xin Yong and Ruochen Zhang and Jessica Zosa Forde and Skyler Wang and Samuel Cahyawijaya and Holy Lovenia and Genta Indra Winata and Lintang Sutawika and Jan Christian Blaise Cruz and Long Phan and Yin Lin Tan and Alham Fikri Aji,http://arxiv.org/pdf/2303.13592v2 | |
http://arxiv.org/abs/1911.12579v3,creativecommons.org/licenses/by/4.0/,Word Embedding based New Corpus for Low-resourced Language: Sindhi,Wazir Ali and Jay Kumar and Junyu Lu and Zenglin Xu,http://arxiv.org/pdf/1911.12579v3 | |
http://arxiv.org/abs/2101.04758v4,creativecommons.org/licenses/by/4.0/,Self-Training Pre-Trained Language Models for Zero- and Few-Shot Multi-Dialectal Arabic Sequence Labeling,Muhammad Khalifa and Muhammad Abdul-Mageed and Khaled Shaalan,http://arxiv.org/pdf/2101.04758v4 | |
http://arxiv.org/abs/2110.06078v1,creativecommons.org/licenses/by/4.0/,Model-based analysis of brain activity reveals the hierarchy of language in 305 subjects,Charlotte Caucheteux and Alexandre Gramfort and Jean-Rémi King,http://arxiv.org/pdf/2110.06078v1 | |
http://arxiv.org/abs/2205.10828v3,creativecommons.org/licenses/by/4.0/,What Do Compressed Multilingual Machine Translation Models Forget?,Alireza Mohammadshahi and Vassilina Nikoulina and Alexandre Berard and Caroline Brun and James Henderson and Laurent Besacier,http://arxiv.org/pdf/2205.10828v3 | |
http://arxiv.org/abs/2003.12111v1,creativecommons.org/licenses/by/4.0/,FFR V1.0: Fon-French Neural Machine Translation,Bonaventure F. P. Dossou and Chris C. Emezue,http://arxiv.org/pdf/2003.12111v1 | |
http://arxiv.org/abs/2012.05628v3,creativecommons.org/licenses/by/4.0/,As Good as New. How to Successfully Recycle English GPT-2 to Make Models for Other Languages,Wietse de Vries and Malvina Nissim,http://arxiv.org/pdf/2012.05628v3 | |
http://arxiv.org/abs/2104.09585v1,creativecommons.org/licenses/by/4.0/,ELECTRAMed: a new pre-trained language representation model for biomedical NLP,Giacomo Miolo and Giulio Mantoan and Carlotta Orsenigo,http://arxiv.org/pdf/2104.09585v1 | |
http://arxiv.org/abs/2105.04633v1,creativecommons.org/licenses/by/4.0/,"Language Acquisition is Embodied, Interactive, Emotive: a Research Proposal",Casey Kennington,http://arxiv.org/pdf/2105.04633v1 | |
http://arxiv.org/abs/2106.04563v2,creativecommons.org/licenses/by/4.0/,XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation,Subhabrata Mukherjee and Ahmed Hassan Awadallah and Jianfeng Gao,http://arxiv.org/pdf/2106.04563v2 | |
http://arxiv.org/abs/2110.08527v3,creativecommons.org/licenses/by/4.0/,An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models,Nicholas Meade and Elinor Poole-Dayan and Siva Reddy,http://arxiv.org/pdf/2110.08527v3 | |
http://arxiv.org/abs/2111.14031v1,creativecommons.org/licenses/by/4.0/,FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding,Bill Tuck Weng Pung and Alvin Chan,http://arxiv.org/pdf/2111.14031v1 | |
http://arxiv.org/abs/2205.12558v2,creativecommons.org/licenses/by/4.0/,Gradient-Based Constrained Sampling from Language Models,Sachin Kumar and Biswajit Paria and Yulia Tsvetkov,http://arxiv.org/pdf/2205.12558v2 | |
http://arxiv.org/abs/2208.10264v4,creativecommons.org/licenses/by/4.0/,Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies,Gati Aher and Rosa I. Arriaga and Adam Tauman Kalai,http://arxiv.org/pdf/2208.10264v4 | |
http://arxiv.org/abs/2301.00068v1,creativecommons.org/licenses/by/4.0/,On the Inconsistencies of Conditionals Learned by Masked Language Models,Tom Young and Yang You,http://arxiv.org/pdf/2301.00068v1 | |
http://arxiv.org/abs/2301.12810v1,creativecommons.org/licenses/by/4.0/,Crawling the Internal Knowledge-Base of Language Models,Roi Cohen and Mor Geva and Jonathan Berant and Amir Globerson,http://arxiv.org/pdf/2301.12810v1 | |
http://arxiv.org/abs/2304.08442v1,creativecommons.org/licenses/by/4.0/,The MiniPile Challenge for Data-Efficient Language Models,Jean Kaddour,http://arxiv.org/pdf/2304.08442v1 | |
http://arxiv.org/abs/2112.10668v3,creativecommons.org/licenses/by/4.0/,Few-shot Learning with Multilingual Language Models,Xi Victoria Lin and Todor Mihaylov and Mikel Artetxe and Tianlu Wang and Shuohui Chen and Daniel Simig and Myle Ott and Naman Goyal and Shruti Bhosale and Jingfei Du and Ramakanth Pasunuru and Sam Shleifer and Punit Singh Koura and Vishrav Chaudhary and Brian O'Horo and Jeff Wang and Luke Zettlemoyer and Zornitsa Kozareva and Mona Diab and Veselin Stoyanov and Xian Li,http://arxiv.org/pdf/2112.10668v3 | |
http://arxiv.org/abs/2111.13999v1,creativecommons.org/licenses/by/4.0/,Exploring Low-Cost Transformer Model Compression for Large-Scale Commercial Reply Suggestions,Vaishnavi Shrivastava and Radhika Gaonkar and Shashank Gupta and Abhishek Jha,http://arxiv.org/pdf/2111.13999v1 | |
http://arxiv.org/abs/1904.07334v1,creativecommons.org/licenses/by/4.0/,Multi-Head Multi-Layer Attention to Deep Language Representations for Grammatical Error Detection,Masahiro Kaneko and Mamoru Komachi,http://arxiv.org/pdf/1904.07334v1 | |
http://arxiv.org/abs/2111.05671v1,creativecommons.org/licenses/by/4.0/,Pre-trained Transformer-Based Approach for Arabic Question Answering : A Comparative Study,Kholoud Alsubhi and Amani Jamal and Areej Alhothali,http://arxiv.org/pdf/2111.05671v1 | |
http://arxiv.org/abs/2206.01843v2,creativecommons.org/licenses/by/4.0/,Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning,Yujia Xie and Luowei Zhou and Xiyang Dai and Lu Yuan and Nguyen Bach and Ce Liu and Michael Zeng,http://arxiv.org/pdf/2206.01843v2 | |
http://arxiv.org/abs/2301.09790v2,creativecommons.org/licenses/by/4.0/,Can Very Large Pretrained Language Models Learn Storytelling With A Few Examples?,Zhuohan Xie and Trevor Cohn and Jey Han Lau,http://arxiv.org/pdf/2301.09790v2 | |
http://arxiv.org/abs/2208.05446v2,creativecommons.org/licenses/by/4.0/,CoditT5: Pretraining for Source Code and Natural Language Editing,Jiyang Zhang and Sheena Panthaplackel and Pengyu Nie and Junyi Jessy Li and Milos Gligoric,http://arxiv.org/pdf/2208.05446v2 | |
http://arxiv.org/abs/2212.09682v1,creativecommons.org/licenses/by/4.0/,Multilingual Sequence-to-Sequence Models for Hebrew NLP,Matan Eyal and Hila Noga and Roee Aharoni and Idan Szpektor and Reut Tsarfaty,http://arxiv.org/pdf/2212.09682v1 | |
http://arxiv.org/abs/2103.05327v2,creativecommons.org/licenses/by/4.0/,BERTese: Learning to Speak to BERT,Adi Haviv and Jonathan Berant and Amir Globerson,http://arxiv.org/pdf/2103.05327v2 | |
http://arxiv.org/abs/2207.03546v1,creativecommons.org/licenses/by/4.0/,"BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus",Josh Meyer and David Ifeoluwa Adelani and Edresson Casanova and Alp Öktem and Daniel Whitenack Julian Weber and Salomon Kabongo and Elizabeth Salesky and Iroro Orife and Colin Leong and Perez Ogayo and Chris Emezue and Jonathan Mukiibi and Salomey Osei and Apelete Agbolo and Victor Akinode and Bernard Opoku and Samuel Olanrewaju and Jesujoba Alabi and Shamsuddeen Muhammad,http://arxiv.org/pdf/2207.03546v1 | |
http://arxiv.org/abs/2208.11857v1,creativecommons.org/licenses/by/4.0/,Shortcut Learning of Large Language Models in Natural Language Understanding: A Survey,Mengnan Du and Fengxiang He and Na Zou and Dacheng Tao and Xia Hu,http://arxiv.org/pdf/2208.11857v1 | |
http://arxiv.org/abs/2209.05629v1,creativecommons.org/licenses/by/4.0/,Leveraging Large Language Models for Robot 3D Scene Understanding,William Chen and Siyi Hu and Rajat Talak and Luca Carlone,http://arxiv.org/pdf/2209.05629v1 | |
http://arxiv.org/abs/2212.09736v1,creativecommons.org/licenses/by/4.0/,"Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments",Yu Gu and Xiang Deng and Yu Su,http://arxiv.org/pdf/2212.09736v1 | |
http://arxiv.org/abs/2201.06796v2,creativecommons.org/licenses/by/4.0/,CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities,Mina Lee and Percy Liang and Qian Yang,http://arxiv.org/pdf/2201.06796v2 | |
http://arxiv.org/abs/2303.03004v2,creativecommons.org/licenses/by/4.0/,"xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval",Mohammad Abdullah Matin Khan and M Saiful Bari and Xuan Long Do and Weishi Wang and Md Rizwan Parvez and Shafiq Joty,http://arxiv.org/pdf/2303.03004v2 | |
http://arxiv.org/abs/2304.00116v1,creativecommons.org/licenses/by/4.0/,Enhancing Large Language Models with Climate Resources,Mathias Kraus and Julia Anna Bingler and Markus Leippold and Tobias Schimanski and Chiara Colesanti Senni and Dominik Stammbach and Saeid Ashraf Vaghefi and Nicolas Webersinke,http://arxiv.org/pdf/2304.00116v1 | |
http://arxiv.org/abs/2302.04089v1,creativecommons.org/licenses/by/4.0/,ZipLM: Hardware-Aware Structured Pruning of Language Models,Eldar Kurtic and Elias Frantar and Dan Alistarh,http://arxiv.org/pdf/2302.04089v1 | |
http://arxiv.org/abs/2211.07705v1,creativecommons.org/licenses/by/4.0/,A Machine Learning Approach to Classifying Construction Cost Documents into the International Construction Measurement Standard,J. Ignacio Deza and Hisham Ihshaish and Lamine Mahdjoubi,http://arxiv.org/pdf/2211.07705v1 | |
http://arxiv.org/abs/2303.05546v1,creativecommons.org/licenses/by/4.0/,Weakly-Supervised HOI Detection from Interaction Labels Only and Language/Vision-Language Priors,Mesut Erhan Unal and Adriana Kovashka,http://arxiv.org/pdf/2303.05546v1 | |
http://arxiv.org/abs/2212.01907v1,creativecommons.org/licenses/by/4.0/,Understanding How Model Size Affects Few-shot Instruction Prompting,Ayrton San Joaquin and Ardy Haroen,http://arxiv.org/pdf/2212.01907v1 | |
http://arxiv.org/abs/2304.09871v1,creativecommons.org/licenses/by/4.0/,A Theory on Adam Instability in Large-Scale Machine Learning,Igor Molybog and Peter Albert and Moya Chen and Zachary DeVito and David Esiobu and Naman Goyal and Punit Singh Koura and Sharan Narang and Andrew Poulton and Ruan Silva and Binh Tang and Puxin Xu and Yuchen Zhang and Melanie Kambadur and Stephen Roller and Susan Zhang,http://arxiv.org/pdf/2304.09871v1 | |
http://arxiv.org/abs/2209.01975v1,creativecommons.org/licenses/by/4.0/,Selective Annotation Makes Language Models Better Few-Shot Learners,Hongjin Su and Jungo Kasai and Chen Henry Wu and Weijia Shi and Tianlu Wang and Jiayi Xin and Rui Zhang and Mari Ostendorf and Luke Zettlemoyer and Noah A. Smith and Tao Yu,http://arxiv.org/pdf/2209.01975v1 | |
http://arxiv.org/abs/2304.09181v1,creativecommons.org/licenses/by/4.0/,Large Language Models Based Automatic Synthesis of Software Specifications,Shantanu Mandal and Adhrik Chethan and Vahid Janfaza and S M Farabi Mahmud and Todd A Anderson and Javier Turek and Jesmin Jahan Tithi and Abdullah Muzahid,http://arxiv.org/pdf/2304.09181v1 | |
http://arxiv.org/abs/2104.06678v1,creativecommons.org/licenses/by/4.0/,Large-Scale Self- and Semi-Supervised Learning for Speech Translation,Changhan Wang and Anne Wu and Juan Pino and Alexei Baevski and Michael Auli and Alexis Conneau,http://arxiv.org/pdf/2104.06678v1 | |
http://arxiv.org/abs/2105.03791v2,creativecommons.org/licenses/by/4.0/,Enhancing Transformers with Gradient Boosted Decision Trees for NLI Fine-Tuning,Benjamin Minixhofer and Milan Gritta and Ignacio Iacobacci,http://arxiv.org/pdf/2105.03791v2 | |
http://arxiv.org/abs/1912.05372v4,creativecommons.org/licenses/by/4.0/,FlauBERT: Unsupervised Language Model Pre-training for French,Hang Le and Loïc Vial and Jibril Frej and Vincent Segonne and Maximin Coavoux and Benjamin Lecouteux and Alexandre Allauzen and Benoît Crabbé and Laurent Besacier and Didier Schwab,http://arxiv.org/pdf/1912.05372v4 | |
http://arxiv.org/abs/2109.06262v1,creativecommons.org/licenses/by/4.0/,Evaluating Multiway Multilingual NMT in the Turkic Languages,Jamshidbek Mirzakhalov and Anoop Babu and Aigiz Kunafin and Ahsan Wahab and Behzod Moydinboyev and Sardana Ivanova and Mokhiyakhon Uzokova and Shaxnoza Pulatova and Duygu Ataman and Julia Kreutzer and Francis Tyers and Orhan Firat and John Licato and Sriram Chellappan,http://arxiv.org/pdf/2109.06262v1 | |
http://arxiv.org/abs/2205.12986v4,creativecommons.org/licenses/by/4.0/,Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling,Kaitao Song and Yichong Leng and Xu Tan and Yicheng Zou and Tao Qin and Dongsheng Li,http://arxiv.org/pdf/2205.12986v4 | |
http://arxiv.org/abs/2211.09084v1,creativecommons.org/licenses/by/4.0/,Technical Report on Neural Language Models and Few-Shot Learning for Systematic Requirements Processing in MDSE,Vincent Bertram and Miriam Boß and Evgeny Kusmenko and Imke Helene Nachmann and Bernhard Rumpe and Danilo Trotta and Louis Wachtmeister,http://arxiv.org/pdf/2211.09084v1 | |
http://arxiv.org/abs/2304.11477v1,creativecommons.org/licenses/by/4.0/,LLM+P: Empowering Large Language Models with Optimal Planning Proficiency,Bo Liu and Yuqian Jiang and Xiaohan Zhang and Qiang Liu and Shiqi Zhang and Joydeep Biswas and Peter Stone,http://arxiv.org/pdf/2304.11477v1 | |
http://arxiv.org/abs/2301.03980v1,creativecommons.org/licenses/by/4.0/,Language Models sounds the Death Knell of Knowledge Graphs,Kunal Suri and Atul Singh and Prakhar Mishra and Swapna Sourav Rout and Rajesh Sabapathy,http://arxiv.org/pdf/2301.03980v1 | |
http://arxiv.org/abs/2204.02633v1,creativecommons.org/licenses/by/4.0/,DAGAM: Data Augmentation with Generation And Modification,Byeong-Cheol Jo and Tak-Sung Heo and Yeongjoon Park and Yongmin Yoo and Won Ik Cho and Kyungsun Kim,http://arxiv.org/pdf/2204.02633v1 | |
http://arxiv.org/abs/2209.06430v4,creativecommons.org/licenses/by/4.0/,CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment,Hongwei Xue and Yuchong Sun and Bei Liu and Jianlong Fu and Ruihua Song and Houqiang Li and Jiebo Luo,http://arxiv.org/pdf/2209.06430v4 | |
http://arxiv.org/abs/2303.07895v1,creativecommons.org/licenses/by/4.0/,The Learnability of In-Context Learning,Noam Wies and Yoav Levine and Amnon Shashua,http://arxiv.org/pdf/2303.07895v1 | |
http://arxiv.org/abs/2010.04900v2,creativecommons.org/licenses/by/4.0/,Toward Micro-Dialect Identification in Diaglossic and Code-Switched Environments,Muhammad Abdul-Mageed and Chiyu Zhang and AbdelRahim Elmadany and Lyle Ungar,http://arxiv.org/pdf/2010.04900v2 | |
http://arxiv.org/abs/2101.00027v1,creativecommons.org/licenses/by/4.0/,The Pile: An 800GB Dataset of Diverse Text for Language Modeling,Leo Gao and Stella Biderman and Sid Black and Laurence Golding and Travis Hoppe and Charles Foster and Jason Phang and Horace He and Anish Thite and Noa Nabeshima and Shawn Presser and Connor Leahy,http://arxiv.org/pdf/2101.00027v1 | |
http://arxiv.org/abs/2103.12407v4,creativecommons.org/licenses/by/4.0/,Detecting Hate Speech with GPT-3,Ke-Li Chiu and Annie Collins and Rohan Alexander,http://arxiv.org/pdf/2103.12407v4 | |
http://arxiv.org/abs/2108.02340v1,creativecommons.org/licenses/by/4.0/,Robust Transfer Learning with Pretrained Language Models through Adapters,Wenjuan Han and Bo Pang and Yingnian Wu,http://arxiv.org/pdf/2108.02340v1 | |
http://arxiv.org/abs/2205.12586v2,creativecommons.org/licenses/by/4.0/,Perturbation Augmentation for Fairer NLP,Rebecca Qian and Candace Ross and Jude Fernandes and Eric Smith and Douwe Kiela and Adina Williams,http://arxiv.org/pdf/2205.12586v2 | |
http://arxiv.org/abs/2208.05051v1,creativecommons.org/licenses/by/4.0/,Limitations of Language Models in Arithmetic and Symbolic Induction,Jing Qian and Hong Wang and Zekun Li and Shiyang Li and Xifeng Yan,http://arxiv.org/pdf/2208.05051v1 | |
http://arxiv.org/abs/2210.07074v2,creativecommons.org/licenses/by/4.0/,CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing,Andy Rosenbaum and Saleh Soltan and Wael Hamza and Amir Saffari and Marco Damonte and Isabel Groves,http://arxiv.org/pdf/2210.07074v2 | |
http://arxiv.org/abs/2101.03216v2,creativecommons.org/licenses/by/4.0/,Breaking Writer's Block: Low-cost Fine-tuning of Natural Language Generation Models,Alexandre Duval and Thomas Lamson and Gael de Leseleuc de Kerouara and Matthias Gallé,http://arxiv.org/pdf/2101.03216v2 | |
http://arxiv.org/abs/2106.09204v1,creativecommons.org/licenses/by/4.0/,An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models,Xueqing Liu and Chi Wang,http://arxiv.org/pdf/2106.09204v1 | |
http://arxiv.org/abs/2107.05002v2,creativecommons.org/licenses/by/4.0/,Improving Low-resource Reading Comprehension via Cross-lingual Transposition Rethinking,Gaochen Wu and Bin Xu and Yuxin Qin and Fei Kong and Bangchang Liu and Hongwen Zhao and Dejie Chang,http://arxiv.org/pdf/2107.05002v2 | |
http://arxiv.org/abs/2107.12600v1,creativecommons.org/licenses/by/4.0/,PiSLTRc: Position-informed Sign Language Transformer with Content-aware Convolution,Pan Xie and Mengyi Zhao and Xiaohui Hu,http://arxiv.org/pdf/2107.12600v1 | |
http://arxiv.org/abs/2203.10415v1,creativecommons.org/licenses/by/4.0/,How does the pre-training objective affect what large language models learn about linguistic properties?,Ahmed Alajrami and Nikolaos Aletras,http://arxiv.org/pdf/2203.10415v1 | |
http://arxiv.org/abs/2204.12820v1,creativecommons.org/licenses/by/4.0/,LyS_ACoruña at SemEval-2022 Task 10: Repurposing Off-the-Shelf Tools for Sentiment Analysis as Semantic Dependency Parsing,Iago Alonso-Alonso and David Vilares and Carlos Gómez-Rodríguez,http://arxiv.org/pdf/2204.12820v1 | |
http://arxiv.org/abs/2206.08325v2,creativecommons.org/licenses/by/4.0/,Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models,Maribeth Rauh and John Mellor and Jonathan Uesato and Po-Sen Huang and Johannes Welbl and Laura Weidinger and Sumanth Dathathri and Amelia Glaese and Geoffrey Irving and Iason Gabriel and William Isaac and Lisa Anne Hendricks,http://arxiv.org/pdf/2206.08325v2 | |
http://arxiv.org/abs/2206.10744v1,creativecommons.org/licenses/by/4.0/,Don't Forget About Pronouns: Removing Gender Bias in Language Models Without Losing Factual Gender Information,Tomasz Limisiewicz and David Mareček,http://arxiv.org/pdf/2206.10744v1 | |
http://arxiv.org/abs/2301.11660v2,creativecommons.org/licenses/by/4.0/,Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning,Hyunsoo Cho and Choonghyun Park and Junyeop Kim and Hyuhng Joon Kim and Kang Min Yoo and Sang-goo Lee,http://arxiv.org/pdf/2301.11660v2 | |
http://arxiv.org/abs/2303.07578v1,creativecommons.org/licenses/by/4.0/,VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation,Rohan Badlani and Akshit Arora and Subhankar Ghosh and Rafael Valle and Kevin J. Shih and João Felipe Santos and Boris Ginsburg and Bryan Catanzaro,http://arxiv.org/pdf/2303.07578v1 | |
http://arxiv.org/abs/2303.15619v1,creativecommons.org/licenses/by/4.0/,Typhoon: Towards an Effective Task-Specific Masking Strategy for Pre-trained Language Models,Muhammed Shahir Abdurrahman and Hashem Elezabi and Bruce Changlong Xu,http://arxiv.org/pdf/2303.15619v1 | |
http://arxiv.org/abs/2209.15162v3,creativecommons.org/licenses/by/4.0/,Linearly Mapping from Image to Text Space,Jack Merullo and Louis Castricato and Carsten Eickhoff and Ellie Pavlick,http://arxiv.org/pdf/2209.15162v3 | |
http://arxiv.org/abs/2205.09295v2,creativecommons.org/licenses/by/4.0/,Are Prompt-based Models Clueless?,Pride Kavumba and Ryo Takahashi and Yusuke Oda,http://arxiv.org/pdf/2205.09295v2 | |
http://arxiv.org/abs/2207.12994v1,creativecommons.org/licenses/by/4.0/,V$^2$L: Leveraging Vision and Vision-language Models into Large-scale Product Retrieval,Wenhao Wang and Yifan Sun and Zongxin Yang and Yi Yang,http://arxiv.org/pdf/2207.12994v1 | |
http://arxiv.org/abs/2212.14206v1,creativecommons.org/licenses/by/4.0/,Maximizing Use-Case Specificity through Precision Model Tuning,Pranjali Awasthi and David Recio-Mitter and Yosuke Kyle Sugi,http://arxiv.org/pdf/2212.14206v1 | |
http://arxiv.org/abs/2205.07557v2,creativecommons.org/licenses/by/4.0/,"Heroes, Villains, and Victims, and GPT-3: Automated Extraction of Character Roles Without Training Data",Dominik Stammbach and Maria Antoniak and Elliott Ash,http://arxiv.org/pdf/2205.07557v2 | |
http://arxiv.org/abs/1909.11272v1,creativecommons.org/licenses/by/4.0/,TalkDown: A Corpus for Condescension Detection in Context,Zijian Wang and Christopher Potts,http://arxiv.org/pdf/1909.11272v1 | |
http://arxiv.org/abs/2109.00916v1,creativecommons.org/licenses/by/4.0/,Coarse-To-Fine And Cross-Lingual ASR Transfer,Peter Polák and Ondřej Bojar,http://arxiv.org/pdf/2109.00916v1 | |
http://arxiv.org/abs/2206.05399v1,creativecommons.org/licenses/by/4.0/,Building a Personalized Dialogue System with Prompt-Tuning,Tomohito Kasahara and Daisuke Kawahara and Nguyen Tung and Shengzhe Li and Kenta Shinzato and Toshinori Sato,http://arxiv.org/pdf/2206.05399v1 | |
http://arxiv.org/abs/2302.02178v1,creativecommons.org/licenses/by/4.0/,Construction Grammar Provides Unique Insight into Neural Language Models,Leonie Weissweiler and Taiqi He and Naoki Otani and David R. Mortensen and Lori Levin and Hinrich Schütze,http://arxiv.org/pdf/2302.02178v1 | |
http://arxiv.org/abs/2303.15647v1,creativecommons.org/licenses/by/4.0/,Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning,Vladislav Lialin and Vijeta Deshpande and Anna Rumshisky,http://arxiv.org/pdf/2303.15647v1 | |
http://arxiv.org/abs/2203.06096v1,creativecommons.org/licenses/by/4.0/,WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language,Federico Tavella and Viktor Schlegel and Marta Romeo and Aphrodite Galata and Angelo Cangelosi,http://arxiv.org/pdf/2203.06096v1 | |
http://arxiv.org/abs/2204.04306v1,creativecommons.org/licenses/by/4.0/,MMTAfrica: Multilingual Machine Translation for African Languages,Chris C. Emezue and Bonaventure F. P. Dossou,http://arxiv.org/pdf/2204.04306v1 | |
http://arxiv.org/abs/2205.10560v1,creativecommons.org/licenses/by/4.0/,Unsupervised Sign Language Phoneme Clustering using HamNoSys Notation,Boris Mocialov and Graham Turner and Helen Hastie,http://arxiv.org/pdf/2205.10560v1 | |
http://arxiv.org/abs/2203.13151v1,creativecommons.org/licenses/by/4.0/,Multi-armed bandits for online optimization of language model pre-training: the use case of dynamic masking,Iñigo Urteaga and Moulay-Zaïdane Draïdia and Tomer Lancewicki and Shahram Khadivi,http://arxiv.org/pdf/2203.13151v1 | |
http://arxiv.org/abs/2104.07639v4,creativecommons.org/licenses/by/4.0/,Robust Optimization for Multilingual Translation with Imbalanced Data,Xian Li and Hongyu Gong,http://arxiv.org/pdf/2104.07639v4 | |
http://arxiv.org/abs/2111.02643v5,creativecommons.org/licenses/by/4.0/,Response Generation with Context-Aware Prompt Learning,Xiaodong Gu and Kang Min Yoo and Sang-Woo Lee,http://arxiv.org/pdf/2111.02643v5 | |
http://arxiv.org/abs/2202.05144v1,creativecommons.org/licenses/by/4.0/,InPars: Data Augmentation for Information Retrieval using Large Language Models,Luiz Bonifacio and Hugo Abonizio and Marzieh Fadaee and Rodrigo Nogueira,http://arxiv.org/pdf/2202.05144v1 | |
http://arxiv.org/abs/2303.05221v1,creativecommons.org/licenses/by/4.0/,SEAM: An Integrated Activation-Coupled Model of Sentence Processing and Eye Movements in Reading,Maximilian M. Rabe and Dario Paape and Daniela Mertzen and Shravan Vasishth and Ralf Engbert,http://arxiv.org/pdf/2303.05221v1 | |
http://arxiv.org/abs/2206.04327v1,creativecommons.org/licenses/by/4.0/,Language Identification for Austronesian Languages,Jonathan Dunn and Wikke Nijhof,http://arxiv.org/pdf/2206.04327v1 | |
http://arxiv.org/abs/2111.14745v1,creativecommons.org/licenses/by/4.0/,A Simple Long-Tailed Recognition Baseline via Vision-Language Model,Teli Ma and Shijie Geng and Mengmeng Wang and Jing Shao and Jiasen Lu and Hongsheng Li and Peng Gao and Yu Qiao,http://arxiv.org/pdf/2111.14745v1 | |
http://arxiv.org/abs/2204.11574v1,creativecommons.org/licenses/by/4.0/,A global analysis of metrics used for measuring performance in natural language processing,Kathrin Blagec and Georg Dorffner and Milad Moradi and Simon Ott and Matthias Samwald,http://arxiv.org/pdf/2204.11574v1 | |
http://arxiv.org/abs/2210.10358v2,creativecommons.org/licenses/by/4.0/,Leveraging a New Spanish Corpus for Multilingual and Crosslingual Metaphor Detection,Elisa Sanchez-Bayona and Rodrigo Agerri,http://arxiv.org/pdf/2210.10358v2 | |
http://arxiv.org/abs/2105.11321v1,creativecommons.org/licenses/by/4.0/,Neural Language Models for Nineteenth-Century English,Kasra Hosseini and Kaspar Beelen and Giovanni Colavizza and Mariona Coll Ardanuy,http://arxiv.org/pdf/2105.11321v1 | |
http://arxiv.org/abs/2106.01023v1,creativecommons.org/licenses/by/4.0/,One Teacher is Enough? Pre-trained Language Model Distillation from Multiple Teachers,Chuhan Wu and Fangzhao Wu and Yongfeng Huang,http://arxiv.org/pdf/2106.01023v1 | |
http://arxiv.org/abs/2205.11277v2,creativecommons.org/licenses/by/4.0/,When does Parameter-Efficient Transfer Learning Work for Machine Translation?,Ahmet Üstün and Asa Cooper Stickland,http://arxiv.org/pdf/2205.11277v2 | |
http://arxiv.org/abs/2205.11747v1,creativecommons.org/licenses/by/4.0/,BabyBear: Cheap inference triage for expensive language models,Leila Khalili and Yao You and John Bohannon,http://arxiv.org/pdf/2205.11747v1 | |
http://arxiv.org/abs/2210.17497v1,creativecommons.org/licenses/by/4.0/,Leveraging Pre-trained Models for Failure Analysis Triplets Generation,Kenneth Ezukwoke and Anis Hoayek and Mireille Batton-Hubert and Xavier Boucher and Pascal Gounet and Jerome Adrian,http://arxiv.org/pdf/2210.17497v1 | |
http://arxiv.org/abs/2211.15593v1,creativecommons.org/licenses/by/4.0/,GPT-Neo for commonsense reasoning-a theoretical and practical lens,Rohan Kashyap and Vivek Kashyap and Narendra C. P,http://arxiv.org/pdf/2211.15593v1 | |
http://arxiv.org/abs/2212.08192v1,creativecommons.org/licenses/by/4.0/,The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources in Natural Language Understanding Systems,Akshatha Arodi and Martin Pömsl and Kaheer Suleman and Adam Trischler and Alexandra Olteanu and Jackie Chi Kit Cheung,http://arxiv.org/pdf/2212.08192v1 | |
http://arxiv.org/abs/2212.05238v1,creativecommons.org/licenses/by/4.0/,Structured information extraction from complex scientific text with fine-tuned large language models,Alexander Dunn and John Dagdelen and Nicholas Walker and Sanghoon Lee and Andrew S. Rosen and Gerbrand Ceder and Kristin Persson and Anubhav Jain,http://arxiv.org/pdf/2212.05238v1 | |
http://arxiv.org/abs/2210.16431v1,creativecommons.org/licenses/by/4.0/,DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention,Fenglin Liu and Xian Wu and Shen Ge and Xuancheng Ren and Wei Fan and Xu Sun and Yuexian Zou,http://arxiv.org/pdf/2210.16431v1 | |
http://arxiv.org/abs/2101.05783v2,creativecommons.org/licenses/by/4.0/,Persistent Anti-Muslim Bias in Large Language Models,Abubakar Abid and Maheen Farooqi and James Zou,http://arxiv.org/pdf/2101.05783v2 | |
http://arxiv.org/abs/2110.02370v1,creativecommons.org/licenses/by/4.0/,Leveraging the Inductive Bias of Large Language Models for Abstract Textual Reasoning,Christopher Michael Rytting and David Wingate,http://arxiv.org/pdf/2110.02370v1 | |
http://arxiv.org/abs/2206.11484v2,creativecommons.org/licenses/by/4.0/,Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models,Virginia K. Felkner and Ho-Chun Herbert Chang and Eugene Jang and Jonathan May,http://arxiv.org/pdf/2206.11484v2 | |
http://arxiv.org/abs/2211.04486v1,creativecommons.org/licenses/by/4.0/,Active Example Selection for In-Context Learning,Yiming Zhang and Shi Feng and Chenhao Tan,http://arxiv.org/pdf/2211.04486v1 | |
http://arxiv.org/abs/2302.14520v1,creativecommons.org/licenses/by/4.0/,Large Language Models Are State-of-the-Art Evaluators of Translation Quality,Tom Kocmi and Christian Federmann,http://arxiv.org/pdf/2302.14520v1 | |
http://arxiv.org/abs/1805.04453v1,creativecommons.org/licenses/by/4.0/,Bootstrapping Multilingual Intent Models via Machine Translation for Dialog Automation,Nicholas Ruiz and Srinivas Bangalore and John Chen,http://arxiv.org/pdf/1805.04453v1 | |
http://arxiv.org/abs/2108.03353v1,creativecommons.org/licenses/by/4.0/,Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning,Bryan Wang and Gang Li and Xin Zhou and Zhourong Chen and Tovi Grossman and Yang Li,http://arxiv.org/pdf/2108.03353v1 | |
http://arxiv.org/abs/2110.08512v1,creativecommons.org/licenses/by/4.0/,AugmentedCode: Examining the Effects of Natural Language Resources in Code Retrieval Models,Mehdi Bahrami and N. C. Shrikanth and Yuji Mizobuchi and Lei Liu and Masahiro Fukuyori and Wei-Peng Chen and Kazuki Munakata,http://arxiv.org/pdf/2110.08512v1 | |
http://arxiv.org/abs/2202.11923v1,creativecommons.org/licenses/by/4.0/,Welcome to the Modern World of Pronouns: Identity-Inclusive Natural Language Processing beyond Gender,Anne Lauscher and Archie Crowley and Dirk Hovy,http://arxiv.org/pdf/2202.11923v1 | |
http://arxiv.org/abs/2205.00445v1,creativecommons.org/licenses/by/4.0/,"MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning",Ehud Karpas and Omri Abend and Yonatan Belinkov and Barak Lenz and Opher Lieber and Nir Ratner and Yoav Shoham and Hofit Bata and Yoav Levine and Kevin Leyton-Brown and Dor Muhlgay and Noam Rozen and Erez Schwartz and Gal Shachaf and Shai Shalev-Shwartz and Amnon Shashua and Moshe Tenenholtz,http://arxiv.org/pdf/2205.00445v1 | |
http://arxiv.org/abs/2210.03162v1,creativecommons.org/licenses/by/4.0/,Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models,David Wingate and Mohammad Shoeybi and Taylor Sorensen,http://arxiv.org/pdf/2210.03162v1 | |
http://arxiv.org/abs/2210.03575v2,creativecommons.org/licenses/by/4.0/,Are Representations Built from the Ground Up? An Empirical Examination of Local Composition in Language Models,Emmy Liu and Graham Neubig,http://arxiv.org/pdf/2210.03575v2 | |
http://arxiv.org/abs/2212.10770v1,creativecommons.org/licenses/by/4.0/,ImPaKT: A Dataset for Open-Schema Knowledge Base Construction,Luke Vilnis and Zach Fisher and Bhargav Kanagal and Patrick Murray and Sumit Sanghai,http://arxiv.org/pdf/2212.10770v1 | |
http://arxiv.org/abs/2302.12069v1,creativecommons.org/licenses/by/4.0/,Deep learning model for Mongolian Citizens Feedback Analysis using Word Vector Embeddings,Zolzaya Dashdorj and Tsetsentsengel Munkhbayar and Stanislav Grigorev,http://arxiv.org/pdf/2302.12069v1 | |
http://arxiv.org/abs/2303.06854v1,creativecommons.org/licenses/by/4.0/,Robust Contrastive Language-Image Pretraining against Adversarial Attacks,Wenhan Yang and Baharan Mirzasoleiman,http://arxiv.org/pdf/2303.06854v1 | |
http://arxiv.org/abs/2303.14100v1,creativecommons.org/licenses/by/4.0/,Errors are Useful Prompts: Instruction Guided Task Programming with Verifier-Assisted Iterative Prompting,Marta Skreta and Naruki Yoshikawa and Sebastian Arellano-Rubach and Zhi Ji and Lasse Bjørn Kristensen and Kourosh Darvish and Alán Aspuru-Guzik and Florian Shkurti and Animesh Garg,http://arxiv.org/pdf/2303.14100v1 | |
http://arxiv.org/abs/2303.15727v1,creativecommons.org/licenses/by/4.0/,Evaluation of ChatGPT for NLP-based Mental Health Applications,Bishal Lamichhane,http://arxiv.org/pdf/2303.15727v1 | |
http://arxiv.org/abs/2209.12106v1,creativecommons.org/licenses/by/4.0/,Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity,Gabriel Simmons,http://arxiv.org/pdf/2209.12106v1 | |
http://arxiv.org/abs/2304.10592v1,creativecommons.org/licenses/by/4.0/,MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models,Deyao Zhu and Jun Chen and Xiaoqian Shen and Xiang Li and Mohamed Elhoseiny,http://arxiv.org/pdf/2304.10592v1 | |
http://arxiv.org/abs/2106.07716v1,creativecommons.org/licenses/by/4.0/,Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR Models using Hybrid Generated Pseudotranscripts,Chak-Fai Li and Francis Keith and William Hartmann and Matthew Snover and Owen Kimball,http://arxiv.org/pdf/2106.07716v1 | |
http://arxiv.org/abs/2010.11856v3,creativecommons.org/licenses/by/4.0/,XOR QA: Cross-lingual Open-Retrieval Question Answering,Akari Asai and Jungo Kasai and Jonathan H. Clark and Kenton Lee and Eunsol Choi and Hannaneh Hajishirzi,http://arxiv.org/pdf/2010.11856v3 | |
http://arxiv.org/abs/2301.01967v1,creativecommons.org/licenses/by/4.0/,A Survey of Code-switching: Linguistic and Social Perspectives for Language Technologies,A. Seza Doğruöz and Sunayana Sitaram and Barbara E. Bullock and Almeida Jacqueline Toribio,http://arxiv.org/pdf/2301.01967v1 | |
http://arxiv.org/abs/2210.16147v3,creativecommons.org/licenses/by/4.0/,Modeling structure-building in the brain with CCG parsing and large language models,Miloš Stanojević and Jonathan R. Brennan and Donald Dunagan and Mark Steedman and John T. Hale,http://arxiv.org/pdf/2210.16147v3 | |
http://arxiv.org/abs/2302.09458v1,creativecommons.org/licenses/by/4.0/,Learning Language Representations with Logical Inductive Bias,Jianshu Chen,http://arxiv.org/pdf/2302.09458v1 | |
http://arxiv.org/abs/2102.13136v1,creativecommons.org/licenses/by/4.0/,Automated essay scoring using efficient transformer-based language models,Christopher M Ormerod and Akanksha Malhotra and Amir Jafari,http://arxiv.org/pdf/2102.13136v1 | |
http://arxiv.org/abs/2103.10360v2,creativecommons.org/licenses/by/4.0/,GLM: General Language Model Pretraining with Autoregressive Blank Infilling,Zhengxiao Du and Yujie Qian and Xiao Liu and Ming Ding and Jiezhong Qiu and Zhilin Yang and Jie Tang,http://arxiv.org/pdf/2103.10360v2 | |
http://arxiv.org/abs/2109.10847v2,creativecommons.org/licenses/by/4.0/,Small-Bench NLP: Benchmark for small single GPU trained models in Natural Language Processing,Kamal Raj Kanakarajan and Bhuvana Kundumani and Malaikannan Sankarasubbu,http://arxiv.org/pdf/2109.10847v2 | |
http://arxiv.org/abs/2201.07434v2,creativecommons.org/licenses/by/4.0/,Interpreting Arabic Transformer Models,Ahmed Abdelali and Nadir Durrani and Fahim Dalvi and Hassan Sajjad,http://arxiv.org/pdf/2201.07434v2 | |
http://arxiv.org/abs/2207.14255v1,creativecommons.org/licenses/by/4.0/,Efficient Training of Language Models to Fill in the Middle,Mohammad Bavarian and Heewoo Jun and Nikolas Tezak and John Schulman and Christine McLeavey and Jerry Tworek and Mark Chen,http://arxiv.org/pdf/2207.14255v1 | |
http://arxiv.org/abs/2302.00070v1,creativecommons.org/licenses/by/4.0/,Debiasing Vision-Language Models via Biased Prompts,Ching-Yao Chuang and Varun Jampani and Yuanzhen Li and Antonio Torralba and Stefanie Jegelka,http://arxiv.org/pdf/2302.00070v1 | |
http://arxiv.org/abs/2303.17612v2,creativecommons.org/licenses/by/4.0/,"oBERTa: Improving Sparse Transfer Learning via improved initialization, distillation, and pruning regimes",Daniel Campos and Alexandre Marques and Mark Kurtz and ChengXiang Zhai,http://arxiv.org/pdf/2303.17612v2 | |
http://arxiv.org/abs/2205.12005v2,creativecommons.org/licenses/by/4.0/,mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections,Chenliang Li and Haiyang Xu and Junfeng Tian and Wei Wang and Ming Yan and Bin Bi and Jiabo Ye and Hehong Chen and Guohai Xu and Zheng Cao and Ji Zhang and Songfang Huang and Fei Huang and Jingren Zhou and Luo Si,http://arxiv.org/pdf/2205.12005v2 | |
http://arxiv.org/abs/2301.13294v2,creativecommons.org/licenses/by/4.0/,Adaptive Machine Translation with Large Language Models,Yasmin Moslem and Rejwanul Haque and John D. Kelleher and Andy Way,http://arxiv.org/pdf/2301.13294v2 | |
http://arxiv.org/abs/2203.07785v1,creativecommons.org/licenses/by/4.0/,The Ghost in the Machine has an American accent: value conflict in GPT-3,Rebecca L Johnson and Giada Pistilli and Natalia Menédez-González and Leslye Denisse Dias Duran and Enrico Panai and Julija Kalpokiene and Donald Jay Bertulfo,http://arxiv.org/pdf/2203.07785v1 | |
http://arxiv.org/abs/2010.11973v1,creativecommons.org/licenses/by/4.0/,Rediscovering the Slavic Continuum in Representations Emerging from Neural Models of Spoken Language Identification,Badr M. Abdullah and Jacek Kudera and Tania Avgustinova and Bernd Möbius and Dietrich Klakow,http://arxiv.org/pdf/2010.11973v1 | |
http://arxiv.org/abs/2211.08411v1,creativecommons.org/licenses/by/4.0/,Large Language Models Struggle to Learn Long-Tail Knowledge,Nikhil Kandpal and Haikang Deng and Adam Roberts and Eric Wallace and Colin Raffel,http://arxiv.org/pdf/2211.08411v1 | |
http://arxiv.org/abs/2301.00303v1,creativecommons.org/licenses/by/4.0/,Rethinking with Retrieval: Faithful Large Language Model Inference,Hangfeng He and Hongming Zhang and Dan Roth,http://arxiv.org/pdf/2301.00303v1 | |
http://arxiv.org/abs/2303.08559v1,creativecommons.org/licenses/by/4.0/,"Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples!",Yubo Ma and Yixin Cao and YongChing Hong and Aixin Sun,http://arxiv.org/pdf/2303.08559v1 | |
http://arxiv.org/abs/2303.09136v1,creativecommons.org/licenses/by/4.0/,A Short Survey of Viewing Large Language Models in Legal Aspect,Zhongxiang Sun,http://arxiv.org/pdf/2303.09136v1 | |
http://arxiv.org/abs/2304.02868v1,creativecommons.org/licenses/by/4.0/,Can Large Language Models Play Text Games Well? Current State-of-the-Art and Open Questions,Chen Feng Tsai and Xiaochen Zhou and Sierra S. Liu and Jing Li and Mo Yu and Hongyuan Mei,http://arxiv.org/pdf/2304.02868v1 | |
http://arxiv.org/abs/2304.10611v1,creativecommons.org/licenses/by/4.0/,Multi-aspect Repetition Suppression and Content Moderation of Large Language Models,Minghui Zhang and Alex Sokolov and Weixin Cai and Si-Qing Chen,http://arxiv.org/pdf/2304.10611v1 | |
http://arxiv.org/abs/2205.12650v2,creativecommons.org/licenses/by/4.0/,Few-shot Reranking for Multi-hop QA via Language Model Prompting,Muhammad Khalifa and Lajanugen Logeswaran and Moontae Lee and Honglak Lee and Lu Wang,http://arxiv.org/pdf/2205.12650v2 | |
http://arxiv.org/abs/2212.10933v1,creativecommons.org/licenses/by/4.0/,Resolving Indirect Referring Expressions for Entity Selection,Mohammad Javad Hosseini and Filip Radlinski and Silvia Pareti and Annie Louis,http://arxiv.org/pdf/2212.10933v1 | |
http://arxiv.org/abs/2301.11596v2,creativecommons.org/licenses/by/4.0/,ThoughtSource: A central hub for large language model reasoning data,Simon Ott and Konstantin Hebenstreit and Valentin Liévin and Christoffer Egeberg Hother and Milad Moradi and Maximilian Mayrhauser and Robert Praas and Ole Winther and Matthias Samwald,http://arxiv.org/pdf/2301.11596v2 | |
http://arxiv.org/abs/2304.02138v2,creativecommons.org/licenses/by/4.0/,Geotechnical Parrot Tales (GPT): Harnessing Large Language Models in geotechnical engineering,Krishna Kumar,http://arxiv.org/pdf/2304.02138v2 | |
http://arxiv.org/abs/2010.12626v1,creativecommons.org/licenses/by/4.0/,Topic Modeling with Contextualized Word Representation Clusters,Laure Thompson and David Mimno,http://arxiv.org/pdf/2010.12626v1 | |
http://arxiv.org/abs/2203.10945v1,creativecommons.org/licenses/by/4.0/,AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive Summarization,Moussa Kamal Eddine and Nadi Tomeh and Nizar Habash and Joseph Le Roux and Michalis Vazirgiannis,http://arxiv.org/pdf/2203.10945v1 | |
http://arxiv.org/abs/2010.14649v2,creativecommons.org/licenses/by/4.0/,Learning Contextualised Cross-lingual Word Embeddings and Alignments for Extremely Low-Resource Languages Using Parallel Corpora,Takashi Wada and Tomoharu Iwata and Yuji Matsumoto and Timothy Baldwin and Jey Han Lau,http://arxiv.org/pdf/2010.14649v2 | |
http://arxiv.org/abs/2109.12068v4,creativecommons.org/licenses/by/4.0/,AraT5: Text-to-Text Transformers for Arabic Language Generation,El Moatez Billah Nagoudi and AbdelRahim Elmadany and Muhammad Abdul-Mageed,http://arxiv.org/pdf/2109.12068v4 | |
http://arxiv.org/abs/2112.06905v2,creativecommons.org/licenses/by/4.0/,GLaM: Efficient Scaling of Language Models with Mixture-of-Experts,Nan Du and Yanping Huang and Andrew M. Dai and Simon Tong and Dmitry Lepikhin and Yuanzhong Xu and Maxim Krikun and Yanqi Zhou and Adams Wei Yu and Orhan Firat and Barret Zoph and Liam Fedus and Maarten Bosma and Zongwei Zhou and Tao Wang and Yu Emma Wang and Kellie Webster and Marie Pellat and Kevin Robinson and Kathleen Meier-Hellstern and Toju Duke and Lucas Dixon and Kun Zhang and Quoc V Le and Yonghui Wu and Zhifeng Chen and Claire Cui,http://arxiv.org/pdf/2112.06905v2 | |
http://arxiv.org/abs/2211.08412v1,creativecommons.org/licenses/by/4.0/,Evaluating the Factual Consistency of Large Language Models Through Summarization,Derek Tam and Anisha Mascarenhas and Shiyue Zhang and Sarah Kwan and Mohit Bansal and Colin Raffel,http://arxiv.org/pdf/2211.08412v1 | |
http://arxiv.org/abs/2303.06748v1,creativecommons.org/licenses/by/4.0/,DTT: An Example-Driven Tabular Transformer by Leveraging Large Language Models,Arash Dargahi Nobari and Davood Rafiei,http://arxiv.org/pdf/2303.06748v1 | |
http://arxiv.org/abs/2209.11035v2,creativecommons.org/licenses/by/4.0/,MonoByte: A Pool of Monolingual Byte-level Language Models,Hugo Abonizio and Leandro Rodrigues de Souza and Roberto Lotufo and Rodrigo Nogueira,http://arxiv.org/pdf/2209.11035v2 | |
http://arxiv.org/abs/2106.13474v2,creativecommons.org/licenses/by/4.0/,"Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains",Yunzhi Yao and Shaohan Huang and Wenhui Wang and Li Dong and Furu Wei,http://arxiv.org/pdf/2106.13474v2 | |
http://arxiv.org/abs/1910.14243v1,creativecommons.org/licenses/by/4.0/,DiaNet: BERT and Hierarchical Attention Multi-Task Learning of Fine-Grained Dialect,Muhammad Abdul-Mageed and Chiyu Zhang and AbdelRahim Elmadany and Arun Rajendran and Lyle Ungar,http://arxiv.org/pdf/1910.14243v1 | |
http://arxiv.org/abs/2011.11536v1,creativecommons.org/licenses/by/4.0/,Studying Taxonomy Enrichment on Diachronic WordNet Versions,Irina Nikishina and Alexander Panchenko and Varvara Logacheva and Natalia Loukachevitch,http://arxiv.org/pdf/2011.11536v1 | |
http://arxiv.org/abs/2210.05033v1,creativecommons.org/licenses/by/4.0/,Multilingual Representation Distillation with Contrastive Learning,Weiting Tan and Kevin Heffernan and Holger Schwenk and Philipp Koehn,http://arxiv.org/pdf/2210.05033v1 | |
http://arxiv.org/abs/2302.12746v2,creativecommons.org/licenses/by/4.0/,Spanish Built Factual Freectianary (Spanish-BFF): the first AI-generated free dictionary,Miguel Ortega-Martín and Óscar García-Sierra and Alfonso Ardoiz and Juan Carlos Armenteros and Jorge Álvarez and Adrián Alonso,http://arxiv.org/pdf/2302.12746v2 | |
http://arxiv.org/abs/2103.07449v3,creativecommons.org/licenses/by/4.0/,Cooperative Self-training of Machine Reading Comprehension,Hongyin Luo and Shang-Wen Li and Mingye Gao and Seunghak Yu and James Glass,http://arxiv.org/pdf/2103.07449v3 | |
http://arxiv.org/abs/2106.10076v2,creativecommons.org/licenses/by/4.0/,Label prompt for multi-label text classification,Rui Song and Xingbing Chen and Zelong Liu and Haining An and Zhiqi Zhang and Xiaoguang Wang and Hao Xu,http://arxiv.org/pdf/2106.10076v2 | |
http://arxiv.org/abs/2202.04350v1,creativecommons.org/licenses/by/4.0/,pNLP-Mixer: an Efficient all-MLP Architecture for Language,Francesco Fusco and Damian Pascual and Peter Staar,http://arxiv.org/pdf/2202.04350v1 | |
http://arxiv.org/abs/2209.11902v1,creativecommons.org/licenses/by/4.0/,Learning Chess With Language Models and Transformers,Michael DeLeo and Erhan Guven,http://arxiv.org/pdf/2209.11902v1 | |
http://arxiv.org/abs/2210.04186v2,creativecommons.org/licenses/by/4.0/,Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT,Bhavya Bhavya and Jinjun Xiong and Chengxiang Zhai,http://arxiv.org/pdf/2210.04186v2 | |
http://arxiv.org/abs/2211.11216v2,creativecommons.org/licenses/by/4.0/,Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music Generation Task,Shangda Wu and Maosong Sun,http://arxiv.org/pdf/2211.11216v2 | |
http://arxiv.org/abs/2010.09803v2,creativecommons.org/licenses/by/4.0/,Adversarial Training for Code Retrieval with Question-Description Relevance Regularization,Jie Zhao and Huan Sun,http://arxiv.org/pdf/2010.09803v2 | |
http://arxiv.org/abs/2105.15014v1,creativecommons.org/licenses/by/4.0/,Singing Language Identification using a Deep Phonotactic Approach,Lenny Renault and Andrea Vaglio and Romain Hennequin,http://arxiv.org/pdf/2105.15014v1 | |
http://arxiv.org/abs/2204.08887v1,creativecommons.org/licenses/by/4.0/,Cross-Lingual Phrase Retrieval,Heqi Zheng and Xiao Zhang and Zewen Chi and Heyan Huang and Tan Yan and Tian Lan and Wei Wei and Xian-Ling Mao,http://arxiv.org/pdf/2204.08887v1 | |
http://arxiv.org/abs/2210.13002v1,creativecommons.org/licenses/by/4.0/,An Empirical Revisiting of Linguistic Knowledge Fusion in Language Understanding Tasks,Changlong Yu and Tianyi Xiao and Lingpeng Kong and Yangqiu Song and Wilfred Ng,http://arxiv.org/pdf/2210.13002v1 | |
http://arxiv.org/abs/2301.10481v2,creativecommons.org/licenses/by/4.0/,FewShotTextGCN: K-hop neighborhood regularization for few-shot learning on graphs,Niels van der Heijden and Ekaterina Shutova and Helen Yannakoudakis,http://arxiv.org/pdf/2301.10481v2 | |
http://arxiv.org/abs/2207.12759v1,creativecommons.org/licenses/by/4.0/,Training Effective Neural Sentence Encoders from Automatically Mined Paraphrases,Sławomir Dadas,http://arxiv.org/pdf/2207.12759v1 | |
http://arxiv.org/abs/2208.11981v1,creativecommons.org/licenses/by/4.0/,On Reality and the Limits of Language Data,Nigel H. Collier and Fangyu Liu and Ehsan Shareghi,http://arxiv.org/pdf/2208.11981v1 | |
http://arxiv.org/abs/2210.06576v1,creativecommons.org/licenses/by/4.0/,DATScore: Evaluating Translation with Data Augmented Translations,Moussa Kamal Eddine and Guokan Shang and Michalis Vazirgiannis,http://arxiv.org/pdf/2210.06576v1 | |
http://arxiv.org/abs/2303.15621v2,creativecommons.org/licenses/by/4.0/,ChatGPT as a Factual Inconsistency Evaluator for Text Summarization,Zheheng Luo and Qianqian Xie and Sophia Ananiadou,http://arxiv.org/pdf/2303.15621v2 | |
http://arxiv.org/abs/2211.11081v1,creativecommons.org/licenses/by/4.0/,A Theory of Unsupervised Translation Motivated by Understanding Animal Communication,Shafi Goldwasser and David F. Gruber and Adam Tauman Kalai and Orr Paradise,http://arxiv.org/pdf/2211.11081v1 | |
http://arxiv.org/abs/2206.01335v2,creativecommons.org/licenses/by/4.0/,"Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code",Patrick Bareiß and Beatriz Souza and Marcelo d'Amorim and Michael Pradel,http://arxiv.org/pdf/2206.01335v2 | |
http://arxiv.org/abs/2003.04866v1,creativecommons.org/licenses/by/4.0/,Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity,Ivan Vulić and Simon Baker and Edoardo Maria Ponti and Ulla Petti and Ira Leviant and Kelly Wing and Olga Majewska and Eden Bar and Matt Malone and Thierry Poibeau and Roi Reichart and Anna Korhonen,http://arxiv.org/pdf/2003.04866v1 | |
http://arxiv.org/abs/2202.07894v1,creativecommons.org/licenses/by/4.0/,Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers,Yotaro Kubo and Shigeki Karita and Michiel Bacchiani,http://arxiv.org/pdf/2202.07894v1 | |
http://arxiv.org/abs/2304.09842v1,creativecommons.org/licenses/by/4.0/,Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models,Pan Lu and Baolin Peng and Hao Cheng and Michel Galley and Kai-Wei Chang and Ying Nian Wu and Song-Chun Zhu and Jianfeng Gao,http://arxiv.org/pdf/2304.09842v1 | |
http://arxiv.org/abs/2304.11679v1,creativecommons.org/licenses/by/4.0/,Domain Mastery Benchmark: An Ever-Updating Benchmark for Evaluating Holistic Domain Knowledge of Large Language Model--A Preliminary Release,Zhouhong Gu and Xiaoxuan Zhu and Haoning Ye and Lin Zhang and Zhuozhi Xiong and Zihan Li and Qianyu He and Sihang Jiang and Hongwei Feng and Yanghua Xiao,http://arxiv.org/pdf/2304.11679v1 | |
http://arxiv.org/abs/2301.09919v2,creativecommons.org/licenses/by/4.0/,Opportunities and Challenges in Neural Dialog Tutoring,Jakub Macina and Nico Daheim and Lingzhi Wang and Tanmay Sinha and Manu Kapur and Iryna Gurevych and Mrinmaya Sachan,http://arxiv.org/pdf/2301.09919v2 | |
http://arxiv.org/abs/2304.11082v1,creativecommons.org/licenses/by/4.0/,Fundamental Limitations of Alignment in Large Language Models,Yotam Wolf and Noam Wies and Yoav Levine and Amnon Shashua,http://arxiv.org/pdf/2304.11082v1 | |
http://arxiv.org/abs/2303.16749v1,creativecommons.org/licenses/by/4.0/,Improving Code Generation by Training with Natural Language Feedback,Angelica Chen and Jérémy Scheurer and Tomasz Korbak and Jon Ander Campos and Jun Shern Chan and Samuel R. Bowman and Kyunghyun Cho and Ethan Perez,http://arxiv.org/pdf/2303.16749v1 | |
http://arxiv.org/abs/2304.02017v5,creativecommons.org/licenses/by/4.0/,"Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its Applications, Advantages, Limitations, and Future Directions in Natural Language Processing",Walid Hariri,http://arxiv.org/pdf/2304.02017v5 | |
http://arxiv.org/abs/2106.01091v1,creativecommons.org/licenses/by/4.0/,belabBERT: a Dutch RoBERTa-based language model applied to psychiatric classification,Joppe Wouts and Janna de Boer and Alban Voppel and Sanne Brederoo and Sander van Splunter and Iris Sommer,http://arxiv.org/pdf/2106.01091v1 | |
http://arxiv.org/abs/2203.04904v3,creativecommons.org/licenses/by/4.0/,Rethinking Task Sampling for Few-shot Vision-Language Transfer Learning,Zhenhailong Wang and Hang Yu and Manling Li and Han Zhao and Heng Ji,http://arxiv.org/pdf/2203.04904v3 | |
http://arxiv.org/abs/2103.14583v3,creativecommons.org/licenses/by/4.0/,Leveraging pre-trained representations to improve access to untranscribed speech from endangered languages,Nay San and Martijn Bartelds and Mitchell Browne and Lily Clifford and Fiona Gibson and John Mansfield and David Nash and Jane Simpson and Myfany Turpin and Maria Vollmer and Sasha Wilmoth and Dan Jurafsky,http://arxiv.org/pdf/2103.14583v3 | |
http://arxiv.org/abs/2205.12673v2,creativecommons.org/licenses/by/4.0/,InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning,Prakhar Gupta and Cathy Jiao and Yi-Ting Yeh and Shikib Mehri and Maxine Eskenazi and Jeffrey P. Bigham,http://arxiv.org/pdf/2205.12673v2 | |
http://arxiv.org/abs/2211.11875v1,creativecommons.org/licenses/by/4.0/,Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference,Eric Mitchell and Joseph J. Noh and Siyan Li and William S. Armstrong and Ananth Agarwal and Patrick Liu and Chelsea Finn and Christopher D. Manning,http://arxiv.org/pdf/2211.11875v1 | |
http://arxiv.org/abs/2302.06560v1,creativecommons.org/licenses/by/4.0/,Large Scale Multi-Lingual Multi-Modal Summarization Dataset,Yash Verma and Anubhav Jangra and Raghvendra Kumar and Sriparna Saha,http://arxiv.org/pdf/2302.06560v1 | |
http://arxiv.org/abs/2212.04088v3,creativecommons.org/licenses/by/4.0/,LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models,Chan Hee Song and Jiaman Wu and Clayton Washington and Brian M. Sadler and Wei-Lun Chao and Yu Su,http://arxiv.org/pdf/2212.04088v3 | |
http://arxiv.org/abs/2108.03867v1,creativecommons.org/licenses/by/4.0/,Benchmarking Multi-Task Learning for Sentiment Analysis and Offensive Language Identification in Under-Resourced Dravidian Languages,Adeep Hande and Siddhanth U Hegde and Ruba Priyadharshini and Rahul Ponnusamy and Prasanna Kumar Kumaresan and Sajeetha Thavareesan and Bharathi Raja Chakravarthi,http://arxiv.org/pdf/2108.03867v1 | |
http://arxiv.org/abs/2205.14660v1,creativecommons.org/licenses/by/4.0/,SFE-AI at SemEval-2022 Task 11: Low-Resource Named Entity Recognition using Large Pre-trained Language Models,Changyu Hou and Jun Wang and Yixuan Qiao and Peng Jiang and Peng Gao and Guotong Xie and Qizhi Lin and Xiaopeng Wang and Xiandi Jiang and Benqi Wang and Qifeng Xiao,http://arxiv.org/pdf/2205.14660v1 | |
http://arxiv.org/abs/2211.01180v2,creativecommons.org/licenses/by/4.0/,"M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval",Layne Berry and Yi-Jen Shih and Hsuan-Fu Wang and Heng-Jui Chang and Hung-yi Lee and David Harwath,http://arxiv.org/pdf/2211.01180v2 | |
http://arxiv.org/abs/2212.10534v1,creativecommons.org/licenses/by/4.0/,DISCO: Distilling Phrasal Counterfactuals with Large Language Models,Zeming Chen and Qiyue Gao and Kyle Richardson and Antoine Bosselut and Ashish Sabharwal,http://arxiv.org/pdf/2212.10534v1 | |
http://arxiv.org/abs/2303.14951v1,creativecommons.org/licenses/by/4.0/,Improving Contextualized Topic Models with Negative Sampling,Suman Adhya and Avishek Lahiri and Debarshi Kumar Sanyal and Partha Pratim Das,http://arxiv.org/pdf/2303.14951v1 | |
http://arxiv.org/abs/2211.05110v1,creativecommons.org/licenses/by/4.0/,Large Language Models with Controllable Working Memory,Daliang Li and Ankit Singh Rawat and Manzil Zaheer and Xin Wang and Michal Lukasik and Andreas Veit and Felix Yu and Sanjiv Kumar,http://arxiv.org/pdf/2211.05110v1 | |
http://arxiv.org/abs/2302.09207v1,creativecommons.org/licenses/by/4.0/,RetVec: Resilient and Efficient Text Vectorizer,Elie Bursztein and Marina Zhang and Owen Vallis and Xinyu Jia and Alexey Kurakin,http://arxiv.org/pdf/2302.09207v1 | |
http://arxiv.org/abs/2110.01710v1,creativecommons.org/licenses/by/4.0/,PyTorrent: A Python Library Corpus for Large-scale Language Models,Mehdi Bahrami and N. C. Shrikanth and Shade Ruangwan and Lei Liu and Yuji Mizobuchi and Masahiro Fukuyori and Wei-Peng Chen and Kazuki Munakata and Tim Menzies,http://arxiv.org/pdf/2110.01710v1 | |
http://arxiv.org/abs/2111.00607v3,creativecommons.org/licenses/by/4.0/,A Systematic Investigation of Commonsense Knowledge in Large Language Models,Xiang Lorraine Li and Adhiguna Kuncoro and Jordan Hoffmann and Cyprien de Masson d'Autume and Phil Blunsom and Aida Nematzadeh,http://arxiv.org/pdf/2111.00607v3 | |
http://arxiv.org/abs/2208.14536v1,creativecommons.org/licenses/by/4.0/,MultiCoNER: A Large-scale Multilingual dataset for Complex Named Entity Recognition,Shervin Malmasi and Anjie Fang and Besnik Fetahu and Sudipta Kar and Oleg Rokhlenko,http://arxiv.org/pdf/2208.14536v1 | |
http://arxiv.org/abs/2212.10511v1,creativecommons.org/licenses/by/4.0/,When Not to Trust Language Models: Investigating Effectiveness and Limitations of Parametric and Non-Parametric Memories,Alex Mallen and Akari Asai and Victor Zhong and Rajarshi Das and Hannaneh Hajishirzi and Daniel Khashabi,http://arxiv.org/pdf/2212.10511v1 | |
http://arxiv.org/abs/2304.06712v1,creativecommons.org/licenses/by/4.0/,What does CLIP know about a red circle? Visual prompt engineering for VLMs,Aleksandar Shtedritski and Christian Rupprecht and Andrea Vedaldi,http://arxiv.org/pdf/2304.06712v1 | |
http://arxiv.org/abs/2304.10548v1,creativecommons.org/licenses/by/4.0/,Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding,Ziang Xiao and Xingdi Yuan and Q. Vera Liao and Rania Abdelghani and Pierre-Yves Oudeyer,http://arxiv.org/pdf/2304.10548v1 | |
http://arxiv.org/abs/1911.00637v1,creativecommons.org/licenses/by/4.0/,Sentence-Level BERT and Multi-Task Learning of Age and Gender in Social Media,Muhammad Abdul-Mageed and Chiyu Zhang and Arun Rajendran and AbdelRahim Elmadany and Michael Przystupa and Lyle Ungar,http://arxiv.org/pdf/1911.00637v1 | |
http://arxiv.org/abs/2012.08146v1,creativecommons.org/licenses/by/4.0/,Generation of complex database queries and API calls from natural language utterances,Amol Kelkar and Nachiketa Rajpurohit and Utkarsh Mittal and Peter Relan,http://arxiv.org/pdf/2012.08146v1 | |
http://arxiv.org/abs/2107.06483v1,creativecommons.org/licenses/by/4.0/,From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text,Ishan Tarunesh and Syamantak Kumar and Preethi Jyothi,http://arxiv.org/pdf/2107.06483v1 | |
http://arxiv.org/abs/2110.03047v1,creativecommons.org/licenses/by/4.0/,Integrating Categorical Features in End-to-End ASR,Rongqing Huang,http://arxiv.org/pdf/2110.03047v1 | |
http://arxiv.org/abs/2110.03252v1,creativecommons.org/licenses/by/4.0/,Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling,Kyuhong Shim and Iksoo Choi and Wonyong Sung and Jungwook Choi,http://arxiv.org/pdf/2110.03252v1 | |
http://arxiv.org/abs/2110.13229v2,creativecommons.org/licenses/by/4.0/,Distributionally Robust Recurrent Decoders with Random Network Distillation,Antonio Valerio Miceli-Barone and Alexandra Birch and Rico Sennrich,http://arxiv.org/pdf/2110.13229v2 | |
http://arxiv.org/abs/2202.05209v1,creativecommons.org/licenses/by/4.0/,Improving Automatic Speech Recognition for Non-Native English with Transfer Learning and Language Model Decoding,Peter Sullivan and Toshiko Shibano and Muhammad Abdul-Mageed,http://arxiv.org/pdf/2202.05209v1 | |
http://arxiv.org/abs/2204.04289v1,creativecommons.org/licenses/by/4.0/,Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models,Patrick Huber and Giuseppe Carenini,http://arxiv.org/pdf/2204.04289v1 | |
http://arxiv.org/abs/2205.12702v3,creativecommons.org/licenses/by/4.0/,Detecting Label Errors by using Pre-Trained Language Models,Derek Chong and Jenny Hong and Christopher D. Manning,http://arxiv.org/pdf/2205.12702v3 | |
http://arxiv.org/abs/2206.10781v1,creativecommons.org/licenses/by/4.0/,Efficient and effective training of language and graph neural network models,Vassilis N. Ioannidis and Xiang Song and Da Zheng and Houyu Zhang and Jun Ma and Yi Xu and Belinda Zeng and Trishul Chilimbi and George Karypis,http://arxiv.org/pdf/2206.10781v1 | |
http://arxiv.org/abs/2208.03713v1,creativecommons.org/licenses/by/4.0/,Study of Encoder-Decoder Architectures for Code-Mix Search Query Translation,Mandar Kulkarni and Soumya Chennabasavaraj and Nikesh Garera,http://arxiv.org/pdf/2208.03713v1 | |
http://arxiv.org/abs/2210.06150v1,creativecommons.org/licenses/by/4.0/,Annotating Norwegian Language Varieties on Twitter for Part-of-Speech,Petter Mæhlum and Andre Kåsen and Samia Touileb and Jeremy Barnes,http://arxiv.org/pdf/2210.06150v1 | |
http://arxiv.org/abs/2211.08073v2,creativecommons.org/licenses/by/4.0/,GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-distribution Generalization Perspective,Linyi Yang and Shuibai Zhang and Libo Qin and Yafu Li and Yidong Wang and Hanmeng Liu and Jindong Wang and Xing Xie and Yue Zhang,http://arxiv.org/pdf/2211.08073v2 | |
http://arxiv.org/abs/2301.11507v1,creativecommons.org/licenses/by/4.0/,Semi-Parametric Video-Grounded Text Generation,Sungdong Kim and Jin-Hwa Kim and Jiyoung Lee and Minjoon Seo,http://arxiv.org/pdf/2301.11507v1 | |
http://arxiv.org/abs/2304.05221v1,creativecommons.org/licenses/by/4.0/,Towards preserving word order importance through Forced Invalidation,Hadeel Al-Negheimish and Pranava Madhyastha and Alessandra Russo,http://arxiv.org/pdf/2304.05221v1 | |
http://arxiv.org/abs/2102.02557v1,creativecommons.org/licenses/by/4.0/,Adaptive Semiparametric Language Models,Dani Yogatama and Cyprien de Masson d'Autume and Lingpeng Kong,http://arxiv.org/pdf/2102.02557v1 | |
http://arxiv.org/abs/2104.09617v1,creativecommons.org/licenses/by/4.0/,Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model,Per E Kummervold and Javier de la Rosa and Freddy Wetjen and Svein Arne Brygfjeld,http://arxiv.org/pdf/2104.09617v1 | |
http://arxiv.org/abs/2107.05697v1,creativecommons.org/licenses/by/4.0/,Few-shot Language Coordination by Modeling Theory of Mind,Hao Zhu and Graham Neubig and Yonatan Bisk,http://arxiv.org/pdf/2107.05697v1 | |
http://arxiv.org/abs/2205.11503v1,creativecommons.org/licenses/by/4.0/,Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models,Mirac Suzgun and Luke Melas-Kyriazi and Dan Jurafsky,http://arxiv.org/pdf/2205.11503v1 | |
http://arxiv.org/abs/2211.10000v1,creativecommons.org/licenses/by/4.0/,Protein language model rescue mutations highlight variant effects and structure in clinically relevant genes,Onuralp Soylemez and Pablo Cordero,http://arxiv.org/pdf/2211.10000v1 | |
http://arxiv.org/abs/2212.10622v1,creativecommons.org/licenses/by/4.0/,mFACE: Multilingual Summarization with Factual Consistency Evaluation,Roee Aharoni and Shashi Narayan and Joshua Maynez and Jonathan Herzig and Elizabeth Clark and Mirella Lapata,http://arxiv.org/pdf/2212.10622v1 | |
http://arxiv.org/abs/2111.14709v3,creativecommons.org/licenses/by/4.0/,Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching,Zhengxiang Wang,http://arxiv.org/pdf/2111.14709v3 | |
http://arxiv.org/abs/2204.13509v2,creativecommons.org/licenses/by/4.0/,On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model,Seongjin Shin and Sang-Woo Lee and Hwijeen Ahn and Sungdong Kim and HyoungSeok Kim and Boseop Kim and Kyunghyun Cho and Gichang Lee and Woomyoung Park and Jung-Woo Ha and Nako Sung,http://arxiv.org/pdf/2204.13509v2 | |
http://arxiv.org/abs/2209.15093v1,creativecommons.org/licenses/by/4.0/,Unpacking Large Language Models with Conceptual Consistency,Pritish Sahu and Michael Cogswell and Yunye Gong and Ajay Divakaran,http://arxiv.org/pdf/2209.15093v1 | |
http://arxiv.org/abs/2109.00165v1,creativecommons.org/licenses/by/4.0/,An Unsupervised Method for Building Sentence Simplification Corpora in Multiple Languages,Xinyu Lu and Jipeng Qiang and Yun Li and Yunhao Yuan and Yi Zhu,http://arxiv.org/pdf/2109.00165v1 | |
http://arxiv.org/abs/2206.12036v2,creativecommons.org/licenses/by/4.0/,SC-Ques: A Sentence Completion Question Dataset for English as a Second Language Learners,Qiongqiong Liu and Yaying Huang and Zitao Liu and Shuyan Huang and Jiahao Chen and Xiangyu Zhao and Guimin Lin and Yuyu Zhou and Weiqi Luo,http://arxiv.org/pdf/2206.12036v2 | |
http://arxiv.org/abs/2212.03551v5,creativecommons.org/licenses/by/4.0/,Talking About Large Language Models,Murray Shanahan,http://arxiv.org/pdf/2212.03551v5 | |
http://arxiv.org/abs/2301.05843v1,creativecommons.org/licenses/by/4.0/,Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data,Jing Wei and Sungdong Kim and Hyunhoon Jung and Young-Ho Kim,http://arxiv.org/pdf/2301.05843v1 | |
http://arxiv.org/abs/2303.17511v1,creativecommons.org/licenses/by/4.0/,On pitfalls (and advantages) of sophisticated large language models,Anna Strasser,http://arxiv.org/pdf/2303.17511v1 | |
http://arxiv.org/abs/2110.05367v1,creativecommons.org/licenses/by/4.0/,Improving Gender Fairness of Pre-Trained Language Models without Catastrophic Forgetting,Zahra Fatemi and Chen Xing and Wenhao Liu and Caiming Xiong,http://arxiv.org/pdf/2110.05367v1 | |
http://arxiv.org/abs/2210.14868v3,creativecommons.org/licenses/by/4.0/,Multi-lingual Evaluation of Code Generation Models,Ben Athiwaratkun and Sanjay Krishna Gouda and Zijian Wang and Xiaopeng Li and Yuchen Tian and Ming Tan and Wasi Uddin Ahmad and Shiqi Wang and Qing Sun and Mingyue Shang and Sujan Kumar Gonugondla and Hantian Ding and Varun Kumar and Nathan Fulton and Arash Farahani and Siddhartha Jain and Robert Giaquinto and Haifeng Qian and Murali Krishna Ramanathan and Ramesh Nallapati and Baishakhi Ray and Parminder Bhatia and Sudipta Sengupta and Dan Roth and Bing Xiang,http://arxiv.org/pdf/2210.14868v3 | |
http://arxiv.org/abs/2012.07528v1,creativecommons.org/licenses/by/4.0/,Disentangling Homophemes in Lip Reading using Perplexity Analysis,Souheil Fenghour and Daqing Chen and Kun Guo and Perry Xiao,http://arxiv.org/pdf/2012.07528v1 | |
http://arxiv.org/abs/2011.09031v4,creativecommons.org/licenses/by/4.0/,Predictions For Pre-training Language Models,Tong Guo,http://arxiv.org/pdf/2011.09031v4 | |
http://arxiv.org/abs/2209.10063v3,creativecommons.org/licenses/by/4.0/,Generate rather than Retrieve: Large Language Models are Strong Context Generators,Wenhao Yu and Dan Iter and Shuohang Wang and Yichong Xu and Mingxuan Ju and Soumya Sanyal and Chenguang Zhu and Michael Zeng and Meng Jiang,http://arxiv.org/pdf/2209.10063v3 | |
http://arxiv.org/abs/2108.01928v1,creativecommons.org/licenses/by/4.0/,How to Query Language Models?,Leonard Adolphs and Shehzaad Dhuliawala and Thomas Hofmann,http://arxiv.org/pdf/2108.01928v1 | |
http://arxiv.org/abs/2304.10946v1,creativecommons.org/licenses/by/4.0/,CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models,Tianhao Li and Sandesh Shetty and Advaith Kamath and Ajay Jaiswal and Xianqian Jiang and Ying Ding and Yejin Kim,http://arxiv.org/pdf/2304.10946v1 | |
http://arxiv.org/abs/2011.02323v1,creativecommons.org/licenses/by/4.0/,Indic-Transformers: An Analysis of Transformer Language Models for Indian Languages,Kushal Jain and Adwait Deshpande and Kumar Shridhar and Felix Laumann and Ayushman Dash,http://arxiv.org/pdf/2011.02323v1 | |
http://arxiv.org/abs/2101.05509v3,creativecommons.org/licenses/by/4.0/,Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake News Detection,Ben Chen and Bin Chen and Dehong Gao and Qijin Chen and Chengfu Huo and Xiaonan Meng and Weijun Ren and Yang Zhou,http://arxiv.org/pdf/2101.05509v3 | |
http://arxiv.org/abs/2110.08484v2,creativecommons.org/licenses/by/4.0/,A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models,Woojeong Jin and Yu Cheng and Yelong Shen and Weizhu Chen and Xiang Ren,http://arxiv.org/pdf/2110.08484v2 | |
http://arxiv.org/abs/1808.07364v3,creativecommons.org/licenses/by/4.0/,Neural Named Entity Recognition from Subword Units,Abdalghani Abujabal and Judith Gaspers,http://arxiv.org/pdf/1808.07364v3 | |
http://arxiv.org/abs/2304.11384v1,creativecommons.org/licenses/by/4.0/,An Empirical Study on Using Large Language Models for Multi-Intent Comment Generation,Mingyang Geng and Shangwen Wang and Dezun Dong and Haotian Wang and Ge Li and Zhi Jin and Xiaoguang Mao and Xiangke Liao,http://arxiv.org/pdf/2304.11384v1 | |
http://arxiv.org/abs/2205.13708v1,creativecommons.org/licenses/by/4.0/,HiJoNLP at SemEval-2022 Task 2: Detecting Idiomaticity of Multiword Expressions using Multilingual Pretrained Language Models,Minghuan Tan,http://arxiv.org/pdf/2205.13708v1 | |
http://arxiv.org/abs/1804.01768v1,creativecommons.org/licenses/by/4.0/,Chinese-Portuguese Machine Translation: A Study on Building Parallel Corpora from Comparable Texts,Siyou Liu and Longyue Wang and Chao-Hong Liu,http://arxiv.org/pdf/1804.01768v1 | |
http://arxiv.org/abs/2010.14534v1,creativecommons.org/licenses/by/4.0/,Unmasking Contextual Stereotypes: Measuring and Mitigating BERT's Gender Bias,Marion Bartl and Malvina Nissim and Albert Gatt,http://arxiv.org/pdf/2010.14534v1 | |
http://arxiv.org/abs/2012.15263v1,creativecommons.org/licenses/by/4.0/,Predicting cross-linguistic adjective order with information gain,William Dyer and Richard Futrell and Zoey Liu and Gregory Scontras,http://arxiv.org/pdf/2012.15263v1 | |
http://arxiv.org/abs/2101.12338v1,creativecommons.org/licenses/by/4.0/,Enabling Robots to Draw and Tell: Towards Visually Grounded Multimodal Description Generation,Ting Han and Sina Zarrieß,http://arxiv.org/pdf/2101.12338v1 | |
http://arxiv.org/abs/2104.05753v1,creativecommons.org/licenses/by/4.0/,Towards a parallel corpus of Portuguese and the Bantu language Emakhuwa of Mozambique,Felermino D. M. A. Ali and Andrew Caines and Jaimito L. A. Malavi,http://arxiv.org/pdf/2104.05753v1 | |
http://arxiv.org/abs/2205.10078v1,creativecommons.org/licenses/by/4.0/,Uzbek affix finite state machine for stemming,Maksud Sharipov and Ulugbek Salaev,http://arxiv.org/pdf/2205.10078v1 | |
http://arxiv.org/abs/2207.00758v1,creativecommons.org/licenses/by/4.0/,MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question Answering for 16 Diverse Languages,Akari Asai and Shayne Longpre and Jungo Kasai and Chia-Hsuan Lee and Rui Zhang and Junjie Hu and Ikuya Yamada and Jonathan H. Clark and Eunsol Choi,http://arxiv.org/pdf/2207.00758v1 | |
http://arxiv.org/abs/2304.07297v1,creativecommons.org/licenses/by/4.0/,Language Instructed Reinforcement Learning for Human-AI Coordination,Hengyuan Hu and Dorsa Sadigh,http://arxiv.org/pdf/2304.07297v1 | |
http://arxiv.org/abs/2103.10685v3,creativecommons.org/licenses/by/4.0/,Controllable Generation from Pre-trained Language Models via Inverse Prompting,Xu Zou and Da Yin and Qingyang Zhong and Ming Ding and Hongxia Yang and Zhilin Yang and Jie Tang,http://arxiv.org/pdf/2103.10685v3 | |
http://arxiv.org/abs/2205.05535v1,creativecommons.org/licenses/by/4.0/,Clinical Prompt Learning with Frozen Language Models,Niall Taylor and Yi Zhang and Dan Joyce and Alejo Nevado-Holgado and Andrey Kormilitzin,http://arxiv.org/pdf/2205.05535v1 | |
http://arxiv.org/abs/2205.09712v1,creativecommons.org/licenses/by/4.0/,Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning,Antonia Creswell and Murray Shanahan and Irina Higgins,http://arxiv.org/pdf/2205.09712v1 | |
http://arxiv.org/abs/2302.00763v1,creativecommons.org/licenses/by/4.0/,Collaborating with language models for embodied reasoning,Ishita Dasgupta and Christine Kaeser-Chen and Kenneth Marino and Arun Ahuja and Sheila Babayan and Felix Hill and Rob Fergus,http://arxiv.org/pdf/2302.00763v1 | |
http://arxiv.org/abs/2209.05946v1,creativecommons.org/licenses/by/4.0/,OmDet: Language-Aware Object Detection with Large-scale Vision-Language Multi-dataset Pre-training,Tiancheng Zhao and Peng Liu and Xiaopeng Lu and Kyusong Lee,http://arxiv.org/pdf/2209.05946v1 | |
http://arxiv.org/abs/2211.03730v1,creativecommons.org/licenses/by/4.0/,DPCSpell: A Transformer-based Detector-Purificator-Corrector Framework for Spelling Error Correction of Bangla and Resource Scarce Indic Languages,Mehedi Hasan Bijoy and Nahid Hossain and Salekul Islam and Swakkhar Shatabda,http://arxiv.org/pdf/2211.03730v1 | |
http://arxiv.org/abs/1906.08237v2,creativecommons.org/licenses/by/4.0/,XLNet: Generalized Autoregressive Pretraining for Language Understanding,Zhilin Yang and Zihang Dai and Yiming Yang and Jaime Carbonell and Ruslan Salakhutdinov and Quoc V. Le,http://arxiv.org/pdf/1906.08237v2 | |
http://arxiv.org/abs/2011.14039v2,creativecommons.org/licenses/by/4.0/,An Investigation of Language Model Interpretability via Sentence Editing,Samuel Stevens and Yu Su,http://arxiv.org/pdf/2011.14039v2 | |
http://arxiv.org/abs/2105.02486v2,creativecommons.org/licenses/by/4.0/,Towards General Natural Language Understanding with Probabilistic Worldbuilding,Abulhair Saparov and Tom M. Mitchell,http://arxiv.org/pdf/2105.02486v2 | |
http://arxiv.org/abs/2109.06704v1,creativecommons.org/licenses/by/4.0/,KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning,Haonan Li and Yeyun Gong and Jian Jiao and Ruofei Zhang and Timothy Baldwin and Nan Duan,http://arxiv.org/pdf/2109.06704v1 | |
http://arxiv.org/abs/2204.01845v1,creativecommons.org/licenses/by/4.0/,Compliance Checking with NLI: Privacy Policies vs. Regulations,Amin Rabinia and Zane Nygaard,http://arxiv.org/pdf/2204.01845v1 | |
http://arxiv.org/abs/2210.01296v2,creativecommons.org/licenses/by/4.0/,Recitation-Augmented Language Models,Zhiqing Sun and Xuezhi Wang and Yi Tay and Yiming Yang and Denny Zhou,http://arxiv.org/pdf/2210.01296v2 | |
http://arxiv.org/abs/2212.06369v3,creativecommons.org/licenses/by/4.0/,Technical Report -- Competition Solution for Prompt Tuning using Pretrained Language Model,Jiang-Long Song and Wu-He Zou and Feng Li and Xiao-Lei Qin and Wei-Dong Zhang,http://arxiv.org/pdf/2212.06369v3 | |
http://arxiv.org/abs/2302.07388v1,creativecommons.org/licenses/by/4.0/,Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models,Shrimai Prabhumoye and Mostofa Patwary and Mohammad Shoeybi and Bryan Catanzaro,http://arxiv.org/pdf/2302.07388v1 | |
http://arxiv.org/abs/2110.08551v1,creativecommons.org/licenses/by/4.0/,HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression,Chenhe Dong and Yaliang Li and Ying Shen and Minghui Qiu,http://arxiv.org/pdf/2110.08551v1 | |
http://arxiv.org/abs/2203.15917v1,creativecommons.org/licenses/by/4.0/,Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization,Evelina Bakhturina and Yang Zhang and Boris Ginsburg,http://arxiv.org/pdf/2203.15917v1 | |
http://arxiv.org/abs/2206.13289v1,creativecommons.org/licenses/by/4.0/,Analyzing Encoded Concepts in Transformer Language Models,Hassan Sajjad and Nadir Durrani and Fahim Dalvi and Firoj Alam and Abdul Rafae Khan and Jia Xu,http://arxiv.org/pdf/2206.13289v1 | |
http://arxiv.org/abs/2302.04269v1,creativecommons.org/licenses/by/4.0/,Diagnosing and Rectifying Vision Models using Language,Yuhui Zhang and Jeff Z. HaoChen and Shih-Cheng Huang and Kuan-Chieh Wang and James Zou and Serena Yeung,http://arxiv.org/pdf/2302.04269v1 | |
http://arxiv.org/abs/2303.03103v1,creativecommons.org/licenses/by/4.0/,Towards Zero-Shot Functional Compositionality of Language Models,Hangyeol Yu and Myeongho Jeong and Jamin Shin and Hyeongdon Moon and Juneyoung Park and Seungtaek Choi,http://arxiv.org/pdf/2303.03103v1 | |
http://arxiv.org/abs/2110.06419v1,creativecommons.org/licenses/by/4.0/,Federated Natural Language Generation for Personalized Dialogue System,Yujie Lu and Chao Huang and Huanli Zhan and Yong Zhuang,http://arxiv.org/pdf/2110.06419v1 | |
http://arxiv.org/abs/2202.00470v1,creativecommons.org/licenses/by/4.0/,An Assessment of the Impact of OCR Noise on Language Models,Konstantin Todorov and Giovanni Colavizza,http://arxiv.org/pdf/2202.00470v1 | |
http://arxiv.org/abs/2201.11576v1,creativecommons.org/licenses/by/4.0/,Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation,Jixuan Wang and Kuan-Chieh Wang and Frank Rudzicz and Michael Brudno,http://arxiv.org/pdf/2201.11576v1 | |
http://arxiv.org/abs/2208.14754v1,creativecommons.org/licenses/by/4.0/,LexMAE: Lexicon-Bottlenecked Pretraining for Large-Scale Retrieval,Tao Shen and Xiubo Geng and Chongyang Tao and Can Xu and Xiaolong Huang and Binxing Jiao and Linjun Yang and Daxin Jiang,http://arxiv.org/pdf/2208.14754v1 | |
http://arxiv.org/abs/2303.06573v1,creativecommons.org/licenses/by/4.0/,Large Language Models Know Your Contextual Search Intent: A Prompting Framework for Conversational Search,Kelong Mao and Zhicheng Dou and Haonan Chen and Fengran Mo and Hongjin Qian,http://arxiv.org/pdf/2303.06573v1 | |
http://arxiv.org/abs/2304.06030v2,creativecommons.org/licenses/by/4.0/,The Role of Large Language Models in the Recognition of Territorial Sovereignty: An Analysis of the Construction of Legitimacy,Francisco Castillo-Eslava and Carlos Mougan and Alejandro Romero-Reche and Steffen Staab,http://arxiv.org/pdf/2304.06030v2 | |
http://arxiv.org/abs/2008.06268v1,creativecommons.org/licenses/by/4.0/,An Efficient Model Inference Algorithm for Learning-based Testing of Reactive Systems,Muddassar A. Sindhu,http://arxiv.org/pdf/2008.06268v1 | |
http://arxiv.org/abs/2011.12170v1,creativecommons.org/licenses/by/4.0/,Domain-Transferable Method for Named Entity Recognition Task,Vladislav Mikhailov and Tatiana Shavrina,http://arxiv.org/pdf/2011.12170v1 | |
http://arxiv.org/abs/2109.04607v1,creativecommons.org/licenses/by/4.0/,IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization,Fajri Koto and Jey Han Lau and Timothy Baldwin,http://arxiv.org/pdf/2109.04607v1 | |
http://arxiv.org/abs/2206.03216v2,creativecommons.org/licenses/by/4.0/,Data Governance in the Age of Large-Scale Data-Driven Language Technology,Yacine Jernite and Huu Nguyen and Stella Biderman and Anna Rogers and Maraim Masoud and Valentin Danchev and Samson Tan and Alexandra Sasha Luccioni and Nishant Subramani and Gérard Dupont and Jesse Dodge and Kyle Lo and Zeerak Talat and Isaac Johnson and Dragomir Radev and Somaieh Nikpoor and Jörg Frohberg and Aaron Gokaslan and Peter Henderson and Rishi Bommasani and Margaret Mitchell,http://arxiv.org/pdf/2206.03216v2 | |
http://arxiv.org/abs/1803.05820v1,creativecommons.org/licenses/by/4.0/,RUSSE: The First Workshop on Russian Semantic Similarity,Alexander Panchenko and Natalia Loukachevitch and Dmitry Ustalov and Denis Paperno and Christian Meyer and Natalia Konstantinova,http://arxiv.org/pdf/1803.05820v1 | |
http://arxiv.org/abs/2109.00590v4,creativecommons.org/licenses/by/4.0/,WebQA: Multihop and Multimodal QA,Yingshan Chang and Mridu Narang and Hisami Suzuki and Guihong Cao and Jianfeng Gao and Yonatan Bisk,http://arxiv.org/pdf/2109.00590v4 | |
http://arxiv.org/abs/2111.09296v3,creativecommons.org/licenses/by/4.0/,XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale,Arun Babu and Changhan Wang and Andros Tjandra and Kushal Lakhotia and Qiantong Xu and Naman Goyal and Kritika Singh and Patrick von Platen and Yatharth Saraf and Juan Pino and Alexei Baevski and Alexis Conneau and Michael Auli,http://arxiv.org/pdf/2111.09296v3 | |
http://arxiv.org/abs/2210.14472v1,creativecommons.org/licenses/by/4.0/,Sinhala Sentence Embedding: A Two-Tiered Structure for Low-Resource Languages,Gihan Weeraprameshwara and Vihanga Jayawickrama and Nisansa de Silva and Yudhanjaya Wijeratne,http://arxiv.org/pdf/2210.14472v1 | |
http://arxiv.org/abs/2212.10505v1,creativecommons.org/licenses/by/4.0/,DePlot: One-shot visual language reasoning by plot-to-table translation,Fangyu Liu and Julian Martin Eisenschlos and Francesco Piccinno and Syrine Krichene and Chenxi Pang and Kenton Lee and Mandar Joshi and Wenhu Chen and Nigel Collier and Yasemin Altun,http://arxiv.org/pdf/2212.10505v1 | |
http://arxiv.org/abs/2302.13241v1,creativecommons.org/licenses/by/4.0/,Cross-Lingual Question Answering over Knowledge Base as Reading Comprehension,Chen Zhang and Yuxuan Lai and Yansong Feng and Xingyu Shen and Haowei Du and Dongyan Zhao,http://arxiv.org/pdf/2302.13241v1 | |
http://arxiv.org/abs/2101.01785v3,creativecommons.org/licenses/by/4.0/,ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic,Muhammad Abdul-Mageed and AbdelRahim Elmadany and El Moatez Billah Nagoudi,http://arxiv.org/pdf/2101.01785v3 | |
http://arxiv.org/abs/2108.11857v2,creativecommons.org/licenses/by/4.0/,Probing Pre-trained Auto-regressive Language Models for Named Entity Typing and Recognition,Elena V. Epure and Romain Hennequin,http://arxiv.org/pdf/2108.11857v2 | |
http://arxiv.org/abs/2110.10329v1,creativecommons.org/licenses/by/4.0/,SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training,Ankur Bapna and Yu-an Chung and Nan Wu and Anmol Gulati and Ye Jia and Jonathan H. Clark and Melvin Johnson and Jason Riesa and Alexis Conneau and Yu Zhang,http://arxiv.org/pdf/2110.10329v1 | |
http://arxiv.org/abs/2210.07792v2,creativecommons.org/licenses/by/4.0/,Robust Preference Learning for Storytelling via Contrastive Reinforcement Learning,Louis Castricato and Alexander Havrilla and Shahbuland Matiana and Michael Pieler and Anbang Ye and Ian Yang and Spencer Frazier and Mark Riedl,http://arxiv.org/pdf/2210.07792v2 | |
http://arxiv.org/abs/2210.08726v2,creativecommons.org/licenses/by/4.0/,"RARR: Researching and Revising What Language Models Say, Using Language Models",Luyu Gao and Zhuyun Dai and Panupong Pasupat and Anthony Chen and Arun Tejasvi Chaganty and Yicheng Fan and Vincent Y. Zhao and Ni Lao and Hongrae Lee and Da-Cheng Juan and Kelvin Guu,http://arxiv.org/pdf/2210.08726v2 | |
http://arxiv.org/abs/2303.05707v1,creativecommons.org/licenses/by/4.0/,MuLTI: Efficient Video-and-Language Understanding with MultiWay-Sampler and Multiple Choice Modeling,Jiaqi Xu and Bo Liu and Yunkuo Chen and Mengli Cheng and Xing Shi,http://arxiv.org/pdf/2303.05707v1 | |
http://arxiv.org/abs/2304.07327v1,creativecommons.org/licenses/by/4.0/,OpenAssistant Conversations -- Democratizing Large Language Model Alignment,Andreas Köpf and Yannic Kilcher and Dimitri von Rütte and Sotiris Anagnostidis and Zhi-Rui Tam and Keith Stevens and Abdullah Barhoum and Nguyen Minh Duc and Oliver Stanley and Richárd Nagyfi and Shahul ES and Sameer Suri and David Glushkov and Arnav Dantuluri and Andrew Maguire and Christoph Schuhmann and Huu Nguyen and Alexander Mattick,http://arxiv.org/pdf/2304.07327v1 | |
http://arxiv.org/abs/2005.07503v1,creativecommons.org/licenses/by/4.0/,COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter,Martin Müller and Marcel Salathé and Per E Kummervold,http://arxiv.org/pdf/2005.07503v1 | |
http://arxiv.org/abs/2105.08840v3,creativecommons.org/licenses/by/4.0/,Training Heterogeneous Features in Sequence to Sequence Tasks: Latent Enhanced Multi-filter Seq2Seq Model,Yunhao Yang and Zhaokun Xue,http://arxiv.org/pdf/2105.08840v3 | |
http://arxiv.org/abs/2205.12538v2,creativecommons.org/licenses/by/4.0/,Is a Question Decomposition Unit All We Need?,Pruthvi Patel and Swaroop Mishra and Mihir Parmar and Chitta Baral,http://arxiv.org/pdf/2205.12538v2 | |
http://arxiv.org/abs/2206.05802v2,creativecommons.org/licenses/by/4.0/,Self-critiquing models for assisting human evaluators,William Saunders and Catherine Yeh and Jeff Wu and Steven Bills and Long Ouyang and Jonathan Ward and Jan Leike,http://arxiv.org/pdf/2206.05802v2 | |
http://arxiv.org/abs/2211.06687v3,creativecommons.org/licenses/by/4.0/,Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation,Yusong Wu and Ke Chen and Tianyu Zhang and Yuchen Hui and Taylor Berg-Kirkpatrick and Shlomo Dubnov,http://arxiv.org/pdf/2211.06687v3 | |
http://arxiv.org/abs/2304.09138v1,creativecommons.org/licenses/by/4.0/,Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI Task,Zihao Wu and Lu Zhang and Chao Cao and Xiaowei Yu and Haixing Dai and Chong Ma and Zhengliang Liu and Lin Zhao and Gang Li and Wei Liu and Quanzheng Li and Dinggang Shen and Xiang Li and Dajiang Zhu and Tianming Liu,http://arxiv.org/pdf/2304.09138v1 | |
http://arxiv.org/abs/2210.13701v1,creativecommons.org/licenses/by/4.0/,Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence,Hung-Ting Chen and Michael J. Q. Zhang and Eunsol Choi,http://arxiv.org/pdf/2210.13701v1 | |
http://arxiv.org/abs/2205.02022v2,creativecommons.org/licenses/by/4.0/,A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation,David Ifeoluwa Adelani and Jesujoba Oluwadara Alabi and Angela Fan and Julia Kreutzer and Xiaoyu Shen and Machel Reid and Dana Ruiter and Dietrich Klakow and Peter Nabende and Ernie Chang and Tajuddeen Gwadabe and Freshia Sackey and Bonaventure F. P. Dossou and Chris Chinenye Emezue and Colin Leong and Michael Beukman and Shamsuddeen Hassan Muhammad and Guyo Dub Jarso and Oreen Yousuf and Andre Niyongabo Rubungo and Gilles Hacheme and Eric Peter Wairagala and Muhammad Umair Nasir and Benjamin Ayoade Ajibade and Tunde Oluwaseyi Ajayi and Yvonne Wambui Gitau and Jade Abbott and Mohamed Ahmed and Millicent Ochieng and Anuoluwapo Aremu and Perez Ogayo and Jonathan Mukiibi and Fatoumata Ouoba Kabore and Godson Koffi Kalipe and Derguene Mbaye and Allahsera Auguste Tapo and Victoire Memdjokam Koagne and Edwin Munkoh-Buabeng and Valencia Wagner and Idris Abdulmumin and Ayodele Awokoya and Happy Buzaaba and Blessing Sibanda and Andiswa Bukula and Sam Manthalu,http://arxiv.org/pdf/2205.02022v2 | |
http://arxiv.org/abs/2301.11847v1,creativecommons.org/licenses/by/4.0/,A Comparative Study of Pretrained Language Models for Long Clinical Text,Yikuan Li and Ramsey M. Wehbe and Faraz S. Ahmad and Hanyin Wang and Yuan Luo,http://arxiv.org/pdf/2301.11847v1 | |
http://arxiv.org/abs/2105.04024v3,creativecommons.org/licenses/by/4.0/,DocSCAN: Unsupervised Text Classification via Learning from Neighbors,Dominik Stammbach and Elliott Ash,http://arxiv.org/pdf/2105.04024v3 | |
http://arxiv.org/abs/2109.06692v1,creativecommons.org/licenses/by/4.0/,LRWR: Large-Scale Benchmark for Lip Reading in Russian language,Evgeniy Egorov and Vasily Kostyumov and Mikhail Konyk and Sergey Kolesnikov,http://arxiv.org/pdf/2109.06692v1 | |
http://arxiv.org/abs/2207.09854v1,creativecommons.org/licenses/by/4.0/,"Auto-active Verification of Graph Algorithms, Written in OCaml",Daniel Castanho and Mário Pereira,http://arxiv.org/pdf/2207.09854v1 | |
http://arxiv.org/abs/2106.01251v1,creativecommons.org/licenses/by/4.0/,Multilingual Medical Question Answering and Information Retrieval for Rural Health Intelligence Access,Vishal Vinod and Susmit Agrawal and Vipul Gaurav and Pallavi R and Savita Choudhary,http://arxiv.org/pdf/2106.01251v1 | |
http://arxiv.org/abs/2111.08088v1,creativecommons.org/licenses/by/4.0/,Assessing gender bias in medical and scientific masked language models with StereoSet,Robert Robinson,http://arxiv.org/pdf/2111.08088v1 | |
http://arxiv.org/abs/2301.01224v1,creativecommons.org/licenses/by/4.0/,An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation,Kevin Moran and Ali Yachnes and George Purnell and Junayed Mahmud and Michele Tufano and Carlos Bernal-Cárdenas and Denys Poshyvanyk and Zach H'Doubler,http://arxiv.org/pdf/2301.01224v1 | |
http://arxiv.org/abs/2304.03086v1,creativecommons.org/licenses/by/4.0/,ChatGPT for Shaping the Future of Dentistry: The Potential of Multi-Modal Large Language Model,Hanyao Huang and Ou Zheng and Dongdong Wang and Jiayi Yin and Zijin Wang and Shengxuan Ding and Heng Yin and Chuan Xu and Renjie Yang and Qian Zheng and Bing Shi,http://arxiv.org/pdf/2304.03086v1 | |
http://arxiv.org/abs/2304.05501v1,creativecommons.org/licenses/by/4.0/,L3MVN: Leveraging Large Language Models for Visual Target Navigation,Bangguo Yu and Hamidreza Kasaei and Ming Cao,http://arxiv.org/pdf/2304.05501v1 | |
http://arxiv.org/abs/2109.04877v1,creativecommons.org/licenses/by/4.0/,Efficient Test Time Adapter Ensembling for Low-resource Language Varieties,Xinyi Wang and Yulia Tsvetkov and Sebastian Ruder and Graham Neubig,http://arxiv.org/pdf/2109.04877v1 | |
http://arxiv.org/abs/2204.04873v1,creativecommons.org/licenses/by/4.0/,Adapting BigScience Multilingual Model to Unseen Languages,Zheng-Xin Yong and Vassilina Nikoulina,http://arxiv.org/pdf/2204.04873v1 | |
http://arxiv.org/abs/2009.13570v2,creativecommons.org/licenses/by/4.0/,DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue,Shikib Mehri and Mihail Eric and Dilek Hakkani-Tur,http://arxiv.org/pdf/2009.13570v2 | |
http://arxiv.org/abs/2101.07120v1,creativecommons.org/licenses/by/4.0/,Neural Abstractive Text Summarizer for Telugu Language,Mohan Bharath B and Aravindh Gowtham B and Akhil M,http://arxiv.org/pdf/2101.07120v1 | |
http://arxiv.org/abs/2106.05589v1,creativecommons.org/licenses/by/4.0/,AUGNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation,Xinnuo Xu and Guoyin Wang and Young-Bum Kim and Sungjin Lee,http://arxiv.org/pdf/2106.05589v1 | |
http://arxiv.org/abs/2205.03815v1,creativecommons.org/licenses/by/4.0/,Beyond Distributional Hypothesis: Let Language Models Learn Meaning-Text Correspondence,Myeongjun Jang and Frank Mtumbuka and Thomas Lukasiewicz,http://arxiv.org/pdf/2205.03815v1 | |
http://arxiv.org/abs/2205.07523v1,creativecommons.org/licenses/by/4.0/,Prompting to Distill: Boosting Data-Free Knowledge Distillation via Reinforced Prompt,Xinyin Ma and Xinchao Wang and Gongfan Fang and Yongliang Shen and Weiming Lu,http://arxiv.org/pdf/2205.07523v1 | |
http://arxiv.org/abs/2205.10036v1,creativecommons.org/licenses/by/4.0/,Exploring Extreme Parameter Compression for Pre-trained Language Models,Yuxin Ren and Benyou Wang and Lifeng Shang and Xin Jiang and Qun Liu,http://arxiv.org/pdf/2205.10036v1 | |
http://arxiv.org/abs/2210.09150v2,creativecommons.org/licenses/by/4.0/,Prompting GPT-3 To Be Reliable,Chenglei Si and Zhe Gan and Zhengyuan Yang and Shuohang Wang and Jianfeng Wang and Jordan Boyd-Graber and Lijuan Wang,http://arxiv.org/pdf/2210.09150v2 | |
http://arxiv.org/abs/2210.13578v1,creativecommons.org/licenses/by/4.0/,Speeding Up Question Answering Task of Language Models via Inverted Index,Xiang Ji and Yesim Sungu-Eryilmaz and Elaheh Momeni and Reza Rawassizadeh,http://arxiv.org/pdf/2210.13578v1 | |
http://arxiv.org/abs/2301.09211v1,creativecommons.org/licenses/by/4.0/,An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models,Saghar Hosseini and Hamid Palangi and Ahmed Hassan Awadallah,http://arxiv.org/pdf/2301.09211v1 | |
http://arxiv.org/abs/2302.07926v1,creativecommons.org/licenses/by/4.0/,Commonsense Reasoning for Conversational AI: A Survey of the State of the Art,Christopher Richardson and Larry Heck,http://arxiv.org/pdf/2302.07926v1 | |
http://arxiv.org/abs/2303.03480v1,creativecommons.org/licenses/by/4.0/,"Can an Embodied Agent Find Your Cat-shaped Mug""? LLM-Based Zero-Shot Object Navigation""",Vishnu Sashank Dorbala and James F. Mullen Jr. and Dinesh Manocha,http://arxiv.org/pdf/2303.03480v1 | |
http://arxiv.org/abs/2304.02886v1,creativecommons.org/licenses/by/4.0/,Automatic ICD-10 Code Association: A Challenging Task on French Clinical Texts,Yakini Tchouka and Jean-François Couchot and David Laiymani and Philippe Selles and Azzedine Rahmani,http://arxiv.org/pdf/2304.02886v1 | |
http://arxiv.org/abs/2006.00031v1,creativecommons.org/licenses/by/4.0/,A Comparative Study of Lexical Substitution Approaches based on Neural Language Models,Nikolay Arefyev and Boris Sheludko and Alexander Podolskiy and Alexander Panchenko,http://arxiv.org/pdf/2006.00031v1 | |
http://arxiv.org/abs/2102.04490v2,creativecommons.org/licenses/by/4.0/,Unsupervised Abstractive Summarization of Bengali Text Documents,Radia Rayan Chowdhury and Mir Tafseer Nayeem and Tahsin Tasnim Mim and Md. Saifur Rahman Chowdhury and Taufiqul Jannat,http://arxiv.org/pdf/2102.04490v2 | |
http://arxiv.org/abs/2104.04487v1,creativecommons.org/licenses/by/4.0/,Language model fusion for streaming end to end speech recognition,Rodrigo Cabrera and Xiaofeng Liu and Mohammadreza Ghodsi and Zebulun Matteson and Eugene Weinstein and Anjuli Kannan,http://arxiv.org/pdf/2104.04487v1 | |
http://arxiv.org/abs/2109.08259v1,creativecommons.org/licenses/by/4.0/,Self-training with Few-shot Rationalization: Teacher Explanations Aid Student in Few-shot NLU,Meghana Moorthy Bhat and Alessandro Sordoni and Subhabrata Mukherjee,http://arxiv.org/pdf/2109.08259v1 | |
http://arxiv.org/abs/2111.08284v2,creativecommons.org/licenses/by/4.0/,Few-Shot Self-Rationalization with Natural Language Prompts,Ana Marasović and Iz Beltagy and Doug Downey and Matthew E. Peters,http://arxiv.org/pdf/2111.08284v2 | |
http://arxiv.org/abs/2112.08547v2,creativecommons.org/licenses/by/4.0/,Learning Rich Representation of Keyphrases from Text,Mayank Kulkarni and Debanjan Mahata and Ravneet Arora and Rajarshi Bhowmik,http://arxiv.org/pdf/2112.08547v2 | |
http://arxiv.org/abs/2210.04873v2,creativecommons.org/licenses/by/4.0/,CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation,Tanay Dixit and Bhargavi Paranjape and Hannaneh Hajishirzi and Luke Zettlemoyer,http://arxiv.org/pdf/2210.04873v2 | |
http://arxiv.org/abs/2211.16198v2,creativecommons.org/licenses/by/4.0/,SuS-X: Training-Free Name-Only Transfer of Vision-Language Models,Vishaal Udandarao and Ankush Gupta and Samuel Albanie,http://arxiv.org/pdf/2211.16198v2 | |
http://arxiv.org/abs/2303.09384v1,creativecommons.org/licenses/by/4.0/,LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations,Catherine Tony and Markus Mutas and Nicolás E. Díaz Ferreyra and Riccardo Scandariato,http://arxiv.org/pdf/2303.09384v1 | |
http://arxiv.org/abs/2303.17760v1,creativecommons.org/licenses/by/4.0/,"CAMEL: Communicative Agents for Mind"" Exploration of Large Scale Language Model Society""",Guohao Li and Hasan Abed Al Kader Hammoud and Hani Itani and Dmitrii Khizbullin and Bernard Ghanem,http://arxiv.org/pdf/2303.17760v1 | |
http://arxiv.org/abs/2011.03203v1,creativecommons.org/licenses/by/4.0/,Unleashing the Power of Neural Discourse Parsers -- A Context and Structure Aware Approach Using Large Scale Pretraining,Grigorii Guz and Patrick Huber and Giuseppe Carenini,http://arxiv.org/pdf/2011.03203v1 | |
http://arxiv.org/abs/2104.11070v2,creativecommons.org/licenses/by/4.0/,Adapting Long Context NLM for ASR Rescoring in Conversational Agents,Ashish Shenoy and Sravan Bodapati and Monica Sunkara and Srikanth Ronanki and Katrin Kirchhoff,http://arxiv.org/pdf/2104.11070v2 | |
http://arxiv.org/abs/2111.09564v2,creativecommons.org/licenses/by/4.0/,LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model,Yukyung Lee and Jina Kim and Pilsung Kang,http://arxiv.org/pdf/2111.09564v2 | |
http://arxiv.org/abs/2204.06745v1,creativecommons.org/licenses/by/4.0/,GPT-NeoX-20B: An Open-Source Autoregressive Language Model,Sid Black and Stella Biderman and Eric Hallahan and Quentin Anthony and Leo Gao and Laurence Golding and Horace He and Connor Leahy and Kyle McDonell and Jason Phang and Michael Pieler and USVSN Sai Prashanth and Shivanshu Purohit and Laria Reynolds and Jonathan Tow and Ben Wang and Samuel Weinbach,http://arxiv.org/pdf/2204.06745v1 | |
http://arxiv.org/abs/2302.06860v2,creativecommons.org/licenses/by/4.0/,BLIAM: Literature-based Data Synthesis for Synergistic Drug Combination Prediction,Cai Yang and Addie Woicik and Hoifung Poon and Sheng Wang,http://arxiv.org/pdf/2302.06860v2 | |
http://arxiv.org/abs/2303.17590v1,creativecommons.org/licenses/by/4.0/,Going Beyond Nouns With Vision & Language Models Using Synthetic Data,Paola Cascante-Bonilla and Khaled Shehada and James Seale Smith and Sivan Doveh and Donghyun Kim and Rameswar Panda and Gül Varol and Aude Oliva and Vicente Ordonez and Rogerio Feris and Leonid Karlinsky,http://arxiv.org/pdf/2303.17590v1 | |
http://arxiv.org/abs/2102.04887v2,creativecommons.org/licenses/by/4.0/,NewsBERT: Distilling Pre-trained Language Model for Intelligent News Application,Chuhan Wu and Fangzhao Wu and Yang Yu and Tao Qi and Yongfeng Huang and Qi Liu,http://arxiv.org/pdf/2102.04887v2 | |
http://arxiv.org/abs/2304.06762v1,creativecommons.org/licenses/by/4.0/,Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study,Boxin Wang and Wei Ping and Peng Xu and Lawrence McAfee and Zihan Liu and Mohammad Shoeybi and Yi Dong and Oleksii Kuchaiev and Bo Li and Chaowei Xiao and Anima Anandkumar and Bryan Catanzaro,http://arxiv.org/pdf/2304.06762v1 | |
http://arxiv.org/abs/1707.03762v4,creativecommons.org/licenses/by/4.0/,Revisiting Elementary Denotational Semantics,Jeremy G. Siek,http://arxiv.org/pdf/1707.03762v4 | |
http://arxiv.org/abs/2112.07868v2,creativecommons.org/licenses/by/4.0/,Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases,Shrimai Prabhumoye and Rafal Kocielnik and Mohammad Shoeybi and Anima Anandkumar and Bryan Catanzaro,http://arxiv.org/pdf/2112.07868v2 | |
http://arxiv.org/abs/2303.09639v1,creativecommons.org/licenses/by/4.0/,Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models,Aashka Trivedi and Takuma Udagawa and Michele Merler and Rameswar Panda and Yousef El-Kurdi and Bishwaranjan Bhattacharjee,http://arxiv.org/pdf/2303.09639v1 | |
http://arxiv.org/abs/2304.02210v1,creativecommons.org/licenses/by/4.0/,Document-Level Machine Translation with Large Language Models,Longyue Wang and Chenyang Lyu and Tianbo Ji and Zhirui Zhang and Dian Yu and Shuming Shi and Zhaopeng Tu,http://arxiv.org/pdf/2304.02210v1 | |
http://arxiv.org/abs/2108.02962v1,creativecommons.org/licenses/by/4.0/,Dezyne: Paving the Way to Practical Formal Software Engineering,Rutger van Beusekom and Bert de Jonge and Paul Hoogendijk and Jan Nieuwenhuizen,http://arxiv.org/pdf/2108.02962v1 | |
http://arxiv.org/abs/2204.06130v2,creativecommons.org/licenses/by/4.0/,Impossible Triangle: What's Next for Pre-trained Language Models?,Chenguang Zhu and Michael Zeng,http://arxiv.org/pdf/2204.06130v2 | |
http://arxiv.org/abs/2206.03931v3,creativecommons.org/licenses/by/4.0/,Learning to Generate Prompts for Dialogue Generation through Reinforcement Learning,Hsuan Su and Pohan Chi and Shih-Cheng Huang and Chung Ho Lam and Saurav Sahay and Shang-Tse Chen and Hung-yi Lee,http://arxiv.org/pdf/2206.03931v3 | |
http://arxiv.org/abs/2212.10474v1,creativecommons.org/licenses/by/4.0/,ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models,Jonas Belouadi and Steffen Eger,http://arxiv.org/pdf/2212.10474v1 | |
http://arxiv.org/abs/2102.04130v3,creativecommons.org/licenses/by/4.0/,Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models,Hannah Kirk and Yennie Jun and Haider Iqbal and Elias Benussi and Filippo Volpin and Frederic A. Dreyer and Aleksandar Shtedritski and Yuki M. Asano,http://arxiv.org/pdf/2102.04130v3 | |
http://arxiv.org/abs/2105.03119v1,creativecommons.org/licenses/by/4.0/,Applying Model-based Requirements Engineering in Three Large European Collaborative Projects: An Experience Report,Andrey Sadovykh and Dragos Truscan and Hugo Bruneliere,http://arxiv.org/pdf/2105.03119v1 | |
http://arxiv.org/abs/2106.09449v1,creativecommons.org/licenses/by/4.0/,DocNLI: A Large-scale Dataset for Document-level Natural Language Inference,Wenpeng Yin and Dragomir Radev and Caiming Xiong,http://arxiv.org/pdf/2106.09449v1 | |
http://arxiv.org/abs/2205.10569v1,creativecommons.org/licenses/by/4.0/,HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking,Yanzhao Zhang and Dingkun Long and Guangwei Xu and Pengjun Xie,http://arxiv.org/pdf/2205.10569v1 | |
http://arxiv.org/abs/2211.12508v1,creativecommons.org/licenses/by/4.0/,Time-Aware Datasets are Adaptive Knowledgebases for the New Normal,Abhijit Suprem and Sanjyot Vaidya and Joao Eduardo Ferreira and Calton Pu,http://arxiv.org/pdf/2211.12508v1 | |
http://arxiv.org/abs/2301.04788v1,creativecommons.org/licenses/by/4.0/,Language Cognition and Language Computation -- Human and Machine Language Understanding,Shaonan Wang and Nai Ding and Nan Lin and Jiajun Zhang and Chengqing Zong,http://arxiv.org/pdf/2301.04788v1 | |
http://arxiv.org/abs/2107.03141v1,creativecommons.org/licenses/by/4.0/,Hierarchical Text Classification of Urdu News using Deep Neural Network,Taimoor Ahmed Javed and Waseem Shahzad and Umair Arshad,http://arxiv.org/pdf/2107.03141v1 | |
http://arxiv.org/abs/1810.06635v1,creativecommons.org/licenses/by/4.0/,Semi-supervised and Active-learning Scenarios: Efficient Acoustic Model Refinement for a Low Resource Indian Language,Maharajan Chellapriyadharshini and Anoop Toffy and Srinivasa Raghavan K. M. and V Ramasubramanian,http://arxiv.org/pdf/1810.06635v1 | |
http://arxiv.org/abs/1910.00883v2,creativecommons.org/licenses/by/4.0/,Exploiting BERT for End-to-End Aspect-based Sentiment Analysis,Xin Li and Lidong Bing and Wenxuan Zhang and Wai Lam,http://arxiv.org/pdf/1910.00883v2 | |
http://arxiv.org/abs/2005.11768v2,creativecommons.org/licenses/by/4.0/,KaLM at SemEval-2020 Task 4: Knowledge-aware Language Models for Comprehension And Generation,Jiajing Wan and Xinting Huang,http://arxiv.org/pdf/2005.11768v2 | |
http://arxiv.org/abs/2104.05228v1,creativecommons.org/licenses/by/4.0/,SuperSim: a test set for word similarity and relatedness in Swedish,Simon Hengchen and Nina Tahmasebi,http://arxiv.org/pdf/2104.05228v1 | |
http://arxiv.org/abs/2106.03598v1,creativecommons.org/licenses/by/4.0/,SciFive: a text-to-text transformer model for biomedical literature,Long N. Phan and James T. Anibal and Hieu Tran and Shaurya Chanana and Erol Bahadroglu and Alec Peltekian and Grégoire Altan-Bonnet,http://arxiv.org/pdf/2106.03598v1 | |
http://arxiv.org/abs/2109.07465v1,creativecommons.org/licenses/by/4.0/,On the Limits of Minimal Pairs in Contrastive Evaluation,Jannis Vamvas and Rico Sennrich,http://arxiv.org/pdf/2109.07465v1 | |
http://arxiv.org/abs/2302.11773v1,creativecommons.org/licenses/by/4.0/,Detecting software vulnerabilities using Language Models,Marwan Omar,http://arxiv.org/pdf/2302.11773v1 | |
http://arxiv.org/abs/2110.10305v1,creativecommons.org/licenses/by/4.0/,"When in Doubt, Summon the Titans: Efficient Inference with Large Models",Ankit Singh Rawat and Manzil Zaheer and Aditya Krishna Menon and Amr Ahmed and Sanjiv Kumar,http://arxiv.org/pdf/2110.10305v1 | |
http://arxiv.org/abs/2108.13961v1,creativecommons.org/licenses/by/4.0/,Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools,Nils Feldhus and Robert Schwarzenberg and Sebastian Möller,http://arxiv.org/pdf/2108.13961v1 | |
http://arxiv.org/abs/2110.01691v3,creativecommons.org/licenses/by/4.0/,AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts,Tongshuang Wu and Michael Terry and Carrie J. Cai,http://arxiv.org/pdf/2110.01691v3 | |
http://arxiv.org/abs/2110.08387v3,creativecommons.org/licenses/by/4.0/,Generated Knowledge Prompting for Commonsense Reasoning,Jiacheng Liu and Alisa Liu and Ximing Lu and Sean Welleck and Peter West and Ronan Le Bras and Yejin Choi and Hannaneh Hajishirzi,http://arxiv.org/pdf/2110.08387v3 | |
http://arxiv.org/abs/2205.00176v1,creativecommons.org/licenses/by/4.0/,Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models,Sanghwan Bae and Donghyun Kwak and Sungdong Kim and Donghoon Ham and Soyoung Kang and Sang-Woo Lee and Woomyoung Park,http://arxiv.org/pdf/2205.00176v1 | |
http://arxiv.org/abs/2206.11309v1,creativecommons.org/licenses/by/4.0/,GODEL: Large-Scale Pre-Training for Goal-Directed Dialog,Baolin Peng and Michel Galley and Pengcheng He and Chris Brockett and Lars Liden and Elnaz Nouri and Zhou Yu and Bill Dolan and Jianfeng Gao,http://arxiv.org/pdf/2206.11309v1 | |
http://arxiv.org/abs/2208.04417v2,creativecommons.org/licenses/by/4.0/,Debiased Large Language Models Still Associate Muslims with Uniquely Violent Acts,Babak Hemmatian and Lav R. Varshney,http://arxiv.org/pdf/2208.04417v2 | |
http://arxiv.org/abs/2208.10063v2,creativecommons.org/licenses/by/4.0/,Selection Collider Bias in Large Language Models,Emily McMilin,http://arxiv.org/pdf/2208.10063v2 | |
http://arxiv.org/abs/2208.14271v1,creativecommons.org/licenses/by/4.0/,Faithful Reasoning Using Large Language Models,Antonia Creswell and Murray Shanahan,http://arxiv.org/pdf/2208.14271v1 | |
http://arxiv.org/abs/2210.12022v1,creativecommons.org/licenses/by/4.0/,Performance-Efficiency Trade-Offs in Adapting Language Models to Text Classification Tasks,Laura Aina and Nikos Voskarides and Roi Blanco,http://arxiv.org/pdf/2210.12022v1 | |
http://arxiv.org/abs/2211.08466v1,creativecommons.org/licenses/by/4.0/,Reasoning Circuits: Few-shot Multihop Question Generation with Structured Rationales,Saurabh Kulshreshtha and Anna Rumshisky,http://arxiv.org/pdf/2211.08466v1 | |
http://arxiv.org/abs/2212.10509v1,creativecommons.org/licenses/by/4.0/,Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions,Harsh Trivedi and Niranjan Balasubramanian and Tushar Khot and Ashish Sabharwal,http://arxiv.org/pdf/2212.10509v1 | |
http://arxiv.org/abs/2212.10726v1,creativecommons.org/licenses/by/4.0/,Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval,John Wieting and Jonathan H. Clark and William W. Cohen and Graham Neubig and Taylor Berg-Kirkpatrick,http://arxiv.org/pdf/2212.10726v1 | |
http://arxiv.org/abs/2301.13848v1,creativecommons.org/licenses/by/4.0/,Benchmarking Large Language Models for News Summarization,Tianyi Zhang and Faisal Ladhak and Esin Durmus and Percy Liang and Kathleen McKeown and Tatsunori B. Hashimoto,http://arxiv.org/pdf/2301.13848v1 | |
http://arxiv.org/abs/2303.15125v1,creativecommons.org/licenses/by/4.0/,LMCanvas: Object-Oriented Interaction to Personalize Large Language Model-Powered Writing Environments,Tae Soo Kim and Arghya Sarkar and Yoonjoo Lee and Minsuk Chang and Juho Kim,http://arxiv.org/pdf/2303.15125v1 | |
http://arxiv.org/abs/2010.05731v1,creativecommons.org/licenses/by/4.0/,Probing Pretrained Language Models for Lexical Semantics,Ivan Vulić and Edoardo Maria Ponti and Robert Litschko and Goran Glavaš and Anna Korhonen,http://arxiv.org/pdf/2010.05731v1 | |
http://arxiv.org/abs/2109.06050v2,creativecommons.org/licenses/by/4.0/,Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-Training,Momchil Hardalov and Arnav Arora and Preslav Nakov and Isabelle Augenstein,http://arxiv.org/pdf/2109.06050v2 | |
http://arxiv.org/abs/2201.13405v1,creativecommons.org/licenses/by/4.0/,Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation,Olga Majewska and Evgeniia Razumovskaia and Edoardo Maria Ponti and Ivan Vulić and Anna Korhonen,http://arxiv.org/pdf/2201.13405v1 | |
http://arxiv.org/abs/2210.01848v2,creativecommons.org/licenses/by/4.0/,Explaining Patterns in Data with Language Models via Interpretable Autoprompting,Chandan Singh and John X. Morris and Jyoti Aneja and Alexander M. Rush and Jianfeng Gao,http://arxiv.org/pdf/2210.01848v2 | |
http://arxiv.org/abs/2212.01944v3,creativecommons.org/licenses/by/4.0/,Automaton-Based Representations of Task Knowledge from Generative Language Models,Yunhao Yang and Jean-Raphaël Gaglione and Cyrus Neary and Ufuk Topcu,http://arxiv.org/pdf/2212.01944v3 | |
http://arxiv.org/abs/2211.14920v1,creativecommons.org/licenses/by/4.0/,EPIK: Eliminating multi-model Pipelines with Knowledge-distillation,Bhavesh Laddagiri and Yash Raj and Anshuman Dash,http://arxiv.org/pdf/2211.14920v1 | |
http://arxiv.org/abs/2205.03695v1,creativecommons.org/licenses/by/4.0/,AKI-BERT: a Pre-trained Clinical Language Model for Early Prediction of Acute Kidney Injury,Chengsheng Mao and Liang Yao and Yuan Luo,http://arxiv.org/pdf/2205.03695v1 | |
http://arxiv.org/abs/2212.10873v2,creativecommons.org/licenses/by/4.0/,Prompt-Augmented Linear Probing: Scaling Beyond The Limit of Few-shot In-Context Learners,Hyunsoo Cho and Hyuhng Joon Kim and Junyeob Kim and Sang-Woo Lee and Sang-goo Lee and Kang Min Yoo and Taeuk Kim,http://arxiv.org/pdf/2212.10873v2 | |
http://arxiv.org/abs/2301.02828v2,creativecommons.org/licenses/by/4.0/,Why do Nearest Neighbor Language Models Work?,Frank F. Xu and Uri Alon and Graham Neubig,http://arxiv.org/pdf/2301.02828v2 | |
http://arxiv.org/abs/2302.11713v2,creativecommons.org/licenses/by/4.0/,Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?,Yang Chen and Hexiang Hu and Yi Luan and Haitian Sun and Soravit Changpinyo and Alan Ritter and Ming-Wei Chang,http://arxiv.org/pdf/2302.11713v2 | |
http://arxiv.org/abs/2210.03251v1,creativecommons.org/licenses/by/4.0/,Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints,Ganesh Jawahar and Subhabrata Mukherjee and Debadeepta Dey and Muhammad Abdul-Mageed and Laks V. S. Lakshmanan and Caio Cesar Teodoro Mendes and Gustavo Henrique de Rosa and Shital Shah,http://arxiv.org/pdf/2210.03251v1 | |
http://arxiv.org/abs/1807.10311v1,creativecommons.org/licenses/by/4.0/,Open Source Automatic Speech Recognition for German,Benjamin Milde and Arne Köhn,http://arxiv.org/pdf/1807.10311v1 | |
http://arxiv.org/abs/2004.14848v2,creativecommons.org/licenses/by/4.0/,Enriched Pre-trained Transformers for Joint Slot Filling and Intent Detection,Momchil Hardalov and Ivan Koychev and Preslav Nakov,http://arxiv.org/pdf/2004.14848v2 | |
http://arxiv.org/abs/2109.11295v1,creativecommons.org/licenses/by/4.0/,Dynamic Knowledge Distillation for Pre-trained Language Models,Lei Li and Yankai Lin and Shuhuai Ren and Peng Li and Jie Zhou and Xu Sun,http://arxiv.org/pdf/2109.11295v1 | |
http://arxiv.org/abs/2210.00045v1,creativecommons.org/licenses/by/4.0/,Calibrating Sequence likelihood Improves Conditional Language Generation,Yao Zhao and Misha Khalman and Rishabh Joshi and Shashi Narayan and Mohammad Saleh and Peter J. Liu,http://arxiv.org/pdf/2210.00045v1 | |
http://arxiv.org/abs/2304.06975v1,creativecommons.org/licenses/by/4.0/,HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge,Haochun Wang and Chi Liu and Nuwa Xi and Zewen Qiang and Sendong Zhao and Bing Qin and Ting Liu,http://arxiv.org/pdf/2304.06975v1 | |
http://arxiv.org/abs/2304.03153v1,creativecommons.org/licenses/by/4.0/,Zero-Shot Next-Item Recommendation using Large Pretrained Language Models,Lei Wang and Ee-Peng Lim,http://arxiv.org/pdf/2304.03153v1 | |
http://arxiv.org/abs/1803.02324v2,creativecommons.org/licenses/by/4.0/,Annotation Artifacts in Natural Language Inference Data,Suchin Gururangan and Swabha Swayamdipta and Omer Levy and Roy Schwartz and Samuel R. Bowman and Noah A. Smith,http://arxiv.org/pdf/1803.02324v2 | |
http://arxiv.org/abs/2003.01200v4,creativecommons.org/licenses/by/4.0/,Natural Language Processing Advancements By Deep Learning: A Survey,Amirsina Torfi and Rouzbeh A. Shirvani and Yaser Keneshloo and Nader Tavaf and Edward A. Fox,http://arxiv.org/pdf/2003.01200v4 | |
http://arxiv.org/abs/2108.08252v1,creativecommons.org/licenses/by/4.0/,Deep Natural Language Processing for LinkedIn Search Systems,Weiwei Guo and Xiaowei Liu and Sida Wang and Michaeel Kazi and Zhoutong Fu and Huiji Gao and Jun Jia and Liang Zhang and Bo Long,http://arxiv.org/pdf/2108.08252v1 | |
http://arxiv.org/abs/2109.10475v1,creativecommons.org/licenses/by/4.0/,Salience-Aware Event Chain Modeling for Narrative Understanding,Xiyang Zhang and Muhao Chen and Jonathan May,http://arxiv.org/pdf/2109.10475v1 | |
http://arxiv.org/abs/2203.06228v1,creativecommons.org/licenses/by/4.0/,CoDA21: Evaluating Language Understanding Capabilities of NLP Models With Context-Definition Alignment,Lütfi Kerem Senel and Timo Schick and Hinrich Schütze,http://arxiv.org/pdf/2203.06228v1 | |
http://arxiv.org/abs/2207.14525v1,creativecommons.org/licenses/by/4.0/,Curriculum Learning for Data-Efficient Vision-Language Alignment,Tejas Srinivasan and Xiang Ren and Jesse Thomason,http://arxiv.org/pdf/2207.14525v1 | |
http://arxiv.org/abs/2210.01091v2,creativecommons.org/licenses/by/4.0/,The (In)Effectiveness of Intermediate Task Training For Domain Adaptation and Cross-Lingual Transfer Learning,Sovesh Mohapatra and Somesh Mohapatra,http://arxiv.org/pdf/2210.01091v2 | |
http://arxiv.org/abs/2211.09102v2,creativecommons.org/licenses/by/4.0/,Prompting PaLM for Translation: Assessing Strategies and Performance,David Vilar and Markus Freitag and Colin Cherry and Jiaming Luo and Viresh Ratnakar and George Foster,http://arxiv.org/pdf/2211.09102v2 | |
http://arxiv.org/abs/2212.00616v1,creativecommons.org/licenses/by/4.0/,Extensible Prompts for Language Models,Tao Ge and Jing Hu and Li Dong and Shaoguang Mao and Yan Xia and Xun Wang and Si-Qing Chen and Furu Wei,http://arxiv.org/pdf/2212.00616v1 | |
http://arxiv.org/abs/2212.02712v1,creativecommons.org/licenses/by/4.0/,Improved Beam Search for Hallucination Mitigation in Abstractive Summarization,Arvind Krishna Sridhar and Erik Visser,http://arxiv.org/pdf/2212.02712v1 | |
http://arxiv.org/abs/2212.10408v1,creativecommons.org/licenses/by/4.0/,Geographic and Geopolitical Biases of Language Models,Fahim Faisal and Antonios Anastasopoulos,http://arxiv.org/pdf/2212.10408v1 | |
http://arxiv.org/abs/2302.03194v2,creativecommons.org/licenses/by/4.0/,UDApter -- Efficient Domain Adaptation Using Adapters,Bhavitvya Malik and Abhinav Ramesh Kashyap and Min-Yen Kan and Soujanya Poria,http://arxiv.org/pdf/2302.03194v2 | |
http://arxiv.org/abs/2303.00001v1,creativecommons.org/licenses/by/4.0/,Reward Design with Language Models,Minae Kwon and Sang Michael Xie and Kalesha Bullard and Dorsa Sadigh,http://arxiv.org/pdf/2303.00001v1 | |
http://arxiv.org/abs/2304.00557v1,creativecommons.org/licenses/by/4.0/,Semi-supervised Neural Machine Translation with Consistency Regularization for Low-Resource Languages,Viet H. Pham and Thang M. Pham and Giang Nguyen and Long Nguyen and Dien Dinh,http://arxiv.org/pdf/2304.00557v1 | |
http://arxiv.org/abs/2210.10692v2,creativecommons.org/licenses/by/4.0/,Separating Grains from the Chaff: Using Data Filtering to Improve Multilingual Translation for Low-Resourced African Languages,Idris Abdulmumin and Michael Beukman and Jesujoba O. Alabi and Chris Emezue and Everlyn Asiko and Tosin Adewumi and Shamsuddeen Hassan Muhammad and Mofetoluwa Adeyemi and Oreen Yousuf and Sahib Singh and Tajuddeen Rabiu Gwadabe,http://arxiv.org/pdf/2210.10692v2 | |
http://arxiv.org/abs/2301.05272v1,creativecommons.org/licenses/by/4.0/,Inaccessible Neural Language Models Could Reinvigorate Linguistic Nativism,Patrick Perrine,http://arxiv.org/pdf/2301.05272v1 | |
http://arxiv.org/abs/2207.11906v2,creativecommons.org/licenses/by/4.0/,Learning a Dual-Mode Speech Recognition Model via Self-Pruning,Chunxi Liu and Yuan Shangguan and Haichuan Yang and Yangyang Shi and Raghuraman Krishnamoorthi and Ozlem Kalinli,http://arxiv.org/pdf/2207.11906v2 | |
http://arxiv.org/abs/2202.02635v1,creativecommons.org/licenses/by/4.0/,Multilingual Hate Speech and Offensive Content Detection using Modified Cross-entropy Loss,Arka Mitra and Priyanshu Sankhala,http://arxiv.org/pdf/2202.02635v1 | |
http://arxiv.org/abs/2203.10692v1,creativecommons.org/licenses/by/4.0/,Better Language Model with Hypernym Class Prediction,He Bai and Tong Wang and Alessandro Sordoni and Peng Shi,http://arxiv.org/pdf/2203.10692v1 | |
http://arxiv.org/abs/2206.14576v1,creativecommons.org/licenses/by/4.0/,Using cognitive psychology to understand GPT-3,Marcel Binz and Eric Schulz,http://arxiv.org/pdf/2206.14576v1 | |
http://arxiv.org/abs/2208.02957v2,creativecommons.org/licenses/by/4.0/,Meaning without reference in large language models,Steven T. Piantadosi and Felix Hill,http://arxiv.org/pdf/2208.02957v2 | |
http://arxiv.org/abs/2210.05598v3,creativecommons.org/licenses/by/4.0/,Enriching Biomedical Knowledge for Low-resource Language Through Large-Scale Translation,Long Phan and Tai Dang and Hieu Tran and Trieu H. Trinh and Vy Phan and Lam D. Chau and Minh-Thang Luong,http://arxiv.org/pdf/2210.05598v3 | |
http://arxiv.org/abs/2210.09658v1,creativecommons.org/licenses/by/4.0/,ROSE: Robust Selective Fine-tuning for Pre-trained Language Models,Lan Jiang and Hao Zhou and Yankai Lin and Peng Li and Jie Zhou and Rui Jiang,http://arxiv.org/pdf/2210.09658v1 | |
http://arxiv.org/abs/2210.13979v2,creativecommons.org/licenses/by/4.0/,Meta-learning Pathologies from Radiology Reports using Variance Aware Prototypical Networks,Arijit Sehanobish and Kawshik Kannan and Nabila Abraham and Anasuya Das and Benjamin Odry,http://arxiv.org/pdf/2210.13979v2 | |
http://arxiv.org/abs/2212.01217v1,creativecommons.org/licenses/by/4.0/,Using Large Pre-Trained Language Model to Assist FDA in Premarket Medical Device,Zongzhe Xu,http://arxiv.org/pdf/2212.01217v1 | |
http://arxiv.org/abs/2212.13392v1,creativecommons.org/licenses/by/4.0/,DeepCuts: Single-Shot Interpretability based Pruning for BERT,Jasdeep Singh Grover and Bhavesh Gawri and Ruskin Raj Manku,http://arxiv.org/pdf/2212.13392v1 | |
http://arxiv.org/abs/2303.07678v1,creativecommons.org/licenses/by/4.0/,Query2doc: Query Expansion with Large Language Models,Liang Wang and Nan Yang and Furu Wei,http://arxiv.org/pdf/2303.07678v1 | |
http://arxiv.org/abs/2304.08637v1,creativecommons.org/licenses/by/4.0/,An Evaluation on Large Language Model Outputs: Discourse and Memorization,Adrian de Wynter and Xun Wang and Alex Sokolov and Qilong Gu and Si-Qing Chen,http://arxiv.org/pdf/2304.08637v1 | |
http://arxiv.org/abs/2304.11490v1,creativecommons.org/licenses/by/4.0/,Boosting Theory-of-Mind Performance in Large Language Models via Prompting,Shima Rahimi Moghaddam and Christopher J. Honey,http://arxiv.org/pdf/2304.11490v1 | |
http://arxiv.org/abs/2206.04105v3,creativecommons.org/licenses/by/4.0/,Words are all you need? Language as an approximation for human similarity judgments,Raja Marjieh and Pol van Rijn and Ilia Sucholutsky and Theodore R. Sumers and Harin Lee and Thomas L. Griffiths and Nori Jacoby,http://arxiv.org/pdf/2206.04105v3 | |
http://arxiv.org/abs/2211.05015v1,creativecommons.org/licenses/by/4.0/,Detecting Languages Unintelligible to Multilingual Models through Local Structure Probes,Louis Clouâtre and Prasanna Parthasarathi and Amal Zouaq and Sarath Chandar,http://arxiv.org/pdf/2211.05015v1 | |
http://arxiv.org/abs/2206.06888v1,creativecommons.org/licenses/by/4.0/,CERT: Continual Pre-Training on Sketches for Library-Oriented Code Generation,Daoguang Zan and Bei Chen and Dejian Yang and Zeqi Lin and Minsu Kim and Bei Guan and Yongji Wang and Weizhu Chen and Jian-Guang Lou,http://arxiv.org/pdf/2206.06888v1 | |
http://arxiv.org/abs/2207.14000v1,creativecommons.org/licenses/by/4.0/,Multi-Step Deductive Reasoning Over Natural Language: An Empirical Study on Out-of-Distribution Generalisation,Qiming Bao and Alex Yuxuan Peng and Tim Hartill and Neset Tan and Zhenyun Deng and Michael Witbrock and Jiamou Liu,http://arxiv.org/pdf/2207.14000v1 | |
http://arxiv.org/abs/2211.00384v2,creativecommons.org/licenses/by/4.0/,The future is different: Large pre-trained language models fail in prediction tasks,Kostadin Cvejoski and Ramsés J. Sánchez and César Ojeda,http://arxiv.org/pdf/2211.00384v2 | |
http://arxiv.org/abs/2303.18116v1,creativecommons.org/licenses/by/4.0/,Pair Programming with Large Language Models for Sampling and Estimation of Copulas,Jan Górecki,http://arxiv.org/pdf/2303.18116v1 | |
http://arxiv.org/abs/2005.11197v2,creativecommons.org/licenses/by/4.0/,Simplify-then-Translate: Automatic Preprocessing for Black-Box Machine Translation,Sneha Mehta and Bahareh Azarnoush and Boris Chen and Avneesh Saluja and Vinith Misra and Ballav Bihani and Ritwik Kumar,http://arxiv.org/pdf/2005.11197v2 | |
http://arxiv.org/abs/2203.17247v3,creativecommons.org/licenses/by/4.0/,VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers,Estelle Aflalo and Meng Du and Shao-Yen Tseng and Yongfei Liu and Chenfei Wu and Nan Duan and Vasudev Lal,http://arxiv.org/pdf/2203.17247v3 | |
http://arxiv.org/abs/2212.07798v1,creativecommons.org/licenses/by/4.0/,Utilizing Background Knowledge for Robust Reasoning over Traffic Situations,Jiarui Zhang and Filip Ilievski and Aravinda Kollaa and Jonathan Francis and Kaixin Ma and Alessandro Oltramari,http://arxiv.org/pdf/2212.07798v1 | |
http://arxiv.org/abs/2302.11521v1,creativecommons.org/licenses/by/4.0/,How Does In-Context Learning Help Prompt Tuning?,Simeng Sun and Yang Liu and Dan Iter and Chenguang Zhu and Mohit Iyyer,http://arxiv.org/pdf/2302.11521v1 | |
http://arxiv.org/abs/2110.13995v1,creativecommons.org/licenses/by/4.0/,Software Engineering Meets Systems Engineering: Conceptual Modeling Applied to Engineering Operations,Sabah Al-Fedaghi and Mahdi Modhaffar,http://arxiv.org/pdf/2110.13995v1 | |
http://arxiv.org/abs/2302.03900v1,creativecommons.org/licenses/by/4.0/,Zero-shot Generation of Coherent Storybook from Plain Text Story using Diffusion Models,Hyeonho Jeong and Gihyun Kwon and Jong Chul Ye,http://arxiv.org/pdf/2302.03900v1 | |
http://arxiv.org/abs/2201.11838v3,creativecommons.org/licenses/by/4.0/,Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences,Yikuan Li and Ramsey M. Wehbe and Faraz S. Ahmad and Hanyin Wang and Yuan Luo,http://arxiv.org/pdf/2201.11838v3 | |
http://arxiv.org/abs/2301.01181v7,creativecommons.org/licenses/by/4.0/,Large Language Models as Corporate Lobbyists,John J. Nay,http://arxiv.org/pdf/2301.01181v7 | |
http://arxiv.org/abs/1912.09582v1,creativecommons.org/licenses/by/4.0/,BERTje: A Dutch BERT Model,Wietse de Vries and Andreas van Cranenburgh and Arianna Bisazza and Tommaso Caselli and Gertjan van Noord and Malvina Nissim,http://arxiv.org/pdf/1912.09582v1 | |
http://arxiv.org/abs/2101.00297v3,creativecommons.org/licenses/by/4.0/,Analyzing Commonsense Emergence in Few-shot Knowledge Models,Jeff Da and Ronan Le Bras and Ximing Lu and Yejin Choi and Antoine Bosselut,http://arxiv.org/pdf/2101.00297v3 | |
http://arxiv.org/abs/2104.08860v2,creativecommons.org/licenses/by/4.0/,CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval,Huaishao Luo and Lei Ji and Ming Zhong and Yang Chen and Wen Lei and Nan Duan and Tianrui Li,http://arxiv.org/pdf/2104.08860v2 | |
http://arxiv.org/abs/2106.00851v1,creativecommons.org/licenses/by/4.0/,Parameter-Efficient Neural Question Answering Models via Graph-Enriched Document Representations,Louis Castricato and Stephen Fitz and Won Young Shin,http://arxiv.org/pdf/2106.00851v1 | |
http://arxiv.org/abs/2110.02402v1,creativecommons.org/licenses/by/4.0/,Language Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformers,Narsimha Chilkuri and Eric Hunsberger and Aaron Voelker and Gurshaant Malik and Chris Eliasmith,http://arxiv.org/pdf/2110.02402v1 | |
http://arxiv.org/abs/2111.08210v1,creativecommons.org/licenses/by/4.0/,Meeting Summarization with Pre-training and Clustering Methods,Andras Huebner and Wei Ji and Xiang Xiao,http://arxiv.org/pdf/2111.08210v1 | |
http://arxiv.org/abs/2206.07593v1,creativecommons.org/licenses/by/4.0/,HICEM: A High-Coverage Emotion Model for Artificial Emotional Intelligence,Benjamin Wortman and James Z. Wang,http://arxiv.org/pdf/2206.07593v1 | |
http://arxiv.org/abs/2209.10505v1,creativecommons.org/licenses/by/4.0/,Text Revealer: Private Text Reconstruction via Model Inversion Attacks against Transformers,Ruisi Zhang and Seira Hidano and Farinaz Koushanfar,http://arxiv.org/pdf/2209.10505v1 | |
http://arxiv.org/abs/2210.02833v1,creativecommons.org/licenses/by/4.0/,Matching Text and Audio Embeddings: Exploring Transfer-learning Strategies for Language-based Audio Retrieval,Benno Weck and Miguel Pérez Fernández and Holger Kirchhoff and Xavier Serra,http://arxiv.org/pdf/2210.02833v1 | |
http://arxiv.org/abs/2210.11468v1,creativecommons.org/licenses/by/4.0/,ObSynth: An Interactive Synthesis System for Generating Object Models from Natural Language Specifications,Alex Gu and Tamara Mitrovska and Daniela Velez and Jacob Andreas and Armando Solar-Lezama,http://arxiv.org/pdf/2210.11468v1 | |
http://arxiv.org/abs/2211.00593v1,creativecommons.org/licenses/by/4.0/,Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small,Kevin Wang and Alexandre Variengien and Arthur Conmy and Buck Shlegeris and Jacob Steinhardt,http://arxiv.org/pdf/2211.00593v1 | |
http://arxiv.org/abs/2211.08987v1,creativecommons.org/licenses/by/4.0/,TSMind: Alibaba and Soochow University's Submission to the WMT22 Translation Suggestion Task,Xin Ge and Ke Wang and Jiayi Wang and Nini Xiao and Xiangyu Duan and Yu Zhao and Yuqi Zhang,http://arxiv.org/pdf/2211.08987v1 | |
http://arxiv.org/abs/2302.01318v1,creativecommons.org/licenses/by/4.0/,Accelerating Large Language Model Decoding with Speculative Sampling,Charlie Chen and Sebastian Borgeaud and Geoffrey Irving and Jean-Baptiste Lespiau and Laurent Sifre and John Jumper,http://arxiv.org/pdf/2302.01318v1 | |
http://arxiv.org/abs/2302.05932v1,creativecommons.org/licenses/by/4.0/,Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking,Derek Chen and Kun Qian and Zhou Yu,http://arxiv.org/pdf/2302.05932v1 | |
http://arxiv.org/abs/2302.12128v1,creativecommons.org/licenses/by/4.0/,On the Generalization Ability of Retrieval-Enhanced Transformers,Tobias Norlund and Ehsan Doostmohammadi and Richard Johansson and Marco Kuhlmann,http://arxiv.org/pdf/2302.12128v1 | |
http://arxiv.org/abs/2304.11721v1,creativecommons.org/licenses/by/4.0/,A Lightweight Constrained Generation Alternative for Query-focused Summarization,Zhichao Xu and Daniel Cohen,http://arxiv.org/pdf/2304.11721v1 | |
http://arxiv.org/abs/2103.02432v2,creativecommons.org/licenses/by/4.0/,FuncADL: Functional Analysis Description Language,Mason Proffitt and Gordon Watts,http://arxiv.org/pdf/2103.02432v2 | |
http://arxiv.org/abs/2203.10744v1,creativecommons.org/licenses/by/4.0/,Programming Language Agnostic Mining of Code and Language Pairs with Sequence Labeling Based Question Answering,Changran Hu and Akshara Reddi Methukupalli and Yutong Zhou and Chen Wu and Yubo Chen,http://arxiv.org/pdf/2203.10744v1 | |
http://arxiv.org/abs/2304.03816v1,creativecommons.org/licenses/by/4.0/,Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions,Sarah Fakhoury and Saikat Chakraborty and Madan Musuvathi and Shuvendu K. Lahiri,http://arxiv.org/pdf/2304.03816v1 | |
http://arxiv.org/abs/2207.00112v1,creativecommons.org/licenses/by/4.0/,Language model compression with weighted low-rank factorization,Yen-Chang Hsu and Ting Hua and Sungen Chang and Qian Lou and Yilin Shen and Hongxia Jin,http://arxiv.org/pdf/2207.00112v1 | |
http://arxiv.org/abs/2008.05055v1,creativecommons.org/licenses/by/4.0/,The Annotation Guideline of LST20 Corpus,Prachya Boonkwan and Vorapon Luantangsrisuk and Sitthaa Phaholphinyo and Kanyanat Kriengket and Dhanon Leenoi and Charun Phrombut and Monthika Boriboon and Krit Kosawat and Thepchai Supnithi,http://arxiv.org/pdf/2008.05055v1 | |
http://arxiv.org/abs/2205.07407v1,creativecommons.org/licenses/by/4.0/,What GPT Knows About Who is Who,Xiaohan Yang and Eduardo Peynetti and Vasco Meerman and Chris Tanner,http://arxiv.org/pdf/2205.07407v1 | |
http://arxiv.org/abs/2206.13517v1,creativecommons.org/licenses/by/4.0/,ProGen2: Exploring the Boundaries of Protein Language Models,Erik Nijkamp and Jeffrey Ruffolo and Eli N. Weinstein and Nikhil Naik and Ali Madani,http://arxiv.org/pdf/2206.13517v1 | |
http://arxiv.org/abs/2210.10332v2,creativecommons.org/licenses/by/4.0/,Revision Transformers: Getting RiT of No-Nos,Felix Friedrich and Wolfgang Stammer and Patrick Schramowski and Kristian Kersting,http://arxiv.org/pdf/2210.10332v2 | |
http://arxiv.org/abs/2304.06861v1,creativecommons.org/licenses/by/4.0/,Evaluation of Social Biases in Recent Large Pre-Trained Models,Swapnil Sharma and Nikita Anand and Kranthi Kiran G. V. and Alind Jain,http://arxiv.org/pdf/2304.06861v1 | |
http://arxiv.org/abs/2104.05146v1,creativecommons.org/licenses/by/4.0/,Assessing Reference-Free Peer Evaluation for Machine Translation,Sweta Agrawal and George Foster and Markus Freitag and Colin Cherry,http://arxiv.org/pdf/2104.05146v1 | |
http://arxiv.org/abs/2108.10580v1,creativecommons.org/licenses/by/4.0/,Detection of Criminal Texts for the Polish State Border Guard,Artur Nowakowski and Krzysztof Jassem,http://arxiv.org/pdf/2108.10580v1 | |
http://arxiv.org/abs/2210.06928v2,creativecommons.org/licenses/by/4.0/,"Sentence Ambiguity, Grammaticality and Complexity Probes",Sunit Bhattacharya and Vilém Zouhar and Ondřej Bojar,http://arxiv.org/pdf/2210.06928v2 | |
http://arxiv.org/abs/2303.15642v1,creativecommons.org/licenses/by/4.0/,Graph Sequence Learning for Premise Selection,Edvard K. Holden and Konstantin Korovin,http://arxiv.org/pdf/2303.15642v1 | |
http://arxiv.org/abs/2304.05012v1,creativecommons.org/licenses/by/4.0/,Human-machine cooperation for semantic feature listing,Kushin Mukherjee and Siddharth Suresh and Timothy T. Rogers,http://arxiv.org/pdf/2304.05012v1 | |
http://arxiv.org/abs/2304.08823v1,creativecommons.org/licenses/by/4.0/,Transfer to a Low-Resource Language via Close Relatives: The Case Study on Faroese,Vésteinn Snæbjarnarson and Annika Simonsen and Goran Glavaš and Ivan Vulić,http://arxiv.org/pdf/2304.08823v1 | |
http://arxiv.org/abs/2303.10464v1,creativecommons.org/licenses/by/4.0/,SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language Models,Vithursan Thangarasa and Abhay Gupta and William Marshall and Tianda Li and Kevin Leong and Dennis DeCoste and Sean Lie and Shreyas Saxena,http://arxiv.org/pdf/2303.10464v1 | |
http://arxiv.org/abs/2203.09509v4,creativecommons.org/licenses/by/4.0/,ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection,Thomas Hartvigsen and Saadia Gabriel and Hamid Palangi and Maarten Sap and Dipankar Ray and Ece Kamar,http://arxiv.org/pdf/2203.09509v4 | |
http://arxiv.org/abs/2208.03030v1,creativecommons.org/licenses/by/4.0/,ChiQA: A Large Scale Image-based Real-World Question Answering Dataset for Multi-Modal Understanding,Bingning Wang and Feiyang Lv and Ting Yao and Yiming Yuan and Jin Ma and Yu Luo and Haijin Liang,http://arxiv.org/pdf/2208.03030v1 | |
http://arxiv.org/abs/2012.03084v1,creativecommons.org/licenses/by/4.0/,Pre-training Protein Language Models with Label-Agnostic Binding Pairs Enhances Performance in Downstream Tasks,Modestas Filipavicius and Matteo Manica and Joris Cadow and Maria Rodriguez Martinez,http://arxiv.org/pdf/2012.03084v1 | |
http://arxiv.org/abs/2203.05115v2,creativecommons.org/licenses/by/4.0/,Internet-augmented language models through few-shot prompting for open-domain question answering,Angeliki Lazaridou and Elena Gribovskaya and Wojciech Stokowiec and Nikolai Grigorev,http://arxiv.org/pdf/2203.05115v2 | |
http://arxiv.org/abs/2302.01588v1,creativecommons.org/licenses/by/4.0/,Bioformer: an efficient transformer language model for biomedical text mining,Li Fang and Qingyu Chen and Chih-Hsuan Wei and Zhiyong Lu and Kai Wang,http://arxiv.org/pdf/2302.01588v1 | |
http://arxiv.org/abs/2302.14233v1,creativecommons.org/licenses/by/4.0/,Goal Driven Discovery of Distributional Differences via Language Descriptions,Ruiqi Zhong and Peter Zhang and Steve Li and Jinwoo Ahn and Dan Klein and Jacob Steinhardt,http://arxiv.org/pdf/2302.14233v1 | |
http://arxiv.org/abs/2302.14828v1,creativecommons.org/licenses/by/4.0/,Automatic Scoring of Dream Reports' Emotional Content with Large Language Models,Lorenzo Bertolini and Valentina Elce and Adriana Michalak and Giulio Bernardi and Julie Weeds,http://arxiv.org/pdf/2302.14828v1 | |
http://arxiv.org/abs/2304.01852v2,creativecommons.org/licenses/by/4.0/,Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models,Yiheng Liu and Tianle Han and Siyuan Ma and Jiayue Zhang and Yuanyuan Yang and Jiaming Tian and Hao He and Antong Li and Mengshen He and Zhengliang Liu and Zihao Wu and Dajiang Zhu and Xiang Li and Ning Qiang and Dingang Shen and Tianming Liu and Bao Ge,http://arxiv.org/pdf/2304.01852v2 | |
http://arxiv.org/abs/2302.11520v2,creativecommons.org/licenses/by/4.0/,Guiding Large Language Models via Directional Stimulus Prompting,Zekun Li and Baolin Peng and Pengcheng He and Michel Galley and Jianfeng Gao and Xifeng Yan,http://arxiv.org/pdf/2302.11520v2 | |
http://arxiv.org/abs/2303.17071v1,creativecommons.org/licenses/by/4.0/,DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents,Varun Nair and Elliot Schumacher and Geoffrey Tso and Anitha Kannan,http://arxiv.org/pdf/2303.17071v1 | |
http://arxiv.org/abs/2304.00457v1,creativecommons.org/licenses/by/4.0/,LLMMaps -- A Visual Metaphor for Stratified Evaluation of Large Language Models,Patrik Puchert and Poonam Poonam and Christian van Onzenoodt and Timo Ropinski,http://arxiv.org/pdf/2304.00457v1 | |
http://arxiv.org/abs/2304.09406v1,creativecommons.org/licenses/by/4.0/,How to Do Things with Deep Learning Code,Minh Hua and Rita Raley,http://arxiv.org/pdf/2304.09406v1 | |
http://arxiv.org/abs/2211.11720v3,creativecommons.org/licenses/by/4.0/,Multitask Vision-Language Prompt Tuning,Sheng Shen and Shijia Yang and Tianjun Zhang and Bohan Zhai and Joseph E. Gonzalez and Kurt Keutzer and Trevor Darrell,http://arxiv.org/pdf/2211.11720v3 | |
http://arxiv.org/abs/2212.00193v1,creativecommons.org/licenses/by/4.0/,Distilling Multi-Step Reasoning Capabilities of Large Language Models into Smaller Models via Semantic Decompositions,Kumar Shridhar and Alessandro Stolfo and Mrinmaya Sachan,http://arxiv.org/pdf/2212.00193v1 | |
http://arxiv.org/abs/2212.10537v2,creativecommons.org/licenses/by/4.0/,Does CLIP Bind Concepts? Probing Compositionality in Large Image Models,Martha Lewis and Nihal V. Nayak and Peilin Yu and Qinan Yu and Jack Merullo and Stephen H. Bach and Ellie Pavlick,http://arxiv.org/pdf/2212.10537v2 | |
http://arxiv.org/abs/2106.07225v1,creativecommons.org/licenses/by/4.0/,English to Bangla Machine Translation Using Recurrent Neural Network,Shaykh Siddique and Tahmid Ahmed and Md. Rifayet Azam Talukder and Md. Mohsin Uddin,http://arxiv.org/pdf/2106.07225v1 | |
http://arxiv.org/abs/2107.07651v2,creativecommons.org/licenses/by/4.0/,Align before Fuse: Vision and Language Representation Learning with Momentum Distillation,Junnan Li and Ramprasaath R. Selvaraju and Akhilesh Deepak Gotmare and Shafiq Joty and Caiming Xiong and Steven Hoi,http://arxiv.org/pdf/2107.07651v2 | |
http://arxiv.org/abs/2108.09105v1,creativecommons.org/licenses/by/4.0/,Airbert: In-domain Pretraining for Vision-and-Language Navigation,Pierre-Louis Guhur and Makarand Tapaswi and Shizhe Chen and Ivan Laptev and Cordelia Schmid,http://arxiv.org/pdf/2108.09105v1 | |
http://arxiv.org/abs/2202.12814v1,creativecommons.org/licenses/by/4.0/,The Reality of Multi-Lingual Machine Translation,Tom Kocmi and Dominik Macháček and Ondřej Bojar,http://arxiv.org/pdf/2202.12814v1 | |
http://arxiv.org/abs/2204.11454v2,creativecommons.org/licenses/by/4.0/,Natural Language to Code Translation with Execution,Freda Shi and Daniel Fried and Marjan Ghazvininejad and Luke Zettlemoyer and Sida I. Wang,http://arxiv.org/pdf/2204.11454v2 | |
http://arxiv.org/abs/2205.15503v3,creativecommons.org/licenses/by/4.0/,Leveraging Pre-Trained Language Models to Streamline Natural Language Interaction for Self-Tracking,Young-Ho Kim and Sungdong Kim and Minsuk Chang and Sang-Woo Lee,http://arxiv.org/pdf/2205.15503v3 | |
http://arxiv.org/abs/2210.03629v3,creativecommons.org/licenses/by/4.0/,ReAct: Synergizing Reasoning and Acting in Language Models,Shunyu Yao and Jeffrey Zhao and Dian Yu and Nan Du and Izhak Shafran and Karthik Narasimhan and Yuan Cao,http://arxiv.org/pdf/2210.03629v3 | |
http://arxiv.org/abs/2303.18027v2,creativecommons.org/licenses/by/4.0/,Evaluating GPT-4 and ChatGPT on Japanese Medical Licensing Examinations,Jungo Kasai and Yuhei Kasai and Keisuke Sakaguchi and Yutaro Yamada and Dragomir Radev,http://arxiv.org/pdf/2303.18027v2 | |
http://arxiv.org/abs/2304.02754v1,creativecommons.org/licenses/by/4.0/,Behavioral estimates of conceptual structure are robust across tasks in humans but not large language models,Siddharth Suresh and Lisa Padua and Kushin Mukherjee and Timothy T Rogers,http://arxiv.org/pdf/2304.02754v1 | |
http://arxiv.org/abs/2007.15211v2,creativecommons.org/licenses/by/4.0/,NeuralQA: A Usable Library for Question Answering (Contextual Query Expansion + BERT) on Large Datasets,Victor Dibia,http://arxiv.org/pdf/2007.15211v2 | |
http://arxiv.org/abs/2203.07259v3,creativecommons.org/licenses/by/4.0/,The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models,Eldar Kurtic and Daniel Campos and Tuan Nguyen and Elias Frantar and Mark Kurtz and Benjamin Fineran and Michael Goin and Dan Alistarh,http://arxiv.org/pdf/2203.07259v3 | |
http://arxiv.org/abs/2211.17192v1,creativecommons.org/licenses/by/4.0/,Fast Inference from Transformers via Speculative Decoding,Yaniv Leviathan and Matan Kalman and Yossi Matias,http://arxiv.org/pdf/2211.17192v1 | |
http://arxiv.org/abs/2302.06321v1,creativecommons.org/licenses/by/4.0/,Parameter-efficient Modularised Bias Mitigation via AdapterFusion,Deepak Kumar and Oleg Lesota and George Zerveas and Daniel Cohen and Carsten Eickhoff and Markus Schedl and Navid Rekabsaz,http://arxiv.org/pdf/2302.06321v1 | |
http://arxiv.org/abs/2011.06195v1,creativecommons.org/licenses/by/4.0/,Towards Semi-Supervised Semantics Understanding from Speech,Cheng-I Lai and Jin Cao and Sravan Bodapati and Shang-Wen Li,http://arxiv.org/pdf/2011.06195v1 | |
http://arxiv.org/abs/2101.04998v1,creativecommons.org/licenses/by/4.0/,Coarse and Fine-Grained Hostility Detection in Hindi Posts using Fine Tuned Multilingual Embeddings,Arkadipta De and Venkatesh E and Kaushal Kumar Maurya and Maunendra Sankar Desarkar,http://arxiv.org/pdf/2101.04998v1 | |
http://arxiv.org/abs/2104.06378v5,creativecommons.org/licenses/by/4.0/,QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering,Michihiro Yasunaga and Hongyu Ren and Antoine Bosselut and Percy Liang and Jure Leskovec,http://arxiv.org/pdf/2104.06378v5 | |
http://arxiv.org/abs/2106.10619v1,creativecommons.org/licenses/by/4.0/,A Brief Study on the Effects of Training Generative Dialogue Models with a Semantic loss,Prasanna Parthasarathi and Mohamed Abdelsalam and Joelle Pineau and Sarath Chandar,http://arxiv.org/pdf/2106.10619v1 | |
http://arxiv.org/abs/2108.00946v2,creativecommons.org/licenses/by/4.0/,StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators,Rinon Gal and Or Patashnik and Haggai Maron and Gal Chechik and Daniel Cohen-Or,http://arxiv.org/pdf/2108.00946v2 | |
http://arxiv.org/abs/2109.07953v1,creativecommons.org/licenses/by/4.0/,Efficient Attribute Injection for Pretrained Language Models,Reinald Kim Amplayo and Kang Min Yoo and Sang-Woo Lee,http://arxiv.org/pdf/2109.07953v1 | |
http://arxiv.org/abs/2110.15836v2,creativecommons.org/licenses/by/4.0/,Combining Unsupervised and Text Augmented Semi-Supervised Learning for Low Resourced Autoregressive Speech Recognition,Chak-Fai Li and Francis Keith and William Hartmann and Matthew Snover,http://arxiv.org/pdf/2110.15836v2 | |
http://arxiv.org/abs/2202.03753v2,creativecommons.org/licenses/by/4.0/,Semantic features of object concepts generated with GPT-3,Hannes Hansen and Martin N. Hebart,http://arxiv.org/pdf/2202.03753v2 | |
http://arxiv.org/abs/2205.05448v2,creativecommons.org/licenses/by/4.0/,Symphony Generation with Permutation Invariant Language Model,Jiafeng Liu and Yuanliang Dong and Zehua Cheng and Xinran Zhang and Xiaobing Li and Feng Yu and Maosong Sun,http://arxiv.org/pdf/2205.05448v2 | |
http://arxiv.org/abs/2206.15014v1,creativecommons.org/licenses/by/4.0/,Compressing Pre-trained Transformers via Low-Bit NxM Sparsity for Natural Language Understanding,Connor Holmes and Minjia Zhang and Yuxiong He and Bo Wu,http://arxiv.org/pdf/2206.15014v1 | |
http://arxiv.org/abs/2209.11068v1,creativecommons.org/licenses/by/4.0/,Prompting for a conversation: How to control a dialog model?,Josef Valvoda and Yimai Fang and David Vandyke,http://arxiv.org/pdf/2209.11068v1 | |
http://arxiv.org/abs/2210.09132v1,creativecommons.org/licenses/by/4.0/,Pseudo-OOD training for robust language models,Dhanasekar Sundararaman and Nikhil Mehta and Lawrence Carin,http://arxiv.org/pdf/2210.09132v1 | |
http://arxiv.org/abs/2210.15452v1,creativecommons.org/licenses/by/4.0/,Exploring Predictive Uncertainty and Calibration in NLP: A Study on the Impact of Method & Data Scarcity,Dennis Ulmer and Jes Frellsen and Christian Hardmeier,http://arxiv.org/pdf/2210.15452v1 | |
http://arxiv.org/abs/2211.08989v1,creativecommons.org/licenses/by/4.0/,Avoid Overthinking in Self-Supervised Models for Speech Recognition,Dan Berrebbi and Brian Yan and Shinji Watanabe,http://arxiv.org/pdf/2211.08989v1 | |
http://arxiv.org/abs/1709.09443v1,creativecommons.org/licenses/by/4.0/,Prosodic Features from Large Corpora of Child-Directed Speech as Predictors of the Age of Acquisition of Words,Lea Frermann and Michael C. Frank,http://arxiv.org/pdf/1709.09443v1 | |
http://arxiv.org/abs/2101.00376v2,creativecommons.org/licenses/by/4.0/,RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge,Bill Yuchen Lin and Ziyi Wu and Yichi Yang and Dong-Ho Lee and Xiang Ren,http://arxiv.org/pdf/2101.00376v2 | |
http://arxiv.org/abs/2112.12926v1,creativecommons.org/licenses/by/4.0/,nvBench: A Large-Scale Synthesized Dataset for Cross-Domain Natural Language to Visualization Task,Yuyu Luo and Jiawei Tang and Guoliang Li,http://arxiv.org/pdf/2112.12926v1 | |
http://arxiv.org/abs/2202.04742v1,creativecommons.org/licenses/by/4.0/,FedQAS: Privacy-aware machine reading comprehension with federated learning,Addi Ait-Mlouk and Sadi Alawadi and Salman Toor and Andreas Hellander,http://arxiv.org/pdf/2202.04742v1 | |
http://arxiv.org/abs/2204.13309v1,creativecommons.org/licenses/by/4.0/,Improving robustness of language models from a geometry-aware perspective,Bin Zhu and Zhaoquan Gu and Le Wang and Jinyin Chen and Qi Xuan,http://arxiv.org/pdf/2204.13309v1 | |
http://arxiv.org/abs/2205.10981v1,creativecommons.org/licenses/by/4.0/,Improving Short Text Classification With Augmented Data Using GPT-3,Salvador Balkus and Donghui Yan,http://arxiv.org/pdf/2205.10981v1 | |
http://arxiv.org/abs/2205.12615v1,creativecommons.org/licenses/by/4.0/,Autoformalization with Large Language Models,Yuhuai Wu and Albert Q. Jiang and Wenda Li and Markus N. Rabe and Charles Staats and Mateja Jamnik and Christian Szegedy,http://arxiv.org/pdf/2205.12615v1 | |
http://arxiv.org/abs/2206.04585v2,creativecommons.org/licenses/by/4.0/,Extracting Zero-shot Common Sense from Large Language Models for Robot 3D Scene Understanding,William Chen and Siyi Hu and Rajat Talak and Luca Carlone,http://arxiv.org/pdf/2206.04585v2 | |
http://arxiv.org/abs/2209.04811v1,creativecommons.org/licenses/by/4.0/,Probing for Understanding of English Verb Classes and Alternations in Large Pre-trained Language Models,David K. Yi and James V. Bruno and Jiayu Han and Peter Zukerman and Shane Steinert-Threlkeld,http://arxiv.org/pdf/2209.04811v1 | |
http://arxiv.org/abs/2209.12711v1,creativecommons.org/licenses/by/4.0/,Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts,Joel Jang and Seonghyeon Ye and Minjoon Seo,http://arxiv.org/pdf/2209.12711v1 | |
http://arxiv.org/abs/2211.04699v1,creativecommons.org/licenses/by/4.0/,FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration,Yangjun Wu and Kebin Fang and Yao Zhao and Hao Zhang and Lifeng Shi and Mengqi Zhang,http://arxiv.org/pdf/2211.04699v1 | |
http://arxiv.org/abs/2211.04898v2,creativecommons.org/licenses/by/4.0/,Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token,Baohao Liao and David Thulke and Sanjika Hewavitharana and Hermann Ney and Christof Monz,http://arxiv.org/pdf/2211.04898v2 | |
http://arxiv.org/abs/2212.05113v1,creativecommons.org/licenses/by/4.0/,Automatically Generating CS Learning Materials with Large Language Models,Stephen MacNeil and Andrew Tran and Juho Leinonen and Paul Denny and Joanne Kim and Arto Hellas and Seth Bernstein and Sami Sarsa,http://arxiv.org/pdf/2212.05113v1 | |
http://arxiv.org/abs/2212.11456v1,creativecommons.org/licenses/by/4.0/,CAMeMBERT: Cascading Assistant-Mediated Multilingual BERT,Dan DeGenaro and Jugal Kalita,http://arxiv.org/pdf/2212.11456v1 | |
http://arxiv.org/abs/2212.14047v1,creativecommons.org/licenses/by/4.0/,Using Large Language Models to Generate Engaging Captions for Data Visualizations,Ashley Liew and Klaus Mueller,http://arxiv.org/pdf/2212.14047v1 | |
http://arxiv.org/abs/2301.08721v1,creativecommons.org/licenses/by/4.0/,Batch Prompting: Efficient Inference with Large Language Model APIs,Zhoujun Cheng and Jungo Kasai and Tao Yu,http://arxiv.org/pdf/2301.08721v1 | |
http://arxiv.org/abs/2302.07856v1,creativecommons.org/licenses/by/4.0/,Dictionary-based Phrase-level Prompting of Large Language Models for Machine Translation,Marjan Ghazvininejad and Hila Gonen and Luke Zettlemoyer,http://arxiv.org/pdf/2302.07856v1 | |
http://arxiv.org/abs/2302.12832v1,creativecommons.org/licenses/by/4.0/,Fluid Transformers and Creative Analogies: Exploring Large Language Models' Capacity for Augmenting Cross-Domain Analogical Creativity,Zijian Ding and Arvind Srinivasan and Stephen MacNeil and Joel Chan,http://arxiv.org/pdf/2302.12832v1 | |
http://arxiv.org/abs/2303.01580v1,creativecommons.org/licenses/by/4.0/,Mixture of Soft Prompts for Controllable Data Generation,Derek Chen and Celine Lee and Yunan Lu and Domenic Rosati and Zhou Yu,http://arxiv.org/pdf/2303.01580v1 | |
http://arxiv.org/abs/2303.06247v2,creativecommons.org/licenses/by/4.0/,Task and Motion Planning with Large Language Models for Object Rearrangement,Yan Ding and Xiaohan Zhang and Chris Paxton and Shiqi Zhang,http://arxiv.org/pdf/2303.06247v2 | |
http://arxiv.org/abs/2303.15473v1,creativecommons.org/licenses/by/4.0/,Can Large Language Models assist in Hazard Analysis?,Simon Diemert and Jens H Weber,http://arxiv.org/pdf/2303.15473v1 | |
http://arxiv.org/abs/2303.16421v1,creativecommons.org/licenses/by/4.0/,ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models,Ning Bian and Xianpei Han and Le Sun and Hongyu Lin and Yaojie Lu and Ben He,http://arxiv.org/pdf/2303.16421v1 | |
http://arxiv.org/abs/2304.04487v1,creativecommons.org/licenses/by/4.0/,Inference with Reference: Lossless Acceleration of Large Language Models,Nan Yang and Tao Ge and Liang Wang and Binxing Jiao and Daxin Jiang and Linjun Yang and Rangan Majumder and Furu Wei,http://arxiv.org/pdf/2304.04487v1 | |
http://arxiv.org/abs/2304.06638v1,creativecommons.org/licenses/by/4.0/,How Useful are Educational Questions Generated by Large Language Models?,Sabina Elkins and Ekaterina Kochmar and Jackie C. K. Cheung and Iulian Serban,http://arxiv.org/pdf/2304.06638v1 | |
http://arxiv.org/abs/2012.13354v2,creativecommons.org/licenses/by/4.0/,To what extent do human explanations of model behavior align with actual model behavior?,Grusha Prasad and Yixin Nie and Mohit Bansal and Robin Jia and Douwe Kiela and Adina Williams,http://arxiv.org/pdf/2012.13354v2 | |
http://arxiv.org/abs/2303.02939v3,creativecommons.org/licenses/by/4.0/,FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model,Ruiqing Xue and Yanqing Liu and Lei He and Xu Tan and Linquan Liu and Edward Lin and Sheng Zhao,http://arxiv.org/pdf/2303.02939v3 | |
http://arxiv.org/abs/2209.07686v2,creativecommons.org/licenses/by/4.0/,"Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango",Aman Madaan and Amir Yazdanbakhsh,http://arxiv.org/pdf/2209.07686v2 | |
http://arxiv.org/abs/1604.05372v1,creativecommons.org/licenses/by/4.0/,Clustering Comparable Corpora of Russian and Ukrainian Academic Texts: Word Embeddings and Semantic Fingerprints,Andrey Kutuzov and Mikhail Kopotev and Tatyana Sviridenko and Lyubov Ivanova,http://arxiv.org/pdf/1604.05372v1 | |
http://arxiv.org/abs/2103.01273v2,creativecommons.org/licenses/by/4.0/,"On the Effectiveness of Dataset Embeddings in Mono-lingual,Multi-lingual and Zero-shot Conditions",Rob van der Goot and Ahmet Üstün and Barbara Plank,http://arxiv.org/pdf/2103.01273v2 | |
http://arxiv.org/abs/2105.09081v1,creativecommons.org/licenses/by/4.0/,Essay-BR: a Brazilian Corpus of Essays,Jeziel C. Marinho and Rafael T. Anchieta and Raimundo S. Moura,http://arxiv.org/pdf/2105.09081v1 | |
http://arxiv.org/abs/2109.05357v1,creativecommons.org/licenses/by/4.0/,Learning from Language Description: Low-shot Named Entity Recognition via Decomposed Framework,Yaqing Wang and Haoda Chu and Chao Zhang and Jing Gao,http://arxiv.org/pdf/2109.05357v1 | |
http://arxiv.org/abs/2109.10147v1,creativecommons.org/licenses/by/4.0/,Knowledge Distillation with Noisy Labels for Natural Language Understanding,Shivendra Bhardwaj and Abbas Ghaddar and Ahmad Rashid and Khalil Bibi and Chengyang Li and Ali Ghodsi and Philippe Langlais and Mehdi Rezagholizadeh,http://arxiv.org/pdf/2109.10147v1 | |
http://arxiv.org/abs/2110.09635v1,creativecommons.org/licenses/by/4.0/,A ground-truth dataset of real security patches,Sofia Reis and Rui Abreu,http://arxiv.org/pdf/2110.09635v1 | |
http://arxiv.org/abs/2110.11790v1,creativecommons.org/licenses/by/4.0/,Automatic Guide Generation for Stan via NumPyro,Guillaume Baudart and Louis Mandel,http://arxiv.org/pdf/2110.11790v1 | |
http://arxiv.org/abs/2112.08633v2,creativecommons.org/licenses/by/4.0/,Learning To Retrieve Prompts for In-Context Learning,Ohad Rubin and Jonathan Herzig and Jonathan Berant,http://arxiv.org/pdf/2112.08633v2 | |
http://arxiv.org/abs/2201.05613v2,creativecommons.org/licenses/by/4.0/,The Dark Side of the Language: Pre-trained Transformers in the DarkNet,Leonardo Ranaldi and Aria Nourbakhsh and Arianna Patrizi and Elena Sofia Ruzzetti and Dario Onorati and Francesca Fallucchi and Fabio Massimo Zanzotto,http://arxiv.org/pdf/2201.05613v2 | |
http://arxiv.org/abs/2202.07991v1,creativecommons.org/licenses/by/4.0/,ADIMA: Abuse Detection In Multilingual Audio,Vikram Gupta and Rini Sharon and Ramit Sawhney and Debdoot Mukherjee,http://arxiv.org/pdf/2202.07991v1 | |
http://arxiv.org/abs/2204.14243v2,creativecommons.org/licenses/by/4.0/,Training Naturalized Semantic Parsers with Very Little Data,Subendhu Rongali and Konstantine Arkoudas and Melanie Rubino and Wael Hamza,http://arxiv.org/pdf/2204.14243v2 | |
http://arxiv.org/abs/2209.00731v2,creativecommons.org/licenses/by/4.0/,In conversation with Artificial Intelligence: aligning language models with human values,Atoosa Kasirzadeh and Iason Gabriel,http://arxiv.org/pdf/2209.00731v2 | |
http://arxiv.org/abs/2210.13838v2,creativecommons.org/licenses/by/4.0/,Multilingual Relation Classification via Efficient and Effective Prompting,Yuxuan Chen and David Harbecke and Leonhard Hennig,http://arxiv.org/pdf/2210.13838v2 | |
http://arxiv.org/abs/2211.00046v1,creativecommons.org/licenses/by/4.0/,Very Low Resource Sentence Alignment: Luhya and Swahili,Everlyn Asiko Chimoto and Bruce A. Bassett,http://arxiv.org/pdf/2211.00046v1 | |
http://arxiv.org/abs/2212.10539v1,creativecommons.org/licenses/by/4.0/,"Toward Human Readable Prompt Tuning: Kubrick's The Shining is a good movie, and a good prompt too?",Weijia Shi and Xiaochuang Han and Hila Gonen and Ari Holtzman and Yulia Tsvetkov and Luke Zettlemoyer,http://arxiv.org/pdf/2212.10539v1 | |
http://arxiv.org/abs/2303.00733v1,creativecommons.org/licenses/by/4.0/,SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks,Kai-Wei Chang and Yu-Kai Wang and Hua Shen and Iu-thing Kang and Wei-Cheng Tseng and Shang-Wen Li and Hung-yi Lee,http://arxiv.org/pdf/2303.00733v1 | |
http://arxiv.org/abs/2304.01746v1,creativecommons.org/licenses/by/4.0/,Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation,Tao Fang and Shu Yang and Kaixin Lan and Derek F. Wong and Jinpeng Hu and Lidia S. Chao and Yue Zhang,http://arxiv.org/pdf/2304.01746v1 | |
http://arxiv.org/abs/2304.03682v1,creativecommons.org/licenses/by/4.0/,BenCoref: A Multi-Domain Dataset of Nominal Phrases and Pronominal Reference Annotations,Shadman Rohan and Mojammel Hossain and Mohammad Mamun Or Rashid and Nabeel Mohammed,http://arxiv.org/pdf/2304.03682v1 | |
http://arxiv.org/abs/1512.08823v2,creativecommons.org/licenses/by/4.0/,Reduction of Nondeterministic Tree Automata,Ricardo Almeida and Lukáš Holík and Richard Mayr,http://arxiv.org/pdf/1512.08823v2 | |
http://arxiv.org/abs/2004.02077v1,creativecommons.org/licenses/by/4.0/,Machine Translation Pre-training for Data-to-Text Generation -- A Case Study in Czech,Mihir Kale and Scott Roy,http://arxiv.org/pdf/2004.02077v1 | |
http://arxiv.org/abs/2112.12750v1,creativecommons.org/licenses/by/4.0/,SLIP: Self-supervision meets Language-Image Pre-training,Norman Mu and Alexander Kirillov and David Wagner and Saining Xie,http://arxiv.org/pdf/2112.12750v1 | |
http://arxiv.org/abs/2111.09543v4,creativecommons.org/licenses/by/4.0/,DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing,Pengcheng He and Jianfeng Gao and Weizhu Chen,http://arxiv.org/pdf/2111.09543v4 | |
http://arxiv.org/abs/2106.06297v1,creativecommons.org/licenses/by/4.0/,Dynamic Language Models for Continuously Evolving Content,Spurthi Amba Hombaiah and Tao Chen and Mingyang Zhang and Michael Bendersky and Marc Najork,http://arxiv.org/pdf/2106.06297v1 | |
http://arxiv.org/abs/2303.06135v2,creativecommons.org/licenses/by/4.0/,Rewarding Chatbots for Real-World Engagement with Millions of Users,Robert Irvine and Douglas Boubert and Vyas Raina and Adian Liusie and Ziyi Zhu and Vineet Mudupalli and Aliaksei Korshuk and Zongyi Liu and Fritz Cremer and Valentin Assassi and Christie-Carol Beauchamp and Xiaoding Lu and Thomas Rialan and William Beauchamp,http://arxiv.org/pdf/2303.06135v2 | |
http://arxiv.org/abs/2301.09626v1,creativecommons.org/licenses/by/4.0/,Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning,Malte Ostendorff and Georg Rehm,http://arxiv.org/pdf/2301.09626v1 | |
http://arxiv.org/abs/2012.07534v1,creativecommons.org/licenses/by/4.0/,Effect of Word Embedding Models on Hate and Offensive Speech Detection,Safa Alsafari and Samira Sadaoui and Malek Mouhoub,http://arxiv.org/pdf/2012.07534v1 | |
http://arxiv.org/abs/2105.07465v3,creativecommons.org/licenses/by/4.0/,SLGPT: Using Transfer Learning to Directly Generate Simulink Model Files and Find Bugs in the Simulink Toolchain,Sohil Lal Shrestha and Christoph Csallner,http://arxiv.org/pdf/2105.07465v3 | |
http://arxiv.org/abs/2106.00840v1,creativecommons.org/licenses/by/4.0/,Comparing Test Sets with Item Response Theory,Clara Vania and Phu Mon Htut and William Huang and Dhara Mungra and Richard Yuanzhe Pang and Jason Phang and Haokun Liu and Kyunghyun Cho and Samuel R. Bowman,http://arxiv.org/pdf/2106.00840v1 | |
http://arxiv.org/abs/2112.00791v2,creativecommons.org/licenses/by/4.0/,Controlling Conditional Language Models without Catastrophic Forgetting,Tomasz Korbak and Hady Elsahar and German Kruszewski and Marc Dymetman,http://arxiv.org/pdf/2112.00791v2 | |
http://arxiv.org/abs/2204.05979v1,creativecommons.org/licenses/by/4.0/,Discovering material information using hierarchical Reformer model on financial regulatory filings,Francois Mercier and Makesh Narsimhan,http://arxiv.org/pdf/2204.05979v1 | |
http://arxiv.org/abs/2211.09800v2,creativecommons.org/licenses/by/4.0/,InstructPix2Pix: Learning to Follow Image Editing Instructions,Tim Brooks and Aleksander Holynski and Alexei A. Efros,http://arxiv.org/pdf/2211.09800v2 | |
http://arxiv.org/abs/2211.15199v1,creativecommons.org/licenses/by/4.0/,Large Pre-Trained Models with Extra-Large Vocabularies: A Contrastive Analysis of Hebrew BERT Models and a New One to Outperform Them All,Eylon Guetta and Avi Shmidman and Shaltiel Shmidman and Cheyn Shmuel Shmidman and Joshua Guedalia and Moshe Koppel and Dan Bareket and Amit Seker and Reut Tsarfaty,http://arxiv.org/pdf/2211.15199v1 | |
http://arxiv.org/abs/2112.08726v1,creativecommons.org/licenses/by/4.0/,NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics,Ximing Lu and Sean Welleck and Peter West and Liwei Jiang and Jungo Kasai and Daniel Khashabi and Ronan Le Bras and Lianhui Qin and Youngjae Yu and Rowan Zellers and Noah A. Smith and Yejin Choi,http://arxiv.org/pdf/2112.08726v1 | |
http://arxiv.org/abs/2201.06723v2,creativecommons.org/licenses/by/4.0/,Emojis as Anchors to Detect Arabic Offensive Language and Hate Speech,Hamdy Mubarak and Sabit Hassan and Shammur Absar Chowdhury,http://arxiv.org/pdf/2201.06723v2 | |
http://arxiv.org/abs/2202.01374v1,creativecommons.org/licenses/by/4.0/,mSLAM: Massively multilingual joint pre-training for speech and text,Ankur Bapna and Colin Cherry and Yu Zhang and Ye Jia and Melvin Johnson and Yong Cheng and Simran Khanuja and Jason Riesa and Alexis Conneau,http://arxiv.org/pdf/2202.01374v1 | |
http://arxiv.org/abs/2205.04652v1,creativecommons.org/licenses/by/4.0/,SuMe: A Dataset Towards Summarizing Biomedical Mechanisms,Mohaddeseh Bastan and Nishant Shankar and Mihai Surdeanu and Niranjan Balasubramanian,http://arxiv.org/pdf/2205.04652v1 | |
http://arxiv.org/abs/2206.08932v1,creativecommons.org/licenses/by/4.0/,Putting GPT-3's Creativity to the (Alternative Uses) Test,Claire Stevenson and Iris Smal and Matthijs Baas and Raoul Grasman and Han van der Maas,http://arxiv.org/pdf/2206.08932v1 | |
http://arxiv.org/abs/2212.10846v2,creativecommons.org/licenses/by/4.0/,From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models,Jiaxian Guo and Junnan Li and Dongxu Li and Anthony Meng Huat Tiong and Boyang Li and Dacheng Tao and Steven C. H. Hoi,http://arxiv.org/pdf/2212.10846v2 | |
http://arxiv.org/abs/2302.04931v1,creativecommons.org/licenses/by/4.0/,In-Context Learning with Many Demonstration Examples,Mukai Li and Shansan Gong and Jiangtao Feng and Yiheng Xu and Jun Zhang and Zhiyong Wu and Lingpeng Kong,http://arxiv.org/pdf/2302.04931v1 | |
http://arxiv.org/abs/2111.00276v2,creativecommons.org/licenses/by/4.0/,EventNarrative: A large-scale Event-centric Dataset for Knowledge Graph-to-Text Generation,Anthony Colas and Ali Sadeghian and Yue Wang and Daisy Zhe Wang,http://arxiv.org/pdf/2111.00276v2 | |
http://arxiv.org/abs/2208.11701v1,creativecommons.org/licenses/by/4.0/,Ontology-Driven Self-Supervision for Adverse Childhood Experiences Identification Using Social Media Datasets,Jinge Wu and Rowena Smith and Honghan Wu,http://arxiv.org/pdf/2208.11701v1 | |
http://arxiv.org/abs/2205.05055v6,creativecommons.org/licenses/by/4.0/,Data Distributional Properties Drive Emergent In-Context Learning in Transformers,Stephanie C. Y. Chan and Adam Santoro and Andrew K. Lampinen and Jane X. Wang and Aaditya Singh and Pierre H. Richemond and Jay McClelland and Felix Hill,http://arxiv.org/pdf/2205.05055v6 | |
http://arxiv.org/abs/2205.11005v1,creativecommons.org/licenses/by/4.0/,Parameter-Efficient Sparsity for Large Language Models Fine-Tuning,Yuchao Li and Fuli Luo and Chuanqi Tan and Mengdi Wang and Songfang Huang and Shen Li and Junjie Bai,http://arxiv.org/pdf/2205.11005v1 | |
http://arxiv.org/abs/2110.12201v1,creativecommons.org/licenses/by/4.0/,Spanish Legalese Language Model and Corpora,Asier Gutiérrez-Fandiño and Jordi Armengol-Estapé and Aitor Gonzalez-Agirre and Marta Villegas,http://arxiv.org/pdf/2110.12201v1 | |
http://arxiv.org/abs/1911.00461v1,creativecommons.org/licenses/by/4.0/,On the Unintended Social Bias of Training Language Generation Models with Data from Local Media,Omar U. Florez,http://arxiv.org/pdf/1911.00461v1 | |
http://arxiv.org/abs/2010.04897v1,creativecommons.org/licenses/by/4.0/,Information Extraction from Swedish Medical Prescriptions with Sig-Transformer Encoder,John Pougue Biyong and Bo Wang and Terry Lyons and Alejo J Nevado-Holgado,http://arxiv.org/pdf/2010.04897v1 | |
http://arxiv.org/abs/2102.03551v1,creativecommons.org/licenses/by/4.0/,Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling,Ernie Chang and Vera Demberg and Alex Marin,http://arxiv.org/pdf/2102.03551v1 | |
http://arxiv.org/abs/2104.01394v1,creativecommons.org/licenses/by/4.0/,MMBERT: Multimodal BERT Pretraining for Improved Medical VQA,Yash Khare and Viraj Bagal and Minesh Mathew and Adithi Devi and U Deva Priyakumar and CV Jawahar,http://arxiv.org/pdf/2104.01394v1 | |
http://arxiv.org/abs/2106.00590v2,creativecommons.org/licenses/by/4.0/,NewsEmbed: Modeling News through Pre-trained Document Representations,Jialu Liu and Tianqi Liu and Cong Yu,http://arxiv.org/pdf/2106.00590v2 | |
http://arxiv.org/abs/2109.12036v1,creativecommons.org/licenses/by/4.0/,Transformers Generalize Linearly,Jackson Petty and Robert Frank,http://arxiv.org/pdf/2109.12036v1 | |
http://arxiv.org/abs/2109.12406v1,creativecommons.org/licenses/by/4.0/,MINIMAL: Mining Models for Data Free Universal Adversarial Triggers,Swapnil Parekh and Yaman Singla Kumar and Somesh Singh and Changyou Chen and Balaji Krishnamurthy and Rajiv Ratn Shah,http://arxiv.org/pdf/2109.12406v1 | |
http://arxiv.org/abs/2111.00526v2,creativecommons.org/licenses/by/4.0/,FinEAS: Financial Embedding Analysis of Sentiment,Asier Gutiérrez-Fandiño and Miquel Noguer i Alonso and Petter Kolm and Jordi Armengol-Estapé,http://arxiv.org/pdf/2111.00526v2 | |
http://arxiv.org/abs/2111.01543v1,creativecommons.org/licenses/by/4.0/,UQuAD1.0: Development of an Urdu Question Answering Training Data for Machine Reading Comprehension,Samreen Kazi and Shakeel Khoja,http://arxiv.org/pdf/2111.01543v1 | |
http://arxiv.org/abs/2203.10378v1,creativecommons.org/licenses/by/4.0/,On Robust Prefix-Tuning for Text Classification,Zonghan Yang and Yang Liu,http://arxiv.org/pdf/2203.10378v1 | |
http://arxiv.org/abs/2204.05185v2,creativecommons.org/licenses/by/4.0/,Uniform Complexity for Text Generation,Joseph Marvin Imperial,http://arxiv.org/pdf/2204.05185v2 | |
http://arxiv.org/abs/2205.00363v3,creativecommons.org/licenses/by/4.0/,Visual Spatial Reasoning,Fangyu Liu and Guy Emerson and Nigel Collier,http://arxiv.org/pdf/2205.00363v3 | |
http://arxiv.org/abs/2205.09246v1,creativecommons.org/licenses/by/4.0/,Transformer-based Program Synthesis for Low-Data Environments,Jack Roper,http://arxiv.org/pdf/2205.09246v1 | |
http://arxiv.org/abs/2208.05798v1,creativecommons.org/licenses/by/4.0/,Aesthetic Visual Question Answering of Photographs,Xin Jin and Wu Zhou and Xinghui Zhou and Shuai Cui and Le Zhang and Jianwen Lv and Shu Zhao,http://arxiv.org/pdf/2208.05798v1 | |
http://arxiv.org/abs/2210.01240v4,creativecommons.org/licenses/by/4.0/,Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought,Abulhair Saparov and He He,http://arxiv.org/pdf/2210.01240v4 | |
http://arxiv.org/abs/2210.05839v1,creativecommons.org/licenses/by/4.0/,SEAL : Interactive Tool for Systematic Error Analysis and Labeling,Nazneen Rajani and Weixin Liang and Lingjiao Chen and Meg Mitchell and James Zou,http://arxiv.org/pdf/2210.05839v1 | |
http://arxiv.org/abs/2211.07954v1,creativecommons.org/licenses/by/4.0/,An Overview on Controllable Text Generation via Variational Auto-Encoders,Haoqin Tu and Yitong Li,http://arxiv.org/pdf/2211.07954v1 | |
http://arxiv.org/abs/2212.10114v1,creativecommons.org/licenses/by/4.0/,True Detective: A Challenging Benchmark for Deep Abductive Reasoning \\in Foundation Models,Maksym Del and Mark Fishel,http://arxiv.org/pdf/2212.10114v1 | |
http://arxiv.org/abs/2212.10773v1,creativecommons.org/licenses/by/4.0/,MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning,Zhiyang Xu and Ying Shen and Lifu Huang,http://arxiv.org/pdf/2212.10773v1 | |
http://arxiv.org/abs/2301.10166v1,creativecommons.org/licenses/by/4.0/,Leveraging Vision-Language Models for Granular Market Change Prediction,Christopher Wimmer and Navid Rekabsaz,http://arxiv.org/pdf/2301.10166v1 | |
http://arxiv.org/abs/2301.12314v1,creativecommons.org/licenses/by/4.0/,Progressive Prompts: Continual Learning for Language Models,Anastasia Razdaibiedina and Yuning Mao and Rui Hou and Madian Khabsa and Mike Lewis and Amjad Almahairi,http://arxiv.org/pdf/2301.12314v1 | |
http://arxiv.org/abs/2302.02463v3,creativecommons.org/licenses/by/4.0/,Nationality Bias in Text Generation,Pranav Narayanan Venkit and Sanjana Gautam and Ruchi Panchanadikar and Ting-Hao 'Kenneth' Huang and Shomir Wilson,http://arxiv.org/pdf/2302.02463v3 | |
http://arxiv.org/abs/2303.03840v2,creativecommons.org/licenses/by/4.0/,A Challenging Benchmark for Low-Resource Learning,Yudong Wang and Chang Ma and Qingxiu Dong and Lingpeng Kong and Jingjing Xu,http://arxiv.org/pdf/2303.03840v2 | |
http://arxiv.org/abs/2303.04497v1,creativecommons.org/licenses/by/4.0/,Exploiting the Textual Potential from Vision-Language Pre-training for Text-based Person Search,Guanshuo Wang and Fufu Yu and Junjie Li and Qiong Jia and Shouhong Ding,http://arxiv.org/pdf/2303.04497v1 | |
http://arxiv.org/abs/2304.08243v1,creativecommons.org/licenses/by/4.0/,Stochastic Code Generation,Swapnil Sharma and Nikita Anand and Kranthi Kiran G. V,http://arxiv.org/pdf/2304.08243v1 | |
http://arxiv.org/abs/2201.07520v1,creativecommons.org/licenses/by/4.0/,CM3: A Causal Masked Multimodal Model of the Internet,Armen Aghajanyan and Bernie Huang and Candace Ross and Vladimir Karpukhin and Hu Xu and Naman Goyal and Dmytro Okhonko and Mandar Joshi and Gargi Ghosh and Mike Lewis and Luke Zettlemoyer,http://arxiv.org/pdf/2201.07520v1 | |
http://arxiv.org/abs/2009.09223v1,creativecommons.org/licenses/by/4.0/,BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition,Usman Naseem and Matloob Khushi and Vinay Reddy and Sakthivel Rajendran and Imran Razzak and Jinman Kim,http://arxiv.org/pdf/2009.09223v1 | |
http://arxiv.org/abs/2105.02605v2,creativecommons.org/licenses/by/4.0/,GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph,Junhan Yang and Zheng Liu and Shitao Xiao and Chaozhuo Li and Defu Lian and Sanjay Agrawal and Amit Singh and Guangzhong Sun and Xing Xie,http://arxiv.org/pdf/2105.02605v2 | |
http://arxiv.org/abs/2301.02120v1,creativecommons.org/licenses/by/4.0/,Reprogramming Pretrained Language Models for Protein Sequence Representation Learning,Ria Vinod and Pin-Yu Chen and Payel Das,http://arxiv.org/pdf/2301.02120v1 | |
http://arxiv.org/abs/2301.07851v1,creativecommons.org/licenses/by/4.0/,From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition,Chao-Han Huck Yang and Bo Li and Yu Zhang and Nanxin Chen and Rohit Prabhavalkar and Tara N. Sainath and Trevor Strohman,http://arxiv.org/pdf/2301.07851v1 | |
http://arxiv.org/abs/2302.05527v1,creativecommons.org/licenses/by/4.0/,CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code,Shuyan Zhou and Uri Alon and Sumit Agarwal and Graham Neubig,http://arxiv.org/pdf/2302.05527v1 | |
http://arxiv.org/abs/2106.12797v1,creativecommons.org/licenses/by/4.0/,A comprehensive empirical analysis on cross-domain semantic enrichment for detection of depressive language,Nawshad Farruque and Randy Goebel and Osmar Zaiane,http://arxiv.org/pdf/2106.12797v1 | |
http://arxiv.org/abs/2109.09707v1,creativecommons.org/licenses/by/4.0/,A Plug-and-Play Method for Controlled Text Generation,Damian Pascual and Beni Egressy and Clara Meister and Ryan Cotterell and Roger Wattenhofer,http://arxiv.org/pdf/2109.09707v1 | |
http://arxiv.org/abs/2110.04544v1,creativecommons.org/licenses/by/4.0/,CLIP-Adapter: Better Vision-Language Models with Feature Adapters,Peng Gao and Shijie Geng and Renrui Zhang and Teli Ma and Rongyao Fang and Yongfeng Zhang and Hongsheng Li and Yu Qiao,http://arxiv.org/pdf/2110.04544v1 | |
http://arxiv.org/abs/2210.12810v1,creativecommons.org/licenses/by/4.0/,Code4Struct: Code Generation for Few-Shot Structured Prediction from Natural Language,Xingyao Wang and Sha Li and Heng Ji,http://arxiv.org/pdf/2210.12810v1 | |
http://arxiv.org/abs/2302.05852v1,creativecommons.org/licenses/by/4.0/,"Why is this misleading?"": Detecting News Headline Hallucinations with Explanations""",Jiaming Shen and Jialu Liu and Dan Finnie and Negar Rahmati and Michael Bendersky and Marc Najork,http://arxiv.org/pdf/2302.05852v1 | |
http://arxiv.org/abs/2304.11107v1,creativecommons.org/licenses/by/4.0/,ChatABL: Abductive Learning via Natural Language Interaction with ChatGPT,Tianyang Zhong and Yaonai Wei and Li Yang and Zihao Wu and Zhengliang Liu and Xiaozheng Wei and Wenjun Li and Junjie Yao and Chong Ma and Xiang Li and Dajiang Zhu and Xi Jiang and Junwei Han and Dinggang Shen and Tianming Liu and Tuo Zhang,http://arxiv.org/pdf/2304.11107v1 | |
http://arxiv.org/abs/2102.06991v2,creativecommons.org/licenses/by/4.0/,The first large scale collection of diverse Hausa language datasets,Isa Inuwa-Dutse,http://arxiv.org/pdf/2102.06991v2 | |
http://arxiv.org/abs/2107.09948v4,creativecommons.org/licenses/by/4.0/,A Statistical Model of Word Rank Evolution,Alex John Quijano and Rick Dale and Suzanne Sindi,http://arxiv.org/pdf/2107.09948v4 | |
http://arxiv.org/abs/2303.00807v1,creativecommons.org/licenses/by/4.0/,UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers,Jon Saad-Falcon and Omar Khattab and Keshav Santhanam and Radu Florian and Martin Franz and Salim Roukos and Avirup Sil and Md Arafat Sultan and Christopher Potts,http://arxiv.org/pdf/2303.00807v1 | |
http://arxiv.org/abs/1902.06092v1,creativecommons.org/licenses/by/4.0/,Exploring Language Similarities with Dimensionality Reduction Technique,Sangarshanan Veeraraghavan,http://arxiv.org/pdf/1902.06092v1 | |
http://arxiv.org/abs/2205.12672v2,creativecommons.org/licenses/by/4.0/,Discovering Language-neutral Sub-networks in Multilingual Language Models,Negar Foroutan and Mohammadreza Banaei and Remi Lebret and Antoine Bosselut and Karl Aberer,http://arxiv.org/pdf/2205.12672v2 | |
http://arxiv.org/abs/2303.15350v1,creativecommons.org/licenses/by/4.0/,Improving Neural Topic Models with Wasserstein Knowledge Distillation,Suman Adhya and Debarshi Kumar Sanyal,http://arxiv.org/pdf/2303.15350v1 | |
http://arxiv.org/abs/2205.01204v1,creativecommons.org/licenses/by/4.0/,Multi-Task Text Classification using Graph Convolutional Networks for Large-Scale Low Resource Language,Mounika Marreddy and Subba Reddy Oota and Lakshmi Sireesha Vakada and Venkata Charan Chinni and Radhika Mamidi,http://arxiv.org/pdf/2205.01204v1 | |
http://arxiv.org/abs/1904.02036v1,creativecommons.org/licenses/by/4.0/,A Large-Scale Comparison of Historical Text Normalization Systems,Marcel Bollmann,http://arxiv.org/pdf/1904.02036v1 | |
http://arxiv.org/abs/2104.04243v1,creativecommons.org/licenses/by/4.0/,Incorporating External Knowledge to Enhance Tabular Reasoning,J. Neeraja and Vivek Gupta and Vivek Srikumar,http://arxiv.org/pdf/2104.04243v1 | |
http://arxiv.org/abs/2204.10483v1,creativecommons.org/licenses/by/4.0/,NLP Based Anomaly Detection for Categorical Time Series,Matthew Horak and Sowmya Chandrasekaran and Giovanni Tobar,http://arxiv.org/pdf/2204.10483v1 | |
http://arxiv.org/abs/2210.00131v2,creativecommons.org/licenses/by/4.0/,Selection Induced Collider Bias: A Gender Pronoun Uncertainty Case Study,Emily McMilin,http://arxiv.org/pdf/2210.00131v2 | |
http://arxiv.org/abs/2304.05591v1,creativecommons.org/licenses/by/4.0/,Semantic Feature Verification in FLAN-T5,Siddharth Suresh and Kushin Mukherjee and Timothy T. Rogers,http://arxiv.org/pdf/2304.05591v1 | |
http://arxiv.org/abs/2304.11094v1,creativecommons.org/licenses/by/4.0/,Effectiveness of Debiasing Techniques: An Indigenous Qualitative Analysis,Vithya Yogarajan and Gillian Dobbie and Henry Gouk,http://arxiv.org/pdf/2304.11094v1 | |
http://arxiv.org/abs/2109.02797v1,creativecommons.org/licenses/by/4.0/,Puzzle Solving without Search or Human Knowledge: An Unnatural Language Approach,David Noever and Ryerson Burdick,http://arxiv.org/pdf/2109.02797v1 | |
http://arxiv.org/abs/2109.11577v2,creativecommons.org/licenses/by/4.0/,Text Ranking and Classification using Data Compression,Nitya Kasturi and Igor L. Markov,http://arxiv.org/pdf/2109.11577v2 | |
http://arxiv.org/abs/2205.12643v1,creativecommons.org/licenses/by/4.0/,Asking the Right Questions in Low Resource Template Extraction,Nils Holzenberger and Yunmo Chen and Benjamin Van Durme,http://arxiv.org/pdf/2205.12643v1 | |
http://arxiv.org/abs/2210.06384v3,creativecommons.org/licenses/by/4.0/,GMP*: Well-Tuned Gradual Magnitude Pruning Can Outperform Most BERT-Pruning Methods,Eldar Kurtic and Dan Alistarh,http://arxiv.org/pdf/2210.06384v3 | |
http://arxiv.org/abs/2210.14852v2,creativecommons.org/licenses/by/4.0/,Causality Detection using Multiple Annotation Decisions,Quynh Anh Nguyen and Arka Mitra,http://arxiv.org/pdf/2210.14852v2 | |
http://arxiv.org/abs/2211.17201v1,creativecommons.org/licenses/by/4.0/,ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT,Rui Pan and Shizhe Diao and Jianlin Chen and Tong Zhang,http://arxiv.org/pdf/2211.17201v1 | |
http://arxiv.org/abs/2302.11042v1,creativecommons.org/licenses/by/4.0/,In-context Example Selection with Influences,Tai Nguyen and Eric Wong,http://arxiv.org/pdf/2302.11042v1 | |
http://arxiv.org/abs/2304.12102v1,creativecommons.org/licenses/by/4.0/,Unlocking Context Constraints of LLMs: Enhancing Context Efficiency of LLMs with Self-Information-Based Content Filtering,Yucheng Li,http://arxiv.org/pdf/2304.12102v1 | |
http://arxiv.org/abs/2109.14259v1,creativecommons.org/licenses/by/4.0/,Hierarchical Character Tagger for Short Text Spelling Error Correction,Mengyi Gao and Canran Xu and Peng Shi,http://arxiv.org/pdf/2109.14259v1 | |
http://arxiv.org/abs/2111.14447v2,creativecommons.org/licenses/by/4.0/,ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic,Yoad Tewel and Yoav Shalev and Idan Schwartz and Lior Wolf,http://arxiv.org/pdf/2111.14447v2 | |
http://arxiv.org/abs/2206.01838v1,creativecommons.org/licenses/by/4.0/,Differentially Private Model Compression,Fatemehsadat Mireshghallah and Arturs Backurs and Huseyin A Inan and Lukas Wutschitz and Janardhan Kulkarni,http://arxiv.org/pdf/2206.01838v1 | |
http://arxiv.org/abs/2212.01365v1,creativecommons.org/licenses/by/4.0/,An Information-Theoretic Analysis of Compute-Optimal Neural Scaling Laws,Hong Jun Jeon and Benjamin Van Roy,http://arxiv.org/pdf/2212.01365v1 | |
http://arxiv.org/abs/2302.09185v1,creativecommons.org/licenses/by/4.0/,Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints,Albert Lu and Hongxin Zhang and Yanzhe Zhang and Xuezhi Wang and Diyi Yang,http://arxiv.org/pdf/2302.09185v1 | |
http://arxiv.org/abs/2212.14834v4,creativecommons.org/licenses/by/4.0/,Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models,Yinlin Deng and Chunqiu Steven Xia and Haoran Peng and Chenyuan Yang and Lingming Zhang,http://arxiv.org/pdf/2212.14834v4 | |
http://arxiv.org/abs/2212.07143v1,creativecommons.org/licenses/by/4.0/,Reproducible scaling laws for contrastive language-image learning,Mehdi Cherti and Romain Beaumont and Ross Wightman and Mitchell Wortsman and Gabriel Ilharco and Cade Gordon and Christoph Schuhmann and Ludwig Schmidt and Jenia Jitsev,http://arxiv.org/pdf/2212.07143v1 | |
http://arxiv.org/abs/2204.02311v5,creativecommons.org/licenses/by/4.0/,PaLM: Scaling Language Modeling with Pathways,Aakanksha Chowdhery and Sharan Narang and Jacob Devlin and Maarten Bosma and Gaurav Mishra and Adam Roberts and Paul Barham and Hyung Won Chung and Charles Sutton and Sebastian Gehrmann and Parker Schuh and Kensen Shi and Sasha Tsvyashchenko and Joshua Maynez and Abhishek Rao and Parker Barnes and Yi Tay and Noam Shazeer and Vinodkumar Prabhakaran and Emily Reif and Nan Du and Ben Hutchinson and Reiner Pope and James Bradbury and Jacob Austin and Michael Isard and Guy Gur-Ari and Pengcheng Yin and Toju Duke and Anselm Levskaya and Sanjay Ghemawat and Sunipa Dev and Henryk Michalewski and Xavier Garcia and Vedant Misra and Kevin Robinson and Liam Fedus and Denny Zhou and Daphne Ippolito and David Luan and Hyeontaek Lim and Barret Zoph and Alexander Spiridonov and Ryan Sepassi and David Dohan and Shivani Agrawal and Mark Omernick and Andrew M. Dai and Thanumalayan Sankaranarayana Pillai and Marie Pellat and Aitor Lewkowycz and Erica Moreira and Rewon Child and Oleksandr Polozov and Katherine Lee and Zongwei Zhou and Xuezhi Wang and Brennan Saeta and Mark Diaz and Orhan Firat and Michele Catasta and Jason Wei and Kathy Meier-Hellstern and Douglas Eck and Jeff Dean and Slav Petrov and Noah Fiedel,http://arxiv.org/pdf/2204.02311v5 | |
http://arxiv.org/abs/2111.09791v1,creativecommons.org/licenses/by/4.0/,Supporting Undotted Arabic with Pre-trained Language Models,Aviad Rom and Kfir Bar,http://arxiv.org/pdf/2111.09791v1 | |
http://arxiv.org/abs/2212.10502v1,creativecommons.org/licenses/by/4.0/,A Measure-Theoretic Characterization of Tight Language Models,Li Du and Lucas Torroba Hennigen and Tiago Pimentel and Clara Meister and Jason Eisner and Ryan Cotterell,http://arxiv.org/pdf/2212.10502v1 | |
http://arxiv.org/abs/2301.06527v1,creativecommons.org/licenses/by/4.0/,XNLI 2.0: Improving XNLI dataset and performance on Cross Lingual Understanding (XLU),Ankit Kumar Upadhyay and Harsit Kumar Upadhya,http://arxiv.org/pdf/2301.06527v1 | |
http://arxiv.org/abs/1910.06426v1,creativecommons.org/licenses/by/4.0/,Tell-the-difference: Fine-grained Visual Descriptor via a Discriminating Referee,Shuangjie Xu and Feng Xu and Yu Cheng and Pan Zhou,http://arxiv.org/pdf/1910.06426v1 | |
http://arxiv.org/abs/2111.08267v1,creativecommons.org/licenses/by/4.0/,Solving Probability and Statistics Problems by Program Synthesis,Leonard Tang and Elizabeth Ke and Nikhil Singh and Nakul Verma and Iddo Drori,http://arxiv.org/pdf/2111.08267v1 | |
http://arxiv.org/abs/2205.01287v3,creativecommons.org/licenses/by/4.0/,SemAttack: Natural Textual Attacks via Different Semantic Spaces,Boxin Wang and Chejian Xu and Xiangyu Liu and Yu Cheng and Bo Li,http://arxiv.org/pdf/2205.01287v3 | |
http://arxiv.org/abs/2301.05318v1,creativecommons.org/licenses/by/4.0/,Language-Informed Transfer Learning for Embodied Household Activities,Yuqian Jiang and Qiaozi Gao and Govind Thattai and Gaurav Sukhatme,http://arxiv.org/pdf/2301.05318v1 | |
http://arxiv.org/abs/2304.11163v1,creativecommons.org/licenses/by/4.0/,"ChatGPT, Large Language Technologies, and the Bumpy Road of Benefiting Humanity",Atoosa Kasirzadeh,http://arxiv.org/pdf/2304.11163v1 | |
http://arxiv.org/abs/2302.07257v1,creativecommons.org/licenses/by/4.0/,ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models,Sheng Wang and Zihao Zhao and Xi Ouyang and Qian Wang and Dinggang Shen,http://arxiv.org/pdf/2302.07257v1 | |
http://arxiv.org/abs/1908.05672v5,creativecommons.org/licenses/by/4.0/,Towards Making the Most of BERT in Neural Machine Translation,Jiacheng Yang and Mingxuan Wang and Hao Zhou and Chengqi Zhao and Yong Yu and Weinan Zhang and Lei Li,http://arxiv.org/pdf/1908.05672v5 | |
http://arxiv.org/abs/2006.00671v2,creativecommons.org/licenses/by/4.0/,Conversational Machine Comprehension: a Literature Review,Somil Gupta and Bhanu Pratap Singh Rawat and Hong Yu,http://arxiv.org/pdf/2006.00671v2 | |
http://arxiv.org/abs/2104.08251v1,creativecommons.org/licenses/by/4.0/,proScript: Partially Ordered Scripts Generation via Pre-trained Language Models,Keisuke Sakaguchi and Chandra Bhagavatula and Ronan Le Bras and Niket Tandon and Peter Clark and Yejin Choi,http://arxiv.org/pdf/2104.08251v1 | |
http://arxiv.org/abs/2202.05993v1,creativecommons.org/licenses/by/4.0/,Wav2Vec2.0 on the Edge: Performance Evaluation,Santosh Gondi,http://arxiv.org/pdf/2202.05993v1 | |
http://arxiv.org/abs/2205.11505v1,creativecommons.org/licenses/by/4.0/,What Makes Data-to-Text Generation Hard for Pretrained Language Models?,Moniba Keymanesh and Adrian Benton and Mark Dredze,http://arxiv.org/pdf/2205.11505v1 | |
http://arxiv.org/abs/2205.07065v1,creativecommons.org/licenses/by/4.0/,What do Models Learn From Training on More Than Text? Measuring Visual Commonsense Knowledge,Lovisa Hagström and Richard Johansson,http://arxiv.org/pdf/2205.07065v1 | |
http://arxiv.org/abs/2104.10658v1,creativecommons.org/licenses/by/4.0/,Using GPT-2 to Create Synthetic Data to Improve the Prediction Performance of NLP Machine Learning Classification Models,Dewayne Whitfield,http://arxiv.org/pdf/2104.10658v1 | |
http://arxiv.org/abs/2211.13317v1,creativecommons.org/licenses/by/4.0/,Rank-One Editing of Encoder-Decoder Models,Vikas Raunak and Arul Menezes,http://arxiv.org/pdf/2211.13317v1 | |
http://arxiv.org/abs/2211.04508v1,creativecommons.org/licenses/by/4.0/,SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations,Paul-Ambroise Duquenne and Hongyu Gong and Ning Dong and Jingfei Du and Ann Lee and Vedanuj Goswani and Changhan Wang and Juan Pino and Benoît Sagot and Holger Schwenk,http://arxiv.org/pdf/2211.04508v1 | |
http://arxiv.org/abs/2211.14133v1,creativecommons.org/licenses/by/4.0/,PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices,Kazuki Osawa and Shigang Li and Torsten Hoefler,http://arxiv.org/pdf/2211.14133v1 | |
http://arxiv.org/abs/2302.08399v5,creativecommons.org/licenses/by/4.0/,Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks,Tomer Ullman,http://arxiv.org/pdf/2302.08399v5 | |
http://arxiv.org/abs/1812.01250v1,creativecommons.org/licenses/by/4.0/,Quantification and Analysis of Scientific Language Variation Across Research Fields,Pei Zhou and Muhao Chen and Kai-Wei Chang and Carlo Zaniolo,http://arxiv.org/pdf/1812.01250v1 | |
http://arxiv.org/abs/2003.07019v1,creativecommons.org/licenses/by/4.0/,Key Phrase Classification in Complex Assignments,Manikandan Ravikiran,http://arxiv.org/pdf/2003.07019v1 | |
http://arxiv.org/abs/2005.08314v1,creativecommons.org/licenses/by/4.0/,TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data,Pengcheng Yin and Graham Neubig and Wen-tau Yih and Sebastian Riedel,http://arxiv.org/pdf/2005.08314v1 | |
http://arxiv.org/abs/2009.09870v2,creativecommons.org/licenses/by/4.0/,Content Planning for Neural Story Generation with Aristotelian Rescoring,Seraphina Goldfarb-Tarrant and Tuhin Chakrabarty and Ralph Weischedel and Nanyun Peng,http://arxiv.org/pdf/2009.09870v2 | |
http://arxiv.org/abs/2011.05197v1,creativecommons.org/licenses/by/4.0/,UmBERTo-MTSA @ AcCompl-It: Improving Complexity and Acceptability Prediction with Multi-task Learning on Self-Supervised Annotations,Gabriele Sarti,http://arxiv.org/pdf/2011.05197v1 | |
http://arxiv.org/abs/2012.04332v1,creativecommons.org/licenses/by/4.0/,Facts2Story: Controlling Text Generation by Key Facts,Eyal Orbach and Yoav Goldberg,http://arxiv.org/pdf/2012.04332v1 | |
http://arxiv.org/abs/2102.05766v2,creativecommons.org/licenses/by/4.0/,Fused Acoustic and Text Encoding for Multimodal Bilingual Pretraining and Speech Translation,Renjie Zheng and Junkun Chen and Mingbo Ma and Liang Huang,http://arxiv.org/pdf/2102.05766v2 | |
http://arxiv.org/abs/2103.09535v1,creativecommons.org/licenses/by/4.0/,Towards Few-Shot Fact-Checking via Perplexity,Nayeon Lee and Yejin Bang and Andrea Madotto and Madian Khabsa and Pascale Fung,http://arxiv.org/pdf/2103.09535v1 | |
http://arxiv.org/abs/2104.06999v2,creativecommons.org/licenses/by/4.0/,Detecting Cross-Geographic Biases in Toxicity Modeling on Social Media,Sayan Ghosh and Dylan Baker and David Jurgens and Vinodkumar Prabhakaran,http://arxiv.org/pdf/2104.06999v2 | |
http://arxiv.org/abs/2104.07885v2,creativecommons.org/licenses/by/4.0/,Probing Across Time: What Does RoBERTa Know and When?,Leo Z. Liu and Yizhong Wang and Jungo Kasai and Hannaneh Hajishirzi and Noah A. Smith,http://arxiv.org/pdf/2104.07885v2 | |
http://arxiv.org/abs/2105.11601v2,creativecommons.org/licenses/by/4.0/,Personalized Transformer for Explainable Recommendation,Lei Li and Yongfeng Zhang and Li Chen,http://arxiv.org/pdf/2105.11601v2 | |
http://arxiv.org/abs/2105.14277v3,creativecommons.org/licenses/by/4.0/,Grammar Accuracy Evaluation (GAE): Quantifiable Quantitative Evaluation of Machine Translation Models,Dojun Park and Youngjin Jang and Harksoo Kim,http://arxiv.org/pdf/2105.14277v3 | |
http://arxiv.org/abs/2106.07207v1,creativecommons.org/licenses/by/4.0/,Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation,Xiang Lin and Simeng Han and Shafiq Joty,http://arxiv.org/pdf/2106.07207v1 | |
http://arxiv.org/abs/2107.07610v3,creativecommons.org/licenses/by/4.0/,Self-Supervised Contrastive Learning with Adversarial Perturbations for Defending Word Substitution-based Attacks,Zhao Meng and Yihan Dong and Mrinmaya Sachan and Roger Wattenhofer,http://arxiv.org/pdf/2107.07610v3 | |
http://arxiv.org/abs/2107.08582v1,creativecommons.org/licenses/by/4.0/,Bridging the Gap between Language Model and Reading Comprehension: Unsupervised MRC via Self-Supervision,Ning Bian and Xianpei Han and Bo Chen and Hongyu Lin and Ben He and Le Sun,http://arxiv.org/pdf/2107.08582v1 | |
http://arxiv.org/abs/2107.09622v1,creativecommons.org/licenses/by/4.0/,More Parameters? No Thanks!,Zeeshan Khan and Kartheek Akella and Vinay P. Namboodiri and C V Jawahar,http://arxiv.org/pdf/2107.09622v1 | |
http://arxiv.org/abs/2109.06822v2,creativecommons.org/licenses/by/4.0/,LM-Critic: Language Models for Unsupervised Grammatical Error Correction,Michihiro Yasunaga and Jure Leskovec and Percy Liang,http://arxiv.org/pdf/2109.06822v2 | |
http://arxiv.org/abs/2109.10274v2,creativecommons.org/licenses/by/4.0/,The Trade-offs of Domain Adaptation for Neural Language Models,David Grangier and Dan Iter,http://arxiv.org/pdf/2109.10274v2 | |
http://arxiv.org/abs/2109.12788v1,creativecommons.org/licenses/by/4.0/,Multiplicative Position-aware Transformer Models for Language Understanding,Zhiheng Huang and Davis Liang and Peng Xu and Bing Xiang,http://arxiv.org/pdf/2109.12788v1 | |
http://arxiv.org/abs/2110.03111v3,creativecommons.org/licenses/by/4.0/,Cut the CARP: Fishing for zero-shot story evaluation,Shahbuland Matiana and JR Smith and Ryan Teehan and Louis Castricato and Stella Biderman and Leo Gao and Spencer Frazier,http://arxiv.org/pdf/2110.03111v3 | |
http://arxiv.org/abs/2111.13611v1,creativecommons.org/licenses/by/4.0/,Predicting Document Coverage for Relation Extraction,Sneha Singhania and Simon Razniewski and Gerhard Weikum,http://arxiv.org/pdf/2111.13611v1 | |
http://arxiv.org/abs/2111.15417v1,creativecommons.org/licenses/by/4.0/,A Comparative Study of Transformers on Word Sense Disambiguation,Avi Chawla and Nidhi Mulay and Vikas Bishnoi and Gaurav Dhama and Dr. Anil Kumar Singh,http://arxiv.org/pdf/2111.15417v1 | |
http://arxiv.org/abs/2112.00283v1,creativecommons.org/licenses/by/4.0/,Wiki to Automotive: Understanding the Distribution Shift and its impact on Named Entity Recognition,Anmol Nayak and Hari Prasad Timmapathini,http://arxiv.org/pdf/2112.00283v1 | |
http://arxiv.org/abs/2112.07522v2,creativecommons.org/licenses/by/4.0/,LMTurk: Few-Shot Learners as Crowdsourcing Workers in a Language-Model-as-a-Service Framework,Mengjie Zhao and Fei Mi and Yasheng Wang and Minglei Li and Xin Jiang and Qun Liu and Hinrich Schütze,http://arxiv.org/pdf/2112.07522v2 | |
http://arxiv.org/abs/2112.11480v1,creativecommons.org/licenses/by/4.0/,On the Compression of Natural Language Models,Saeed Damadi,http://arxiv.org/pdf/2112.11480v1 | |
http://arxiv.org/abs/2202.02617v1,creativecommons.org/licenses/by/4.0/,Adaptive Fine-Tuning of Transformer-Based Language Models for Named Entity Recognition,Felix Stollenwerk,http://arxiv.org/pdf/2202.02617v1 | |
http://arxiv.org/abs/2202.04728v1,creativecommons.org/licenses/by/4.0/,Predicting Human Similarity Judgments Using Large Language Models,Raja Marjieh and Ilia Sucholutsky and Theodore R. Sumers and Nori Jacoby and Thomas L. Griffiths,http://arxiv.org/pdf/2202.04728v1 | |
http://arxiv.org/abs/2204.07289v1,creativecommons.org/licenses/by/4.0/,Identifying and Measuring Token-Level Sentiment Bias in Pre-trained Language Models with Prompts,Apoorv Garg and Deval Srivastava and Zhiyang Xu and Lifu Huang,http://arxiv.org/pdf/2204.07289v1 | |
http://arxiv.org/abs/2205.06160v2,creativecommons.org/licenses/by/4.0/,Localized Vision-Language Matching for Open-vocabulary Object Detection,Maria A. Bravo and Sudhanshu Mittal and Thomas Brox,http://arxiv.org/pdf/2205.06160v2 | |
http://arxiv.org/abs/2205.07081v1,creativecommons.org/licenses/by/4.0/,GoalNet: Inferring Conjunctive Goal Predicates from Human Plan Demonstrations for Robot Instruction Following,Shreya Sharma and Jigyasa Gupta and Shreshth Tuli and Rohan Paul and Mausam,http://arxiv.org/pdf/2205.07081v1 | |
http://arxiv.org/abs/2208.03711v1,creativecommons.org/licenses/by/4.0/,Vernacular Search Query Translation with Unsupervised Domain Adaptation,Mandar Kulkarni and Nikesh Garera,http://arxiv.org/pdf/2208.03711v1 | |
http://arxiv.org/abs/2209.11000v1,creativecommons.org/licenses/by/4.0/,Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation,Xingdi Yuan and Tong Wang and Yen-Hsiang Wang and Emery Fine and Rania Abdelghani and Pauline Lucas and Hélène Sauzéon and Pierre-Yves Oudeyer,http://arxiv.org/pdf/2209.11000v1 | |
http://arxiv.org/abs/2210.04726v1,creativecommons.org/licenses/by/4.0/,Knowledge Prompts: Injecting World Knowledge into Language Models through Soft Prompts,Cicero Nogueira dos Santos and Zhe Dong and Daniel Cer and John Nham and Siamak Shakeri and Jianmo Ni and Yun-hsuan Sung,http://arxiv.org/pdf/2210.04726v1 | |
http://arxiv.org/abs/2210.04964v1,creativecommons.org/licenses/by/4.0/,Generating Executable Action Plans with Environmentally-Aware Language Models,Maitrey Gramopadhye and Daniel Szafir,http://arxiv.org/pdf/2210.04964v1 | |
http://arxiv.org/abs/2210.07323v3,creativecommons.org/licenses/by/4.0/,Experiments on Turkish ASR with Self-Supervised Speech Representation Learning,Ali Safaya and Engin Erzin,http://arxiv.org/pdf/2210.07323v3 | |
http://arxiv.org/abs/2210.07688v2,creativecommons.org/licenses/by/4.0/,Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training,Wenliang Dai and Zihan Liu and Ziwei Ji and Dan Su and Pascale Fung,http://arxiv.org/pdf/2210.07688v2 | |
http://arxiv.org/abs/2211.04126v1,creativecommons.org/licenses/by/4.0/,Conciseness: An Overlooked Language Task,Felix Stahlberg and Aashish Kumar and Chris Alberti and Shankar Kumar,http://arxiv.org/pdf/2211.04126v1 | |
http://arxiv.org/abs/2211.07828v1,creativecommons.org/licenses/by/4.0/,Adaptation Approaches for Nearest Neighbor Language Models,Rishabh Bhardwaj and George Polovets and Monica Sunkara,http://arxiv.org/pdf/2211.07828v1 | |
http://arxiv.org/abs/2212.02437v1,creativecommons.org/licenses/by/4.0/,In-context Examples Selection for Machine Translation,Sweta Agrawal and Chunting Zhou and Mike Lewis and Luke Zettlemoyer and Marjan Ghazvininejad,http://arxiv.org/pdf/2212.02437v1 | |
http://arxiv.org/abs/2212.08120v1,creativecommons.org/licenses/by/4.0/,Injecting Domain Knowledge in Language Models for Task-Oriented Dialogue Systems,Denis Emelin and Daniele Bonadiman and Sawsan Alqahtani and Yi Zhang and Saab Mansour,http://arxiv.org/pdf/2212.08120v1 | |
http://arxiv.org/abs/2212.10535v1,creativecommons.org/licenses/by/4.0/,A Survey of Deep Learning for Mathematical Reasoning,Pan Lu and Liang Qiu and Wenhao Yu and Sean Welleck and Kai-Wei Chang,http://arxiv.org/pdf/2212.10535v1 | |
http://arxiv.org/abs/2212.11185v1,creativecommons.org/licenses/by/4.0/,Entropy- and Distance-Based Predictors From GPT-2 Attention Patterns Predict Reading Times Over and Above GPT-2 Surprisal,Byung-Doh Oh and William Schuler,http://arxiv.org/pdf/2212.11185v1 | |
http://arxiv.org/abs/2212.14882v1,creativecommons.org/licenses/by/4.0/,ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports,Katharina Jeblick and Balthasar Schachtner and Jakob Dexl and Andreas Mittermeier and Anna Theresa Stüber and Johanna Topalis and Tobias Weber and Philipp Wesp and Bastian Sabel and Jens Ricke and Michael Ingrisch,http://arxiv.org/pdf/2212.14882v1 | |
http://arxiv.org/abs/2302.06541v1,creativecommons.org/licenses/by/4.0/,Towards Agile Text Classifiers for Everyone,Maximilian Mozes and Jessica Hoffmann and Katrin Tomanek and Muhamed Kouate and Nithum Thain and Ann Yuan and Tolga Bolukbasi and Lucas Dixon,http://arxiv.org/pdf/2302.06541v1 | |
http://arxiv.org/abs/2302.07371v1,creativecommons.org/licenses/by/4.0/,AutoBiasTest: Controllable Sentence Generation for Automated and Open-Ended Social Bias Testing in Language Models,Rafal Kocielnik and Shrimai Prabhumoye and Vivian Zhang and R. Michael Alvarez and Anima Anandkumar,http://arxiv.org/pdf/2302.07371v1 | |
http://arxiv.org/abs/2302.11054v1,creativecommons.org/licenses/by/4.0/,Conversational Text-to-SQL: An Odyssey into State-of-the-Art and Challenges Ahead,Sree Hari Krishnan Parthasarathi and Lu Zeng and Dilek Hakkani-Tur,http://arxiv.org/pdf/2302.11054v1 | |
http://arxiv.org/abs/2303.13547v1,creativecommons.org/licenses/by/4.0/,A comprehensive evaluation of ChatGPT's zero-shot Text-to-SQL capability,Aiwei Liu and Xuming Hu and Lijie Wen and Philip S. Yu,http://arxiv.org/pdf/2303.13547v1 | |
http://arxiv.org/abs/2303.15422v1,creativecommons.org/licenses/by/4.0/,KPEval: Towards Fine-grained Semantic-based Evaluation of Keyphrase Extraction and Generation Systems,Di Wu and Da Yin and Kai-Wei Chang,http://arxiv.org/pdf/2303.15422v1 | |
http://arxiv.org/abs/2102.10958v1,creativecommons.org/licenses/by/4.0/,"Bilingual Language Modeling, A transfer learning technique for Roman Urdu",Usama Khalid and Mirza Omer Beg and Muhammad Umair Arshad,http://arxiv.org/pdf/2102.10958v1 | |
http://arxiv.org/abs/2212.10785v1,creativecommons.org/licenses/by/4.0/,SERENGETI: Massively Multilingual Language Models for Africa,Ife Adebara and AbdelRahim Elmadany and Muhammad Abdul-Mageed and Alcides Alcoba Inciarte,http://arxiv.org/pdf/2212.10785v1 | |
http://arxiv.org/abs/2206.11719v2,creativecommons.org/licenses/by/4.0/,AST-Probe: Recovering abstract syntax trees from hidden representations of pre-trained language models,José Antonio Hernández López and Martin Weyssow and Jesús Sánchez Cuadrado and Houari Sahraoui,http://arxiv.org/pdf/2206.11719v2 | |
http://arxiv.org/abs/2102.04472v1,creativecommons.org/licenses/by/4.0/,PyAutoFit: A Classy Probabilistic Programming Language for Model Composition and Fitting,James. W. Nightingale and Richard G. Hayes and Matthew Griffiths,http://arxiv.org/pdf/2102.04472v1 | |
http://arxiv.org/abs/2201.01845v2,creativecommons.org/licenses/by/4.0/,Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation,Zoey Liu and Emily Prud'hommeaux,http://arxiv.org/pdf/2201.01845v2 | |
http://arxiv.org/abs/2201.06170v2,creativecommons.org/licenses/by/4.0/,Evaluation of HTR models without Ground Truth Material,Phillip Benjamin Ströbel and Simon Clematide and Martin Volk and Raphael Schwitter and Tobias Hodel and David Schoch,http://arxiv.org/pdf/2201.06170v2 | |
http://arxiv.org/abs/2302.04870v1,creativecommons.org/licenses/by/4.0/,Offsite-Tuning: Transfer Learning without Full Model,Guangxuan Xiao and Ji Lin and Song Han,http://arxiv.org/pdf/2302.04870v1 | |
http://arxiv.org/abs/2011.04732v1,creativecommons.org/licenses/by/4.0/,CLAR: A Cross-Lingual Argument Regularizer for Semantic Role Labeling,Ishan Jindal and Yunyao Li and Siddhartha Brahma and Huaiyu Zhu,http://arxiv.org/pdf/2011.04732v1 | |
http://arxiv.org/abs/2102.10957v1,creativecommons.org/licenses/by/4.0/,Co-occurrences using Fasttext embeddings for word similarity tasks in Urdu,Usama Khalid and Aizaz Hussain and Muhammad Umair Arshad and Waseem Shahzad and Mirza Omer Beg,http://arxiv.org/pdf/2102.10957v1 | |
http://arxiv.org/abs/2012.07974v3,creativecommons.org/licenses/by/4.0/,A review of on-device fully neural end-to-end automatic speech recognition algorithms,Chanwoo Kim and Dhananjaya Gowda and Dongsoo Lee and Jiyeon Kim and Ankur Kumar and Sungsoo Kim and Abhinav Garg and Changwoo Han,http://arxiv.org/pdf/2012.07974v3 | |
http://arxiv.org/abs/2210.01293v1,creativecommons.org/licenses/by/4.0/,ThinkSum: Probabilistic reasoning over sets using large language models,Batu Ozturkler and Nikolay Malkin and Zhen Wang and Nebojsa Jojic,http://arxiv.org/pdf/2210.01293v1 | |
http://arxiv.org/abs/2101.06351v1,creativecommons.org/licenses/by/4.0/,Weakly-Supervised Hierarchical Models for Predicting Persuasive Strategies in Good-faith Textual Requests,Jiaao Chen and Diyi Yang,http://arxiv.org/pdf/2101.06351v1 | |
http://arxiv.org/abs/2211.15603v3,creativecommons.org/licenses/by/4.0/,Action-GPT: Leveraging Large-scale Language Models for Improved and Generalized Action Generation,Sai Shashank Kalakonda and Shubh Maheshwari and Ravi Kiran Sarvadevabhatla,http://arxiv.org/pdf/2211.15603v3 | |
http://arxiv.org/abs/2302.08722v3,creativecommons.org/licenses/by/4.0/,GPT4MIA: Utilizing Generative Pre-trained Transformer (GPT-3) as A Plug-and-Play Transductive Model for Medical Image Analysis,Yizhe Zhang and Danny Z. Chen,http://arxiv.org/pdf/2302.08722v3 | |
http://arxiv.org/abs/2304.09337v1,creativecommons.org/licenses/by/4.0/,Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models,Stephen Brade and Bryan Wang and Mauricio Sousa and Sageev Oore and Tovi Grossman,http://arxiv.org/pdf/2304.09337v1 | |
http://arxiv.org/abs/1908.10747v1,creativecommons.org/licenses/by/4.0/,Language Tasks and Language Games: On Methodology in Current Natural Language Processing Research,David Schlangen,http://arxiv.org/pdf/1908.10747v1 | |
http://arxiv.org/abs/2104.01294v1,creativecommons.org/licenses/by/4.0/,Representations of Language Varieties Are Reliable Given Corpus Similarity Measures,Jonathan Dunn,http://arxiv.org/pdf/2104.01294v1 | |
http://arxiv.org/abs/2108.09814v1,creativecommons.org/licenses/by/4.0/,UzBERT: pretraining a BERT model for Uzbek,B. Mansurov and A. Mansurov,http://arxiv.org/pdf/2108.09814v1 | |
http://arxiv.org/abs/2109.15254v2,creativecommons.org/licenses/by/4.0/,SlovakBERT: Slovak Masked Language Model,Matúš Pikuliak and Štefan Grivalský and Martin Konôpka and Miroslav Blšták and Martin Tamajka and Viktor Bachratý and Marián Šimko and Pavol Balážik and Michal Trnka and Filip Uhlárik,http://arxiv.org/pdf/2109.15254v2 | |
http://arxiv.org/abs/2108.03070v1,creativecommons.org/licenses/by/4.0/,SWSR: A Chinese Dataset and Lexicon for Online Sexism Detection,Aiqi Jiang and Xiaohan Yang and Yang Liu and Arkaitz Zubiaga,http://arxiv.org/pdf/2108.03070v1 | |
http://arxiv.org/abs/2209.04725v1,creativecommons.org/licenses/by/4.0/,Anticipating the Unseen Discrepancy for Vision and Language Navigation,Yujie Lu and Huiliang Zhang and Ping Nie and Weixi Feng and Wenda Xu and Xin Eric Wang and William Yang Wang,http://arxiv.org/pdf/2209.04725v1 | |
http://arxiv.org/abs/2304.04498v2,creativecommons.org/licenses/by/4.0/,Towards Digital Nature: Bridging the Gap between Turing Machine Objects and Linguistic Objects in LLMMs for Universal Interaction of Object-Oriented Descriptions,Yoichi Ochiai and Naruya Kondo and Tatsuki Fushimi,http://arxiv.org/pdf/2304.04498v2 | |
http://arxiv.org/abs/2208.05577v1,creativecommons.org/licenses/by/4.0/,Reducing Retraining by Recycling Parameter-Efficient Prompts,Brian Lester and Joshua Yurtsever and Siamak Shakeri and Noah Constant,http://arxiv.org/pdf/2208.05577v1 | |
http://arxiv.org/abs/2210.16298v1,creativecommons.org/licenses/by/4.0/,Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers,Jieyu Zhao and Xuezhi Wang and Yao Qin and Jilin Chen and Kai-Wei Chang,http://arxiv.org/pdf/2210.16298v1 | |
http://arxiv.org/abs/2112.01753v2,creativecommons.org/licenses/by/4.0/,Probing Linguistic Information For Logical Inference In Pre-trained Language Models,Zeming Chen and Qiyue Gao,http://arxiv.org/pdf/2112.01753v2 | |
http://arxiv.org/abs/2203.05936v2,creativecommons.org/licenses/by/4.0/,Are discrete units necessary for Spoken Language Modeling?,Tu Anh Nguyen and Benoit Sagot and Emmanuel Dupoux,http://arxiv.org/pdf/2203.05936v2 | |
http://arxiv.org/abs/2108.05652v1,creativecommons.org/licenses/by/4.0/,Modeling Relevance Ranking under the Pre-training and Fine-tuning Paradigm,Lin Bo and Liang Pang and Gang Wang and Jun Xu and XiuQiang He and Ji-Rong Wen,http://arxiv.org/pdf/2108.05652v1 | |
http://arxiv.org/abs/2009.08445v2,creativecommons.org/licenses/by/4.0/,Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks,Trapit Bansal and Rishikesh Jha and Tsendsuren Munkhdalai and Andrew McCallum,http://arxiv.org/pdf/2009.08445v2 | |
http://arxiv.org/abs/2104.06591v2,creativecommons.org/licenses/by/4.0/,Zero-Resource Multi-Dialectal Arabic Natural Language Understanding,Muhammad Khalifa and Hesham Hassan and Aly Fahmy,http://arxiv.org/pdf/2104.06591v2 | |
http://arxiv.org/abs/2108.10561v1,creativecommons.org/licenses/by/4.0/,Taming the Beast: Learning to Control Neural Conversational Models,Andrea Madotto,http://arxiv.org/pdf/2108.10561v1 | |
http://arxiv.org/abs/2109.13620v1,creativecommons.org/licenses/by/4.0/,Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking,Nikita Moghe and Mark Steedman and Alexandra Birch,http://arxiv.org/pdf/2109.13620v1 | |
http://arxiv.org/abs/2112.02945v1,creativecommons.org/licenses/by/4.0/,Configuration Space Exploration for Digital Printing Systems,Jasper Denkers and Marvin Brunner and Louis van Gool and Eelco Visser,http://arxiv.org/pdf/2112.02945v1 | |
http://arxiv.org/abs/2201.03425v1,creativecommons.org/licenses/by/4.0/,"Towards Trustworthy AutoGrading of Short, Multi-lingual, Multi-type Answers",Johannes Schneider and Robin Richner and Micha Riser,http://arxiv.org/pdf/2201.03425v1 | |
http://arxiv.org/abs/2202.13529v1,creativecommons.org/licenses/by/4.0/,"KMIR: A Benchmark for Evaluating Knowledge Memorization, Identification and Reasoning Abilities of Language Models",Daniel Gao and Yantao Jia and Lei Li and Chengzhen Fu and Zhicheng Dou and Hao Jiang and Xinyu Zhang and Lei Chen and Zhao Cao,http://arxiv.org/pdf/2202.13529v1 | |
http://arxiv.org/abs/2203.07731v1,creativecommons.org/licenses/by/4.0/,Evaluating BERT-based Pre-training Language Models for Detecting Misinformation,Rini Anggrainingsih and Ghulam Mubashar Hassan and Amitava Datta,http://arxiv.org/pdf/2203.07731v1 | |
http://arxiv.org/abs/2204.08405v1,creativecommons.org/licenses/by/4.0/,Zero-shot Entity and Tweet Characterization with Designed Conditional Prompts and Contexts,Sharath Srivatsa and Tushar Mohan and Kumari Neha and Nishchay Malakar and Ponnurangam Kumaraguru and Srinath Srinivasa,http://arxiv.org/pdf/2204.08405v1 | |
http://arxiv.org/abs/2206.00761v2,creativecommons.org/licenses/by/4.0/,On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting,Tomasz Korbak and Hady Elsahar and Germán Kruszewski and Marc Dymetman,http://arxiv.org/pdf/2206.00761v2 | |
http://arxiv.org/abs/2209.12786v1,creativecommons.org/licenses/by/4.0/,Do ever larger octopi still amplify reporting biases? Evidence from judgments of typical colour,Fangyu Liu and Julian Martin Eisenschlos and Jeremy R. Cole and Nigel Collier,http://arxiv.org/pdf/2209.12786v1 | |
http://arxiv.org/abs/2303.08033v1,creativecommons.org/licenses/by/4.0/,Large Language Models (GPT) Struggle to Answer Multiple-Choice Questions about Code,Jaromir Savelka and Arav Agarwal and Christopher Bogart and Majd Sakr,http://arxiv.org/pdf/2303.08033v1 | |
http://arxiv.org/abs/2010.01063v1,creativecommons.org/licenses/by/4.0/,Syntax Representation in Word Embeddings and Neural Networks -- A Survey,Tomasz Limisiewicz and David Mareček,http://arxiv.org/pdf/2010.01063v1 | |
http://arxiv.org/abs/2205.06644v1,creativecommons.org/licenses/by/4.0/,Controlling Translation Formality Using Pre-trained Multilingual Language Models,Elijah Rippeth and Sweta Agrawal and Marine Carpuat,http://arxiv.org/pdf/2205.06644v1 | |
http://arxiv.org/abs/2012.08673v2,creativecommons.org/licenses/by/4.0/,A Closer Look at the Robustness of Vision-and-Language Pre-trained Models,Linjie Li and Zhe Gan and Jingjing Liu,http://arxiv.org/pdf/2012.08673v2 | |
http://arxiv.org/abs/2106.12230v1,creativecommons.org/licenses/by/4.0/,Recognising Biomedical Names: Challenges and Solutions,Xiang Dai,http://arxiv.org/pdf/2106.12230v1 | |
http://arxiv.org/abs/2205.11388v1,creativecommons.org/licenses/by/4.0/,StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models,Adam Liška and Tomáš Kočiský and Elena Gribovskaya and Tayfun Terzi and Eren Sezener and Devang Agrawal and Cyprien de Masson d'Autume and Tim Scholtes and Manzil Zaheer and Susannah Young and Ellen Gilsenan-McMahon and Sophia Austin and Phil Blunsom and Angeliki Lazaridou,http://arxiv.org/pdf/2205.11388v1 | |
http://arxiv.org/abs/2210.07144v1,creativecommons.org/licenses/by/4.0/,Reprogramming Large Pretrained Language Models for Antibody Sequence Infilling,Igor Melnyk and Vijil Chenthamarakshan and Pin-Yu Chen and Payel Das and Amit Dhurandhar and Inkit Padhi and Devleena Das,http://arxiv.org/pdf/2210.07144v1 | |
http://arxiv.org/abs/2303.14070v4,creativecommons.org/licenses/by/4.0/,ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge,Yunxiang Li and Zihan Li and Kai Zhang and Ruilong Dan and You Zhang,http://arxiv.org/pdf/2303.14070v4 | |
http://arxiv.org/abs/2212.10947v1,creativecommons.org/licenses/by/4.0/,Parallel Context Windows Improve In-Context Learning of Large Language Models,Nir Ratner and Yoav Levine and Yonatan Belinkov and Ori Ram and Omri Abend and Ehud Karpas and Amnon Shashua and Kevin Leyton-Brown and Yoav Shoham,http://arxiv.org/pdf/2212.10947v1 | |
http://arxiv.org/abs/2304.10014v1,creativecommons.org/licenses/by/4.0/,Physics task development of prospective physics teachers using ChatGPT,Stefan Küchemann and Steffen Steinert and Natalia Revenga and Matthias Schweinberger and Yavuz Dinc and Karina E. Avila and Jochen Kuhn,http://arxiv.org/pdf/2304.10014v1 | |
http://arxiv.org/abs/2304.10691v1,creativecommons.org/licenses/by/4.0/,SkinGPT: A Dermatology Diagnostic System with Vision Large Language Model,Juexiao Zhou and Xin Gao,http://arxiv.org/pdf/2304.10691v1 | |
http://arxiv.org/abs/2203.10326v2,creativecommons.org/licenses/by/4.0/,Pretraining with Artificial Language: Studying Transferable Knowledge in Language Models,Ryokan Ri and Yoshimasa Tsuruoka,http://arxiv.org/pdf/2203.10326v2 | |
http://arxiv.org/abs/2101.09368v2,creativecommons.org/licenses/by/4.0/,Effects of Pre- and Post-Processing on type-based Embeddings in Lexical Semantic Change Detection,Jens Kaiser and Sinan Kurtyigit and Serge Kotchourko and Dominik Schlechtweg,http://arxiv.org/pdf/2101.09368v2 | |
http://arxiv.org/abs/2304.05403v1,creativecommons.org/licenses/by/4.0/,Isolated Sign Language Recognition based on Tree Structure Skeleton Images,David Laines and Gissella Bejarano and Miguel Gonzalez-Mendoza and Gilberto Ochoa-Ruiz,http://arxiv.org/pdf/2304.05403v1 | |
http://arxiv.org/abs/1806.02557v2,creativecommons.org/licenses/by/4.0/,Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification,Zhenpeng Chen and Sheng Shen and Ziniu Hu and Xuan Lu and Qiaozhu Mei and Xuanzhe Liu,http://arxiv.org/pdf/1806.02557v2 | |
http://arxiv.org/abs/2107.06569v1,creativecommons.org/licenses/by/4.0/,Importance-based Neuron Allocation for Multilingual Neural Machine Translation,Wanying Xie and Yang Feng and Shuhao Gu and Dong Yu,http://arxiv.org/pdf/2107.06569v1 | |
http://arxiv.org/abs/2303.01249v1,creativecommons.org/licenses/by/4.0/,Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition,Zhijie Shen and Wu Guo and Bin Gu,http://arxiv.org/pdf/2303.01249v1 | |
http://arxiv.org/abs/2001.05315v1,creativecommons.org/licenses/by/4.0/,A Continuous Space Neural Language Model for Bengali Language,Hemayet Ahmed Chowdhury and Md. Azizul Haque Imon and Anisur Rahman and Aisha Khatun and Md. Saiful Islam,http://arxiv.org/pdf/2001.05315v1 | |
http://arxiv.org/abs/2211.00635v1,creativecommons.org/licenses/by/4.0/,Preserving In-Context Learning ability in Large Language Model Fine-tuning,Yihan Wang and Si Si and Daliang Li and Michal Lukasik and Felix Yu and Cho-Jui Hsieh and Inderjit S Dhillon and Sanjiv Kumar,http://arxiv.org/pdf/2211.00635v1 | |
http://arxiv.org/abs/2210.14353v2,creativecommons.org/licenses/by/4.0/,"RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering",Victor Zhong and Weijia Shi and Wen-tau Yih and Luke Zettlemoyer,http://arxiv.org/pdf/2210.14353v2 | |
http://arxiv.org/abs/2301.11293v1,creativecommons.org/licenses/by/4.0/,Understanding Finetuning for Factual Knowledge Extraction from Language Models,Mehran Kazemi and Sid Mittal and Deepak Ramachandran,http://arxiv.org/pdf/2301.11293v1 | |
http://arxiv.org/abs/2303.01903v2,creativecommons.org/licenses/by/4.0/,Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering,Zhenwei Shao and Zhou Yu and Meng Wang and Jun Yu,http://arxiv.org/pdf/2303.01903v2 | |
http://arxiv.org/abs/2303.09128v1,creativecommons.org/licenses/by/4.0/,Exploring Distributional Shifts in Large Language Models for Code Analysis,Shushan Arakelyan and Rocktim Jyoti Das and Yi Mao and Xiang Ren,http://arxiv.org/pdf/2303.09128v1 | |
http://arxiv.org/abs/2303.10431v1,creativecommons.org/licenses/by/4.0/,DeAR: Debiasing Vision-Language Models with Additive Residuals,Ashish Seth and Mayur Hemani and Chirag Agarwal,http://arxiv.org/pdf/2303.10431v1 | |
http://arxiv.org/abs/2303.13217v3,creativecommons.org/licenses/by/4.0/,Fairness-guided Few-shot Prompting for Large Language Models,Huan Ma and Changqing Zhang and Yatao Bian and Lemao Liu and Zhirui Zhang and Peilin Zhao and Shu Zhang and Huazhu Fu and Qinghua Hu and Bingzhe Wu,http://arxiv.org/pdf/2303.13217v3 | |
http://arxiv.org/abs/2303.13379v1,creativecommons.org/licenses/by/4.0/,Practical and Ethical Challenges of Large Language Models in Education: A Systematic Literature Review,Lixiang Yan and Lele Sha and Linxuan Zhao and Yuheng Li and Roberto Martinez-Maldonado and Guanliang Chen and Xinyu Li and Yueqiao Jin and Dragan Gašević,http://arxiv.org/pdf/2303.13379v1 | |
http://arxiv.org/abs/2108.06665v1,creativecommons.org/licenses/by/4.0/,"Accurate, yet inconsistent? Consistency Analysis on Language Understanding Models",Myeongjun Jang and Deuk Sin Kwon and Thomas Lukasiewicz,http://arxiv.org/pdf/2108.06665v1 | |
http://arxiv.org/abs/2012.14005v1,creativecommons.org/licenses/by/4.0/,Neural document expansion for ad-hoc information retrieval,Cheng Tang and Andrew Arnold,http://arxiv.org/pdf/2012.14005v1 | |
http://arxiv.org/abs/2103.15760v2,creativecommons.org/licenses/by/4.0/,Shrinking Bigfoot: Reducing wav2vec 2.0 footprint,Zilun Peng and Akshay Budhkar and Ilana Tuil and Jason Levy and Parinaz Sobhani and Raphael Cohen and Jumana Nassour,http://arxiv.org/pdf/2103.15760v2 | |
http://arxiv.org/abs/2109.08627v1,creativecommons.org/licenses/by/4.0/,Classification-based Quality Estimation: Small and Efficient Models for Real-world Applications,Shuo Sun and Ahmed El-Kishky and Vishrav Chaudhary and James Cross and Francisco Guzmán and Lucia Specia,http://arxiv.org/pdf/2109.08627v1 | |
http://arxiv.org/abs/2205.12113v2,creativecommons.org/licenses/by/4.0/,The Curious Case of Control,Elias Stengel-Eskin and Benjamin Van Durme,http://arxiv.org/pdf/2205.12113v2 | |
http://arxiv.org/abs/2211.02941v3,creativecommons.org/licenses/by/4.0/,Small Language Models for Tabular Data,Benjamin L. Badger,http://arxiv.org/pdf/2211.02941v3 | |
http://arxiv.org/abs/2210.06475v2,creativecommons.org/licenses/by/4.0/,Equi-Tuning: Group Equivariant Fine-Tuning of Pretrained Models,Sourya Basu and Prasanna Sattigeri and Karthikeyan Natesan Ramamurthy and Vijil Chenthamarakshan and Kush R. Varshney and Lav R. Varshney and Payel Das,http://arxiv.org/pdf/2210.06475v2 | |
http://arxiv.org/abs/2212.10696v1,creativecommons.org/licenses/by/4.0/,Analyzing Semantic Faithfulness of Language Models via Input Intervention on Conversational Question Answering,Akshay Chaturvedi and Swarnadeep Bhar and Soumadeep Saha and Utpal Garain and Nicholas Asher,http://arxiv.org/pdf/2212.10696v1 | |
http://arxiv.org/abs/1803.06456v1,creativecommons.org/licenses/by/4.0/,Experiments with Neural Networks for Small and Large Scale Authorship Verification,Marjan Hosseinia and Arjun Mukherjee,http://arxiv.org/pdf/1803.06456v1 | |
http://arxiv.org/abs/2104.05022v2,creativecommons.org/licenses/by/4.0/,WEC: Deriving a Large-scale Cross-document Event Coreference dataset from Wikipedia,Alon Eirew and Arie Cattan and Ido Dagan,http://arxiv.org/pdf/2104.05022v2 | |
http://arxiv.org/abs/2203.08118v3,creativecommons.org/licenses/by/4.0/,Representation Learning for Resource-Constrained Keyphrase Generation,Di Wu and Wasi Uddin Ahmad and Sunipa Dev and Kai-Wei Chang,http://arxiv.org/pdf/2203.08118v3 | |
http://arxiv.org/abs/2210.05793v2,creativecommons.org/licenses/by/4.0/,Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR,Dongseong Hwang and Khe Chai Sim and Yu Zhang and Trevor Strohman,http://arxiv.org/pdf/2210.05793v2 | |
http://arxiv.org/abs/1904.00585v2,creativecommons.org/licenses/by/4.0/,Using Similarity Measures to Select Pretraining Data for NER,Xiang Dai and Sarvnaz Karimi and Ben Hachey and Cecile Paris,http://arxiv.org/pdf/1904.00585v2 | |
http://arxiv.org/abs/1911.04286v1,creativecommons.org/licenses/by/4.0/,Deep Contextualized Self-training for Low Resource Dependency Parsing,Guy Rotman and Roi Reichart,http://arxiv.org/pdf/1911.04286v1 | |
http://arxiv.org/abs/2004.12198v2,creativecommons.org/licenses/by/4.0/,Quantifying the Contextualization of Word Representations with Semantic Class Probing,Mengjie Zhao and Philipp Dufter and Yadollah Yaghoobzadeh and Hinrich Schütze,http://arxiv.org/pdf/2004.12198v2 | |
http://arxiv.org/abs/2010.07261v2,creativecommons.org/licenses/by/4.0/,Learning Improvised Chatbots from Adversarial Modifications of Natural Language Feedback,Makesh Narsimhan Sreedhar and Kun Ni and Siva Reddy,http://arxiv.org/pdf/2010.07261v2 | |
http://arxiv.org/abs/2012.00124v1,creativecommons.org/licenses/by/4.0/,Extreme Model Compression for On-device Natural Language Understanding,Kanthashree Mysore Sathyendra and Samridhi Choudhary and Leah Nicolich-Henkin,http://arxiv.org/pdf/2012.00124v1 | |
http://arxiv.org/abs/2012.04446v1,creativecommons.org/licenses/by/4.0/,LAMP: Label Augmented Multimodal Pretraining,Jia Guo and Chen Zhu and Yilun Zhao and Heda Wang and Yao Hu and Xiaofei He and Deng Cai,http://arxiv.org/pdf/2012.04446v1 | |
http://arxiv.org/abs/2103.15335v1,creativecommons.org/licenses/by/4.0/,Changing the Mind of Transformers for Topically-Controllable Language Generation,Haw-Shiuan Chang and Jiaming Yuan and Mohit Iyyer and Andrew McCallum,http://arxiv.org/pdf/2103.15335v1 | |
http://arxiv.org/abs/2108.12848v2,creativecommons.org/licenses/by/4.0/,Span Fine-tuning for Pre-trained Language Models,Rongzhou Bao and Zhuosheng Zhang and Hai Zhao,http://arxiv.org/pdf/2108.12848v2 | |
http://arxiv.org/abs/2109.05190v3,creativecommons.org/licenses/by/4.0/,PoKE: A Prompt-based Knowledge Eliciting Approach for Event Argument Extraction,Jiaju Lin and Qin Chen,http://arxiv.org/pdf/2109.05190v3 | |
http://arxiv.org/abs/2109.07765v1,creativecommons.org/licenses/by/4.0/,"Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models",Casimiro Pio Carrino and Jordi Armengol-Estapé and Ona de Gibert Bonet and Asier Gutiérrez-Fandiño and Aitor Gonzalez-Agirre and Martin Krallinger and Marta Villegas,http://arxiv.org/pdf/2109.07765v1 | |
http://arxiv.org/abs/2109.08648v1,creativecommons.org/licenses/by/4.0/,Efficient Measuring of Readability to Improve Documents Accessibility for Arabic Language Learners,Sadik Bessou and Ghozlane Chenni,http://arxiv.org/pdf/2109.08648v1 | |
http://arxiv.org/abs/2110.01799v1,creativecommons.org/licenses/by/4.0/,ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts,Yuta Koreeda and Christopher D. Manning,http://arxiv.org/pdf/2110.01799v1 | |
http://arxiv.org/abs/2110.04541v3,creativecommons.org/licenses/by/4.0/,The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design,Yoav Levine and Noam Wies and Daniel Jannai and Dan Navon and Yedid Hoshen and Amnon Shashua,http://arxiv.org/pdf/2110.04541v3 | |
http://arxiv.org/abs/2110.08329v2,creativecommons.org/licenses/by/4.0/,Control Prefixes for Parameter-Efficient Text Generation,Jordan Clive and Kris Cao and Marek Rei,http://arxiv.org/pdf/2110.08329v2 | |
http://arxiv.org/abs/2111.06467v2,creativecommons.org/licenses/by/4.0/,SynthBio: A Case Study in Human-AI Collaborative Curation of Text Datasets,Ann Yuan and Daphne Ippolito and Vitaly Nikolaev and Chris Callison-Burch and Andy Coenen and Sebastian Gehrmann,http://arxiv.org/pdf/2111.06467v2 | |
http://arxiv.org/abs/2111.08249v1,creativecommons.org/licenses/by/4.0/,Bengali Handwritten Grapheme Classification: Deep Learning Approach,Tarun Roy and Hasib Hasan and Kowsar Hossain and Masuma Akter Rumi,http://arxiv.org/pdf/2111.08249v1 | |
http://arxiv.org/abs/2203.02167v1,creativecommons.org/licenses/by/4.0/,SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models,Liang Wang and Wei Zhao and Zhuoyu Wei and Jingming Liu,http://arxiv.org/pdf/2203.02167v1 | |
http://arxiv.org/abs/2203.05648v1,creativecommons.org/licenses/by/4.0/,"Contextualized Sensorimotor Norms: multi-dimensional measures of sensorimotor strength for ambiguous English words, in context",Sean Trott and Benjamin Bergen,http://arxiv.org/pdf/2203.05648v1 | |
http://arxiv.org/abs/2203.11171v4,creativecommons.org/licenses/by/4.0/,Self-Consistency Improves Chain of Thought Reasoning in Language Models,Xuezhi Wang and Jason Wei and Dale Schuurmans and Quoc Le and Ed Chi and Sharan Narang and Aakanksha Chowdhery and Denny Zhou,http://arxiv.org/pdf/2203.11171v4 | |
http://arxiv.org/abs/2204.01959v1,creativecommons.org/licenses/by/4.0/,Data Augmentation for Intent Classification with Off-the-shelf Large Language Models,Gaurav Sahu and Pau Rodriguez and Issam H. Laradji and Parmida Atighehchian and David Vazquez and Dzmitry Bahdanau,http://arxiv.org/pdf/2204.01959v1 | |
http://arxiv.org/abs/2204.03031v2,creativecommons.org/licenses/by/4.0/,VALUE: Understanding Dialect Disparity in NLU,Caleb Ziems and Jiaao Chen and Camille Harris and Jessica Anderson and Diyi Yang,http://arxiv.org/pdf/2204.03031v2 | |
http://arxiv.org/abs/2206.11815v1,creativecommons.org/licenses/by/4.0/,Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution,Nikolay Arefyev and Boris Sheludko and Alexander Podolskiy and Alexander Panchenko,http://arxiv.org/pdf/2206.11815v1 | |
http://arxiv.org/abs/2206.14607v1,creativecommons.org/licenses/by/4.0/,NERDA-Con: Extending NER models for Continual Learning -- Integrating Distinct Tasks and Updating Distribution Shifts,Supriti Vijay and Aman Priyanshu,http://arxiv.org/pdf/2206.14607v1 | |
http://arxiv.org/abs/2208.05393v1,creativecommons.org/licenses/by/4.0/,A Quantum Natural Language Processing Approach to Pronoun Resolution,Hadi Wazni and Kin Ian Lo and Lachlan McPheat and Mehrnoosh Sadrzadeh,http://arxiv.org/pdf/2208.05393v1 | |
http://arxiv.org/abs/2208.05596v1,creativecommons.org/licenses/by/4.0/,Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines,Patrick Flynn and Tristan Vanderbruggen and Chunhua Liao and Pei-Hung Lin and Murali Emani and Xipeng Shen,http://arxiv.org/pdf/2208.05596v1 | |
http://arxiv.org/abs/2208.08195v2,creativecommons.org/licenses/by/4.0/,Benchmarking Compositionality with Formal Languages,Josef Valvoda and Naomi Saphra and Jonathan Rawski and Adina Williams and Ryan Cotterell,http://arxiv.org/pdf/2208.08195v2 | |
http://arxiv.org/abs/2208.12367v2,creativecommons.org/licenses/by/4.0/,A Compact Pretraining Approach for Neural Language Models,Shahriar Golchin and Mihai Surdeanu and Nazgol Tavabi and Ata Kiapour,http://arxiv.org/pdf/2208.12367v2 | |
http://arxiv.org/abs/2208.14652v1,creativecommons.org/licenses/by/4.0/,Unified Knowledge Prompt Pre-training for Customer Service Dialogues,Keqing He and Jingang Wang and Chaobo Sun and Wei Wu,http://arxiv.org/pdf/2208.14652v1 | |
http://arxiv.org/abs/2209.14161v1,creativecommons.org/licenses/by/4.0/,Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models,Youness Moukafih and Mounir Ghogho and Kamel Smaili,http://arxiv.org/pdf/2209.14161v1 | |
http://arxiv.org/abs/2210.02969v3,creativecommons.org/licenses/by/4.0/,Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners,Seonghyeon Ye and Doyoung Kim and Joel Jang and Joongbo Shin and Minjoon Seo,http://arxiv.org/pdf/2210.02969v3 | |
http://arxiv.org/abs/2210.05245v2,creativecommons.org/licenses/by/4.0/,PatternRank: Leveraging Pretrained Language Models and Part of Speech for Unsupervised Keyphrase Extraction,Tim Schopf and Simon Klimek and Florian Matthes,http://arxiv.org/pdf/2210.05245v2 | |
http://arxiv.org/abs/2210.10325v1,creativecommons.org/licenses/by/4.0/,Improving Stability of Fine-Tuning Pretrained Language Models via Component-Wise Gradient Norm Clipping,Chenghao Yang and Xuezhe Ma,http://arxiv.org/pdf/2210.10325v1 | |
http://arxiv.org/abs/2210.11255v1,creativecommons.org/licenses/by/4.0/,Evidence > Intuition: Transferability Estimation for Encoder Selection,Elisa Bassignana and Max Müller-Eberstein and Mike Zhang and Barbara Plank,http://arxiv.org/pdf/2210.11255v1 | |
http://arxiv.org/abs/2210.11617v1,creativecommons.org/licenses/by/4.0/,Boosting Natural Language Generation from Instructions with Meta-Learning,Budhaditya Deb and Guoqing Zheng and Ahmed Hassan Awadallah,http://arxiv.org/pdf/2210.11617v1 | |
http://arxiv.org/abs/2210.13144v1,creativecommons.org/licenses/by/4.0/,Weak-Supervised Dysarthria-invariant Features for Spoken Language Understanding using an FHVAE and Adversarial Training,Jinzi Qi and Hugo Van hamme,http://arxiv.org/pdf/2210.13144v1 | |
http://arxiv.org/abs/2210.14128v1,creativecommons.org/licenses/by/4.0/,IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models,Chenguang Wang and Xiao Liu and Dawn Song,http://arxiv.org/pdf/2210.14128v1 | |
http://arxiv.org/abs/2210.15224v1,creativecommons.org/licenses/by/4.0/,The Effect of Normalization for Bi-directional Amharic-English Neural Machine Translation,Tadesse Destaw Belay and Atnafu Lambebo Tonja and Olga Kolesnikova and Seid Muhie Yimam and Abinew Ali Ayele and Silesh Bogale Haile and Grigori Sidorov and Alexander Gelbukh,http://arxiv.org/pdf/2210.15224v1 | |
http://arxiv.org/abs/2211.01071v1,creativecommons.org/licenses/by/4.0/,Gradient Knowledge Distillation for Pre-trained Language Models,Lean Wang and Lei Li and Xu Sun,http://arxiv.org/pdf/2211.01071v1 | |
http://arxiv.org/abs/2211.17121v1,creativecommons.org/licenses/by/4.0/,sEHR-CE: Language modelling of structured EHR data for efficient and generalizable patient cohort expansion,Anna Munoz-Farre and Harry Rose and Sera Aylin Cakiroglu,http://arxiv.org/pdf/2211.17121v1 | |
http://arxiv.org/abs/2212.11680v1,creativecommons.org/licenses/by/4.0/,Smooth Sailing: Improving Active Learning for Pre-trained Language Models with Representation Smoothness Analysis,Josip Jukić and Jan Šnajder,http://arxiv.org/pdf/2212.11680v1 | |
http://arxiv.org/abs/2302.03494v8,creativecommons.org/licenses/by/4.0/,A Categorical Archive of ChatGPT Failures,Ali Borji,http://arxiv.org/pdf/2302.03494v8 | |
http://arxiv.org/abs/2303.03628v1,creativecommons.org/licenses/by/4.0/,CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification,Seungone Kim and Se June Joo and Yul Jang and Hyungjoo Chae and Jinyoung Yeo,http://arxiv.org/pdf/2303.03628v1 | |
http://arxiv.org/abs/2303.08991v2,creativecommons.org/licenses/by/4.0/,DeltaScore: Story Evaluation with Perturbations,Zhuohan Xie and Miao Li and Trevor Cohn and Jey Han Lau,http://arxiv.org/pdf/2303.08991v2 | |
http://arxiv.org/abs/2303.17650v1,creativecommons.org/licenses/by/4.0/,Comparing Abstractive Summaries Generated by ChatGPT to Real Summaries Through Blinded Reviewers and Text Classification Algorithms,Mayank Soni and Vincent Wade,http://arxiv.org/pdf/2303.17650v1 | |
http://arxiv.org/abs/2304.02828v1,creativecommons.org/licenses/by/4.0/,Uncurated Image-Text Datasets: Shedding Light on Demographic Bias,Noa Garcia and Yusuke Hirota and Yankun Wu and Yuta Nakashima,http://arxiv.org/pdf/2304.02828v1 | |
http://arxiv.org/abs/2304.07830v1,creativecommons.org/licenses/by/4.0/,How does ChatGPT rate sound semantics?,Kai Siedenburg and Charalampos Saitis,http://arxiv.org/pdf/2304.07830v1 | |
http://arxiv.org/abs/2304.10346v1,creativecommons.org/licenses/by/4.0/,Interventional Probing in High Dimensions: An NLI Case Study,Julia Rozanova and Marco Valentino and Lucas Cordeiro and Andre Freitas,http://arxiv.org/pdf/2304.10346v1 | |
http://arxiv.org/abs/2303.12417v2,creativecommons.org/licenses/by/4.0/,CLIP$^2$: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data,Yihan Zeng and Chenhan Jiang and Jiageng Mao and Jianhua Han and Chaoqiang Ye and Qingqiu Huang and Dit-Yan Yeung and Zhen Yang and Xiaodan Liang and Hang Xu,http://arxiv.org/pdf/2303.12417v2 | |
http://arxiv.org/abs/1810.11960v2,creativecommons.org/licenses/by/4.0/,Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language,Yusuke Yasuda and Xin Wang and Shinji Takaki and Junichi Yamagishi,http://arxiv.org/pdf/1810.11960v2 | |
http://arxiv.org/abs/2106.06090v2,creativecommons.org/licenses/by/4.0/,Graph Neural Networks for Natural Language Processing: A Survey,Lingfei Wu and Yu Chen and Kai Shen and Xiaojie Guo and Hanning Gao and Shucheng Li and Jian Pei and Bo Long,http://arxiv.org/pdf/2106.06090v2 | |
http://arxiv.org/abs/2205.11024v2,creativecommons.org/licenses/by/4.0/,Vector-Quantized Input-Contextualized Soft Prompts for Natural Language Understanding,Rishabh Bhardwaj and Amrita Saha and Steven C. H. Hoi and Soujanya Poria,http://arxiv.org/pdf/2205.11024v2 | |
http://arxiv.org/abs/2010.07665v1,creativecommons.org/licenses/by/4.0/,Diverse Keyphrase Generation with Neural Unlikelihood Training,Hareesh Bahuleyan and Layla El Asri,http://arxiv.org/pdf/2010.07665v1 | |
http://arxiv.org/abs/2102.07983v1,creativecommons.org/licenses/by/4.0/,"FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the Dictionary",Terra Blevins and Mandar Joshi and Luke Zettlemoyer,http://arxiv.org/pdf/2102.07983v1 | |
http://arxiv.org/abs/2105.07623v2,creativecommons.org/licenses/by/4.0/,Sentence Similarity Based on Contexts,Xiaofei Sun and Yuxian Meng and Xiang Ao and Fei Wu and Tianwei Zhang and Jiwei Li and Chun Fan,http://arxiv.org/pdf/2105.07623v2 | |
http://arxiv.org/abs/2109.06264v3,creativecommons.org/licenses/by/4.0/,Post-OCR Document Correction with large Ensembles of Character Sequence-to-Sequence Models,Juan Ramirez-Orta and Eduardo Xamena and Ana Maguitman and Evangelos Milios and Axel J. Soto,http://arxiv.org/pdf/2109.06264v3 | |
http://arxiv.org/abs/2301.07093v2,creativecommons.org/licenses/by/4.0/,GLIGEN: Open-Set Grounded Text-to-Image Generation,Yuheng Li and Haotian Liu and Qingyang Wu and Fangzhou Mu and Jianwei Yang and Jianfeng Gao and Chunyuan Li and Yong Jae Lee,http://arxiv.org/pdf/2301.07093v2 | |
http://arxiv.org/abs/2301.07543v1,creativecommons.org/licenses/by/4.0/,Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?,John J. Horton,http://arxiv.org/pdf/2301.07543v1 | |
http://arxiv.org/abs/2109.03537v2,creativecommons.org/licenses/by/4.0/,On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets,Cheng-Han Chiang and Hung-yi Lee,http://arxiv.org/pdf/2109.03537v2 | |
http://arxiv.org/abs/2209.06422v1,creativecommons.org/licenses/by/4.0/,Language Chameleon: Transformation analysis between languages using Cross-lingual Post-training based on Pre-trained language models,Suhyune Son and Chanjun Park and Jungseob Lee and Midan Shim and Chanhee Lee and Yoonna Jang and Jaehyung Seo and Heuiseok Lim,http://arxiv.org/pdf/2209.06422v1 | |
http://arxiv.org/abs/2303.18232v1,creativecommons.org/licenses/by/4.0/,DIME-FM: DIstilling Multimodal and Efficient Foundation Models,Ximeng Sun and Pengchuan Zhang and Peizhao Zhang and Hardik Shah and Kate Saenko and Xide Xia,http://arxiv.org/pdf/2303.18232v1 | |
http://arxiv.org/abs/1606.06361v2,creativecommons.org/licenses/by/4.0/,A Probabilistic Generative Grammar for Semantic Parsing,Abulhair Saparov,http://arxiv.org/pdf/1606.06361v2 | |
http://arxiv.org/abs/2107.00650v2,creativecommons.org/licenses/by/4.0/,CLIP-It! Language-Guided Video Summarization,Medhini Narasimhan and Anna Rohrbach and Trevor Darrell,http://arxiv.org/pdf/2107.00650v2 | |
http://arxiv.org/abs/2112.04426v3,creativecommons.org/licenses/by/4.0/,Improving language models by retrieving from trillions of tokens,Sebastian Borgeaud and Arthur Mensch and Jordan Hoffmann and Trevor Cai and Eliza Rutherford and Katie Millican and George van den Driessche and Jean-Baptiste Lespiau and Bogdan Damoc and Aidan Clark and Diego de Las Casas and Aurelia Guy and Jacob Menick and Roman Ring and Tom Hennigan and Saffron Huang and Loren Maggiore and Chris Jones and Albin Cassirer and Andy Brock and Michela Paganini and Geoffrey Irving and Oriol Vinyals and Simon Osindero and Karen Simonyan and Jack W. Rae and Erich Elsen and Laurent Sifre,http://arxiv.org/pdf/2112.04426v3 | |
http://arxiv.org/abs/2203.09590v4,creativecommons.org/licenses/by/4.0/,Enhanced Temporal Knowledge Embeddings with Contextualized Language Representations,Zhen Han and Ruotong Liao and Beiyan Liu and Yao Zhang and Zifeng Ding and Jindong Gu and Heinz Köppl and Hinrich Schütze and Volker Tresp,http://arxiv.org/pdf/2203.09590v4 | |
http://arxiv.org/abs/2203.14267v2,creativecommons.org/licenses/by/4.0/,bitsa_nlp@LT-EDI-ACL2022: Leveraging Pretrained Language Models for Detecting Homophobia and Transphobia in Social Media Comments,Vitthal Bhandari and Poonam Goyal,http://arxiv.org/pdf/2203.14267v2 | |
http://arxiv.org/abs/2204.02123v1,creativecommons.org/licenses/by/4.0/,Improved and Efficient Conversational Slot Labeling through Question Answering,Gabor Fuisz and Ivan Vulić and Samuel Gibbons and Inigo Casanueva and Paweł Budzianowski,http://arxiv.org/pdf/2204.02123v1 | |
http://arxiv.org/abs/2303.03697v1,creativecommons.org/licenses/by/4.0/,Stylometric Detection of AI-Generated Text in Twitter Timelines,Tharindu Kumarage and Joshua Garland and Amrita Bhattacharjee and Kirill Trapeznikov and Scott Ruston and Huan Liu,http://arxiv.org/pdf/2303.03697v1 | |
http://arxiv.org/abs/2303.03836v2,creativecommons.org/licenses/by/4.0/,Exploring the Feasibility of ChatGPT for Event Extraction,Jun Gao and Huan Zhao and Changlong Yu and Ruifeng Xu,http://arxiv.org/pdf/2303.03836v2 | |
http://arxiv.org/abs/2303.14822v1,creativecommons.org/licenses/by/4.0/,MGTBench: Benchmarking Machine-Generated Text Detection,Xinlei He and Xinyue Shen and Zeyuan Chen and Michael Backes and Yang Zhang,http://arxiv.org/pdf/2303.14822v1 | |
http://arxiv.org/abs/2304.03843v1,creativecommons.org/licenses/by/4.0/,Why think step-by-step? Reasoning emerges from the locality of experience,Ben Prystawski and Noah D. Goodman,http://arxiv.org/pdf/2304.03843v1 | |
http://arxiv.org/abs/2212.13138v1,creativecommons.org/licenses/by/4.0/,Large Language Models Encode Clinical Knowledge,Karan Singhal and Shekoofeh Azizi and Tao Tu and S. Sara Mahdavi and Jason Wei and Hyung Won Chung and Nathan Scales and Ajay Tanwani and Heather Cole-Lewis and Stephen Pfohl and Perry Payne and Martin Seneviratne and Paul Gamble and Chris Kelly and Nathaneal Scharli and Aakanksha Chowdhery and Philip Mansfield and Blaise Aguera y Arcas and Dale Webster and Greg S. Corrado and Yossi Matias and Katherine Chou and Juraj Gottweis and Nenad Tomasev and Yun Liu and Alvin Rajkomar and Joelle Barral and Christopher Semturs and Alan Karthikesalingam and Vivek Natarajan,http://arxiv.org/pdf/2212.13138v1 | |
http://arxiv.org/abs/2208.02402v2,creativecommons.org/licenses/by/4.0/,Fusing Sentence Embeddings Into LSTM-based Autoregressive Language Models,Vilém Zouhar and Marius Mosbach and Dietrich Klakow,http://arxiv.org/pdf/2208.02402v2 | |
http://arxiv.org/abs/2103.13275v1,creativecommons.org/licenses/by/4.0/,When Word Embeddings Become Endangered,Khalid Alnajjar,http://arxiv.org/pdf/2103.13275v1 | |
http://arxiv.org/abs/2111.07180v1,creativecommons.org/licenses/by/4.0/,Explainable Semantic Space by Grounding Language to Vision with Cross-Modal Contrastive Learning,Yizhen Zhang and Minkyu Choi and Kuan Han and Zhongming Liu,http://arxiv.org/pdf/2111.07180v1 | |
http://arxiv.org/abs/2107.10137v2,creativecommons.org/licenses/by/4.0/,Improved Text Classification via Contrastive Adversarial Training,Lin Pan and Chung-Wei Hang and Avirup Sil and Saloni Potdar,http://arxiv.org/pdf/2107.10137v2 | |
http://arxiv.org/abs/1903.04739v1,creativecommons.org/licenses/by/4.0/,Syllable-based Neural Named Entity Recognition for Myanmar Language,Hsu Myat Mo and Khin Mar Soe,http://arxiv.org/pdf/1903.04739v1 | |
http://arxiv.org/abs/2102.00894v1,creativecommons.org/licenses/by/4.0/,Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models,Nora Kassner and Philipp Dufter and Hinrich Schütze,http://arxiv.org/pdf/2102.00894v1 | |
http://arxiv.org/abs/2106.06683v2,creativecommons.org/licenses/by/4.0/,Assessing Multilingual Fairness in Pre-trained Multimodal Representations,Jialu Wang and Yang Liu and Xin Eric Wang,http://arxiv.org/pdf/2106.06683v2 | |
http://arxiv.org/abs/2210.00066v1,creativecommons.org/licenses/by/4.0/,Improving Policy Learning via Language Dynamics Distillation,Victor Zhong and Jesse Mu and Luke Zettlemoyer and Edward Grefenstette and Tim Rocktäschel,http://arxiv.org/pdf/2210.00066v1 | |
http://arxiv.org/abs/2108.03739v2,creativecommons.org/licenses/by/4.0/,Machine Translation of Low-Resource Indo-European Languages,Wei-Rui Chen and Muhammad Abdul-Mageed,http://arxiv.org/pdf/2108.03739v2 | |
http://arxiv.org/abs/2104.09400v1,creativecommons.org/licenses/by/4.0/,Probing for Bridging Inference in Transformer Language Models,Onkar Pandit and Yufang Hou,http://arxiv.org/pdf/2104.09400v1 | |
http://arxiv.org/abs/2202.09955v2,creativecommons.org/licenses/by/4.0/,StyleBERT: Chinese pretraining by font style information,Chao Lv and Han Zhang and XinKai Du and Yunhao Zhang and Ying Huang and Wenhao Li and Jia Han and Shanshan Gu,http://arxiv.org/pdf/2202.09955v2 | |
http://arxiv.org/abs/2303.13310v1,creativecommons.org/licenses/by/4.0/,SwissBERT: The Multilingual Language Model for Switzerland,Jannis Vamvas and Johannes Graën and Rico Sennrich,http://arxiv.org/pdf/2303.13310v1 | |
http://arxiv.org/abs/2110.07304v1,creativecommons.org/licenses/by/4.0/,An Empirical Investigation of Multi-bridge Multilingual NMT models,Anoop Kunchukuttan,http://arxiv.org/pdf/2110.07304v1 | |
http://arxiv.org/abs/2208.11194v1,creativecommons.org/licenses/by/4.0/,Bitext Mining for Low-Resource Languages via Contrastive Learning,Weiting Tan and Philipp Koehn,http://arxiv.org/pdf/2208.11194v1 | |
http://arxiv.org/abs/2103.15877v1,creativecommons.org/licenses/by/4.0/,Unsupervised Machine Translation On Dravidian Languages,Sai Koneru and Danni Liu and Jan Niehues,http://arxiv.org/pdf/2103.15877v1 | |
http://arxiv.org/abs/2110.15943v2,creativecommons.org/licenses/by/4.0/,MetaICL: Learning to Learn In Context,Sewon Min and Mike Lewis and Luke Zettlemoyer and Hannaneh Hajishirzi,http://arxiv.org/pdf/2110.15943v2 | |
http://arxiv.org/abs/2111.10337v2,creativecommons.org/licenses/by/4.0/,Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions,Hongwei Xue and Tiankai Hang and Yanhong Zeng and Yuchong Sun and Bei Liu and Huan Yang and Jianlong Fu and Baining Guo,http://arxiv.org/pdf/2111.10337v2 | |
http://arxiv.org/abs/2202.12837v2,creativecommons.org/licenses/by/4.0/,Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?,Sewon Min and Xinxi Lyu and Ari Holtzman and Mikel Artetxe and Mike Lewis and Hannaneh Hajishirzi and Luke Zettlemoyer,http://arxiv.org/pdf/2202.12837v2 | |
http://arxiv.org/abs/2302.12813v3,creativecommons.org/licenses/by/4.0/,Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback,Baolin Peng and Michel Galley and Pengcheng He and Hao Cheng and Yujia Xie and Yu Hu and Qiuyuan Huang and Lars Liden and Zhou Yu and Weizhu Chen and Jianfeng Gao,http://arxiv.org/pdf/2302.12813v3 | |
http://arxiv.org/abs/2304.10428v1,creativecommons.org/licenses/by/4.0/,GPT-NER: Named Entity Recognition via Large Language Models,Shuhe Wang and Xiaofei Sun and Xiaoya Li and Rongbin Ouyang and Fei Wu and Tianwei Zhang and Jiwei Li and Guoyin Wang,http://arxiv.org/pdf/2304.10428v1 | |
http://arxiv.org/abs/2101.00419v2,creativecommons.org/licenses/by/4.0/,KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation,Yiran Xing and Zai Shi and Zhao Meng and Gerhard Lakemeyer and Yunpu Ma and Roger Wattenhofer,http://arxiv.org/pdf/2101.00419v2 | |
http://arxiv.org/abs/2110.02067v1,creativecommons.org/licenses/by/4.0/,Teach Me What to Say and I Will Learn What to Pick: Unsupervised Knowledge Selection Through Response Generation with Pretrained Generative Models,Ehsan Lotfi and Maxime De Bruyn and Jeska Buhmann and Walter Daelemans,http://arxiv.org/pdf/2110.02067v1 | |
http://arxiv.org/abs/2112.08616v1,creativecommons.org/licenses/by/4.0/,Masked Measurement Prediction: Learning to Jointly Predict Quantities and Units from Textual Context,Daniel Spokoyny and Ivan Lee and Zhao Jin and Taylor Berg-Kirkpatrick,http://arxiv.org/pdf/2112.08616v1 | |
http://arxiv.org/abs/2112.11909v2,creativecommons.org/licenses/by/4.0/,Few-shot Multi-hop Question Answering over Knowledge Base,Meihao Fan and Lei Zhang and Siyao Xiao and Yuru Liang,http://arxiv.org/pdf/2112.11909v2 | |
http://arxiv.org/abs/2211.08473v1,creativecommons.org/licenses/by/4.0/,On the Compositional Generalization Gap of In-Context Learning,Arian Hosseini and Ankit Vani and Dzmitry Bahdanau and Alessandro Sordoni and Aaron Courville,http://arxiv.org/pdf/2211.08473v1 | |
http://arxiv.org/abs/2211.14865v2,creativecommons.org/licenses/by/4.0/,Understanding BLOOM: An empirical study on diverse NLP tasks,Parag Pravin Dakle and SaiKrishna Rallabandi and Preethi Raghavan,http://arxiv.org/pdf/2211.14865v2 | |
http://arxiv.org/abs/2210.08402v1,creativecommons.org/licenses/by/4.0/,LAION-5B: An open large-scale dataset for training next generation image-text models,Christoph Schuhmann and Romain Beaumont and Richard Vencu and Cade Gordon and Ross Wightman and Mehdi Cherti and Theo Coombes and Aarush Katta and Clayton Mullis and Mitchell Wortsman and Patrick Schramowski and Srivatsa Kundurthy and Katherine Crowson and Ludwig Schmidt and Robert Kaczmarczyk and Jenia Jitsev,http://arxiv.org/pdf/2210.08402v1 | |
http://arxiv.org/abs/1707.06480v1,creativecommons.org/licenses/by/4.0/,Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones,Zhenisbek Assylbekov and Rustem Takhanov and Bagdat Myrzakhmetov and Jonathan N. Washington,http://arxiv.org/pdf/1707.06480v1 | |
http://arxiv.org/abs/2012.05776v3,creativecommons.org/licenses/by/4.0/,Multi-Sense Language Modelling,Andrea Lekkas and Peter Schneider-Kamp and Isabelle Augenstein,http://arxiv.org/pdf/2012.05776v3 | |
http://arxiv.org/abs/2012.12543v2,creativecommons.org/licenses/by/4.0/,Code Switching Language Model Using Monolingual Training Data,Asad Ullah and Tauseef Ahmed,http://arxiv.org/pdf/2012.12543v2 | |
http://arxiv.org/abs/2203.16512v2,creativecommons.org/licenses/by/4.0/,Vakyansh: ASR Toolkit for Low Resource Indic languages,Harveen Singh Chadha and Anirudh Gupta and Priyanshi Shah and Neeraj Chhimwal and Ankur Dhuriya and Rishabh Gaur and Vivek Raghavan,http://arxiv.org/pdf/2203.16512v2 | |
http://arxiv.org/abs/2212.10440v1,creativecommons.org/licenses/by/4.0/,Perplexed by Quality: A Perplexity-based Method for Adult and Harmful Content Detection in Multilingual Heterogeneous Web Data,Tim Jansen and Yangling Tong and Victoria Zevallos and Pedro Ortiz Suarez,http://arxiv.org/pdf/2212.10440v1 | |
http://arxiv.org/abs/2303.05453v1,creativecommons.org/licenses/by/4.0/,Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback,Hannah Rose Kirk and Bertie Vidgen and Paul Röttger and Scott A. Hale,http://arxiv.org/pdf/2303.05453v1 | |
http://arxiv.org/abs/2210.15042v3,creativecommons.org/licenses/by/4.0/,Privately Fine-Tuning Large Language Models with Differential Privacy,Rouzbeh Behnia and Mohamamdreza Ebrahimi and Jason Pacheco and Balaji Padmanabhan,http://arxiv.org/pdf/2210.15042v3 | |
http://arxiv.org/abs/2110.08207v3,creativecommons.org/licenses/by/4.0/,Multitask Prompted Training Enables Zero-Shot Task Generalization,Victor Sanh and Albert Webson and Colin Raffel and Stephen H. Bach and Lintang Sutawika and Zaid Alyafeai and Antoine Chaffin and Arnaud Stiegler and Teven Le Scao and Arun Raja and Manan Dey and M Saiful Bari and Canwen Xu and Urmish Thakker and Shanya Sharma Sharma and Eliza Szczechla and Taewoon Kim and Gunjan Chhablani and Nihal Nayak and Debajyoti Datta and Jonathan Chang and Mike Tian-Jian Jiang and Han Wang and Matteo Manica and Sheng Shen and Zheng Xin Yong and Harshit Pandey and Rachel Bawden and Thomas Wang and Trishala Neeraj and Jos Rozen and Abheesht Sharma and Andrea Santilli and Thibault Fevry and Jason Alan Fries and Ryan Teehan and Tali Bers and Stella Biderman and Leo Gao and Thomas Wolf and Alexander M. Rush,http://arxiv.org/pdf/2110.08207v3 | |
http://arxiv.org/abs/2207.07411v1,creativecommons.org/licenses/by/4.0/,Plex: Towards Reliability using Pretrained Large Model Extensions,Dustin Tran and Jeremiah Liu and Michael W. Dusenberry and Du Phan and Mark Collier and Jie Ren and Kehang Han and Zi Wang and Zelda Mariet and Huiyi Hu and Neil Band and Tim G. J. Rudner and Karan Singhal and Zachary Nado and Joost van Amersfoort and Andreas Kirsch and Rodolphe Jenatton and Nithum Thain and Honglin Yuan and Kelly Buchanan and Kevin Murphy and D. Sculley and Yarin Gal and Zoubin Ghahramani and Jasper Snoek and Balaji Lakshminarayanan,http://arxiv.org/pdf/2207.07411v1 | |
http://arxiv.org/abs/2110.05877v1,creativecommons.org/licenses/by/4.0/,OpenHands: Making Sign Language Recognition Accessible with Pose-based Pretrained Models across Languages,Prem Selvaraj and Gokul NC and Pratyush Kumar and Mitesh Khapra,http://arxiv.org/pdf/2110.05877v1 | |
http://arxiv.org/abs/2302.05016v2,creativecommons.org/licenses/by/4.0/,Is Multimodal Vision Supervision Beneficial to Language?,Avinash Madasu and Vasudev Lal,http://arxiv.org/pdf/2302.05016v2 | |
http://arxiv.org/abs/2205.07307v1,creativecommons.org/licenses/by/4.0/,Optimization of Decision Tree Evaluation Using SIMD Instructions,Alexey Mironov and Ilnur Khuziev,http://arxiv.org/pdf/2205.07307v1 | |
http://arxiv.org/abs/2211.15271v2,creativecommons.org/licenses/by/4.0/,The Myth of Culturally Agnostic AI Models,Eva Cetinic,http://arxiv.org/pdf/2211.15271v2 | |
http://arxiv.org/abs/1708.00415v2,creativecommons.org/licenses/by/4.0/,A Generative Parser with a Discriminative Recognition Algorithm,Jianpeng Cheng and Adam Lopez and Mirella Lapata,http://arxiv.org/pdf/1708.00415v2 | |
http://arxiv.org/abs/2103.16716v1,creativecommons.org/licenses/by/4.0/,"BASE Layers: Simplifying Training of Large, Sparse Models",Mike Lewis and Shruti Bhosale and Tim Dettmers and Naman Goyal and Luke Zettlemoyer,http://arxiv.org/pdf/2103.16716v1 | |
http://arxiv.org/abs/2107.03176v1,creativecommons.org/licenses/by/4.0/,On Training Instance Selection for Few-Shot Neural Text Generation,Ernie Chang and Xiaoyu Shen and Hui-Syuan Yeh and Vera Demberg,http://arxiv.org/pdf/2107.03176v1 | |
http://arxiv.org/abs/2203.08568v3,creativecommons.org/licenses/by/4.0/,In-Context Learning for Few-Shot Dialogue State Tracking,Yushi Hu and Chia-Hsuan Lee and Tianbao Xie and Tao Yu and Noah A. Smith and Mari Ostendorf,http://arxiv.org/pdf/2203.08568v3 | |
http://arxiv.org/abs/2205.09685v1,creativecommons.org/licenses/by/4.0/,ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD,Moustafa Al-Hajj and Mustafa Jarrar,http://arxiv.org/pdf/2205.09685v1 | |
http://arxiv.org/abs/2205.11194v1,creativecommons.org/licenses/by/4.0/,UnifieR: A Unified Retriever for Large-Scale Retrieval,Tao Shen and Xiubo Geng and Chongyang Tao and Can Xu and Kai Zhang and Daxin Jiang,http://arxiv.org/pdf/2205.11194v1 | |
http://arxiv.org/abs/2212.01558v1,creativecommons.org/licenses/by/4.0/,PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained Image-Language Models,Minghua Liu and Yinhao Zhu and Hong Cai and Shizhong Han and Zhan Ling and Fatih Porikli and Hao Su,http://arxiv.org/pdf/2212.01558v1 | |
http://arxiv.org/abs/2212.14149v1,creativecommons.org/licenses/by/4.0/,Macro-block dropout for improved regularization in training end-to-end speech recognition models,Chanwoo Kim and Sathish Indurti and Jinhwan Park and Wonyong Sung,http://arxiv.org/pdf/2212.14149v1 | |
http://arxiv.org/abs/2301.01820v3,creativecommons.org/licenses/by/4.0/,InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval,Vitor Jeronymo and Luiz Bonifacio and Hugo Abonizio and Marzieh Fadaee and Roberto Lotufo and Jakub Zavrel and Rodrigo Nogueira,http://arxiv.org/pdf/2301.01820v3 | |
http://arxiv.org/abs/2304.11015v1,creativecommons.org/licenses/by/4.0/,DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction,Mohammadreza Pourreza and Davood Rafiei,http://arxiv.org/pdf/2304.11015v1 | |
http://arxiv.org/abs/2101.04566v1,creativecommons.org/licenses/by/4.0/,Frequency Limited $\mathcal{H}_2$ Optimal Model Reduction of Large-Scale Sparse Dynamical Systems,Xin Du and M. Monir Uddin and A. Mostakim Fony and Md. Tanzim Hossain and Mohammaed Sahadat-Hossain,http://arxiv.org/pdf/2101.04566v1 | |
http://arxiv.org/abs/1910.03806v1,creativecommons.org/licenses/by/4.0/,Is Multilingual BERT Fluent in Language Generation?,Samuel Rönnqvist and Jenna Kanerva and Tapio Salakoski and Filip Ginter,http://arxiv.org/pdf/1910.03806v1 | |
http://arxiv.org/abs/1911.07613v1,creativecommons.org/licenses/by/4.0/,A Subword Level Language Model for Bangla Language,Aisha Khatun and Anisur Rahman and Hemayet Ahmed Chowdhury and Md. Saiful Islam and Ayesha Tasnim,http://arxiv.org/pdf/1911.07613v1 | |
http://arxiv.org/abs/2206.10668v1,creativecommons.org/licenses/by/4.0/,BenchCLAMP: A Benchmark for Evaluating Language Models on Semantic Parsing,Subhro Roy and Sam Thomson and Tongfei Chen and Richard Shin and Adam Pauls and Jason Eisner and Benjamin Van Durme,http://arxiv.org/pdf/2206.10668v1 | |
http://arxiv.org/abs/2105.09938v3,creativecommons.org/licenses/by/4.0/,Measuring Coding Challenge Competence With APPS,Dan Hendrycks and Steven Basart and Saurav Kadavath and Mantas Mazeika and Akul Arora and Ethan Guo and Collin Burns and Samir Puranik and Horace He and Dawn Song and Jacob Steinhardt,http://arxiv.org/pdf/2105.09938v3 | |
http://arxiv.org/abs/2205.12302v2,creativecommons.org/licenses/by/4.0/,Garden-Path Traversal in GPT-2,William Jurayj and William Rudman and Carsten Eickhoff,http://arxiv.org/pdf/2205.12302v2 | |
http://arxiv.org/abs/2303.15846v1,creativecommons.org/licenses/by/4.0/,Soft-prompt tuning to predict lung cancer using primary care free-text Dutch medical notes,Auke Elfrink and Iacopo Vagliano and Ameen Abu-Hanna and Iacer Calixto,http://arxiv.org/pdf/2303.15846v1 | |
http://arxiv.org/abs/2304.07258v1,creativecommons.org/licenses/by/4.0/,"Learn What Is Possible, Then Choose What Is Best: Disentangling One-To-Many Relations in Language Through Text-based Games",Benjamin Towle and Ke Zhou,http://arxiv.org/pdf/2304.07258v1 | |
http://arxiv.org/abs/2104.08384v2,creativecommons.org/licenses/by/4.0/,"Wikily"" Supervised Neural Translation Tailored to Cross-Lingual Tasks""",Mohammad Sadegh Rasooli and Chris Callison-Burch and Derry Tanti Wijaya,http://arxiv.org/pdf/2104.08384v2 | |
http://arxiv.org/abs/2105.14839v2,creativecommons.org/licenses/by/4.0/,Greedy-layer Pruning: Speeding up Transformer Models for Natural Language Processing,David Peer and Sebastian Stabinger and Stefan Engl and Antonio Rodriguez-Sanchez,http://arxiv.org/pdf/2105.14839v2 | |
http://arxiv.org/abs/2212.10233v1,creativecommons.org/licenses/by/4.0/,Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study,Di Wu and Wasi Uddin Ahmad and Kai-Wei Chang,http://arxiv.org/pdf/2212.10233v1 | |
http://arxiv.org/abs/2302.08583v1,creativecommons.org/licenses/by/4.0/,JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition,Zhong Meng and Weiran Wang and Rohit Prabhavalkar and Tara N. Sainath and Tongzhou Chen and Ehsan Variani and Yu Zhang and Bo Li and Andrew Rosenberg and Bhuvana Ramabhadran,http://arxiv.org/pdf/2302.08583v1 | |
http://arxiv.org/abs/2209.09513v2,creativecommons.org/licenses/by/4.0/,Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering,Pan Lu and Swaroop Mishra and Tony Xia and Liang Qiu and Kai-Wei Chang and Song-Chun Zhu and Oyvind Tafjord and Peter Clark and Ashwin Kalyan,http://arxiv.org/pdf/2209.09513v2 | |
http://arxiv.org/abs/2302.07459v2,creativecommons.org/licenses/by/4.0/,The Capacity for Moral Self-Correction in Large Language Models,Deep Ganguli and Amanda Askell and Nicholas Schiefer and Thomas I. Liao and Kamilė Lukošiūtė and Anna Chen and Anna Goldie and Azalia Mirhoseini and Catherine Olsson and Danny Hernandez and Dawn Drain and Dustin Li and Eli Tran-Johnson and Ethan Perez and Jackson Kernion and Jamie Kerr and Jared Mueller and Joshua Landau and Kamal Ndousse and Karina Nguyen and Liane Lovitt and Michael Sellitto and Nelson Elhage and Noemi Mercado and Nova DasSarma and Oliver Rausch and Robert Lasenby and Robin Larson and Sam Ringer and Sandipan Kundu and Saurav Kadavath and Scott Johnston and Shauna Kravec and Sheer El Showk and Tamera Lanham and Timothy Telleen-Lawton and Tom Henighan and Tristan Hume and Yuntao Bai and Zac Hatfield-Dodds and Ben Mann and Dario Amodei and Nicholas Joseph and Sam McCandlish and Tom Brown and Christopher Olah and Jack Clark and Samuel R. Bowman and Jared Kaplan,http://arxiv.org/pdf/2302.07459v2 | |
http://arxiv.org/abs/2201.00075v1,creativecommons.org/licenses/by/4.0/,How do lexical semantics affect translation? An empirical study,Vivek Subramanian and Dhanasekar Sundararaman,http://arxiv.org/pdf/2201.00075v1 | |
http://arxiv.org/abs/2208.10347v1,creativecommons.org/licenses/by/4.0/,A robust class of languages of 2-nested words,Séverine Fratani and Guillaume Maurras and Pierre-Alain Reynier,http://arxiv.org/pdf/2208.10347v1 | |
http://arxiv.org/abs/2211.09110v1,creativecommons.org/licenses/by/4.0/,Holistic Evaluation of Language Models,Percy Liang and Rishi Bommasani and Tony Lee and Dimitris Tsipras and Dilara Soylu and Michihiro Yasunaga and Yian Zhang and Deepak Narayanan and Yuhuai Wu and Ananya Kumar and Benjamin Newman and Binhang Yuan and Bobby Yan and Ce Zhang and Christian Cosgrove and Christopher D. Manning and Christopher Ré and Diana Acosta-Navas and Drew A. Hudson and Eric Zelikman and Esin Durmus and Faisal Ladhak and Frieda Rong and Hongyu Ren and Huaxiu Yao and Jue Wang and Keshav Santhanam and Laurel Orr and Lucia Zheng and Mert Yuksekgonul and Mirac Suzgun and Nathan Kim and Neel Guha and Niladri Chatterji and Omar Khattab and Peter Henderson and Qian Huang and Ryan Chi and Sang Michael Xie and Shibani Santurkar and Surya Ganguli and Tatsunori Hashimoto and Thomas Icard and Tianyi Zhang and Vishrav Chaudhary and William Wang and Xuechen Li and Yifan Mai and Yuhui Zhang and Yuta Koreeda,http://arxiv.org/pdf/2211.09110v1 | |
http://arxiv.org/abs/1812.11549v1,creativecommons.org/licenses/by/4.0/,Visibly Pushdown Languages over Sliding Windows,Moses Ganardi,http://arxiv.org/pdf/1812.11549v1 | |
http://arxiv.org/abs/2102.10535v1,creativecommons.org/licenses/by/4.0/,Automatic Code Generation using Pre-Trained Language Models,Luis Perez and Lizi Ottens and Sudharshan Viswanathan,http://arxiv.org/pdf/2102.10535v1 | |
http://arxiv.org/abs/2204.03951v1,creativecommons.org/licenses/by/4.0/,RuBioRoBERTa: a pre-trained biomedical language model for Russian language biomedical text mining,Alexander Yalunin and Alexander Nesterov and Dmitriy Umerenkov,http://arxiv.org/pdf/2204.03951v1 | |
http://arxiv.org/abs/2005.12656v3,creativecommons.org/licenses/by/4.0/,An open-source voice type classifier for child-centered daylong recordings,Marvin Lavechin and Ruben Bousbib and Hervé Bredin and Emmanuel Dupoux and Alejandrina Cristia,http://arxiv.org/pdf/2005.12656v3 | |
http://arxiv.org/abs/2104.08815v3,creativecommons.org/licenses/by/4.0/,FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks,Bill Yuchen Lin and Chaoyang He and Zihang Zeng and Hulin Wang and Yufen Huang and Christophe Dupuy and Rahul Gupta and Mahdi Soltanolkotabi and Xiang Ren and Salman Avestimehr,http://arxiv.org/pdf/2104.08815v3 | |
http://arxiv.org/abs/2105.11407v1,creativecommons.org/licenses/by/4.0/,VANiLLa : Verbalized Answers in Natural Language at Large Scale,Debanjali Biswas and Mohnish Dubey and Md Rashad Al Hasan Rony and Jens Lehmann,http://arxiv.org/pdf/2105.11407v1 | |
http://arxiv.org/abs/2111.02574v2,creativecommons.org/licenses/by/4.0/,Contextual Semantic Parsing for Multilingual Task-Oriented Dialogues,Mehrad Moradshahi and Victoria Tsai and Giovanni Campagna and Monica S. Lam,http://arxiv.org/pdf/2111.02574v2 | |
http://arxiv.org/abs/2111.06741v2,creativecommons.org/licenses/by/4.0/,A Quantum Natural Language Processing Approach to Musical Intelligence,Eduardo Reck Miranda and Richie Yeung and Anna Pearson and Konstantinos Meichanetzidis and Bob Coecke,http://arxiv.org/pdf/2111.06741v2 | |
http://arxiv.org/abs/2112.02333v1,creativecommons.org/licenses/by/4.0/,LoNLI: An Extensible Framework for Testing Diverse Logical Reasoning Capabilities for NLI,Ishan Tarunesh and Somak Aditya and Monojit Choudhury,http://arxiv.org/pdf/2112.02333v1 | |
http://arxiv.org/abs/2302.13007v3,creativecommons.org/licenses/by/4.0/,AugGPT: Leveraging ChatGPT for Text Data Augmentation,Haixing Dai and Zhengliang Liu and Wenxiong Liao and Xiaoke Huang and Yihan Cao and Zihao Wu and Lin Zhao and Shaochen Xu and Wei Liu and Ninghao Liu and Sheng Li and Dajiang Zhu and Hongmin Cai and Lichao Sun and Quanzheng Li and Dinggang Shen and Tianming Liu and Xiang Li,http://arxiv.org/pdf/2302.13007v3 | |
http://arxiv.org/abs/2303.12093v2,creativecommons.org/licenses/by/4.0/,ChatGPT for Programming Numerical Methods,Ali Kashefi and Tapan Mukerji,http://arxiv.org/pdf/2303.12093v2 | |
http://arxiv.org/abs/2304.03347v1,creativecommons.org/licenses/by/4.0/,On the Evaluations of ChatGPT and Emotion-enhanced Prompting for Mental Health Analysis,Kailai Yang and Shaoxiong Ji and Tianlin Zhang and Qianqian Xie and Sophia Ananiadou,http://arxiv.org/pdf/2304.03347v1 | |
http://arxiv.org/abs/2111.13792v3,creativecommons.org/licenses/by/4.0/,LAFITE: Towards Language-Free Training for Text-to-Image Generation,Yufan Zhou and Ruiyi Zhang and Changyou Chen and Chunyuan Li and Chris Tensmeyer and Tong Yu and Jiuxiang Gu and Jinhui Xu and Tong Sun,http://arxiv.org/pdf/2111.13792v3 | |
http://arxiv.org/abs/2203.00056v1,creativecommons.org/licenses/by/4.0/,An Empirical Study on Explanations in Out-of-Domain Settings,George Chrysostomou and Nikolaos Aletras,http://arxiv.org/pdf/2203.00056v1 | |
http://arxiv.org/abs/2211.13196v1,creativecommons.org/licenses/by/4.0/,SeedBERT: Recovering Annotator Rating Distributions from an Aggregated Label,Aneesha Sampath and Victoria Lin and Louis-Philippe Morency,http://arxiv.org/pdf/2211.13196v1 | |
http://arxiv.org/abs/1907.09038v1,creativecommons.org/licenses/by/4.0/,Augmenting a BiLSTM tagger with a Morphological Lexicon and a Lexical Category Identification Step,Steinþór Steingrímsson and Örvar Kárason and Hrafn Loftsson,http://arxiv.org/pdf/1907.09038v1 | |
http://arxiv.org/abs/2007.12544v3,creativecommons.org/licenses/by/4.0/,FiSSA at SemEval-2020 Task 9: Fine-tuned For Feelings,Bertelt Braaksma and Richard Scholtens and Stan van Suijlekom and Remy Wang and Ahmet Üstün,http://arxiv.org/pdf/2007.12544v3 | |
http://arxiv.org/abs/2101.00416v2,creativecommons.org/licenses/by/4.0/,Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting,Wangchunshu Zhou and Tao Ge and Canwen Xu and Ke Xu and Furu Wei,http://arxiv.org/pdf/2101.00416v2 | |
http://arxiv.org/abs/2101.11423v1,creativecommons.org/licenses/by/4.0/,A More Efficient Chinese Named Entity Recognition base on BERT and Syntactic Analysis,Xiao Fu and Guijun Zhang,http://arxiv.org/pdf/2101.11423v1 | |
http://arxiv.org/abs/2106.08770v1,creativecommons.org/licenses/by/4.0/,TSSuBERT: Tweet Stream Summarization Using BERT,Alexis Dusart and Karen Pinel-Sauvagnat and Gilles Hubert,http://arxiv.org/pdf/2106.08770v1 | |
http://arxiv.org/abs/2111.15588v4,creativecommons.org/licenses/by/4.0/,SimpleTRON: Simple Transformer with O(N) Complexity,Uladzislau Yorsh and Alexander Kovalenko and Vojtěch Vančura and Daniel Vašata and Pavel Kordík and Tomáš Mikolov,http://arxiv.org/pdf/2111.15588v4 | |
http://arxiv.org/abs/2303.13506v1,creativecommons.org/licenses/by/4.0/,The Quantization Model of Neural Scaling,Eric J. Michaud and Ziming Liu and Uzay Girit and Max Tegmark,http://arxiv.org/pdf/2303.13506v1 | |
http://arxiv.org/abs/2203.11480v5,creativecommons.org/licenses/by/4.0/,WuDaoMM: A large-scale Multi-Modal Dataset for Pre-training models,Sha Yuan and Shuai Zhao and Jiahong Leng and Zhao Xue and Hanyu Zhao and Peiyu Liu and Zheng Gong and Wayne Xin Zhao and Junyi Li and Jie Tang,http://arxiv.org/pdf/2203.11480v5 | |
http://arxiv.org/abs/2209.08982v1,creativecommons.org/licenses/by/4.0/,How to Adapt Pre-trained Vision-and-Language Models to a Text-only Input?,Lovisa Hagström and Richard Johansson,http://arxiv.org/pdf/2209.08982v1 | |
http://arxiv.org/abs/2212.10923v1,creativecommons.org/licenses/by/4.0/,Language Models as Inductive Reasoners,Zonglin Yang and Li Dong and Xinya Du and Hao Cheng and Erik Cambria and Xiaodong Liu and Jianfeng Gao and Furu Wei,http://arxiv.org/pdf/2212.10923v1 | |
http://arxiv.org/abs/2104.07483v2,creativecommons.org/licenses/by/4.0/,IndT5: A Text-to-Text Transformer for 10 Indigenous Languages,El Moatez Billah Nagoudi and Wei-Rui Chen and Muhammad Abdul-Mageed and Hasan Cavusogl,http://arxiv.org/pdf/2104.07483v2 | |
http://arxiv.org/abs/2107.11976v2,creativecommons.org/licenses/by/4.0/,One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval,Akari Asai and Xinyan Yu and Jungo Kasai and Hannaneh Hajishirzi,http://arxiv.org/pdf/2107.11976v2 | |
http://arxiv.org/abs/2103.11790v3,creativecommons.org/licenses/by/4.0/,Large Pre-trained Language Models Contain Human-like Biases of What is Right and Wrong to Do,Patrick Schramowski and Cigdem Turan and Nico Andersen and Constantin A. Rothkopf and Kristian Kersting,http://arxiv.org/pdf/2103.11790v3 | |
http://arxiv.org/abs/2103.13020v3,creativecommons.org/licenses/by/4.0/,deGraphCS: Embedding Variable-based Flow Graph for Neural Code Search,Chen Zeng and Yue Yu and Shanshan Li and Xin Xia and Zhiming Wang and Mingyang Geng and Bailin Xiao and Wei Dong and Xiangke Liao,http://arxiv.org/pdf/2103.13020v3 | |
http://arxiv.org/abs/2203.08410v3,creativecommons.org/licenses/by/4.0/,Thinking about GPT-3 In-Context Learning for Biomedical IE? Think Again,Bernal Jiménez Gutiérrez and Nikolas McNeal and Clay Washington and You Chen and Lang Li and Huan Sun and Yu Su,http://arxiv.org/pdf/2203.08410v3 | |
http://arxiv.org/abs/2204.00458v2,creativecommons.org/licenses/by/4.0/,Evaluation of Fake News Detection with Knowledge-Enhanced Language Models,Chenxi Whitehouse and Tillman Weyde and Pranava Madhyastha and Nikos Komninos,http://arxiv.org/pdf/2204.00458v2 | |
http://arxiv.org/abs/2205.03767v3,creativecommons.org/licenses/by/4.0/,Context-Aware Abbreviation Expansion Using Large Language Models,Shanqing Cai and Subhashini Venugopalan and Katrin Tomanek and Ajit Narayanan and Meredith Ringel Morris and Michael P. Brenner,http://arxiv.org/pdf/2205.03767v3 | |
http://arxiv.org/abs/2205.12105v2,creativecommons.org/licenses/by/4.0/,HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval,Feilong Chen and Xiuyi Chen and Jiaxin Shi and Duzhen Zhang and Jianlong Chang and Qi Tian,http://arxiv.org/pdf/2205.12105v2 | |
http://arxiv.org/abs/2209.02812v1,creativecommons.org/licenses/by/4.0/,Increasing Adverse Drug Events extraction robustness on social media: case study on negation and speculation,Simone Scaboro and Beatrice Portelli and Emmanuele Chersoni and Enrico Santus and Giuseppe Serra,http://arxiv.org/pdf/2209.02812v1 | |
http://arxiv.org/abs/2209.09900v1,creativecommons.org/licenses/by/4.0/,LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging,Andy Rosenbaum and Saleh Soltan and Wael Hamza and Yannick Versley and Markus Boese,http://arxiv.org/pdf/2209.09900v1 | |
http://arxiv.org/abs/2210.00720v2,creativecommons.org/licenses/by/4.0/,Complexity-Based Prompting for Multi-Step Reasoning,Yao Fu and Hao Peng and Ashish Sabharwal and Peter Clark and Tushar Khot,http://arxiv.org/pdf/2210.00720v2 | |
http://arxiv.org/abs/2210.05675v2,creativecommons.org/licenses/by/4.0/,Transformers generalize differently from information stored in context vs in weights,Stephanie C. Y. Chan and Ishita Dasgupta and Junkyung Kim and Dharshan Kumaran and Andrew K. Lampinen and Felix Hill,http://arxiv.org/pdf/2210.05675v2 | |
http://arxiv.org/abs/2211.09699v3,creativecommons.org/licenses/by/4.0/,PromptCap: Prompt-Guided Task-Aware Image Captioning,Yushi Hu and Hang Hua and Zhengyuan Yang and Weijia Shi and Noah A. Smith and Jiebo Luo,http://arxiv.org/pdf/2211.09699v3 | |
http://arxiv.org/abs/2212.05856v1,creativecommons.org/licenses/by/4.0/,"I think this is the most disruptive technology"": Exploring Sentiments of ChatGPT Early Adopters using Twitter Data""",Mubin Ul Haque and Isuru Dharmadasa and Zarrin Tasnim Sworna and Roshan Namal Rajapakse and Hussain Ahmad,http://arxiv.org/pdf/2212.05856v1 | |
http://arxiv.org/abs/2212.08037v2,creativecommons.org/licenses/by/4.0/,Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models,Bernd Bohnet and Vinh Q. Tran and Pat Verga and Roee Aharoni and Daniel Andor and Livio Baldini Soares and Massimiliano Ciaramita and Jacob Eisenstein and Kuzman Ganchev and Jonathan Herzig and Kai Hui and Tom Kwiatkowski and Ji Ma and Jianmo Ni and Lierni Sestorain Saralegui and Tal Schuster and William W. Cohen and Michael Collins and Dipanjan Das and Donald Metzler and Slav Petrov and Kellie Webster,http://arxiv.org/pdf/2212.08037v2 | |
http://arxiv.org/abs/2302.08500v1,creativecommons.org/licenses/by/4.0/,Auditing large language models: a three-layered approach,Jakob Mökander and Jonas Schuett and Hannah Rose Kirk and Luciano Floridi,http://arxiv.org/pdf/2302.08500v1 | |
http://arxiv.org/abs/2303.08896v1,creativecommons.org/licenses/by/4.0/,SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models,Potsawee Manakul and Adian Liusie and Mark J. F. Gales,http://arxiv.org/pdf/2303.08896v1 | |
http://arxiv.org/abs/2107.11020v3,creativecommons.org/licenses/by/4.0/,Emotion analysis and detection during COVID-19,Tiberiu Sosea and Chau Pham and Alexander Tekle and Cornelia Caragea and Junyi Jessy Li,http://arxiv.org/pdf/2107.11020v3 | |
http://arxiv.org/abs/2212.01476v1,creativecommons.org/licenses/by/4.0/,NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization,Chao Zhao and Faeze Brahman and Kaiqiang Song and Wenlin Yao and Dian Yu and Snigdha Chaturvedi,http://arxiv.org/pdf/2212.01476v1 | |
http://arxiv.org/abs/2212.09257v1,creativecommons.org/licenses/by/4.0/,PromptBoosting: Black-Box Text Classification with Ten Forward Passes,Bairu Hou and Joe O'Connor and Jacob Andreas and Shiyu Chang and Yang Zhang,http://arxiv.org/pdf/2212.09257v1 | |
http://arxiv.org/abs/2303.14770v1,creativecommons.org/licenses/by/4.0/,Koala: An Index for Quantifying Overlaps with Pre-training Corpora,Thuy-Trang Vu and Xuanli He and Gholamreza Haffari and Ehsan Shareghi,http://arxiv.org/pdf/2303.14770v1 | |
http://arxiv.org/abs/2109.01942v2,creativecommons.org/licenses/by/4.0/,On the ability of monolingual models to learn language-agnostic representations,Leandro Rodrigues de Souza and Rodrigo Nogueira and Roberto Lotufo,http://arxiv.org/pdf/2109.01942v2 | |
http://arxiv.org/abs/2301.08913v1,creativecommons.org/licenses/by/4.0/,Unifying Structure Reasoning and Language Model Pre-training for Complex Reasoning,Siyuan Wang and Zhongyu Wei and Jiarong Xu and Zhihao Fan,http://arxiv.org/pdf/2301.08913v1 | |
http://arxiv.org/abs/1912.01072v2,creativecommons.org/licenses/by/4.0/,Leveraging Contextual Embeddings for Detecting Diachronic Semantic Shift,Matej Martinc and Petra Kralj Novak and Senja Pollak,http://arxiv.org/pdf/1912.01072v2 | |
http://arxiv.org/abs/2004.11449v1,creativecommons.org/licenses/by/4.0/,Upgrading the Newsroom: An Automated Image Selection System for News Articles,Fangyu Liu and Rémi Lebret and Didier Orel and Philippe Sordet and Karl Aberer,http://arxiv.org/pdf/2004.11449v1 | |
http://arxiv.org/abs/2008.08547v1,creativecommons.org/licenses/by/4.0/,UoB at SemEval-2020 Task 12: Boosting BERT with Corpus Level Information,Wah Meng Lim and Harish Tayyar Madabushi,http://arxiv.org/pdf/2008.08547v1 | |
http://arxiv.org/abs/2101.11360v1,creativecommons.org/licenses/by/4.0/,An Empirical Study of Cross-Lingual Transferability in Generative Dialogue State Tracker,Yen-Ting Lin and Yun-Nung Chen,http://arxiv.org/pdf/2101.11360v1 | |
http://arxiv.org/abs/2103.01620v2,creativecommons.org/licenses/by/4.0/,Disentangling Syntax and Semantics in the Brain with Deep Networks,Charlotte Caucheteux and Alexandre Gramfort and Jean-Remi King,http://arxiv.org/pdf/2103.01620v2 | |
http://arxiv.org/abs/2103.08993v1,creativecommons.org/licenses/by/4.0/,Fast Development of ASR in African Languages using Self Supervised Speech Representation Learning,Jama Hussein Mohamud and Lloyd Acquaye Thompson and Aissatou Ndoye and Laurent Besacier,http://arxiv.org/pdf/2103.08993v1 | |
http://arxiv.org/abs/2103.10198v2,creativecommons.org/licenses/by/4.0/,Phylogenetic typology,Gerhard Jäger and Johannes Wahle,http://arxiv.org/pdf/2103.10198v2 | |
http://arxiv.org/abs/2105.06027v1,creativecommons.org/licenses/by/4.0/,Towards Human-Free Automatic Quality Evaluation of German Summarization,Neslihan Iskender and Oleg Vasilyev and Tim Polzehl and John Bohannon and Sebastian Möller,http://arxiv.org/pdf/2105.06027v1 | |
http://arxiv.org/abs/2107.01982v1,creativecommons.org/licenses/by/4.0/,The DCU-EPFL Enhanced Dependency Parser at the IWPT 2021 Shared Task,James Barry and Alireza Mohammadshahi and Joachim Wagner and Jennifer Foster and James Henderson,http://arxiv.org/pdf/2107.01982v1 | |
http://arxiv.org/abs/2107.09710v1,creativecommons.org/licenses/by/4.0/,TLA: Twitter Linguistic Analysis,Tushar Sarkar and Nishant Rajadhyaksha,http://arxiv.org/pdf/2107.09710v1 | |
http://arxiv.org/abs/2109.15196v2,creativecommons.org/licenses/by/4.0/,Multilingual AMR Parsing with Noisy Knowledge Distillation,Deng Cai and Xin Li and Jackie Chun-Sing Ho and Lidong Bing and Wai Lam,http://arxiv.org/pdf/2109.15196v2 | |
http://arxiv.org/abs/2110.08559v1,creativecommons.org/licenses/by/4.0/,"FrugalScore: Learning Cheaper, Lighter and Faster Evaluation Metricsfor Automatic Text Generation",Moussa Kamal Eddine and Guokan Shang and Antoine J. -P. Tixier and Michalis Vazirgiannis,http://arxiv.org/pdf/2110.08559v1 | |
http://arxiv.org/abs/2111.13662v2,creativecommons.org/licenses/by/4.0/,Modular Information Flow through Ownership,Will Crichton and Marco Patrignani and Maneesh Agrawala and Pat Hanrahan,http://arxiv.org/pdf/2111.13662v2 | |
http://arxiv.org/abs/2204.00743v2,creativecommons.org/licenses/by/4.0/,Entity-Centric Query Refinement,David Wadden and Nikita Gupta and Kenton Lee and Kristina Toutanova,http://arxiv.org/pdf/2204.00743v2 | |
http://arxiv.org/abs/2207.08286v1,creativecommons.org/licenses/by/4.0/,An Overview of Distant Supervision for Relation Extraction with a Focus on Denoising and Pre-training Methods,William Hogan,http://arxiv.org/pdf/2207.08286v1 | |
http://arxiv.org/abs/2208.01009v2,creativecommons.org/licenses/by/4.0/,Few-shot Adaptation Works with UnpredicTable Data,Jun Shern Chan and Michael Pieler and Jonathan Jao and Jérémy Scheurer and Ethan Perez,http://arxiv.org/pdf/2208.01009v2 | |
http://arxiv.org/abs/2208.05559v1,creativecommons.org/licenses/by/4.0/,"Comparing Channel Restrictions of Communicating State Machines, High-level Message Sequence Charts, and Multiparty Session Types",Felix Stutz and Damien Zufferey,http://arxiv.org/pdf/2208.05559v1 | |
http://arxiv.org/abs/2208.14610v2,creativecommons.org/licenses/by/4.0/,The Sparse Abstract Machine,Olivia Hsu and Maxwell Strange and Ritvik Sharma and Jaeyeon Won and Kunle Olukotun and Joel Emer and Mark Horowitz and Fredrik Kjolstad,http://arxiv.org/pdf/2208.14610v2 | |
http://arxiv.org/abs/2209.06049v3,creativecommons.org/licenses/by/4.0/,Pre-training Transformers on Indian Legal Text,Shounak Paul and Arpan Mandal and Pawan Goyal and Saptarshi Ghosh,http://arxiv.org/pdf/2209.06049v3 | |
http://arxiv.org/abs/2211.05596v1,creativecommons.org/licenses/by/4.0/,Prompt Learning for Domain Adaptation in Task-Oriented Dialogue,Makesh Narsimhan Sreedhar and Christopher Parisien,http://arxiv.org/pdf/2211.05596v1 | |
http://arxiv.org/abs/2211.11890v1,creativecommons.org/licenses/by/4.0/,TEMPERA: Test-Time Prompting via Reinforcement Learning,Tianjun Zhang and Xuezhi Wang and Denny Zhou and Dale Schuurmans and Joseph E. Gonzalez,http://arxiv.org/pdf/2211.11890v1 | |
http://arxiv.org/abs/2301.12074v1,creativecommons.org/licenses/by/4.0/,Comparing Intrinsic Gender Bias Evaluation Measures without using Human Annotated Examples,Masahiro Kaneko and Danushka Bollegala and Naoaki Okazaki,http://arxiv.org/pdf/2301.12074v1 | |
http://arxiv.org/abs/2302.00493v1,creativecommons.org/licenses/by/4.0/,You Are What You Talk About: Inducing Evaluative Topics for Personality Analysis,Josip Jukić and Iva Vukojević and Jan Šnajder,http://arxiv.org/pdf/2302.00493v1 | |
http://arxiv.org/abs/2302.00739v1,creativecommons.org/licenses/by/4.0/,Inference of Partial Colexifications from Multilingual Wordlists,Johann-Mattis List,http://arxiv.org/pdf/2302.00739v1 | |
http://arxiv.org/abs/2302.03694v1,creativecommons.org/licenses/by/4.0/,Characterizing Financial Market Coverage using Artificial Intelligence,Jean Marie Tshimula and D'Jeff K. Nkashama and Patrick Owusu and Marc Frappier and Pierre-Martin Tardif and Froduald Kabanza and Armelle Brun and Jean-Marc Patenaude and Shengrui Wang and Belkacem Chikhaoui,http://arxiv.org/pdf/2302.03694v1 | |
http://arxiv.org/abs/2303.12810v1,creativecommons.org/licenses/by/4.0/,Are LLMs the Master of All Trades? : Exploring Domain-Agnostic Reasoning Skills of LLMs,Shrivats Agrawal,http://arxiv.org/pdf/2303.12810v1 | |
http://arxiv.org/abs/2303.16145v1,creativecommons.org/licenses/by/4.0/,NeuralMind-UNICAMP at 2022 TREC NeuCLIR: Large Boring Rerankers for Cross-lingual Retrieval,Vitor Jeronymo and Roberto Lotufo and Rodrigo Nogueira,http://arxiv.org/pdf/2303.16145v1 | |
http://arxiv.org/abs/2304.01240v2,creativecommons.org/licenses/by/4.0/,Identifying Mentions of Pain in Mental Health Records Text: A Natural Language Processing Approach,Jaya Chaturvedi and Sumithra Velupillai and Robert Stewart and Angus Roberts,http://arxiv.org/pdf/2304.01240v2 | |
http://arxiv.org/abs/2304.06028v1,creativecommons.org/licenses/by/4.0/,RECLIP: Resource-efficient CLIP by Training with Small Images,Runze Li and Dahun Kim and Bir Bhanu and Weicheng Kuo,http://arxiv.org/pdf/2304.06028v1 | |
http://arxiv.org/abs/2205.11656v1,creativecommons.org/licenses/by/4.0/,FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?,Shikhar Tuli and Bhishma Dedhia and Shreshth Tuli and Niraj K. Jha,http://arxiv.org/pdf/2205.11656v1 | |
http://arxiv.org/abs/1911.04118v2,creativecommons.org/licenses/by/4.0/,TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection,Siddhant Garg and Thuy Vu and Alessandro Moschitti,http://arxiv.org/pdf/1911.04118v2 | |
http://arxiv.org/abs/2303.07514v1,creativecommons.org/licenses/by/4.0/,Handwritten Word Recognition using Deep Learning Approach: A Novel Way of Generating Handwritten Words,Mst Shapna Akter and Hossain Shahriar and Alfredo Cuzzocrea and Nova Ahmed and Carson Leung,http://arxiv.org/pdf/2303.07514v1 | |
http://arxiv.org/abs/1412.7415v2,creativecommons.org/licenses/by/4.0/,A prototype Malayalam to Sign Language Automatic Translator,Jestin Joy and Kannan Balakrishnan,http://arxiv.org/pdf/1412.7415v2 | |
http://arxiv.org/abs/1709.02076v1,creativecommons.org/licenses/by/4.0/,Composition by Conversation,Donya Quick and Clayton T. Morrison,http://arxiv.org/pdf/1709.02076v1 | |
http://arxiv.org/abs/2007.11865v1,creativecommons.org/licenses/by/4.0/,AI4D -- African Language Dataset Challenge,Kathleen Siminyu and Sackey Freshia and Jade Abbott and Vukosi Marivate,http://arxiv.org/pdf/2007.11865v1 | |
http://arxiv.org/abs/2103.14698v1,creativecommons.org/licenses/by/4.0/,Implementing G-Machine in HyperLMNtal,Jin Sano,http://arxiv.org/pdf/2103.14698v1 | |
http://arxiv.org/abs/2109.01628v1,creativecommons.org/licenses/by/4.0/,Cross-Lingual Training with Dense Retrieval for Document Retrieval,Peng Shi and Rui Zhang and He Bai and Jimmy Lin,http://arxiv.org/pdf/2109.01628v1 | |
http://arxiv.org/abs/2102.09268v2,creativecommons.org/licenses/by/4.0/,Training Large-Scale News Recommenders with Pretrained Language Models in the Loop,Shitao Xiao and Zheng Liu and Yingxia Shao and Tao Di and Xing Xie,http://arxiv.org/pdf/2102.09268v2 | |
http://arxiv.org/abs/2106.03373v4,creativecommons.org/licenses/by/4.0/,Pre-trained Language Model for Web-scale Retrieval in Baidu Search,Yiding Liu and Guan Huang and Jiaxiang Liu and Weixue Lu and Suqi Cheng and Yukun Li and Daiting Shi and Shuaiqiang Wang and Zhicong Cheng and Dawei Yin,http://arxiv.org/pdf/2106.03373v4 | |
http://arxiv.org/abs/2109.02401v4,creativecommons.org/licenses/by/4.0/,Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization,Tiezheng Yu and Wenliang Dai and Zihan Liu and Pascale Fung,http://arxiv.org/pdf/2109.02401v4 | |
http://arxiv.org/abs/2204.12130v2,creativecommons.org/licenses/by/4.0/,LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models,Mor Geva and Avi Caciularu and Guy Dar and Paul Roit and Shoval Sadde and Micah Shlain and Bar Tamir and Yoav Goldberg,http://arxiv.org/pdf/2204.12130v2 | |
http://arxiv.org/abs/2208.13916v1,creativecommons.org/licenses/by/4.0/,A Language Agnostic Multilingual Streaming On-Device ASR System,Bo Li and Tara N. Sainath and Ruoming Pang and Shuo-yiin Chang and Qiumin Xu and Trevor Strohman and Vince Chen and Qiao Liang and Heguang Liu and Yanzhang He and Parisa Haghani and Sameer Bidichandani,http://arxiv.org/pdf/2208.13916v1 | |
http://arxiv.org/abs/2304.08448v1,creativecommons.org/licenses/by/4.0/,ImpressionGPT: An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT,Chong Ma and Zihao Wu and Jiaqi Wang and Shaochen Xu and Yaonai Wei and Zhengliang Liu and Lei Guo and Xiaoyan Cai and Shu Zhang and Tuo Zhang and Dajiang Zhu and Dinggang Shen and Tianming Liu and Xiang Li,http://arxiv.org/pdf/2304.08448v1 | |
http://arxiv.org/abs/2212.01488v2,creativecommons.org/licenses/by/4.0/,Event knowledge in large language models: the gap between the impossible and the unlikely,Carina Kauf and Anna A. Ivanova and Giulia Rambelli and Emmanuele Chersoni and Jingyuan S. She and Zawad Chowdhury and Evelina Fedorenko and Alessandro Lenci,http://arxiv.org/pdf/2212.01488v2 | |
http://arxiv.org/abs/2208.10091v2,creativecommons.org/licenses/by/4.0/,Incorporating Domain Knowledge through Task Augmentation for Front-End JavaScript Code Generation,Sijie Shen and Xiang Zhu and Yihong Dong and Qizhi Guo and Yankun Zhen and Ge Li,http://arxiv.org/pdf/2208.10091v2 | |
http://arxiv.org/abs/2004.01221v1,creativecommons.org/licenses/by/4.0/,Towards Relevance and Sequence Modeling in Language Recognition,Bharat Padi and Anand Mohan and Sriram Ganapathy,http://arxiv.org/pdf/2004.01221v1 | |
http://arxiv.org/abs/2011.03965v1,creativecommons.org/licenses/by/4.0/,On the Practical Ability of Recurrent Neural Networks to Recognize Hierarchical Languages,Satwik Bhattamishra and Kabir Ahuja and Navin Goyal,http://arxiv.org/pdf/2011.03965v1 | |
http://arxiv.org/abs/2105.07144v3,creativecommons.org/licenses/by/4.0/,A Cognitive Regularizer for Language Modeling,Jason Wei and Clara Meister and Ryan Cotterell,http://arxiv.org/pdf/2105.07144v3 | |
http://arxiv.org/abs/2212.09662v1,creativecommons.org/licenses/by/4.0/,MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering,Fangyu Liu and Francesco Piccinno and Syrine Krichene and Chenxi Pang and Kenton Lee and Mandar Joshi and Yasemin Altun and Nigel Collier and Julian Martin Eisenschlos,http://arxiv.org/pdf/2212.09662v1 | |
http://arxiv.org/abs/2105.14444v1,creativecommons.org/licenses/by/4.0/,NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search,Jin Xu and Xu Tan and Renqian Luo and Kaitao Song and Jian Li and Tao Qin and Tie-Yan Liu,http://arxiv.org/pdf/2105.14444v1 | |
http://arxiv.org/abs/2112.11805v2,creativecommons.org/licenses/by/4.0/,Neural-Symbolic Integration for Interactive Learning and Conceptual Grounding,Benedikt Wagner and Artur d'Avila Garcez,http://arxiv.org/pdf/2112.11805v2 | |
http://arxiv.org/abs/2205.01941v1,creativecommons.org/licenses/by/4.0/,Lexical Knowledge Internalization for Neural Dialog Generation,Zhiyong Wu and Wei Bi and Xiang Li and Lingpeng Kong and Ben Kao,http://arxiv.org/pdf/2205.01941v1 | |
http://arxiv.org/abs/1903.10915v1,creativecommons.org/licenses/by/4.0/,Language Model Adaptation for Language and Dialect Identification of Text,Tommi Jauhiainen and Krister Lindén and Heidi Jauhiainen,http://arxiv.org/pdf/1903.10915v1 | |
http://arxiv.org/abs/2010.03542v1,creativecommons.org/licenses/by/4.0/,Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive Language Identification using Pre-trained Language Models,Shuohuan Wang and Jiaxiang Liu and Xuan Ouyang and Yu Sun,http://arxiv.org/pdf/2010.03542v1 | |
http://arxiv.org/abs/2106.02232v1,creativecommons.org/licenses/by/4.0/,Language Scaling for Universal Suggested Replies Model,Qianlan Ying and Payal Bajaj and Budhaditya Deb and Yu Yang and Wei Wang and Bojia Lin and Milad Shokouhi and Xia Song and Yang Yang and Daxin Jiang,http://arxiv.org/pdf/2106.02232v1 | |
http://arxiv.org/abs/2201.06469v1,creativecommons.org/licenses/by/4.0/,Handling Compounding in Mobile Keyboard Input,Andreas Kabel and Keith Hall and Tom Ouyang and David Rybach and Daan van Esch and Françoise Beaufays,http://arxiv.org/pdf/2201.06469v1 | |
http://arxiv.org/abs/2201.10707v1,creativecommons.org/licenses/by/4.0/,A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model,Xin Sun and Tao Ge and Shuming Ma and Jingjing Li and Furu Wei and Houfeng Wang,http://arxiv.org/pdf/2201.10707v1 | |
http://arxiv.org/abs/2210.12246v1,creativecommons.org/licenses/by/4.0/,A General Architecture for Client-Agnostic Hybrid Model Editors as a Service,Liam Walsh and Juergen Dingel and Karim Jahed,http://arxiv.org/pdf/2210.12246v1 | |
http://arxiv.org/abs/2303.17517v1,creativecommons.org/licenses/by/4.0/,Hindi as a Second Language: Improving Visually Grounded Speech with Semantically Similar Samples,Hyeonggon Ryu and Arda Senocak and In So Kweon and Joon Son Chung,http://arxiv.org/pdf/2303.17517v1 | |
http://arxiv.org/abs/2102.00405v2,creativecommons.org/licenses/by/4.0/,BNLP: Natural language processing toolkit for Bengali language,Sagor Sarker,http://arxiv.org/pdf/2102.00405v2 | |
http://arxiv.org/abs/2102.07150v1,creativecommons.org/licenses/by/4.0/,indicnlp@kgp at DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Languages,Kushal Kedia and Abhilash Nandy,http://arxiv.org/pdf/2102.07150v1 | |
http://arxiv.org/abs/2106.01195v2,creativecommons.org/licenses/by/4.0/,Figurative Language in Recognizing Textual Entailment,Tuhin Chakrabarty and Debanjan Ghosh and Adam Poliak and Smaranda Muresan,http://arxiv.org/pdf/2106.01195v2 | |
http://arxiv.org/abs/2201.13072v1,creativecommons.org/licenses/by/4.0/,Are Mutually Intelligible Languages Easier to Translate?,Avital Friedland and Jonathan Zeltser and Omer Levy,http://arxiv.org/pdf/2201.13072v1 | |
http://arxiv.org/abs/2203.09313v2,creativecommons.org/licenses/by/4.0/,EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training,Yuxian Gu and Jiaxin Wen and Hao Sun and Yi Song and Pei Ke and Chujie Zheng and Zheng Zhang and Jianzhu Yao and Xiaoyan Zhu and Jie Tang and Minlie Huang,http://arxiv.org/pdf/2203.09313v2 | |
http://arxiv.org/abs/2111.00830v2,creativecommons.org/licenses/by/4.0/,Deep Learning Transformer Architecture for Named Entity Recognition on Low Resourced Languages: State of the art results,Ridewaan Hanslo,http://arxiv.org/pdf/2111.00830v2 | |
http://arxiv.org/abs/1908.03837v2,creativecommons.org/licenses/by/4.0/,Modeling Graphs with Vertex Replacement Grammars,Satyaki Sikdar and Justus Hibshman and Tim Weninger,http://arxiv.org/pdf/1908.03837v2 | |
http://arxiv.org/abs/1707.04678v1,creativecommons.org/licenses/by/4.0/,Lyrics-Based Music Genre Classification Using a Hierarchical Attention Network,Alexandros Tsaptsinos,http://arxiv.org/pdf/1707.04678v1 | |
http://arxiv.org/abs/1906.01543v2,creativecommons.org/licenses/by/4.0/,Training Neural Response Selection for Task-Oriented Dialogue Systems,Matthew Henderson and Ivan Vulić and Daniela Gerz and Iñigo Casanueva and Paweł Budzianowski and Sam Coope and Georgios Spithourakis and Tsung-Hsien Wen and Nikola Mrkšić and Pei-Hao Su,http://arxiv.org/pdf/1906.01543v2 | |
http://arxiv.org/abs/2012.15419v1,creativecommons.org/licenses/by/4.0/,An Experimental Evaluation of Transformer-based Language Models in the Biomedical Domain,Paul Grouchy and Shobhit Jain and Michael Liu and Kuhan Wang and Max Tian and Nidhi Arora and Hillary Ngai and Faiza Khan Khattak and Elham Dolatabadi and Sedef Akinli Kocak,http://arxiv.org/pdf/2012.15419v1 | |
http://arxiv.org/abs/2106.02497v1,creativecommons.org/licenses/by/4.0/,COINS: Dynamically Generating COntextualized Inference Rules for Narrative Story Completion,Debjit Paul and Anette Frank,http://arxiv.org/pdf/2106.02497v1 | |
http://arxiv.org/abs/2106.05634v1,creativecommons.org/licenses/by/4.0/,Exploring Unsupervised Pretraining Objectives for Machine Translation,Christos Baziotis and Ivan Titov and Alexandra Birch and Barry Haddow,http://arxiv.org/pdf/2106.05634v1 | |
http://arxiv.org/abs/2106.15332v1,creativecommons.org/licenses/by/4.0/,Winner Team Mia at TextVQA Challenge 2021: Vision-and-Language Representation Learning with Pre-trained Sequence-to-Sequence Model,Yixuan Qiao and Hao Chen and Jun Wang and Yihao Chen and Xianbin Ye and Ziliang Li and Xianbiao Qi and Peng Gao and Guotong Xie,http://arxiv.org/pdf/2106.15332v1 | |
http://arxiv.org/abs/2109.10164v2,creativecommons.org/licenses/by/4.0/,RAIL-KD: RAndom Intermediate Layer Mapping for Knowledge Distillation,Md Akmal Haidar and Nithin Anchuri and Mehdi Rezagholizadeh and Abbas Ghaddar and Philippe Langlais and Pascal Poupart,http://arxiv.org/pdf/2109.10164v2 | |
http://arxiv.org/abs/2109.11745v1,creativecommons.org/licenses/by/4.0/,DACT-BERT: Differentiable Adaptive Computation Time for an Efficient BERT Inference,Cristóbal Eyzaguirre and Felipe del Río and Vladimir Araujo and Álvaro Soto,http://arxiv.org/pdf/2109.11745v1 | |
http://arxiv.org/abs/2110.06176v2,creativecommons.org/licenses/by/4.0/,Mention Memory: incorporating textual knowledge into Transformers through entity mention attention,Michiel de Jong and Yury Zemlyanskiy and Nicholas FitzGerald and Fei Sha and William Cohen,http://arxiv.org/pdf/2110.06176v2 | |
http://arxiv.org/abs/2112.01147v2,creativecommons.org/licenses/by/4.0/,CO2Sum:Contrastive Learning for Factual-Consistent Abstractive Summarization,Wei Liu and Huanqin Wu and Wenjing Mu and Zhen Li and Tao Chen and Dan Nie,http://arxiv.org/pdf/2112.01147v2 | |
http://arxiv.org/abs/2205.09224v2,creativecommons.org/licenses/by/4.0/,Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner,Danilo Ribeiro and Shen Wang and Xiaofei Ma and Rui Dong and Xiaokai Wei and Henry Zhu and Xinchi Chen and Zhiheng Huang and Peng Xu and Andrew Arnold and Dan Roth,http://arxiv.org/pdf/2205.09224v2 | |
http://arxiv.org/abs/2205.13657v3,creativecommons.org/licenses/by/4.0/,An enhanced Conv-TasNet model for speech separation using a speaker distance-based loss function,Jose A. Arango-Sánchez and Julián D. Arias-Londoño,http://arxiv.org/pdf/2205.13657v3 | |
http://arxiv.org/abs/2207.10802v2,creativecommons.org/licenses/by/4.0/,Combing for Credentials: Active Pattern Extraction from Smart Reply,Bargav Jayaraman and Esha Ghosh and Melissa Chase and Sambuddha Roy and Huseyin Inan and Wei Dai and David Evans,http://arxiv.org/pdf/2207.10802v2 | |
http://arxiv.org/abs/2211.01542v2,creativecommons.org/licenses/by/4.0/,Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions,Shuhao Gu and Bojie Hu and Yang Feng,http://arxiv.org/pdf/2211.01542v2 | |
http://arxiv.org/abs/2211.07712v1,creativecommons.org/licenses/by/4.0/,Cloning Ideology and Style using Deep Learning,Dr. Omer Beg and Muhammad Nasir Zafar and Waleed Anjum,http://arxiv.org/pdf/2211.07712v1 | |
http://arxiv.org/abs/2211.16912v1,creativecommons.org/licenses/by/4.0/,Quadapter: Adapter for GPT-2 Quantization,Minseop Park and Jaeseong You and Markus Nagel and Simyung Chang,http://arxiv.org/pdf/2211.16912v1 | |
http://arxiv.org/abs/2212.01692v1,creativecommons.org/licenses/by/4.0/,What is Not in the Context? Evaluation of Few-shot Learners with Informative Demonstrations,Michal Štefánik and Marek Kadlčík,http://arxiv.org/pdf/2212.01692v1 | |
http://arxiv.org/abs/2301.10521v1,creativecommons.org/licenses/by/4.0/,ExaRanker: Explanation-Augmented Neural Ranker,Fernando Ferraretto and Thiago Laitz and Roberto Lotufo and Rodrigo Nogueira,http://arxiv.org/pdf/2301.10521v1 | |
http://arxiv.org/abs/2301.11845v2,creativecommons.org/licenses/by/4.0/,Learning the Effects of Physical Actions in a Multi-modal Environment,Gautier Dagan and Frank Keller and Alex Lascarides,http://arxiv.org/pdf/2301.11845v2 | |
http://arxiv.org/abs/2302.05608v1,creativecommons.org/licenses/by/4.0/,Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis,Zhu Wang and Sourav Medya and Sathya N. Ravi,http://arxiv.org/pdf/2302.05608v1 | |
http://arxiv.org/abs/2302.05888v1,creativecommons.org/licenses/by/4.0/,Position Matters! Empirical Study of Order Effect in Knowledge-grounded Dialogue,Hsuan Su and Shachi H Kumar and Sahisnu Mazumder and Wenda Chen and Ramesh Manuvinakurike and Eda Okur and Saurav Sahay and Lama Nachman and Shang-Tse Chen and Hung-yi Lee,http://arxiv.org/pdf/2302.05888v1 | |
http://arxiv.org/abs/2302.12367v1,creativecommons.org/licenses/by/4.0/,Extracting Victim Counts from Text,Mian Zhong and Shehzaad Dhuliawala and Niklas Stoehr,http://arxiv.org/pdf/2302.12367v1 | |
http://arxiv.org/abs/2303.17003v1,creativecommons.org/licenses/by/4.0/,Evaluating GPT-3.5 and GPT-4 Models on Brazilian University Admission Exams,Desnes Nunes and Ricardo Primi and Ramon Pires and Roberto Lotufo and Rodrigo Nogueira,http://arxiv.org/pdf/2303.17003v1 | |
http://arxiv.org/abs/2303.17972v2,creativecommons.org/licenses/by/4.0/,$\varepsilon$ KÚ <MASK>: Integrating Yorùbá cultural greetings into machine translation,Idris Akinade and Jesujoba Alabi and David Adelani and Clement Odoje and Dietrich Klakow,http://arxiv.org/pdf/2303.17972v2 | |
http://arxiv.org/abs/2304.01830v1,creativecommons.org/licenses/by/4.0/,Learning to Name Classes for Vision and Language Models,Sarah Parisot and Yongxin Yang and Steven McDonagh,http://arxiv.org/pdf/2304.01830v1 | |
http://arxiv.org/abs/2304.08247v1,creativecommons.org/licenses/by/4.0/,MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data,Tianyu Han and Lisa C. Adams and Jens-Michalis Papaioannou and Paul Grundmann and Tom Oberhauser and Alexander Löser and Daniel Truhn and Keno K. Bressem,http://arxiv.org/pdf/2304.08247v1 | |
http://arxiv.org/abs/2304.11075v1,creativecommons.org/licenses/by/4.0/,Spaiche: Extending State-of-the-Art ASR Models to Swiss German Dialects,Clément Sicard and Kajetan Pyszkowski and Victor Gillioz,http://arxiv.org/pdf/2304.11075v1 | |
http://arxiv.org/abs/2110.09574v1,creativecommons.org/licenses/by/4.0/,Multilingual Domain Adaptation for NMT: Decoupling Language and Domain Information with Adapters,Asa Cooper Stickland and Alexandre Bérard and Vassilina Nikoulina,http://arxiv.org/pdf/2110.09574v1 | |
http://arxiv.org/abs/2205.08755v1,creativecommons.org/licenses/by/4.0/,Persian Natural Language Inference: A Meta-learning approach,Heydar Soudani and Mohammad Hassan Mojab and Hamid Beigy,http://arxiv.org/pdf/2205.08755v1 | |
http://arxiv.org/abs/2101.11038v1,creativecommons.org/licenses/by/4.0/,Muppet: Massive Multi-task Representations with Pre-Finetuning,Armen Aghajanyan and Anchit Gupta and Akshat Shrivastava and Xilun Chen and Luke Zettlemoyer and Sonal Gupta,http://arxiv.org/pdf/2101.11038v1 | |
http://arxiv.org/abs/2201.12799v1,creativecommons.org/licenses/by/4.0/,Recognition of Implicit Geographic Movement in Text,Scott Pezanowski and Prasenjit Mitra,http://arxiv.org/pdf/2201.12799v1 | |
http://arxiv.org/abs/2204.04711v1,creativecommons.org/licenses/by/4.0/,Data Augmentation for Biomedical Factoid Question Answering,Dimitris Pappas and Prodromos Malakasiotis and Ion Androutsopoulos,http://arxiv.org/pdf/2204.04711v1 | |
http://arxiv.org/abs/2211.15914v1,creativecommons.org/licenses/by/4.0/,Zero-Shot Opinion Summarization with GPT-3,Adithya Bhaskar and Alexander R. Fabbri and Greg Durrett,http://arxiv.org/pdf/2211.15914v1 | |
http://arxiv.org/abs/2109.05812v2,creativecommons.org/licenses/by/4.0/,UniMS: A Unified Framework for Multimodal Summarization with Knowledge Distillation,Zhengkun Zhang and Xiaojun Meng and Yasheng Wang and Xin Jiang and Qun Liu and Zhenglu Yang,http://arxiv.org/pdf/2109.05812v2 | |
http://arxiv.org/abs/2203.06486v3,creativecommons.org/licenses/by/4.0/,Chart-to-Text: A Large-Scale Benchmark for Chart Summarization,Shankar Kantharaj and Rixie Tiffany Ko Leong and Xiang Lin and Ahmed Masry and Megh Thakkar and Enamul Hoque and Shafiq Joty,http://arxiv.org/pdf/2203.06486v3 | |
http://arxiv.org/abs/2205.11409v1,creativecommons.org/licenses/by/4.0/,Many-Class Text Classification with Matching,Yi Song and Yuxian Gu and Minlie Huang,http://arxiv.org/pdf/2205.11409v1 | |
http://arxiv.org/abs/2212.01588v1,creativecommons.org/licenses/by/4.0/,RHO ($ρ$): Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding,Ziwei Ji and Zihan Liu and Nayeon Lee and Tiezheng Yu and Bryan Wilie and Min Zeng and Pascale Fung,http://arxiv.org/pdf/2212.01588v1 | |
http://arxiv.org/abs/2212.08153v1,creativecommons.org/licenses/by/4.0/,FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference,Michiel de Jong and Yury Zemlyanskiy and Joshua Ainslie and Nicholas FitzGerald and Sumit Sanghai and Fei Sha and William Cohen,http://arxiv.org/pdf/2212.08153v1 | |
http://arxiv.org/abs/2303.10138v1,creativecommons.org/licenses/by/4.0/,"Generate, Transform, Answer: Question Specific Tool Synthesis for Tabular Data",Carlos Gemmell and Jeffrey Dalton,http://arxiv.org/pdf/2303.10138v1 | |
http://arxiv.org/abs/2304.09386v1,creativecommons.org/licenses/by/4.0/,Towards Objective-Tailored Genetic Improvement Through Large Language Models,Sungmin Kang and Shin Yoo,http://arxiv.org/pdf/2304.09386v1 | |
http://arxiv.org/abs/2112.08491v1,creativecommons.org/licenses/by/4.0/,"Human Languages with Greater Information Density Increase Communication Speed, but Decrease Conversation Breadth",Pedro Aceves and James A. Evans,http://arxiv.org/pdf/2112.08491v1 | |
http://arxiv.org/abs/2303.03915v1,creativecommons.org/licenses/by/4.0/,The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset,Hugo Laurençon and Lucile Saulnier and Thomas Wang and Christopher Akiki and Albert Villanova del Moral and Teven Le Scao and Leandro Von Werra and Chenghao Mou and Eduardo González Ponferrada and Huu Nguyen and Jörg Frohberg and Mario Šaško and Quentin Lhoest and Angelina McMillan-Major and Gerard Dupont and Stella Biderman and Anna Rogers and Loubna Ben allal and Francesco De Toni and Giada Pistilli and Olivier Nguyen and Somaieh Nikpoor and Maraim Masoud and Pierre Colombo and Javier de la Rosa and Paulo Villegas and Tristan Thrush and Shayne Longpre and Sebastian Nagel and Leon Weber and Manuel Muñoz and Jian Zhu and Daniel Van Strien and Zaid Alyafeai and Khalid Almubarak and Minh Chien Vu and Itziar Gonzalez-Dios and Aitor Soroa and Kyle Lo and Manan Dey and Pedro Ortiz Suarez and Aaron Gokaslan and Shamik Bose and David Adelani and Long Phan and Hieu Tran and Ian Yu and Suhas Pai and Jenny Chim and Violette Lepercq and Suzana Ilic and Margaret Mitchell and Sasha Alexandra Luccioni and Yacine Jernite,http://arxiv.org/pdf/2303.03915v1 | |
http://arxiv.org/abs/1912.10308v1,creativecommons.org/licenses/by/4.0/,Candidate Fusion: Integrating Language Modelling into a Sequence-to-Sequence Handwritten Word Recognition Architecture,Lei Kang and Pau Riba and Mauricio Villegas and Alicia Fornés and Marçal Rusiñol,http://arxiv.org/pdf/1912.10308v1 | |
http://arxiv.org/abs/2204.03498v1,creativecommons.org/licenses/by/4.0/,On the Effectiveness of Pretrained Models for API Learning,Mohammad Abdul Hadi and Imam Nur Bani Yusuf and Ferdian Thung and Kien Gia Luong and Jiang Lingxiao and Fatemeh H. Fard and David Lo,http://arxiv.org/pdf/2204.03498v1 | |
http://arxiv.org/abs/2204.07288v1,creativecommons.org/licenses/by/4.0/,Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models,Phyllis Ang and Bhuwan Dhingra and Lisa Wu Wills,http://arxiv.org/pdf/2204.07288v1 | |
http://arxiv.org/abs/2109.14728v1,creativecommons.org/licenses/by/4.0/,Collaborative Storytelling with Human Actors and AI Narrators,Boyd Branch and Piotr Mirowski and Kory W. Mathewson,http://arxiv.org/pdf/2109.14728v1 | |
http://arxiv.org/abs/2112.14757v2,creativecommons.org/licenses/by/4.0/,A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model,Mengde Xu and Zheng Zhang and Fangyun Wei and Yutong Lin and Yue Cao and Han Hu and Xiang Bai,http://arxiv.org/pdf/2112.14757v2 | |
http://arxiv.org/abs/2208.00636v2,creativecommons.org/licenses/by/4.0/,Interacting with next-phrase suggestions: How suggestion systems aid and influence the cognitive processes of writing,Advait Bhat and Saaket Agashe and Niharika Mohile and Parth Oberoi and Ravi Jangir and Anirudha Joshi,http://arxiv.org/pdf/2208.00636v2 | |
http://arxiv.org/abs/2301.11305v1,creativecommons.org/licenses/by/4.0/,DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature,Eric Mitchell and Yoonho Lee and Alexander Khazatsky and Christopher D. Manning and Chelsea Finn,http://arxiv.org/pdf/2301.11305v1 | |
http://arxiv.org/abs/2111.00640v2,creativecommons.org/licenses/by/4.0/,VSEC: Transformer-based Model for Vietnamese Spelling Correction,Dinh-Truong Do and Ha Thanh Nguyen and Thang Ngoc Bui and Dinh Hieu Vo,http://arxiv.org/pdf/2111.00640v2 | |
http://arxiv.org/abs/2205.01068v4,creativecommons.org/licenses/by/4.0/,OPT: Open Pre-trained Transformer Language Models,Susan Zhang and Stephen Roller and Naman Goyal and Mikel Artetxe and Moya Chen and Shuohui Chen and Christopher Dewan and Mona Diab and Xian Li and Xi Victoria Lin and Todor Mihaylov and Myle Ott and Sam Shleifer and Kurt Shuster and Daniel Simig and Punit Singh Koura and Anjali Sridhar and Tianlu Wang and Luke Zettlemoyer,http://arxiv.org/pdf/2205.01068v4 | |
http://arxiv.org/abs/2303.11156v1,creativecommons.org/licenses/by/4.0/,Can AI-Generated Text be Reliably Detected?,Vinu Sankar Sadasivan and Aounon Kumar and Sriram Balasubramanian and Wenxiao Wang and Soheil Feizi,http://arxiv.org/pdf/2303.11156v1 | |
http://arxiv.org/abs/2303.08448v1,creativecommons.org/licenses/by/4.0/,A Cross-institutional Evaluation on Breast Cancer Phenotyping NLP Algorithms on Electronic Health Records,Sicheng Zhou and Nan Wang and Liwei Wang and Ju Sun and Anne Blaes and Hongfang Liu and Rui Zhang,http://arxiv.org/pdf/2303.08448v1 | |
http://arxiv.org/abs/2203.13590v1,creativecommons.org/licenses/by/4.0/,Impact of Dataset on Acoustic Models for Automatic Speech Recognition,Siddhesh Singh,http://arxiv.org/pdf/2203.13590v1 | |
http://arxiv.org/abs/2206.08853v2,creativecommons.org/licenses/by/4.0/,MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge,Linxi Fan and Guanzhi Wang and Yunfan Jiang and Ajay Mandlekar and Yuncong Yang and Haoyi Zhu and Andrew Tang and De-An Huang and Yuke Zhu and Anima Anandkumar,http://arxiv.org/pdf/2206.08853v2 | |
http://arxiv.org/abs/2211.10438v4,creativecommons.org/licenses/by/4.0/,SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models,Guangxuan Xiao and Ji Lin and Mickael Seznec and Hao Wu and Julien Demouth and Song Han,http://arxiv.org/pdf/2211.10438v4 | |
http://arxiv.org/abs/2301.13268v2,creativecommons.org/licenses/by/4.0/,Contextual Dynamic Prompting for Response Generation in Task-oriented Dialog Systems,Sandesh Swamy and Narges Tabari and Chacha Chen and Rashmi Gangadharaiah,http://arxiv.org/pdf/2301.13268v2 | |
http://arxiv.org/abs/2302.05319v1,creativecommons.org/licenses/by/4.0/,Controlling Large Language Models to Generate Secure and Vulnerable Code,Jingxuan He and Martin Vechev,http://arxiv.org/pdf/2302.05319v1 | |
http://arxiv.org/abs/2302.08659v1,creativecommons.org/licenses/by/4.0/,Uncertainty-aware Self-training for Low-resource Neural Sequence Labeling,Jianing Wang and Chengyu Wang and Jun Huang and Ming Gao and Aoying Zhou,http://arxiv.org/pdf/2302.08659v1 | |
http://arxiv.org/abs/2304.01246v1,creativecommons.org/licenses/by/4.0/,Safety Analysis in the Era of Large Language Models: A Case Study of STPA using ChatGPT,Yi Qi and Xingyu Zhao and Xiaowei Huang,http://arxiv.org/pdf/2304.01246v1 | |
http://arxiv.org/abs/2304.09667v2,creativecommons.org/licenses/by/4.0/,GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information,Qiao Jin and Yifan Yang and Qingyu Chen and Zhiyong Lu,http://arxiv.org/pdf/2304.09667v2 | |
http://arxiv.org/abs/2112.01742v1,creativecommons.org/licenses/by/4.0/,Multitask Finetuning for Improving Neural Machine Translation in Indian Languages,Shaily Desai and Atharva Kshirsagar and Manisha Marathe,http://arxiv.org/pdf/2112.01742v1 | |
http://arxiv.org/abs/2107.05476v1,creativecommons.org/licenses/by/4.0/,Technical Report of Team GraphMIRAcles in the WikiKG90M-LSC Track of OGB-LSC @ KDD Cup 2021,Jianyu Cai and Jiajun Chen and Taoxing Pan and Zhanqiu Zhang and Jie Wang,http://arxiv.org/pdf/2107.05476v1 | |
http://arxiv.org/abs/2110.01963v1,creativecommons.org/licenses/by/4.0/,"Multimodal datasets: misogyny, pornography, and malignant stereotypes",Abeba Birhane and Vinay Uday Prabhu and Emmanuel Kahembwe,http://arxiv.org/pdf/2110.01963v1 | |
http://arxiv.org/abs/2109.09405v1,creativecommons.org/licenses/by/4.0/,Assessing the quality of sources in Wikidata across languages: a hybrid approach,Gabriel Amaral and Alessandro Piscopo and Lucie-Aimée Kaffee and Odinaldo Rodrigues and Elena Simperl,http://arxiv.org/pdf/2109.09405v1 | |
http://arxiv.org/abs/2109.09475v1,creativecommons.org/licenses/by/4.0/,Knowledge Graph Question Answering via SPARQL Silhouette Generation,Sukannya Purkayastha and Saswati Dana and Dinesh Garg and Dinesh Khandelwal and G P Shrivatsa Bhargav,http://arxiv.org/pdf/2109.09475v1 | |
http://arxiv.org/abs/2109.10504v3,creativecommons.org/licenses/by/4.0/,KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation,Yongfei Liu and Chenfei Wu and Shao-yen Tseng and Vasudev Lal and Xuming He and Nan Duan,http://arxiv.org/pdf/2109.10504v3 | |
http://arxiv.org/abs/2205.09229v3,creativecommons.org/licenses/by/4.0/,PromptDA: Label-guided Data Augmentation for Prompt-based Few-shot Learners,Canyu Chen and Kai Shu,http://arxiv.org/pdf/2205.09229v3 | |
http://arxiv.org/abs/2205.12089v2,creativecommons.org/licenses/by/4.0/,Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity Resolution,Georgios Tziafas and Hamidreza Kasaei,http://arxiv.org/pdf/2205.12089v2 | |
http://arxiv.org/abs/2210.12328v1,creativecommons.org/licenses/by/4.0/,"R$^2$F: A General Retrieval, Reading and Fusion Framework for Document-level Natural Language Inference",Hao Wang and Yixin Cao and Yangguang Li and Zhen Huang and Kun Wang and Jing Shao,http://arxiv.org/pdf/2210.12328v1 | |
http://arxiv.org/abs/2211.08462v1,creativecommons.org/licenses/by/4.0/,Navigating Connected Memories with a Task-oriented Dialog System,Seungwhan Moon and Satwik Kottur and Alborz Geramifard and Babak Damavandi,http://arxiv.org/pdf/2211.08462v1 | |
http://arxiv.org/abs/2303.04142v1,creativecommons.org/licenses/by/4.0/,From Copilot to Pilot: Towards AI Supported Software Development,Rohith Pudari and Neil A. Ernst,http://arxiv.org/pdf/2303.04142v1 | |
http://arxiv.org/abs/2211.10086v1,creativecommons.org/licenses/by/4.0/,Metadata Might Make Language Models Better,Kaspar Beelen and Daniel van Strien,http://arxiv.org/pdf/2211.10086v1 | |
http://arxiv.org/abs/2204.09168v2,creativecommons.org/licenses/by/4.0/,Analyzing Gender Representation in Multilingual Models,Hila Gonen and Shauli Ravfogel and Yoav Goldberg,http://arxiv.org/pdf/2204.09168v2 | |
http://arxiv.org/abs/2103.00854v3,creativecommons.org/licenses/by/4.0/,Vyākarana: A Colorless Green Benchmark for Syntactic Evaluation in Indic Languages,Rajaswa Patil and Jasleen Dhillon and Siddhant Mahurkar and Saumitra Kulkarni and Manav Malhotra and Veeky Baths,http://arxiv.org/pdf/2103.00854v3 | |
http://arxiv.org/abs/2108.04080v2,creativecommons.org/licenses/by/4.0/,Aspect-based Sentiment Analysis in Document -- FOMC Meeting Minutes on Economic Projection,Sarah-Yifei-Wang,http://arxiv.org/pdf/2108.04080v2 | |
http://arxiv.org/abs/2110.03142v1,creativecommons.org/licenses/by/4.0/,A Comparative Study of Transformer-Based Language Models on Extractive Question Answering,Kate Pearce and Tiffany Zhan and Aneesh Komanduri and Justin Zhan,http://arxiv.org/pdf/2110.03142v1 | |
http://arxiv.org/abs/1909.08135v3,creativecommons.org/licenses/by/4.0/,SUPP.AI: Finding Evidence for Supplement-Drug Interactions,Lucy Lu Wang and Oyvind Tafjord and Arman Cohan and Sarthak Jain and Sam Skjonsberg and Carissa Schoenick and Nick Botner and Waleed Ammar,http://arxiv.org/pdf/1909.08135v3 | |
http://arxiv.org/abs/2011.04767v1,creativecommons.org/licenses/by/4.0/,An Analysis of Dataset Overlap on Winograd-Style Tasks,Ali Emami and Adam Trischler and Kaheer Suleman and Jackie Chi Kit Cheung,http://arxiv.org/pdf/2011.04767v1 | |
http://arxiv.org/abs/2011.10285v1,creativecommons.org/licenses/by/4.0/,Learning Informative Representations of Biomedical Relations with Latent Variable Models,Harshil Shah and Julien Fauqueur,http://arxiv.org/pdf/2011.10285v1 | |
http://arxiv.org/abs/2012.15353v1,creativecommons.org/licenses/by/4.0/,Deriving Contextualised Semantic Features from BERT (and Other Transformer Model) Embeddings,Jacob Turton and David Vinson and Robert Elliott Smith,http://arxiv.org/pdf/2012.15353v1 | |
http://arxiv.org/abs/2103.06779v2,creativecommons.org/licenses/by/4.0/,MERMAID: Metaphor Generation with Symbolism and Discriminative Decoding,Tuhin Chakrabarty and Xurui Zhang and Smaranda Muresan and Nanyun Peng,http://arxiv.org/pdf/2103.06779v2 | |
http://arxiv.org/abs/2103.16102v1,creativecommons.org/licenses/by/4.0/,XRJL-HKUST at SemEval-2021 Task 4: WordNet-Enhanced Dual Multi-head Co-Attention for Reading Comprehension of Abstract Meaning,Yuxin Jiang and Ziyi Shou and Qijun Wang and Hao Wu and Fangzhen Lin,http://arxiv.org/pdf/2103.16102v1 | |
http://arxiv.org/abs/2104.03465v1,creativecommons.org/licenses/by/4.0/,Nutribullets Hybrid: Multi-document Health Summarization,Darsh J Shah and Lili Yu and Tao Lei and Regina Barzilay,http://arxiv.org/pdf/2104.03465v1 | |
http://arxiv.org/abs/2109.00239v1,creativecommons.org/licenses/by/4.0/,OptAGAN: Entropy-based finetuning on text VAE-GAN,Paolo Tirotta and Stefano Lodi,http://arxiv.org/pdf/2109.00239v1 | |
http://arxiv.org/abs/2109.13067v1,creativecommons.org/licenses/by/4.0/,Multi-Task and Multi-Corpora Training Strategies to Enhance Argumentative Sentence Linking Performance,Jan Wira Gotama Putra and Simone Teufel and Takenobu Tokunaga,http://arxiv.org/pdf/2109.13067v1 | |
http://arxiv.org/abs/2110.08355v2,creativecommons.org/licenses/by/4.0/,Clean or Annotate: How to Spend a Limited Data Collection Budget,Derek Chen and Zhou Yu and Samuel R. Bowman,http://arxiv.org/pdf/2110.08355v2 | |
http://arxiv.org/abs/2112.11670v1,creativecommons.org/licenses/by/4.0/,Domain Adaptation with Pre-trained Transformers for Query Focused Abstractive Text Summarization,Md Tahmid Rahman Laskar and Enamul Hoque and Jimmy Xiangji Huang,http://arxiv.org/pdf/2112.11670v1 | |
http://arxiv.org/abs/2204.10989v3,creativecommons.org/licenses/by/4.0/,Dialogue Meaning Representation for Task-Oriented Dialogue Systems,Xiangkun Hu and Junqi Dai and Hang Yan and Yi Zhang and Qipeng Guo and Xipeng Qiu and Zheng Zhang,http://arxiv.org/pdf/2204.10989v3 | |
http://arxiv.org/abs/2204.12061v2,creativecommons.org/licenses/by/4.0/,PLOD: An Abbreviation Detection Dataset for Scientific Documents,Leonardo Zilio and Hadeel Saadany and Prashant Sharma and Diptesh Kanojia and Constantin Orăsan,http://arxiv.org/pdf/2204.12061v2 | |
http://arxiv.org/abs/2205.12506v2,creativecommons.org/licenses/by/4.0/,Memorization in NLP Fine-tuning Methods,Fatemehsadat Mireshghallah and Archit Uniyal and Tianhao Wang and David Evans and Taylor Berg-Kirkpatrick,http://arxiv.org/pdf/2205.12506v2 | |
http://arxiv.org/abs/2206.02770v1,creativecommons.org/licenses/by/4.0/,Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts,Basil Mustafa and Carlos Riquelme and Joan Puigcerver and Rodolphe Jenatton and Neil Houlsby,http://arxiv.org/pdf/2206.02770v1 | |
http://arxiv.org/abs/2209.12616v1,creativecommons.org/licenses/by/4.0/,T-NER: An All-Round Python Library for Transformer-based Named Entity Recognition,Asahi Ushio and Jose Camacho-Collados,http://arxiv.org/pdf/2209.12616v1 | |
http://arxiv.org/abs/2210.02570v1,creativecommons.org/licenses/by/4.0/,Revisiting Structured Dropout,Yiren Zhao and Oluwatomisin Dada and Xitong Gao and Robert D Mullins,http://arxiv.org/pdf/2210.02570v1 | |
http://arxiv.org/abs/2210.07469v2,creativecommons.org/licenses/by/4.0/,StyLEx: Explaining Style Using Human Lexical Annotations,Shirley Anugrah Hayati and Kyumin Park and Dheeraj Rajagopal and Lyle Ungar and Dongyeop Kang,http://arxiv.org/pdf/2210.07469v2 | |
http://arxiv.org/abs/2210.12929v1,creativecommons.org/licenses/by/4.0/,Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks,Vikas Raunak and Arul Menezes,http://arxiv.org/pdf/2210.12929v1 | |
http://arxiv.org/abs/2211.07716v1,creativecommons.org/licenses/by/4.0/,Zero-Shot Text Matching for Automated Auditing using Sentence Transformers,David Biesner and Maren Pielka and Rajkumar Ramamurthy and Tim Dilmaghani and Bernd Kliem and Rüdiger Loitz and Rafet Sifa,http://arxiv.org/pdf/2211.07716v1 | |
http://arxiv.org/abs/2211.13638v1,creativecommons.org/licenses/by/4.0/,Prototypical Fine-tuning: Towards Robust Performance Under Varying Data Sizes,Yiqiao Jin and Xiting Wang and Yaru Hao and Yizhou Sun and Xing Xie,http://arxiv.org/pdf/2211.13638v1 | |
http://arxiv.org/abs/2212.10515v1,creativecommons.org/licenses/by/4.0/,CausalDialogue: Modeling Utterance-level Causality in Conversations,Yi-Lin Tuan and Alon Albalak and Wenda Xu and Michael Saxon and Connor Pryor and Lise Getoor and William Yang Wang,http://arxiv.org/pdf/2212.10515v1 | |
http://arxiv.org/abs/2302.05454v1,creativecommons.org/licenses/by/4.0/,Distillation of encoder-decoder transformers for sequence labelling,Marco Farina and Duccio Pappadopulo and Anant Gupta and Leslie Huang and Ozan İrsoy and Thamar Solorio,http://arxiv.org/pdf/2302.05454v1 | |
http://arxiv.org/abs/2302.06598v1,creativecommons.org/licenses/by/4.0/,Gradient-Based Automated Iterative Recovery for Parameter-Efficient Tuning,Maximilian Mozes and Tolga Bolukbasi and Ann Yuan and Frederick Liu and Nithum Thain and Lucas Dixon,http://arxiv.org/pdf/2302.06598v1 | |
http://arxiv.org/abs/2303.12135v1,creativecommons.org/licenses/by/4.0/,Understand Legal Documents with Contextualized Large Language Models,Xin Jin and Yuchen Wang,http://arxiv.org/pdf/2303.12135v1 | |
http://arxiv.org/abs/2303.13314v1,creativecommons.org/licenses/by/4.0/,Leveraging Foundation Models for Clinical Text Analysis,Shaina Raza and Syed Raza Bashir,http://arxiv.org/pdf/2303.13314v1 | |
http://arxiv.org/abs/2304.06148v1,creativecommons.org/licenses/by/4.0/,Detection of Fake Generated Scientific Abstracts,Panagiotis C. Theocharopoulos and Panagiotis Anagnostou and Anastasia Tsoukala and Spiros V. Georgakopoulos and Sotiris K. Tasoulis and Vassilis P. Plagianakos,http://arxiv.org/pdf/2304.06148v1 | |
http://arxiv.org/abs/2212.08094v1,creativecommons.org/licenses/by/4.0/,Joint processing of linguistic properties in brains and language models,Subba Reddy Oota and Manish Gupta and Mariya Toneva,http://arxiv.org/pdf/2212.08094v1 | |
http://arxiv.org/abs/2108.03089v1,creativecommons.org/licenses/by/4.0/,Cross-lingual Capsule Network for Hate Speech Detection in Social Media,Aiqi Jiang and Arkaitz Zubiaga,http://arxiv.org/pdf/2108.03089v1 | |
http://arxiv.org/abs/2203.09299v1,creativecommons.org/licenses/by/4.0/,Proceedings Fifth Workshop on Models for Formal Analysis of Real Systems,Clemens Dubslaff and Bas Luttik,http://arxiv.org/pdf/2203.09299v1 | |
http://arxiv.org/abs/2207.04648v1,creativecommons.org/licenses/by/4.0/,Learning Large-scale Universal User Representation with Sparse Mixture of Experts,Caigao Jiang and Siqiao Xue and James Zhang and Lingyue Liu and Zhibo Zhu and Hongyan Hao,http://arxiv.org/pdf/2207.04648v1 | |
http://arxiv.org/abs/2012.03837v2,creativecommons.org/licenses/by/4.0/,Parallel Training of Deep Networks with Local Updates,Michael Laskin and Luke Metz and Seth Nabarro and Mark Saroufim and Badreddine Noune and Carlo Luschi and Jascha Sohl-Dickstein and Pieter Abbeel,http://arxiv.org/pdf/2012.03837v2 | |
http://arxiv.org/abs/1910.04269v1,creativecommons.org/licenses/by/4.0/,Spoken Language Identification using ConvNets,Sarthak and Shikhar Shukla and Govind Mittal,http://arxiv.org/pdf/1910.04269v1 | |
http://arxiv.org/abs/2104.00767v1,creativecommons.org/licenses/by/4.0/,Canonical and Surface Morphological Segmentation for Nguni Languages,Tumi Moeng and Sheldon Reay and Aaron Daniels and Jan Buys,http://arxiv.org/pdf/2104.00767v1 | |
http://arxiv.org/abs/2010.05953v2,creativecommons.org/licenses/by/4.0/,COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs,Jena D. Hwang and Chandra Bhagavatula and Ronan Le Bras and Jeff Da and Keisuke Sakaguchi and Antoine Bosselut and Yejin Choi,http://arxiv.org/pdf/2010.05953v2 | |
http://arxiv.org/abs/2107.06955v1,creativecommons.org/licenses/by/4.0/,HTLM: Hyper-Text Pre-Training and Prompting of Language Models,Armen Aghajanyan and Dmytro Okhonko and Mike Lewis and Mandar Joshi and Hu Xu and Gargi Ghosh and Luke Zettlemoyer,http://arxiv.org/pdf/2107.06955v1 | |
http://arxiv.org/abs/2210.08773v3,creativecommons.org/licenses/by/4.0/,Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training,Anthony Meng Huat Tiong and Junnan Li and Boyang Li and Silvio Savarese and Steven C. H. Hoi,http://arxiv.org/pdf/2210.08773v3 | |
http://arxiv.org/abs/2301.03238v1,creativecommons.org/licenses/by/4.0/,MAQA: A Multimodal QA Benchmark for Negation,Judith Yue Li and Aren Jansen and Qingqing Huang and Joonseok Lee and Ravi Ganti and Dima Kuzmin,http://arxiv.org/pdf/2301.03238v1 | |
http://arxiv.org/abs/2304.11029v2,creativecommons.org/licenses/by/4.0/,CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval,Shangda Wu and Dingyao Yu and Xu Tan and Maosong Sun,http://arxiv.org/pdf/2304.11029v2 | |
http://arxiv.org/abs/2110.02600v3,creativecommons.org/licenses/by/4.0/,Sequential Reptile: Inter-Task Gradient Alignment for Multilingual Learning,Seanie Lee and Hae Beom Lee and Juho Lee and Sung Ju Hwang,http://arxiv.org/pdf/2110.02600v3 | |
http://arxiv.org/abs/2203.13550v1,creativecommons.org/licenses/by/4.0/,Modeling Target-Side Morphology in Neural Machine Translation: A Comparison of Strategies,Marion Weller-Di Marco and Matthias Huck and Alexander Fraser,http://arxiv.org/pdf/2203.13550v1 | |
http://arxiv.org/abs/2205.12542v3,creativecommons.org/licenses/by/4.0/,ER-Test: Evaluating Explanation Regularization Methods for Language Models,Brihi Joshi and Aaron Chan and Ziyi Liu and Shaoliang Nie and Maziar Sanjabi and Hamed Firooz and Xiang Ren,http://arxiv.org/pdf/2205.12542v3 | |
http://arxiv.org/abs/2209.09815v2,creativecommons.org/licenses/by/4.0/,Towards Fine-tuning Pre-trained Language Models with Integer Forward and Backward Propagation,Mohammadreza Tayaranian and Alireza Ghaffari and Marzieh S. Tahaei and Mehdi Rezagholizadeh and Masoud Asgharian and Vahid Partovi Nia,http://arxiv.org/pdf/2209.09815v2 | |
http://arxiv.org/abs/2210.02914v2,creativecommons.org/licenses/by/4.0/,Generative Entity Typing with Curriculum Learning,Siyu Yuan and Deqing Yang and Jiaqing Liang and Zhixu Li and Jinxi Liu and Jingyue Huang and Yanghua Xiao,http://arxiv.org/pdf/2210.02914v2 | |
http://arxiv.org/abs/2210.17049v2,creativecommons.org/licenses/by/4.0/,Modular Hybrid Autoregressive Transducer,Zhong Meng and Tongzhou Chen and Rohit Prabhavalkar and Yu Zhang and Gary Wang and Kartik Audhkhasi and Jesse Emond and Trevor Strohman and Bhuvana Ramabhadran and W. Ronny Huang and Ehsan Variani and Yinghui Huang and Pedro J. Moreno,http://arxiv.org/pdf/2210.17049v2 | |
http://arxiv.org/abs/2212.07127v4,creativecommons.org/licenses/by/4.0/,Towards mapping the contemporary art world with ArtLM: an art-specific NLP model,Qinkai Chen and Mohamed El-Mennaoui and Antoine Fosset and Amine Rebei and Haoyang Cao and Philine Bouscasse and Christy Eóin O'Beirne and Sasha Shevchenko and Mathieu Rosenbaum,http://arxiv.org/pdf/2212.07127v4 | |
http://arxiv.org/abs/2303.14480v1,creativecommons.org/licenses/by/4.0/,GANTEE: Generative Adversatial Network for Taxonomy Entering Evaluation,Zhouhong Gu and Sihang Jiang and Jingping Liu and Yanghua Xiao and Hongwei Feng and Zhixu Li and Jiaqing Liang and Jian Zhong,http://arxiv.org/pdf/2303.14480v1 | |
http://arxiv.org/abs/1808.03570v1,creativecommons.org/licenses/by/4.0/,Densely Connected Convolutional Networks for Speech Recognition,Chia Yu Li and Ngoc Thang Vu,http://arxiv.org/pdf/1808.03570v1 | |
http://arxiv.org/abs/1811.06968v2,creativecommons.org/licenses/by/4.0/,Symbolic Register Automata,Loris D'Antoni and Tiago Ferreira and Matteo Sammartino and Alexandra Silva,http://arxiv.org/pdf/1811.06968v2 | |
http://arxiv.org/abs/1909.03526v3,creativecommons.org/licenses/by/4.0/,Multi-Task Bidirectional Transformer Representations for Irony Detection,Chiyu Zhang and Muhammad Abdul-Mageed,http://arxiv.org/pdf/1909.03526v3 | |
http://arxiv.org/abs/2109.10441v1,creativecommons.org/licenses/by/4.0/,Evaluating Debiasing Techniques for Intersectional Biases,Shivashankar Subramanian and Xudong Han and Timothy Baldwin and Trevor Cohn and Lea Frermann,http://arxiv.org/pdf/2109.10441v1 | |
http://arxiv.org/abs/2203.07890v1,creativecommons.org/licenses/by/4.0/,K-VQG: Knowledge-aware Visual Question Generation for Common-sense Acquisition,Kohei Uehara and Tatsuya Harada,http://arxiv.org/pdf/2203.07890v1 | |
http://arxiv.org/abs/2206.02661v1,creativecommons.org/licenses/by/4.0/,Evaluating Deep Taylor Decomposition for Reliability Assessment in the Wild,Stephanie Brandl and Daniel Hershcovich and Anders Søgaard,http://arxiv.org/pdf/2206.02661v1 | |
http://arxiv.org/abs/2209.14981v2,creativecommons.org/licenses/by/4.0/,Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging,Jean Kaddour,http://arxiv.org/pdf/2209.14981v2 | |
http://arxiv.org/abs/2210.05103v1,creativecommons.org/licenses/by/4.0/,Leveraging Artificial Intelligence on Binary Code Comprehension,Yifan Zhang,http://arxiv.org/pdf/2210.05103v1 | |
http://arxiv.org/abs/2212.08926v1,creativecommons.org/licenses/by/4.0/,A Simple Baseline for Beam Search Reranking,Lior Vassertail and Omer Levy,http://arxiv.org/pdf/2212.08926v1 | |
http://arxiv.org/abs/2212.10544v1,creativecommons.org/licenses/by/4.0/,Pretraining Without Attention,Junxiong Wang and Jing Nathan Yan and Albert Gu and Alexander M. Rush,http://arxiv.org/pdf/2212.10544v1 | |
http://arxiv.org/abs/2204.05541v1,creativecommons.org/licenses/by/4.0/,Not always about you: Prioritizing community needs when developing endangered language technology,Zoey Liu and Crystal Richardson and Richard Hatcher Jr and Emily Prud'hommeaux,http://arxiv.org/pdf/2204.05541v1 | |
http://arxiv.org/abs/2201.09680v1,creativecommons.org/licenses/by/4.0/,Relational Memory Augmented Language Models,Qi Liu and Dani Yogatama and Phil Blunsom,http://arxiv.org/pdf/2201.09680v1 | |
http://arxiv.org/abs/2106.06017v2,creativecommons.org/licenses/by/4.0/,Cross-lingual Emotion Detection,Sabit Hassan and Shaden Shaar and Kareem Darwish,http://arxiv.org/pdf/2106.06017v2 | |
http://arxiv.org/abs/1906.11608v2,creativecommons.org/licenses/by/4.0/,Simple Natural Language Processing Tools for Danish,Leon Derczynski,http://arxiv.org/pdf/1906.11608v2 | |
http://arxiv.org/abs/2210.11416v5,creativecommons.org/licenses/by/4.0/,Scaling Instruction-Finetuned Language Models,Hyung Won Chung and Le Hou and Shayne Longpre and Barret Zoph and Yi Tay and William Fedus and Yunxuan Li and Xuezhi Wang and Mostafa Dehghani and Siddhartha Brahma and Albert Webson and Shixiang Shane Gu and Zhuyun Dai and Mirac Suzgun and Xinyun Chen and Aakanksha Chowdhery and Alex Castro-Ros and Marie Pellat and Kevin Robinson and Dasha Valter and Sharan Narang and Gaurav Mishra and Adams Yu and Vincent Zhao and Yanping Huang and Andrew Dai and Hongkun Yu and Slav Petrov and Ed H. Chi and Jeff Dean and Jacob Devlin and Adam Roberts and Denny Zhou and Quoc V. Le and Jason Wei,http://arxiv.org/pdf/2210.11416v5 | |
http://arxiv.org/abs/2104.12470v5,creativecommons.org/licenses/by/4.0/,Easy and Efficient Transformer : Scalable Inference Solution For large NLP model,Gongzheng Li and Yadong Xi and Jingzhen Ding and Duan Wang and Bai Liu and Changjie Fan and Xiaoxi Mao and Zeng Zhao,http://arxiv.org/pdf/2104.12470v5 | |
http://arxiv.org/abs/2111.14247v2,creativecommons.org/licenses/by/4.0/,A Survey of Large-Scale Deep Learning Serving System Optimization: Challenges and Opportunities,Fuxun Yu and Di Wang and Longfei Shangguan and Minjia Zhang and Xulong Tang and Chenchen Liu and Xiang Chen,http://arxiv.org/pdf/2111.14247v2 | |
http://arxiv.org/abs/2304.07493v1,creativecommons.org/licenses/by/4.0/,OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization,Cong Guo and Jiaming Tang and Weiming Hu and Jingwen Leng and Chen Zhang and Fan Yang and Yunxin Liu and Minyi Guo and Yuhao Zhu,http://arxiv.org/pdf/2304.07493v1 | |
http://arxiv.org/abs/2304.12198v1,creativecommons.org/licenses/by/4.0/,Performance of ChatGPT on the US Fundamentals of Engineering Exam: Comprehensive Assessment of Proficiency and Potential Implications for Professional Environmental Engineering Practice,Vinay Pursnani and Yusuf Sermet and Ibrahim Demir,http://arxiv.org/pdf/2304.12198v1 | |
http://arxiv.org/abs/2301.01701v2,creativecommons.org/licenses/by/4.0/,Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries,Ali Al-Kaswan and Toufique Ahmed and Maliheh Izadi and Anand Ashok Sawant and Premkumar Devanbu and Arie van Deursen,http://arxiv.org/pdf/2301.01701v2 | |
http://arxiv.org/abs/2204.06271v1,creativecommons.org/licenses/by/4.0/,TangoBERT: Reducing Inference Cost by using Cascaded Architecture,Jonathan Mamou and Oren Pereg and Moshe Wasserblat and Roy Schwartz,http://arxiv.org/pdf/2204.06271v1 | |
http://arxiv.org/abs/2211.04939v1,creativecommons.org/licenses/by/4.0/,Efficient Speech Translation with Pre-trained Models,Zhaolin Li and Jan Niehues,http://arxiv.org/pdf/2211.04939v1 | |
http://arxiv.org/abs/2111.02687v1,creativecommons.org/licenses/by/4.0/,CoreLM: Coreference-aware Language Model Fine-Tuning,Nikolaos Stylianou and Ioannis Vlahavas,http://arxiv.org/pdf/2111.02687v1 | |
http://arxiv.org/abs/2303.18190v1,creativecommons.org/licenses/by/4.0/,Assessing Language Model Deployment with Risk Cards,Leon Derczynski and Hannah Rose Kirk and Vidhisha Balachandran and Sachin Kumar and Yulia Tsvetkov and M. R. Leiser and Saif Mohammad,http://arxiv.org/pdf/2303.18190v1 | |
http://arxiv.org/abs/1911.12559v1,creativecommons.org/licenses/by/4.0/,KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents,Ygor Gallina and Florian Boudin and Béatrice Daille,http://arxiv.org/pdf/1911.12559v1 | |
http://arxiv.org/abs/2012.07575v2,creativecommons.org/licenses/by/4.0/,Large-scale Quantitative Evidence of Media Impact on Public Opinion toward China,Junming Huang and Gavin Cook and Yu Xie,http://arxiv.org/pdf/2012.07575v2 | |
http://arxiv.org/abs/2103.01242v2,creativecommons.org/licenses/by/4.0/,Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language,Avia Efrat and Uri Shaham and Dan Kilman and Omer Levy,http://arxiv.org/pdf/2103.01242v2 | |
http://arxiv.org/abs/2110.10328v1,creativecommons.org/licenses/by/4.0/,R$^3$Net:Relation-embedded Representation Reconstruction Network for Change Captioning,Yunbin Tu and Liang Li and Chenggang Yan and Shengxiang Gao and Zhengtao Yu,http://arxiv.org/pdf/2110.10328v1 | |
http://arxiv.org/abs/2111.11431v1,creativecommons.org/licenses/by/4.0/,"RedCaps: web-curated image-text data created by the people, for the people",Karan Desai and Gaurav Kaul and Zubin Aysola and Justin Johnson,http://arxiv.org/pdf/2111.11431v1 | |
http://arxiv.org/abs/2203.14371v1,creativecommons.org/licenses/by/4.0/,MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering,Ankit Pal and Logesh Kumar Umapathi and Malaikannan Sankarasubbu,http://arxiv.org/pdf/2203.14371v1 | |
http://arxiv.org/abs/2204.12785v1,creativecommons.org/licenses/by/4.0/,Plug-and-Play Adaptation for Continuously-updated QA,Kyungjae Lee and Wookje Han and Seung-won Hwang and Hwaran Lee and Joonsuk Park and Sang-Woo Lee,http://arxiv.org/pdf/2204.12785v1 | |
http://arxiv.org/abs/2205.04050v1,creativecommons.org/licenses/by/4.0/,Few-shot Mining of Naturally Occurring Inputs and Outputs,Mandar Joshi and Terra Blevins and Mike Lewis and Daniel S. Weld and Luke Zettlemoyer,http://arxiv.org/pdf/2205.04050v1 | |
http://arxiv.org/abs/2205.10153v1,creativecommons.org/licenses/by/4.0/,Mapping Complex Technologies via Science-Technology Linkages; The Case of Neuroscience -- A transformer based keyword extraction approach,Daniel Hain and Roman Jurowetzki and Mariagrazia Squicciarini,http://arxiv.org/pdf/2205.10153v1 | |
http://arxiv.org/abs/2206.07106v1,creativecommons.org/licenses/by/4.0/,NewsEdits: A News Article Revision Dataset and a Document-Level Reasoning Challenge,Alexander Spangher and Xiang Ren and Jonathan May and Nanyun Peng,http://arxiv.org/pdf/2206.07106v1 | |
http://arxiv.org/abs/2210.05257v1,creativecommons.org/licenses/by/4.0/,Rethinking the Event Coding Pipeline with Prompt Entailment,Clément Lefebvre and Niklas Stoehr,http://arxiv.org/pdf/2210.05257v1 | |
http://arxiv.org/abs/2302.14534v1,creativecommons.org/licenses/by/4.0/,Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face,Christopher Akiki and Odunayo Ogundepo and Aleksandra Piktus and Xinyu Zhang and Akintunde Oladipo and Jimmy Lin and Martin Potthast,http://arxiv.org/pdf/2302.14534v1 | |
http://arxiv.org/abs/2212.08967v1,creativecommons.org/licenses/by/4.0/,"Foundation models in brief: A historical, socio-technical focus",Johannes Schneider,http://arxiv.org/pdf/2212.08967v1 | |
http://arxiv.org/abs/1711.05159v5,creativecommons.org/licenses/by/4.0/,"Classical Control, Quantum Circuits and Linear Logic in Enriched Category Theory",Mathys Rennela and Sam Staton,http://arxiv.org/pdf/1711.05159v5 | |
http://arxiv.org/abs/2109.04727v1,creativecommons.org/licenses/by/4.0/,A Simple and Effective Method To Eliminate the Self Language Bias in Multilingual Representations,Ziyi Yang and Yinfei Yang and Daniel Cer and Eric Darve,http://arxiv.org/pdf/2109.04727v1 | |
http://arxiv.org/abs/2205.12148v3,creativecommons.org/licenses/by/4.0/,Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer,Ahmet Üstün and Arianna Bisazza and Gosse Bouma and Gertjan van Noord and Sebastian Ruder,http://arxiv.org/pdf/2205.12148v3 | |
http://arxiv.org/abs/2210.11359v1,creativecommons.org/licenses/by/4.0/,Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages,Paul Röttger and Debora Nozza and Federico Bianchi and Dirk Hovy,http://arxiv.org/pdf/2210.11359v1 | |
http://arxiv.org/abs/2211.02136v1,creativecommons.org/licenses/by/4.0/,Logographic Information Aids Learning Better Representations for Natural Language Inference,Zijian Jin and Duygu Ataman,http://arxiv.org/pdf/2211.02136v1 | |
http://arxiv.org/abs/2301.08115v1,creativecommons.org/licenses/by/4.0/,Language Embeddings Sometimes Contain Typological Generalizations,Robert Östling and Murathan Kurfalı,http://arxiv.org/pdf/2301.08115v1 | |
http://arxiv.org/abs/2303.01616v1,creativecommons.org/licenses/by/4.0/,Separated and Shared Effects in Higher-Order Languages,Pedro H. Azevedo de Amorim and Justin Hsu,http://arxiv.org/pdf/2303.01616v1 | |
http://arxiv.org/abs/2104.09777v2,creativecommons.org/licenses/by/4.0/,Subsentence Extraction from Text Using Coverage-Based Deep Learning Language Models,JongYoon Lim and Inkyu Sa and Ho Seok Ahn and Norina Gasteiger and Sanghyub John Lee and Bruce MacDonald,http://arxiv.org/pdf/2104.09777v2 | |
http://arxiv.org/abs/2110.07592v3,creativecommons.org/licenses/by/4.0/,DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances,Sreyan Ghosh and Samden Lepcha and S Sakshi and Rajiv Ratn Shah and S. Umesh,http://arxiv.org/pdf/2110.07592v3 | |
http://arxiv.org/abs/2201.02662v2,creativecommons.org/licenses/by/4.0/,Imagined versus Remembered Stories: Quantifying Differences in Narrative Flow,Maarten Sap and Anna Jafarpour and Yejin Choi and Noah A. Smith and James W. Pennebaker and Eric Horvitz,http://arxiv.org/pdf/2201.02662v2 | |
http://arxiv.org/abs/2211.12164v1,creativecommons.org/licenses/by/4.0/,OLGA : An Ontology and LSTM-based approach for generating Arithmetic Word Problems (AWPs) of transfer type,Suresh Kumar and P Sreenivasa Kumar,http://arxiv.org/pdf/2211.12164v1 | |
http://arxiv.org/abs/2301.12030v1,creativecommons.org/licenses/by/4.0/,TiLT: A Time-Centric Approach for Stream Query Optimization and Parallelization,Anand Jayarajan and Wei Zhao and Yudi Sun and Gennady Pekhimenko,http://arxiv.org/pdf/2301.12030v1 | |
http://arxiv.org/abs/2301.12867v3,creativecommons.org/licenses/by/4.0/,Exploring AI Ethics of ChatGPT: A Diagnostic Analysis,Terry Yue Zhuo and Yujin Huang and Chunyang Chen and Zhenchang Xing,http://arxiv.org/pdf/2301.12867v3 | |
http://arxiv.org/abs/1507.01701v1,creativecommons.org/licenses/by/4.0/,A Survey and Classification of Controlled Natural Languages,Tobias Kuhn,http://arxiv.org/pdf/1507.01701v1 | |
http://arxiv.org/abs/1711.05468v1,creativecommons.org/licenses/by/4.0/,Tracking Typological Traits of Uralic Languages in Distributed Language Representations,Johannes Bjerva and Isabelle Augenstein,http://arxiv.org/pdf/1711.05468v1 | |
http://arxiv.org/abs/2210.05726v1,creativecommons.org/licenses/by/4.0/,Automatic Speech Recognition of Low-Resource Languages Based on Chukchi,Anastasia Safonova and Tatiana Yudina and Emil Nadimanov and Cydnie Davenport,http://arxiv.org/pdf/2210.05726v1 | |
http://arxiv.org/abs/2106.00143v1,creativecommons.org/licenses/by/4.0/,An Exploratory Analysis of Multilingual Word-Level Quality Estimation with Cross-Lingual Transformers,Tharindu Ranasinghe and Constantin Orasan and Ruslan Mitkov,http://arxiv.org/pdf/2106.00143v1 | |
http://arxiv.org/abs/2110.15621v1,creativecommons.org/licenses/by/4.0/,MentalBERT: Publicly Available Pretrained Language Models for Mental Healthcare,Shaoxiong Ji and Tianlin Zhang and Luna Ansari and Jie Fu and Prayag Tiwari and Erik Cambria,http://arxiv.org/pdf/2110.15621v1 | |
http://arxiv.org/abs/2203.09679v1,creativecommons.org/licenses/by/4.0/,Modeling Intensification for Sign Language Generation: A Computational Approach,Mert İnan and Yang Zhong and Sabit Hassan and Lorna Quandt and Malihe Alikhani,http://arxiv.org/pdf/2203.09679v1 | |
http://arxiv.org/abs/2207.00560v1,creativecommons.org/licenses/by/4.0/,Is neural language acquisition similar to natural? A chronological probing study,Ekaterina Voloshina and Oleg Serikov and Tatiana Shavrina,http://arxiv.org/pdf/2207.00560v1 | |
http://arxiv.org/abs/2207.05666v1,creativecommons.org/licenses/by/4.0/,Zero-shot Cross-lingual Transfer is Under-specified Optimization,Shijie Wu and Benjamin Van Durme and Mark Dredze,http://arxiv.org/pdf/2207.05666v1 | |
http://arxiv.org/abs/2207.10576v2,creativecommons.org/licenses/by/4.0/,Democratizing Ethical Assessment of Natural Language Generation Models,Amin Rasekh and Ian Eisenberg,http://arxiv.org/pdf/2207.10576v2 | |
http://arxiv.org/abs/2208.10801v1,creativecommons.org/licenses/by/4.0/,MATra: A Multilingual Attentive Transliteration System for Indian Scripts,Yash Raj and Bhavesh Laddagiri,http://arxiv.org/pdf/2208.10801v1 | |
http://arxiv.org/abs/2211.08237v2,creativecommons.org/licenses/by/4.0/,Multilingual Speech Emotion Recognition With Multi-Gating Mechanism and Neural Architecture Search,Zihan Wang and Qi Meng and HaiFeng Lan and XinRui Zhang and KeHao Guo and Akshat Gupta,http://arxiv.org/pdf/2211.08237v2 | |
http://arxiv.org/abs/2302.09345v1,creativecommons.org/licenses/by/4.0/,Improving the Out-Of-Distribution Generalization Capability of Language Models: Counterfactually-Augmented Data is not Enough,Caoyun Fan and Wenqing Chen and Jidong Tian and Yitian Li and Hao He and Yaohui Jin,http://arxiv.org/pdf/2302.09345v1 | |
http://arxiv.org/abs/2303.01157v2,creativecommons.org/licenses/by/4.0/,How will Language Modelers like ChatGPT Affect Occupations and Industries?,Ed Felten and Manav Raj and Robert Seamans,http://arxiv.org/pdf/2303.01157v2 | |
http://arxiv.org/abs/2212.00006v1,creativecommons.org/licenses/by/4.0/,"Operationalizing Specifications, In Addition to Test Sets for Evaluating Constrained Generative Models",Vikas Raunak and Matt Post and Arul Menezes,http://arxiv.org/pdf/2212.00006v1 | |
http://arxiv.org/abs/2302.07867v3,creativecommons.org/licenses/by/4.0/,Learning Performance-Improving Code Edits,Aman Madaan and Alexander Shypula and Uri Alon and Milad Hashemi and Parthasarathy Ranganathan and Yiming Yang and Graham Neubig and Amir Yazdanbakhsh,http://arxiv.org/pdf/2302.07867v3 | |
http://arxiv.org/abs/2012.04080v1,creativecommons.org/licenses/by/4.0/,A Taxonomy of Empathetic Response Intents in Human Social Conversations,Anuradha Welivita and Pearl Pu,http://arxiv.org/pdf/2012.04080v1 | |
http://arxiv.org/abs/2303.16985v1,creativecommons.org/licenses/by/4.0/,Adapting to the Low-Resource Double-Bind: Investigating Low-Compute Methods on Low-Resource African Languages,Colin Leong and Herumb Shandilya and Bonaventure F. P. Dossou and Atnafu Lambebo Tonja and Joel Mathew and Abdul-Hakeem Omotayo and Oreen Yousuf and Zainab Akinjobi and Chris Chinenye Emezue and Shamsudeen Muhammad and Steven Kolawole and Younwoo Choi and Tosin Adewumi,http://arxiv.org/pdf/2303.16985v1 | |
http://arxiv.org/abs/2102.00291v1,creativecommons.org/licenses/by/4.0/,Speech Recognition by Simply Fine-tuning BERT,Wen-Chin Huang and Chia-Hua Wu and Shang-Bao Luo and Kuan-Yu Chen and Hsin-Min Wang and Tomoki Toda,http://arxiv.org/pdf/2102.00291v1 | |
http://arxiv.org/abs/2103.05070v1,creativecommons.org/licenses/by/4.0/,Text Simplification by Tagging,Kostiantyn Omelianchuk and Vipul Raheja and Oleksandr Skurzhanskyi,http://arxiv.org/pdf/2103.05070v1 | |
http://arxiv.org/abs/2103.09548v1,creativecommons.org/licenses/by/4.0/,ENCONTER: Entity Constrained Progressive Sequence Generation via Insertion-based Transformer,Lee-Hsun Hsieh and Yang-Yin Lee and Ee-Peng Lim,http://arxiv.org/pdf/2103.09548v1 | |
http://arxiv.org/abs/2104.05696v1,creativecommons.org/licenses/by/4.0/,Joint Universal Syntactic and Semantic Parsing,Elias Stengel-Eskin and Kenton Murray and Sheng Zhang and Aaron Steven White and Benjamin Van Durme,http://arxiv.org/pdf/2104.05696v1 | |
http://arxiv.org/abs/2104.08459v3,creativecommons.org/licenses/by/4.0/,KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset,Saida Mussakhojayeva and Aigerim Janaliyeva and Almas Mirzakhmetov and Yerbolat Khassanov and Huseyin Atakan Varol,http://arxiv.org/pdf/2104.08459v3 | |
http://arxiv.org/abs/2106.00149v1,creativecommons.org/licenses/by/4.0/,HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalization,Jiaao Chen and Dinghan Shen and Weizhu Chen and Diyi Yang,http://arxiv.org/pdf/2106.00149v1 | |
http://arxiv.org/abs/2107.12460v1,creativecommons.org/licenses/by/4.0/,Don't Sweep your Learning Rate under the Rug: A Closer Look at Cross-modal Transfer of Pretrained Transformers,Danielle Rothermel and Margaret Li and Tim Rocktäschel and Jakob Foerster,http://arxiv.org/pdf/2107.12460v1 | |
http://arxiv.org/abs/2107.14600v2,creativecommons.org/licenses/by/4.0/,MDQE: A More Accurate Direct Pretraining for Machine Translation Quality Estimation,Lei Lin,http://arxiv.org/pdf/2107.14600v2 | |
http://arxiv.org/abs/2108.01377v1,creativecommons.org/licenses/by/4.0/,A Dynamic Head Importance Computation Mechanism for Neural Machine Translation,Akshay Goindani and Manish Shrivastava,http://arxiv.org/pdf/2108.01377v1 | |
http://arxiv.org/abs/2108.06614v1,creativecommons.org/licenses/by/4.0/,The SelectGen Challenge: Finding the Best Training Samples for Few-Shot Neural Text Generation,Ernie Chang and Xiaoyu Shen and Alex Marin and Vera Demberg,http://arxiv.org/pdf/2108.06614v1 | |
http://arxiv.org/abs/2108.09164v1,creativecommons.org/licenses/by/4.0/,A Neural Conversation Generation Model via Equivalent Shared Memory Investigation,Changzhen Ji and Yating Zhang and Xiaozhong Liu and Adam Jatowt and Changlong Sun and Conghui Zhu and Tiejun Zhao,http://arxiv.org/pdf/2108.09164v1 | |
http://arxiv.org/abs/2110.01643v1,creativecommons.org/licenses/by/4.0/,Privacy enabled Financial Text Classification using Differential Privacy and Federated Learning,Priyam Basu and Tiasa Singha Roy and Rakshit Naidu and Zumrut Muftuoglu,http://arxiv.org/pdf/2110.01643v1 | |
http://arxiv.org/abs/2110.13495v1,creativecommons.org/licenses/by/4.0/,Assessing the Sufficiency of Arguments through Conclusion Generation,Timon Gurcke and Milad Alshomary and Henning Wachsmuth,http://arxiv.org/pdf/2110.13495v1 | |
http://arxiv.org/abs/2112.01187v1,creativecommons.org/licenses/by/4.0/,Computing Class Hierarchies from Classifiers,Kai Kang and Fangzhen Lin,http://arxiv.org/pdf/2112.01187v1 | |
http://arxiv.org/abs/2201.09523v1,creativecommons.org/licenses/by/4.0/,BTPK-based learning: An Interpretable Method for Named Entity Recognition,Yulin Chen and Zelai Yao and Haixiao Chi and Dov Gabbay and Bo Yuan and Bruno Bentzen and Beishui Liao,http://arxiv.org/pdf/2201.09523v1 | |
http://arxiv.org/abs/2203.09178v1,creativecommons.org/licenses/by/4.0/,Multilingual Detection of Personal Employment Status on Twitter,Manuel Tonneau and Dhaval Adjodah and João Palotti and Nir Grinberg and Samuel Fraiberger,http://arxiv.org/pdf/2203.09178v1 | |
http://arxiv.org/abs/2204.08952v3,creativecommons.org/licenses/by/4.0/,Retrieval Enhanced Data Augmentation for Question Answering on Privacy Policies,Md Rizwan Parvez and Jianfeng Chi and Wasi Uddin Ahmad and Yuan Tian and Kai-Wei Chang,http://arxiv.org/pdf/2204.08952v3 | |
http://arxiv.org/abs/2205.00498v2,creativecommons.org/licenses/by/4.0/,CUP: Curriculum Learning based Prompt Tuning for Implicit Event Argument Extraction,Jiaju Lin and Qin Chen and Jie Zhou and Jian Jin and Liang He,http://arxiv.org/pdf/2205.00498v2 | |
http://arxiv.org/abs/2205.13792v2,creativecommons.org/licenses/by/4.0/,kNN-Prompt: Nearest Neighbor Zero-Shot Inference,Weijia Shi and Julian Michael and Suchin Gururangan and Luke Zettlemoyer,http://arxiv.org/pdf/2205.13792v2 | |
http://arxiv.org/abs/2205.15661v1,creativecommons.org/licenses/by/4.0/,NEWTS: A Corpus for News Topic-Focused Summarization,Seyed Ali Bahrainian and Sheridan Feucht and Carsten Eickhoff,http://arxiv.org/pdf/2205.15661v1 | |
http://arxiv.org/abs/2206.05696v1,creativecommons.org/licenses/by/4.0/,Grounding in social media: An approach to building a chit-chat dialogue model,Ritvik Choudhary and Daisuke Kawahara,http://arxiv.org/pdf/2206.05696v1 | |
http://arxiv.org/abs/2209.01712v1,creativecommons.org/licenses/by/4.0/,ChemBERTa-2: Towards Chemical Foundation Models,Walid Ahmad and Elana Simon and Seyone Chithrananda and Gabriel Grand and Bharath Ramsundar,http://arxiv.org/pdf/2209.01712v1 | |
http://arxiv.org/abs/2209.12687v1,creativecommons.org/licenses/by/4.0/,"A Case Report On The A.I. Locked-In Problem"": social concerns with modern NLP""",Yoshija Walter,http://arxiv.org/pdf/2209.12687v1 | |
http://arxiv.org/abs/2209.12953v1,creativecommons.org/licenses/by/4.0/,Dialog Acts for Task-Driven Embodied Agents,Spandana Gella and Aishwarya Padmakumar and Patrick Lange and Dilek Hakkani-Tur,http://arxiv.org/pdf/2209.12953v1 | |
http://arxiv.org/abs/2210.00705v2,creativecommons.org/licenses/by/4.0/,SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model,Yi-Jen Shih and Hsuan-Fu Wang and Heng-Jui Chang and Layne Berry and Hung-yi Lee and David Harwath,http://arxiv.org/pdf/2210.00705v2 | |
http://arxiv.org/abs/2210.13952v5,creativecommons.org/licenses/by/4.0/,KnowGL: Knowledge Generation and Linking from Text,Gaetano Rossiello and Md Faisal Mahbub Chowdhury and Nandana Mihindukulasooriya and Owen Cornec and Alfio Massimiliano Gliozzo,http://arxiv.org/pdf/2210.13952v5 | |
http://arxiv.org/abs/2210.17525v1,creativecommons.org/licenses/by/4.0/,Query Refinement Prompts for Closed-Book Long-Form Question Answering,Reinald Kim Amplayo and Kellie Webster and Michael Collins and Dipanjan Das and Shashi Narayan,http://arxiv.org/pdf/2210.17525v1 | |
http://arxiv.org/abs/2212.01453v1,creativecommons.org/licenses/by/4.0/,Twitter Data Analysis: Izmir Earthquake Case,Özgür Agrali and Hakan Sökün and Enis Karaarslan,http://arxiv.org/pdf/2212.01453v1 | |
http://arxiv.org/abs/2212.09865v1,creativecommons.org/licenses/by/4.0/,Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations,Xinxi Lyu and Sewon Min and Iz Beltagy and Luke Zettlemoyer and Hannaneh Hajishirzi,http://arxiv.org/pdf/2212.09865v1 | |
http://arxiv.org/abs/2212.10504v1,creativecommons.org/licenses/by/4.0/,Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?,Sang-Woo Lee and Sungdong Kim and Donghyeon Ko and Donghoon Ham and Youngki Hong and Shin Ah Oh and Hyunhoon Jung and Wangkyo Jung and Kyunghyun Cho and Donghyun Kwak and Hyungsuk Noh and Woomyoung Park,http://arxiv.org/pdf/2212.10504v1 | |
http://arxiv.org/abs/2304.01097v2,creativecommons.org/licenses/by/4.0/,DoctorGLM: Fine-tuning your Chinese Doctor is not a Herculean Task,Honglin Xiong and Sheng Wang and Yitao Zhu and Zihao Zhao and Yuxiao Liu and Linlin Huang and Qian Wang and Dinggang Shen,http://arxiv.org/pdf/2304.01097v2 | |
http://arxiv.org/abs/2304.02822v1,creativecommons.org/licenses/by/4.0/,Approach Intelligent Writing Assistants Usability with Seven Stages of Action,Avinash Bhat and Disha Shrivastava and Jin L. C. Guo,http://arxiv.org/pdf/2304.02822v1 | |
http://arxiv.org/abs/2304.09865v1,creativecommons.org/licenses/by/4.0/,Safer Conversational AI as a Source of User Delight,Xiaoding Lu and Aleksey Korshuk and Zongyi Liu and William Beauchamp and Chai Research,http://arxiv.org/pdf/2304.09865v1 | |
http://arxiv.org/abs/2201.01549v4,creativecommons.org/licenses/by/4.0/,SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations,Changan Niu and Chuanyi Li and Vincent Ng and Jidong Ge and Liguo Huang and Bin Luo,http://arxiv.org/pdf/2201.01549v4 | |
http://arxiv.org/abs/2206.10265v2,creativecommons.org/licenses/by/4.0/,KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Low-Resource NLP,Yufei Wang and Jiayi Zheng and Can Xu and Xiubo Geng and Tao Shen and Chongyang Tao and Daxin Jiang,http://arxiv.org/pdf/2206.10265v2 | |
http://arxiv.org/abs/1805.01952v2,creativecommons.org/licenses/by/4.0/,A Coherent Unsupervised Model for Toponym Resolution,Ehsan Kamalloo and Davood Rafiei,http://arxiv.org/pdf/1805.01952v2 | |
http://arxiv.org/abs/2106.13876v4,creativecommons.org/licenses/by/4.0/,Knowledge-Grounded Self-Rationalization via Extractive and Natural Language Explanations,Bodhisattwa Prasad Majumder and Oana-Maria Camburu and Thomas Lukasiewicz and Julian McAuley,http://arxiv.org/pdf/2106.13876v4 | |
http://arxiv.org/abs/2204.13874v1,creativecommons.org/licenses/by/4.0/,OA-Mine: Open-World Attribute Mining for E-Commerce Products with Weak Supervision,Xinyang Zhang and Chenwei Zhang and Xian Li and Xin Luna Dong and Jingbo Shang and Christos Faloutsos and Jiawei Han,http://arxiv.org/pdf/2204.13874v1 | |
http://arxiv.org/abs/2206.02369v2,creativecommons.org/licenses/by/4.0/,Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation,Jin Xu and Xiaojiang Liu and Jianhao Yan and Deng Cai and Huayang Li and Jian Li,http://arxiv.org/pdf/2206.02369v2 | |
http://arxiv.org/abs/2208.08080v1,creativecommons.org/licenses/by/4.0/,Multimodal Lecture Presentations Dataset: Understanding Multimodality in Educational Slides,Dong Won Lee and Chaitanya Ahuja and Paul Pu Liang and Sanika Natu and Louis-Philippe Morency,http://arxiv.org/pdf/2208.08080v1 | |
http://arxiv.org/abs/2209.05433v2,creativecommons.org/licenses/by/4.0/,FP8 Formats for Deep Learning,Paulius Micikevicius and Dusan Stosic and Neil Burgess and Marius Cornea and Pradeep Dubey and Richard Grisenthwaite and Sangwon Ha and Alexander Heinecke and Patrick Judd and John Kamalu and Naveen Mellempudi and Stuart Oberman and Mohammad Shoeybi and Michael Siu and Hao Wu,http://arxiv.org/pdf/2209.05433v2 | |
http://arxiv.org/abs/2209.15197v1,creativecommons.org/licenses/by/4.0/,Evaluation of taxonomic and neural embedding methods for calculating semantic similarity,Dongqiang Yang and Yanqin Yin,http://arxiv.org/pdf/2209.15197v1 | |
http://arxiv.org/abs/2302.07324v1,creativecommons.org/licenses/by/4.0/,READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises,Chenglei Si and Zhengyan Zhang and Yingfa Chen and Xiaozhi Wang and Zhiyuan Liu and Maosong Sun,http://arxiv.org/pdf/2302.07324v1 | |
http://arxiv.org/abs/2303.15078v2,creativecommons.org/licenses/by/4.0/,Large Language Models are Diverse Role-Players for Summarization Evaluation,Ning Wu and Ming Gong and Linjun Shou and Shining Liang and Daxin Jiang,http://arxiv.org/pdf/2303.15078v2 | |
http://arxiv.org/abs/2304.02080v1,creativecommons.org/licenses/by/4.0/,Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data,Vladislav Lialin and Stephen Rawls and David Chan and Shalini Ghosh and Anna Rumshisky and Wael Hamza,http://arxiv.org/pdf/2304.02080v1 | |
http://arxiv.org/abs/2111.09453v3,creativecommons.org/licenses/by/4.0/,RoBERTuito: a pre-trained language model for social media text in Spanish,Juan Manuel Pérez and Damián A. Furman and Laura Alonso Alemany and Franco Luque,http://arxiv.org/pdf/2111.09453v3 | |
http://arxiv.org/abs/2104.08666v2,creativecommons.org/licenses/by/4.0/,Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language Models,Tejas Srinivasan and Yonatan Bisk,http://arxiv.org/pdf/2104.08666v2 | |
http://arxiv.org/abs/2206.11993v1,creativecommons.org/licenses/by/4.0/,A Disability Lens towards Biases in GPT-3 Generated Open-Ended Languages,Akhter Al Amin and Kazi Sinthia Kabir,http://arxiv.org/pdf/2206.11993v1 | |
http://arxiv.org/abs/2206.01685v2,creativecommons.org/licenses/by/4.0/,Toward a realistic model of speech processing in the brain with self-supervised learning,Juliette Millet and Charlotte Caucheteux and Pierre Orhan and Yves Boubenec and Alexandre Gramfort and Ewan Dunbar and Christophe Pallier and Jean-Remi King,http://arxiv.org/pdf/2206.01685v2 | |
http://arxiv.org/abs/2211.16044v1,creativecommons.org/licenses/by/4.0/,Model Extraction Attack against Self-supervised Speech Models,Tsu-Yuan Hsu and Chen-An Li and Tung-Yu Wu and Hung-yi Lee,http://arxiv.org/pdf/2211.16044v1 | |
http://arxiv.org/abs/2103.07762v2,creativecommons.org/licenses/by/4.0/,OkwuGbé: End-to-End Speech Recognition for Fon and Igbo,Bonaventure F. P. Dossou and Chris C. Emezue,http://arxiv.org/pdf/2103.07762v2 | |
http://arxiv.org/abs/2205.03983v3,creativecommons.org/licenses/by/4.0/,Building Machine Translation Systems for the Next Thousand Languages,Ankur Bapna and Isaac Caswell and Julia Kreutzer and Orhan Firat and Daan van Esch and Aditya Siddhant and Mengmeng Niu and Pallavi Baljekar and Xavier Garcia and Wolfgang Macherey and Theresa Breiner and Vera Axelrod and Jason Riesa and Yuan Cao and Mia Xu Chen and Klaus Macherey and Maxim Krikun and Pidong Wang and Alexander Gutkin and Apurva Shah and Yanping Huang and Zhifeng Chen and Yonghui Wu and Macduff Hughes,http://arxiv.org/pdf/2205.03983v3 | |
http://arxiv.org/abs/2302.01496v1,creativecommons.org/licenses/by/4.0/,Efficient Domain Adaptation for Speech Foundation Models,Bo Li and Dongseong Hwang and Zhouyuan Huo and Junwen Bai and Guru Prakash and Tara N. Sainath and Khe Chai Sim and Yu Zhang and Wei Han and Trevor Strohman and Francoise Beaufays,http://arxiv.org/pdf/2302.01496v1 | |
http://arxiv.org/abs/2303.16133v1,creativecommons.org/licenses/by/4.0/,Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models,Adyasha Maharana and Amita Kamath and Christopher Clark and Mohit Bansal and Aniruddha Kembhavi,http://arxiv.org/pdf/2303.16133v1 | |
http://arxiv.org/abs/2010.02429v3,creativecommons.org/licenses/by/4.0/,Modeling Preconditions in Text with a Crowd-sourced Dataset,Heeyoung Kwon and Mahnaz Koupaee and Pratyush Singh and Gargi Sawhney and Anmol Shukla and Keerthi Kumar Kallur and Nathanael Chambers and Niranjan Balasubramanian,http://arxiv.org/pdf/2010.02429v3 | |
http://arxiv.org/abs/2106.13802v1,creativecommons.org/licenses/by/4.0/,Efficient Document Image Classification Using Region-Based Graph Neural Network,Jaya Krishna Mandivarapu and Eric Bunch and Qian You and Glenn Fung,http://arxiv.org/pdf/2106.13802v1 | |
http://arxiv.org/abs/2107.06785v2,creativecommons.org/licenses/by/4.0/,Large-Scale News Classification using BERT Language Model: Spark NLP Approach,Kuncahyo Setyo Nugroho and Anantha Yullian Sukmadewa and Novanto Yudistira,http://arxiv.org/pdf/2107.06785v2 | |
http://arxiv.org/abs/2112.09118v4,creativecommons.org/licenses/by/4.0/,Unsupervised Dense Information Retrieval with Contrastive Learning,Gautier Izacard and Mathilde Caron and Lucas Hosseini and Sebastian Riedel and Piotr Bojanowski and Armand Joulin and Edouard Grave,http://arxiv.org/pdf/2112.09118v4 | |
http://arxiv.org/abs/2205.15241v2,creativecommons.org/licenses/by/4.0/,Multi-Game Decision Transformers,Kuang-Huei Lee and Ofir Nachum and Mengjiao Yang and Lisa Lee and Daniel Freeman and Winnie Xu and Sergio Guadarrama and Ian Fischer and Eric Jang and Henryk Michalewski and Igor Mordatch,http://arxiv.org/pdf/2205.15241v2 | |
http://arxiv.org/abs/2211.15388v2,creativecommons.org/licenses/by/4.0/,Shifted Diffusion for Text-to-image Generation,Yufan Zhou and Bingchen Liu and Yizhe Zhu and Xiao Yang and Changyou Chen and Jinhui Xu,http://arxiv.org/pdf/2211.15388v2 | |
http://arxiv.org/abs/2102.10094v3,creativecommons.org/licenses/by/4.0/,Formal Language Theory Meets Modern NLP,William Merrill,http://arxiv.org/pdf/2102.10094v3 | |
http://arxiv.org/abs/2203.10321v1,creativecommons.org/licenses/by/4.0/,Sequence-to-Sequence Knowledge Graph Completion and Question Answering,Apoorv Saxena and Adrian Kochsiek and Rainer Gemulla,http://arxiv.org/pdf/2203.10321v1 | |
http://arxiv.org/abs/2206.12866v1,creativecommons.org/licenses/by/4.0/,Contextual embedding and model weighting by fusing domain knowledge on Biomedical Question Answering,Yuxuan Lu and Jingya Yan and Zhixuan Qi and Zhongzheng Ge and Yongping Du,http://arxiv.org/pdf/2206.12866v1 | |
http://arxiv.org/abs/2211.03270v1,creativecommons.org/licenses/by/4.0/,Reconciliation of Pre-trained Models and Prototypical Neural Networks in Few-shot Named Entity Recognition,Youcheng Huang and Wenqiang Lei and Jie Fu and Jiancheng Lv,http://arxiv.org/pdf/2211.03270v1 | |
http://arxiv.org/abs/2011.08242v1,creativecommons.org/licenses/by/4.0/,Opportunities and Challenges for Circuit Board Level Hardware Description Languages,Richard Lin and Björn Hartmann,http://arxiv.org/pdf/2011.08242v1 | |
http://arxiv.org/abs/2012.11995v1,creativecommons.org/licenses/by/4.0/,Pre-Training a Language Model Without Human Language,Cheng-Han Chiang and Hung-yi Lee,http://arxiv.org/pdf/2012.11995v1 | |
http://arxiv.org/abs/2109.00087v3,creativecommons.org/licenses/by/4.0/,It's not Rocket Science : Interpreting Figurative Language in Narratives,Tuhin Chakrabarty and Yejin Choi and Vered Shwartz,http://arxiv.org/pdf/2109.00087v3 | |
http://arxiv.org/abs/2304.00906v1,creativecommons.org/licenses/by/4.0/,ScandEval: A Benchmark for Scandinavian Natural Language Processing,Dan Saattrup Nielsen,http://arxiv.org/pdf/2304.00906v1 | |
http://arxiv.org/abs/2108.09931v1,creativecommons.org/licenses/by/4.0/,"Towards a Formal Modelling, Analysis, and Verification of a Clone Node Attack Detection Scheme in the Internet of Things",Khizar Hameed and Saurabh Garg and Muhammad Bilal Amin and Byeong Kang,http://arxiv.org/pdf/2108.09931v1 | |
http://arxiv.org/abs/2301.00704v1,creativecommons.org/licenses/by/4.0/,Muse: Text-To-Image Generation via Masked Generative Transformers,Huiwen Chang and Han Zhang and Jarred Barber and AJ Maschinot and Jose Lezama and Lu Jiang and Ming-Hsuan Yang and Kevin Murphy and William T. Freeman and Michael Rubinstein and Yuanzhen Li and Dilip Krishnan,http://arxiv.org/pdf/2301.00704v1 | |
http://arxiv.org/abs/2010.13347v1,creativecommons.org/licenses/by/4.0/,"A Language and Methodology based on Scenarios, Grammars and Views, for Administrative Business Processes Modelling",Milliam Maxime Zekeng Ndadji and Maurice Tchoupé Tchendji and Clémentin Tayou Djamegni and Didier Parigot,http://arxiv.org/pdf/2010.13347v1 | |
http://arxiv.org/abs/2012.03864v1,creativecommons.org/licenses/by/4.0/,Evaluating Cross-Lingual Transfer Learning Approaches in Multilingual Conversational Agent Models,Lizhen Tan and Olga Golovneva,http://arxiv.org/pdf/2012.03864v1 | |
http://arxiv.org/abs/2210.15184v2,creativecommons.org/licenses/by/4.0/,Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models,Harshita Diddee and Sandipan Dandapat and Monojit Choudhury and Tanuja Ganu and Kalika Bali,http://arxiv.org/pdf/2210.15184v2 | |
http://arxiv.org/abs/2106.03441v3,creativecommons.org/licenses/by/4.0/,Attention Temperature Matters in Abstractive Summarization Distillation,Shengqiang Zhang and Xingxing Zhang and Hangbo Bao and Furu Wei,http://arxiv.org/pdf/2106.03441v3 | |
http://arxiv.org/abs/2006.16370v1,creativecommons.org/licenses/by/4.0/,Classification of cancer pathology reports: a large-scale comparative study,Stefano Martina and Leonardo Ventura and Paolo Frasconi,http://arxiv.org/pdf/2006.16370v1 | |
http://arxiv.org/abs/2210.13312v2,creativecommons.org/licenses/by/4.0/,Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs,Maarten Sap and Ronan LeBras and Daniel Fried and Yejin Choi,http://arxiv.org/pdf/2210.13312v2 | |
http://arxiv.org/abs/2112.09866v1,creativecommons.org/licenses/by/4.0/,Cascading Adaptors to Leverage English Data to Improve Performance of Question Answering for Low-Resource Languages,Hariom A. Pandya and Bhavik Ardeshna and Dr. Brijesh S. Bhatt,http://arxiv.org/pdf/2112.09866v1 | |
http://arxiv.org/abs/2203.09435v2,creativecommons.org/licenses/by/4.0/,Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation,Xinyi Wang and Sebastian Ruder and Graham Neubig,http://arxiv.org/pdf/2203.09435v2 | |
http://arxiv.org/abs/2207.05259v1,creativecommons.org/licenses/by/4.0/,Language-Based Causal Representation Learning,Blai Bonet and Hector Geffner,http://arxiv.org/pdf/2207.05259v1 | |
http://arxiv.org/abs/2302.04087v1,creativecommons.org/licenses/by/4.0/,Büchi-like characterizations for Parikh-recognizable omega-languages,Mario Grobler and Sebastian Siebertz,http://arxiv.org/pdf/2302.04087v1 | |
http://arxiv.org/abs/2103.11441v3,creativecommons.org/licenses/by/4.0/,TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing,Tao Gui and Xiao Wang and Qi Zhang and Qin Liu and Yicheng Zou and Xin Zhou and Rui Zheng and Chong Zhang and Qinzhuo Wu and Jiacheng Ye and Zexiong Pang and Yongxin Zhang and Zhengyan Li and Ruotian Ma and Zichu Fei and Ruijian Cai and Jun Zhao and Xingwu Hu and Zhiheng Yan and Yiding Tan and Yuan Hu and Qiyuan Bian and Zhihua Liu and Bolin Zhu and Shan Qin and Xiaoyu Xing and Jinlan Fu and Yue Zhang and Minlong Peng and Xiaoqing Zheng and Yaqian Zhou and Zhongyu Wei and Xipeng Qiu and Xuanjing Huang,http://arxiv.org/pdf/2103.11441v3 | |
http://arxiv.org/abs/2204.13743v1,creativecommons.org/licenses/by/4.0/,HiNER: A Large Hindi Named Entity Recognition Dataset,Rudra Murthy and Pallab Bhattacharjee and Rahul Sharnagat and Jyotsana Khatri and Diptesh Kanojia and Pushpak Bhattacharyya,http://arxiv.org/pdf/2204.13743v1 | |
http://arxiv.org/abs/2203.02094v2,creativecommons.org/licenses/by/4.0/,LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models,Mojan Javaheripi and Gustavo H. de Rosa and Subhabrata Mukherjee and Shital Shah and Tomasz L. Religa and Caio C. T. Mendes and Sebastien Bubeck and Farinaz Koushanfar and Debadeepta Dey,http://arxiv.org/pdf/2203.02094v2 | |
http://arxiv.org/abs/2011.08626v2,creativecommons.org/licenses/by/4.0/,Neural Semi-supervised Learning for Text Classification Under Large-Scale Pretraining,Zijun Sun and Chun Fan and Xiaofei Sun and Yuxian Meng and Fei Wu and Jiwei Li,http://arxiv.org/pdf/2011.08626v2 | |
http://arxiv.org/abs/2304.03589v1,creativecommons.org/licenses/by/4.0/,On Efficient Training of Large-Scale Deep Learning Models: A Literature Review,Li Shen and Yan Sun and Zhiyuan Yu and Liang Ding and Xinmei Tian and Dacheng Tao,http://arxiv.org/pdf/2304.03589v1 | |
http://arxiv.org/abs/2202.00540v1,creativecommons.org/licenses/by/4.0/,Dominant Set-based Active Learning for Text Classification and its Application to Online Social Media,Toktam A. Oghaz and Ivan Garibay,http://arxiv.org/pdf/2202.00540v1 | |
http://arxiv.org/abs/2205.10487v1,creativecommons.org/licenses/by/4.0/,Scaling Laws and Interpretability of Learning from Repeated Data,Danny Hernandez and Tom Brown and Tom Conerly and Nova DasSarma and Dawn Drain and Sheer El-Showk and Nelson Elhage and Zac Hatfield-Dodds and Tom Henighan and Tristan Hume and Scott Johnston and Ben Mann and Chris Olah and Catherine Olsson and Dario Amodei and Nicholas Joseph and Jared Kaplan and Sam McCandlish,http://arxiv.org/pdf/2205.10487v1 | |
http://arxiv.org/abs/1706.03499v1,creativecommons.org/licenses/by/4.0/,SU-RUG at the CoNLL-SIGMORPHON 2017 shared task: Morphological Inflection with Attentional Sequence-to-Sequence Models,Robert Östling and Johannes Bjerva,http://arxiv.org/pdf/1706.03499v1 | |
http://arxiv.org/abs/1611.02988v1,creativecommons.org/licenses/by/4.0/,Distant supervision for emotion detection using Facebook reactions,Chris Pool and Malvina Nissim,http://arxiv.org/pdf/1611.02988v1 | |
http://arxiv.org/abs/2001.03216v1,creativecommons.org/licenses/by/4.0/,Simulating Lexical Semantic Change from Sense-Annotated Data,Dominik Schlechtweg and Sabine Schulte im Walde,http://arxiv.org/pdf/2001.03216v1 | |
http://arxiv.org/abs/2001.05314v2,creativecommons.org/licenses/by/4.0/,Embedding Compression with Isotropic Iterative Quantization,Siyu Liao and Jie Chen and Yanzhi Wang and Qinru Qiu and Bo Yuan,http://arxiv.org/pdf/2001.05314v2 | |
http://arxiv.org/abs/2009.12695v1,creativecommons.org/licenses/by/4.0/,Techniques to Improve Q&A Accuracy with Transformer-based models on Large Complex Documents,Chejui Liao and Tabish Maniar and Sravanajyothi N and Anantha Sharma,http://arxiv.org/pdf/2009.12695v1 | |
http://arxiv.org/abs/2011.03138v1,creativecommons.org/licenses/by/4.0/,Semi-supervised URL Segmentation with Recurrent Neural Networks Pre-trained on Knowledge Graph Entities,Hao Zhang and Jae Ro and Richard Sproat,http://arxiv.org/pdf/2011.03138v1 | |
http://arxiv.org/abs/2103.07259v1,creativecommons.org/licenses/by/4.0/,Explaining and Improving BERT Performance on Lexical Semantic Change Detection,Severin Laicher and Sinan Kurtyigit and Dominik Schlechtweg and Jonas Kuhn and Sabine Schulte im Walde,http://arxiv.org/pdf/2103.07259v1 | |
http://arxiv.org/abs/2106.03111v1,creativecommons.org/licenses/by/4.0/,Lexical Semantic Change Discovery,Sinan Kurtyigit and Maike Park and Dominik Schlechtweg and Jonas Kuhn and Sabine Schulte im Walde,http://arxiv.org/pdf/2106.03111v1 | |
http://arxiv.org/abs/2106.03161v2,creativecommons.org/licenses/by/4.0/,Identifying Populist Paragraphs in Text: A machine-learning approach,Jogilė Ulinskaitė and Lukas Pukelis,http://arxiv.org/pdf/2106.03161v2 | |
http://arxiv.org/abs/2107.03474v1,creativecommons.org/licenses/by/4.0/,Differentiable Random Access Memory using Lattices,Adam P. Goucher and Rajan Troll,http://arxiv.org/pdf/2107.03474v1 | |
http://arxiv.org/abs/2109.03200v2,creativecommons.org/licenses/by/4.0/,ExCode-Mixed: Explainable Approaches towards Sentiment Analysis on Code-Mixed Data using BERT models,Aman Priyanshu and Aleti Vardhan and Sudarshan Sivakumar and Supriti Vijay and Nipuna Chhabra,http://arxiv.org/pdf/2109.03200v2 | |
http://arxiv.org/abs/2112.02810v1,creativecommons.org/licenses/by/4.0/,An Effective GCN-based Hierarchical Multi-label classification for Protein Function Prediction,Kyudam Choi and Yurim Lee and Cheongwon Kim and Minsung Yoon,http://arxiv.org/pdf/2112.02810v1 | |
http://arxiv.org/abs/2203.01282v2,creativecommons.org/licenses/by/4.0/,py-irt: A Scalable Item Response Theory Library for Python,John P. Lalor and Pedro Rodriguez,http://arxiv.org/pdf/2203.01282v2 | |
http://arxiv.org/abs/2205.01825v1,creativecommons.org/licenses/by/4.0/,AmbiPun: Generating Humorous Puns with Ambiguous Context,Anirudh Mittal and Yufei Tian and Nanyun Peng,http://arxiv.org/pdf/2205.01825v1 | |
http://arxiv.org/abs/2209.00797v2,creativecommons.org/licenses/by/4.0/,"Random Text Perturbations Work, but not Always",Zhengxiang Wang,http://arxiv.org/pdf/2209.00797v2 | |
http://arxiv.org/abs/2209.12614v1,creativecommons.org/licenses/by/4.0/,Identifying epidemic related Tweets using noisy learning,Ramya Tekumalla and Juan M. Banda,http://arxiv.org/pdf/2209.12614v1 | |
http://arxiv.org/abs/2211.04417v1,creativecommons.org/licenses/by/4.0/,nBIIG: A Neural BI Insights Generation System for Table Reporting,Yotam Perlitz and Dafna Sheinwald and Noam Slonim and Michal Shmueli-Scheuer,http://arxiv.org/pdf/2211.04417v1 | |
http://arxiv.org/abs/2304.09172v1,creativecommons.org/licenses/by/4.0/,Hyperbolic Image-Text Representations,Karan Desai and Maximilian Nickel and Tanmay Rajpurohit and Justin Johnson and Ramakrishna Vedantam,http://arxiv.org/pdf/2304.09172v1 | |
http://arxiv.org/abs/2106.05426v4,creativecommons.org/licenses/by/4.0/,Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses,Richard Antonello and Javier Turek and Vy Vo and Alexander Huth,http://arxiv.org/pdf/2106.05426v4 | |
http://arxiv.org/abs/1906.12230v1,creativecommons.org/licenses/by/4.0/,FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms,Henry B. Moss and Andrew Moore and David S. Leslie and Paul Rayson,http://arxiv.org/pdf/1906.12230v1 | |
http://arxiv.org/abs/2003.13198v4,creativecommons.org/licenses/by/4.0/,InterBERT: Vision-and-Language Interaction for Multi-modal Pretraining,Junyang Lin and An Yang and Yichang Zhang and Jie Liu and Jingren Zhou and Hongxia Yang,http://arxiv.org/pdf/2003.13198v4 | |
http://arxiv.org/abs/1911.03268v1,creativecommons.org/licenses/by/4.0/,Inducing brain-relevant bias in natural language processing models,Dan Schwartz and Mariya Toneva and Leila Wehbe,http://arxiv.org/pdf/1911.03268v1 | |
http://arxiv.org/abs/2103.05132v2,creativecommons.org/licenses/by/4.0/,AfriVEC: Word Embedding Models for African Languages. Case Study of Fon and Nobiin,Bonaventure F. P. Dossou and Mohammed Sabry,http://arxiv.org/pdf/2103.05132v2 | |
http://arxiv.org/abs/2210.11621v1,creativecommons.org/licenses/by/4.0/,SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages,Alireza Mohammadshahi and Vassilina Nikoulina and Alexandre Berard and Caroline Brun and James Henderson and Laurent Besacier,http://arxiv.org/pdf/2210.11621v1 | |
http://arxiv.org/abs/2211.11678v1,creativecommons.org/licenses/by/4.0/,Measuring Harmful Representations in Scandinavian Language Models,Samia Touileb and Debora Nozza,http://arxiv.org/pdf/2211.11678v1 | |
http://arxiv.org/abs/2301.12652v2,creativecommons.org/licenses/by/4.0/,REPLUG: Retrieval-Augmented Black-Box Language Models,Weijia Shi and Sewon Min and Michihiro Yasunaga and Minjoon Seo and Rich James and Mike Lewis and Luke Zettlemoyer and Wen-tau Yih,http://arxiv.org/pdf/2301.12652v2 | |
http://arxiv.org/abs/2011.07347v1,creativecommons.org/licenses/by/4.0/,Conditioned Natural Language Generation using only Unconditioned Language Model: An Exploration,Fan-Keng Sun and Cheng-I Lai,http://arxiv.org/pdf/2011.07347v1 | |
http://arxiv.org/abs/2206.11146v1,creativecommons.org/licenses/by/4.0/,Modeling Emergent Lexicon Formation with a Self-Reinforcing Stochastic Process,Brendon Boldt and David Mortensen,http://arxiv.org/pdf/2206.11146v1 | |
http://arxiv.org/abs/2303.15714v1,creativecommons.org/licenses/by/4.0/,Explicit Planning Helps Language Models in Logical Reasoning,Hongyu Zhao and Kangrui Wang and Mo Yu and Hongyuan Mei,http://arxiv.org/pdf/2303.15714v1 | |
http://arxiv.org/abs/1909.05088v1,creativecommons.org/licenses/by/4.0/,Getting Gender Right in Neural Machine Translation,Eva Vanmassenhove and Christian Hardmeier and Andy Way,http://arxiv.org/pdf/1909.05088v1 | |
http://arxiv.org/abs/1807.01784v1,creativecommons.org/licenses/by/4.0/,Program Language Translation Using a Grammar-Driven Tree-to-Tree Model,Mehdi Drissi and Olivia Watkins and Aditya Khant and Vivaswat Ojha and Pedro Sandoval and Rakia Segev and Eric Weiner and Robert Keller,http://arxiv.org/pdf/1807.01784v1 | |
http://arxiv.org/abs/2007.02629v2,creativecommons.org/licenses/by/4.0/,Learning Spoken Language Representations with Neural Lattice Language Modeling,Chao-Wei Huang and Yun-Nung Chen,http://arxiv.org/pdf/2007.02629v2 | |
http://arxiv.org/abs/2010.04482v1,creativecommons.org/licenses/by/4.0/,Word Level Language Identification in English Telugu Code Mixed Data,Sunil Gundapu and Radhika Mamidi,http://arxiv.org/pdf/2010.04482v1 | |
http://arxiv.org/abs/2110.06128v3,creativecommons.org/licenses/by/4.0/,Regionalized models for Spanish language variations based on Twitter,Eric S. Tellez and Daniela Moctezuma and Sabino Miranda and Mario Graff and Guillermo Ruiz,http://arxiv.org/pdf/2110.06128v3 | |
http://arxiv.org/abs/2112.10543v1,creativecommons.org/licenses/by/4.0/,Spiral Language Modeling,Yong Cao and Yukun Feng and Shaohui Kuang and Gu Xu,http://arxiv.org/pdf/2112.10543v1 | |
http://arxiv.org/abs/2202.00794v2,creativecommons.org/licenses/by/4.0/,Learning to pronounce as measuring cross-lingual joint orthography-phonology complexity,Domenic Rosati,http://arxiv.org/pdf/2202.00794v2 | |
http://arxiv.org/abs/2205.01620v2,creativecommons.org/licenses/by/4.0/,Unifying the Convergences in Multilingual Neural Machine Translation,Yichong Huang and Xiaocheng Feng and Xinwei Geng and Bing Qin,http://arxiv.org/pdf/2205.01620v2 | |
http://arxiv.org/abs/2205.12404v3,creativecommons.org/licenses/by/4.0/,FLUTE: Figurative Language Understanding through Textual Explanations,Tuhin Chakrabarty and Arkadiy Saakyan and Debanjan Ghosh and Smaranda Muresan,http://arxiv.org/pdf/2205.12404v3 | |
http://arxiv.org/abs/2207.09157v1,creativecommons.org/licenses/by/4.0/,On the cross-lingual transferability of multilingual prototypical models across NLU tasks,Oralie Cattan and Christophe Servan and Sophie Rosset,http://arxiv.org/pdf/2207.09157v1 | |
http://arxiv.org/abs/2211.03818v1,creativecommons.org/licenses/by/4.0/,CELLS: A Parallel Corpus for Biomedical Lay Language Generation,Yue Guo and Wei Qiu and Gondy Leroy and Sheng Wang and Trevor Cohen,http://arxiv.org/pdf/2211.03818v1 | |
http://arxiv.org/abs/2302.07974v1,creativecommons.org/licenses/by/4.0/,Tree-Based Representation and Generation of Natural and Mathematical Language,Alexander Scarlatos and Andrew Lan,http://arxiv.org/pdf/2302.07974v1 | |
http://arxiv.org/abs/2304.05764v1,creativecommons.org/licenses/by/4.0/,Measuring Normative and Descriptive Biases in Language Models Using Census Data,Samia Touileb and Lilja Øvrelid and Erik Velldal,http://arxiv.org/pdf/2304.05764v1 | |
http://arxiv.org/abs/2011.09567v1,creativecommons.org/licenses/by/4.0/,Predicting metrical patterns in Spanish poetry with language models,Javier de la Rosa and Salvador Ros and Elena González-Blanco,http://arxiv.org/pdf/2011.09567v1 | |
http://arxiv.org/abs/2211.02011v4,creativecommons.org/licenses/by/4.0/,Inverse scaling can become U-shaped,Jason Wei and Najoung Kim and Yi Tay and Quoc V. Le,http://arxiv.org/pdf/2211.02011v4 | |
http://arxiv.org/abs/2103.06922v3,creativecommons.org/licenses/by/4.0/,Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU Models,Mengnan Du and Varun Manjunatha and Rajiv Jain and Ruchi Deshpande and Franck Dernoncourt and Jiuxiang Gu and Tong Sun and Xia Hu,http://arxiv.org/pdf/2103.06922v3 | |
http://arxiv.org/abs/2205.01541v1,creativecommons.org/licenses/by/4.0/,Efficient Fine-Tuning of BERT Models on the Edge,Danilo Vucetic and Mohammadreza Tayaranian and Maryam Ziaeefard and James J. Clark and Brett H. Meyer and Warren J. Gross,http://arxiv.org/pdf/2205.01541v1 | |
http://arxiv.org/abs/2212.09747v1,creativecommons.org/licenses/by/4.0/,Do CoNLL-2003 Named Entity Taggers Still Work Well in 2023?,Shuheng Liu and Alan Ritter,http://arxiv.org/pdf/2212.09747v1 | |
http://arxiv.org/abs/2204.02531v1,creativecommons.org/licenses/by/4.0/,Improving Zero-Shot Event Extraction via Sentence Simplification,Sneha Mehta and Huzefa Rangwala and Naren Ramakrishnan,http://arxiv.org/pdf/2204.02531v1 | |
http://arxiv.org/abs/2202.11844v3,creativecommons.org/licenses/by/4.0/,First is Better Than Last for Language Data Influence,Chih-Kuan Yeh and Ankur Taly and Mukund Sundararajan and Frederick Liu and Pradeep Ravikumar,http://arxiv.org/pdf/2202.11844v3 | |
http://arxiv.org/abs/2212.12017v3,creativecommons.org/licenses/by/4.0/,OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization,Srinivasan Iyer and Xi Victoria Lin and Ramakanth Pasunuru and Todor Mihaylov and Daniel Simig and Ping Yu and Kurt Shuster and Tianlu Wang and Qing Liu and Punit Singh Koura and Xian Li and Brian O'Horo and Gabriel Pereyra and Jeff Wang and Christopher Dewan and Asli Celikyilmaz and Luke Zettlemoyer and Ves Stoyanov,http://arxiv.org/pdf/2212.12017v3 | |
http://arxiv.org/abs/2006.07698v2,creativecommons.org/publicdomain/zero/1.0/,Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya,Abrhalei Tela and Abraham Woubie and Ville Hautamaki,http://arxiv.org/pdf/2006.07698v2 | |
http://arxiv.org/abs/2109.06327v2,creativecommons.org/publicdomain/zero/1.0/,Evaluating Transferability of BERT Models on Uralic Languages,Judit Ács and Dániel Lévai and András Kornai,http://arxiv.org/pdf/2109.06327v2 | |
http://arxiv.org/abs/2205.06885v1,creativecommons.org/publicdomain/zero/1.0/,PathologyBERT -- Pre-trained Vs. A New Transformer Language Model for Pathology Domain,Thiago Santos and Amara Tariq and Susmita Das and Kavyasree Vayalpati and Geoffrey H. Smith and Hari Trivedi and Imon Banerjee,http://arxiv.org/pdf/2205.06885v1 | |
http://arxiv.org/abs/2209.10583v1,creativecommons.org/publicdomain/zero/1.0/,Representing Affect Information in Word Embeddings,Yuhan Zhang and Wenqi Chen and Ruihan Zhang and Xiajie Zhang,http://arxiv.org/pdf/2209.10583v1 | |
http://arxiv.org/abs/2204.04748v1,creativecommons.org/publicdomain/zero/1.0/,Breaking Character: Are Subwords Good Enough for MRLs After All?,Omri Keren and Tal Avinari and Reut Tsarfaty and Omer Levy,http://arxiv.org/pdf/2204.04748v1 | |
http://arxiv.org/abs/1912.13415v1,creativecommons.org/publicdomain/zero/1.0/,End-to-end Named Entity Recognition and Relation Extraction using Pre-trained Language Models,John Giorgi and Xindi Wang and Nicola Sahar and Won Young Shin and Gary D. Bader and Bo Wang,http://arxiv.org/pdf/1912.13415v1 | |
http://arxiv.org/abs/2011.08724v2,creativecommons.org/publicdomain/zero/1.0/,Multi-SQL: An extensible multi-model data query language,Yu Yan and Nan Jiang and Hongzhi Wang and Yutong Wang and Chang Liu and Yuzhuo Wang,http://arxiv.org/pdf/2011.08724v2 | |
http://arxiv.org/abs/2302.00083v1,creativecommons.org/publicdomain/zero/1.0/,In-Context Retrieval-Augmented Language Models,Ori Ram and Yoav Levine and Itay Dalmedigos and Dor Muhlgay and Amnon Shashua and Kevin Leyton-Brown and Yoav Shoham,http://arxiv.org/pdf/2302.00083v1 | |
http://arxiv.org/abs/2303.10131v1,creativecommons.org/publicdomain/zero/1.0/,She Elicits Requirements and He Tests: Software Engineering Gender Bias in Large Language Models,Christoph Treude and Hideaki Hata,http://arxiv.org/pdf/2303.10131v1 | |
http://arxiv.org/abs/2004.13819v1,creativecommons.org/publicdomain/zero/1.0/,Neural Machine Translation for Low-Resourced Indian Languages,Himanshu Choudhary and Shivansh Rao and Rajesh Rohilla,http://arxiv.org/pdf/2004.13819v1 | |
http://arxiv.org/abs/2104.04670v5,creativecommons.org/publicdomain/zero/1.0/,Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections,Ruiqi Zhong and Kristy Lee and Zheng Zhang and Dan Klein,http://arxiv.org/pdf/2104.04670v5 | |
http://arxiv.org/abs/2108.02170v1,creativecommons.org/publicdomain/zero/1.0/,Curriculum learning for language modeling,Daniel Campos,http://arxiv.org/pdf/2108.02170v1 | |
http://arxiv.org/abs/2209.10792v2,creativecommons.org/publicdomain/zero/1.0/,Deep Learning Based Page Creation for Improving E-Commerce Organic Search Traffic,Cheng Jie and Da Xu and Zigeng Wang and Wei Shen,http://arxiv.org/pdf/2209.10792v2 | |
http://arxiv.org/abs/1810.08606v1,creativecommons.org/publicdomain/zero/1.0/,An Exploration of Dropout with RNNs for Natural Language Inference,Amit Gajbhiye and Sardar Jaf and Noura Al Moubayed and A. Stephen McGough and Steven Bradley,http://arxiv.org/pdf/1810.08606v1 | |
http://arxiv.org/abs/2302.13344v1,creativecommons.org/publicdomain/zero/1.0/,Tailoring Language Generation Models under Total Variation Distance,Haozhe Ji and Pei Ke and Zhipeng Hu and Rongsheng Zhang and Minlie Huang,http://arxiv.org/pdf/2302.13344v1 | |
http://arxiv.org/abs/2203.01104v4,creativecommons.org/publicdomain/zero/1.0/,Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models,Ze-Feng Gao and Peiyu Liu and Wayne Xin Zhao and Zhong-Yi Lu and Ji-Rong Wen,http://arxiv.org/pdf/2203.01104v4 | |
http://arxiv.org/abs/2204.10624v1,creativecommons.org/publicdomain/zero/1.0/,Learning Functional Distributional Semantics with Visual Data,Yinhong Liu and Guy Emerson,http://arxiv.org/pdf/2204.10624v1 | |
http://arxiv.org/abs/2301.05226v1,creativecommons.org/publicdomain/zero/1.0/,"See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning",Zhenfang Chen and Qinhong Zhou and Yikang Shen and Yining Hong and Hao Zhang and Chuang Gan,http://arxiv.org/pdf/2301.05226v1 | |
http://arxiv.org/abs/2001.08896v5,creativecommons.org/publicdomain/zero/1.0/,Compressing Language Models using Doped Kronecker Products,Urmish Thakker and Paul N. Whatmough and Zhi-Gang Liu and Matthew Mattina and Jesse Beu,http://arxiv.org/pdf/2001.08896v5 | |
http://arxiv.org/abs/2006.03659v4,creativecommons.org/publicdomain/zero/1.0/,DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations,John Giorgi and Osvald Nitski and Bo Wang and Gary Bader,http://arxiv.org/pdf/2006.03659v4 | |
http://arxiv.org/abs/2206.04575v1,creativecommons.org/publicdomain/zero/1.0/,Transformer based Urdu Handwritten Text Optical Character Reader,Mohammad Daniyal Shaiq and Musa Dildar Ahmed Cheema and Ali Kamal,http://arxiv.org/pdf/2206.04575v1 | |
http://arxiv.org/abs/2212.10179v1,creativecommons.org/publicdomain/zero/1.0/,Toward Human-Like Evaluation for Natural Language Generation with Error Analysis,Qingyu Lu and Liang Ding and Liping Xie and Kanjian Zhang and Derek F. Wong and Dacheng Tao,http://arxiv.org/pdf/2212.10179v1 | |
http://arxiv.org/abs/2107.00157v5,creativecommons.org/publicdomain/zero/1.0/,Cross-Lingual Transfer Learning for Statistical Type Inference,Zhiming Li and Xiaofei Xie and Haoliang Li and Zhengzi Xu and Yi Li and Yang Liu,http://arxiv.org/pdf/2107.00157v5 | |
http://arxiv.org/abs/2302.14338v3,creativecommons.org/publicdomain/zero/1.0/,Turning a CLIP Model into a Scene Text Detector,Wenwen Yu and Yuliang Liu and Wei Hua and Deqiang Jiang and Bo Ren and Xiang Bai,http://arxiv.org/pdf/2302.14338v3 | |
http://arxiv.org/abs/2211.10435v2,creativecommons.org/publicdomain/zero/1.0/,PAL: Program-aided Language Models,Luyu Gao and Aman Madaan and Shuyan Zhou and Uri Alon and Pengfei Liu and Yiming Yang and Jamie Callan and Graham Neubig,http://arxiv.org/pdf/2211.10435v2 | |
http://arxiv.org/abs/2109.11314v1,creativecommons.org/publicdomain/zero/1.0/,ParaShoot: A Hebrew Question Answering Dataset,Omri Keren and Omer Levy,http://arxiv.org/pdf/2109.11314v1 | |
http://arxiv.org/abs/2301.09072v1,creativecommons.org/publicdomain/zero/1.0/,ContraBERT: Enhancing Code Pre-trained Models via Contrastive Learning,Shangqing Liu and Bozhi Wu and Xiaofei Xie and Guozhu Meng and Yang Liu,http://arxiv.org/pdf/2301.09072v1 | |
http://arxiv.org/abs/2302.08901v1,creativecommons.org/publicdomain/zero/1.0/,Exploring External Knowledge for Accurate modeling of Visual and Language Problems,Xuewen Yang,http://arxiv.org/pdf/2302.08901v1 | |
http://arxiv.org/abs/2303.13809v1,creativecommons.org/publicdomain/zero/1.0/,Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models: A Case Study on ChatGPT,Qingyu Lu and Baopu Qiu and Liang Ding and Liping Xie and Dacheng Tao,http://arxiv.org/pdf/2303.13809v1 | |
http://arxiv.org/abs/2103.03493v1,creativecommons.org/publicdomain/zero/1.0/,Causal Attention for Vision-Language Tasks,Xu Yang and Hanwang Zhang and Guojun Qi and Jianfei Cai,http://arxiv.org/pdf/2103.03493v1 | |
http://arxiv.org/abs/2207.07039v3,creativecommons.org/publicdomain/zero/1.0/,Convolutional Bypasses Are Better Vision Transformer Adapters,Shibo Jie and Zhi-Hong Deng,http://arxiv.org/pdf/2207.07039v3 | |
http://arxiv.org/abs/2109.13348v2,creativecommons.org/publicdomain/zero/1.0/,Evaluating Biomedical BERT Models for Vocabulary Alignment at Scale in the UMLS Metathesaurus,Goonmeet Bajaj and Vinh Nguyen and Thilini Wijesiriwardene and Hong Yung Yip and Vishesh Javangula and Srinivasan Parthasarathy and Amit Sheth and Olivier Bodenreider,http://arxiv.org/pdf/2109.13348v2 | |
http://arxiv.org/abs/2010.04823v1,creativecommons.org/publicdomain/zero/1.0/,On some representations of context-free languages,Krasimir Yordzhev,http://arxiv.org/pdf/2010.04823v1 | |
http://arxiv.org/abs/2303.09062v1,creativecommons.org/publicdomain/zero/1.0/,Knowledge Transfer for Pseudo-code Generation from Low Resource Programming Language,Ankita Sontakke and Kanika Kalra and Manasi Patwardhan and Lovekesh Vig and Raveendra Kumar Medicherla and Ravindra Naik and Shrishti Pradhan,http://arxiv.org/pdf/2303.09062v1 | |
http://arxiv.org/abs/2205.08514v2,creativecommons.org/publicdomain/zero/1.0/,Recovering Private Text in Federated Learning of Language Models,Samyak Gupta and Yangsibo Huang and Zexuan Zhong and Tianyu Gao and Kai Li and Danqi Chen,http://arxiv.org/pdf/2205.08514v2 | |
http://arxiv.org/abs/2210.05549v1,creativecommons.org/publicdomain/zero/1.0/,Continual Training of Language Models for Few-Shot Learning,Zixuan Ke and Haowei Lin and Yijia Shao and Hu Xu and Lei Shu and Bing Liu,http://arxiv.org/pdf/2210.05549v1 | |
http://arxiv.org/abs/2006.06434v1,creativecommons.org/publicdomain/zero/1.0/,TableQA: a Large-Scale Chinese Text-to-SQL Dataset for Table-Aware SQL Generation,Ningyuan Sun and Xuefeng Yang and Yunfeng Liu,http://arxiv.org/pdf/2006.06434v1 | |
http://arxiv.org/abs/2205.00484v1,creativecommons.org/publicdomain/zero/1.0/,Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs,Songlin Yang and Wei Liu and Kewei Tu,http://arxiv.org/pdf/2205.00484v1 | |
http://arxiv.org/abs/2207.14157v1,creativecommons.org/publicdomain/zero/1.0/,A Hazard Analysis Framework for Code Synthesis Large Language Models,Heidy Khlaaf and Pamela Mishkin and Joshua Achiam and Gretchen Krueger and Miles Brundage,http://arxiv.org/pdf/2207.14157v1 | |
http://arxiv.org/abs/1909.09491v1,creativecommons.org/publicdomain/zero/1.0/,A simple discriminative training method for machine translation with large-scale features,Tian Xia and Shaodan Zhai and Shaojun Wang,http://arxiv.org/pdf/1909.09491v1 | |
http://arxiv.org/abs/2101.00434v2,creativecommons.org/publicdomain/zero/1.0/,Coreference Resolution without Span Representations,Yuval Kirstain and Ori Ram and Omer Levy,http://arxiv.org/pdf/2101.00434v2 | |
http://arxiv.org/abs/2210.02441v3,creativecommons.org/publicdomain/zero/1.0/,Ask Me Anything: A simple strategy for prompting language models,Simran Arora and Avanika Narayan and Mayee F. Chen and Laurel Orr and Neel Guha and Kush Bhatia and Ines Chami and Frederic Sala and Christopher Ré,http://arxiv.org/pdf/2210.02441v3 | |
http://arxiv.org/abs/1912.03334v1,creativecommons.org/publicdomain/zero/1.0/,Explaining Sequence-Level Knowledge Distillation as Data-Augmentation for Neural Machine Translation,Mitchell A. Gordon and Kevin Duh,http://arxiv.org/pdf/1912.03334v1 | |
http://arxiv.org/abs/2201.05337v2,creativecommons.org/publicdomain/zero/1.0/,A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models,Hanqing Zhang and Haolin Song and Shaoyu Li and Ming Zhou and Dawei Song,http://arxiv.org/pdf/2201.05337v2 | |
http://arxiv.org/abs/2304.09433v2,creativecommons.org/publicdomain/zero/1.0/,Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes,Simran Arora and Brandon Yang and Sabri Eyuboglu and Avanika Narayan and Andrew Hojel and Immanuel Trummer and Christopher Ré,http://arxiv.org/pdf/2304.09433v2 | |
http://arxiv.org/abs/2004.05986v3,creativecommons.org/publicdomain/zero/1.0/,CLUE: A Chinese Language Understanding Evaluation Benchmark,Liang Xu and Hai Hu and Xuanwei Zhang and Lu Li and Chenjie Cao and Yudong Li and Yechen Xu and Kai Sun and Dian Yu and Cong Yu and Yin Tian and Qianqian Dong and Weitang Liu and Bo Shi and Yiming Cui and Junyi Li and Jun Zeng and Rongzhao Wang and Weijian Xie and Yanting Li and Yina Patterson and Zuoyu Tian and Yiwen Zhang and He Zhou and Shaoweihua Liu and Zhe Zhao and Qipeng Zhao and Cong Yue and Xinrui Zhang and Zhengliang Yang and Kyle Richardson and Zhenzhong Lan,http://arxiv.org/pdf/2004.05986v3 | |
http://arxiv.org/abs/2110.05287v1,creativecommons.org/publicdomain/zero/1.0/,TEET! Tunisian Dataset for Toxic Speech Detection,Slim Gharbi and Heger Arfaoui and Hatem Haddad and Mayssa Kchaou,http://arxiv.org/pdf/2110.05287v1 | |
http://arxiv.org/abs/2203.15754v1,creativecommons.org/publicdomain/zero/1.0/,Evaluating Prompts Across Multiple Choice Tasks In a Zero-Shot Setting,Gabriel Orlanski,http://arxiv.org/pdf/2203.15754v1 | |
http://arxiv.org/abs/2210.09304v1,creativecommons.org/publicdomain/zero/1.0/,Non-Contrastive Learning Meets Language-Image Pre-Training,Jinghao Zhou and Li Dong and Zhe Gan and Lijuan Wang and Furu Wei,http://arxiv.org/pdf/2210.09304v1 | |
http://arxiv.org/abs/2209.09444v3,creativecommons.org/publicdomain/zero/1.0/,Vega-MT: The JD Explore Academy Translation System for WMT22,Changtong Zan and Keqin Peng and Liang Ding and Baopu Qiu and Boan Liu and Shwai He and Qingyu Lu and Zheng Zhang and Chuang Liu and Weifeng Liu and Yibing Zhan and Dacheng Tao,http://arxiv.org/pdf/2209.09444v3 | |
http://arxiv.org/abs/2304.11276v1,creativecommons.org/publicdomain/zero/1.0/,The Role of AI in Human-AI Creative Writing for Hong Kong Secondary Students,Hengky Susanto and David James Woo and Kai Guo,http://arxiv.org/pdf/2304.11276v1 | |
http://arxiv.org/abs/2212.09656v1,creativecommons.org/publicdomain/zero/1.0/,Visconde: Multi-document QA with GPT-3 and Neural Reranking,Jayr Pereira and Robson Fidalgo and Roberto Lotufo and Rodrigo Nogueira,http://arxiv.org/pdf/2212.09656v1 | |
http://arxiv.org/abs/2205.09911v2,creativecommons.org/publicdomain/zero/1.0/,Can Foundation Models Wrangle Your Data?,Avanika Narayan and Ines Chami and Laurel Orr and Simran Arora and Christopher Ré,http://arxiv.org/pdf/2205.09911v2 | |
http://arxiv.org/abs/2103.14620v2,creativecommons.org/publicdomain/zero/1.0/,LiGCN: Label-interpretable Graph Convolutional Networks for Multi-label Text Classification,Irene Li and Aosong Feng and Hao Wu and Tianxiao Li and Toyotaro Suzumura and Ruihai Dong,http://arxiv.org/pdf/2103.14620v2 | |
http://arxiv.org/abs/1803.10299v3,creativecommons.org/publicdomain/zero/1.0/,Multi-Modal Data Augmentation for End-to-End ASR,Adithya Renduchintala and Shuoyang Ding and Matthew Wiesner and Shinji Watanabe,http://arxiv.org/pdf/1803.10299v3 | |
http://arxiv.org/abs/2211.07855v1,creativecommons.org/publicdomain/zero/1.0/,Relationship of the language distance to English ability of a country,Cao Xinxin and Lei Xiaolan and Murtadha Ahmed,http://arxiv.org/pdf/2211.07855v1 | |
http://arxiv.org/abs/1908.01841v2,creativecommons.org/publicdomain/zero/1.0/,DLGNet: A Transformer-based Model for Dialogue Response Generation,Oluwatobi Olabiyi and Erik T. Mueller,http://arxiv.org/pdf/1908.01841v2 | |
http://arxiv.org/abs/2207.11716v1,creativecommons.org/publicdomain/zero/1.0/,A Cognitive Study on Semantic Similarity Analysis of Large Corpora: A Transformer-based Approach,Praneeth Nemani and Satyanarayana Vollala,http://arxiv.org/pdf/2207.11716v1 | |
http://arxiv.org/abs/2202.13871v2,creativecommons.org/publicdomain/zero/1.0/,Wastewater Pipe Rating Model Using Natural Language Processing,Sai Nethra Betgeri and Shashank Reddy Vadyala and Dr. John C. Mattews and Dr. Hongfang Lu,http://arxiv.org/pdf/2202.13871v2 | |
http://arxiv.org/abs/2109.12997v1,creativecommons.org/publicdomain/zero/1.0/,Benchmarking the Status of Default Pseudorandom Number Generators in Common Programming Languages,Nils van den Honert and Diederick Vermetten and Anna V. Kononova,http://arxiv.org/pdf/2109.12997v1 | |
http://arxiv.org/abs/1212.6183v1,creativecommons.org/licenses/by-nc-sa/3.0/,The Buffered π-Calculus: A Model for Concurrent Languages,Xiaojie Deng and Yu Zhang and Yuxin Deng and Farong Zhong,http://arxiv.org/pdf/1212.6183v1 | |
http://arxiv.org/abs/2304.11406v1,creativecommons.org/licenses/by-nc-sa/4.0/,LaMP: When Large Language Models Meet Personalization,Alireza Salemi and Sheshera Mysore and Michael Bendersky and Hamed Zamani,http://arxiv.org/pdf/2304.11406v1 | |
http://arxiv.org/abs/1810.12387v1,creativecommons.org/licenses/by-nc-sa/4.0/,Language Modeling with Sparse Product of Sememe Experts,Yihong Gu and Jun Yan and Hao Zhu and Zhiyuan Liu and Ruobing Xie and Maosong Sun and Fen Lin and Leyu Lin,http://arxiv.org/pdf/1810.12387v1 | |
http://arxiv.org/abs/2303.01229v1,creativecommons.org/licenses/by-nc-sa/4.0/,Almanac: Knowledge-Grounded Language Models for Clinical Medicine,Cyril Zakka and Akash Chaurasia and Rohan Shad and William Hiesinger,http://arxiv.org/pdf/2303.01229v1 | |
http://arxiv.org/abs/2006.02144v1,creativecommons.org/licenses/by-nc-sa/4.0/,Transfer Learning for British Sign Language Modelling,Boris Mocialov and Graham Turner and Helen Hastie,http://arxiv.org/pdf/2006.02144v1 | |
http://arxiv.org/abs/2202.07138v2,creativecommons.org/licenses/by-nc-sa/4.0/,Integrating AI Planning with Natural Language Processing: A Combination of Explicit and Tacit Knowledge,Kebing Jin and Hankz Hankui Zhuo,http://arxiv.org/pdf/2202.07138v2 | |
http://arxiv.org/abs/2109.05522v1,creativecommons.org/licenses/by-nc-sa/4.0/,TEASEL: A Transformer-Based Speech-Prefixed Language Model,Mehdi Arjmand and Mohammad Javad Dousti and Hadi Moradi,http://arxiv.org/pdf/2109.05522v1 | |
http://arxiv.org/abs/2210.07054v2,creativecommons.org/licenses/by-nc-sa/4.0/,Scaling Back-Translation with Domain Text Generation for Sign Language Gloss Translation,Jinhui Ye and Wenxiang Jiao and Xing Wang and Zhaopeng Tu,http://arxiv.org/pdf/2210.07054v2 | |
http://arxiv.org/abs/1709.03759v1,creativecommons.org/licenses/by-nc-sa/4.0/,Language Models of Spoken Dutch,Lyan Verwimp and Joris Pelemans and Marieke Lycke and Hugo Van hamme and Patrick Wambacq,http://arxiv.org/pdf/1709.03759v1 | |
http://arxiv.org/abs/2006.02120v1,creativecommons.org/licenses/by-nc-sa/4.0/,Towards Large-Scale Data Mining for Data-Driven Analysis of Sign Languages,Boris Mocialov and Graham Turner and Helen Hastie,http://arxiv.org/pdf/2006.02120v1 | |
http://arxiv.org/abs/2302.09432v2,creativecommons.org/licenses/by-nc-sa/4.0/,"BBT-Fin: Comprehensive Construction of Chinese Financial Domain Pre-trained Language Model, Corpus and Benchmark",Dakuan Lu and Hengkui Wu and Jiaqing Liang and Yipei Xu and Qianyu He and Yipeng Geng and Mengkun Han and Yingsi Xin and Yanghua Xiao,http://arxiv.org/pdf/2302.09432v2 | |
http://arxiv.org/abs/1805.09016v1,creativecommons.org/licenses/by-nc-sa/4.0/,Bilingual Sentiment Embeddings: Joint Projection of Sentiment Across Languages,Jeremy Barnes and Roman Klinger and Sabine Schulte im Walde,http://arxiv.org/pdf/1805.09016v1 | |
http://arxiv.org/abs/2203.13397v1,creativecommons.org/licenses/by-nc-sa/4.0/,GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models,Changye Li and David Knopman and Weizhe Xu and Trevor Cohen and Serguei Pakhomov,http://arxiv.org/pdf/2203.13397v1 | |
http://arxiv.org/abs/2210.05359v1,creativecommons.org/licenses/by-nc-sa/4.0/,Mind's Eye: Grounded Language Model Reasoning through Simulation,Ruibo Liu and Jason Wei and Shixiang Shane Gu and Te-Yen Wu and Soroush Vosoughi and Claire Cui and Denny Zhou and Andrew M. Dai,http://arxiv.org/pdf/2210.05359v1 | |
http://arxiv.org/abs/2112.11070v1,creativecommons.org/licenses/by-nc-sa/4.0/,An Inference Approach To Question Answering Over Knowledge Graphs,Aayushee Gupta and K. M. Annervaz and Ambedkar Dukkipati and Shubhashis Sengupta,http://arxiv.org/pdf/2112.11070v1 | |
http://arxiv.org/abs/2010.12472v2,creativecommons.org/licenses/by-nc-sa/4.0/,HateBERT: Retraining BERT for Abusive Language Detection in English,Tommaso Caselli and Valerio Basile and Jelena Mitrović and Michael Granitzer,http://arxiv.org/pdf/2010.12472v2 | |
http://arxiv.org/abs/2203.06906v1,creativecommons.org/licenses/by-nc-sa/4.0/,PERT: Pre-training BERT with Permuted Language Model,Yiming Cui and Ziqing Yang and Ting Liu,http://arxiv.org/pdf/2203.06906v1 | |
http://arxiv.org/abs/2304.05368v2,creativecommons.org/licenses/by-nc-sa/4.0/,Are Large Language Models Ready for Healthcare? A Comparative Study on Clinical Language Understanding,Yuqing Wang and Yun Zhao and Linda Petzold,http://arxiv.org/pdf/2304.05368v2 | |
http://arxiv.org/abs/2008.06788v2,creativecommons.org/licenses/by-nc-sa/4.0/,Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation,Goran Glavaš and Ivan Vulić,http://arxiv.org/pdf/2008.06788v2 | |
http://arxiv.org/abs/2211.00083v1,creativecommons.org/licenses/by-nc-sa/4.0/,WHEN FLUE MEETS FLANG: Benchmarks and Large Pre-trained Language Model for Financial Domain,Raj Sanjay Shah and Kunal Chawla and Dheeraj Eidnani and Agam Shah and Wendi Du and Sudheer Chava and Natraj Raman and Charese Smiley and Jiaao Chen and Diyi Yang,http://arxiv.org/pdf/2211.00083v1 | |
http://arxiv.org/abs/2109.03646v1,creativecommons.org/licenses/by-nc-sa/4.0/,Sustainable Modular Debiasing of Language Models,Anne Lauscher and Tobias Lüken and Goran Glavaš,http://arxiv.org/pdf/2109.03646v1 | |
http://arxiv.org/abs/2103.12801v1,creativecommons.org/licenses/by-nc-sa/4.0/,Variable Name Recovery in Decompiled Binary Code using Constrained Masked Language Modeling,Pratyay Banerjee and Kuntal Kumar Pal and Fish Wang and Chitta Baral,http://arxiv.org/pdf/2103.12801v1 | |
http://arxiv.org/abs/2109.05093v1,creativecommons.org/licenses/by-nc-sa/4.0/,PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models,Torsten Scholak and Nathan Schucher and Dzmitry Bahdanau,http://arxiv.org/pdf/2109.05093v1 | |
http://arxiv.org/abs/2205.10661v1,creativecommons.org/licenses/by-nc-sa/4.0/,An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs,Jiarui Zhang and Filip Ilievski and Kaixin Ma and Jonathan Francis and Alessandro Oltramari,http://arxiv.org/pdf/2205.10661v1 | |
http://arxiv.org/abs/2208.14493v1,creativecommons.org/licenses/by-nc-sa/4.0/,Annotated Dataset Creation through General Purpose Language Models for non-English Medical NLP,Johann Frei and Frank Kramer,http://arxiv.org/pdf/2208.14493v1 | |
http://arxiv.org/abs/1801.06436v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Resource-Light Method for Cross-Lingual Semantic Textual Similarity,Goran Glavaš and Marc Franco-Salvador and Simone Paolo Ponzetto and Paolo Rosso,http://arxiv.org/pdf/1801.06436v1 | |
http://arxiv.org/abs/2110.03546v2,creativecommons.org/licenses/by-nc-sa/4.0/,mRAT-SQL+GAP:A Portuguese Text-to-SQL Transformer,Marcelo Archanjo José and Fabio Gagliardi Cozman,http://arxiv.org/pdf/2110.03546v2 | |
http://arxiv.org/abs/2108.00801v2,creativecommons.org/licenses/by-nc-sa/4.0/,LICHEE: Improving Language Model Pre-training with Multi-grained Tokenization,Weidong Guo and Mingjun Zhao and Lusheng Zhang and Di Niu and Jinwen Luo and Zhenhua Liu and Zhenyang Li and Jianbo Tang,http://arxiv.org/pdf/2108.00801v2 | |
http://arxiv.org/abs/2210.15133v1,creativecommons.org/licenses/by-nc-sa/4.0/,Retrieval Oriented Masking Pre-training Language Model for Dense Passage Retrieval,Dingkun Long and Yanzhao Zhang and Guangwei Xu and Pengjun Xie,http://arxiv.org/pdf/2210.15133v1 | |
http://arxiv.org/abs/2202.11558v1,creativecommons.org/licenses/by-nc-sa/4.0/,Short-answer scoring with ensembles of pretrained language models,Christopher Ormerod,http://arxiv.org/pdf/2202.11558v1 | |
http://arxiv.org/abs/2205.10012v3,creativecommons.org/licenses/by-nc-sa/4.0/,Descartes: Generating Short Descriptions of Wikipedia Articles,Marija Sakota and Maxime Peyrard and Robert West,http://arxiv.org/pdf/2205.10012v3 | |
http://arxiv.org/abs/2112.08709v2,creativecommons.org/licenses/by-nc-sa/4.0/,DOCmT5: Document-Level Pretraining of Multilingual Language Models,Chia-Hsuan Lee and Aditya Siddhant and Viresh Ratnakar and Melvin Johnson,http://arxiv.org/pdf/2112.08709v2 | |
http://arxiv.org/abs/2204.05939v1,creativecommons.org/licenses/by-nc-sa/4.0/,Mining Logical Event Schemas From Pre-Trained Language Models,Lane Lawley and Lenhart Schubert,http://arxiv.org/pdf/2204.05939v1 | |
http://arxiv.org/abs/2302.05578v2,creativecommons.org/licenses/by-nc-sa/4.0/,Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models,Renat Aksitov and Chung-Ching Chang and David Reitter and Siamak Shakeri and Yunhsuan Sung,http://arxiv.org/pdf/2302.05578v2 | |
http://arxiv.org/abs/1907.12009v1,creativecommons.org/licenses/by-nc-sa/4.0/,Representation Degeneration Problem in Training Natural Language Generation Models,Jun Gao and Di He and Xu Tan and Tao Qin and Liwei Wang and Tie-Yan Liu,http://arxiv.org/pdf/1907.12009v1 | |
http://arxiv.org/abs/2111.01582v1,creativecommons.org/licenses/by-nc-sa/4.0/,LMdiff: A Visual Diff Tool to Compare Language Models,Hendrik Strobelt and Benjamin Hoover and Arvind Satyanarayan and Sebastian Gehrmann,http://arxiv.org/pdf/2111.01582v1 | |
http://arxiv.org/abs/2102.10275v1,creativecommons.org/licenses/by-nc-sa/4.0/,An Attention Ensemble Approach for Efficient Text Classification of Indian Languages,Atharva Kulkarni and Amey Hengle and Rutuja Udyawar,http://arxiv.org/pdf/2102.10275v1 | |
http://arxiv.org/abs/2211.16594v3,creativecommons.org/licenses/by-nc-sa/4.0/,Exploiting Category Names for Few-Shot Classification with Vision-Language Models,Taihong Xiao and Zirui Wang and Liangliang Cao and Jiahui Yu and Shengyang Dai and Ming-Hsuan Yang,http://arxiv.org/pdf/2211.16594v3 | |
http://arxiv.org/abs/2202.04173v3,creativecommons.org/licenses/by-nc-sa/4.0/,Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models,Boxin Wang and Wei Ping and Chaowei Xiao and Peng Xu and Mostofa Patwary and Mohammad Shoeybi and Bo Li and Anima Anandkumar and Bryan Catanzaro,http://arxiv.org/pdf/2202.04173v3 | |
http://arxiv.org/abs/2211.09527v1,creativecommons.org/licenses/by-nc-sa/4.0/,Ignore Previous Prompt: Attack Techniques For Language Models,Fábio Perez and Ian Ribeiro,http://arxiv.org/pdf/2211.09527v1 | |
http://arxiv.org/abs/2108.13169v1,creativecommons.org/licenses/by-nc-sa/4.0/,Enterprise Architecture Model Transformation Engine,Erik Heiland and Peter Hillmann and Andreas Karcher,http://arxiv.org/pdf/2108.13169v1 | |
http://arxiv.org/abs/2109.10234v1,creativecommons.org/licenses/by-nc-sa/4.0/,BERTweetFR : Domain Adaptation of Pre-Trained Language Models for French Tweets,Yanzhu Guo and Virgile Rennard and Christos Xypolopoulos and Michalis Vazirgiannis,http://arxiv.org/pdf/2109.10234v1 | |
http://arxiv.org/abs/2102.12516v1,creativecommons.org/licenses/by-nc-sa/4.0/,"A Large-Scale, Automated Study of Language Surrounding Artificial Intelligence",Autumn Toney,http://arxiv.org/pdf/2102.12516v1 | |
http://arxiv.org/abs/2209.14901v2,creativecommons.org/licenses/by-nc-sa/4.0/,DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing,Yanjun Gao and Dmitriy Dligach and Timothy Miller and John Caskey and Brihat Sharma and Matthew M Churpek and Majid Afshar,http://arxiv.org/pdf/2209.14901v2 | |
http://arxiv.org/abs/2302.08150v1,creativecommons.org/licenses/by-nc-sa/4.0/,Reanalyzing L2 Preposition Learning with Bayesian Mixed Effects and a Large Language Model,Jakob Prange and Man Ho Ivy Wong,http://arxiv.org/pdf/2302.08150v1 | |
http://arxiv.org/abs/2303.08288v1,creativecommons.org/licenses/by-nc-sa/4.0/,Attention-likelihood relationship in transformers,Valeria Ruscio and Valentino Maiorca and Fabrizio Silvestri,http://arxiv.org/pdf/2303.08288v1 | |
http://arxiv.org/abs/2007.10629v1,creativecommons.org/licenses/by-nc-sa/4.0/,SLNSpeech: solving extended speech separation problem by the help of sign language,Jiasong Wu and Taotao Li and Youyong Kong and Guanyu Yang and Lotfi Senhadji and Huazhong Shu,http://arxiv.org/pdf/2007.10629v1 | |
http://arxiv.org/abs/1912.01580v2,creativecommons.org/licenses/by-nc-sa/4.0/,A Comparative Study of Pretrained Language Models on Thai Social Text Categorization,Thanapapas Horsuwan and Kasidis Kanwatchara and Peerapon Vateekul and Boonserm Kijsirikul,http://arxiv.org/pdf/1912.01580v2 | |
http://arxiv.org/abs/2002.05955v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Data Efficient End-To-End Spoken Language Understanding Architecture,Marco Dinarelli and Nikita Kapoor and Bassam Jabaian and Laurent Besacier,http://arxiv.org/pdf/2002.05955v1 | |
http://arxiv.org/abs/2212.03404v1,creativecommons.org/licenses/by-nc-sa/4.0/,Towards using Few-Shot Prompt Learning for Automating Model Completion,Meriem Ben Chaaben and Lola Burgueño and Houari Sahraoui,http://arxiv.org/pdf/2212.03404v1 | |
http://arxiv.org/abs/2004.15006v2,creativecommons.org/licenses/by-nc-sa/4.0/,Template Guided Text Generation for Task-Oriented Dialogue,Mihir Kale and Abhinav Rastogi,http://arxiv.org/pdf/2004.15006v2 | |
http://arxiv.org/abs/2304.00228v1,creativecommons.org/licenses/by-nc-sa/4.0/,Large language models can rate news outlet credibility,Kai-Cheng Yang and Filippo Menczer,http://arxiv.org/pdf/2304.00228v1 | |
http://arxiv.org/abs/2110.12396v2,creativecommons.org/licenses/by-nc-sa/4.0/,Using Motion History Images with 3D Convolutional Networks in Isolated Sign Language Recognition,Ozge Mercanoglu Sincan and Hacer Yalim Keles,http://arxiv.org/pdf/2110.12396v2 | |
http://arxiv.org/abs/1907.12763v2,creativecommons.org/licenses/by-nc-sa/4.0/,Finding Moments in Video Collections Using Natural Language,Victor Escorcia and Mattia Soldan and Josef Sivic and Bernard Ghanem and Bryan Russell,http://arxiv.org/pdf/1907.12763v2 | |
http://arxiv.org/abs/2211.00609v1,creativecommons.org/licenses/by-nc-sa/4.0/,"A Simple, Yet Effective Approach to Finding Biases in Code Generation",Spyridon Mouselinos and Mateusz Malinowski and Henryk Michalewski,http://arxiv.org/pdf/2211.00609v1 | |
http://arxiv.org/abs/2204.08975v1,creativecommons.org/licenses/by-nc-sa/4.0/,Detecting Text Formality: A Study of Text Classification Approaches,Daryna Dementieva and Ivan Trifinov and Andrey Likhachev and Alexander Panchenko,http://arxiv.org/pdf/2204.08975v1 | |
http://arxiv.org/abs/2302.13939v2,creativecommons.org/licenses/by-nc-sa/4.0/,SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks,Rui-Jie Zhu and Qihang Zhao and Jason K. Eshraghian,http://arxiv.org/pdf/2302.13939v2 | |
http://arxiv.org/abs/2106.00241v1,creativecommons.org/licenses/by-nc-sa/4.0/,Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition,Shining Liang and Ming Gong and Jian Pei and Linjun Shou and Wanli Zuo and Xianglin Zuo and Daxin Jiang,http://arxiv.org/pdf/2106.00241v1 | |
http://arxiv.org/abs/2212.05740v1,creativecommons.org/licenses/by-nc-sa/4.0/,Searching for Effective Multilingual Fine-Tuning Methods: A Case Study in Summarization,Yiwei Qin and Graham Neubig and Pengfei Liu,http://arxiv.org/pdf/2212.05740v1 | |
http://arxiv.org/abs/2106.05852v2,creativecommons.org/licenses/by-nc-sa/4.0/,Automatic Speech Recognition in Sanskrit: A New Speech Corpus and Modelling Insights,Devaraja Adiga and Rishabh Kumar and Amrith Krishna and Preethi Jyothi and Ganesh Ramakrishnan and Pawan Goyal,http://arxiv.org/pdf/2106.05852v2 | |
http://arxiv.org/abs/2112.00405v1,creativecommons.org/licenses/by-nc-sa/4.0/,NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging,Zihan Liu and Feijun Jiang and Yuxiang Hu and Chen Shi and Pascale Fung,http://arxiv.org/pdf/2112.00405v1 | |
http://arxiv.org/abs/2206.14366v3,creativecommons.org/licenses/by-nc-sa/4.0/,Knowledge Distillation of Transformer-based Language Models Revisited,Chengqiang Lu and Jianwei Zhang and Yunfei Chu and Zhengyu Chen and Jingren Zhou and Fei Wu and Haiqing Chen and Hongxia Yang,http://arxiv.org/pdf/2206.14366v3 | |
http://arxiv.org/abs/1901.00297v1,creativecommons.org/licenses/by-nc-sa/4.0/,"A Deep Learning Approach for Similar Languages, Varieties and Dialects",Vidya Prasad K and Akarsh S and Vinayakumar R and Soman KP,http://arxiv.org/pdf/1901.00297v1 | |
http://arxiv.org/abs/2108.00356v4,creativecommons.org/licenses/by-nc-sa/4.0/,Improving Social Meaning Detection with Pragmatic Masking and Surrogate Fine-Tuning,Chiyu Zhang and Muhammad Abdul-Mageed,http://arxiv.org/pdf/2108.00356v4 | |
http://arxiv.org/abs/2110.00428v1,creativecommons.org/licenses/by-nc-sa/4.0/,Zero-shot Natural Language Video Localization,Jinwoo Nam and Daechul Ahn and Dongyeop Kang and Seong Jong Ha and Jonghyun Choi,http://arxiv.org/pdf/2110.00428v1 | |
http://arxiv.org/abs/2110.08455v1,creativecommons.org/licenses/by-nc-sa/4.0/,Knowledge Enhanced Pretrained Language Models: A Compreshensive Survey,Xiaokai Wei and Shen Wang and Dejiao Zhang and Parminder Bhatia and Andrew Arnold,http://arxiv.org/pdf/2110.08455v1 | |
http://arxiv.org/abs/2303.07142v3,creativecommons.org/licenses/by-nc-sa/4.0/,Large Language Models in the Workplace: A Case Study on Prompt Engineering for Job Type Classification,Benjamin Clavié and Alexandru Ciceu and Frederick Naylor and Guillaume Soulié and Thomas Brightwell,http://arxiv.org/pdf/2303.07142v3 | |
http://arxiv.org/abs/1909.11687v2,creativecommons.org/licenses/by-nc-sa/4.0/,Extremely Small BERT Models from Mixed-Vocabulary Training,Sanqiang Zhao and Raghav Gupta and Yang Song and Denny Zhou,http://arxiv.org/pdf/1909.11687v2 | |
http://arxiv.org/abs/2105.00824v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Survey of Recent Abstract Summarization Techniques,Diyah Puspitaningrum,http://arxiv.org/pdf/2105.00824v1 | |
http://arxiv.org/abs/2212.08635v1,creativecommons.org/licenses/by-nc-sa/4.0/,Self-Prompting Large Language Models for Open-Domain QA,Junlong Li and Zhuosheng Zhang and Hai Zhao,http://arxiv.org/pdf/2212.08635v1 | |
http://arxiv.org/abs/1701.09123v1,creativecommons.org/licenses/by-nc-sa/4.0/,Robust Multilingual Named Entity Recognition with Shallow Semi-Supervised Features,Rodrigo Agerri and German Rigau,http://arxiv.org/pdf/1701.09123v1 | |
http://arxiv.org/abs/2202.13257v1,creativecommons.org/licenses/by-nc-sa/4.0/,Controllable Natural Language Generation with Contrastive Prefixes,Jing Qian and Li Dong and Yelong Shen and Furu Wei and Weizhu Chen,http://arxiv.org/pdf/2202.13257v1 | |
http://arxiv.org/abs/2303.03290v1,creativecommons.org/licenses/by-nc-sa/4.0/,AmQA: Amharic Question Answering Dataset,Tilahun Abedissa and Ricardo Usbeck and Yaregal Assabie,http://arxiv.org/pdf/2303.03290v1 | |
http://arxiv.org/abs/2304.00634v1,creativecommons.org/licenses/by-nc-sa/4.0/,MMT: A Multilingual and Multi-Topic Indian Social Media Dataset,Dwip Dalal and Vivek Srivastava and Mayank Singh,http://arxiv.org/pdf/2304.00634v1 | |
http://arxiv.org/abs/2203.09866v1,creativecommons.org/licenses/by-nc-sa/4.0/,Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation,Beatrice Savoldi and Marco Gaido and Luisa Bentivogli and Matteo Negri and Marco Turchi,http://arxiv.org/pdf/2203.09866v1 | |
http://arxiv.org/abs/2304.01238v2,creativecommons.org/licenses/by-nc-sa/4.0/,Spam-T5: Benchmarking Large Language Models for Few-Shot Email Spam Detection,Maxime Labonne and Sean Moran,http://arxiv.org/pdf/2304.01238v2 | |
http://arxiv.org/abs/2010.01611v2,creativecommons.org/licenses/by-nc-sa/4.0/,"When in Doubt, Ask: Generating Answerable and Unanswerable Questions, Unsupervised",Liubov Nikolenko and Pouya Rezazadeh Kalehbasti,http://arxiv.org/pdf/2010.01611v2 | |
http://arxiv.org/abs/2109.00796v1,creativecommons.org/licenses/by-nc-sa/4.0/,Multi-Modal Zero-Shot Sign Language Recognition,Razieh Rastgoo and Kourosh Kiani and Sergio Escalera and Mohammad Sabokrou,http://arxiv.org/pdf/2109.00796v1 | |
http://arxiv.org/abs/2211.06398v1,creativecommons.org/licenses/by-nc-sa/4.0/,Investigating Fairness Disparities in Peer Review: A Language Model Enhanced Approach,Jiayao Zhang and Hongming Zhang and Zhun Deng and Dan Roth,http://arxiv.org/pdf/2211.06398v1 | |
http://arxiv.org/abs/2210.12770v2,creativecommons.org/licenses/by-nc-sa/4.0/,On Cross-Domain Pre-Trained Language Models for Clinical Text Mining: How Do They Perform on Data-Constrained Fine-Tuning?,Yuping Wu and Lifeng Han and Valerio Antonini and Goran Nenadic,http://arxiv.org/pdf/2210.12770v2 | |
http://arxiv.org/abs/2204.02685v3,creativecommons.org/licenses/by-nc-sa/4.0/,SecureBERT: A Domain-Specific Language Model for Cybersecurity,Ehsan Aghaei and Xi Niu and Waseem Shadid and Ehab Al-Shaer,http://arxiv.org/pdf/2204.02685v3 | |
http://arxiv.org/abs/2203.01111v2,creativecommons.org/licenses/by-nc-sa/4.0/,Large-Scale Hate Speech Detection with Cross-Domain Transfer,Cagri Toraman and Furkan Şahinuç and Eyup Halit Yilmaz,http://arxiv.org/pdf/2203.01111v2 | |
http://arxiv.org/abs/2301.02071v1,creativecommons.org/licenses/by-nc-sa/4.0/,Towards Table-to-Text Generation with Pretrained Language Model: A Table Structure Understanding and Text Deliberating Approach,Miao Chen and Xinjiang Lu and Tong Xu and Yanyan Li and Jingbo Zhou and Dejing Dou and Hui Xiong,http://arxiv.org/pdf/2301.02071v1 | |
http://arxiv.org/abs/2210.01185v1,creativecommons.org/licenses/by-nc-sa/4.0/,ContraGen: Effective Contrastive Learning For Causal Language Model,Nihal Jain and Dejiao Zhang and Wasi Uddin Ahmad and Zijian Wang and Feng Nan and Xiaopeng Li and Ming Tan and Ramesh Nallapati and Baishakhi Ray and Parminder Bhatia and Xiaofei Ma and Bing Xiang,http://arxiv.org/pdf/2210.01185v1 | |
http://arxiv.org/abs/2010.07835v3,creativecommons.org/licenses/by-nc-sa/4.0/,Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach,Yue Yu and Simiao Zuo and Haoming Jiang and Wendi Ren and Tuo Zhao and Chao Zhang,http://arxiv.org/pdf/2010.07835v3 | |
http://arxiv.org/abs/2202.13758v3,creativecommons.org/licenses/by-nc-sa/4.0/,Logical Fallacy Detection,Zhijing Jin and Abhinav Lalwani and Tejas Vaidhya and Xiaoyu Shen and Yiwen Ding and Zhiheng Lyu and Mrinmaya Sachan and Rada Mihalcea and Bernhard Schölkopf,http://arxiv.org/pdf/2202.13758v3 | |
http://arxiv.org/abs/2304.02697v1,creativecommons.org/licenses/by-nc-sa/4.0/,Revolutionizing Single Cell Analysis: The Power of Large Language Models for Cell Type Annotation,Zehua Zeng and Hongwu Du,http://arxiv.org/pdf/2304.02697v1 | |
http://arxiv.org/abs/2301.04528v1,creativecommons.org/licenses/by-nc-sa/4.0/,The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference,Richard Brath and Daniel Keim and Johannes Knittel and Shimei Pan and Pia Sommerauer and Hendrik Strobelt,http://arxiv.org/pdf/2301.04528v1 | |
http://arxiv.org/abs/1510.01717v1,creativecommons.org/licenses/by-nc-sa/4.0/,Language Segmentation,David Alfter,http://arxiv.org/pdf/1510.01717v1 | |
http://arxiv.org/abs/2109.03009v1,creativecommons.org/licenses/by-nc-sa/4.0/,Sequential Attention Module for Natural Language Processing,Mengyuan Zhou and Jian Ma and Haiqin Yang and Lianxin Jiang and Yang Mo,http://arxiv.org/pdf/2109.03009v1 | |
http://arxiv.org/abs/2105.05066v1,creativecommons.org/licenses/by-nc-sa/4.0/,"ChaLearn LAP Large Scale Signer Independent Isolated Sign Language Recognition Challenge: Design, Results and Future Research",Ozge Mercanoglu Sincan and Julio C. S. Jacques Junior and Sergio Escalera and Hacer Yalim Keles,http://arxiv.org/pdf/2105.05066v1 | |
http://arxiv.org/abs/2301.02241v1,creativecommons.org/licenses/by-nc-sa/4.0/,CiT: Curation in Training for Effective Vision-Language Data,Hu Xu and Saining Xie and Po-Yao Huang and Licheng Yu and Russell Howes and Gargi Ghosh and Luke Zettlemoyer and Christoph Feichtenhofer,http://arxiv.org/pdf/2301.02241v1 | |
http://arxiv.org/abs/1812.06624v1,creativecommons.org/licenses/by-nc-sa/4.0/,Feature Fusion Effects of Tensor Product Representation on (De)Compositional Network for Caption Generation for Images,Chiranjib Sur,http://arxiv.org/pdf/1812.06624v1 | |
http://arxiv.org/abs/2204.04327v2,creativecommons.org/licenses/by-nc-sa/4.0/,"Show, Don't Tell: Demonstrations Outperform Descriptions for Schema-Guided Task-Oriented Dialogue",Raghav Gupta and Harrison Lee and Jeffrey Zhao and Abhinav Rastogi and Yuan Cao and Yonghui Wu,http://arxiv.org/pdf/2204.04327v2 | |
http://arxiv.org/abs/2304.03472v2,creativecommons.org/licenses/by-nc-sa/4.0/,Does Prompt-Tuning Language Model Ensure Privacy?,Shangyu Xie and Wei Dai and Esha Ghosh and Sambuddha Roy and Dan Schwartz and Kim Laine,http://arxiv.org/pdf/2304.03472v2 | |
http://arxiv.org/abs/2102.07818v2,creativecommons.org/licenses/by-nc-sa/4.0/,Certified Robustness to Programmable Transformations in LSTMs,Yuhao Zhang and Aws Albarghouthi and Loris D'Antoni,http://arxiv.org/pdf/2102.07818v2 | |
http://arxiv.org/abs/2009.06054v1,creativecommons.org/licenses/by-nc-sa/4.0/,Deconstructing Legal Text_Object Oriented Design in Legal Adjudication,Megan Ma and Dmitriy Podkopaev and Avalon Campbell-Cousins and Adam Nicholas,http://arxiv.org/pdf/2009.06054v1 | |
http://arxiv.org/abs/2211.13112v1,creativecommons.org/licenses/by-nc-sa/4.0/,"This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish",Łukasz Augustyniak and Kamil Tagowski and Albert Sawczyn and Denis Janiak and Roman Bartusiak and Adrian Szymczak and Marcin Wątroba and Arkadiusz Janz and Piotr Szymański and Mikołaj Morzy and Tomasz Kajdanowicz and Maciej Piasecki,http://arxiv.org/pdf/2211.13112v1 | |
http://arxiv.org/abs/2302.04012v1,creativecommons.org/licenses/by-nc-sa/4.0/,Systematically Finding Security Vulnerabilities in Black-Box Code Generation Models,Hossein Hajipour and Thorsten Holz and Lea Schönherr and Mario Fritz,http://arxiv.org/pdf/2302.04012v1 | |
http://arxiv.org/abs/2203.05008v2,creativecommons.org/licenses/by-nc-sa/4.0/,Sentence-Select: Large-Scale Language Model Data Selection for Rare-Word Speech Recognition,W. Ronny Huang and Cal Peyser and Tara N. Sainath and Ruoming Pang and Trevor Strohman and Shankar Kumar,http://arxiv.org/pdf/2203.05008v2 | |
http://arxiv.org/abs/2301.13126v1,creativecommons.org/licenses/by-nc-sa/4.0/,LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain,Joel Niklaus and Veton Matoshi and Pooja Rani and Andrea Galassi and Matthias Stürmer and Ilias Chalkidis,http://arxiv.org/pdf/2301.13126v1 | |
http://arxiv.org/abs/2107.00430v3,creativecommons.org/licenses/by-nc-sa/4.0/,Language-Level Semantics Conditioned 3D Point Cloud Segmentation,Bo Liu and Shuang Deng and Qiulei Dong and Zhanyi Hu,http://arxiv.org/pdf/2107.00430v3 | |
http://arxiv.org/abs/1902.00508v1,creativecommons.org/licenses/by-nc-sa/4.0/,"How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions",Goran Glavas and Robert Litschko and Sebastian Ruder and Ivan Vulic,http://arxiv.org/pdf/1902.00508v1 | |
http://arxiv.org/abs/2302.02029v1,creativecommons.org/licenses/by-nc-sa/4.0/,Towards Few-Shot Identification of Morality Frames using In-Context Learning,Shamik Roy and Nishanth Sridhar Nakshatri and Dan Goldwasser,http://arxiv.org/pdf/2302.02029v1 | |
http://arxiv.org/abs/2304.08592v1,creativecommons.org/licenses/by-nc-sa/4.0/,Improving Scene Text Recognition for Character-Level Long-Tailed Distribution,Sunghyun Park and Sunghyo Chung and Jungsoo Lee and Jaegul Choo,http://arxiv.org/pdf/2304.08592v1 | |
http://arxiv.org/abs/2302.08575v1,creativecommons.org/licenses/by-nc-sa/4.0/,Foundation Models for Natural Language Processing -- Pre-trained Language Models Integrating Media,Gerhard Paaß and Sven Giesselbach,http://arxiv.org/pdf/2302.08575v1 | |
http://arxiv.org/abs/2008.06121v1,creativecommons.org/licenses/by-nc-sa/4.0/,LSTM Acoustic Models Learn to Align and Pronounce with Graphemes,Arindrima Datta and Guanlong Zhao and Bhuvana Ramabhadran and Eugene Weinstein,http://arxiv.org/pdf/2008.06121v1 | |
http://arxiv.org/abs/2203.11239v1,creativecommons.org/licenses/by-nc-sa/4.0/,DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization,Zheng Li and Zijian Wang and Ming Tan and Ramesh Nallapati and Parminder Bhatia and Andrew Arnold and Bing Xiang and Dan Roth,http://arxiv.org/pdf/2203.11239v1 | |
http://arxiv.org/abs/2106.08898v1,creativecommons.org/licenses/by-nc-sa/4.0/,RefBERT: Compressing BERT by Referencing to Pre-computed Representations,Xinyi Wang and Haiqin Yang and Liang Zhao and Yang Mo and Jianping Shen,http://arxiv.org/pdf/2106.08898v1 | |
http://arxiv.org/abs/1906.09379v1,creativecommons.org/licenses/by-nc-sa/4.0/,Evaluating Computational Language Models with Scaling Properties of Natural Language,Shuntaro Takahashi and Kumiko Tanaka-Ishii,http://arxiv.org/pdf/1906.09379v1 | |
http://arxiv.org/abs/2303.15669v1,creativecommons.org/licenses/by-nc-sa/4.0/,Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low Resource Languages,Seongyeon Park and Myungseo Song and Bohyung Kim and Tae-Hyun Oh,http://arxiv.org/pdf/2303.15669v1 | |
http://arxiv.org/abs/2104.00933v1,creativecommons.org/licenses/by-nc-sa/4.0/,Humor@IITK at SemEval-2021 Task 7: Large Language Models for Quantifying Humor and Offensiveness,Aishwarya Gupta and Avik Pal and Bholeshwar Khurana and Lakshay Tyagi and Ashutosh Modi,http://arxiv.org/pdf/2104.00933v1 | |
http://arxiv.org/abs/2209.00981v1,creativecommons.org/licenses/by-nc-sa/4.0/,Exploiting Pretrained Biochemical Language Models for Targeted Drug Design,Gökçe Uludoğan and Elif Ozkirimli and Kutlu O. Ulgen and Nilgün Karalı and Arzucan Özgür,http://arxiv.org/pdf/2209.00981v1 | |
http://arxiv.org/abs/2209.05034v1,creativecommons.org/licenses/by-nc-sa/4.0/,CSL: A Large-scale Chinese Scientific Literature Dataset,Yudong Li and Yuqing Zhang and Zhe Zhao and Linlin Shen and Weijie Liu and Weiquan Mao and Hui Zhang,http://arxiv.org/pdf/2209.05034v1 | |
http://arxiv.org/abs/2304.11060v1,creativecommons.org/licenses/by-nc-sa/4.0/,SkillGPT: a RESTful API service for skill extraction and standardization using a Large Language Model,Nan Li and Bo Kang and Tijl De Bie,http://arxiv.org/pdf/2304.11060v1 | |
http://arxiv.org/abs/2105.05605v1,creativecommons.org/licenses/by-nc-sa/4.0/,Priberam Labs at the NTCIR-15 SHINRA2020-ML: Classification Task,Ruben Cardoso and Afonso Mendes and Andre Lamurias,http://arxiv.org/pdf/2105.05605v1 | |
http://arxiv.org/abs/2209.02267v1,creativecommons.org/licenses/by-nc-sa/4.0/,Entity Aware Syntax Tree Based Data Augmentation for Natural Language Understanding,Jiaxing Xu and Jianbin Cui and Jiangneng Li and Wenge Rong and Noboru Matsuda,http://arxiv.org/pdf/2209.02267v1 | |
http://arxiv.org/abs/2301.10075v1,creativecommons.org/licenses/by-nc-sa/4.0/,From Inclusive Language to Gender-Neutral Machine Translation,Andrea Piergentili and Dennis Fucci and Beatrice Savoldi and Luisa Bentivogli and Matteo Negri,http://arxiv.org/pdf/2301.10075v1 | |
http://arxiv.org/abs/2109.00859v1,creativecommons.org/licenses/by-nc-sa/4.0/,CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation,Yue Wang and Weishi Wang and Shafiq Joty and Steven C. H. Hoi,http://arxiv.org/pdf/2109.00859v1 | |
http://arxiv.org/abs/2012.01266v2,creativecommons.org/licenses/by-nc-sa/4.0/,Meta-KD: A Meta Knowledge Distillation Framework for Language Model Compression across Domains,Haojie Pan and Chengyu Wang and Minghui Qiu and Yichang Zhang and Yaliang Li and Jun Huang,http://arxiv.org/pdf/2012.01266v2 | |
http://arxiv.org/abs/2207.06591v3,creativecommons.org/licenses/by-nc-sa/4.0/,A methodology to characterize bias and harmful stereotypes in natural language processing in Latin America,Laura Alonso Alemany and Luciana Benotti and Hernán Maina and Lucía González and Mariela Rajngewerc and Lautaro Martínez and Jorge Sánchez and Mauro Schilman and Guido Ivetta and Alexia Halvorsen and Amanda Mata Rojo and Matías Bordone and Beatriz Busaniche,http://arxiv.org/pdf/2207.06591v3 | |
http://arxiv.org/abs/1903.09442v2,creativecommons.org/licenses/by-nc-sa/4.0/,LINSPECTOR: Multilingual Probing Tasks for Word Representations,Gözde Gül Şahin and Clara Vania and Ilia Kuznetsov and Iryna Gurevych,http://arxiv.org/pdf/1903.09442v2 | |
http://arxiv.org/abs/2301.11564v1,creativecommons.org/licenses/by-nc-sa/4.0/,Learning 6-DoF Fine-grained Grasp Detection Based on Part Affordance Grounding,Yaoxian Song and Penglei Sun and Yi Ren and Yu Zheng and Yue Zhang,http://arxiv.org/pdf/2301.11564v1 | |
http://arxiv.org/abs/2304.02213v5,creativecommons.org/licenses/by-nc-sa/4.0/,Large Language Models as Master Key: Unlocking the Secrets of Materials Science with GPT,Tong Xie and Yuwei Wan and Wei Huang and Yufei Zhou and Yixuan Liu and Qingyuan Linghu and Shaozhou Wang and Chunyu Kit and Clara Grazian and Wenjie Zhang and Bram Hoex,http://arxiv.org/pdf/2304.02213v5 | |
http://arxiv.org/abs/2004.03636v1,creativecommons.org/licenses/by-nc-sa/4.0/,Efficient long-distance relation extraction with DG-SpanBERT,Jun Chen and Robert Hoehndorf and Mohamed Elhoseiny and Xiangliang Zhang,http://arxiv.org/pdf/2004.03636v1 | |
http://arxiv.org/abs/2204.00923v3,creativecommons.org/licenses/by-nc-sa/4.0/,Word separation in continuous sign language using isolated signs and post-processing,Razieh Rastgoo and Kourosh Kiani and Sergio Escalera,http://arxiv.org/pdf/2204.00923v3 | |
http://arxiv.org/abs/2211.07713v1,creativecommons.org/licenses/by-nc-sa/4.0/,How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling,Samuel Cahyawijaya and Bryan Wilie and Holy Lovenia and Huan Zhong and MingQian Zhong and Yuk-Yu Nancy Ip and Pascale Fung,http://arxiv.org/pdf/2211.07713v1 | |
http://arxiv.org/abs/2301.02773v1,creativecommons.org/licenses/by-nc-sa/4.0/,Building a Parallel Corpus and Training Translation Models Between Luganda and English,Richard Kimera and Daniela N. Rim and Heeyoul Choi,http://arxiv.org/pdf/2301.02773v1 | |
http://arxiv.org/abs/2301.12066v1,creativecommons.org/licenses/by-nc-sa/4.0/,Truth Machines: Synthesizing Veracity in AI Language Models,Luke Munn and Liam Magee and Vanicka Arora,http://arxiv.org/pdf/2301.12066v1 | |
http://arxiv.org/abs/2210.11431v1,creativecommons.org/licenses/by-nc-sa/4.0/,Counterfactual Recipe Generation: Exploring Compositional Generalization in a Realistic Scenario,Xiao Liu and Yansong Feng and Jizhi Tang and Chengang Hu and Dongyan Zhao,http://arxiv.org/pdf/2210.11431v1 | |
http://arxiv.org/abs/1609.06649v1,creativecommons.org/licenses/by-nc-sa/4.0/,Minimally Supervised Written-to-Spoken Text Normalization,Ke Wu and Kyle Gorman and Richard Sproat,http://arxiv.org/pdf/1609.06649v1 | |
http://arxiv.org/abs/2109.13582v2,creativecommons.org/licenses/by-nc-sa/4.0/,PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided MCTS Decoding,Antoine Chaffin and Vincent Claveau and Ewa Kijak,http://arxiv.org/pdf/2109.13582v2 | |
http://arxiv.org/abs/2201.07338v2,creativecommons.org/licenses/by-nc-sa/4.0/,Controllable Protein Design with Language Models,Noelia Ferruz and Birte Höcker,http://arxiv.org/pdf/2201.07338v2 | |
http://arxiv.org/abs/2205.05391v1,creativecommons.org/licenses/by-nc-sa/4.0/,Query-Based Keyphrase Extraction from Long Documents,Martin Docekal and Pavel Smrz,http://arxiv.org/pdf/2205.05391v1 | |
http://arxiv.org/abs/2304.06337v1,creativecommons.org/licenses/by-nc-sa/4.0/,Computational modeling of semantic change,Nina Tahmasebi and Haim Dubossarsky,http://arxiv.org/pdf/2304.06337v1 | |
http://arxiv.org/abs/2101.08106v2,creativecommons.org/licenses/by-nc-sa/4.0/,Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation,Lingyun Feng and Minghui Qiu and Yaliang Li and Hai-Tao Zheng and Ying Shen,http://arxiv.org/pdf/2101.08106v2 | |
http://arxiv.org/abs/2106.04403v2,creativecommons.org/licenses/by-nc-sa/4.0/,SynthRef: Generation of Synthetic Referring Expressions for Object Segmentation,Ioannis Kazakos and Carles Ventura and Miriam Bellver and Carina Silberer and Xavier Giro-i-Nieto,http://arxiv.org/pdf/2106.04403v2 | |
http://arxiv.org/abs/2102.10407v5,creativecommons.org/licenses/by-nc-sa/4.0/,VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning,Jun Chen and Han Guo and Kai Yi and Boyang Li and Mohamed Elhoseiny,http://arxiv.org/pdf/2102.10407v5 | |
http://arxiv.org/abs/2301.03029v6,creativecommons.org/licenses/by-nc-sa/4.0/,Topic Modelling of Swedish Newspaper Articles about Coronavirus: a Case Study using Latent Dirichlet Allocation Method,Bernadeta Griciūtė and Lifeng Han and Goran Nenadic,http://arxiv.org/pdf/2301.03029v6 | |
http://arxiv.org/abs/2302.10593v1,creativecommons.org/licenses/by-nc-sa/4.0/,Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys,Henk van den Heuvel and Martijn Bentum and Simone Wills and Judith C. Koops,http://arxiv.org/pdf/2302.10593v1 | |
http://arxiv.org/abs/2107.07653v3,creativecommons.org/licenses/by-nc-sa/4.0/,TAPEX: Table Pre-training via Learning a Neural SQL Executor,Qian Liu and Bei Chen and Jiaqi Guo and Morteza Ziyadi and Zeqi Lin and Weizhu Chen and Jian-Guang Lou,http://arxiv.org/pdf/2107.07653v3 | |
http://arxiv.org/abs/1805.06087v1,creativecommons.org/licenses/by-nc-sa/4.0/,Learning to Write with Cooperative Discriminators,Ari Holtzman and Jan Buys and Maxwell Forbes and Antoine Bosselut and David Golub and Yejin Choi,http://arxiv.org/pdf/1805.06087v1 | |
http://arxiv.org/abs/2106.06566v1,creativecommons.org/licenses/by-nc-sa/4.0/,Sample-efficient Linguistic Generalizations through Program Synthesis: Experiments with Phonology Problems,Saujas Vaduguru and Aalok Sathe and Monojit Choudhury and Dipti Misra Sharma,http://arxiv.org/pdf/2106.06566v1 | |
http://arxiv.org/abs/2204.04914v1,creativecommons.org/licenses/by-nc-sa/4.0/,Zero-shot Cross-lingual Conversational Semantic Role Labeling,Han Wu and Haochen Tan and Kun Xu and Shuqi Liu and Lianwei Wu and Linqi Song,http://arxiv.org/pdf/2204.04914v1 | |
http://arxiv.org/abs/2205.00258v2,creativecommons.org/licenses/by-nc-sa/4.0/,EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language Processing,Chengyu Wang and Minghui Qiu and Chen Shi and Taolin Zhang and Tingting Liu and Lei Li and Jianing Wang and Ming Wang and Jun Huang and Wei Lin,http://arxiv.org/pdf/2205.00258v2 | |
http://arxiv.org/abs/2206.07666v1,creativecommons.org/licenses/by-nc-sa/4.0/,Transformer-based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project,Jan Lehečka and Josef V. Psutka and Josef Psutka,http://arxiv.org/pdf/2206.07666v1 | |
http://arxiv.org/abs/2210.05883v1,creativecommons.org/licenses/by-nc-sa/4.0/,AD-DROP: Attribution-Driven Dropout for Robust Language Model Fine-Tuning,Tao Yang and Jinghao Deng and Xiaojun Quan and Qifan Wang and Shaoliang Nie,http://arxiv.org/pdf/2210.05883v1 | |
http://arxiv.org/abs/2301.13816v2,creativecommons.org/licenses/by-nc-sa/4.0/,Execution-based Code Generation using Deep Reinforcement Learning,Parshin Shojaee and Aneesh Jain and Sindhu Tipirneni and Chandan K. Reddy,http://arxiv.org/pdf/2301.13816v2 | |
http://arxiv.org/abs/2303.09306v2,creativecommons.org/licenses/by-nc-sa/4.0/,BanglaCoNER: Towards Robust Bangla Complex Named Entity Recognition,HAZ Sameen Shahgir and Ramisa Alam and Md. Zarif Ul Alam,http://arxiv.org/pdf/2303.09306v2 | |
http://arxiv.org/abs/2205.06983v2,creativecommons.org/licenses/by-nc-sa/4.0/,RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL,Jiexing Qi and Jingyao Tang and Ziwei He and Xiangpeng Wan and Yu Cheng and Chenghu Zhou and Xinbing Wang and Quanshi Zhang and Zhouhan Lin,http://arxiv.org/pdf/2205.06983v2 | |
http://arxiv.org/abs/2202.07630v1,creativecommons.org/licenses/by-nc-sa/4.0/,Delving Deeper into Cross-lingual Visual Question Answering,Chen Liu and Jonas Pfeiffer and Anna Korhonen and Ivan Vulic and Iryna Gurevych,http://arxiv.org/pdf/2202.07630v1 | |
http://arxiv.org/abs/2203.10250v1,creativecommons.org/licenses/by-nc-sa/4.0/,Meta-X$_{NLG}$: A Meta-Learning Approach Based on Language Clustering for Zero-Shot Cross-Lingual Transfer and Generation,Kaushal Kumar Maurya and Maunendra Sankar Desarkar,http://arxiv.org/pdf/2203.10250v1 | |
http://arxiv.org/abs/1810.09699v1,creativecommons.org/licenses/by-nc-sa/4.0/,Semi-supervised acoustic model training for speech with code-switching,Emre Yılmaz and Mitchell McLaren and Henk van den Heuvel and David A. van Leeuwen,http://arxiv.org/pdf/1810.09699v1 | |
http://arxiv.org/abs/2002.12683v2,creativecommons.org/licenses/by-nc-sa/4.0/,RP-DNN: A Tweet level propagation context based deep neural networks for early rumor detection in Social Media,Jie Gao and Sooji Han and Xingyi Song and Fabio Ciravegna,http://arxiv.org/pdf/2002.12683v2 | |
http://arxiv.org/abs/2206.07278v1,creativecommons.org/licenses/by-nc-sa/4.0/,Nebula Graph: An open source distributed graph database,Min Wu and Xinglu Yi and Hui Yu and Yu Liu and Yujue Wang,http://arxiv.org/pdf/2206.07278v1 | |
http://arxiv.org/abs/1901.03116v2,creativecommons.org/licenses/by-nc-sa/4.0/,Equalizing Gender Biases in Neural Machine Translation with Word Embeddings Techniques,Joel Escudé Font and Marta R. Costa-jussà,http://arxiv.org/pdf/1901.03116v2 | |
http://arxiv.org/abs/2110.08518v2,creativecommons.org/licenses/by-nc-sa/4.0/,MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding,Junlong Li and Yiheng Xu and Lei Cui and Furu Wei,http://arxiv.org/pdf/2110.08518v2 | |
http://arxiv.org/abs/2112.06953v1,creativecommons.org/licenses/by-nc-sa/4.0/,Controlled Cue Generation for Play Scripts,Alara Dirik and Hilal Donmez and Pinar Yanardag,http://arxiv.org/pdf/2112.06953v1 | |
http://arxiv.org/abs/2202.00535v2,creativecommons.org/licenses/by-nc-sa/4.0/,Novelty Controlled Paraphrase Generation with Retrieval Augmented Conditional Prompt Tuning,Jishnu Ray Chowdhury and Yong Zhuang and Shuyi Wang,http://arxiv.org/pdf/2202.00535v2 | |
http://arxiv.org/abs/2304.06594v1,creativecommons.org/licenses/by-nc-sa/4.0/,Solving Tensor Low Cycle Rank Approximation,Yichuan Deng and Yeqi Gao and Zhao Song,http://arxiv.org/pdf/2304.06594v1 | |
http://arxiv.org/abs/2212.07016v2,creativecommons.org/licenses/by-nc-sa/4.0/,Understanding Zero-Shot Adversarial Robustness for Large-Scale Models,Chengzhi Mao and Scott Geng and Junfeng Yang and Xin Wang and Carl Vondrick,http://arxiv.org/pdf/2212.07016v2 | |
http://arxiv.org/abs/2104.01619v1,creativecommons.org/licenses/by-nc-sa/4.0/,KnowGraph@IITK at SemEval-2021 Task 11: Building KnowledgeGraph for NLP Research,Shashank Shailabh and Sajal Chaurasia and Ashutosh Modi,http://arxiv.org/pdf/2104.01619v1 | |
http://arxiv.org/abs/2110.08395v2,creativecommons.org/licenses/by-nc-sa/4.0/,DS-TOD: Efficient Domain Specialization for Task Oriented Dialog,Chia-Chien Hung and Anne Lauscher and Simone Paolo Ponzetto and Goran Glavaš,http://arxiv.org/pdf/2110.08395v2 | |
http://arxiv.org/abs/2302.09051v4,creativecommons.org/licenses/by-nc-sa/4.0/,"Complex QA and language models hybrid architectures, Survey",Xavier Daull and Patrice Bellot and Emmanuel Bruno and Vincent Martin and Elisabeth Murisasco,http://arxiv.org/pdf/2302.09051v4 | |
http://arxiv.org/abs/2110.07816v1,creativecommons.org/licenses/by-nc-sa/4.0/,Multilingual Neural Machine Translation:Can Linguistic Hierarchies Help?,Fahimeh Saleh and Wray Buntine and Gholamreza Haffari and Lan Du,http://arxiv.org/pdf/2110.07816v1 | |
http://arxiv.org/abs/2304.00472v1,creativecommons.org/licenses/by-nc-sa/4.0/,Querying Large Language Models with SQL,Mohammed Saeed and Nicola De Cao and Paolo Papotti,http://arxiv.org/pdf/2304.00472v1 | |
http://arxiv.org/abs/1712.05972v2,creativecommons.org/licenses/by-nc-sa/4.0/,"Train Once, Test Anywhere: Zero-Shot Learning for Text Classification",Pushpankar Kumar Pushp and Muktabh Mayank Srivastava,http://arxiv.org/pdf/1712.05972v2 | |
http://arxiv.org/abs/2109.10282v5,creativecommons.org/licenses/by-nc-sa/4.0/,TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models,Minghao Li and Tengchao Lv and Jingye Chen and Lei Cui and Yijuan Lu and Dinei Florencio and Cha Zhang and Zhoujun Li and Furu Wei,http://arxiv.org/pdf/2109.10282v5 | |
http://arxiv.org/abs/2203.07996v2,creativecommons.org/licenses/by-nc-sa/4.0/,Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition,Xichen Pan and Peiyu Chen and Yichen Gong and Helong Zhou and Xinbing Wang and Zhouhan Lin,http://arxiv.org/pdf/2203.07996v2 | |
http://arxiv.org/abs/2109.08270v3,creativecommons.org/licenses/by-nc-sa/4.0/,Language Models as a Knowledge Source for Cognitive Agents,Robert E. Wray,III and James R. Kirk and John E. Laird | |
http://arxiv.org/abs/2210.12460v1,creativecommons.org/licenses/by-nc-sa/4.0/,Collaborative Reasoning on Multi-Modal Semantic Graphs for Video-Grounded Dialogue Generation,Xueliang Zhao and Yuxuan Wang and Chongyang Tao and Chenshuo Wang and Dongyan Zhao,http://arxiv.org/pdf/2210.12460v1 | |
http://arxiv.org/abs/2108.03533v3,creativecommons.org/licenses/by-nc-sa/4.0/,Improving Similar Language Translation With Transfer Learning,Ife Adebara and Muhammad Abdul-Mageed,http://arxiv.org/pdf/2108.03533v3 | |
http://arxiv.org/abs/2004.00139v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Swiss German Dictionary: Variation in Speech and Writing,Larissa Schmidt and Lucy Linder and Sandra Djambazovska and Alexandros Lazaridis and Tanja Samardžić and Claudiu Musat,http://arxiv.org/pdf/2004.00139v1 | |
http://arxiv.org/abs/2105.09043v2,creativecommons.org/licenses/by-nc-sa/4.0/,Sentence Extraction-Based Machine Reading Comprehension for Vietnamese,Phong Nguyen-Thuan Do and Nhat Duy Nguyen and Tin Van Huynh and Kiet Van Nguyen and Anh Gia-Tuan Nguyen and Ngan Luu-Thuy Nguyen,http://arxiv.org/pdf/2105.09043v2 | |
http://arxiv.org/abs/2105.14779v2,creativecommons.org/licenses/by-nc-sa/4.0/,Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR,Shammur Absar Chowdhury and Amir Hussein and Ahmed Abdelali and Ahmed Ali,http://arxiv.org/pdf/2105.14779v2 | |
http://arxiv.org/abs/1904.02818v1,creativecommons.org/licenses/by-nc-sa/4.0/,Neural Networks for Modeling Source Code Edits,Rui Zhao and David Bieber and Kevin Swersky and Daniel Tarlow,http://arxiv.org/pdf/1904.02818v1 | |
http://arxiv.org/abs/2205.14583v2,creativecommons.org/licenses/by-nc-sa/4.0/,Learning Locality and Isotropy in Dialogue Modeling,Han Wu and Haochen Tan and Mingjie Zhan and Gangming Zhao and Shaoqing Lu and Ding Liang and Linqi Song,http://arxiv.org/pdf/2205.14583v2 | |
http://arxiv.org/abs/2107.00281v3,creativecommons.org/licenses/by-nc-sa/4.0/,Scientia Potentia Est -- On the Role of Knowledge in Computational Argumentation,Anne Lauscher and Henning Wachsmuth and Iryna Gurevych and Goran Glavaš,http://arxiv.org/pdf/2107.00281v3 | |
http://arxiv.org/abs/1908.01341v1,creativecommons.org/licenses/by-nc-sa/4.0/,SF-Net: Structured Feature Network for Continuous Sign Language Recognition,Zhaoyang Yang and Zhenmei Shi and Xiaoyong Shen and Yu-Wing Tai,http://arxiv.org/pdf/1908.01341v1 | |
http://arxiv.org/abs/2110.07938v1,creativecommons.org/licenses/by-nc-sa/4.0/,Identifying Causal Influences on Publication Trends and Behavior: A Case Study of the Computational Linguistics Community,Maria Glenski and Svitlana Volkova,http://arxiv.org/pdf/2110.07938v1 | |
http://arxiv.org/abs/2111.01340v2,creativecommons.org/licenses/by-nc-sa/4.0/,Adapting to the Long Tail: A Meta-Analysis of Transfer Learning Research for Language Understanding Tasks,Aakanksha Naik and Jill Lehman and Carolyn Rose,http://arxiv.org/pdf/2111.01340v2 | |
http://arxiv.org/abs/2111.08545v1,creativecommons.org/licenses/by-nc-sa/4.0/,Coral: An Approach for Conversational Agents in Mental Health Applications,Harsh Sakhrani and Saloni Parekh and Shubham Mahajan,http://arxiv.org/pdf/2111.08545v1 | |
http://arxiv.org/abs/2206.08474v1,creativecommons.org/licenses/by-nc-sa/4.0/,XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence,Ming Zhu and Aneesh Jain and Karthik Suresh and Roshan Ravindran and Sindhu Tipirneni and Chandan K. Reddy,http://arxiv.org/pdf/2206.08474v1 | |
http://arxiv.org/abs/2210.07095v2,creativecommons.org/licenses/by-nc-sa/4.0/,Incorporating Context into Subword Vocabularies,Shaked Yehezkel and Yuval Pinter,http://arxiv.org/pdf/2210.07095v2 | |
http://arxiv.org/abs/2303.06273v1,creativecommons.org/licenses/by-nc-sa/4.0/,Consistency Analysis of ChatGPT,Myeongjun Jang and Thomas Lukasiewicz,http://arxiv.org/pdf/2303.06273v1 | |
http://arxiv.org/abs/2212.10015v1,creativecommons.org/licenses/by-nc-sa/4.0/,Benchmarking Spatial Relationships in Text-to-Image Generation,Tejas Gokhale and Hamid Palangi and Besmira Nushi and Vibhav Vineet and Eric Horvitz and Ece Kamar and Chitta Baral and Yezhou Yang,http://arxiv.org/pdf/2212.10015v1 | |
http://arxiv.org/abs/2301.11749v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Multi-task Multi-stage Transitional Training Framework for Neural Chat Translation,Chulun Zhou and Yunlong Liang and Fandong Meng and Jie Zhou and Jinan Xu and Hongji Wang and Min Zhang and Jinsong Su,http://arxiv.org/pdf/2301.11749v1 | |
http://arxiv.org/abs/1909.05855v2,creativecommons.org/licenses/by-nc-sa/4.0/,Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset,Abhinav Rastogi and Xiaoxue Zang and Srinivas Sunkara and Raghav Gupta and Pranav Khaitan,http://arxiv.org/pdf/1909.05855v2 | |
http://arxiv.org/abs/2106.00245v2,creativecommons.org/licenses/by-nc-sa/4.0/,Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models,Linjie Li and Jie Lei and Zhe Gan and Jingjing Liu,http://arxiv.org/pdf/2106.00245v2 | |
http://arxiv.org/abs/2006.07490v1,creativecommons.org/licenses/by-nc-sa/4.0/,Understanding Unintended Memorization in Federated Learning,Om Thakkar and Swaroop Ramaswamy and Rajiv Mathews and Françoise Beaufays,http://arxiv.org/pdf/2006.07490v1 | |
http://arxiv.org/abs/2210.04802v1,creativecommons.org/licenses/by-nc-sa/4.0/,SimSCOOD: Systematic Analysis of Out-of-Distribution Behavior of Source Code Models,Hossein Hajipour and Ning Yu and Cristian-Alexandru Staicu and Mario Fritz,http://arxiv.org/pdf/2210.04802v1 | |
http://arxiv.org/abs/2304.03277v1,creativecommons.org/licenses/by-nc-sa/4.0/,Instruction Tuning with GPT-4,Baolin Peng and Chunyuan Li and Pengcheng He and Michel Galley and Jianfeng Gao,http://arxiv.org/pdf/2304.03277v1 | |
http://arxiv.org/abs/2112.11914v1,creativecommons.org/licenses/by-nc-sa/4.0/,Assisted Text Annotation Using Active Learning to Achieve High Quality with Little Effort,Franziska Weeber and Felix Hamborg and Karsten Donnay and Bela Gipp,http://arxiv.org/pdf/2112.11914v1 | |
http://arxiv.org/abs/2210.03235v3,creativecommons.org/licenses/by-nc-sa/4.0/,Improving Large-scale Paraphrase Acquisition and Generation,Yao Dou and Chao Jiang and Wei Xu,http://arxiv.org/pdf/2210.03235v3 | |
http://arxiv.org/abs/2303.15725v1,creativecommons.org/licenses/by-nc-sa/4.0/,"Solving Regularized Exp, Cosh and Sinh Regression Problems",Zhihang Li and Zhao Song and Tianyi Zhou,http://arxiv.org/pdf/2303.15725v1 | |
http://arxiv.org/abs/2304.02839v1,creativecommons.org/licenses/by-nc-sa/4.0/,"Whose Text Is It Anyway? Exploring BigCode, Intellectual Property, and Ethics",Madiha Zahrah Choksi and David Goedicke,http://arxiv.org/pdf/2304.02839v1 | |
http://arxiv.org/abs/1604.03184v1,creativecommons.org/licenses/by-nc-sa/4.0/,Desiree - a Refinement Calculus for Requirements Engineering,Feng-Lin Li and John Mylopoulos,http://arxiv.org/pdf/1604.03184v1 | |
http://arxiv.org/abs/1912.03768v2,creativecommons.org/licenses/by-nc-sa/4.0/,TypeWriter: Neural Type Prediction with Search-based Validation,Michael Pradel and Georgios Gousios and Jason Liu and Satish Chandra,http://arxiv.org/pdf/1912.03768v2 | |
http://arxiv.org/abs/2104.08087v1,creativecommons.org/licenses/by-nc-sa/4.0/,Citations are not opinions: a corpus linguistics approach to understanding how citations are made,Domenic Rosati,http://arxiv.org/pdf/2104.08087v1 | |
http://arxiv.org/abs/2303.01490v1,creativecommons.org/licenses/by-nc-sa/4.0/,Language Variety Identification with True Labels,Marcos Zampieri and Kai North and Tommi Jauhiainen and Mariano Felice and Neha Kumari and Nishant Nair and Yash Bangera,http://arxiv.org/pdf/2303.01490v1 | |
http://arxiv.org/abs/2111.05193v2,creativecommons.org/licenses/by-nc-sa/4.0/,A Survey on Green Deep Learning,Jingjing Xu and Wangchunshu Zhou and Zhiyi Fu and Hao Zhou and Lei Li,http://arxiv.org/pdf/2111.05193v2 | |
http://arxiv.org/abs/2304.07772v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Comprehensive Evaluation of the Copy Mechanism for Natural Language to SPARQL Query Generation,Samuel Reyd and Amal Zouaq and Papa Abdou Karim Karou Diallo,http://arxiv.org/pdf/2304.07772v1 | |
http://arxiv.org/abs/2212.01757v1,creativecommons.org/licenses/by-nc-sa/4.0/,Languages You Know Influence Those You Learn: Impact of Language Characteristics on Multi-Lingual Text-to-Text Transfer,Benjamin Muller and Deepanshu Gupta and Siddharth Patwardhan and Jean-Philippe Fauconnier and David Vandyke and Sachin Agarwal,http://arxiv.org/pdf/2212.01757v1 | |
http://arxiv.org/abs/1809.01229v1,creativecommons.org/licenses/by-nc-sa/4.0/,t-Exponential Memory Networks for Question-Answering Machines,Kyriakos Tolias and Sotirios Chatzis,http://arxiv.org/pdf/1809.01229v1 | |
http://arxiv.org/abs/2202.13623v1,creativecommons.org/licenses/by-nc-sa/4.0/,Interactive Machine Learning for Image Captioning,Mareike Hartmann and Aliki Anagnostopoulou and Daniel Sonntag,http://arxiv.org/pdf/2202.13623v1 | |
http://arxiv.org/abs/2109.03926v2,creativecommons.org/licenses/by-nc-sa/4.0/,Transformers in the loop: Polarity in neural models of language,Lisa Bylinina and Alexey Tikhonov,http://arxiv.org/pdf/2109.03926v2 | |
http://arxiv.org/abs/2206.07627v1,creativecommons.org/licenses/by-nc-sa/4.0/,Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech,Jan Lehečka and Jan Švec and Aleš Pražák and Josef V. Psutka,http://arxiv.org/pdf/2206.07627v1 | |
http://arxiv.org/abs/2106.07876v3,creativecommons.org/licenses/by-nc-sa/4.0/,Vision-Language Navigation with Random Environmental Mixup,Chong Liu and Fengda Zhu and Xiaojun Chang and Xiaodan Liang and Zongyuan Ge and Yi-Dong Shen,http://arxiv.org/pdf/2106.07876v3 | |
http://arxiv.org/abs/1809.06471v1,creativecommons.org/licenses/by-nc-sa/4.0/,A Language for Large-Scale Collaboration in Economics: A Streamlined Computational Representation of Financial Models,Jorge Faleiro,http://arxiv.org/pdf/1809.06471v1 | |
http://arxiv.org/abs/2212.01218v1,creativecommons.org/licenses/by-nc-sa/4.0/,Answer ranking in Community Question Answering: a deep learning approach,Lucas Valentin,http://arxiv.org/pdf/2212.01218v1 | |
http://arxiv.org/abs/1610.08431v3,creativecommons.org/licenses/by-nc-sa/4.0/,Broad Context Language Modeling as Reading Comprehension,Zewei Chu and Hai Wang and Kevin Gimpel and David McAllester,http://arxiv.org/pdf/1610.08431v3 | |
http://arxiv.org/abs/2105.15065v1,creativecommons.org/licenses/by-nc-sa/4.0/,Picking Pearl From Seabed: Extracting Artefacts from Noisy Issue Triaging Collaborative Conversations for Hybrid Cloud Services,Amar Prakash Azad and Supriyo Ghosh and Ajay Gupta and Harshit Kumar and Prateeti Mohapatra,http://arxiv.org/pdf/2105.15065v1 | |
http://arxiv.org/abs/2111.08374v3,creativecommons.org/licenses/by-nc-sa/4.0/,Literature-Augmented Clinical Outcome Prediction,Aakanksha Naik and Sravanthi Parasa and Sergey Feldman and Lucy Lu Wang and Tom Hope,http://arxiv.org/pdf/2111.08374v3 | |
http://arxiv.org/abs/2205.02293v3,creativecommons.org/licenses/by-nc-sa/4.0/,Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance,Jingwei Ni and Zhijing Jin and Markus Freitag and Mrinmaya Sachan and Bernhard Schölkopf,http://arxiv.org/pdf/2205.02293v3 | |
http://arxiv.org/abs/2209.08966v2,creativecommons.org/licenses/by-nc-sa/4.0/,Will It Blend? Mixing Training Paradigms & Prompting for Argument Quality Prediction,Michiel van der Meer and Myrthe Reuver and Urja Khurana and Lea Krause and Selene Báez Santamaría,http://arxiv.org/pdf/2209.08966v2 | |
http://arxiv.org/abs/2212.02851v1,creativecommons.org/licenses/by-nc-sa/4.0/,DiSTRICT: Dialogue State Tracking with Retriever Driven In-Context Tuning,Praveen Venkateswaran and Evelyn Duesterwald and Vatche Isahagian,http://arxiv.org/pdf/2212.02851v1 | |
http://arxiv.org/abs/2303.07240v1,creativecommons.org/licenses/by-nc-sa/4.0/,PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents,Weixiong Lin and Ziheng Zhao and Xiaoman Zhang and Chaoyi Wu and Ya Zhang and Yanfeng Wang and Weidi Xie,http://arxiv.org/pdf/2303.07240v1 | |
http://arxiv.org/abs/2304.05973v1,creativecommons.org/licenses/by-nc-sa/4.0/,HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented Prompting,Jiaying Lu and Jiaming Shen and Bo Xiong and Wenjing Ma and Steffen Staab and Carl Yang,http://arxiv.org/pdf/2304.05973v1 | |
http://arxiv.org/abs/2206.05224v2,creativecommons.org/licenses/by-nc-sa/4.0/,A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction,Wonseok Hwang and Dongjun Lee and Kyoungyeon Cho and Hanuhl Lee and Minjoon Seo,http://arxiv.org/pdf/2206.05224v2 | |
http://arxiv.org/abs/2201.07614v1,creativecommons.org/licenses/by-nc-sa/4.0/,Uncovering More Shallow Heuristics: Probing the Natural Language Inference Capacities of Transformer-Based Pre-Trained Language Models Using Syllogistic Patterns,Reto Gubelmann and Siegfried Handschuh,http://arxiv.org/pdf/2201.07614v1 | |
http://arxiv.org/abs/2203.03191v1,creativecommons.org/licenses/by-nc-sa/4.0/,Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features,Florian Lux and Ngoc Thang Vu,http://arxiv.org/pdf/2203.03191v1 | |
http://arxiv.org/abs/2012.14740v4,creativecommons.org/licenses/by-nc-sa/4.0/,LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding,Yang Xu and Yiheng Xu and Tengchao Lv and Lei Cui and Furu Wei and Guoxin Wang and Yijuan Lu and Dinei Florencio and Cha Zhang and Wanxiang Che and Min Zhang and Lidong Zhou,http://arxiv.org/pdf/2012.14740v4 | |
http://arxiv.org/abs/2204.04504v1,creativecommons.org/licenses/by-nc-sa/4.0/,TANet: Thread-Aware Pretraining for Abstractive Conversational Summarization,Ze Yang and Liran Wang and Zhoujin Tian and Wei Wu and Zhoujun Li,http://arxiv.org/pdf/2204.04504v1 | |
http://arxiv.org/abs/2011.13633v2,creativecommons.org/licenses/by-nc-sa/4.0/,CoRe: An Efficient Coarse-refined Training Framework for BERT,Cheng Yang and Shengnan Wang and Yuechuan Li and Chao Yang and Ming Yan and Jingqiao Zhang and Fangquan Lin,http://arxiv.org/pdf/2011.13633v2 | |
http://arxiv.org/abs/1806.10423v2,creativecommons.org/licenses/by-nc-sa/4.0/,Implementing Convex Optimization in R: Two Econometric Examples,Zhan Gao and Zhentao Shi,http://arxiv.org/pdf/1806.10423v2 | |
http://arxiv.org/abs/2210.09472v2,creativecommons.org/licenses/by-nc-sa/4.0/,Multi-granularity Argument Mining in Legal Texts,Huihui Xu and Kevin Ashley,http://arxiv.org/pdf/2210.09472v2 | |
http://arxiv.org/abs/2211.06774v2,creativecommons.org/licenses/by-nc-sa/4.0/,Large-Scale Bidirectional Training for Zero-Shot Image Captioning,Taehoon Kim and Mark Marsden and Pyunghwan Ahn and Sangyun Kim and Sihaeng Lee and Alessandra Sala and Seung Hwan Kim,http://arxiv.org/pdf/2211.06774v2 | |
http://arxiv.org/abs/1908.05828v1,creativecommons.org/licenses/by-nc-sa/4.0/,Named Entity Recognition for Nepali Language,Oyesh Mann Singh and Ankur Padia and Anupam Joshi,http://arxiv.org/pdf/1908.05828v1 | |
http://arxiv.org/abs/2002.00759v2,creativecommons.org/licenses/by-nc-sa/4.0/,Comparison Between Traditional Machine Learning Models And Neural Network Models For Vietnamese Hate Speech Detection,Son T. Luu and Hung P. Nguyen and Kiet Van Nguyen and Ngan Luu-Thuy Nguyen,http://arxiv.org/pdf/2002.00759v2 | |
http://arxiv.org/abs/2005.06752v1,creativecommons.org/licenses/by-nc-sa/4.0/,Large Scale Font Independent Urdu Text Recognition System,Atique Ur Rehman and Sibt Ul Hussain,http://arxiv.org/pdf/2005.06752v1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment