Remove 2014 Remove BERT Remove Computational Linguistics
article thumbnail

The State of Transfer Learning in NLP

Sebastian Ruder

Later approaches then scaled these representations to sentences and documents ( Le and Mikolov, 2014 ; Conneau et al., In contrast, current models like BERT-Large and GPT-2 consist of 24 Transformer blocks and recent models are even deeper. Multilingual BERT in particular has been the subject of much recent attention ( Pires et al.,

NLP 75
article thumbnail

The State of Multilingual AI

Sebastian Ruder

Research models such as BERT and T5 have become much more accessible while the latest generation of language and multi-modal models are demonstrating increasingly powerful capabilities. In Findings of the Association for Computational Linguistics: ACL 2022 (pp. RoBERTa: A Robustly Optimized BERT Pretraining Approach.

article thumbnail

Major trends in NLP: a review of 20 years of ACL research

NLP People

The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019) is starting this week in Florence, Italy. Especially pre-trained word embeddings such as Word2Vec, FastText and BERT allow NLP developers to jump to the next level. White (2014). References E. Cambria and B. Toutanova (2018). Sutskever, O.

NLP 52