article thumbnail

All Languages Are NOT Created (Tokenized) Equal

Topbots

70% of research papers published in a computational linguistics conference only evaluated English.[ In Findings of the Association for Computational Linguistics: ACL 2022 , pages 2340–2354, Dublin, Ireland. Association for Computational Linguistics. Association for Computational Linguistics.

article thumbnail

The State of Transfer Learning in NLP

Sebastian Ruder

2013 ) learned a single representation for every word independent of its context. Major themes Several major themes can be observed in how this paradigm has been applied: From words to words-in-context  Over time, representations incorporate more context. Early approaches such as word2vec ( Mikolov et al.,

NLP 75
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Multi-domain Multilingual Question Answering

Sebastian Ruder

2013 ), MCScript ( Modi et al., RC Olympics: The many domains of reading comprehension Datasets in the Fiction domain typically require processing narratives in books such as NarrativeQA ( Kočiský et al., 2018 ), Children's Book Test ( Hill et al., 2016 ), and BookTest ( Bajgar et al., 2020 ) and Polish 'Did you know?'

BERT 52
article thumbnail

Parsing English in 500 Lines of Python

Explosion

I wrote this blog post in 2013, describing an exciting advance in natural language understanding technology. The derivation for the transition system we’re using, Arc Hybrid, is in Goldberg and Nivre (2013). TACL 2013] However, I wrote my own features for it. This prevents us from doing a bunch of costly copy operations.

Python 45
article thumbnail

AI Distillery (Part 2): Distilling by Embedding

ML Review

vector: Probing sentence embeddings for linguistic properties. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Vol. Star our repo: ai-distillery And clap your little hearts out for MTank ! References Harris, Z. Distributional structure. Word, 10(2–3), 146–162.

AI 40
article thumbnail

Major trends in NLP: a review of 20 years of ACL research

NLP People

The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019) is starting this week in Florence, Italy. The universal linguistic principle behind word embeddings is distributional similarity: a word can be characterized by the contexts in which it occurs. Goldberg and G. Hirst (2017). Mikolov et al.

NLP 52