Anna Rogers (Gladkova)

Contact Google Scholar CV GitHub LinkedIn

I am a post-doctoral associate in the Computer Science Department at Text Machine lab, University of Massachusetts (Lowell). I work at the intersection of linguistics, natural language processing, and machine learning. I hold a Ph.D. degree from the Department of Language and Information Sciences at the University of Tokyo (Japan).

My current projects span intrinsic evaluation of word embeddings, compositionality, temporal and analogical reasoning. I also lead annotation projects for sentiment analysis and temporal reasoning.

Romanov, A., Rumshisky, A., Rogers, A., Donahue, D. Adversarial Decomposition of Text Representation. Accepted for NAACL 2019
Rogers, A., Hosur Anathakrishna, Sh., & Rumshisky, A. What's in Your Embedding, And How It Predicts Task Performance. In Proceedings of the 27th International Conference on Computational Linguistics (pp. 2690–2703).
Rogers, A., Romanov, A., Rumshisky, A., Volkova, S., Gronas, M., & Gribov, A. RuSentiment: An Enriched Sentiment Analysis Dataset for Social Media in Russian. In Proceedings of the 27th International Conference on Computational Linguistics (pp. 755–763).
Karpinska, M., Li, B., Rogers, A. & Drozd, A. Subcharacter Information in Japanese Embeddings: When Is It Worth It? In Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP (pp. 28–37).
Rogers, A., Drozd, A., & Li, B. (2017). The (Too Many) Problems of Analogical Reasoning with Word Vectors. In Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (* SEM 2017) (pp. 135–148).
Li, B., Liu, T., Zhao, Z., Tang, B., Drozd, A., Rogers, A., & Du, X. (2017). Investigating different syntactic context types and context representations for learning word embeddings. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 2411–2421).
Rogers, A. (2017). Multilingual computational lexicography: frame semantics meets distributional semantics (Ph.D. dissertation). University of Tokyo, Tokyo.

Gladkova, A., Drozd, A., & Matsuoka, S. (2016). Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t. In Proceedings of the NAACL-HLT SRW (pp. 47–54). San Diego, California, June 12-17, 2016: ACL.
Gladkova, A., & Drozd, A. (2016). Intrinsic evaluations of word embeddings: what can we do better? In Proceedings of The 1st Workshop on Evaluating Vector Space Representations for NLP (pp. 36–42). Berlin, Germany: ACL.
Drozd, A., Gladkova, A., & Matsuoka, S. (2016). Word embeddings, analogies, and machine learning: beyond king - man + woman = queen. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers (pp. 3519–3530). Osaka, Japan, December 11-17.
Santus, E., Gladkova, A., Evert, S., & Lenci, A. (2016). The CogALex-V shared task on the corpus-based identification of semantic relations. In Proceedings of the Workshop on Cognitive Aspects of the Lexicon (pp. 69–79). Osaka, Japan, December 11-17: ACL.

Drozd, A., Gladkova, A., & Matsuoka, S. (2015). Discovering aspectual classes of Russian verbs in untagged large corpora. In Proceedings of 2015 IEEE International Conference on Data Science and Data Intensive Systems (DSDIS) (pp. 61–68).
Drozd, A., Gladkova, A., & Matsuoka, S. (2015). Python, performance, and Natural Language Processing. In Proceedings of the 5th Workshop on Python for High-Performance and Scientific Computing (p. 1:1–1:10). New York, NY, USA: ACM.

What's in your embedding, and how it predicts task performance.

Linguistic Diagnostics (LD) is a new methodology for evaluation, error analysis and development of word embedding models, implemented in an open-source Python library. In a large-scale experiment with 14 datasets LD successfully highlights the differences in the output of GloVe and word2vec algorithms that correlate with their performance on different NLP tasks.

Project page

Analogical reasoning with word embeddings: why king - man + woman does NOT equal queen.

A series of 5 papers demonstrates that the famous linear vector offset model of linguistic relations fails for most linguistic relations, is biased by cosine similaity, and also underestimates the amount of information captured in word embeddings (which makes word analogies a dubious benchmark). Incorporating subword information is shown to be beneficial for morphological relations in English and Japanese.

Project page

Vecto: a new open-source Python library for training, and evaluating, and working with word embeddings

Vecto is an ongoing project that aims to provide a one-stop toolkit for working with word embeddings. A major part of the project is framework for reproducible research on distributional semantic representations, with experiment metadata collected and logged automatically.

Project page

RuSentiment: the largest sentiment analysis dataset for Russian social media, enriched with active learning.

RuSentiment is currently the largest openly available sentiment dataset for Russian social media (~30K posts), diversified with active learning. We also present a lightweight 5-class annotation scheme that enables speedy and consistent annotation (250-350 posts per hour with Fleiss' kappa 0.654), with ready-to-use sentiment annotation guidelines for English and Russian.

Project page
Programming & scripting

Python, JavaScript, Matlab/Octave, Bash;

Machine learning

scikit-learn, PyTorch



Theoretical frameworks

Distributional semantics, frame semantics, sociolinguistics, pragmatics, discourse analysis, diachronic analysis of languages


English, Japanese, French, Ukrainian, Russian

Word embeddings: 6 years later.
22 May 2019: UMass Amherst (USA). [SLIDES]

What's in your embedding, and how it predicts task performance.
27 September 2018: UMass Amherst (USA). [SLIDES] [VIDEO]
A version of this talk was also presented on August 30 2018 at IT University of Copenhagen (Denmark).

Distributional compositional semantics in the age of word embeddings.
7 May 2018: Tutorial T4 at LREC 2018, Miyazaki, Japan.
Tutorial website:

Detecting linguistic relations with analogies: what works and what doesn't.
July 15 2016: Google Tokyo seminar, Tokyo, Japan. [SLIDES]


RepEval 2019: The Third Workshop on Evaluating Vector Space Representations for NLP (URL)
June 6 2019: Minneapolis, USA (co-located with NAACL 2019)

T4 LREC 2018 tutorial: Distributional compositional semantics in the age of word embeddings: tasks, resources and methodology (URL)
May 7, 2018: Miyazaki, Japan (LREC 2018)

CogALex-V Shared Task on the Corpus-Based Identification of Semantic Relations (URL)
December 12, 2016: Osaka, Japan (Cognitive Aspects of the Lexicon Workshop, co-located with COLING 2016)


NAACL, COLING, *SEM, RepEval, Language Resources and Evaluation

COMP-1005: Introduction to Programming for Data Science (URL)
University of Massachusetts Lowell, Computer Science department, spring 2019

NLP in Python @ ESSLLI: Introduction to NLP with Python (beginner & advanced - a suite of two 1-week courses) (URL)
Riga, Latvia, August 5-16 2019 (European Summer School in Logic, Language and Information 2019)
This course is supported by a grant from Embassy of US in Latvia.