Our paper on “An Artificial Language Evaluation of Distributional Semantic Models” has been accepted for presentation at CoNLL (The SIGNLL Conference on Computational Natural Language Learning). The 2017 conference is colocated with ACL 2017 in Vancouver, and will be held the first week of August. Here is the abstract:
Recent studies of distributional semantic models have set up a competition between word embeddings obtained from predictive neural networks and word vectors obtained from abstractive count-based models. This paper is an attempt to reveal the underlying contribution of additional training data and post-processing steps on each type of model in word similarity and relatedness inference tasks. We do so by designing an artificial language framework, training a predictive and a count-based model on data sampled from this grammar, and evaluating the resulting word vectors in paradigmatic and syntagmatic tasks defined with respect to the grammar.