Automatic frequency dictionaryof connectivity by Lina Kostenko and Mykola Vinhranovsky

УДК 811.161.2’42:004
DOI: https://doi.org/10.17721/2520-6397.2023.1.01

Nataliia DARCHUK, Dr Hab., Prof., Taras Shevchenko National University of Kyiv, Kyiv, Ukraine

ORCID: 0000-0001-8932-9301


Automatic frequency dictionaryof connectivity by Lina Kostenko and Mykola Vinhranovsky (pdf)

The article is devoted to the description of the linguistic support for the automatic arrangement of the word combinations of the text in the electronic dictionary, as well as to the comparative linguistic analysis of the obtained author’s dictionaries of the word combinations of the poetic text of Lina Kostenko (total volume of 30,057 word usages) and Mykola Vinhranovskyi (total volume of 20,317 word usages), placed in the Corpus of the Modern Ukrainian Language. The purpose of the analysis is to identify common and
different phrases in functioning, with the establishment of the parameterization of the author’s style. The relevance of the topic is obvious in connection with the need to establish the grammatical and lexical valence of words, typical partlanguage conjunctiveness, laws of combinatorics of word combinations of various types and degrees. The novelty lies both in the approach itself, that is, in the possibility of automatically creating an alphabetic-frequency dictionary, and in the method of implementation: the phrase dictionary is part of the syntactic representation of a sentence in the form of a model – a graphic representation of a dependency tree, which is also an interesting tool for characterizing syntactic categories – predicativeness, order etc. The task of the parser was to identify all types of conjugation – predicative, subjunctive and consecutive – of each word in the text. Since the lexical-grammatical nature of a word determines its ability to combine with other words, word combinations are divided into noun, adjective, pronoun, numeral, verb and adverb. The article deals with simple binary phrases with or without a preposition, which can be extended into complex ones, because the analysis of the content structure is
required to determine their composition. The perspective of the project is in further use in the semantic analysis of the text, as well as a finished product for linguistic research on the syntax of the Ukrainian language.

Keywords: parser, phrase, dependency tree, frequency.