Daria M. Golikova
Ural Federal University
Ekaterinburg, Russia
Proper Names and Named Entities Recognition in the Automatic Text Processing.
Review of the book: Nouvel, D., Ehrmann, M., & Rosset, S. (2016). Named Entities for Computational Linguistics. London; Hoboken: ISTE Ltd; John Wiley & Sons, Inc., 2016.
Voprosy onomastiki, 2018, Volume 15, Issue 1, pp. 207–215 (in Russian)
DOI: 10.15826/vopr_onom.2018.15.1.012
Received 15 January 2018
Abstract: The reviewed book by Damien Nouvel, Maud Ehrmann, and Sophie Rosset Named Entities for Computational Linguistics deals with automatic processing of texts, written in a natural language, and with named entities recognition, aimed at extracting most important information in these texts. The notion of named entities here extends to the entire set of linguistic units referring to an object. The researchers minutely consider the concept of named entities, juxtaposing this category to that of proper names and comparing their definitions, and describe all the stages of creation and implementation of automatic text annotation algorithms, as well as different ways of evaluating their performance quality. Proper names, in this context, are seen as a particular instance of named entities, one of the typical sources of reference to real objects to be electronically recognized in the text. The book provides a detailed overview and analysis of previous studies in the same field, based mainly on the English language data. It presents instruments and resources required to create and implement the algorithms in question, these may include typologies, knowledge or databases, and various types of corpora. Theoretical considerations, proposed by the authors, are supported by a significant number of exemplary cases, with algorithms operation principles presented in charts. The reviewed book gives quite a comprehensive picture of modern computational linguistic studies focused on named entities recognition and indicates some problems which are unresolved as yet.
Keywords: computational linguistics, proper names, automatic text processing, annotation, named entities, corpus, knowledge base
|
|