From Word Alignment to Word Senses, via Multilingual Wordnets
Close
Conţinutul numărului revistei
Articolul precedent
Articolul urmator
891 2
Ultima descărcare din IBN:
2017-04-29 13:00
Căutarea după subiecte
similare conform CZU
004.43+004.912:81'33 (1)
Software (293)
Application-oriented computer-based techniques (438)
Linguistics and languages (4990)
SM ISO690:2012
TUFIŞ, Dan. From Word Alignment to Word Senses, via Multilingual Wordnets. In: Computer Science Journal of Moldova, 2006, nr. 1(40), pp. 3-33. ISSN 1561-4042.
EXPORT metadate:
Google Scholar
Crossref
CERIF

DataCite
Dublin Core
Computer Science Journal of Moldova
Numărul 1(40) / 2006 / ISSN 1561-4042 /ISSNe 2587-4330

From Word Alignment to Word Senses, via Multilingual Wordnets
CZU: 004.43+004.912:81'33

Pag. 3-33

Tufiş Dan
 
Institute for Artificial Intelligence, Romanian Academy
 
Disponibil în IBN: 6 decembrie 2013


Rezumat

Most of the successful commercial applications in language processing (text and/or speech) dispense with any explicit con¬cern on semantics, with the usual motivations stemming from the computational high costs required for dealing with semantics, in case of large volumes of data. With recent advances in corpus linguistics and statistical-based methods in NLP, revealing use ful semantic features of linguistic data is becoming cheaper and cheaper and the accuracy of this process is steadily improving. Lately, there seems to be a growing acceptance of the idea that multilingual lexical ontologies might be the key towards align¬ing different views on the semantic atomic units to be used in characterizing the general meaning of various and multilingual documents. Depending on the granularity at which semantic dis¬tinctions are necessary, the accuracy of the basic semantic pro¬cessing (such as word sense disambiguation) can be very high with relatively low complexity computing. The paper substanti ates this statement by presenting a statistical/based system for word alignment and word sense disambiguation in parallel cor¬pora. We describe a word alignment platform which ensures text pre-processing (tokenization, POS-tagging, lemmatization, chunking, sentence and word alignment) as required by an accu¬rate word sense disambiguation.