Text Classification Using Word-Based PPM Models
Închide
Conţinutul numărului revistei
Articolul precedent
Articolul urmator
893 6
Ultima descărcare din IBN:
2023-05-08 13:29
Căutarea după subiecte
similare conform CZU
004.915 (1)
Informatică aplicată. Tehnici bazate pe calculator cu aplicații practice (438)
SM ISO690:2012
BOBICEV, Victoria. Text Classification Using Word-Based PPM Models. In: Computer Science Journal of Moldova, 2006, nr. 2(41), pp. 183-201. ISSN 1561-4042.
EXPORT metadate:
Google Scholar
Crossref
CERIF

DataCite
Dublin Core
Computer Science Journal of Moldova
Numărul 2(41) / 2006 / ISSN 1561-4042 /ISSNe 2587-4330

Text Classification Using Word-Based PPM Models
CZU: 004.915

Pag. 183-201

Bobicev Victoria
 
Technical University of Moldova
 
 
Disponibil în IBN: 8 decembrie 2013


Rezumat

Text classification is one of the most actual among the natural language processing problems. In this paper the application of word-based PPM (Prediction by Partial Matching) model for automatic content-based text classification is described. Our main idea is that words and especially word combinations are more relevant features for many text classification tasks. Key-words for a document in most cases are not just single words but combination of two or three words. The main result of the implemented experiments proved applicability of word-based PPM models for content-based text classification. Although in some cases the entropy difference which influenced the choice was rather small (several hundredths), most of the documents (up to 97%) were classified correctly.