Найдено научных статей и публикаций: 1, для научной тематики: Web mining
1.
Krizhanovsky A., Smirnov A.
- International Journal of Computer and Systems Sciences , 2009
A new type of documents called a "wiki page" is winning the Internet. This is expressed not only in an increase of the number of Internet pages of this type, but also in the popularity of Wiki projects (in particular, Wikipedia); therefore the problem of parsing in Wiki texts is becoming more and mo...
A new type of documents called a "wiki page" is winning the Internet. This is expressed not only in an increase of the number of Internet pages of this type, but also in the popularity of Wiki projects (in particular, Wikipedia); therefore the problem of parsing in Wiki texts is becoming more and more topical. A new method for indexing Wikipedia texts in three languages: Russian, English, and German, is proposed and implemented. The architecture of the indexing system, including the software components GATE and Lemmatizer, is considered. The rules of converting Wiki texts into texts in a natural language are described. Index bases for the Russian Wikipedia and Simple English Wikipedia are constructed. The validity of Zipf's laws is tested for the Russian Wikipedia and Simple English Wikipedia.
International Journal of Computer and Systems Sciences, 2009, 48(4). P.616-624.