Подреждане на Думи с Минимум Ресурси

  • Райчо Мукелов Machine Learning, Natural Language Processing, Statistics, Reliability

Абстракт

В тази статия е представена методология за подреждане на думи с минимум ресурси базирана на общо-специфични релации извлечени от Уеб корпус. За целта се използа алгоритъма TextRank, асиметрични мерки за асоциация, честотите на думите измерени с помоща на Уеб търсачка и итеративен k-means алгоритъм за клъстеризация. Също така се предлага надеждностна оценка на метода.

Биографични данни

Райчо Мукелов, Machine Learning, Natural Language Processing, Statistics, Reliability

Райчо Муклов

ТУ-Варна

Литература

[1] Bollegala, D., Matsuo, Y. and Ishizuka, M. 2007. Measuring Semantic Similarity between Words Using WebSearch Engines. In Proceedings of International World Wide Web Conference (WWW 2007).
[2] Caraballo, S.A. 1999. Automatic Construction of a Hypernym-labeled Noun Hierarchy from Text. In Proceedings of the Conference of the Association for Computational Linguistics (ACL 1999).
[3] Dias, G., Mukelov, R. & Cleuziou, G. (2008). Unsupervised Learning of General-Specific Noun Relations from the Web. 21th International FLAIRS Conference (FLAIRS 2008). AAAI Press. Coconut Grove, Florida, USA. May 15-18. pp. 147-153.
[4] Dias, G., Santos, C., and Cleuziou, G. 2006. Automatic Knowledge Representation using a Graph-based Algorithm for Language-Independent Lexical Chaining. In Proceedings of the Workshop on Information Extraction Beyond the Document associated to the Joint Conference of the International Committee of Computational Linguistics and the Association for Computational Linguistics (COLING/ACL), pages. 36-47.
[5] Garipova Julia, Anton Georgiev, T . Papanchev, N . Nikolov and D . Zlatev. Operational Reliability Assessment of Systems Containing Electroni c Elements. 2nd International Scientific Conference “Intelligent information technologies for industry”, 14 - 16 .9. 2017, Varna, Bulgaria. © Springer International Publishing AG 2018; Proceedings of the Conference IITI’17, pp.340 - 348 Advances in Intelligent Systems and Computing 680, DOI: 10.1007/978 - 3 - 319 - 68324 - 9_37 .
[6] Georgiev, A., Papanchev, T., Nikolov, N. 2016. Reliability assessment of power semiconductor devices. 19th International Symposium on Electrical Apparatus and Technologies, SIELA 2016.
[7] Georgiev, A., Nikolov, N., Papanchev, T. 2016. Maintenance process efficiency when conduct reliability-centered maintenance of complex electronic systems. 19th International Symposium on Electrical Apparatus and Technologies, SIELA 2016.
[8] Grefenstette, G. 1994. Explorations in Automatic Thesaurus Discovery. Kluwer Academic Publishers, USA.
[9] Hearst, M.H. 1992. Automatic Acquisition of Hyponyms from Large Text Corpora. In Proceedings of the Fourteenth International Conference on Computational Linguistics (COLING 1992), pages 539-545.
[10] Kilgarriff, A. 2007. Googleology is Bad Science. Computational Linguistics 33 (1), pages: 147-151.
[11] Michelbacher, L., Evert, S. and Schütze, H. 2007. Asymmetric Association Measures. In Proceedings of the Recent Advances in Natural Language Processing (RANLP 2007).
[12] Mihalcea, R. and Tarau, P. 2004. TextRank: Bringing Order into Texts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), pages 404-411.
[13] Moralyiski R., & Dias, G. (2007). One Sense per Discourse for Synonym Detection. International Conference On Recent Advances in Natural Language Processing (RANLP 2007). Borovets, Bulgaria, September 27-29. ISBN: 978-954-91743-7-3. pp. 383-387.
[14] Pecina, P. and Schlesinger, P. 2006. Combining Association Measures for Collocation Extraction. In Proceedings of the International Committee of Computational Linguistics and the Association for Computational Linguistics (COLING/ACL 2006).
[15] Riloff, E. 1993. Automatically Constructing a Dictionary for Information Extraction Tasks. In Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI 1993), pages 811-816.
[16] Sang, E.J.K. and Hofmann, K. 2007. Automatic Extraction of Dutch Hypernym-Hyponym Pairs. In Proceedings of Computational Linguistics in the Netherlands Conference (CLIN 2007).
[17] Snow, R., Jurafsky, D. and Ng, A. Y. 2006. Learning Syntactic Patterns for Automatic Hypernym Discovery. In Proceedings of the International Committee of Computational Linguistics and the Association for Computational Linguistics (COLING/ACL 2006).
[18] Snow, R., Jurafsky, D. and Ng, A. Y. 2005. Semantic Taxonomy Induction from Heterogenous Evidence. In Proceedings of the Neural Information Processing Systems Conference (NIPS 2005).
[19] Stevenson, M., and Greenwood, M. 2006. Comparing Information Extraction Pattern Models. In Proceedings of the Workshop on Information Extraction Beyond the Document associated to the Joint Conference of the International Committee of Computational Linguistics and the Association for Computational Linguistics (COLING/ACL 2006), pages. 29-35.
[20] Tan, P.-N., Kumar, V. and Srivastava, J. 2004. Selecting the Right Objective Measure for Association Analysis. Information Systems, 29(4)., pages 293-313
[21] Sanderson, M. and Croft, B. 1999. Deriving concept hierarchies from text. In proceedings of the Annual ACM Conference on Research and Development in Information Retrieval, pages 206-213
Публикуван
2017-11-28
Как да се цитира
МУКЕЛОВ, Райчо. Подреждане на Думи с Минимум Ресурси. Списание ХайТек / HiTech Journal, [S.l.], v. 1, n. 1, p. 56-68, ное. 2017. ISSN 2534-9996. Достъпно на: <https://hit.hit-tech.eu/index.php/hit/article/view/22>. Дата на достъп: 26 юни 2025.
Раздел
ХайТек. Рецензирани научно-технически публикации