Document Type

Honors Thesis

Abstract

Resource-poor, morphologically complex languages are at a disadvantage in natural language processing tasks, such as automatic text summarization or machine translation, due to the shortage of quality linguistic data available in these languages. Recently, researchers have introduced a language-independent, centroid-based method for automatic text summarization which garnered international attention for its success. This thesis explores methods for improving Rossiello et al.’s summarization approach on resource-poor, morphologically complex languages by implementing additional preprocessing steps on the data. Thereafter, stemming is shown to marginally improve research benchmark ROUGE scores for summarizations in German, a relative morphologically complex language, as well as in Turkish, an agglutinative language. In addition, a manual semantic analysis of the associated Word2Vec models in this approach showed improved accuracy when models were constructed on stemmed corpora. This result has implications for research on word embeddings in low-resource and morphologically complex languages.

Publication Date

5-1-2018

Language

English

License

This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.

Recommended Citation

Goss Manshack, Kalen, "IMPROVING AUTOMATIC SUMMARIZATION FOR LOW- AND MODERATE-RESOURCE, MORPHOLOGICALLY COMPLEX LANGUAGES" (2018). 2018 Spring Honors Capstone Projects. 28.
https://mavmatrix.uta.edu/honors_spring2018/28

Download

COinS

2018 Spring Honors Capstone Projects

IMPROVING AUTOMATIC SUMMARIZATION FOR LOW- AND MODERATE-RESOURCE, MORPHOLOGICALLY COMPLEX LANGUAGES

Document Type

Abstract

Publication Date

Language

License

Recommended Citation

Search

Browse

Author & Creator Corner

Links

2018 Spring Honors Capstone Projects

IMPROVING AUTOMATIC SUMMARIZATION FOR LOW- AND MODERATE-RESOURCE, MORPHOLOGICALLY COMPLEX LANGUAGES

Authors

Document Type

Abstract

Publication Date

Language

License

Recommended Citation

Share

Search

Browse

Author & Creator Corner

Links