Peringkasan Dokumen Berdasarkan Metode Semantic Sebaran Kalimat
DOI:
https://doi.org/10.24002/jbi.v8i1.1073Abstract
Abstract.
Sentence distribution method performs weighting based on the sentence distribution without taking the semantic meaning of the sentence spread into account. In fact, the semantic relation between sentences is believed to increase the relevance of the search results document. This study proposes new strategies to summarize documents using the semantic sentence distribution method in an effort to improve the quality of the summary. The experimental results show that the proposed method has better performance with the average performance ROUGE-1 0.412, an increase of 1,9% compared to "Sentence distribution method" and ROUGE-2 by 4,7% compared to 0.127 "sentence distribution method".
Keywords: Semantic Sentence Distribution, Summarizing Document, ROUGE.
Abstrak.
Peringkasan dokumen menggunakan metode sebaran kalimat terbukti memiliki hasil yang lebih baik jika dibanding dengan penelitian-penelitian sebelumnya. Metode tersebut melakukan pembobotan kalimat berdasarkan sebaran kalimat tanpa memperhitungkan makna semantic kalimat yang tersebar. Faktanya hubungan semantic antar kalimat telah terbukti mampu meningkatkan relevansi hasil dalam pencarian dokumen. Penelitian ini mengajukan strategi baru dalam peringkasan dokumen yaitu menggunakan metode semantic sebaran kalimat sebagai upaya untuk meningkatkan kualitas hasil ringkasan. Hasil eksperimen didapatkan bahwa metode yang diusulkan memiliki performa lebih baik dengan capaian rata-rata ROUGE-1 0,412, meningkat 1,9% dibanding metode sebaran kalimat dan ROUGE-2 0,127 meningkat 4,7% dibanding metode sebaran kalimat.
Kata Kunci: Semantic Sebaran Kalimat, Peringkasan Dokumen, ROUGE.
References
He, T., Li F., Shao, W., Chen, J., & Ma, L. (2008). A new feature-fusion sentence selecting strategy for query-focused multi-document summarization. Ock C. et al. (Eds.), Proceeding of international conference advance language processing and web information technology (pp. 81-86). University of Normal, Wuhan, China.
Kogilavani, A. & Balasubramani, P. (2010). Clustering and feature sprecific sentence extraction based summarization of multiple documents. International Journal of Computer Science & Information Technology (IJCSIT), 2 (4) , 99-111.
Lin, C. Y. (2004). ROUGE: A package for automatic evaluation of summaries. Moens, M. F. dan Szpakowicz, S.(Eds), In proceedings of workshop on text summarization brances out (pp. 74-81). Association for Computational Linguistics, Barcelona.
Meng, L., Huang, R. & Gu, J. (2014). Measuring semantic similarity of word pairs using path and information content. International Journal of Future Generation Communication and Networking, 7(3) , 183-194.
Ouyang, Y., Li, W., Zhang, R., Li S., & Lu, Q. (2013). A progressive sentence selection strategy for document summarization. Journal of information precessing and management, 49 (1), 213-221.
Sarkar, K. (2009). Sentence clustering-based summarization of multiple text documents. International journal of computing science and communication technologies, 2(1), 325-335.
Suputra, H. G. I., Arifin, Z. A. & Yuniarti, A. (2013). Strategi pemilihan kalimat pada peringkasan multi-dokumen berdasarkan metode clustering kalimat. Master Thesis of Informatics Engineering ITS.
Tian, X. & Chai, Y. (2011). An improvement to tf-idf : term distribution based term weight algorithm. Journal of Software, 6(3), 413-420.
Wahib, A., Pasnur, Santika, P. P. & Arifin, A. Z. (2015). Perankingan dokumen berbahasa arab menggunakan latent semantic indexing. Jurnal Buana Informatika, 6 (2), 83-92.
Wahib, A., Arifin, A. Z. & Purwitasari, D. (2016). Peringkasan dokumen berbahasa inggris menggunakan sebaran local sentence. Jurnal Buana Informatika, 7(1), 33-42.
Wahib, A., Arifin, A. Z. & Purwitasari, D. (2016). Improving multi-document summary method based on sentence distribution. Journal TELKOMNIKA (Telecommunication Computing Electronics and Control), 14 (1), 286-293.
Downloads
Published
Issue
Section
License
Copyright of this journal is assigned to Jurnal Buana Informatika as the journal publisher by the knowledge of author, whilst the moral right of the publication belongs to author. Every printed and electronic publications are open access for educational purposes, research, and library. The editorial board is not responsible for copyright violation to the other than them aims mentioned before. The reproduction of any part of this journal (printed or online) will be allowed only with a written permission from Jurnal Buana Informatika.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.