Pengaruh Part of Speech Tagging Berbasis Aturan dan Distribusi Probabilitas Maximum Entropy untuk Bahasa Jawa Krama

Authors

  • Hafiz Ridha Pramudita
  • Ema Utami
  • Armadyah Amborowati

DOI:

https://doi.org/10.24002/jbi.v7i4.764

Abstract

Abstract.

Javanese language is one of the local languages in Indonesia, which is used by most of the population of Indonesia. The language has complex grammar to embrace the values of decency that is determined by the use of words containing courtesy known as Raos Alus. Every word in the Javanese belongs to a certain part of speech like what happens to other languages. Part of Speech (POS) tagging is a process to set syntactic category in a word such as nouns, verbs, or adjectives to every word in the document or text. This study examined the POS Tagging with Maximum Entropy and Rule Based for Javanese Krama—Higher Javanese--by using the Open NLP library to measure the maximum entropy. The results obtained are Maximum Entropy and Rule Based can be used for POS Tagging on Javanese Krama with the highest accuracy of 97.67%.
Keywords: POS Tagging, NLP, Maximum Entropy, Rule Based, Javanese Krama
Language


Abstrak. Bahasa Jawa merupakan salah satu bahasa daerah di Indonesia yang dipakai oleh sebagian besar penduduk Indonesia. Bahasa Jawa memiliki tata bahasa yang kompleks karena menganut nilai-nilai kesopanan yang ditentukan berdasarkan penggunaan dengan kata-kata yang mengandung raos alus (rasa sopan). Setiap kata dalam Bahasa Jawa memiliki jenis kata atau part of speech tertentu seperti halnya dengan bahasa-bahasa lain. POS tagging merupakah bagian penting dari cakupan bidang ilmu Natural Languange Processing (NLP). Penelitian ini menguji POS Tagging dengan Berbasis Aturan dan distribusi probabilitas Maximum Entropy pada Bahasa Jawa Krama menggunakan library OpenNLP untuk mengukur maximum entropy. Hasil yang diperoleh adalah Maximum Entropy dan Rule Based dapat digunakan untuk POS Tagging pada Bahasa Jawa Krama dengan akurasi tertinggi 97,67%.
Kata Kunci: POS Tagging, NLP, Maximum Entropy, Rule Based, Bahasa Jawa Krama

References

Alfred, R., Mujat, A., & Obit, J. H. (2013). A ruled-based part of speech (rpos) tagger for malay text articles. In Intelligent Information and Database Systems (pp. 50-59). Springer Berlin Heidelberg.

Altunyurt, L., Orhan, Z., & Gungor, T. (2007). Towards combining rule-based and statistical part of speech tagging in agglutinative languages. Computer engineering, 1(1), pp. 66-69.

Anwar, W., Wang, X., Li, L., & Wand, X. (2007). Hidden markov model based part of speech tagger for urdu. Information Technology Journal, pp. 1190-1198.

Berger, A. L., Pietra, V. J. D., & Pietra, S. A. D. (1996). A maximum entropy approach to natural language processing. Computational linguistics, 22(1), pp.39-71.

Ekbal, A., Haque, R., & Bandyopadhyay, S. (2008). Maximum Entropy Based Bengali Part of Speech Tagging. A. Gelbukh (Ed.), Advances in Natural Language Processing and Applications, Research in Computing Science (RCS) Journal, 33, pp.67-78.

Jurafsky, D,; Martin, J, H. (2000). Speech and language processing, Prentice Hall, New Jersey.Khazanah, D. (2012). Kedudukan Bahasa Jawa Ragam Krama pada Kalangan Generasi Muda, Jember, Jurnal Pengembangan Pendidikan, 9(2).

Kumar, D., & Josan, G. S. (2010). Part of speech taggers for morphologically rich indian languages: a survey. International Journal of Computer Applications, 6(5), pp. 32-41.

Quinn, G. (2011). Teaching Javanese Respect Usage to Foreign Learners.Electronic Journal of Foreign Language Teaching, 8, pp. 362-370.

Poedjosoedarmo, S. (1968). Javanese speech levels. Indonesia (6), pp. 54-81.

Poedjosoedarmo, G. (2006). The effect of Bahasa Indonesia as a lingua franca on the Javanese system of speech levels and their functions. International journal of the sociology of language, 2006(177), pp. 111-121.

Ratnaparkhi, A. (1996). A Maximum Entropy Model for Part-of-Speech Tagging, Proceedings of the Conference on Empirical Methods in Natural Language Processing, Vol. 1996, pp. 133–142.

Rusyidi, Mulyanto, R.J., Sutadi, W., Suranto, Supardiman, B. (1985). Kosa Kata Bahasa Jawa. Pusat Pembinaan dan Pengembangan Bahasa, Jakarta Timur.

Sukarno, S. (2010). The reflection of the Javanese cultural concepts in the politeness of Javanese. k@ta, 12(1), pp. 59-71.

Suyata, P. (2011). Status Isolek Yogyakarta-Surakarta dan Implikasinya Terhadap Bahasa Jawa Standar. Litera 6(1), pp. 1-20.

Downloads

Published

2016-10-25