Preserving Meher and Woirata Corpus Languages using Neural  Machine Translation

Yulius Prabowo; Marthen Gabriel; Nazarudin; Tanwey Ratumanan; Martinus Maslim

doi:10.24002/ijis.v6i2.8542

Authors

Yulius Denny Prabowo Universitas Bina Nusantara
Marthen Texas A&M University
Nazarudin Universitas Indonesia
Ratumanan Universitas Pattimura
Martinus Universitas Atma Jaya Yogyakarta

DOI:

https://doi.org/10.24002/ijis.v6i2.8542

Abstract

Research on languages, particularly regional languages, is extremely challenging to conduct because there is very little or no language corpus available, particularly for Indonesia's regional languages. This project seeks to construct a translation machine for Indonesian in Meher and Woirata languages, and vice versa. However, to be able to achieve this, a corpus of Meher and Woirata languages must first be developed. The production of this corpus was carried out through field studies, the researcher requested various speakers of this language to translate manually and then compared the results from several translators through focus group talks to identify the appropriate use of words. The outcomes of this translation process are then written in the form of a database of Indonesian-Meher and Indonesian-Woirata language pairings which will subsequently be utilized as a learning database for the translation machine that will be created. This research succeeded in collecting 714.000 words in the Meher language and 805.000 words in the Woirata language. These results were then employed as a machine translation learning corpus, the output of the translation carried out by this machine was then validated through direct assessment by speakers of the two languages. The results of this testing indicated an accuracy above 80% for both translation into the Meher language and translation into the Woirata language. From the research carried out, it can be concluded that the construction of the Meher language corpus and the Woirata language corpus which was carried out through field research was successful in gathering and establishing a language corpus for these two languages. Apart from that, the experimental results suggest that the employment of translation algorithms to convert Indonesian into regional languages and vice versa may be carried out and provide translations with acceptable accuracy. The contribution of this research is in the establishment of the Meher and Woirata language corpus so that it can be generally accessed by anyone who requires it.

Author Biographies

Yulius Denny Prabowo, Universitas Bina Nusantara

Computer Science Department, Binus Online Learning, Universitas Bina Nusantara, Jakarta, Indonesia

Marthen, Texas A&M University

Nuclear Engineering Department, Texas A&M University, Texas, United States of America

Nazarudin, Universitas Indonesia

Departemen Linguistik, Fakultas Ilmu Pengetahuan Budaya, Universitas Indonesia, Depok, Jawa Barat

Ratumanan, Universitas Pattimura

Fakultas Keguruan dan Ilmu Kependidikan, Universitas Pattimura, Ambon, Maluku, Indonesia

Martinus, Universitas Atma Jaya Yogyakarta

Program Studi Informatika, Fakultas Teknologi Industri, Universitas Atma Jaya Yogyakarta, Daerah Istimewa Yogyakarta, Indonesia

Preserving Meher and Woirata Corpus Languages using Neural Machine Translation

Authors

DOI:

Abstract

Author Biographies

Yulius Denny Prabowo, Universitas Bina Nusantara

Marthen, Texas A&M University

Nazarudin, Universitas Indonesia

Ratumanan, Universitas Pattimura

Martinus, Universitas Atma Jaya Yogyakarta

Downloads

Published

How to Cite

Issue

Section

License

sinta

_template_naskah

-author-guidelines

-paper-submission

menu

-citation-google-scholar

statistik

doaj

copernicus

-google-scholar

crossref

garuda

turnitin

mendeley

ipi

-flag-counter

Preserving Meher and Woirata Corpus Languages using Neural Machine Translation

Authors

DOI:

Abstract

Author Biographies

Yulius Denny Prabowo, Universitas Bina Nusantara

Marthen, Texas A&M University

Nazarudin, Universitas Indonesia

Ratumanan, Universitas Pattimura

Martinus, Universitas Atma Jaya Yogyakarta

Downloads

Published

How to Cite

Issue

Section

License

login

sinta

_template_naskah

-author-guidelines

-paper-submission

menu

-citation-google-scholar

statistik

doaj

copernicus

-google-scholar

crossref

garuda

turnitin

mendeley

ipi

-flag-counter