Penerapan Optical Character Recognition untuk Pengenalan Variasi Teks pada Media Presentasi Pembelajaran

Authors

Keywords:

learning media, optical character recognition, presentation, tesseract, text processing, media belajar

Abstract

Media digital merupakan bentuk utama media pembelajaran yang banyak digunakan untuk kegiatan belajar mengajar di kelas saat ini. Media pembelajaran digital umumnya tersimpan dalam bentuk citra karena memiliki unsur visual di dalamnya. Salah satu kelemahan data dalam bentuk citra adalah seluruh isi di dalamnya dianggap sebagai gambar, sementara pada media pembelajaran juga terdapat unsur teks di dalamnya. Oleh karena itu, dibutuhkan metode OCR untuk membaca teks di dalamnya agar media tersebut dapat diolah lebih lanjut, misalnya untuk keperluan kategorisasi (indexing) atau untuk dibaca pada sistem lain seperti chatbot. Umumnya, metode OCR digunakan untuk mengenali tulisan dengan bentuk yang seragam pada sebuah citra. Sedangkan pada media pembelajaran, teks di dalamnya memiliki variasi yang berbeda-beda. Penelitian ini mencoba menerapkan metode OCR dengan menggunakan Tesseract untuk menguji 30 data media pembelajaran yang memiliki berbagai macam variasi teks dalam sebuah citra. Hasil pengujian menunjukkan tingkat akurasi pengenalan teks yang cukup baik, yaitu sebesar 91,11%.

Author Biography

Kristian Adi Nugraha, Universitas Kristen Duta Wacana

Fakultas Teknologi Informasi

References

A. Alzubi, "Impact of New Digital Media on Conventional Media and Visual Communication in Jordan," Journal of Engineering Technology and Applied Science (JETAS), vol. 4, no. 3, pp. 105-113, 2022.

W. AlKendi, F. Gechter, L. Heyberger and C. Guyeux, "Advancements and Challenges in Handwritten Text Recognition: A Comprehensive Survey," Journal of Imaging, vol. 10, no. 1, pp. 1-30, 2024.

H. Y. Susetya, A. Rachmat and K. A. Nugraha, "Implementasi Moment Invariant Untuk Pengenalan Label Buku Perpustakaan Berbasis Android," Jurnal Terapan Teknologi Informasi, vol. 1, no. 1, pp. 21-30, 2017.

K. A. Nugraha and D. Sebastian, "Designing consultation chatbot using telegram api and webhook-based nodejs applications," in International Conference on Education and Technology (ICET) 2021, Malang, Indonesia, 2021.

D. Sebastian and K. A. Nugraha, "Academic customer service chatbot development using Telegrambot API," in International Conference on Innovative and Creative Information Technology (ICITech), Salatiga, Indonesia, 2021.

M. Kumar, S. R. Jindal, M. K. Jindal and G. S. Lehal, "Improved Recognition Results of Medieval Handwritten Gurmukhi Manuscripts Using Boosting and Bagging Methodologies," Neural Processing Letters, vol. 50, pp. 43-56, 2019.

B. Dessai and A. Patil, "A Deep Learning Approach for Optical Character Recognition of Handwritten Devanagari Script," in 2019 2nd International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT), Kannur, India, 2019.

J. Memon, M. Sami, R. A. Khan and M. Uddin, "Handwritten Optical Character Recognition (OCR): A Comprehensive Systematic Literature Review (SLR)," IEEE Access , vol. 8, pp. 2169-3536, 2020.

A. Sampath and N. Gomathi, "Handwritten optical character recognition by hybrid neural network training algorithm," The Imaging Science Journal, vol. 67, no. 7, p. 359–373, 2019.

Q.-D. Nguyen, N.-M. Phan, P. Krömer and D.-A. Le, "An Efficient Unsupervised Approach for OCR Error Correction of Vietnamese OCR Text," IEEE Access , vol. 11, pp. 58406 - 58421, 2023.

V. T. Quang, L. H. Duy and N. T. Nhan, "Vietnamese handwritten character recognition using convolutional neural network," IAES International Journal of Artificial Intelligence, vol. 9, no. 2, pp. 276-281, 2020.

L. Mosbah, I. Moalla, T. M. Hamdani, B. Neji, T. Beyrouthy and A. M. Alimi, "ADOCRNet: A Deep Learning OCR for Arabic Documents Recognition," IEEE Access , vol. 12, pp. 55620 - 55631, 2024.

S. M. Darwish and K. O. Elzoghaly, "An Enhanced Offline Printed Arabic OCR Model Based on Bio-Inspired Fuzzy Classifier," IEEE Access, vol. 8, pp. 117770 - 117781, 2020.

Y. Yin, W. Zhang, S. Hong, J. Yang, J. Xiong and G. Gui, "Deep Learning-Aided OCR Techniques for Chinese Uppercase Characters in the Application of Internet of Things," IEEE Access, vol. 7, pp. 47043 - 47049, 2019.

J. Zhang, J. Sang, K. Xu, S. Wu, X. Zhao, Y. Sun, Y. Hu and J. Yu, "Robust CAPTCHAs Towards Malicious OCR," IEEE Transactions on Multimedia, vol. 23, pp. 2575 - 2587, 2020.

A. Mathew, A. Kulkarni, A. Antony, S. Bharadwaj and S. Bhalerao, "DOCR-CAPTCHA: OCR Classifier based Deep Learning Technique for CAPTCHA Recognition," in OITS International Conference on Information Technology (OCIT), Bhubaneswar, India, 2021.

T. Nasir, M. K. Malik and K. Shahzad, "MMU-OCR-21: Towards End-to-End Urdu Text Recognition Using Deep Learning," IEEE Access , vol. 9, pp. 124945 - 124962, 2021.

C. Wibisono and S. Budi, “Form Recognition dan Character Mapping Menggunakan Image Segmentation dan Optical Character Recognition,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 7, no. 1, Apr. 2021, doi: 10.28932/jutisi.v7i1.3340.

S. Faizullah, M. S. Ayub, S. Hussain, and M. A. Khan, “A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges,” Applied Sciences (Switzerland), vol. 13, no. 7, Apr. 2023, doi: 10.3390/app13074584.

K. Kusnantoro, T. Rohana and D. S. Kusumaningrum, "Implementasi Metode Tesseract OCR(Optical Character Recognition) untuk Deteksi Plat Nomor Kendaraan Pada Sistem Parkir," Scientific Student Journal for Information, Technology and Science , vol. 3, no. 1, pp. 59-67, 2022.

G. A. Robby, A. Tandra, I. Susanto, J. Harefa and A. Chowanda, "Implementation of Optical Character Recognition using Tesseract with the Javanese Script Target in Android Application," Procedia Computer Science, vol. 157, pp. 499-505, 2019.

Slidesgo, "Slidesgo," Slidesgo, [Online]. Available: https://slidesgo.com/. [Accessed 12 Februari 2024].

Downloads

Published

2024-04-01