LexIndoLLM: Large Language Model untuk Konsultasi Regulasi Daerah di Indonesia

LexIndoLLM: A Large Language Model for Consulting Local Regulations in Indonesia

Authors

DOI:

https://doi.org/10.24002/jbi.v17i1.14326

Keywords:

Large Language Model, fine-tuning domain, Retrieval-Augmented Generation, regulasi daerah, RAGAS

Abstract

Large Language Model (LLM) berpotensi meningkatkan akses terhadap layanan konsultasi regulasi daerah, tetapi model generik masih sering menghasilkan jawaban yang kurang akurat pada dokumen hukum Indonesia yang panjang, formal, dan kontekstual. Penelitian ini mengembangkan LexIndoLLM, model ringan berbasis Llama 3.2-1B, melalui fine-tuning bertahap pada 393 dokumen regulasi Kabupaten Kutai Kartanegara dan integrasi Retrieval-Augmented Generation (RAG) berbasis FAISS. Evaluasi dilakukan menggunakan RAGAS, perplexity, ROUGE-L, dan metrik efisiensi inferensi. Hasil menunjukkan bahwa pendekatan yang diusulkan meningkatkan kualitas jawaban, ditandai dengan penurunan perplexity dari 9,13 menjadi 1,74, peningkatan ROUGE-L dari 0,2058 menjadi 0,4429, serta nilai faithfulness 0,77 dan answer correctness 0,66. Waktu respons rata-rata di bawah 3,4 detik sehingga cocok untuk deployment lokal. Temuan ini menunjukkan bahwa model ringan yang dipadukan dengan retrieval layak digunakan untuk konsultasi regulasi daerah pada lingkungan komputasi terbatas.

 

Large Language Models (LLMs) have the potential to improve access to local regulatory consultation services, yet general-purpose models often produce inaccurate responses when handling Indonesian legal documents that are lengthy, formal, and highly contextual. This study develops LexIndoLLM, a lightweight model based on Llama 3.2-1B, through staged fine-tuning on 393 local regulatory documents from Kutai Kartanegara Regency and the integration of FAISS-based Retrieval-Augmented Generation (RAG). The model was evaluated using RAGAS, perplexity, ROUGE-L, and inference efficiency metrics. The results show that the proposed approach improves answer quality, as indicated by a reduction in perplexity from 9.13 to 1.74, an increase in ROUGE-L from 0.2058 to 0.4429, and faithfulness and answer correctness scores of 0.77 and 0.66, respectively. The system maintains an average response time under 3.4 seconds, suitable for local deployment. These findings indicate that a lightweight model combined with retrieval is feasible for local regulatory consultation in resource-constrained environments.

References

[1] T. B. Brown et al., “Language models are few-shot learners,” Adv. Neural Inf. Process. Syst., vol. 33, pp. 1877–1901, Jul. 2020.

[2] P. Colombo et al., “SaulLM-7B: A Pioneering Large Language Model for Law,” arXiv preprint arXiv:2403.03883, 2024.

[3] Badan Pusat Statistik, Statistik Indonesia 2025. Jakarta, Indonesia: Badan Pusat Statistik, 2025.

[4] J. Lai, W. Gan, J. Wu, Z. Qi, and P. S. Yu, “Large Language Models in Law: A Survey,” AI Open, vol. 5, pp. 181–196, 2024, doi: 10.1016/j.aiopen.2024.09.002.

[5] Y. Wu, C. Wang, E. Gumusel, and X. Liu, “Knowledge-Infused Legal Wisdom: Navigating LLM Consultation through the Lens of Diagnostics and Positive-Unlabeled Reinforcement Learning,” in Findings of the Association for Computational Linguistics: ACL 2024, Bangkok, Thailand, 2024, pp. 15542–15555. doi: 10.18653/v1/2024.findings-acl.918.

[6] Y. Hu, L. Gan, W. Xiao, K. Kuang, and F. Wu, “Fine-tuning Large Language Models for Improving Factuality in Legal Question Answering,” in Proceedings of the 31st International Conference on Computational Linguistics, 2025, pp. 4410–4427.

[7] S. Yue et al., “LawLLM: Intelligent Legal System with Legal Reasoning and Verifiable Retrieval,” in The 29th International Conference on Database Systems for Advanced Applications (DASFAA 2024), 2024, pp. 304–321. doi: 10.1007/978-981-97-5569-1_19.

[8] R. Dominguez-Olmedo et al., “Lawma: The Power of Specialization for Legal Annotation,” in International Conference on Learning Representations (ICLR), 2025.

[9] P. Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Adv. Neural Inf. Process. Syst., vol. 33, pp. 9459–9474, 2020.

[10] N. Setiawan, M. Akbar, and A. A. T. Susilo, “Pengembangan Chatbot Untuk Layanan Informasi Keanggotaan Guru Metode Support Vector Machine,” Jurnal Buana Informatika, vol. 16, no. 2, pp. 166–175, Oct. 2025.

[11] B. Arham and Sukasih, “Sistem Tanya Jawab Berbasis Artificial Intelligence untuk Akses Informasi RANPERDA di Kabupaten Kampar,” JEKIN - Jurnal Teknik Informatika, vol. 5, no. 2, pp. 928–935, Aug. 2025, doi: 10.58794/jekin.v5i2.1632.

[12] A. Prihartono and A. U. Priantoro, “Graph RAG untuk memahami peraturan tentang pajak kendaraan bermotor di Provinsi Banten,” Al-Ihtiram: Multidisciplinary Journal of Counseling and Social Research, vol. 4, no. 1, pp. 293–300, 2025.

[13] R. Al-Qaesm, M. Hendi, and B. Tantour, “Alkafi-llama3: fine-tuning LLMs for precise legal understanding in Palestine,” Discover Artificial Intelligence, vol. 5, no. 1, p. 107, Jun. 2025, doi: 10.1007/s44163-025-00313-w.

[14] M. Douze et al., “The Faiss Library,” IEEE Trans. Big Data, vol. 12, no. 2, pp. 346–361, Apr. 2026, doi: 10.1109/TBDATA.2025.3618474.

[15] H. Soudani, E. Kanoulas, and F. Hasibi, “Fine Tuning vs. Retrieval Augmented Generation for Less Popular Knowledge,” in SIGIR-AP 2024 - Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, Tokyo, Japan: Association for Computing Machinery, Inc, Dec. 2024, pp. 12–22. doi: 10.1145/3673791.3698415.

[16] K. Seo and T. Utsuro, “RAG based Question Answering of Korean Laws and Precedents,” in Proceedings of the Eighth Fact Extraction and VERification Workshop (FEVER), 2025, pp. 91–100. doi: 10.18653/v1/2025.fever-1.7.

[17] P. Martín-Chozas, P. Calleja, and C. R. Limón, “Terminology Enhanced Retrieval Augmented Generation for Spanish Legal Corpora,” in Proceedings of the 5th Conference on Language, Data and Knowledge, 2025, pp. 147–152.

[18] A. Fadillah, N. Athahirah, and K.-T. Lai, “LawRAG: Indonesian legal document retrieval-augmented generation with specialized chunking and reranking strategies,” Data Technologies and Applications, vol. 60, no. 2, pp. 330–347, Apr. 2026, doi: 10.1108/DTA-03-2025-0195.

[19] S. Es, J. James, L. Espinosa-Anke, and S. Schockaert, “RAGAs: Automated Evaluation of Retrieval Augmented Generation,” in Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, 2024, pp. 150–158. doi: 10.18653/v1/2024.eacl-demo.16.

[20] E. J. Hu et al., “LoRA: Low-Rank Adaptation of Large Language Models,” in International Conference on Learning Representations (ICLR), 2022.

[21] T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, “QLoRA: Efficient Finetuning of Quantized LLMs,” in Advances in Neural Information Processing Systems, 2023. doi: 10.5555/3666122.3666563.

[22] F. Kautsar et al., “Transfer Learning Menggunakan LoRA+ pada Llama 3.2 untuk Percakapan Bahasa Indonesia,” Techno.Com, vol. 24, no. 2, pp. 332–343, May 2025, doi: 10.62411/tc.v24i2.12508.

[23] T. Wolf et al., “Transformers: State-of-the-Art Natural Language Processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020, pp. 38–45. doi: 10.18653/v1/2020.emnlp-demos.6.

[24] L. Wang, N. Yang, X. Huang, L. Yang, R. Majumder, and F. Wei, “Improving Text Embeddings with Large Language Models,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, Bangkok, Thailand, 2024, pp. 11897–11916. doi: 10.18653/v1/2024.acl-long.642.

[25] H. Lijaya, P. Ho, and H. Santoso, “Comparative Analysis of RAG-Based Open-Source LLMs for Indonesian Banking Customer Service Optimization Using Simulated Data,” Jurnal Sisfokom (Sistem Informasi dan Komputer), vol. 14, no. 3, pp. 330–341, Jul. 2025, doi: 10.32736/sisfokom.v14i3.2383.

Downloads

Published

2026-04-27