Causal and Masked Language Modeling of Javanese Language using Transformer-based Architectures
Published in 2021 International Conference on Advanced Computer Science and Information Systems (ICACSIS), 2021
Abstract
Most natural language understanding breakthroughs occur in popularly spoken languages, while low-resource languages are rarely examined. We pre-trained as well as compared different Transformer-based architectures on the Javanese language. They were trained on causal and masked language modeling tasks, with Javanese Wikipedia documents as corpus, and could then be fine-tuned to downstream natural language understanding tasks. To speed up pre-training, we transferred English word-embeddings, utilized gradual unfreezing of layers, and applied discriminative fine-tuning. We further fine-tuned our models to classify binary movie reviews and find that they were on par with multilingual/cross-lingual Transformers. We release our pre-trained models for others to use, in hopes of encouraging other researchers to work on low-resource languages like Javanese.
BibTeX Citation
@INPROCEEDINGS{9631331,
author = {Wongso, Wilson and Setiawan, David Samuel and Suhartono, Derwin},
booktitle = {2021 International Conference on Advanced Computer Science and Information Systems (ICACSIS)},
title = {Causal and Masked Language Modeling of Javanese Language using Transformer-based Architectures},
year = {2021},
volume = {},
number = {},
pages = {1-7},
doi = {10.1109/ICACSIS53237.2021.9631331}
}
Recommended citation: W. Wongso, D. S. Setiawan and D. Suhartono, "Causal and Masked Language Modeling of Javanese Language using Transformer-based Architectures," 2021 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Depok, Indonesia, 2021, pp. 1-7, doi: 10.1109/ICACSIS53237.2021.9631331.
Download Paper