Causal and Masked Language Modeling of Javanese Language using Transformer-based Architectures

Published in 2021 International Conference on Advanced Computer Science and Information Systems (ICACSIS), 2021

Recommended citation: W. Wongso, D. S. Setiawan and D. Suhartono, "Causal and Masked Language Modeling of Javanese Language using Transformer-based Architectures," 2021 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Depok, Indonesia, 2021, pp. 1-7, doi: 10.1109/ICACSIS53237.2021.9631331. https://ieeexplore.ieee.org/abstract/document/9631331

Abstract

Most natural language understanding breakthroughs occur in popularly spoken languages, while low-resource languages are rarely examined. We pre-trained as well as compared different Transformer-based architectures on the Javanese language. They were trained on causal and masked language modeling tasks, with Javanese Wikipedia documents as corpus, and could then be fine-tuned to downstream natural language understanding tasks. To speed up pre-training, we transferred English word-embeddings, utilized gradual unfreezing of layers, and applied discriminative fine-tuning. We further fine-tuned our models to classify binary movie reviews and find that they were on par with multilingual/cross-lingual Transformers. We release our pre-trained models for others to use, in hopes of encouraging other researchers to work on low-resource languages like Javanese.

BibTeX Citation

@INPROCEEDINGS{9631331,
  author = {Wongso, Wilson and Setiawan, David Samuel and Suhartono, Derwin},
  booktitle = {2021 International Conference on Advanced Computer Science and Information Systems (ICACSIS)}, 
  title = {Causal and Masked Language Modeling of Javanese Language using Transformer-based Architectures}, 
  year = {2021},
  volume = {},
  number = {},
  pages = {1-7},
  doi = {10.1109/ICACSIS53237.2021.9631331}
}