Training a better GPT model: Learnings from PaLM

Revisiting the model released in 2018

Anuj Arora
ML@ABEJA
Published in
6 min readJun 16, 2022

--

Since the release of GPT-2, state of the art NLP has come a long way. In ABEJA, we tackle our client’s use cases to solve their business AI needs using state-of-the-art practices. Till now, we used open-source language models, but we found them lacking for Japanese language. Hence, we decided to train our own GPT-2 model on Japanese language corpus.

However, in order not to waste on the progress made since the original release, we decided to follow the best…

--

--