Hi Marcus,Thanks for your clap!
I haven't met such an issue before and I found that fp16=True is default.
In our experience, a BERT model can be trained for 1 epoch in less an minute on 500 records with a single RTX 2080 Ti.
The advice is you can either debug this TypeError or change to a smaller model(bert-small, albert).