FinBERT: A Pretrained Language Model for Financial Communications
Paper • 2006.08097 • Published
How to use philschmid/finbert-pretrain-yiyanghkust with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="philschmid/finbert-pretrain-yiyanghkust") # Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("philschmid/finbert-pretrain-yiyanghkust")
model = AutoModelForMaskedLM.from_pretrained("philschmid/finbert-pretrain-yiyanghkust")All credits to @yiyanghkust.
I added the TensorFlow model and a proper tokenizer.json
FinBERT is a BERT model pre-trained on financial communication text. The purpose is to enhance financial NLP research and practice. It is trained on the following three financial communication corpus. The total corpora size is 4.9B tokens.
More details on FinBERT's pre-training process can be found at: https://arxiv.org/abs/2006.08097
FinBERT can be further fine-tuned on downstream tasks. Specifically, we have fine-tuned FinBERT on an analyst sentiment classification task, and the fine-tuned model is shared at https://huggingface.co/yiyanghkust/finbert-tone