Model Card for Model ID
This is a test model for an introductory NLP course.
Model Details
Model Description
This model is a fine-tuned version of dccuchile/bert-base-spanish-wwm-cased for sentiment analysis in Spanish. It classifies text into two categories: 'Positive' and 'Negative'. This model card has been automatically generated.
- Developed by: Germán Rosati (for an introductory NLP course)
- Model type: Sequence Classification (Sentiment Analysis)
- Language(s) (NLP): Spanish
- License: Apache-2.0
- Finetuned from model:
dccuchile/bert-base-spanish-wwm-cased
Model Sources
- Course Page: https://gefero.github.io/ecyt_lcd_intro_nlp/
Direct Use
This model is intended for direct use in classifying Spanish text into 'Positive' or 'Negative' sentiment. It is suitable for academic exploration, rapid prototyping, and use cases where a pre-trained and fine-tuned sentiment analysis model for Spanish is required.
Out-of-Scope Use
The model was trained on Amazon reviews and may not generalize perfectly to other domains or highly specialized language. It is not suitable for critical applications without further domain-specific fine-tuning and rigorous evaluation. It does not handle nuances like sarcasm or complex sentiment expressions.
Bias, Risks, and Limitations
The model's performance is dependent on the biases present in the original Amazon reviews dataset. It might reflect societal biases present in the training data, and its performance might vary across different demographic groups or types of Spanish dialects.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. Further testing with diverse datasets and domains is recommended before deployment in production environments.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import pipeline
classifier = pipeline("sentiment-analysis", model="gefero/test-bert-sentiment-spanish-wwm-cased")
text1 = "Me encanta este producto, es excelente."
text2 = "No me gusta nada, es una basura."
text3 = "Este es un producto promedio, ni bueno ni malo."
print(classifier(text1))
print(classifier(text2))
print(classifier(text3))
Training Details
Training Data
The model was fine-tuned on a subset of the Amazon reviews dataset, specifically for Spanish reviews. The dataset was preprocessed to remove neutral sentiments, resulting in 12,000 samples (6,000 Positive, 6,000 Negative) for training, validation, and testing.
Training Procedure
Preprocessing
The text data underwent preprocessing including converting to lowercase, replacing punctuation with spaces, replacing numbers with 'DIGITO', and handling non-ASCII characters. The text was then tokenized using dccuchile/bert-base-spanish-wwm-cased's tokenizer, with truncation and padding to a max_length of 128. Labels were mapped from 'Negative' and 'Positive' to numerical 0 and 1.
Training Hyperparameters
- Learning Rate: 2e-5
- Batch Size (per device): 16
- Number of Epochs: 5
- Weight Decay: 0.01
- Warmup Steps: 0.1 (of total training steps)
- Max Sequence Length: 128
- Evaluation Strategy: Epoch
- Save Strategy: Epoch
- Metric for Best Model: F1-score
Training regime:
- Default Precision: fp32 (no mixed precision specified in training arguments).
Speeds, Sizes, Times [optional]
- Total Training Time: Approximately 477.56 seconds (around 7.96 minutes).
- Training Samples per Second: 15.076
- Total Evaluation Time: Approximately 18.53 seconds.
- Evaluation Samples per Second: 129.489
- Checkpoint Size: The
model.safetensorsfile is saved as part of the final model artifact.
Evaluation
Testing Data, Factors & Metrics
Testing Data
The model was evaluated on a test set consisting of 2,400 samples (1,200 Positive, 1,200 Negative) derived from the Amazon reviews dataset.
Metrics
Accuracy, Precision, Recall, and F1-score were used to evaluate the model's performance, with F1-score being the primary metric for selecting the best model during training.
Results
Summary
On the test set, the model achieved the following performance:
- Accuracy: 0.9225
- Precision: 0.9232
- Recall: 0.9217
- F1-score: 0.9224
Enviromental Impact
- Hardware Type: GPU (Colab environment)
- Hours used: Approximately 0.13 hours for training (7.96 minutes)
- Cloud Provider: Google Cloud (Colab)
- Compute Region: [More Information Needed] (typically a region where Colab is hosted, e.g., US-Central1)
- Carbon Emitted: [More Information Needed]
Technical Specifications
Model Architecture and Objective
The model's architecture is based on BERT (Bidirectional Encoder Representations from Transformers), specifically dccuchile/bert-base-spanish-wwm-cased. It is a sequence classification model, meaning its objective is to classify input text sequences into predefined categories (Positive/Negative sentiment).
Compute Infrastructure
Hardware
Training was performed on a GPU provided by Google Colab (specific model not explicitly stated, typically NVIDIA Tesla T4 or V100).
Software
The training was conducted using Python with the transformers library from Hugging Face, pandas, scikit-learn, torch, and datasets library, all within the Google Colab environment.
Model Card Authors
Germán Rosati
Model Card Contact
Germán Rosatigrosati@unsam.edu.ar
- Downloads last month
- 79
Model tree for gefero/test-bert-sentiment-spanish-wwm-cased
Base model
dccuchile/bert-base-spanish-wwm-cased