12b7eea334e195972fda879fccaa3d4a
This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [de-fr] dataset. It achieves the following results on the evaluation set:
- Loss: 1.7793
- Data Size: 1.0
- Epoch Runtime: 205.0323
- Bleu: 9.2706
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.1853 | 0 | 16.8775 | 0.0954 |
| No log | 1 | 872 | 10.7783 | 0.0078 | 18.1996 | 0.1123 |
| No log | 2 | 1744 | 10.1798 | 0.0156 | 20.4251 | 0.1555 |
| 0.2703 | 3 | 2616 | 9.7975 | 0.0312 | 23.4899 | 0.1377 |
| 0.8809 | 4 | 3488 | 7.7114 | 0.0625 | 29.8361 | 0.2158 |
| 9.1948 | 5 | 4360 | 4.6778 | 0.125 | 41.7857 | 0.9647 |
| 4.2417 | 6 | 5232 | 2.7966 | 0.25 | 64.4291 | 2.8274 |
| 3.2175 | 7 | 6104 | 2.4069 | 0.5 | 110.3889 | 4.3369 |
| 2.8296 | 8.0 | 6976 | 2.1887 | 1.0 | 201.2492 | 5.4576 |
| 2.6175 | 9.0 | 7848 | 2.0855 | 1.0 | 201.5218 | 6.0483 |
| 2.5123 | 10.0 | 8720 | 2.0231 | 1.0 | 203.7766 | 6.5073 |
| 2.4066 | 11.0 | 9592 | 1.9783 | 1.0 | 204.0477 | 6.8182 |
| 2.2775 | 12.0 | 10464 | 1.9493 | 1.0 | 205.4439 | 7.0689 |
| 2.2363 | 13.0 | 11336 | 1.9083 | 1.0 | 204.6456 | 7.2838 |
| 2.1466 | 14.0 | 12208 | 1.8907 | 1.0 | 204.6194 | 7.4938 |
| 2.1032 | 15.0 | 13080 | 1.8707 | 1.0 | 204.2757 | 7.6754 |
| 2.0502 | 16.0 | 13952 | 1.8485 | 1.0 | 206.7366 | 7.8382 |
| 1.9568 | 17.0 | 14824 | 1.8301 | 1.0 | 207.7873 | 7.9788 |
| 1.9315 | 18.0 | 15696 | 1.8225 | 1.0 | 203.7663 | 8.1180 |
| 1.9148 | 19.0 | 16568 | 1.8116 | 1.0 | 205.0227 | 8.2364 |
| 1.8543 | 20.0 | 17440 | 1.7997 | 1.0 | 201.9450 | 8.3445 |
| 1.8677 | 21.0 | 18312 | 1.7845 | 1.0 | 203.1910 | 8.4587 |
| 1.8084 | 22.0 | 19184 | 1.7864 | 1.0 | 202.1538 | 8.5304 |
| 1.7365 | 23.0 | 20056 | 1.7742 | 1.0 | 204.7126 | 8.5680 |
| 1.7111 | 24.0 | 20928 | 1.7745 | 1.0 | 202.5110 | 8.6804 |
| 1.6884 | 25.0 | 21800 | 1.7700 | 1.0 | 206.2581 | 8.7424 |
| 1.668 | 26.0 | 22672 | 1.7731 | 1.0 | 203.0778 | 8.8157 |
| 1.631 | 27.0 | 23544 | 1.7631 | 1.0 | 201.7815 | 8.9063 |
| 1.5975 | 28.0 | 24416 | 1.7713 | 1.0 | 204.1630 | 8.9118 |
| 1.5381 | 29.0 | 25288 | 1.7637 | 1.0 | 202.4205 | 8.9410 |
| 1.5759 | 30.0 | 26160 | 1.7693 | 1.0 | 203.7057 | 8.9786 |
| 1.5469 | 31.0 | 27032 | 1.7613 | 1.0 | 202.3730 | 9.0594 |
| 1.4868 | 32.0 | 27904 | 1.7612 | 1.0 | 203.5100 | 9.0807 |
| 1.4895 | 33.0 | 28776 | 1.7609 | 1.0 | 203.5876 | 9.1370 |
| 1.4205 | 34.0 | 29648 | 1.7661 | 1.0 | 202.8884 | 9.2094 |
| 1.4164 | 35.0 | 30520 | 1.7723 | 1.0 | 202.6136 | 9.2228 |
| 1.4079 | 36.0 | 31392 | 1.7753 | 1.0 | 202.4356 | 9.2698 |
| 1.3664 | 37.0 | 32264 | 1.7793 | 1.0 | 205.0323 | 9.2706 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for contemmcm/12b7eea334e195972fda879fccaa3d4a
Base model
google/umt5-base