95e6fb0b421155557131cb6416f81fc9

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [en-es] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9183
  • Data Size: 1.0
  • Epoch Runtime: 556.8716
  • Bleu: 9.3177

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 11.2254 0 43.5460 0.1240
No log 1 2336 11.3814 0.0078 48.2702 0.1357
0.2569 2 4672 11.0145 0.0156 51.8513 0.0612
0.3397 3 7008 7.3138 0.0312 59.5372 0.2779
5.2549 4 9344 3.5021 0.0625 74.6569 6.6485
3.7693 5 11680 2.7844 0.125 105.4480 4.3756
3.2656 6 14016 2.5733 0.25 169.3362 5.1932
3.0231 7 16352 2.4195 0.5 287.3175 5.9284
2.7689 8.0 18688 2.2880 1.0 533.6306 6.7111
2.5939 9.0 21024 2.1946 1.0 530.1038 7.1317
2.4848 10.0 23360 2.1425 1.0 525.1951 7.4805
2.4079 11.0 25696 2.0975 1.0 538.2567 7.7418
2.2988 12.0 28032 2.0706 1.0 539.8792 7.9399
2.2335 13.0 30368 2.0410 1.0 530.4865 8.1118
2.2005 14.0 32704 2.0216 1.0 541.4448 8.2441
2.1301 15.0 35040 2.0048 1.0 535.1475 8.3626
2.1463 16.0 37376 1.9886 1.0 537.2098 8.5055
2.0191 17.0 39712 1.9742 1.0 555.6616 8.5951
2.033 18.0 42048 1.9648 1.0 556.6508 8.6929
1.9855 19.0 44384 1.9515 1.0 556.0582 8.7545
1.9422 20.0 46720 1.9380 1.0 557.8315 8.8192
1.8887 21.0 49056 1.9424 1.0 556.0074 8.8741
1.8697 22.0 51392 1.9395 1.0 556.3364 8.9136
1.8313 23.0 53728 1.9322 1.0 556.7007 8.9796
1.768 24.0 56064 1.9310 1.0 554.8824 9.0449
1.7831 25.0 58400 1.9234 1.0 556.1053 9.1002
1.7492 26.0 60736 1.9130 1.0 555.6687 9.1191
1.7422 27.0 63072 1.9195 1.0 555.2957 9.1718
1.7031 28.0 65408 1.9127 1.0 557.4359 9.1818
1.6613 29.0 67744 1.9185 1.0 558.5520 9.2350
1.6515 30.0 70080 1.9189 1.0 557.0092 9.2685
1.6051 31.0 72416 1.9168 1.0 555.6120 9.2974
1.6352 32.0 74752 1.9183 1.0 556.8716 9.3177

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/95e6fb0b421155557131cb6416f81fc9

Base model

google/umt5-base
Finetuned
(49)
this model