95e6fb0b421155557131cb6416f81fc9

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [en-es] dataset. It achieves the following results on the evaluation set:

Loss: 1.9183
Data Size: 1.0
Epoch Runtime: 556.8716
Bleu: 9.3177

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	11.2254	0	43.5460	0.1240
No log	1	2336	11.3814	0.0078	48.2702	0.1357
0.2569	2	4672	11.0145	0.0156	51.8513	0.0612
0.3397	3	7008	7.3138	0.0312	59.5372	0.2779
5.2549	4	9344	3.5021	0.0625	74.6569	6.6485
3.7693	5	11680	2.7844	0.125	105.4480	4.3756
3.2656	6	14016	2.5733	0.25	169.3362	5.1932
3.0231	7	16352	2.4195	0.5	287.3175	5.9284
2.7689	8.0	18688	2.2880	1.0	533.6306	6.7111
2.5939	9.0	21024	2.1946	1.0	530.1038	7.1317
2.4848	10.0	23360	2.1425	1.0	525.1951	7.4805
2.4079	11.0	25696	2.0975	1.0	538.2567	7.7418
2.2988	12.0	28032	2.0706	1.0	539.8792	7.9399
2.2335	13.0	30368	2.0410	1.0	530.4865	8.1118
2.2005	14.0	32704	2.0216	1.0	541.4448	8.2441
2.1301	15.0	35040	2.0048	1.0	535.1475	8.3626
2.1463	16.0	37376	1.9886	1.0	537.2098	8.5055
2.0191	17.0	39712	1.9742	1.0	555.6616	8.5951
2.033	18.0	42048	1.9648	1.0	556.6508	8.6929
1.9855	19.0	44384	1.9515	1.0	556.0582	8.7545
1.9422	20.0	46720	1.9380	1.0	557.8315	8.8192
1.8887	21.0	49056	1.9424	1.0	556.0074	8.8741
1.8697	22.0	51392	1.9395	1.0	556.3364	8.9136
1.8313	23.0	53728	1.9322	1.0	556.7007	8.9796
1.768	24.0	56064	1.9310	1.0	554.8824	9.0449
1.7831	25.0	58400	1.9234	1.0	556.1053	9.1002
1.7492	26.0	60736	1.9130	1.0	555.6687	9.1191
1.7422	27.0	63072	1.9195	1.0	555.2957	9.1718
1.7031	28.0	65408	1.9127	1.0	557.4359	9.1818
1.6613	29.0	67744	1.9185	1.0	558.5520	9.2350
1.6515	30.0	70080	1.9189	1.0	557.0092	9.2685
1.6051	31.0	72416	1.9168	1.0	555.6120	9.2974
1.6352	32.0	74752	1.9183	1.0	556.8716	9.3177

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/95e6fb0b421155557131cb6416f81fc9

Base model

google/umt5-base

Finetuned

(49)

this model