Add files using upload-large-folder tool

Files changed (5) hide show

.gitattributes CHANGED Viewed

@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text

 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text
+plots/eval_loss_all_folds.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -79,14 +79,20 @@ The model was fine-tuned using **QLoRA** (4-bit quantization with LoRA) and then
 ### Evaluation Loss — All Folds
 All five folds show consistent and monotonically decreasing evaluation loss throughout training. By step 200, every fold converges to a final eval loss in the range of **0.19–0.24**, demonstrating stable learning without signs of overfitting across different data splits.
 ### Loss Curves — Fold 1
 For Fold 1 (the best-performing fold), training loss drops steeply from ~2.4 at initialization and quickly converges near the evaluation loss by step 10. Both train and eval loss then decrease together steadily, with no divergence — indicating no overfitting.
 ### Token Accuracy — Fold 1
 Token-level accuracy for Fold 1 climbs from ~0.51 at the start to **~0.94** by the final step. Train and eval accuracy track each other closely throughout, with eval accuracy slightly above train accuracy in the later steps.
 ## Usage

 ### Evaluation Loss — All Folds
+![Eval Loss All Folds](plots/eval_loss_all_folds.png)
 All five folds show consistent and monotonically decreasing evaluation loss throughout training. By step 200, every fold converges to a final eval loss in the range of **0.19–0.24**, demonstrating stable learning without signs of overfitting across different data splits.
 ### Loss Curves — Fold 1
+![Loss Curves Fold 1](plots/loss_curves_fold_1.png)
 For Fold 1 (the best-performing fold), training loss drops steeply from ~2.4 at initialization and quickly converges near the evaluation loss by step 10. Both train and eval loss then decrease together steadily, with no divergence — indicating no overfitting.
 ### Token Accuracy — Fold 1
+![Token Accuracy Fold 1](plots/token_accuracy_curves_fold_1.png)
 Token-level accuracy for Fold 1 climbs from ~0.51 at the start to **~0.94** by the final step. Train and eval accuracy track each other closely throughout, with eval accuracy slightly above train accuracy in the later steps.
 ## Usage

plots/eval_loss_all_folds.png ADDED Viewed

plots/loss_curves_fold_1.png ADDED Viewed

plots/token_accuracy_curves_fold_1.png ADDED Viewed