mehti commited on
Commit
ea33424
·
verified ·
1 Parent(s): 34fe3da

Add files using upload-large-folder tool

Browse files
.gitattributes CHANGED
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ plots/eval_loss_all_folds.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -79,14 +79,20 @@ The model was fine-tuned using **QLoRA** (4-bit quantization with LoRA) and then
79
 
80
  ### Evaluation Loss — All Folds
81
 
 
 
82
  All five folds show consistent and monotonically decreasing evaluation loss throughout training. By step 200, every fold converges to a final eval loss in the range of **0.19–0.24**, demonstrating stable learning without signs of overfitting across different data splits.
83
 
84
  ### Loss Curves — Fold 1
85
 
 
 
86
  For Fold 1 (the best-performing fold), training loss drops steeply from ~2.4 at initialization and quickly converges near the evaluation loss by step 10. Both train and eval loss then decrease together steadily, with no divergence — indicating no overfitting.
87
 
88
  ### Token Accuracy — Fold 1
89
 
 
 
90
  Token-level accuracy for Fold 1 climbs from ~0.51 at the start to **~0.94** by the final step. Train and eval accuracy track each other closely throughout, with eval accuracy slightly above train accuracy in the later steps.
91
 
92
  ## Usage
 
79
 
80
  ### Evaluation Loss — All Folds
81
 
82
+ ![Eval Loss All Folds](plots/eval_loss_all_folds.png)
83
+
84
  All five folds show consistent and monotonically decreasing evaluation loss throughout training. By step 200, every fold converges to a final eval loss in the range of **0.19–0.24**, demonstrating stable learning without signs of overfitting across different data splits.
85
 
86
  ### Loss Curves — Fold 1
87
 
88
+ ![Loss Curves Fold 1](plots/loss_curves_fold_1.png)
89
+
90
  For Fold 1 (the best-performing fold), training loss drops steeply from ~2.4 at initialization and quickly converges near the evaluation loss by step 10. Both train and eval loss then decrease together steadily, with no divergence — indicating no overfitting.
91
 
92
  ### Token Accuracy — Fold 1
93
 
94
+ ![Token Accuracy Fold 1](plots/token_accuracy_curves_fold_1.png)
95
+
96
  Token-level accuracy for Fold 1 climbs from ~0.51 at the start to **~0.94** by the final step. Train and eval accuracy track each other closely throughout, with eval accuracy slightly above train accuracy in the later steps.
97
 
98
  ## Usage
plots/eval_loss_all_folds.png ADDED

Git LFS Details

  • SHA256: 558cf6531dd185ff59a737342c6852e245026bb27e9197ba55b8a5652a7b3c4b
  • Pointer size: 131 Bytes
  • Size of remote file: 117 kB
plots/loss_curves_fold_1.png ADDED
plots/token_accuracy_curves_fold_1.png ADDED