BarraHome commited on
Commit
95df930
·
verified ·
1 Parent(s): fd3c4e4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -27
README.md CHANGED
@@ -101,25 +101,6 @@ deepspeed train.py \
101
  --trust_remote_code
102
  ```
103
 
104
- ## 🔧 Regenerating This Model
105
-
106
- To recreate this converted model:
107
-
108
- ```bash
109
- # From the TransMLA root directory
110
- bash scripts/convert/llama3.2-1B.sh
111
- ```
112
-
113
- Or manually using the converter:
114
-
115
- ```bash
116
- python transmla/converter.py \
117
- --model-path meta-llama/Llama-3.2-1B \
118
- --save-path BarraHome/llama3_2-1B-deepseek \
119
- --freqfold 4 \
120
- --ppl-eval-batch-size 16
121
- ```
122
-
123
  ## 💡 Key Benefits
124
 
125
  - **Memory Efficiency**: ~50% reduction in KV cache memory usage
@@ -128,14 +109,6 @@ python transmla/converter.py \
128
  - **Quality Preservation**: Maintains comparable performance to original model
129
  - **Hardware Optimization**: Optimized for H100 and similar accelerators
130
 
131
- ## ⚠️ Requirements
132
-
133
- - **Python**: 3.8+
134
- - **PyTorch**: 2.0+
135
- - **Transformers**: 4.30+
136
- - **CUDA**: 11.7+ (for GPU acceleration)
137
- - **Memory**: 8GB+ GPU memory recommended
138
-
139
  Optional:
140
  - **vLLM**: For optimized inference
141
  - **DeepSpeed**: For distributed training
 
101
  --trust_remote_code
102
  ```
103
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
104
  ## 💡 Key Benefits
105
 
106
  - **Memory Efficiency**: ~50% reduction in KV cache memory usage
 
109
  - **Quality Preservation**: Maintains comparable performance to original model
110
  - **Hardware Optimization**: Optimized for H100 and similar accelerators
111
 
 
 
 
 
 
 
 
 
112
  Optional:
113
  - **vLLM**: For optimized inference
114
  - **DeepSpeed**: For distributed training