Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -44,9 +44,16 @@ The GGUFs have been augmented with the NEO Imatrix dataset- including the Q8s, F
 There are THREE versions of NEO GGUFs in this repo as well, to take advantage of the unique properties of this model.
-As odd as this sounds, lower to mid quants work best because of the stronger Imatrix effect in these quants. Model can
-code better, and seems to make better decisions (rather than hesitating a lot). Higher quants work well, but make generate
-longer reasoning blocks, but can in some cases come up with better solutions (relative to smaller quants).
 IQ3_M will work well for many use cases, at over 150 T/S ; IQ4s/Q4s are the best of Imatrix and "bits".

 There are THREE versions of NEO GGUFs in this repo as well, to take advantage of the unique properties of this model.
+As odd as this sounds, lower to mid quants work best because of the stronger Imatrix effect in these quants.
+Model can code better, and seems to make better decisions (rather than hesitating a lot) and sometimes
+generates SMALLER reasoning blocks.
+Likewise, lower quants often come up with "outside the box" solutions.
+Higher quants work well, but make generate longer reasoning blocks, but can in some cases come up with better solutions (relative to smaller quants).
+For these reasons I suggest you download at least 2 quants and compare operations for your use case(s).
 IQ3_M will work well for many use cases, at over 150 T/S ; IQ4s/Q4s are the best of Imatrix and "bits".