brando90/training_guidelines.md

## training_guidelines.md

      
    Raw
  

              training_guidelines.md
            
          
    Training Guidelines Summary


SFT: Use bf16 or fp32 for training; avoid 8bit. For evaluation, fp16, bf16, or fp32 is fine. Follow established scripts for reliability.
Unsloth: Train LoRA with fp16, bf16, or fp32. Avoid 8bit or lower unless validated through replication of original experiments. No QLoRA unless core setups are stable and everything before this has worked.