research
          
      
      ∙
      06/12/2023
    NF4 Isn't Information Theoretically Optimal (and that's Good)
This note shares some simple calculations and experiments related to abs...
          
            research
          
      
      ∙
      12/16/2021
    Reconsidering the Past: Optimizing Hidden States in Language Models
We present Hidden-State Optimization (HSO), a gradient-based method for ...
          
            research
          
      
      ∙
      08/16/2020
     
             
  
  
     
                             
                             share
 share