Pruna AI Open Sources Its AI Model Optimization Framework for Faster, Efficient Models
Pruna AI open-sources its framework for AI model compression, enhancing efficiency with pruning, quantization, and distillation.
Matilda
Pruna AI Open Sources Its AI Model Optimization Framework for Faster, Efficient Models
Pruna AI, a European startup that has been working on compression algorithms for AI models, is making its optimization framework open source on Thursday. Image:Pruna AI Pruna AI has been creating a framework that applies several efficiency methods, such as caching, pruning, quantization, and distillation, to a given AI model. “We also standardize saving and loading the compressed models, applying combinations of these compression methods, and also evaluating your compressed model after you compress it,” Pruna AI co-fonder and CTO John Rachwan said. In particular, Pruna AI’s framework can evaluate if there’s significant quality loss after compressing a model and the performance gains that you get. “If I were to use a metaphor, we are similar to how Hugging Face standardized transformers and diffusers — how to call them, how to save them, load them, etc. We are doing the same, but for efficiency methods,” he added. Big AI labs have already been using various compression methods already. For i…