At AlphaEdge, we aim to offer AI models that can run on any type of hardware in a flexible way, via API or Edge, on GPU as well as CPU. Our goal is to deliver high performance on complex tasks while significantly reducing latency, memory consumption, and inference costs. We expose this vision through two channels: our open models and our proprietary models.
The open source models available on this Hugging Face organization are based on existing ones, published under permissive licenses, for which we propose improvements. The goal here is to showcase our expertise, particularly in state-of-the-art compression techniques, applied to model classes well known to the community.
Our proprietary models are built on a new architecture that we call ELM (Efficient Language Models). They offer optimized performance for professional use cases requiring full sovereignty and real-time usage. Resource-efficient, they are available through our API or can be deployed on-premises on your hardware at your company.
A key technique for effectively reducing the size of language models while preserving their performance, making them easier to deploy. We invite you to read the blog post written on the subject to learn about all the benefits of this method, and to explore the Space to browse the more than 5,000 models based on the trimming that we propose.
๐ Blog post ๐ค HF Space