Instella
Open 3B language models from AMD
Listed in categories:
Open SourceArtificial IntelligenceGitHub


Description
Instella is a family of fully open state-of-the-art 3 billion parameter language models developed by AMD, trained from scratch on AMD Instinct MI300X GPUs. These models significantly outperform existing fully open models of similar sizes and achieve competitive performance compared to state-of-the-art open-weight models. Instella models are designed to foster innovation and collaboration within the AI community by providing open-source access to model weights, training configurations, datasets, and code.
How to use Instella?
To use Instella models, developers and researchers can access the model weights and training configurations from the provided GitHub repository. Users can implement the models in their applications by following the guidelines and examples available in the documentation.
Core features of Instella:
1️⃣
3 billion parameters for advanced language processing
2️⃣
Trained on AMD Instinct MI300X GPUs for high performance
3️⃣
Fully open-source with accessible model weights and training data
4️⃣
Supports efficient training techniques like FlashAttention2 and Fully Sharded Data Parallelism
5️⃣
Competitive performance against state-of-the-art models like Llama and Qwen.
Why could be used Instella?
# | Use case | Status | |
---|---|---|---|
# 1 | Natural language understanding and generation | ✅ | |
# 2 | Instruction following and interactive AI applications | ✅ | |
# 3 | Research and development in AI and machine learning. | ✅ |
Who developed Instella?
AMD (Advanced Micro Devices) is a leading semiconductor company that develops computer processors and related technologies for business and consumer markets. The company is committed to open-source initiatives and fostering innovation in the AI community through its advanced hardware and software solutions.