IONOS Cloud's AI Model Hub experienced increased error rates with the Llama 405B model due to hardware degradation and subsequent capacity constraints following a hardware failure. The incident affected the Llama 3.1 405B Instruct model's performance and reliability over 58.9 hours. The service was restored with capacity constraints remaining, and users experiencing ongoing issues were advised to use GPT-OSS 120B as a temporary alternative while optimizations are deployed.
We are marking this incident as resolved. The incident was caused by capacity constraints following a hardware failure. While capacity has been restored, we still see some usage‑specific constraints with the Llama 3.1 405B Instruct model. Our AI ModelHub team will deploy optimizations to the model to increase performance and reliability. We recommend that users still experiencing issues with the model check GPT‑OSS 120B as a potential (temporary) replacement.
Our AI Model Hub Team has mitigated the incident. While the underlying root cause is not yet fully established or resolved, the model service should be stable. We are monitoring the situation while the investigation is ongoing
The team has identified the root cause: hardware degradation affecting this model's hosting environment is causing backend instability. We are currently implementing a fix.
Our Model Hub Team is currently working on resolving errors related to an instance running the llama 405b model.
With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.
Start free trialNo credit card required · Cancel anytime · 6020 services available
Integrations with