DataRobot experienced an issue where batch jobs completed successfully but their completion status was not updating correctly in the system. This affected multiple components including AutoML, Website, Predictions, AI Catalog and Data Ingest, AI Apps, and API over a period of approximately 5 days. The issue was resolved through manual workarounds initially, followed by a remediation script, and finally a permanent fix deployed to production.
Trusted by 1,000+ teams
Stop finding out about outages from your users. Monitor 6,320+ cloud services and get alerted the second something breaks.
A fix has been deployed to production and the issue is now resolved. All systems are operating normally.
We are continuing to monitor for any further issues.
The remediation script has been deployed and Engineering is actively monitoring the situation. Batch jobs may take longer than usual to show as completed until a permanent fix is rolled out in the next production deployment.
The affected jobs have been resolved and the issue is mitigated. Engineering is actively working on a permanent fix and testing is currently underway
Batch jobs that are currently in the queue are completed successfully; however, their completion status is not being updated correctly. As a temporary workaround, we are manually marking these jobs as completed while we work on implementing a solution.
With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.
Start free trialNo credit card required · Cancel anytime · 6320 services available
Integrations with