KumpeApps LLC - Primary Data Server Offline – Incident details

Primary Data Server Offline

Resolved
Major outage
Started 12 months agoLasted 1 day

Affected

Websites

Major outage from 4:12 PM to 4:27 PM

A Creative Collection

Major outage from 4:12 PM to 4:27 PM

KumpeApps SSO

Major outage from 4:12 PM to 4:27 PM

Kumpe3D

Major outage from 4:12 PM to 4:27 PM

UVT Web

Major outage from 4:12 PM to 4:27 PM

KHome Portal

Major outage from 4:12 PM to 4:27 PM

Updates
  • Resolved
    Resolved

    This incident has been resolved.

    After monitoring the servers for one hour they are very stable and operating as expected.

    To resolve the issue we migrated our systems to a new server this morning. During this migration, we decided to go ahead and upgrade the memory and processor of our new server. So the new server has much better memory and CPU speeds as well as an upgraded network which should also resolve issues in the past with our servers locking up during high-traffic events like Black Friday sales and new releases.

  • Monitoring
    Monitoring

    We implemented a fix and are currently monitoring the result.

    We ended up migrating our server to a new hosting company which experienced an outage as well. We have successfully migrated our systems to a new server and restored backups. Our backups are from 7pm Central Time on Dec 9th. Any data between then and now may be lost but we do not believe there was any data during that time as our servers were mostly down after that time.

  • Update
    Update

    We are continuing to work on a fix for this incident.

    Database Server, Email, and BOTs have been restored. We are still working on other applications

  • Update
    Update

    We are continuing to work on a fix for this incident.

    We have started migration to a new hosting provider which will be very labor intensive. We expect this migration to take several hours.

  • Update
    Update

    We are continuing to work on a fix for this incident.

    We have temporarily scheduled a reboot of our servers for every 45min which appears to be keeping the server working except for a few seconds every 45min. This should allow for service to continue to function with degraded service while we attempt a permanent fix.

  • Update
    Update

    We are continuing to work on a fix for this incident.

    Restoring backups to a temporary server has failed as the server we had reserved for this is experiencing more problems than our primary server. We are now looking into a new hosting provider that can handle our servers better than the one we have. We hope to have a resolution within the next 24 hours. Hopefully, in the meantime, our hosting provider will resolve their issues.

    We apologize for the inconvenience and hope to have a resolution as quickly as possible. We have found that restarting our servers every 30 minutes will keep systems working with degraded service. But restarting every 30 minutes is a manual process so we will still experience degraded service.

  • Update
    Update

    We are continuing to work on a fix for this incident.

    We have requested our hosting provider to migrate our server to a new cluster. We are waiting for them to complete the migration. We are also looking into restoring our backups to a temporary server to get services back up.

  • Identified
    Identified

    Our primary data server is unstable. It has been identified as a memory issue on our hosted server. We are working with our hosting company to resolve the issue.