cloud-solutions-card copy

jsm-solutions-card

atlassian-services-solutions-card (2)

test-solutions-card

care-solutions-card

devops-solutions-card(1)

gdpr-solutions-card

      Solutions

      Expert consulting and managed services to help complex organisations to work flatter, faster and more dynamically.

      Gold Solution Partner clear

      BDQ.cloud are proud to be an Atlassian Gold Solution Partner.

      SOLUTIONS HOME →

        BDQ Originals

        EEASD_for_mega_menu_150x175

         

        BDQMAJC_for_mega_menu_150x175

          Other products

          Atlassian-vertical-blue@2x-rgb

           

          Sonatype_stacked_logo_full_color_150x150

             

             

            Zephyr Full Color

              Products

              Whether it's our own Atlassian Marketplace apps or the apps that we provide a value-added-reseller service for, you can trust BDQ for the best support, consultancy, training and implementation available.

              Products Home →

                zephyr-basicsjira-essentialsjsd-runningconfluence-essentials

                  zephyr-managementpart-1getting-startedconfluence-server(2)

                    zephyr-automationpart-2getting-moreportfolio-jira

                      Training

                      • We provide high quality technology training to customers in the UK, EU and US.

                      • Our customers range from small companies to multi-nationals. They all want to maximise employee productivity.

                      • We listen to what our customers want to achieve, and take this into account when delivering the courses.

                      Training Home →

                        Resources

                        From webinar recordings to whitepapers, case studies to blog posts. Help yourself to our free content that will hopefully inform and inspire.

                        Resources Home →
                          2 min read

                          GitLabs' database incident

                          Featured Image

                          No one who works in the tech industry should have any schadenfreude in response to GitLab’s outage yesterday as reported by Business Insider and TechCrunch.

                          According to the incredibly open notes that GitLabs published while the incident was still being worked on, the initial trigger to the problem was:

                          Spike in database load due to spam users

                          In response, they took a series of actions to attempt to resolve the spam problem but at 11pm an admin referred to as team-member-1 made a mistake and confused which machine they were running an rm -rf command on. This deleted a live production PostgreSQL data directory. By the time the mistake was noticed only 1.5% of approximately 300GB of data remained.

                          The problem was further compounded by a series of problems they had with their backups. According to an update that they posted some of the backup did not appear to have worked, producing:

                          files only a few bytes in size

                          They have since managed to restore their service but with 6 hours of data lost. They have promised to publish their 5 whys of the cause of the incident and steps they will implement to prevent this from happening again.

                          In another interesting blog post, 2ndQuadrant, the initial author of the core PostgreSQL’s backup technologies, responded to the incident with their observations and suggestions for tools to consider. Well worth a read.

                          As we said at the beginning, there is no room for any schadenfreude. Today this is GitLab, tomorrow it could be anybody. Admins are people and people make mistakes. The only solution is to try and make making a mistake that risks production data as difficult as possible through scripting and automation, regularly ensuring that backups are happening successfully and that backups will actually restore in practice.

                          One positive thing to come out of this will be that lots of people in the tech industry will checking their backups today (I know we are). Another was the #HugOps hashtag where people sent their best wishes to GitLab on Twitter. We certainly echo that sentiment.