From e0f70f32c6c2c8648eb0eb65c9f9cff2d66c8295 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Dachary?= Date: Sat, 7 May 2022 17:58:08 +0100 Subject: [PATCH 1/3] Troubleshooting Gitea upgrades showcase --- content/post/troubleshooting-upgrades.md | 59 ++++++++++++++++++++++++ 1 file changed, 59 insertions(+) create mode 100644 content/post/troubleshooting-upgrades.md diff --git a/content/post/troubleshooting-upgrades.md b/content/post/troubleshooting-upgrades.md new file mode 100644 index 0000000..059ac37 --- /dev/null +++ b/content/post/troubleshooting-upgrades.md @@ -0,0 +1,59 @@ +--- +date: "2022-05-05T12:00:00+00:00" +author: "dachary" +title: "Troubleshooting Gitea upgrades showcase" +tags: ["gitea", "upgrade", "troubleshoot", "fix", "problem", "hostea"] +draft: true +--- + +# Troubleshooting Gitea upgrades showcase +--- + +The [instructions to upgrade a Gitea instance](https://docs.gitea.io/en-us/upgrade-from-gitea/#upgrade-from-binary) only require three to four steps. They work fine most of the time but the documentation is lacking a "Troubleshooting" section to help out when something goes wrong. Maintaining instructions on how to diagnose and fix upgrade problems is an ambitious undertaking and requires updates every time a new case is discovered. + +An [inventory of the known upgrade issues](https://forum.hostea.org/t/things-to-know-about-gitea-upgrades/39) was started to figure out how to structure such a section in the documentation. The [release notes](https://blog.gitea.io/) were analyzed all the way back to [Gitea 1.9.6](https://github.com/go-gitea/gitea/releases/tag/v1.9.6) and the work is still in progress. Here is a sample of the tips that will be included: + +* Upgrade directly to the latest Gitea version, there is no need to upgrade to intermediate versions. +* If the upgrade from version x.y to version x.y+2 fails and there is a need to narrow down the problem, try upgrading to the latest minor version of each major version and verify it works. +* etc. + +However, even with the best documentation, someone will eventually **run into an new problem** and fixing it without compromising the integrity of the data will be challenging. This is best demonstrated by a real world example that was concluded a few days ago. + +# Getting help from the community + +After [upgrading a Gitea intsance from 1.9.6 to 1.16.5](https://discourse.gitea.io/t/blank-page-after-login/5051) the tests conducted manually did not uncover any problem. However, after going to production, some users saw a blank page after login and had to manually type the URL of the project they wanted to see in the browser. The person in charge of the upgrade never had to diagnose Gitea problem and [reached out in the Gitea forum](https://discourse.gitea.io/t/blank-page-after-login/5051). + +> **Tip: explain the problem in a public forum as early as possible to get help fro the community** + +In their post in the forum they explained how they attempted to diagnose the problem and how why they thought that only users created a few years ago were impacted. It was a detailed analysis that was concluded with a partial copy of the logs. It was unfortunately missing [key information](https://discourse.gitea.io/t/blank-page-after-login/5051/12) that was provided only three days later. In the meantime, as they could not figure out the source of the problem, they were on the **verge of [accepting the loss of all the Gitea database](https://discourse.gitea.io/t/blank-page-after-login/5051/11) and start over from the repositories**. However, once all the details were available, [a workaround](https://discourse.gitea.io/t/blank-page-after-login/5051/13) was suggested in the forum. + +> **Tip: focus more on providing detailed facts than exposing the attempted diagnostic** + +There was hope to fix Gitea and in the following days they applied the workaround. They also tried to improve it but without success and eventually accepted a **partial data loss** as inevitable and [reported their success back to the forum](https://discourse.gitea.io/t/blank-page-after-login/5051/14). + +> **Tip: when getting support from the community, providing feedback is the best token of appreciation** + +# Getting professional help + +The [Hostea Clinic](https://hostea.org/gitea-clinic/) is a collective of individual and companies that provides professional services to Gitea admins. They are active members of the Gitea community who [help out](https://discourse.gitea.io/u/dachary/activity) as volunteers. They can also be hired to resolve the more complicated cases. + +The Gitea instance that was in trouble required more than a few minutes of work and access to the database content for a proper diagnostic. They [proposed their assistance](https://discourse.gitea.io/t/blank-page-after-login/5051/13) but although [well received](https://discourse.gitea.io/t/user-research-about-gitea-upgrade-experiences-call-for-volunteers/5063/2), it was not accepted. + +When the Gitea admin explained how they chose to resolve the problem [on the forum](https://discourse.gitea.io/t/blank-page-after-login/5051/14), it confirmed the workaround was viable and the root problem was identified. That was enough to figure out a fix for the underlying bug with [a rather simple patch](https://discourse.gitea.io/t/blank-page-after-login/5051/17) that was merged [and backported](https://github.com/go-gitea/gitea/pull/19629) in the following days. But it happened too late to avoid the data loss. + +To summarize with a timeline, here is what happened: + +* J+1: The **problem is discovered** by users who see a blank page after login and a the Gitea admin tries to diagnose the problem +* J+2: A message is sent **to ask for help in the community** +* J+2 to J+6: Three people in the community suggest ideas but **the Gitea admin cannot figure out the root cause and is on the verge of accepting the loss of all Gitea data** and restart from the git repositories +* J+6: A **workaround is suggested by the community** +* J+7 to J+17: The Gitea admin applies the **workaround and only looses part of the Gitea data** + +And in retrospect, here is what could have happened instead: + +* J+1: The **problem is discovered** by users who see a blank page after login +* J+1: The Gitea admin **[reaches out to someone at the Hostea Clinic](https://hostea.org/gitea-clinic/)** +* J+2: The [logs of the Gitea instance](https://discourse.gitea.io/t/blank-page-after-login/5051/12) are analyzed, **the root cause diagnosed** and [a patch](https://discourse.gitea.io/t/blank-page-after-login/5051/17) is created to fix it. +* J+3: If necessary a Gitea binary is created with the patch and used as a temporary replacement until the next point release is published with [the backport](https://github.com/go-gitea/gitea/pull/19629). The Gitea admin runs the patched Gitea binary in the meantime. **There is no data loss**. + +It does not mean all upgrade problems can be resolved so easily. But it shows, with an example, that in some cases it makes sense to get professional help. -- 2.40.1 From 231224a2916e059f732022c220b79bbc54165e42 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Dachary?= Date: Sat, 7 May 2022 20:26:15 +0100 Subject: [PATCH 2/3] fix the date --- content/post/troubleshooting-upgrades.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/post/troubleshooting-upgrades.md b/content/post/troubleshooting-upgrades.md index 059ac37..f5316a9 100644 --- a/content/post/troubleshooting-upgrades.md +++ b/content/post/troubleshooting-upgrades.md @@ -1,5 +1,5 @@ --- -date: "2022-05-05T12:00:00+00:00" +date: "2022-05-07T12:00:00+00:00" author: "dachary" title: "Troubleshooting Gitea upgrades showcase" tags: ["gitea", "upgrade", "troubleshoot", "fix", "problem", "hostea"] -- 2.40.1 From 6e527fbbffc6958304d47cfb9d3d0d379af444d5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Dachary?= Date: Sun, 8 May 2022 07:23:25 +0100 Subject: [PATCH 3/3] s/fro /from / --- content/post/troubleshooting-upgrades.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/content/post/troubleshooting-upgrades.md b/content/post/troubleshooting-upgrades.md index f5316a9..0f692c7 100644 --- a/content/post/troubleshooting-upgrades.md +++ b/content/post/troubleshooting-upgrades.md @@ -23,8 +23,7 @@ However, even with the best documentation, someone will eventually **run into an After [upgrading a Gitea intsance from 1.9.6 to 1.16.5](https://discourse.gitea.io/t/blank-page-after-login/5051) the tests conducted manually did not uncover any problem. However, after going to production, some users saw a blank page after login and had to manually type the URL of the project they wanted to see in the browser. The person in charge of the upgrade never had to diagnose Gitea problem and [reached out in the Gitea forum](https://discourse.gitea.io/t/blank-page-after-login/5051). -> **Tip: explain the problem in a public forum as early as possible to get help fro the community** - +> **Tip: explain the problem in a public forum as early as possible to get help from the community** In their post in the forum they explained how they attempted to diagnose the problem and how why they thought that only users created a few years ago were impacted. It was a detailed analysis that was concluded with a partial copy of the logs. It was unfortunately missing [key information](https://discourse.gitea.io/t/blank-page-after-login/5051/12) that was provided only three days later. In the meantime, as they could not figure out the source of the problem, they were on the **verge of [accepting the loss of all the Gitea database](https://discourse.gitea.io/t/blank-page-after-login/5051/11) and start over from the repositories**. However, once all the details were available, [a workaround](https://discourse.gitea.io/t/blank-page-after-login/5051/13) was suggested in the forum. > **Tip: focus more on providing detailed facts than exposing the attempted diagnostic** -- 2.40.1