replicaCount fails on other values than 1 #604
Labels
No Label
has
backport
in progress
invalid
kind
breaking
kind
bug
kind
build
kind
dependency
kind
deployment
kind
docs
kind
enhancement
kind
feature
kind
lint
kind
proposal
kind
question
kind
refactor
kind
security
kind
testing
kind
translation
kind
ui
need
backport
priority
critical
priority
low
priority
maybe
priority
medium
reviewed
duplicate
reviewed
invalid
reviewed
wontfix
skip-changelog
status
blocked
status
needs-feedback
status
needs-reviews
status
wip
upstream
gitea
upstream
other
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: gitea/helm-chart#604
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Is this line intended?
https://gitea.com/gitea/helm-chart/blame/branch/main/templates/_helpers.tpl#L8
I'm not sure about bleve few lines below, but I think it could work correctly when there is RWX PV. Updates are synchronized with queue and ran on the single instance.
Multiple replicas with individual RWO PVs won't work as there will be write conflicts between different sessions. There is no central managament within Gitea which would handle/queue those parallel writes, causing potential conflicts or failures if they would occur. Which is why we are requiring a single RWX with multiple replicas.
This is hard to test on your own as you would need to mock almost parallel write requests from multiple sessions to trigger the issue.
(Yet I am not fully sure I understand your second comment. -> What "could work if there is a RWX PVC"?)
Closing as likely stale, feel free to continue commenting if needed.
Probably I'm wrong. Please explain.
Lines 9-23 seems to be a dead code. I might misunderstood the purpose of it, but I don't think you can actually trigger those lines.
If I'm wrong, what is the values.yaml file, that could trigger the error messages from lines 9-23?
My sugesstion is remove dead code or refactor missplaced failure (I can prepare that change, but I'm not sure if I'm missing something).
In the case of the second comment I meant gitea is using jobs and queues to manage indexing and it is designed to handle multiple instances. As far as I know it is well tested behavior and I was using it in the past without error (although maybe I was just lucky and there are some known bugs).
Or let me put it in other words: this line means: if replicas is more than 1, fail with unrelevant message.
What is the point of the rest of the checks if this chart is designed to work with single instance only?
To answer the initial question: yep, looks like dead code.
replicaCount: 2
should go into 9-23 and then all the other conditions apply.L.8 is a mistake and should be removed, yes.
EDIT: I just realized that all assertions in
helpers.tpl
are not effective. Only the ones inconfig.yaml
. They were duplicated anyhow. I did a cleanup and added tests in the linked PR.Given that I am running Gitea in HA and not encountering the failure, something seems off, yes.
I've tested all of the assertions though in a DEV instance back when I added the code, so they are working in general. I will add some tests.
As far as we/I know, Bleve is not "simply" working in multi-instance mode which is why there has been effort to push for another alternative besides elastic which works in such scenarios. I cannot fully proof it though. It surely would be great if bleve would definitely work in HA.
I don't understand that comment. The chart is clearly aiming for HA, see the README etc. l8. is just a mistake.
I wasn't sure if the pointed line is on purpose, and I wasn't sure how to understand your previous comment (I was pointing the line that forces this chart to work in single instance and you were saying this is correct and I wasn't sure why).
I think I was misunderstood. I'm not saying blave can handle multiple instances. As far as I know, gitea uses queues, and do not update blave on multiple instances.
You do need to have queues that can handle multiple instances (like redis).
Is there any documentation that states this is not thread safe and that those assertions are actually needed?
Classic "works on my machine"? 😆
It fails for me. That's why I created this issue.
Otherwise dead code should probably be removed.
I think this was a double-sided misunderstanding.
As the README prominently states, the charts provides (and aims) for a HA setup. And the setup works and is in good use since many months (also on different clusters I am managing).
There were some initial discussions in issues of this repo and in Gitea core where people stated that
bleve
won't work in HA. Which is why all the elastic & friends efforts were pushed. Don't have the links right now though.Hmm strange. I couldn't trigger the failure when creating the unittest and if it would have been effective, nobody should be able to use >1 replica ever since the failure would not be conditioned on anything (it was a c/p mistake likely).
So you're saying
helm install
fails for you withreplicaCount: 2
and the error points to the line inhelpers.tpl
?I am using ArgoCD, so it is not a direct helm install.
It was failing when I was creating the issue.
I will setup test environment and try to reproduce it once again and compare it with plain helm. I will come back with the result.
Hmm interesting, this would mean that the code in
helpers.tpl
is not getting ignored.Wondering what the difference is for Argo compared to
helm install
or the unittest plugin we're using - since I couldn't trigger it there and I also did not fail over it in my own k8s install, which uses multiple replicas.Let's see what you can find out!
As far I have no luck in reproducing exactly the same issue.
Here is what ArgoCD says when I'm trying to set replicaCount to 2:
I am able to reproduce the same result with
Same result for:
This is gitea-10.1.1
EDIT:
It seems like this is other issue. I probably should create a separate ticket for that.
The error might come from
Values.gitea.config.cron.GIT_GC_REPOS
not being defined and hence helm cannot check it.In #611, specifically in
Can you run the checks on the PR? Otherwise we merge and you can check on the next release.