Skip to content

Redis sentinel sentinels k8s-service#441

Open
klml wants to merge 5 commits into
zammad:mainfrom
it-at-m:redis-sentinel-sentinels-svc
Open

Redis sentinel sentinels k8s-service#441
klml wants to merge 5 commits into
zammad:mainfrom
it-at-m:redis-sentinel-sentinels-svc

Conversation

@klml

@klml klml commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What this PR does / why we need it

Removes the switch to use the vanilla k8s-service as "sentinels" when sentinel is activated.

  • if redis was enabled, what is mandatory for redis-sentinel, the else (with the configurable "sentinels") could never be used.
  • if you want to use sentinel, there is no reason to use the vanilla not-sentinel "sentinels"

Checklist

  • Chart Version bumped
  • Upgrading instructions are documented in the zammad/README.md

@mgruner

mgruner commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

@klml CI seems to have issues with this change, can you check please?

@mgruner

mgruner commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

@klml For some reason this CI job times out when testing the "full" scenarios with sentinel enabled. I'm not sure why.

Here's a comparison job that works ok: https://github.com/zammad/zammad-helm/actions/runs/28149865051/job/83365177413

Installing it locally seems to work well.

@mgruner

mgruner commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

Here's an AI analysis of the problem. Can you please rebase and go over your fix? You need to revert the dropping of internal sentinel handling, and instead only correct the name that gets used.

Root cause

This is a release-name mismatch introduced by the _helpers.tpl change in this PR.

What the PR changed (_helpers.tpl): the REDIS_SENTINELS env var used to be release-name-aware when the bundled Redis subchart was enabled:

- name: REDIS_SENTINELS
{{- if .Values.zammadConfig.redis.enabled }}
  value: "{{ .Release.Name }}-redis"      # ← adapted to the actual release name
{{- else }}
  value: "{{ join "," .Values.zammadConfig.redis.sentinel.sentinels }}"
{{- end }}

Now it always uses the static list, which defaults to the hardcoded zammad-redis-sentinel:26379 (values.yaml:302).

Why that breaks only in CI: ct install does not install the release as zammad. It generates a release name of the form zammad-<random10> (the namespace is pinned to zammad via ct.yaml, but the release name is randomized). I rendered the full scenario with a ct-style name to demonstrate:

Thing | Value -- | -- Sentinel Service actually created by the redis subchart | zammad-abc1234567-redis-sentinel (it's {{ include "redis.fullname" . }}-sentinel, release-name-prefixed) REDIS_SENTINELS injected into Zammad pods | zammad-redis-sentinel:26379 (hardcoded)

So in CI the hostname zammad-redis-sentinel doesn't exist → Zammad can't resolve/reach a sentinel → Rails init job and pods never become ready → ct install hangs until the --timeout 900s is hit → the job appears "stuck."

Why it works for you locally: you install with the release name zammad, so zammad-redis-sentinel resolves to the real zammad-redis-sentinel service. The mismatch is invisible at release name zammad — it only surfaces under any other release name, which is exactly what ct uses.

Note all the other subchart hosts in the same file (postgres {{ .Release.Name }}-postgresql, memcached {{ .Release.Name }}-memcached, and the standalone REDIS_URL at line 224 which still uses {{ .Release.Name }}-redis) remain release-name-aware — which is why everything except sentinel passes in CI.

Fix

When the bundled Redis subchart is enabled, the sentinel host must be derived from .Release.Name, mirroring the old behavior and the rest of the file. Something like:

- name: REDIS_SENTINELS
{{- if .Values.zammadConfig.redis.enabled }}
  value: "{{ .Release.Name }}-redis-sentinel:{{ .Values.zammadConfig.redis.sentinel.port | default 26379 }}"
{{- else }}
  value: "{{ join "," .Values.zammadConfig.redis.sentinel.sentinels }}"
{{- end }}

(The -sentinel suffix is the only real change vs. the old code — commit 397f837 correctly moved the target from the plain -redis service to the new -redis-sentinel service; the regression was dropping the .Release.Name branch entirely.) The static sentinels list then only applies to the external-Redis case, where its hardcoded default also makes sense.

Root cause This is a release-name mismatch introduced by the _helpers.tpl change in this PR.

What the PR changed (_helpers.tpl): the REDIS_SENTINELS env var used to be release-name-aware when the bundled Redis subchart was enabled:

  • name: REDIS_SENTINELS
    {{- if .Values.zammadConfig.redis.enabled }}
    value: "{{ .Release.Name }}-redis" # ← adapted to the actual release name
    {{- else }}
    value: "{{ join "," .Values.zammadConfig.redis.sentinel.sentinels }}"
    {{- end }}
    Now it always uses the static list, which defaults to the hardcoded zammad-redis-sentinel:26379 (values.yaml:302).

Why that breaks only in CI: ct install does not install the release as zammad. It generates a release name of the form zammad- (the namespace is pinned to zammad via ct.yaml, but the release name is randomized). I rendered the full scenario with a ct-style name to demonstrate:

Thing Value
Sentinel Service actually created by the redis subchart zammad-abc1234567-redis-sentinel (it's {{ include "redis.fullname" . }}-sentinel, release-name-prefixed)
REDIS_SENTINELS injected into Zammad pods zammad-redis-sentinel:26379 (hardcoded)
So in CI the hostname zammad-redis-sentinel doesn't exist → Zammad can't resolve/reach a sentinel → Rails init job and pods never become ready → ct install hangs until the --timeout 900s is hit → the job appears "stuck."

Why it works for you locally: you install with the release name zammad, so zammad-redis-sentinel resolves to the real zammad-redis-sentinel service. The mismatch is invisible at release name zammad — it only surfaces under any other release name, which is exactly what ct uses.

Note all the other subchart hosts in the same file (postgres {{ .Release.Name }}-postgresql, memcached {{ .Release.Name }}-memcached, and the standalone REDIS_URL at line 224 which still uses {{ .Release.Name }}-redis) remain release-name-aware — which is why everything except sentinel passes in CI.

Fix
When the bundled Redis subchart is enabled, the sentinel host must be derived from .Release.Name, mirroring the old behavior and the rest of the file. Something like:

  • name: REDIS_SENTINELS
    {{- if .Values.zammadConfig.redis.enabled }}
    value: "{{ .Release.Name }}-redis-sentinel:{{ .Values.zammadConfig.redis.sentinel.port | default 26379 }}"
    {{- else }}
    value: "{{ join "," .Values.zammadConfig.redis.sentinel.sentinels }}"
    {{- end }}
    (The -sentinel suffix is the only real change vs. the old code — commit 397f837 correctly moved the target from the plain -redis service to the new -redis-sentinel service; the regression was dropping the .Release.Name branch entirely.) The static sentinels list then only applies to the external-Redis case, where its hardcoded default also makes sense.

{{- if .Values.zammadConfig.redis.sentinel.enabled }}
- name: REDIS_SENTINELS
{{- if .Values.zammadConfig.redis.enabled }}
value: "{{ .Release.Name }}-redis"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot get dropped, instead you can just change the name that is used here to point to another service.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably value: "{{ .Release.Name }}-redis-sentinel"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants