Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ on:
- web
- sidekiq
- ops
- metrics
default: all
run_pre_deploy_migrations:
description: Run `pre-deploy` job (schema-migrations) before service deployment.
Expand Down Expand Up @@ -103,7 +104,7 @@ jobs:
fail-fast: true
matrix:
service: >-
${{ inputs.server_types == 'all' && fromJSON('["web", "sidekiq", "ops"]') ||
${{ inputs.server_types == 'all' && fromJSON('["web", "sidekiq", "ops", "metrics"]') ||
fromJSON(format('["{0}"]', inputs.server_types)) }}
env:
repository_name: ${{ matrix.service == 'ops' && 'mavis/ops' || 'mavis/webapp' }}
Expand Down Expand Up @@ -328,7 +329,7 @@ jobs:
fail-fast: false
matrix:
service: >-
${{ inputs.server_types == 'all' && fromJSON('["web", "sidekiq", "ops"]') ||
${{ inputs.server_types == 'all' && fromJSON('["web", "sidekiq", "ops", "metrics"]') ||
fromJSON(format('["{0}"]', inputs.server_types)) }}
steps:
- name: Configure AWS Credentials
Expand Down
7 changes: 6 additions & 1 deletion bin/docker-start
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,15 @@ elif [ "$SERVER_TYPE" == "sidekiq" ]; then
"$BIN_DIR"/prometheus_exporter &
sleep 5
exec "$BIN_DIR"/sidekiq
elif [ "$SERVER_TYPE" == "metrics" ]; then
echo "Starting metrics publisher..."
"$BIN_DIR"/prometheus_exporter skip-server-labels &
sleep 5
exec "$BIN_DIR"/metrics-publisher
elif [ "$SERVER_TYPE" == "none" ]; then
echo "No server started"
exec tail -f /dev/null # Keep container running
else
echo "SERVER_TYPE variable: '$SERVER_TYPE' unknown. Allowed values: web, sidekiq, none"
echo "SERVER_TYPE variable: '$SERVER_TYPE' unknown. Allowed values: web, sidekiq, metrics, none"
exit 1
fi
12 changes: 12 additions & 0 deletions bin/metrics-publisher
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/usr/bin/env ruby
# frozen_string_literal: true

require_relative "../config/environment"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need to load the entire Rails env? Or is this just to load bundler so we can require prometheus_exporter/instrumentation? Just looking at PrometheusExporter::CustomActiveRecordCollector it's not clear to me that all of Rails is needed.

require "prometheus_exporter/instrumentation"

PrometheusExporter::Instrumentation::SidekiqStats.start
PrometheusExporter::Instrumentation::SidekiqQueue.start(all_queues: true)

at_exit { PrometheusExporter::Client.default.stop(wait_timeout_seconds: 10) }

Kernel.sleep
27 changes: 20 additions & 7 deletions bin/prometheus_exporter
Original file line number Diff line number Diff line change
Expand Up @@ -27,18 +27,31 @@ def fetch_ecs_task_id
end
end

if ARGV.include?("local")
task_id = "N/A"
service_name = "N/A"
else
task_id = fetch_ecs_task_id
service_name = ENV["SERVICE_NAME"]
# An explicitly-set env var (including empty string) overrides the default.
# An empty value omits the label entirely — useful for the metrics task,
# whose series identity must stay stable across container replacements.
def resolve_label(env_key, &default)
if ENV.key?(env_key)
value = ENV[env_key]
return value.empty? ? nil : value
end

default.call
end

labels = {
"TaskId" => resolve_label("PROMETHEUS_TASK_ID_LABEL") do
ARGV.include?("skip-server-labels") ? nil : fetch_ecs_task_id
end,
"ServiceName" => resolve_label("PROMETHEUS_SERVICE_NAME_LABEL") do
ARGV.include?("skip-server-labels") ? nil : ENV["SERVICE_NAME"]
end
}.compact
Comment on lines +42 to +49
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would've expected skip-server-labels to override the env vars? For example, I would expect that a shell could have those env vars set from previous testing / work, but if you ran this script with skip-server-labels it wouldn't send those. I think the way resolve_label works is that it'll ignore this block, and the ARGV check inside it, if the env var is set.

If I'm right, simply swapping the ARGV check with the fetch_ecs_task_id call would fix it, although this syntax might be invalid because of the block, probably will have to tweak it.

Suggested change
labels = {
"TaskId" => resolve_label("PROMETHEUS_TASK_ID_LABEL") do
ARGV.include?("skip-server-labels") ? nil : fetch_ecs_task_id
end,
"ServiceName" => resolve_label("PROMETHEUS_SERVICE_NAME_LABEL") do
ARGV.include?("skip-server-labels") ? nil : ENV["SERVICE_NAME"]
end
}.compact
labels = {
"TaskId" => ARGV.include?("skip-server-labels") ? nil : resolve_label("PROMETHEUS_TASK_ID_LABEL") do
fetch_ecs_task_id
end,
"ServiceName" => ARGV.include?("skip-server-labels") ? nil : resolve_label("PROMETHEUS_SERVICE_NAME_LABEL") do
ENV["SERVICE_NAME"]
end
}.compact


runner = PrometheusExporter::Server::Runner.new(
port: 9394,
bind: "0.0.0.0",
label: { "TaskId" => task_id, "ServiceName" => service_name },
label: labels,
type_collectors: [PrometheusExporter::CustomActiveRecordCollector],
)

Expand Down
5 changes: 3 additions & 2 deletions config/initializers/sidekiq.rb
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,9 @@ def call(_worker, job, _queue)
PrometheusExporter::Instrumentation::Process.start type: "sidekiq"
PrometheusExporter::Instrumentation::ActiveRecord.start
PrometheusExporter::Instrumentation::SidekiqProcess.start
PrometheusExporter::Instrumentation::SidekiqQueue.start
PrometheusExporter::Instrumentation::SidekiqStats.start
# SidekiqStats and SidekiqQueue are global Redis-backed metrics published
# by the metrics task (bin/metrics-publisher) to avoid duplicate
# series across Sidekiq containers.
end

at_exit do
Expand Down
Loading