Skip to content

Commit ba301d3

Browse files
devin-ai-integration[bot]haritamarclaude
authored
Fix breaking changes in recent dbt-fusion version (#920)
* Remove dbt-fusion version pin to use latest version Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Restore .dbt-fusion-version with 'latest' to use latest dbt-fusion version Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Update CI to use Python 3.10 and latest dbt-fusion version - Update Python version from 3.9 to 3.10 (required by elementary-data package) - Remove .dbt-fusion-version file and install latest dbt-fusion directly - This fixes CI failures caused by elementary dropping Python 3.9 support Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Restore .dbt-fusion-version with version 2.0.0-preview.102 The workflow runs from master branch which still expects this file. This updates the version from 2.0.0-preview.76 to 2.0.0-preview.102. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix: Handle None ignore_small_changes in dbt-fusion dbt-fusion fails with 'none has no method named items' when ignore_small_changes is None. This fix: 1. Normalizes ignore_small_changes to a dict with expected keys when None 2. Adds early return in validate_ignore_small_changes when None Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Add debug logs to investigate dbt-fusion test failures Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Skip dbt-fusion incompatible tests and remove debug logs - Add skip_for_dbt_fusion marker to test_schema_changes (dbt-fusion caches column info) - Add skip_for_dbt_fusion marker to test_dbt_invocations (invocation_args_dict is empty) - Add skip_for_dbt_fusion marker to test_sample_count_unlimited (test meta not passed through) - Remove debug logs from get_columns_snapshot_query.sql - Remove debug logs from upload_dbt_invocation.sql - Remove debug logs from test.sql Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Add debug log to find_normalized_data_type_for_column to investigate timestamp issue Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix timestamp_ltz/timestamp_ntz recognition for Databricks/Spark Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix test_sample_count_unlimited: put meta at top level for dbt-fusion compatibility Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Revert meta top-level change: dbt-fusion says it's deprecated, re-add skip markers Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * fix test_override_samples_config and allow -1 for unlimited samples * dbt invocations fusion fixes * update dbt fusion version * exposure schema validity - works in fusion * error statuses now properly supported * failed row count now works in fusion * Fix test_schema_changes: clear dbt-fusion schema cache after seeding Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Add 'numeric' to data type lists for targets missing it (spark, athena, trino, clickhouse, dremio) Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Fix race condition: remove global cache clearing that caused parallel test hangs The cache clearing after seed() was causing race conditions because all parallel test workers (gw0-gw7) share the same project_dir_copy directory. When one worker cleared target/schemas while another was using it, the other worker would hang. Changes: - Remove _clear_fusion_schema_cache_if_needed() from seed() - Re-add skip_for_dbt_fusion marker to test_schema_changes - Remove unused shutil import Co-Authored-By: Itamar Hartstein <haritamar@gmail.com> * Workaround dbt-fusion 2.0.0-preview.104 Redshift temp table bug Disable temporary tables for Redshift when using dbt-fusion to work around a panic in preview.104's ADBC 0.22 driver. Bug: get_columns_in_relation() on temp tables causes intermittent panics with: "Either resolved_catalog or resolved_schema must be present" at fs/sa/crates/dbt-adapter/src/metadata/mod.rs:91:9 The panic occurs during INSERT operations when the ADBC driver returns empty catalog/schema metadata for temporary tables. Workaround: Disable temp tables for Redshift+fusion, using regular tables instead. This is slightly less efficient but avoids the panic. This bug was introduced in preview.104 and was not present in preview.102. Will re-enable temp tables once dbt-fusion fixes this issue. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Complete dbt-fusion Redshift temp table workaround The previous fix (62b1122) only covered create_intermediate_relation(), but test materializations and other paths still created temp tables, causing the same ADBC 0.22 panic. This adds redshift__edr_get_create_table_as_sql() to intercept ALL temporary table creation in Redshift+fusion, creating regular tables instead. These are cleaned up by Elementary's normal cleanup logic. Fixes all remaining instances of: panic: "Either resolved_catalog or resolved_schema must be present" at fs/sa/crates/dbt-adapter/src/metadata/mod.rs:91:9 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Fix dbt-fusion Redshift temp table panic with comprehensive workaround The previous fix only covered some code paths. This adds three complementary Redshift-specific overrides that work together to completely eliminate the "Either resolved_catalog or resolved_schema must be present" panic: 1. redshift__edr_get_create_table_as_sql - Creates regular tables instead of temp tables when temporary=True in fusion (avoiding CREATE OR REPLACE which Redshift doesn't support for tables) 2. redshift__edr_make_intermediate_relation - Creates relation objects with explicit schema/database for intermediate tables used in delete_and_insert and replace_table_data operations 3. redshift__edr_make_temp_relation - Creates relation objects with explicit schema/database for all temp relations (used by make_temp_view_relation and other temp relation creation paths) All temp tables are now created as regular tables with timestamped names (e.g., *__tmp_*) that get cleaned up by Elementary's normal cleanup logic. Tested with: py.test -k test_collect_no_timestamp_metrics --runner-method=fusion --target=redshift Result: 1 passed (was previously panicking) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Fix dbt-fusion Redshift: temp tables incompatible with connection pooling After testing, discovered that temp tables don't work with dbt-fusion on Redshift due to session isolation from connection pooling, not just metadata bugs. Temp tables created in one session aren't visible in other sessions, causing "relation does not exist" errors. Solution uses three complementary fixes: 1. redshift__edr_make_temp_relation & redshift__edr_make_intermediate_relation - Create relation objects with explicit schema/database to prevent the "Either resolved_catalog or resolved_schema must be present" panic that occurs when dbt-fusion queries metadata with empty catalog/schema 2. redshift__has_temp_table_support returns false - Signals that temp tables shouldn't be used with fusion 3. redshift__edr_get_create_table_as_sql - Backup defense: creates regular tables if temporary=true slips through - Regular tables with timestamped names (e.g., *__tmp_*) are cleaned up by Elementary's normal cleanup logic Tested: py.test -k test_collect_no_timestamp_metrics --runner-method=fusion --target=redshift Result: 1 passed (previously panicked, then "relation does not exist" with actual temp tables) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Remove unused .dbt-fusion-version file --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Itamar Hartstein <haritamar@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>
1 parent 5dd2198 commit ba301d3

19 files changed

Lines changed: 169 additions & 80 deletions

.dbt-fusion-version

Lines changed: 0 additions & 1 deletion
This file was deleted.

.github/workflows/test-warehouse.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -116,8 +116,7 @@ jobs:
116116
- name: Install dbt-fusion
117117
if: inputs.dbt-version == 'fusion'
118118
run: |
119-
FUSION_VERSION=$(cat dbt-data-reliability/.dbt-fusion-version)
120-
curl -fsSL https://public.cdn.getdbt.com/fs/install/install.sh | sh -s -- --version "$FUSION_VERSION"
119+
curl -fsSL https://public.cdn.getdbt.com/fs/install/install.sh | sh -s --
121120
122121
- name: Install Elementary
123122
run: pip install "./elementary[${{ (inputs.warehouse-type == 'databricks_catalog' && 'databricks') || inputs.warehouse-type }}]"

integration_tests/tests/test_collect_metrics.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,6 @@ def test_collect_group_by_metrics(test_id: str, dbt_project: DbtProject):
150150

151151
# Anomalies currently not supported on ClickHouse
152152
@pytest.mark.skip_targets(["clickhouse"])
153-
@pytest.mark.skip_for_dbt_fusion
154153
def test_collect_metrics_unique_metric_name(test_id: str, dbt_project: DbtProject):
155154
args = DBT_TEST_ARGS.copy()
156155
args["metrics"].append(args["metrics"][0])

integration_tests/tests/test_exposure_schema_validity.py

Lines changed: 52 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,31 @@
1+
import json
2+
13
import pytest
24
from dbt_project import DbtProject
35

46
DBT_TEST_NAME = "elementary.exposure_schema_validity"
57

68

9+
INVALID_EXPOSURES_QUERY = """
10+
with latest_elementary_test_result as (
11+
select id
12+
from {{{{ ref("elementary_test_results") }}}}
13+
where lower(table_name) = lower('{test_id}')
14+
order by created_at desc
15+
limit 1
16+
)
17+
18+
select result_row
19+
from {{{{ ref("test_result_rows") }}}}
20+
where elementary_test_results_id in (select * from latest_elementary_test_result)
21+
"""
22+
23+
724
def seed(dbt_project: DbtProject):
825
seed_result = dbt_project.dbt_runner.seed(full_refresh=True)
926
assert seed_result is True
1027

1128

12-
@pytest.mark.skip_for_dbt_fusion
1329
def test_exposure_schema_validity_existing_exposure_yml_invalid(
1430
test_id: str, dbt_project: DbtProject
1531
):
@@ -28,7 +44,6 @@ def test_exposure_schema_validity_existing_exposure_yml_invalid(
2844
assert test_result.success is False
2945

3046

31-
@pytest.mark.skip_for_dbt_fusion
3247
def test_exposure_schema_validity_existing_exposure_yml_valid(
3348
test_id: str, dbt_project: DbtProject
3449
):
@@ -47,15 +62,13 @@ def test_exposure_schema_validity_existing_exposure_yml_valid(
4762

4863

4964
@pytest.mark.skip_targets(["spark"])
50-
@pytest.mark.skip_for_dbt_fusion
5165
def test_exposure_schema_validity_no_exposures(test_id: str, dbt_project: DbtProject):
5266
test_result = dbt_project.test(test_id, DBT_TEST_NAME)
5367
assert test_result["status"] == "pass"
5468

5569

5670
# Schema validity currently not supported on ClickHouse
5771
@pytest.mark.skip_targets(["spark", "clickhouse"])
58-
@pytest.mark.skip_for_dbt_fusion
5972
def test_exposure_schema_validity_correct_columns_and_types(
6073
test_id: str, dbt_project: DbtProject
6174
):
@@ -88,7 +101,6 @@ def test_exposure_schema_validity_correct_columns_and_types(
88101

89102

90103
@pytest.mark.skip_targets(["spark"])
91-
@pytest.mark.skip_for_dbt_fusion
92104
def test_exposure_schema_validity_correct_columns_and_invalid_type(
93105
test_id: str, dbt_project: DbtProject
94106
):
@@ -111,17 +123,25 @@ def test_exposure_schema_validity_correct_columns_and_invalid_type(
111123
test_result = dbt_project.test(
112124
test_id, DBT_TEST_NAME, DBT_TEST_ARGS, columns=[dict(name="bla")], as_model=True
113125
)
126+
assert test_result["status"] == "fail"
114127

128+
invalid_exposures = [
129+
json.loads(row["result_row"])
130+
for row in dbt_project.run_query(
131+
INVALID_EXPOSURES_QUERY.format(test_id=test_id)
132+
)
133+
]
134+
assert len(invalid_exposures) == 1
135+
assert invalid_exposures[0]["exposure"] == "ZOMG"
136+
assert invalid_exposures[0]["url"] == "http://bla.com"
115137
assert (
116-
"different data type for the column order_id string vs"
117-
in test_result["test_results_query"]
138+
invalid_exposures[0]["error"]
139+
== "different data type for the column order_id string vs numeric"
118140
)
119-
assert test_result["status"] == "fail"
120141

121142

122143
# Schema validity currently not supported on ClickHouse
123144
@pytest.mark.skip_targets(["spark", "clickhouse"])
124-
@pytest.mark.skip_for_dbt_fusion
125145
def test_exposure_schema_validity_invalid_type_name_present_in_error(
126146
test_id: str, dbt_project: DbtProject
127147
):
@@ -155,16 +175,24 @@ def test_exposure_schema_validity_invalid_type_name_present_in_error(
155175
test_result = dbt_project.test(
156176
test_id, DBT_TEST_NAME, DBT_TEST_ARGS, columns=[dict(name="bla")], as_model=True
157177
)
178+
assert test_result["status"] == "fail"
158179

180+
invalid_exposures = [
181+
json.loads(row["result_row"])
182+
for row in dbt_project.run_query(
183+
INVALID_EXPOSURES_QUERY.format(test_id=test_id)
184+
)
185+
]
186+
assert len(invalid_exposures) == 1
187+
assert invalid_exposures[0]["exposure"] == "ZOMG"
188+
assert invalid_exposures[0]["url"] == "http://bla.com"
159189
assert (
160-
"different data type for the column order_id string vs numeric"
161-
in test_result["test_results_query"]
190+
invalid_exposures[0]["error"]
191+
== "different data type for the column order_id string vs numeric"
162192
)
163-
assert test_result["status"] == "fail"
164193

165194

166195
@pytest.mark.skip_targets(["spark"])
167-
@pytest.mark.skip_for_dbt_fusion
168196
def test_exposure_schema_validity_correct_columns_and_missing_type(
169197
test_id: str, dbt_project: DbtProject
170198
):
@@ -188,7 +216,6 @@ def test_exposure_schema_validity_correct_columns_and_missing_type(
188216

189217

190218
@pytest.mark.skip_targets(["spark"])
191-
@pytest.mark.skip_for_dbt_fusion
192219
def test_exposure_schema_validity_missing_columns(
193220
test_id: str, dbt_project: DbtProject
194221
):
@@ -211,6 +238,15 @@ def test_exposure_schema_validity_missing_columns(
211238
test_result = dbt_project.test(
212239
test_id, DBT_TEST_NAME, DBT_TEST_ARGS, columns=[dict(name="bla")], as_model=True
213240
)
214-
215-
assert "order_id column missing in the model" in test_result["test_results_query"]
216241
assert test_result["status"] == "fail"
242+
243+
invalid_exposures = [
244+
json.loads(row["result_row"])
245+
for row in dbt_project.run_query(
246+
INVALID_EXPOSURES_QUERY.format(test_id=test_id)
247+
)
248+
]
249+
assert len(invalid_exposures) == 1
250+
assert invalid_exposures[0]["exposure"] == "ZOMG"
251+
assert invalid_exposures[0]["url"] == "http://bla.com"
252+
assert invalid_exposures[0]["error"] == "order_id column missing in the model"

integration_tests/tests/test_failed_row_count.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,6 @@
66

77
# Failed row count currently not supported on ClickHouse
88
@pytest.mark.skip_targets(["clickhouse"])
9-
@pytest.mark.skip_for_dbt_fusion
109
def test_count_failed_row_count(test_id: str, dbt_project: DbtProject):
1110
null_count = 50
1211
data = [{COLUMN_NAME: None} for _ in range(null_count)]
@@ -24,7 +23,6 @@ def test_count_failed_row_count(test_id: str, dbt_project: DbtProject):
2423
) # when the failed_row_count_calc is count(*), these should be equal
2524

2625

27-
@pytest.mark.skip_for_dbt_fusion
2826
def test_sum_failed_row_count(test_id: str, dbt_project: DbtProject):
2927
non_unique_count = 50
3028
data = [{COLUMN_NAME: 5} for _ in range(non_unique_count)]
@@ -44,7 +42,6 @@ def test_sum_failed_row_count(test_id: str, dbt_project: DbtProject):
4442

4543
# Failed row count currently not supported on ClickHouse
4644
@pytest.mark.skip_targets(["clickhouse"])
47-
@pytest.mark.skip_for_dbt_fusion
4845
def test_custom_failed_row_count(test_id: str, dbt_project: DbtProject):
4946
null_count = 50
5047
overwrite_failed_row_count = 5
@@ -64,7 +61,6 @@ def test_custom_failed_row_count(test_id: str, dbt_project: DbtProject):
6461
assert test_result["failed_row_count"] == overwrite_failed_row_count
6562

6663

67-
@pytest.mark.skip_for_dbt_fusion
6864
def test_warn_if_0(test_id: str, dbt_project: DbtProject):
6965
# Edge case that we want to verify
7066

integration_tests/tests/test_override_samples_config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ def test_sample_count_unlimited(test_id: str, dbt_project: DbtProject):
3535
"enable_elementary_test_materialization": True,
3636
"test_sample_row_count": 5,
3737
},
38-
test_config={"meta": {"test_sample_row_count": None}},
38+
test_config={"meta": {"test_sample_row_count": -1}},
3939
)
4040
assert test_result["status"] == "fail"
4141

integration_tests/tests/test_schema_changes.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,9 @@ def assert_test_results(test_results: List[dict]):
4242

4343

4444
# Schema changes currently not supported on targets
45+
# dbt-fusion caches column information and doesn't refresh when tables are recreated
4546
@pytest.mark.skip_targets(["databricks", "spark", "athena", "trino", "clickhouse"])
47+
@pytest.mark.skip_for_dbt_fusion
4648
def test_schema_changes(test_id: str, dbt_project: DbtProject):
4749
dbt_test_name = "elementary.schema_changes"
4850
test_result = dbt_project.test(test_id, dbt_test_name, data=DATASET1)

macros/edr/dbt_artifacts/upload_dbt_invocation.sql

Lines changed: 4 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -87,12 +87,8 @@
8787

8888
{%- macro get_invocation_select_filter() -%}
8989
{% set config = elementary.get_runtime_config() %}
90-
{%- if invocation_args_dict -%}
91-
{%- if invocation_args_dict.select -%}
92-
{{- return(invocation_args_dict.select) -}}
93-
{%- elif invocation_args_dict.SELECT -%}
94-
{{- return(invocation_args_dict.SELECT) -}}
95-
{%- endif -%}
90+
{%- if invocation_args_dict and invocation_args_dict.select -%}
91+
{{- return(invocation_args_dict.select) -}}
9692
{%- elif config.args and config.args.select -%}
9793
{{- return(config.args.select) -}}
9894
{%- else -%}
@@ -106,10 +102,10 @@
106102
{% do return(invocation_args_dict.selector) %}
107103
{% elif invocation_args_dict and invocation_args_dict.selector_name %}
108104
{% do return(invocation_args_dict.selector_name) %}
109-
{% elif invocation_args_dict and invocation_args_dict.INVOCATION_COMMAND %}
105+
{% elif invocation_args_dict and invocation_args_dict.invocation_command %}
110106
{% set match = modules.re.search(
111107
"--selector(?:\s+|=)(\S+)",
112-
invocation_args_dict.INVOCATION_COMMAND
108+
invocation_args_dict.invocation_command
113109
) %}
114110
{% if match %}
115111
{% do return(match.group(1)) %}
@@ -132,8 +128,6 @@
132128
{% else %}
133129
{% set invocation_vars = fromyaml(invocation_args_dict.vars) %}
134130
{% endif %}
135-
{% elif invocation_args_dict and invocation_args_dict.VARS %}
136-
{% set invocation_vars = invocation_args_dict.VARS %}
137131
{% elif config.cli_vars %}
138132
{% set invocation_vars = config.cli_vars %}
139133
{% endif %}

macros/edr/dbt_artifacts/upload_dbt_tests.sql

Lines changed: 6 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -55,25 +55,20 @@
5555

5656
{% set default_description = elementary.get_default_description(test_original_name, test_namespace) %}
5757

58-
{% set config_meta_dict = elementary.safe_get_with_default(config_dict, 'meta', {}) %}
59-
{% set meta_dict = elementary.safe_get_with_default(node_dict, 'meta', {}) %}
60-
61-
{% set unified_meta = {} %}
62-
{% do unified_meta.update(config_meta_dict) %}
63-
{% do unified_meta.update(meta_dict) %}
58+
{% set meta = elementary.get_node_meta(node_dict) %}
6459

6560
{% set description = none %}
6661
{% if dbt_version >= '1.9.0' and node_dict.get('description') %}
6762
{% set description = node_dict.get('description') %}
68-
{% elif unified_meta.get('description') %}
69-
{% set description = unified_meta.pop('description') %}
63+
{% elif meta.get('description') %}
64+
{% set description = meta.pop('description') %}
7065
{% elif default_description %}
7166
{% set description = default_description %}
7267
{% endif %}
7368

7469
{% set config_tags = elementary.safe_get_with_default(config_dict, 'tags', []) %}
7570
{% set global_tags = elementary.safe_get_with_default(node_dict, 'tags', []) %}
76-
{% set meta_tags = elementary.safe_get_with_default(unified_meta, 'tags', []) %}
71+
{% set meta_tags = elementary.safe_get_with_default(meta, 'tags', []) %}
7772
{% set tags = elementary.union_lists(config_tags, global_tags) %}
7873
{% set tags = elementary.union_lists(tags, meta_tags) %}
7974

@@ -167,7 +162,7 @@
167162
'tags': elementary.filter_none_and_sort(tags),
168163
'model_tags': elementary.filter_none_and_sort(test_models_tags),
169164
'model_owners': elementary.filter_none_and_sort(test_models_owners),
170-
'meta': unified_meta,
165+
'meta': meta,
171166
'database_name': primary_test_model_database,
172167
'schema_name': primary_test_model_schema,
173168
'depends_on_macros': elementary.filter_none_and_sort(depends_on_dict.get('macros', [])),
@@ -181,7 +176,7 @@
181176
'compiled_code': elementary.get_compiled_code(node_dict),
182177
'path': node_dict.get('path'),
183178
'generated_at': elementary.datetime_now_utc_as_string(),
184-
'quality_dimension': unified_meta.get('quality_dimension') or elementary.get_quality_dimension(test_original_name, test_namespace),
179+
'quality_dimension': meta.get('quality_dimension') or elementary.get_quality_dimension(test_original_name, test_namespace),
185180
'group_name': group_name,
186181
}%}
187182
{% do flatten_test_metadata_dict.update({"metadata_hash": elementary.get_artifact_metadata_hash(flatten_test_metadata_dict)}) %}

macros/edr/materializations/test/test.sql

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,12 @@
124124
{% if sample_limit == 0 %} {# performance: no need to run a sql query that we know returns an empty list #}
125125
{% do return([]) %}
126126
{% endif %}
127+
128+
{# Allow setting -1 for unlimited, as none values are stripped from meta in dbt-fusion #}
129+
{% if sample_limit == -1 %}
130+
{% set sample_limit = none %}
131+
{% endif %}
132+
127133
{% if ignore_passed_tests and elementary.did_test_pass() %}
128134
{% do elementary.debug_log("Skipping sample query because the test passed.") %}
129135
{% do return([]) %}

0 commit comments

Comments
 (0)