Skip to content

Cherry-pick series of pg_dump related work.#1005

Merged
reshke merged 7 commits intoapache:mainfrom
robozmey:cherry-pick-pg_dump
Apr 26, 2025
Merged

Cherry-pick series of pg_dump related work.#1005
reshke merged 7 commits intoapache:mainfrom
robozmey:cherry-pick-pg_dump

Conversation

@robozmey
Copy link
Copy Markdown
Contributor

@robozmey robozmey commented Mar 25, 2025

greenplum-db/gpdb-archive@1e39d4d311
greenplum-db/gpdb-archive@47e33baf5c
greenplum-db/gpdb-archive@d5725632c4
greenplum-db/gpdb-archive@b250064d83
greenplum-db/gpdb-archive@e57503f55d
greenplum-db/gpdb-archive@f6dcaddd35
greenplum-db/gpdb-archive@8317413885

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


@robozmey robozmey force-pushed the cherry-pick-pg_dump branch 2 times, most recently from f90244b to 43d943d Compare March 25, 2025 10:01
@robozmey robozmey marked this pull request as ready for review March 25, 2025 12:02
@tuhaihe
Copy link
Copy Markdown
Member

tuhaihe commented Apr 9, 2025

Hi @robozmey it would be better to rebase the codebase instead of merging commits to keep the commit history clear, fyi.

@robozmey robozmey force-pushed the cherry-pick-pg_dump branch 2 times, most recently from d4304dc to fb38794 Compare April 11, 2025 08:06
@tuhaihe
Copy link
Copy Markdown
Member

tuhaihe commented Apr 23, 2025

Hi @robozmey, it needs you to rebase this PR again. I see @yjhjstz update this PR with a merge commit, which might result in the nonlinear commit history.

robozmey and others added 7 commits April 24, 2025 10:59
Upstream has removed support for server versions older than 9.2.[1]

The motivation for `error_unsupported_server_version`
was to reduce diff footprint when upstream supported much older
versions.[2]

We still need to support GPDB5 (PostgreSQL 8.3), so
remove version checks for PostgreSQL 8.2 and below.

1: postgres/postgres@30e7c17
2: bmdoil/gpdb@b4c5a61

Authored-by: Brent Doil <bdoil@vmware.com>
Removes the flags:

- pre-data-schema-only
- post-data-schema-only

These are undocumented and unused. Removing them cleans
up the code and reduces the diff with upstream.

They were implemented to support `gpcrondump` which
is no longer supported.

Authored-by: Brent Doil <bdoil@vmware.com>
Backported from PG15 92316a4

Conflicts resolved:
* Move GPDB specific numTypeStorageOptions to
  getSchemaData function
* Refactor getTypeStorageOptions query to return
  tableoid and oid fields, which are required to
  correctly index the hash table

Original commit message follows:

Create a hash table that indexes dumpable objects by CatalogId
(that is, catalog OID + object OID).  Use this to replace the
former catalogIdMap array, as well as various other single-
catalog index arrays, and also the extension membership map.

In principle this should be faster for databases with many objects,
since lookups are now O(1) not O(log N).  However, it seems that these
lookups are pretty much negligible in context, so that no overall
performance change can be measured.  But having only one lookup
data structure to maintain makes the code simpler and more flexible,
so let's do it anyway.

Discussion: https://postgr.es/m/2595220.1634855245@sss.pgh.pa.us
I found these by running pg_dump under "valgrind --leak-check=full".

The changes in flagInhIndexes() and getIndexes() replace allocation of
an array of which we use only some elements by individual allocations
of just the actually-needed objects.  The previous coding wasted some
memory, but more importantly it confused valgrind's leak tracking.

collectComments() and collectSecLabels() remain major blots on
the valgrind report, because they don't PQclear their query
results, in order to avoid a lot of strdup's.  That's a dubious
tradeoff, but I'll leave it alone here; an upcoming patch will
modify those functions enough to justify changing the tradeoff.
(cherry-picked from 70bef49)
Checking for RELKIND_MATVIEW was forgotten in
guessConstraintInheritance().  This isn't a live problem, since it is
checked in flagInhTables() which relkinds can have parents, and those
entries will have numParents==0 after that.  But after discussion it
was felt that this place should be kept consistent with
flagInhTables() and flagInhAttrs().

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/a574c8f1-9c84-93ad-a9e5-65233d6fc00f@enterprisedb.com
(cherry picked from commit a22d6a2)
Authored-by: Brent Doil <bdoil@vmware.com>
This reduces the potentially significant command and
communication overhead between frontend and backend
when sending a LOCK TABLE query for each table.

Discussion: https://www.postgresql.org/message-id/20120530.180620.600165924826262795.t-ishii%40sraoss.co.jp

Authored-by: Brent Doil <bdoil@vmware.com>
@robozmey robozmey force-pushed the cherry-pick-pg_dump branch from 893a364 to 2dc796a Compare April 24, 2025 08:00
@tuhaihe
Copy link
Copy Markdown
Member

tuhaihe commented Apr 24, 2025

Hi @robozmey thanks for your updates! I added this PR to the 2.0 release candidate PR list. If not ok, let me know.

@tuhaihe
Copy link
Copy Markdown
Member

tuhaihe commented Apr 25, 2025

Hi, now that all the checks have passed, we can merge this PR into the main branch. cc @robozmey @reshke

@reshke reshke merged commit b1cf523 into apache:main Apr 26, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants