Skip to content

spoc-504: 5.x -> 6.x upgrade path and version safety nets#421

Open
danolivo wants to merge 7 commits intomainfrom
spoc-504
Open

spoc-504: 5.x -> 6.x upgrade path and version safety nets#421
danolivo wants to merge 7 commits intomainfrom
spoc-504

Conversation

@danolivo
Copy link
Copy Markdown
Contributor

@danolivo danolivo commented Apr 16, 2026

Summary

Lays the groundwork for upgrading a spock 5.x cluster to 6.0.0-devel via pg_upgrade. Six logically separable commits:

  1. Remove unused apply-heap helper functions
    fill_missing_defaults(), init_apply_exec_state(), finish_apply_exec_state() had no callers.

  2. Add core-patchset version safety net
    Defines SPOCK_CORE_PATCHSET_VERSION (compile-time) and SpockCorePatchsetVersion (runtime) in core via patches/{15..18}/..., and _PG_init ereport(ERROR)s on mismatch. An unpatched server fails earlier on the missing extern symbol.

  3. Add per-node version safety net
    New int4 NOT NULL column spock.local_node.node_version carries SPOCK_VERSION_NUM at create. get_local_node() looks the column up by name (DROP COLUMN / VACUUM FULL safe), and ereport(ERROR)s with hint
    "Run ALTER EXTENSION spock UPDATE." if the stamp doesn't match the running binary. Always errors regardless of missing_ok. Covered by tests/regress/sql/version_guard.sql and tests/tap/t/020_version_safety_net.pl.

  4. Restructure 5.x -> 6.0.0-devel SQL upgrade chain
    Splits the previous combined spock--5.0.6--6.0.0-devel.sql into:

    • sql/spock--5.0.0.sql — full 5.0.0 install matching v5_STABLE
    • sql/spock--5.0.6--5.0.7.sql — pause/resume_apply_workers,
      wait_for_sync_event(wait_if_disabled), sync_event(transactional), and the sub_skip_schema text->text[] relabel; matches v5_STABLE byte-for-byte
    • sql/spock--5.0.7--6.0.0-devel.sql — only the 6.0.0-devel deltas (apply-group progress, lag_tracker rework, conflict stats, delta_apply helper, sub_alter_options, node_version)

    Result: a v5_STABLE 5.0.7 user upgrading via ALTER EXTENSION spock UPDATE TO '6.0.0-devel' runs only the 5.0.7->6.0.0 step, which DROP-IF-EXISTS-then-CREATEs every signature it changes so collisions with v5_STABLE-installed objects are handled cleanly. In passing adapts tests/tap/t/002_create_subscriber.pl to the new sync_event() / wait_for_sync_event() signatures.

  5. Add 5.x -> 6.x binary-upgrade compatibility shim
    ProcessUtility hook installed only under IsBinaryUpgrade that intercepts ALTER TABLE ... SET (log_old_value=..., delta_apply_function=...) from pg_dump --binary-upgrade and rewrites it to the canonical 6.x SECURITY LABEL FOR spock ON COLUMN ... IS 'spock.delta_apply'.
    Self-contained in src/spock_bucompat_5x.c (~450 lines); retirement is two edits (rm the file + remove the register call). Design doc at docs/internals-doc/binary-upgrade-compat-shim.md covers contracts C1-C10. The security-label provider registration is moved before the IsBinaryUpgrade early-return so synthesised statements find the provider during pg_restore. In passing converts the "spock extension is not created yet" elog to a proper ereport with errcode.

  6. Update user docs for 6.x SECURITY LABEL form
    docs/conflicts.md and docs/troubleshooting.md: replaces legacy reloption examples with spock.delta_apply() helper calls; adds an "Upgrading from spock 5.x" subsection cross-linking to the bucompat design doc.

@danolivo danolivo self-assigned this Apr 16, 2026
@danolivo danolivo added enhancement New feature or request feature New feature labels Apr 16, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 16, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4f702a69-36e0-4f2c-8d75-62ae4cae3d37

📥 Commits

Reviewing files that changed from the base of the PR and between 3ce8a02 and 7f0c171.

📒 Files selected for processing (1)
  • docs/conflicts.md

📝 Walkthrough

Walkthrough

Adds a runtime Spock patchset identity and initialization-time consistency check, records a per-local-node node_version in spock.local_node, validates that value in get_local_node(), installs a binary-upgrade hook to translate legacy reloptions to SECURITY LABELs, and adds tests and docs for the version-guard behavior.

Changes

Version Guard, Patchset Identity, and 5.x Binary-Upgrade Compatibility

Layer / File(s) Summary
PostgreSQL export wiring
patches/*/pg*-000-spock-patchset-version.diff
src/include/miscadmin.h, src/backend/utils/init/globals.c
Adds compile-time SPOCK_CORE_PATCHSET_VERSION macro and exports runtime SpockCorePatchsetVersion global initialized to that macro (applied for PG 15–18 patch files).
Extension init / runtime guard
src/spock.c
Declares register_spock_compat_5x() extern, registers security-label provider earlier, installs binary-upgrade compatibility hook, and aborts init if SpockCorePatchsetVersion mismatches SPOCK_CORE_PATCHSET_VERSION; adjusts an error path to use ereport with ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE.
Binary-upgrade shim implementation
src/spock_bucompat_5x.c
Adds register_spock_compat_5x() and a ProcessUtility hook that rewrites legacy column reloptions (log_old_value, delta_apply_function) into SecLabelStmt(s), trimming/dropping original AlterTable commands as needed and emitting NOTICEs.
Catalog schema: add node_version
sql/spock--5.0.0.sql, sql/spock--5.0.7--6.0.0-devel.sql, sql/spock--6.0.0-devel.sql
Introduces node_version int4 NOT NULL DEFAULT 0 to spock.local_node; upgrade script adds column if missing and initializes rows to spock.spock_version_num().
Catalog constants & accessors
src/spock_node.c
Extends Natts_local_node/Anum_node_local_node_version, sets node_version in create_local_node(), and updates get_local_node() to locate "node_version" by name, assert INT4OID, and error if missing/NULL/mismatched (with proper cleanup and update hint).
Apply/executor refactor (wiring)
src/spock_apply_heap.c
Removes fill_missing_defaults() and init_apply_exec_state() helpers; apply now uses existing slot-based default filling and create_edata_for_relation/finish_edata lifecycle.
Tests scheduling & regression
Makefile, tests/tap/schedule
Inserts version_guard into REGRESS order (resolutions_retention version_guard drop) and adds TAP schedule entry test: 020_version_safety_net.
Regression & TAP tests
tests/regress/sql/version_guard.sql, tests/tap/t/020_version_safety_net.pl, tests/tap/t/002_create_subscriber.pl
Adds SQL regression to verify node_version presence/NOT NULL and tamper detection; TAP test exercises zero/future/missing-column failures and restoration; subscriber test changed to wait on captured spock.sync_event() LSN.
Documentation
docs/conflicts.md, docs/troubleshooting.md
Replaces legacy reloption guidance with SECURITY LABEL spock.delta_apply() workflow and documents upgrade translation from Spock 5.x via the binary-upgrade compatibility shim.
Upgrade scripts / extension SQL surface
sql/spock--5.0.0.sql, sql/spock--5.0.6--5.0.7.sql, sql/spock--5.0.7--6.0.0-devel.sql
SQL additions and upgrades: new objects and functions remain; 5.0.6→5.0.7 introduced wait/pause/resume APIs noted and 5.0.7→6.0.0-devel adds idempotent node_version population while omitting redundant creations of previously added functions.

"I twitch my whiskers at the patch, so neat,
node_version tucked safe beneath each seat;
old options become labels, tidy and bright,
versions checked before the server takes flight;
hop forward—binary-upgrade makes all right." 🐇

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main objective: preparing a 5.x to 6.x upgrade path with version safety nets.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, detailing all six logical commits and their purposes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch spoc-504

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@danolivo danolivo requested a review from mason-sharp April 16, 2026 13:08
@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Apr 16, 2026

Up to standards ✅

🟢 Issues 1 medium

Results:
1 new issue

Category Results
Complexity 1 medium

View in Codacy

🟢 Metrics 35 complexity · -2 duplication

Metric Results
Complexity 35
Duplication -2

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/spock_node.c (1)

607-607: Consider upgrading Assert to a runtime check for defense-in-depth.

The Assert validates the type only in debug builds. While the schema guarantees int4, a corrupt catalog or manual tampering could cause silent misbehavior in release builds if the type is unexpected.

🔧 Suggested defensive check
-		Assert(TupleDescAttr(desc, ver_attnum - 1)->atttypid == INT4OID);
+		if (TupleDescAttr(desc, ver_attnum - 1)->atttypid != INT4OID)
+		{
+			systable_endscan(scan);
+			table_close(rel, for_update ? NoLock : RowExclusiveLock);
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("spock.local_node.node_version has unexpected type"),
+					 errhint("Run ALTER EXTENSION spock UPDATE.")));
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_node.c` at line 607, The Assert call TupleDescAttr(desc, ver_attnum
- 1)->atttypid == INT4OID should be converted to a runtime defensive check: read
the attribute type via TupleDescAttr(desc, ver_attnum - 1)->atttypid, compare it
to INT4OID, and if it does not match raise a proper error (e.g., elog(ERROR,
...)) or return a failure with a clear message referencing desc and ver_attnum
instead of relying on Assert; update the surrounding function (wherever desc and
ver_attnum are used) to handle the error path appropriately so callers don't
proceed with an unexpected type.
tests/tap/t/020_version_safety_net.pl (1)

125-130: Shell command construction could be safer.

The $sql variable is interpolated directly into the shell command. While all callers in this test use hardcoded SQL strings, this pattern is fragile if the test is later extended with dynamic SQL.

🔧 Safer alternative using list form
 sub psql_expect_error {
     my ($node_num, $sql) = `@_`;
     my $port = $cfg->{node_ports}[$node_num - 1];
-    my $result = `$PG_BIN/psql -X -p $port -d regression -t -c "$sql" 2>&1`;
+    my `@cmd` = ("$PG_BIN/psql", "-X", "-p", $port, "-d", "regression", "-t", "-c", $sql);
+    open(my $fh, "-|", `@cmd`, "2>&1") or die "Cannot run psql: $!";
+    my $result = do { local $/; <$fh> };
+    close($fh);
     return $result;
 }

Alternatively, consider using IPC::Run or Perl's qx{} with proper escaping.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/tap/t/020_version_safety_net.pl` around lines 125 - 130, In
psql_expect_error, avoid interpolating $sql into a single-shell backtick
command; instead construct the psql invocation without the shell by using
IPC::Run (e.g., IPC::Run::run) or Perl's list form system/open3 to pass
arguments (including "-c", $sql) so the SQL isn't interpreted by the shell, or
at minimum escape $sql with quotemeta if switching to IPC::Run isn't possible;
update psql_expect_error to call $PG_BIN/psql with arguments (port, database,
"-t", "-c", $sql) via IPC::Run or a safe argument list to eliminate shell
interpolation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/tap/schedule`:
- Line 44: The entry "020_version_safety_net" in the TAP schedule is missing the
required "test:" prefix so the schedule parser skips it; update the schedule so
the line reads with the prefix (e.g. "test: 020_version_safety_net") so the
schedule parser and check_prove will pick up and execute the test.

---

Nitpick comments:
In `@src/spock_node.c`:
- Line 607: The Assert call TupleDescAttr(desc, ver_attnum - 1)->atttypid ==
INT4OID should be converted to a runtime defensive check: read the attribute
type via TupleDescAttr(desc, ver_attnum - 1)->atttypid, compare it to INT4OID,
and if it does not match raise a proper error (e.g., elog(ERROR, ...)) or return
a failure with a clear message referencing desc and ver_attnum instead of
relying on Assert; update the surrounding function (wherever desc and ver_attnum
are used) to handle the error path appropriately so callers don't proceed with
an unexpected type.

In `@tests/tap/t/020_version_safety_net.pl`:
- Around line 125-130: In psql_expect_error, avoid interpolating $sql into a
single-shell backtick command; instead construct the psql invocation without the
shell by using IPC::Run (e.g., IPC::Run::run) or Perl's list form system/open3
to pass arguments (including "-c", $sql) so the SQL isn't interpreted by the
shell, or at minimum escape $sql with quotemeta if switching to IPC::Run isn't
possible; update psql_expect_error to call $PG_BIN/psql with arguments (port,
database, "-t", "-c", $sql) via IPC::Run or a safe argument list to eliminate
shell interpolation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cd68e04c-3e47-4656-994f-972c26f5a0ea

📥 Commits

Reviewing files that changed from the base of the PR and between 3125a09 and cc1f889.

⛔ Files ignored due to path filters (1)
  • tests/regress/expected/version_guard.out is excluded by !**/*.out
📒 Files selected for processing (12)
  • Makefile
  • patches/15/pg15-000-spock-patchset-version.diff
  • patches/16/pg16-000-spock-patchset-version.diff
  • patches/17/pg17-000-spock-patchset-version.diff
  • patches/18/pg18-000-spock-patchset-version.diff
  • sql/spock--5.0.6--6.0.0-devel.sql
  • sql/spock--6.0.0-devel.sql
  • src/spock.c
  • src/spock_node.c
  • tests/regress/sql/version_guard.sql
  • tests/tap/schedule
  • tests/tap/t/020_version_safety_net.pl

Comment thread tests/tap/schedule Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
sql/spock--5.0.6--6.0.0-devel.sql (1)

435-437: Prefer dropping the DEFAULT 0 after the backfill.

The column definition leaves node_version with DEFAULT 0, but runtime validation (in src/spock_node.c:569-622) rejects any value that is NULL or not equal to SPOCK_VERSION_NUM. While the only existing insert path (C code in src/spock_node.c:459) explicitly provides SPOCK_VERSION_NUM, the default is misleading and creates a trap for future code: any new insert path that omits the column would silently get 0 and immediately fail validation.

Dropping the DEFAULT after the backfill enforces explicit provision at all insert sites:

Suggested migration shape
ALTER TABLE spock.local_node
  ADD COLUMN IF NOT EXISTS node_version int4 NOT NULL DEFAULT 0;
UPDATE spock.local_node SET node_version = spock.spock_version_num();
+ALTER TABLE spock.local_node
+  ALTER COLUMN node_version DROP DEFAULT;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sql/spock--5.0.6--6.0.0-devel.sql` around lines 435 - 437, Add a DDL step to
remove the misleading default after the backfill: after updating
spock.local_node.node_version with spock.spock_version_num(), run an ALTER TABLE
on spock.local_node to DROP DEFAULT for the node_version column so future
inserts must explicitly provide a value; reference the table/column names
(spock.local_node, node_version) and the backfill call
(spock.spock_version_num()) so the DROP DEFAULT is applied immediately after the
UPDATE.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@patches/18/pg18-000-spock-patchset-version.diff`:
- Around line 26-36: The definition of SpockCorePatchsetVersion in globals.c
must match the declaration's const qualifier; change the definition to "const
int SpockCorePatchsetVersion = SPOCK_CORE_PATCHSET_VERSION;" so the symbol
SpockCorePatchsetVersion exactly matches the extern PGDLLIMPORT const int
declaration from miscadmin.h (also check and apply the same const fix in the
other PG15–17 patch files where SpockCorePatchsetVersion is defined).

---

Nitpick comments:
In `@sql/spock--5.0.6--6.0.0-devel.sql`:
- Around line 435-437: Add a DDL step to remove the misleading default after the
backfill: after updating spock.local_node.node_version with
spock.spock_version_num(), run an ALTER TABLE on spock.local_node to DROP
DEFAULT for the node_version column so future inserts must explicitly provide a
value; reference the table/column names (spock.local_node, node_version) and the
backfill call (spock.spock_version_num()) so the DROP DEFAULT is applied
immediately after the UPDATE.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 14cd7344-6be0-4e8d-9f9d-d4a909643982

📥 Commits

Reviewing files that changed from the base of the PR and between 904c3cb and 04f525b.

📒 Files selected for processing (2)
  • patches/18/pg18-000-spock-patchset-version.diff
  • sql/spock--5.0.6--6.0.0-devel.sql

Comment thread patches/18/pg18-000-spock-patchset-version.diff Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
tests/tap/t/020_version_safety_net.pl (2)

125-130: psql_expect_error does not verify that psql actually failed.

The helper returns combined output but never inspects the exit status, so a scenario where spock.node_info() unexpectedly succeeds would still pass as long as the stdout happens to match the pattern (e.g., unlikely but masked regressions). Consider capturing $? and asserting non-zero, or using IPC::Run/Test::More::ok on the exit status in addition to the message-pattern checks.

♻️ Optional hardening
 sub psql_expect_error {
     my ($node_num, $sql) = `@_`;
     my $port = $cfg->{node_ports}[$node_num - 1];
     my $result = `$PG_BIN/psql -X -p $port -d regression -t -c "$sql" 2>&1`;
+    my $rc = $? >> 8;
+    note("psql exited with $rc; output: $result") if $rc == 0;
     return $result;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/tap/t/020_version_safety_net.pl` around lines 125 - 130,
psql_expect_error currently returns psql's combined output but never checks the
exit status, so modify the function (psql_expect_error) to capture the exit
status after the backtick call (check $?) and ensure it is non-zero; if the
status is zero, fail/assert (e.g., croak/die or use Test::More::ok) so tests
don't silently accept a successful psql run, and otherwise return the output (or
return both output and status) so callers can still inspect the error message
from $PG_BIN/psql using the configured $cfg->{node_ports}[$node_num - 1].

106-113: Scenario 5 restores the column with DEFAULT 0, which diverges from the canonical schema.

sql/spock--6.0.0-devel.sql defines node_version int4 NOT NULL DEFAULT 0, so the literal column definition here matches. However, on a broader note: if the canonical schema ever changes the default (e.g., to remove DEFAULT 0 once upgrade is complete), this test will silently drift. Consider a short comment pointing at the authoritative definition, or dropping the default after the UPDATE, so the restored table matches production state more closely. Non-blocking.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/tap/t/020_version_safety_net.pl` around lines 106 - 113, The test
restores spock.local_node.node_version with DEFAULT 0 which may diverge from the
authoritative schema; after the ALTER TABLE/UPDATE block (the ALTER TABLE
spock.local_node ADD COLUMN node_version ... and UPDATE spock.local_node SET
node_version = spock.spock_version_num()), remove the literal DEFAULT by issuing
an ALTER TABLE ... ALTER COLUMN node_version DROP DEFAULT so the test leaves the
column in the same default state as production, and add a short inline comment
referencing sql/spock--6.0.0-devel.sql to indicate the canonical definition
being mirrored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@patches/16/pg16-000-spock-patchset-version.diff`:
- Around line 17-28: The declaration in the header uses "extern PGDLLIMPORT
const int SpockCorePatchsetVersion;" but the definition in globals.c is
non-const; update the definition in globals.c (symbol SpockCorePatchsetVersion)
to be const to match the header (i.e., define it as a const int initialized to
SPOCK_CORE_PATCHSET_VERSION), and apply the same const-fix to any sibling
patches (PG15/17/18) where SpockCorePatchsetVersion is defined.

In `@tests/tap/t/020_version_safety_net.pl`:
- Line 3: The test file declares Test::More tests => 10 but only runs 9
assertions causing a TAP failure; either change the plan to tests => 9 or add
the missing assertion in Scenario 3: insert a second assertion mirroring
Scenarios 2 and 4 that checks for the "ALTER EXTENSION spock UPDATE" hint (e.g.,
a like() on the Scenario 3 output similar to the existing like() calls),
ensuring the total assertion count matches the plan.

---

Nitpick comments:
In `@tests/tap/t/020_version_safety_net.pl`:
- Around line 125-130: psql_expect_error currently returns psql's combined
output but never checks the exit status, so modify the function
(psql_expect_error) to capture the exit status after the backtick call (check
$?) and ensure it is non-zero; if the status is zero, fail/assert (e.g.,
croak/die or use Test::More::ok) so tests don't silently accept a successful
psql run, and otherwise return the output (or return both output and status) so
callers can still inspect the error message from $PG_BIN/psql using the
configured $cfg->{node_ports}[$node_num - 1].
- Around line 106-113: The test restores spock.local_node.node_version with
DEFAULT 0 which may diverge from the authoritative schema; after the ALTER
TABLE/UPDATE block (the ALTER TABLE spock.local_node ADD COLUMN node_version ...
and UPDATE spock.local_node SET node_version = spock.spock_version_num()),
remove the literal DEFAULT by issuing an ALTER TABLE ... ALTER COLUMN
node_version DROP DEFAULT so the test leaves the column in the same default
state as production, and add a short inline comment referencing
sql/spock--6.0.0-devel.sql to indicate the canonical definition being mirrored.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3a1c47b6-54b2-4ad8-a237-52c194b7cdd8

📥 Commits

Reviewing files that changed from the base of the PR and between 04f525b and 76d5624.

⛔ Files ignored due to path filters (1)
  • tests/regress/expected/version_guard.out is excluded by !**/*.out
📒 Files selected for processing (13)
  • Makefile
  • patches/15/pg15-000-spock-patchset-version.diff
  • patches/16/pg16-000-spock-patchset-version.diff
  • patches/17/pg17-000-spock-patchset-version.diff
  • patches/18/pg18-000-spock-patchset-version.diff
  • sql/spock--5.0.6--6.0.0-devel.sql
  • sql/spock--6.0.0-devel.sql
  • src/spock.c
  • src/spock_node.c
  • tests/regress/sql/version_guard.sql
  • tests/tap/schedule
  • tests/tap/t/002_create_subscriber.pl
  • tests/tap/t/020_version_safety_net.pl
✅ Files skipped from review due to trivial changes (2)
  • sql/spock--6.0.0-devel.sql
  • tests/regress/sql/version_guard.sql
🚧 Files skipped from review as they are similar to previous changes (8)
  • tests/tap/schedule
  • Makefile
  • src/spock.c
  • patches/15/pg15-000-spock-patchset-version.diff
  • sql/spock--5.0.6--6.0.0-devel.sql
  • patches/17/pg17-000-spock-patchset-version.diff
  • patches/18/pg18-000-spock-patchset-version.diff
  • src/spock_node.c

Comment thread patches/16/pg16-000-spock-patchset-version.diff
Comment thread tests/tap/t/020_version_safety_net.pl
@danolivo danolivo force-pushed the spoc-504 branch 5 times, most recently from 9cd28e3 to d40434a Compare April 22, 2026 10:35
@danolivo danolivo force-pushed the spoc-504 branch 4 times, most recently from 3aeba5c to d0043ea Compare May 5, 2026 14:51
danolivo added 4 commits May 5, 2026 18:01
fill_missing_defaults(), init_apply_exec_state() and
finish_apply_exec_state() in spock_apply_heap.c are no longer called
from anywhere in the tree.  Drop them.

In passing also drops the build_delta_tuple() static-helper grouping
comment that was specific to those callers.
When spock is loaded into a server binary that was built from a
different generation of the spock core patchset than this extension
expects, the result is silent ABI mismatch -- new code reads stale
struct layouts, missing symbols may exist as null lookups, etc.

Make the version coupling explicit:

  * The core patchset (patches/{15,16,17,18}/pg{N}-000-...) defines
    SPOCK_CORE_PATCHSET_VERSION as a compile-time constant in
    miscadmin.h and SpockCorePatchsetVersion as a runtime int in
    globals.c.  Bump the macro when a patchset change is visible to
    the extension binary.

  * spock _PG_init reads both values and ereport(ERROR)s when they
    disagree.  An unpatched server never reaches this check -- the
    dynamic linker fails first on the missing SpockCorePatchsetVersion
    symbol.

The constant and the runtime variable live in core, so a future
patchset bump is one number change in two places, with no spock-side
churn.
Detect "binary upgraded but ALTER EXTENSION spock UPDATE not run" by
stamping each node with the spock version that wrote it and checking
the stamp on every read.

A new int4 NOT NULL column spock.local_node.node_version carries the
binary version (SPOCK_VERSION_NUM) at create_local_node() time.
get_local_node() looks up the column by name (not by Anum, since
DROP COLUMN leaves a gap and VACUUM FULL renumbers attributes), then
ereport(ERROR)s if it is missing or does not match the running binary.

Both error paths suggest "Run ALTER EXTENSION spock UPDATE." and use
ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE so callers can distinguish
schema-mismatch from operational failures.

The check fires regardless of the missing_ok argument: returning NULL
would conflate "node not configured yet" with "node misconfigured",
and not all callers check the return value.

Coverage:
  * tests/regress/sql/version_guard.sql exercises both directions
    (node_version below and above the binary) plus the column-shape
    invariants.  Wired into REGRESS in Makefile.
  * tests/tap/t/020_version_safety_net.pl drives the same scenarios
    end-to-end on a live cluster.  Wired into the tap schedule.
The 5.x -> 6.x upgrade story previously required a single
spock--5.0.6--6.0.0-devel.sql file that bundled all 5.x patch updates
plus the 6.x changes.  That collapsed two unrelated concerns:
  1. picking up patches v5_STABLE shipped in 5.0.7 (wait_for_sync_event
     with wait_if_disabled, sync_event(transactional), the
     sub_skip_schema text->text[] relabel);
  2. landing 6.0.0-devel features (new conflict types, apply-group
     progress, security-label-based delta_apply, etc.).

Split the chain so the 5.x patch level is reached first, then the
6.x-specific changes are applied as a separate step.  Concretely:

  * sql/spock--5.0.0.sql is now a true full-install file at the 5.0.0
    schema level (matches what v5_STABLE ships).
  * sql/spock--5.0.6--5.0.7.sql (also matches v5_STABLE) brings a 5.0.6
    install up to 5.0.7 -- pause/resume_apply_workers,
    wait_for_sync_event(wait_if_disabled), sync_event(transactional),
    and the sub_skip_schema relabel with the LOCK TABLE +
    pg_statistic cleanup that v5_STABLE uses.
  * sql/spock--5.0.7--6.0.0-devel.sql carries only the 6.0.0-devel
    deltas on top of 5.0.7, replacing the previous combined script.
  * sql/spock--5.0.6--6.0.0-devel.sql is removed -- the chain now
    routes through 5.0.7.

A v5_STABLE 5.0.7 user upgrading via "ALTER EXTENSION spock UPDATE TO
'6.0.0-devel'" will run only spock--5.0.7--6.0.0-devel.sql, which
DROP-IF-EXISTS-then-CREATEs every signature it changes so collisions
with v5_STABLE-installed objects are handled cleanly.

In passing also adapts tests/tap/t/002_create_subscriber.pl to use the
new sync_event() / wait_for_sync_event(...) signatures; the previous
spock.sub_wait_for_sync('test_subscription') call is no longer the
canonical way to synchronise.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
src/spock.c (1)

56-57: 💤 Low value

Optional: move register_spock_compat_5x declaration into a shared header.

An inline extern in spock.c works but spreads the function's contract between the .c file and its single caller. Putting the prototype in spock.h (or a small spock_bucompat_5x.h) keeps function declarations centralized and avoids drift if a second caller is ever added.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/spock.c` around lines 56 - 57, Move the inline extern prototype for
register_spock_compat_5x out of src/spock.c and into a shared header (e.g.,
declare it in spock.h or create spock_bucompat_5x.h) and then `#include` that
header from src/spock.c; remove the redundant extern declaration from spock.c so
the function contract is centralized and available to any future callers.
tests/regress/sql/version_guard.sql (1)

1-43: ⚡ Quick win

Consider adding the missing-column scenario for full guard coverage.

get_local_node() has two distinct error paths: value-mismatch ("spock version mismatch") and missing-column ("spock extension schema outdated"). This test covers only the value-mismatch path (via node_version = 0 and 999999). The DROP-COLUMN path is the more interesting safety case — it's what protects against node_version being lost via VACUUM FULL renumbering or an explicit DROP — and the relevant code-segment lookup-by-name vs. positional access is documented as the reason for the name-based scan in src/spock_node.c.

The TAP test reportedly covers it, but exercising both paths in regress keeps the schedule self-contained and keeps the safety-net coverage close to the schema migration that introduces the column.

♻️ Suggested addition
 -- Restore before DDL.
 UPDATE spock.local_node SET node_version = spock.spock_version_num();
+
+-- ---------------------------------------------------------------
+-- Scenario: schema outdated -- node_version column dropped
+-- (simulates pre-6.0 schema with a 6.x binary).
+-- ---------------------------------------------------------------
+ALTER TABLE spock.local_node DROP COLUMN node_version;
+
+\set VERBOSITY terse
+SELECT * FROM spock.node_info();
+\set VERBOSITY default
+
+-- Restore the column for any subsequent regression tests.
+ALTER TABLE spock.local_node
+    ADD COLUMN node_version int4 NOT NULL DEFAULT 0;
+UPDATE spock.local_node SET node_version = spock.spock_version_num();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/regress/sql/version_guard.sql` around lines 1 - 43, Add a regression
case that exercises the "missing-column" error path for get_local_node():
simulate dropping the node_version column from spock.local_node (or otherwise
making it absent) and then call spock.node_info() to verify it raises the "spock
extension schema outdated" error; after the check, restore the schema by
recreating or resetting node_version to spock.spock_version_num() so subsequent
DDL tests run. Place the new steps near the existing version-tampering scenarios
in tests/regress/sql/version_guard.sql and reference
get_local_node()/spock.node_info(), spock.local_node.node_version, and the
rationale in src/spock_node.c when adding the test. Ensure verbosity is set to
terse around the failing call as done for the other scenarios.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/conflicts.md`:
- Around line 41-61: Replace the fenced SQL code blocks with indented (4-space)
code blocks to satisfy MD046: convert the three SELECT spock.delta_apply(...)
snippets, the to_drop => true example, and the final SELECT * FROM pg_seclabel
... snippet by removing the triple-backtick fences and indenting each SQL line
with four spaces; keep the SQL text unchanged (including spock.delta_apply,
provider = 'spock', label = 'spock.delta_apply') so linting passes.
- Line 64: The heading "#### Upgrading from spock 5.x" is one level too deep
(MD001); change that heading to "### Upgrading from spock 5.x" to restore proper
hierarchy so it follows the surrounding section structure.

In `@docs/internals-doc/binary-upgrade-compat-shim.md`:
- Around line 34-40: In docs/internals-doc/binary-upgrade-compat-shim.md update
all fenced code blocks to include language identifiers (e.g., use ```text for
file layout/log output or appropriate language for snippets) so markdownlint
MD040 is satisfied; specifically tag the block showing the file list (the one
containing src/spock_bucompat_5x.c, src/spock.c, sql/spock--6.0.0-devel.sql,
docs/... ) and the other fenced blocks referenced around the same area
(including the blocks corresponding to the content at lines ~109-111) with the
correct language identifiers.

In `@sql/spock--5.0.6--5.0.7.sql`:
- Around line 140-163: Before performing the direct catalog mutations on
pg_catalog.pg_attribute and pg_catalog.pg_statistic for table spock.subscription
and column sub_skip_schema, add a session-local GUC by issuing "SET LOCAL
allow_system_table_mods = on;" so the UPDATE and DELETE are permitted for
superusers; place this SET LOCAL immediately before the LOCK TABLE ... and the
catalog statements and ensure it applies to the same transaction/scope so no
other permission changes are required.

---

Nitpick comments:
In `@src/spock.c`:
- Around line 56-57: Move the inline extern prototype for
register_spock_compat_5x out of src/spock.c and into a shared header (e.g.,
declare it in spock.h or create spock_bucompat_5x.h) and then `#include` that
header from src/spock.c; remove the redundant extern declaration from spock.c so
the function contract is centralized and available to any future callers.

In `@tests/regress/sql/version_guard.sql`:
- Around line 1-43: Add a regression case that exercises the "missing-column"
error path for get_local_node(): simulate dropping the node_version column from
spock.local_node (or otherwise making it absent) and then call spock.node_info()
to verify it raises the "spock extension schema outdated" error; after the
check, restore the schema by recreating or resetting node_version to
spock.spock_version_num() so subsequent DDL tests run. Place the new steps near
the existing version-tampering scenarios in tests/regress/sql/version_guard.sql
and reference get_local_node()/spock.node_info(), spock.local_node.node_version,
and the rationale in src/spock_node.c when adding the test. Ensure verbosity is
set to terse around the failing call as done for the other scenarios.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c6568bd7-650f-4b51-aab7-5558abe2375d

📥 Commits

Reviewing files that changed from the base of the PR and between 76d5624 and ef6ada3.

⛔ Files ignored due to path filters (1)
  • tests/regress/expected/version_guard.out is excluded by !**/*.out
📒 Files selected for processing (20)
  • Makefile
  • docs/conflicts.md
  • docs/internals-doc/binary-upgrade-compat-shim.md
  • docs/troubleshooting.md
  • patches/15/pg15-000-spock-patchset-version.diff
  • patches/16/pg16-000-spock-patchset-version.diff
  • patches/17/pg17-000-spock-patchset-version.diff
  • patches/18/pg18-000-spock-patchset-version.diff
  • sql/spock--5.0.0.sql
  • sql/spock--5.0.6--5.0.7.sql
  • sql/spock--5.0.7--6.0.0-devel.sql
  • sql/spock--6.0.0-devel.sql
  • src/spock.c
  • src/spock_apply_heap.c
  • src/spock_bucompat_5x.c
  • src/spock_node.c
  • tests/regress/sql/version_guard.sql
  • tests/tap/schedule
  • tests/tap/t/002_create_subscriber.pl
  • tests/tap/t/020_version_safety_net.pl
💤 Files with no reviewable changes (1)
  • src/spock_apply_heap.c
✅ Files skipped from review due to trivial changes (4)
  • sql/spock--6.0.0-devel.sql
  • tests/tap/schedule
  • Makefile
  • patches/16/pg16-000-spock-patchset-version.diff
🚧 Files skipped from review as they are similar to previous changes (5)
  • patches/17/pg17-000-spock-patchset-version.diff
  • patches/18/pg18-000-spock-patchset-version.diff
  • tests/tap/t/002_create_subscriber.pl
  • src/spock_node.c
  • tests/tap/t/020_version_safety_net.pl

Comment thread docs/conflicts.md
Comment thread docs/conflicts.md Outdated
Comment thread docs/internals-doc/binary-upgrade-compat-shim.md Outdated
Comment thread sql/spock--5.0.6--5.0.7.sql
danolivo added 2 commits May 6, 2026 09:52
During pg_upgrade from a spock-5.x cluster to a spock-6.x cluster,
pg_dump --binary-upgrade emits the legacy spock-5.x form for
delta-apply markers:

  ALTER TABLE t ALTER COLUMN c SET (log_old_value=true,
      delta_apply_function=spock.delta_apply);

spock 6.x records the same intent as a security label with provider
'spock':

  SECURITY LABEL FOR spock ON COLUMN t.c IS 'spock.delta_apply';

Install a ProcessUtility hook in the new cluster that intercepts the
legacy form during pg_restore and rewrites it on the fly:

  - the legacy DefElems are stripped from the AlterTableCmd;
  - if the stripped cmd has unrelated keys (e.g. fillfactor) they
    survive;
  - if the stripped cmd has nothing left, the cmd is dropped entirely;
  - a synthetic SECURITY LABEL statement is emitted in the same xact;
  - one NOTICE per rewritten column lands in pg_upgrade.log.

Outside pg_upgrade the hook is not installed and the normal DDL path
pays nothing.  The mechanism is one self-contained file
(src/spock_bucompat_5x.c, ~450 lines) plus one call from spock.c
_PG_init.  Retirement is two edits: git rm the .c file and remove
the register_spock_compat_5x() call.

The security label provider is moved to BEFORE the IsBinaryUpgrade
early-return in _PG_init, so the synthesised SECURITY LABEL
statements find the provider during pg_restore.

In passing convert the "spock extension is not created yet" elog in
spock_object_relabel() to a proper ereport with
ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE; the message is reachable
from user-visible code paths (any SECURITY LABEL FOR spock ... before
CREATE EXTENSION) so it deserves an errcode.

Design contract (C1-C10) is documented in
docs/internals-doc/binary-upgrade-compat-shim.md.

A TAP test driving real pg_upgrade from a 5.x dump fixture is still
to do.
Replace the legacy 5.x reloption examples in docs/conflicts.md and
docs/troubleshooting.md with calls to the spock.delta_apply() helper.
The reloption form is still accepted at runtime via the
binary-upgrade compatibility shim during pg_upgrade, but the
documented way to mark a delta-apply column on 6.x is:

  SELECT spock.delta_apply('t'::regclass, 'c');

Add a "Upgrading from spock 5.x" subsection to conflicts.md pointing
at the binary-upgrade compatibility shim doc and showing operators
how to audit the post-upgrade catalog via pg_seclabel.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
docs/conflicts.md (1)

41-62: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

These fenced SQL blocks still trip the repo’s MD046 rule.

The three SQL examples here are still fenced, so markdownlint will keep flagging this section until they are converted to indented code blocks.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/conflicts.md` around lines 41 - 62, Replace the fenced ```sql blocks
with indented code blocks (remove the triple-backticks and indent each line by
four spaces) for the examples that call spock.delta_apply(...) (including the
variant with to_drop => true) and the final SELECT from pg_seclabel so they
become indented code blocks that satisfy MD046; ensure indentation is applied to
every line of those three SQL snippets and no language fence remains.
🧹 Nitpick comments (1)
src/spock.c (1)

965-971: ⚡ Quick win

Add a remediation hint to the patchset-mismatch error.

This new guard will fail very early, so operators need the next step in the error itself. An errhint pointing them to install the matching patched PostgreSQL build or rebuild the extension would make upgrade failures much less opaque.

🛠️ Suggested improvement
 	if (SpockCorePatchsetVersion != SPOCK_CORE_PATCHSET_VERSION)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("spock core patchset version mismatch: "
 						"server has v%d, extension expects v%d",
 						SpockCorePatchsetVersion,
-						SPOCK_CORE_PATCHSET_VERSION)));
+						SPOCK_CORE_PATCHSET_VERSION),
+				 errhint("Install the matching Spock-patched PostgreSQL binaries for this extension build, or rebuild the extension against the running server.")));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/spock.c` around lines 965 - 971, The error raised when
SpockCorePatchsetVersion != SPOCK_CORE_PATCHSET_VERSION lacks a remediation
hint; update the ereport call in src/spock.c (the block using
SpockCorePatchsetVersion, SPOCK_CORE_PATCHSET_VERSION and ereport(ERROR,...)) to
include an errhint guiding operators to either install the matching patched
PostgreSQL build or rebuild/reinstall the extension against the server's
patchset version so upgrade failures show the next steps directly in the error
message.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/conflicts.md`:
- Around line 72-73: The documented canonical audit query is too broad: the
current query "SELECT * FROM pg_seclabel WHERE provider = 'spock'" omits the
label restriction and will over-report; update the query to include the label
predicate (e.g., add "AND label = 'spock.delta_apply'") so it matches the
definition of delta-apply columns and only returns rows where provider = 'spock'
and label = 'spock.delta_apply'.

In `@src/spock_bucompat_5x.c`:
- Around line 160-171: The ProcessUtility call uses a hardcoded
PROCESS_UTILITY_SUBCOMMAND which prevents top-level-only hooks from observing
the synthetic statement; change that call so it passes the original context
parameter instead of PROCESS_UTILITY_SUBCOMMAND (i.e. invoke
ProcessUtility(synth_pstmt, NULL, false, context, params, queryEnv, dest,
NULL)), keeping synth_pstmt, params, queryEnv and dest as-is so other registered
hooks (e.g. spock_autoddl.c checks) will see the statement in the same context
as the caller.

---

Duplicate comments:
In `@docs/conflicts.md`:
- Around line 41-62: Replace the fenced ```sql blocks with indented code blocks
(remove the triple-backticks and indent each line by four spaces) for the
examples that call spock.delta_apply(...) (including the variant with to_drop =>
true) and the final SELECT from pg_seclabel so they become indented code blocks
that satisfy MD046; ensure indentation is applied to every line of those three
SQL snippets and no language fence remains.

---

Nitpick comments:
In `@src/spock.c`:
- Around line 965-971: The error raised when SpockCorePatchsetVersion !=
SPOCK_CORE_PATCHSET_VERSION lacks a remediation hint; update the ereport call in
src/spock.c (the block using SpockCorePatchsetVersion,
SPOCK_CORE_PATCHSET_VERSION and ereport(ERROR,...)) to include an errhint
guiding operators to either install the matching patched PostgreSQL build or
rebuild/reinstall the extension against the server's patchset version so upgrade
failures show the next steps directly in the error message.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bcde85df-9252-4bfa-92a8-159a8b1ffc0a

📥 Commits

Reviewing files that changed from the base of the PR and between ef6ada3 and 3ce8a02.

📒 Files selected for processing (4)
  • docs/conflicts.md
  • docs/troubleshooting.md
  • src/spock.c
  • src/spock_bucompat_5x.c

Comment thread docs/conflicts.md Outdated
Comment thread src/spock_bucompat_5x.c
@danolivo danolivo changed the title Spoc 504: Add version safety net to detect server/extension binary mismatches spoc-504: 5.x -> 6.x upgrade path and version safety nets May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request feature New feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant