Skip to content

Stamp graph.json with generated_at, warn on stale graphify query#1624

Open
edgestack-ai wants to merge 3 commits into
Graphify-Labs:v8from
edgestack-ai:fix/graphify-staleness
Open

Stamp graph.json with generated_at, warn on stale graphify query#1624
edgestack-ai wants to merge 3 commits into
Graphify-Labs:v8from
edgestack-ai:fix/graphify-staleness

Conversation

@edgestack-ai

Copy link
Copy Markdown

Summary

  • export.to_json (the single write chokepoint every build/update/watch path funnels through) now stamps generated_at (ISO UTC) into graph.json's top level, alongside the existing built_at_commit.
  • graphify query compares that stamp against the last commit time of the repo the graph indexes (resolved via the same .graphify_root sidecar convention build.py::_infer_merge_root already uses) and prints one warning line to stderr when the graph is older than the repo's latest commit.
  • Missing/unreadable stamps (graphs built by an older graphify version) warn too, telling the caller to regenerate.
  • Outside a git repo, or when git isn't available, the check is a silent no-op — it never raises, and query output/exit code is otherwise unchanged.

Test plan

  • uv run ruff check graphify/export.py graphify/__main__.py — clean
  • uv run pytest tests/test_export.py tests/test_query_cli.py tests/test_cli_export.py — 73 passed
  • uv run pytest tests/ — 2769 passed / 27 pre-existing failures (all Windows path-separator flakiness in unrelated tests, reproduced identically on v8 HEAD without this change)
  • Manual end-to-end verify: built a tiny demo repo + graph.json, confirmed a fresh graph produces no warning, confirmed advancing the demo repo's history (new commit after the stamp) produces [graphify] warning: graph.json was generated ... but the indexed repo's last commit is ... - graph is stale, run \graphify .` (or `graphify update`) to refresh` on stderr, confirmed a stamp-less graph produces a "no stamp, regenerate" warning, and confirmed no crash/warning outside a git repo.

graph.json now records an ISO generated_at timestamp alongside the
existing built_at_commit, written in export.to_json (the single
chokepoint every build/update/watch path already funnels through).

graphify query compares that stamp against the last commit time of the
repo the graph indexes (resolved the same way build.py's
_infer_merge_root already does, via the .graphify_root sidecar) and
prints one warning line to stderr when the graph predates it. Missing
or unreadable stamps (graphs built by an older graphify version) warn
too, telling the caller to regenerate. Outside a git repo, or if git
isn't available, the check is silently skipped rather than raising -
query output is otherwise unchanged.
…eness

Root-cause fix for the review round on the generated_at/staleness feature:

1. graph.json stamping was only in export.to_json(), so --no-cluster extract
   (__main__.py) and --no-cluster update (watch.py) wrote unstamped graphs
   that check_staleness could never flag. Extracted the stamping logic into
   one chokepoint, stamp_graph_metadata(), that every writer now calls.

2. check_staleness inferred the indexed repo's root from the graph file's
   own location, so a graph written via --out <elsewhere> could never be
   compared against the right repo. The root is now recorded IN the graph
   at write time (indexed_repo_root) and check_staleness prefers that,
   falling back to location-based inference only for legacy graphs that
   predate the field.

3. _canonical_graph_for_compare/_canonical_topology_for_compare (watch.py)
   now also exclude generated_at/indexed_repo_root from same-graph/topology
   comparisons, alongside the existing built_at_commit exclusion - required
   so the --no-cluster update path's "no changes, left untouched" detection
   keeps working now that every write is timestamped.

Added coverage for all 3 reviewer repro cases: --no-cluster extract stamps
(test_extract_cli.py), --no-cluster update stamps (test_watch.py), and
--out-elsewhere staleness detection via the recorded root (test_export.py).
The merge-driver command wrote graph.json directly (__main__.py) bypassing
stamp_graph_metadata() (d1692f4's chokepoint), so a merge-committed
graph.json carried no generated_at/indexed_repo_root and check_staleness
could never flag it as stale.

merge-driver has no natural indexed-root argument - git invokes it as
`graphify merge-driver %O %A %B` with three throwaway temp file paths, not
the real graphify-out/graph.json location. Resolve the actual repo root via
`git rev-parse --show-toplevel` (git runs merge drivers with cwd at the top
of the work tree), falling back to the current side's previously recorded
indexed_repo_root, then to cwd.

Added tests/test_merge_driver_cli.py covering both the normal path and the
outside-a-git-repo fallback.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant