Skip to content

Implement "go mod tidy" equivalent recipe#8102

Open
sambsnyd wants to merge 19 commits into
mainfrom
go-mod-tidy
Open

Implement "go mod tidy" equivalent recipe#8102
sambsnyd wants to merge 19 commits into
mainfrom
go-mod-tidy

Conversation

@sambsnyd

Copy link
Copy Markdown
Member

This branch implements org.openrewrite.golang.GoModTidy, an OpenRewrite recipe that reproduces go mod tidy's effect on go.mod, validated end-to-end through the moderne-cli.

There was a cross-language RPC bug where editing .go sources dropped whitespace from unchanged subtrees.
Our visit / with methods were instantiating new structs even when no changes were made. Fixed that.

Implemented downloading of dependencies so that we can have the complete dependency graph required for tidy (and other dependency manipulations in the future). Added an HttpSender to the RPC mechanism so that we don't have to shell out to go to do this and get proxy/credentials/etc. from the execution context per OpenRewrite idiom.

Validated on 27 real Go repositories via the CLI.

// dependency go.mod files from a GOPROXY) perform an HTTP GET through
// the configured HttpSender, so proxy/auth/TLS are honored. Returns the
// status code and base64-encoded body.
jsonRpc.rpc("Http", new JsonRpcMethod<HttpRequest>() {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to take a close look at this, and any other modification to the Rewrite RPC spec.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm totally open to other ideas about how to enable HTTP communication in the ecosystems we support. Figured we would want to reuse the http sender we have on the java side rather than wiring through the individual proxy settings / credentials / etc.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkschneider I've moved the go dependency resolver to java. There are no longer any changes to RewriteRpc itself

sambsnyd added 16 commits June 23, 2026 19:40
Fixes whitespace loss when a recipe edits .go sources through the
moderne-cli RPC pipeline, and hardens parsing/serialization for the
GoModTidy recipe end-to-end.

Whitespace round-trip fix (pkg/rpc/java_receiver.go): for J nodes whose
Go model holds a direct child where Java's model wraps it
(JRightPadded/Container) — J.If then/else parts, for/forEach bodies,
J.Switch selector, J.Case statements — the receiver passed a nil
baseline to q.Receive. On a CHANGE delta that materialized a fresh empty
instance, so unchanged inner Spaces resolved NO_CHANGE -> empty and the
subtree collapsed (`if x == 1 {`  ->  `if x == 1{`). Now each site passes
the baseline wrapped exactly as the sender wraps it, so inner spaces diff
against the real baseline. Captured the real failing wire as a replayable
regression test (print_collapse_repro_test.go + testdata).

Immutable LST: withX methods return the receiver unchanged when the
argument matches the current value, and the Go visitor copies-on-write,
so unchanged subtrees keep pointer identity and produce no spurious RPC
deltas (the root cause of over-broad patches).

Unified RPC state to match the other parsers (Java/JS/C#/Python): a
single remoteObjects baseline + single ref tables shared by both
directions, replacing the split reverse* maps. handleVisit stores results
in localObjects only.

Parsing: files that fail to parse become a ParseError inline rather than
being silently dropped or downgraded to PlainText; literal normalization
avoids NumberFormatException on Go integer syntax (hex/octal/underscore/
runes); GoModTidy harvests imports from PlainText .go files for
build-excluded/other-arch sources.

Validated on 12 real projects (~3500 .go files): 0 ParseError, 0
PlainText, recipe touches 0 .go files, and direct-dependency parity with
`go mod tidy` in every project.
Makes GoModTidy resolve the full transitive module graph by leaning on the
standard Go module cache as a persistent, shared backbone.

Write-through cache (modgraph/source.go): ProxyWriteThroughSource persists every
fetched .mod and .zip (plus a computed h1: .ziphash) into the standard
$GOMODCACHE/cache/download/<esc>/@v/<ver>.* layout `go mod download` produces,
via atomic temp-file+rename. The first fetch of a module@version costs a network
round-trip through the CLI HttpSender; thereafter CacheSource — and the real go
toolchain — serve it offline. CacheSource.PackageGoFiles gains a fallback that
reads .go files straight from the cached .zip when the extracted tree is absent,
so a clean clone needs no `go` extraction step.

Unified wiring (cmd/rpc/main.go): moduleSource builds
TieredSource(CacheSource, ProxyWriteThroughSource) used by BOTH parse-time graph
resolution and the recipe ExecutionContext. Network resolution is on by default;
opt out for air-gapped runs with MODERNE_GO_OFFLINE, GOPROXY=off, or
MODERNE_GO_PROXY_RESOLVE=0.

Recipe-time re-resolution (recipe/golang/go_mod_tidy.go): computeTidySet no
longer hard-bails when the parse-time graph was incomplete. When the marker's
graph is incomplete it re-resolves now against the network-backed source, so a
cold parse no longer pins the recipe to the incomplete LST-only fallback. When
resolution genuinely cannot complete (offline + cold cache, private module), it
falls back to the LST-only set, which PRESERVES the existing require/// indirect
block rather than dropping unconfirmed deps.

Net effect on the corpus: direct-dependency parity with `go mod tidy` in every
project, and the recipe now prunes spurious indirect entries it previously kept
(e.g. caddy reaches exact parity). The remaining test-transitive indirect deps
some projects list (e.g. kr/text, go.uber.org/mock) come from go 1.17+'s
module-graph pruning-completeness rule (derived from the build list, not package
imports) and are a separate, bounded follow-up.
NeededModules computes the modules that PROVIDE a package in `all` (direct +
import-reachable indirect). For a go>=1.17 main module, `go mod tidy` also
records indirect roots for test-transitive dependencies that the pruned module
graph would under-select — e.g. gin's kr/text (via gopkg.in/check.v1) and
go.uber.org/mock (via a dependency's test), cli's gotest.tools/v3 and
jedisct1/go-minisign. NeededModules misses these because they are not reachable
by walking ordinary package imports.

TidyRequireSet (modgraph/tidy.go) adds them, mirroring
cmd/go/internal/modload.tidyPrunedRoots: start from the import-reachable roots,
walk imports AND tests outward from `all`, and promote a module to an explicit
root whenever the pruned graph under the current roots selects a lower version
than the one actually loaded (Selected(path) < loadedVersion). The version gate
is essential — it adds genuinely under-selected modules while leaving testify-
style clusters out when the pruned graph already selects them correctly, so it
does not over-include.

The pruned selection under a candidate root set is computed by building a
synthetic go.mod (preserving the main module's go directive and replaces) that
requires exactly those roots and re-resolving; go.mod fetches are served from
the write-through cache. Iterates to a fixpoint. No-ops for go<1.17.

Validated: new golden test TidyRequireSet-via-proxy (no-extras and testify
cases) matches `go mod tidy`; live CLI runs bring gin and cli to exact parity
(0 missing, 0 extra), with cobra/testify/mux/uuid unchanged. All 12 corpus
projects now match `go mod tidy` exactly.
…olution

The pruning-completeness pass decided whether a candidate module was under-
selected by re-resolving a synthetic go.mod (Resolve) on every fixpoint
iteration — re-reading and re-parsing every dependency go.mod each time.

Replace that with an in-memory pruned MVS over a requirement index. The index
is seeded for free from the already-resolved graph (res.Graph carries every
LOADED module's require edges) and lazily fetches the go.mod only for the few
pruned modules that pruning left unloaded AND that get promoted to roots —
typically a handful of cache reads total, versus O(deps) parses per iteration.
prunedSelectInMemory mirrors Resolve's traversal exactly (a module's requires
are recursed only when it is unpruned, i.e. go<1.17), so results are identical;
main-module version replacements are carried into the index.

Validated: golden TidyRequireSet-via-proxy cases (no-extras, testify, gin app)
still match `go mod tidy`; live CLI keeps exact parity on gin and cli (the
promotion cases) and cobra (clean), with no over/under-inclusion.
The recipe's scan accumulator was global — a single modulePath, rawImports, and
requireMods shared across every go.mod in a repository. In a multi-module repo
that conflates modules: a nested module's file leaks its imports into the root
module's require set. Observed on prometheus, whose root go.mod gained a direct
requirement on github.com/grpc-ecosystem/grpc-gateway/v2 — imported only by the
nested internal/tools module's `//go:build tools` tools.go — where `go mod tidy`
correctly keeps it indirect.

Scope the accumulator per module by source path. The scanner now records
fileImports keyed by each .go file's source path, plus per-directory module
paths and require sets and the set of go.mod directories. The editor attributes
each file to its nearest-ancestor go.mod (ownerDir = the longest go.mod
directory that is the file's directory or a prefix of it) and tidies each go.mod
against only the files it owns.

Single-module repositories are unaffected: every file maps to the one root
module, identical to before. Validated end to end — prometheus reaches exact
parity with `go mod tidy` (grpc-gateway back to indirect) and gin (single
module) is unchanged. Unit test TestOwnedImportsScopesByModule guards the
attribution.

Known remaining nuance: a root-level `//go:build tools` file is still harvested
(go mod tidy excludes custom-tag files); a build-constraint-aware import filter
is a separate follow-up.
…der)

The pruning-completeness pass promoted every under-selected reachable module in
one pass and iterated to a fixpoint. That over-includes: a module reached only
through a deeper dependency (e.g. github.com/kr/text, required by kr/pretty,
required by gopkg.in/check.v1) looks under-selected until its requirer is itself
promoted — so promoting them together wrongly keeps the deeper one. `go mod
tidy` records kr/pretty but not kr/text; the old pass recorded both (observed on
sourcegraph/conc, and version/cache-state dependent because the result hinged on
iteration order).

Mirror cmd/go/internal/modload.tidyPrunedRoots: walk the package import graph
frontier by frontier in increasing import-stack depth, recomputing the pruned
selection between frontiers, and promote a module only when it is still
under-selected at its depth. Promoting a shallow root (kr/pretty) then pins its
requirements (kr/text) before they are examined, so they are not promoted. A
package's test imports are deferred one frontier deeper (go's `<pkg>.test`
node), keeping test-transitive deps below ordinary ones. The in-memory pruned
selection is recomputed per frontier (cheap; no re-resolution).

Validated: new golden case conc_depth_ordering matches `go mod tidy` (kr/pretty
kept, kr/text excluded); gin/cli promotion cases and live CLI runs on
conc/gin/cli/zap all reach exact parity.
Housekeeping for code paths obsoleted by this branch's work; no behavior change.

- Remove looksLikeModulePath, dead since the scanner moved to parseGoModDeclared
  (which classifies require-block entries via moduleOf) for per-module scoping.
- Drop findResolution, an exact duplicate of GetResolutionResult in the same
  package; use the latter.
- Rewrite the GoModTidy type doc and Description: they still claimed the recipe
  does NOT add missing requires, remove unused ones, or do MVS version
  selection. It now does all of that via TidyRequireSet; only go.sum is left
  alone. DisplayName drops the now-inaccurate "(LST-only)" suffix.
- Clarify the editor's three-tier fallback comment (the marker fallback is a
  parse-time-resolution path, not test-only).
…files

Two related changes to GoModTidy's view of the source.

Remove the dead PlainText handling. The recipe registered org.openrewrite.text.
PlainText and harvested imports from PlainText .go files on the assumption that
the CLI's Go build step backfilled a PlainText for any file the parser omitted.
It does not — a build-excluded file simply vanishes from the LST (verified). So
the language registration, the scanner's PlainText branch, the receiver's
PlainText codec, the PlainText tree type, its value-type factory, and
parser.FileImports were all unreachable. Removed.

Recover platform-gated imports in the parser instead (the robust fix the
PlainText path was meant to be). `go mod tidy` unions imports across every
GOOS/GOARCH and tag, so a module imported only by, say, a //go:build windows
file (cobra's mousetrap) must stay visible. ParsePackage now parses
build-excluded files IMPORTS-ONLY and emits a small CompilationUnit carrying
just their imports, so the recipe counts them via cu.Imports with no marker or
cross-repo plumbing. Imports-only is essential: a full body cannot be mapped
without type info (the mapper needs types to tell a `(T)` conversion from a
parenthesized expression, which otherwise yields a J$Parentheses where Java
expects a TypeTree). Excluded files are type-checked only with the included set;
those with no imports, or that fail the imports-only parse, are dropped rather
than surfaced as spurious ParseErrors. They are never modified by the recipe, so
they are not written back.

Tests updated: build-constraint evaluation is now exercised via MatchBuildContext
(ParsePackage no longer omits), and the omit-test becomes an emit-test asserting
a //go:build windows file's import survives.
proxyResolveEnabled gated network module resolution on two Moderne-specific
environment variables — MODERNE_GO_OFFLINE (newly invented) and
MODERNE_GO_PROXY_RESOLVE — which have no analog in how rewrite handles Maven or
other ecosystems. rewrite attempts the network via the CLI HttpSender and
degrades gracefully when it is unavailable; it does not expose per-ecosystem
offline toggles.

Follow that pattern: resolution is on by default and disabled the Go-native way
with GOPROXY=off (the standard mechanism for air-gapped builds). When the proxy
is unreachable for any other reason, resolution already falls back to the local
cache and the existing require set. Removed both env vars and the isTruthy
helper they needed. GOPROXY=off is honored as full-offline; a GOPROXY list like
"https://corp,off" still enables the proxy.

Verified: GOPROXY=off builds and runs cleanly (cobra degrades gracefully, no
crash, requires preserved); default remains network-on.
First half of porting the Go dependency resolver to pure Java so the generic
Http RPC method can eventually be removed from core RewriteRpc (HTTP moves
entirely into the host, no peer-initiated fetch).

Foundations (faithful ports, with tests):
- GoSemver: golang.org/x/mod/semver Compare, the ordering MVS and the pruning
  version gate depend on.
- ModulePath: module path/version escaping for the proxy URL and cache layout.
- GoModFile: a light go.mod reader (module/go/require/replace) for the project
  and the many dependency go.mods.
- GoImports: an imports-only Go source scanner (replaces go/parser ImportsOnly),
  comment/string aware.

ModSource (the network layer, in Java):
- ModSource interface; CacheSource ($GOMODCACHE read, extracted tree or cached
  zip); ProxySource (fetch via HttpSender + write-through to the standard cache
  layout, atomic writes); TieredSource; Zips (module-zip extraction/filtering).
  ProxySource calls HttpSender directly — no RPC for fetching.

Not yet wired in: Resolve (pruned MVS), NeededModules/TidyRequireSet, and the
RPC method that replaces the Go-side resolver. The existing Go resolver remains
the active path until those land. All new code compiles at Java 8 and the
modgraph unit tests pass.
Port modgraph.Resolve to Java: the go1.17+ pruned module graph and MVS build
list. Every loaded module's requirements become build-list nodes, but recursion
only continues through unpruned (go<1.17) modules; iterative MVS raises a node's
selected version and re-loads it as needed. Dependency go.mods come entirely
from the ModSource (no process execution).

ResolverTest validates against the real toolchain end to end: it resolves a
cobra-requiring module by fetching every dependency go.mod from the GOPROXY via
HttpUrlConnectionSender, and asserts the build list equals `go list -m all`.
Ran green (not skipped) against go1.26 + the live proxy.

Module/zip hashes (go.sum material) are intentionally omitted — go mod tidy's
require-set computation does not need them; that can follow when go.sum support
is ported.
…eness)

Port NeededModules and TidyRequireSet to Java. NeededModules walks the package
import graph from the main module's imports (tests included) and classifies
direct vs indirect against the build list. TidyRequireSet adds the go1.17+
pruning-completeness roots: it walks imports+tests frontier-by-frontier, and at
each frontier promotes any module the pruned in-memory MVS under-selects —
mirroring cmd/go/internal/modload.tidyPrunedRoots. ReqIndex seeds the pruned MVS
from the resolved graph and lazily fetches only promoted roots; prunedSelectInMemory
runs the selection with no per-iteration re-resolution.

TidyTest validates against `go mod tidy` for the five scenarios that stress the
pruning: no-extras, testify test-transitive (kr/text via check.v1), gin's real
promotions, conc's "must not over-promote", and conc's depth-ordering (promote
kr/pretty but not kr/text). All match the toolchain exactly.

Also fixes GoImports to read backtick (raw-string) import paths — bytedance/sonic
writes its imports that way, and missing them dropped sonic's six assembly-related
indirect deps from the gin result. Caught by the gin golden case.
…aph + Http

Complete the port: the GoModTidy recipe no longer resolves dependencies in the
Go peer. It now delegates the entire `go mod tidy` require-set computation to the
pure-Java resolver on the host via a new domain RPC method, GoModResolveTidy.

- core RewriteRpc: replace the generic, network-performing `Http` RPC method
  with a `registerLanguageMethods(JsonRpc)` extension hook (called before bind)
  and a protected getHttpSender(). The generic "host, fetch this URL for the
  peer" capability — the SSRF/coupling concern raised in review — is gone from
  the shared protocol.
- GoRewriteRpc: register GoModResolveTidy, which builds a CacheSource+ProxySource
  from the request and runs Resolver + Tidy. All GOPROXY HTTP happens here, in
  the host, through the configured HttpSender; GOPROXY=off resolves cache-only.
- Go side: resolveTidyViaJava sends {goMod, mainImports, modulePath,
  separateIndirect, goproxy, gomodcache} and applies the returned require set;
  computeTidySet calls it instead of the in-process resolver, falling back to the
  LST-only pass when no resolver is installed (offline). The parse-time marker now
  carries only the declared model; the resolved build list is computed on demand.
- Delete the Go pkg/parser/modgraph package (its algorithm now lives in Java) and
  the parse-time resolveModuleGraph/moduleSource/fetchHTTP plumbing.

Tested: GoModResolveTidyTest drives the exact handler entry point (resolveTidy)
and matches `go mod tidy` for gin's pruning-completeness case, fetching over the
proxy via HttpSender. Full rewrite-go Java suite, rewrite-core rpc tests, and the
Go unit suite are green.

Note: a full modw corpus sweep could not be run here — the moderne-cli
core/serialization module does not compile against the workspace rewrite
(pre-existing API skew, 128 errors in V3LstReader: LstMetadata, ChangesetFilter,
EditPage, UsesMethod.getMethodPattern, …), so the dev fat jar cannot be rebuilt.
This is unrelated to these changes (none touch those files) and predates them
(the on-disk fat jar is from before this work).
Drive resolveTidyViaJava against a canned host response: assert it parses
{direct, indirect, complete} correctly and that the request it writes carries the
exact method name and param field names (goMod, mainImports, modulePath,
separateIndirect, goproxy, gomodcache) the Java GoModResolveTidyRequest expects.

Together with GoModResolveTidyTest (the Java handler half, validated against
`go mod tidy`), this pins both ends of the cross-process contract that neither
single-sided test can see on its own.
sambsnyd added 2 commits June 23, 2026 23:14
The httpSender field, setHttpSender/getHttpSender, and setHttpSenderFrom were
added to core RewriteRpc to feed the Go module-graph resolver. With the generic
Http RPC method gone and the resolver now the sole consumer, core no longer needs
any HttpSender knowledge.

Core RewriteRpc keeps only a generic, language-agnostic beforeSend(Object) hook
invoked before each visit/batchVisit/generate is dispatched to the peer.
GoRewriteRpc overrides it to capture the operation's ExecutionContext HttpSender
into its own field, which the GoModResolveTidy handler uses for GOPROXY fetches;
parseProject captures it the same way. setHttpSender had no callers outside this
initiative (the CLI configures the sender on the ExecutionContext, not the RPC
peer), so removing it is safe.

Verified: rewrite-core rpc tests, the rewrite-go resolver + handler tests, and a
full modw end-to-end run (gin, exact `go mod tidy` parity) all pass.
… GoRewriteRpc

Revert the registerLanguageMethods/beforeSend hooks from core RewriteRpc, which
is now byte-for-byte origin/main again. Everything the Go resolver needs lives in
GoRewriteRpc:

- It registers GoModResolveTidy itself, on the JsonRpc obtained from
  process.getRpcClient(), in its constructor. The channel's method table is a
  ConcurrentHashMap consulted at dispatch time and bind() only starts a read
  loop, so registering after super()'s bind() is safe; the method is only ever
  invoked during a recipe run, long after construction.
- It captures the operation's HttpSender by overriding the public
  visit/batchVisit/generate and reading it from the ExecutionContext before
  delegating to super (parseProject already captured it for parse time).

No new extension points in the shared core protocol. Verified: rewrite-core rpc
tests, rewrite-go resolver/handler tests, and a full modw end-to-end run (gin,
exact `go mod tidy` parity) all pass.
@greg-at-moderne

Copy link
Copy Markdown
Contributor

BTW, shouldn't this recipe land in https://github.com/moderneinc/recipes-go ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

3 participants