feat: add search_docs tool for searching Plane documentation by sriramveeraghanta · Pull Request #158 · makeplane/plane-mcp-server

sriramveeraghanta · 2026-06-19T15:30:58Z

Summary

Adds a search_docs MCP tool that runs local full-text search over Plane's two public documentation sites and returns ranked results with snippets and URLs.

search_docs(query: str, source: "all" | "help" | "developer" = "all", limit: int = 5) -> dict

help → docs.plane.so (product/usage) · developer → developers.plane.so (API reference, self-hosting, OAuth, webhooks, MCP) · all → both.
Fetches each site's Mintlify llms-full.txt, caches it in-process (1h TTL, with a double-checked lock so concurrent calls don't double-fetch), splits it into per-page records, and ranks by saturated, field-weighted term frequency (title > description > body, plus a coverage boost when all query terms match) so a focused page beats a long page that merely repeats a term.
Returns each match as {title, url, source, score, snippet}. Fetch failures are surfaced as a non-fatal error/warnings field rather than crashing the tool.

Why

Lets assistants answer "how do I…" (product) and "how do I build on Plane…" (API/self-host) questions by citing the official docs, instead of relying on stale model knowledge.

Implementation notes

New module plane_mcp/tools/docs.py, registered via tools/__init__.py (matches the existing per-domain pattern). No Plane auth — the docs are public.
Pure functions (_parse_llms_full, _parse_yaml_fields, _score_page, _make_snippet, _search) are split from I/O so the parser and ranker are unit-tested with no network.
Handles YAML block-scalar frontmatter (e.g. url: >- on API-reference pages), strips .html suffixes for clean browsable URLs, and decodes HTML entities in titles/snippets.
Adds httpx as an explicit dependency (already transitive via fastmcp).

Testing

15 new unit tests in tests/test_docs_search.py (parser, tokenizer, ranking, source filter, limits, error handling) — no network required.
Full suite passes (excluding the live-credential integration test); ruff check and ruff format clean.
Verified end-to-end against the live sites. Example results:
- "rest api create work item" → /api-reference/issue/add-issue
- "self host docker compose" → /self-hosting/methods/docker-compose
- "how to create a cycle" → /core-concepts/cycles

Summary by CodeRabbit

New Features
- Added a documentation search tool that can return ranked results, snippets, or full page text for Plane docs.
- Documentation pages can now be searched across help and developer content sources.
Bug Fixes
- Improved guidance so documentation is checked first for how-to, why, and what questions before action-oriented requests.
- Added more reliable handling for partial documentation fetch issues.
Tests
- Added offline coverage for parsing, ranking, filtering, and full-text search behavior.

Add a `search_docs` MCP tool that runs local full-text search over Plane's two public documentation sites and returns ranked results with snippets. - docs.plane.so (product/help) and developers.plane.so (API, self-hosting, OAuth, webhooks, MCP), selectable via a `source` filter ("all" by default). - Fetches each site's Mintlify llms-full.txt, caches it in-process (1h TTL), splits it into per-page records, and ranks by saturated, field-weighted term frequency (title > description > body) so coverage beats repetition. - Handles YAML block-scalar frontmatter, strips .html suffixes, and decodes HTML entities in titles/snippets. - Fetch failures are surfaced as a non-fatal error/warnings field. Adds httpx as an explicit dependency and unit tests for the parser and ranker (no network required).

coderabbitai · 2026-06-19T15:31:07Z

📝 Walkthrough

Walkthrough

Adds a new documentation search tool, wires it into MCP registration and server instructions, updates the README, adds an httpx dependency, and includes offline tests for parsing, search, and full-text output.

Changes

Documentation Search Feature

Layer / File(s)	Summary
Corpus parsing `plane_mcp/tools/docs.py`	Parses `llms-full.txt` pages into `DocPage` records, extracting URLs, titles, descriptions, and body content from frontmatter blocks.
Search scoring, snippets, and caching `plane_mcp/tools/docs.py`	Tokenizes queries, scores pages, builds snippets or full content, fetches the docs corpora with `httpx`, and caches parsed results in memory.
MCP registration and guidance `plane_mcp/tools/docs.py`, `plane_mcp/tools/__init__.py`, `plane_mcp/instructions.py`, `README.md`, `pyproject.toml`	Registers `search_docs` with `FastMCP`, adds the tool to the server instruction flow and README, and adds `httpx` as a runtime dependency.
Offline search tests `tests/test_docs_search.py`	Adds offline tests for parsing, tokenization, ranking, source filtering, fetch errors, and `full_text` response shape.

Sequence Diagram(s)

sequenceDiagram
  participant MCPClient
  participant FastMCP
  participant search_docs
  participant DocsPlane as docs.plane.so/llms-full.txt
  participant DevPlane as developers.plane.so/llms-full.txt

  MCPClient->>FastMCP: ask for a how/what/why answer
  FastMCP->>search_docs: invoke query, source, limit, full_text
  search_docs->>DocsPlane: GET llms-full.txt
  search_docs->>DevPlane: GET llms-full.txt
  search_docs->>search_docs: parse pages, score matches, build snippet or content
  search_docs-->>FastMCP: ranked results with optional error/warnings
  FastMCP-->>MCPClient: tool response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

makeplane/plane-mcp-server#149: Also updates plane_mcp/instructions.py and the shared SERVER_INSTRUCTIONS constant.

Suggested reviewers

Prashant-Surya

Poem

A bunny bounced through docs by moonlight,
Sniffing snippets snug and bright.
search_docs chirped, “Hop this way!”
Full text, ranks, and pages at play.
Now Plane lore nibbles neat and light 🐰

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 43.59% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main change: adding the search_docs tool for Plane documentation search.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/search-docs-tool

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plane_mcp/tools/docs.py`:
- Around line 347-377: `search_docs` currently returns a raw dict and does not
follow the `plane_mcp/tools` contract. Update
`register_docs_tools`/`search_docs` to call `get_plane_client_context()` first,
then use the returned client and workspace slug when performing the docs lookup.
Replace the dict return with a plane-sdk Pydantic response model for the search
results so tool registration and schema stay consistent with the rest of
`plane_mcp/tools`.
- Around line 233-239: The scoring logic in docs.py is using raw substring
counting in the query matching path, so terms like api can match unrelated words
and skew ranking/snippets. Update the matching in the token loop inside the
search/ranking code (including the related excerpt logic around the referenced
matching helper) to use tokenized comparisons or word-boundary matching instead
of str.count/find on the full lowercase text, and keep the scoring based on true
word hits only.
- Around line 291-303: The _get_corpus cache only stores successful fetches, so
repeated _search() calls will keep retrying a failing docs source on every
request. Update _get_corpus to record a short-lived negative cache/backoff entry
for _fetch_corpus failures, and have the cached lookup in _CACHE distinguish
between a normal pages result and a recent failure so search_docs can skip
hammering the same source during outages while still retrying after the backoff
expires.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: abca5c50-9436-4c3a-b4b2-7f1ed7016ddf

📥 Commits

Reviewing files that changed from the base of the PR and between d3ab7cc and 34f477b.

📒 Files selected for processing (6)

README.md
plane_mcp/instructions.py
plane_mcp/tools/__init__.py
plane_mcp/tools/docs.py
pyproject.toml
tests/test_docs_search.py

coderabbitai · 2026-06-26T05:36:47Z

+    for token in query_tokens:
+        title_hits = title_l.count(token)
+        desc_hits = desc_l.count(token)
+        body_hits = body_l.count(token)
+        if title_hits or desc_hits or body_hits:
+            matched_terms += 1
+        score += 6.0 * _saturate(title_hits) + 3.0 * _saturate(desc_hits) + 2.0 * _saturate(body_hits)


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Use token or word-boundary matches instead of raw substrings.

str.count() / find() on the raw text makes any substring a hit, so a query like api will also score and excerpt unrelated words that merely contain api. That skews the core ranking behavior and can generate misleading snippets. Count against tokenized fields or use word-boundary matching instead.

Also applies to: 260-262

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plane_mcp/tools/docs.py` around lines 233 - 239, The scoring logic in docs.py is using raw substring counting in the query matching path, so terms like api can match unrelated words and skew ranking/snippets. Update the matching in the token loop inside the search/ranking code (including the related excerpt logic around the referenced matching helper) to use tokenized comparisons or word-boundary matching instead of str.count/find on the full lowercase text, and keep the scoring based on true word hits only.

coderabbitai · 2026-06-26T05:36:47Z

+def _get_corpus(source: str) -> list[DocPage]:
+    """Return the cached corpus for a source, fetching it if stale or missing."""
+    cached = _CACHE.get(source)
+    if cached and time.time() - cached[0] < _CACHE_TTL:
+        return cached[1]
+    with _CACHE_LOCK:
+        # Re-check under the lock so concurrent callers don't double-fetch.
+        cached = _CACHE.get(source)
+        if cached and time.time() - cached[0] < _CACHE_TTL:
+            return cached[1]
+        pages = _fetch_corpus(source)
+        _CACHE[source] = (time.time(), pages)
+        return pages


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Cache fetch failures instead of retrying them on every request.

Right now only successful fetches populate _CACHE. If either docs host is slow or down, every search_docs call pays the full network timeout again and keeps hammering the failing upstream, even though _search() already treats that failure as non-fatal. A short negative-cache/backoff entry would keep the tool responsive during outages.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plane_mcp/tools/docs.py` around lines 291 - 303, The _get_corpus cache only stores successful fetches, so repeated _search() calls will keep retrying a failing docs source on every request. Update _get_corpus to record a short-lived negative cache/backoff entry for _fetch_corpus failures, and have the cached lookup in _CACHE distinguish between a normal pages result and a recent failure so search_docs can skip hammering the same source during outages while still retrying after the backoff expires.

coderabbitai · 2026-06-26T05:36:47Z

+def register_docs_tools(mcp: FastMCP) -> None:
+    """Register documentation-search tools with the MCP server."""
+
+    @mcp.tool()
+    def search_docs(
+        query: str,
+        source: Literal["all", "help", "developer"] = "all",
+        limit: int = 5,
+        full_text: bool = False,
+    ) -> dict:
+        """
+        Search Plane's official docs for how-to and conceptual answers.
+
+        Use for any how / what / why question about Plane — product usage
+        (docs.plane.so) or building on Plane: REST API, self-hosting, OAuth,
+        webhooks, MCP (developers.plane.so). Prefer over action tools, which act but
+        do not explain. Find a page with the default snippets, then re-call with
+        full_text=True, limit=1 to read it in full from cache (no URL fetch needed).
+
+        Args:
+            query: Question or keywords, e.g. "how to create a cycle".
+            source: "help" (product), "developer" (API / build), or "all" (default).
+            limit: Max results, 1-20 (default 5).
+            full_text: True returns each page's full "content" instead of a
+                "snippet"; use with limit=1.
+
+        Returns:
+            {"query", "results": [{"title", "url", "source", "score", and "snippet"
+            or "content"}]}; "error"/"warnings" only if a docs site fetch failed.
+        """
+        return _search(query, source, limit, full_text)


🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Align search_docs with the plane_mcp/tools contract.

This tool is registered from plane_mcp/tools/, but it returns a raw dict and skips get_plane_client_context(). That breaks the repo’s tool-level contract for schema/registration consistency; either move this out of the contract-bound package or add the standard context lookup plus a plane-sdk Pydantic response model. As per coding guidelines, plane_mcp/tools/**/*.py: Tool functions must return Pydantic models from plane-sdk and Each tool must call get_plane_client_context() to obtain client and workspace_slug.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plane_mcp/tools/docs.py` around lines 347 - 377, `search_docs` currently returns a raw dict and does not follow the `plane_mcp/tools` contract. Update `register_docs_tools`/`search_docs` to call `get_plane_client_context()` first, then use the returned client and workspace slug when performing the docs lookup. Replace the dict return with a plane-sdk Pydantic response model for the search results so tool registration and schema stay consistent with the rest of `plane_mcp/tools`.

Source: Coding guidelines

sriramveeraghanta changed the base branch from canary to main June 19, 2026 15:31

feat: enhance search_docs tool with full-text retrieval option.

34f477b

coderabbitai Bot reviewed Jun 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add search_docs tool for searching Plane documentation#158

feat: add search_docs tool for searching Plane documentation#158
sriramveeraghanta wants to merge 2 commits into
mainfrom
feat/search-docs-tool

sriramveeraghanta commented Jun 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Uh oh!

coderabbitai Bot Jun 26, 2026

Uh oh!

coderabbitai Bot Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

sriramveeraghanta commented Jun 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Implementation notes

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sriramveeraghanta commented Jun 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading