Skip to content

Add validation with Qwen3-VL and Orchard-GUI#1

Open
QianhuiWu wants to merge 2 commits into
mainfrom
add-validation-section
Open

Add validation with Qwen3-VL and Orchard-GUI#1
QianhuiWu wants to merge 2 commits into
mainfrom
add-validation-section

Conversation

@QianhuiWu
Copy link
Copy Markdown
Collaborator

No description provided.

QianhuiWu and others added 2 commits May 15, 2026 23:19
Add section "04 · Validation" with a Plotly bar chart showing that
WebHarbor-WebVoyager preserves the model ranking observed on three
independent live-web benchmarks (WebVoyager, Online-Mind2Web, DeepShop).
Three models evaluated and linked out: Qwen3-VL-4B-Thinking,
Qwen3-VL-235B-A22B-Thinking, Orchard-GUI-4B. Chart styling matches the
page palette (accent-blue family, warm-soft highlight band, Instrument
Sans font).

Also add a teal "Validated" pill above the hero quickstart command that
links to the new section, and update the outline list and nav order.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Validation lede: unbold the three model names; bold the closing claim
  "WebHarbor ranks models consistently with prior evaluations" so the
  takeaway lands.
- Move the "Our vision" hinge block back to follow Evolving Environments
  (it was inadvertently left after Validation when the sections were
  reordered).
- Hero "Validated" pill: move below the docker command, expand wording
  to "model rankings on the docked WebVoyager sites match live-web
  benchmarks", and flip the row margin to match its new position.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant