Real-time insider-trading detector for Polymarket prediction markets on Polygon. Every OrderFilled event is ingested, every trader is scored 0–100 across five behavioural signals, and suspicious wallets surface on a live leaderboard.
Full product requirements:
objective.pdf
| Package | Role |
|---|---|
packages/common |
Shared library. Drizzle schema + migrations, ERC20 contracts, AppError, shared TypeScript types (Wallet, Trade, Market, WalletStats, …). Sub-path exports: common/database, common/contracts, common/errors, common/types. Must be built before backend or frontend. |
packages/backend |
Express API on port 8000. Static singleton services, WebSocket server (socket.io), scoring engine, and the full subgraph ingestion pipeline. |
packages/frontend |
Next.js single-page leaderboard UI. TanStack React Query polling + socket.io WebSocket subscriptions. No client-side routing. |
| Database | Role |
|---|---|
| Supabase / PostgreSQL | Source of truth. All persistent state: markets, trades, wallets, score history. Queried directly via Drizzle ORM — no views, no JSONB, no materialized caches. |
| Redis | Ephemeral operational layer. Never the source of truth — always rebuildable from Postgres. Holds job queues and sync cursors. |
| Source | What is queried | Purpose |
|---|---|---|
| Polymarket Subgraph — trades | orderFilledEvents · marketDatas |
All trade history. Resolves token IDs → condition IDs + outcome indices. Requires GRAPH_API_KEY. |
| Polymarket Subgraph — metadata | questions |
Market titles, creation timestamps (blockTimestamp), neg-risk flags. Requires GRAPH_API_KEY. |
| Alchemy RPC — Polygon mainnet (chain 137) | alchemy_getAssetTransfers |
Wallet's first-ever USDC.e receipt on Polygon — used for the Wallet Age signal (first_tx_at). Requires RPC_URL_137. |
Both subgraph endpoints sit behind The Graph Gateway and use GRAPH_API_KEY as a Bearer token.
All jobs are managed by JobsService and must be started before SubgraphService.init() so consumers are ready before data arrives.
| Job | Type | Interval | What it does |
|---|---|---|---|
| Enrichment Loop | Continuous | BRPOP (1 s timeout) | Pops raw OrderFilled events from raw:events:queue, resolves token IDs → condition IDs via Subgraph, upserts markets / wallets / trades to Postgres, queues wallets for scoring and RPC backfill. |
| Scoring Loop | Continuous | SPOP batch (1 s idle) | Pops up to 250 addresses from scoring:pending, calls score_wallets() PostgreSQL function, writes results to wallets + wallet_scores_history. |
| Realtime Sync | Interval | 1 min | Fetches new orderFilledEvents from the Subgraph since the last cursor, pushes them to raw:events:queue. |
| Volume Refresh | Interval | 5 min | Recomputes total_volume_usdc for every market in Postgres. |
| RPC Backfill Sweep | Interval | 15 min | Pops wallet addresses from rpc:backfill:queue, fetches first_tx_at via Alchemy, updates wallets, re-queues updated wallets for scoring. |
| Rescore Sweep | Interval | 6 hrs | Pushes all Watchlist+ wallets back onto scoring:pending to keep scores current as market lifecycles progress. |
What it is: A Polymarket prediction market, identified by its condition ID (0x…).
Source: Two Subgraph calls — the trades subgraph resolves marketDatas (token ID → condition ID + outcome index); the metadata subgraph resolves questions (title, blockTimestamp, isNegRisk). Fetched during enrichment as new token IDs appear.
Storage: markets table — id (condition ID, text PK), question, started_at, is_neg_risk, total_volume_usdc.
Used for: Entry timing signal (market lifecycle boundaries: started_at → MAX(filled_at)). Volume refresh. Enriching trade records with market context for the UI.
What it is: An EVM address that has placed at least one Polymarket trade.
Source: Extracted from orderFilledEvents.maker. first_tx_at is fetched asynchronously by the RPC backfill sweep via alchemy_getAssetTransfers.
Storage: wallets table — address (unique), current_score, score_tier, first_tx_at, first_trade_at, total_volume_usdc, plus one denormalized column per signal (signal_concentration_pts, signal_market_count_pts, signal_minimum_size_pts, signal_entry_timing_pts, signal_wallet_age_pts). Signal columns are duplicated here so the leaderboard query requires no joins.
Used for: Scoring. Leaderboard ranking.
What it is: A single OrderFilled event from the Polymarket CTF Exchange (0x4bFb41d5B3570DeFd03C39a9A4D8dE6Bd8B8982E).
Source: Polymarket Subgraph (trades endpoint) — orderFilledEvents paginated ascending by ID. Deduplicated on tx_hash.
Storage: trades table — wallet_address, market_id, size_usdc, entry_price (0–1 fraction), side (YES/NO), filled_at, tx_hash.
Used for: Computing all five scoring signals. Market volume aggregation. Trade history panel in the UI.
What it is: A point-in-time snapshot of a wallet's insider score and per-signal breakdown, recorded whenever the score is recomputed.
Source: Produced by the scoring engine after every score_wallets() call.
Storage: wallet_scores_history table — wallet_address, score, one column per signal, trigger (new_trade | cron), recorded_at.
Used for: Score sparkline chart in the wallet detail panel. Auditing why a score changed.
flowchart TD
A["pnpm dev:backend"] --> B
subgraph B["Parallel Init"]
direction LR
C["DatabaseService<br/>Supabase / Postgres"]
D["RedisService<br/>2 named ioredis clients"]
E["ViemService<br/>Polygon RPC"]
F["WSService<br/>socket.io namespaces"]
end
B --> G["JobsService.start()<br/>4 intervals + 2 loops"]
G --> H["Express listen :8000"]
H --> I["SubgraphService.init()<br/>historical bulk sync — paginate all OrderFilled events"]
I --> J["syncMode = realtime<br/>JobsService 1-min sync takes over"]
flowchart TD
SG1["Subgraph<br/>trades"] -->|"orderFilledEvents<br/>bulk on boot + delta 1 min"| Q1[/"raw:events:queue<br/>Redis List"/]
Q1 -->|"BRPOP"| EL["Enrichment Loop"]
SG2["Subgraph<br/>metadata"] -->|"resolve tokenId<br/>→ conditionId + title"| EL
EL -->|"upsert<br/>markets · wallets<br/>trades"| PG[("PostgreSQL")]
EL -->|"SADD"| Q2[/"scoring:pending<br/>Redis Set"/]
EL -->|"SADD"| Q3[/"rpc:backfill:queue<br/>Redis Set"/]
Q3 -->|"sweep 15 min"| RPC["Alchemy RPC<br/>Polygon"]
RPC -->|"first_tx_at"| PG
RPC -->|"re-queue"| Q2
Q2 -->|"SPOP 250"| SL["Scoring Loop"]
SL -->|"score_wallets()<br/>PostgreSQL fn"| PG
PG -->|"every 15 s<br/>+ on subscribe"| STAT["StatsService<br/>→ /stats WS"]
STAT -->|"WS data event<br/>tier distribution"| FE["Frontend"]
PG -->|"REST poll 30 s<br/>GET /wallets"| FE
PG -->|"REST on-demand<br/>GET /wallets/:address"| FE
The frontend polls every 30 seconds for leaderboard data. Wallet detail data is fetched on demand when a row is clicked.
| Endpoint | When | Returns |
|---|---|---|
GET /api/v1/wallets |
Poll every 30 s | Wallet[] — paginated leaderboard sorted by score desc. Filter by ?tier= |
GET /api/v1/wallets/:address |
On wallet select | WalletDetail (Wallet + trade_count, market_count) |
GET /api/v1/wallets/:address/history |
On wallet select | WalletScoresHistory[] — score snapshots for sparkline |
GET /api/v1/wallets/:address/trades |
On wallet select | TradeWithMarket[] — trades + market_question via LEFT JOIN |
GET /healthcheck |
Monitor | { success, timestamp, uptime, stream } |
Rate limit: 100 requests / 60 seconds per IP.
One namespace. Clients emit "subscribe" / "unsubscribe"; server confirms with "subscribed" / "unsubscribed" { success: boolean } then pushes "data" events.
Subscribe once. The server emits a full aggregate snapshot immediately on subscribe (so clients see data before the first tick) and then every 15 seconds.
socket.on("data", {
total_wallets: number,
normal: number,
watchlist: number,
suspicious: number,
flagged: number,
// Per-signal point-bucket counts
conc_25: number,
conc_15: number,
conc_5: number,
conc_0: number,
mkt_25: number,
mkt_15: number,
mkt_5: number,
mkt_0: number,
size_25: number,
size_15: number,
size_5: number,
size_0: number,
timing_25: number,
timing_15: number,
timing_5: number,
timing_0: number,
age_25: number,
age_15: number,
age_5: number,
age_0: number,
age_null: number,
all_max: number, // wallets with all five signals at max (25 pts each)
});Five signals, 25 pts each. Max raw = 125, clamped to 100. Scoring runs entirely inside PostgreSQL via score_wallets(p_addresses text[]) — a set-returning function that joins trades, markets, and wallets in five CTEs and returns one scored row per address. The service layer calls this function and wraps the upsert + history insert in a single transaction.
Final Score = min( H_C + H_MC + H_MS + H_ET + H_WA , 100 ) → 0–100
| Score | Tier |
|---|---|
| 80–100 | 🚨 Flagged Insider |
| 60–79 | |
| 30–59 | 👁️ Watchlist |
| 0–29 | Normal |
A wallet enters scoring:pending when: (1) a new trade is ingested, or (2) the 6-hour rescore sweep runs for all Watchlist+ wallets.
Definition: wallet's USDC in its top market / wallet's total lifetime USDC on Polymarket.
Where top market = the single market with the highest SUM(size_usdc) across all of the wallet's trades — where it spent the most, not where it traded most frequently.
Answers "what fraction of this wallet's own lifetime Polymarket activity is dominated by a single market?" A wallet that put everything into one market scores 100 %; a wallet spread evenly across 50 markets scores ~2 %.
| Volume Concentration | Points |
|---|---|
| > 90 % | 25 |
| 70–90 % | 15 |
| 50–70 % | 5 |
| < 50 % | 0 |
Alternative interpretations not used:
- Market share concentration (
wallet's USDC in market / total USDC volume of that market) — measures whether the wallet is a significant fraction of a thin market's liquidity. A property of the market, not the wallet's behaviour pattern.- Size-gated concentration — only scoring concentration when the top-market position exceeds a threshold (e.g. $1k) to filter out trivial all-in bets. Not implemented — noise from tiny wallets is acceptable at current scale.
Definition: number of distinct Polymarket markets the wallet has ever traded.
Answers "how narrowly focused is this wallet?" Insiders act on specific information about specific events — they trade very few markets. A broad trader spread across many markets is less likely to be acting on inside information.
| Distinct Markets | Points |
|---|---|
| 1 | 25 |
| 2–3 | 15 |
| 4–5 | 5 |
| ≥ 6 | 0 |
Definition: wallet's total lifetime USDC volume across all Polymarket trades.
Answers "did this wallet deploy meaningful capital?" Insiders drop large amounts of money on their bets. Tiny wallets are noise and should score low regardless of other signals.
| Lifetime Volume | Points |
|---|---|
| ≥ $10 000 | 25 |
| ≥ $1 000 | 15 |
| ≥ $100 | 5 |
| < $100 | 0 |
Definition: how far through the market's active lifecycle the wallet first entered its primary market (same top market used for H_C).
market_open = COALESCE(markets.started_at, MIN(all trades in market).filled_at)
market_close = MAX(all trades in market).filled_at -- proxy for resolution
wallet_entry = MIN(wallet's trades in primary market).filled_at
entry_ratio = (wallet_entry − market_open) / (market_close − market_open)
Answers "did this wallet enter late, when insiders typically act?" A wallet that opens a position in the last 10 % of a market's active life is highly suspicious. Early entrants (before the 50 % mark) are not penalized.
If market_open = market_close (single-timestamp market) → 0 pts (undefined, treated as neutral).
| Entry into Market Lifecycle | Points |
|---|---|
| > 90 % | 25 |
| > 70 % | 15 |
| > 50 % | 5 |
| ≤ 50 % or undefined | 0 |
Resolution proxy:
market_close = MAX(filled_at)is an accurate proxy for resolved markets (trading stops at resolution). For still-open markets it tracks the most recent trade, making scores for those markets less reliable. Since insider detection is primarily retrospective this is acceptable.
Definition: gap between the wallet's first-ever USDC.e receipt on Polygon and its first Polymarket trade.
Answers "was this wallet created specifically to make this trade?" A wallet funded and trading within hours is a strong insider signal. Established wallets with a long history of on-chain activity are less suspicious.
Sourced via alchemy_getAssetTransfers (RPC_URL_137). Stored as first_tx_at on the wallets table. Scores 0 if first_tx_at IS NULL — RPC unavailable, or wallet never received USDC.e on Polygon before its first trade.
| Gap: first USDC.e receipt → first Polymarket trade | Points |
|---|---|
| < 1 hour | 25 |
| 1–24 hours | 15 |
| 1–7 days | 5 |
| > 7 days or NULL | 0 |
Two additional signals were prototyped in the original scoring function but removed when the schema was simplified.
Definition: who funded this wallet's USDC on Polygon, and whether that funder is already flagged.
Answers "is this wallet part of a known funding cluster?" A wallet that was directly funded by a flagged insider is almost certainly related.
| Condition | Points |
|---|---|
Funded directly by a flagged wallet |
30 |
Shares a funder with a suspicious/flagged wallet |
20 |
| Funded by an unrecognised non-CEX wallet | 10 |
| CEX-funded or funder unknown | 0 |
Additionally, being funded by a flagged wallet triggered a hard override floor — the final score could not go below 70 regardless of other signals.
The original H_ET also had a 1.5× retroactive multiplier: if the wallet bet YES at a price < 0.30 on a market that resolved YES, or bet NO at a price > 0.70 on a market that resolved NO, the raw timing score was multiplied by 1.5 (capped at 45 pts). This rewards cases where the wallet not only entered late but also took a contrarian position at long odds — and was right.
- Node.js ≥ 20
- pnpm ≥ 9
- A Supabase project (PostgreSQL)
- A Redis instance
- The Graph API key (
GRAPH_API_KEY) - An Alchemy RPC URL for Polygon mainnet (
RPC_URL_137)
Copy the example files and fill in values:
cp packages/backend/.env.example packages/backend/.env
cp packages/frontend/.env.example packages/frontend/.env.localpackages/backend/.env — see packages/backend/.env.example
| Variable | Required | Description |
|---|---|---|
DATABASE_URL |
Yes | PostgreSQL connection string (Supabase) |
REDIS_URL |
Yes | Redis connection string |
GRAPH_API_KEY |
Yes | The Graph Gateway API key |
RPC_URL_137 |
Yes | Alchemy RPC URL for Polygon mainnet |
PORT |
No | Express port (default 8000) |
NODE_ENV |
No | development | production |
LOG_LEVEL |
No | Comma-separated log levels |
packages/frontend/.env.local — see packages/frontend/.env.example
| Variable | Required | Description |
|---|---|---|
NEXT_PUBLIC_API_BASE_URL |
Yes | Backend base URL (e.g. http://localhost:8000) |
pnpm install
# Development
pnpm dev:backend # builds common, starts backend with watch mode
pnpm dev:frontend # builds common, starts Next.js dev server
# Production build
pnpm build:backend
pnpm build:frontend
# Quality checks
pnpm check # lint + typecheck together
pnpm format:checkApply the Drizzle schema to your Supabase project and run any SQL migrations (including the score_wallets PostgreSQL function):
cd packages/common
pnpm exec drizzle-kit pushDB reset: When wiping the database, also clear both Redis sync cursors — on restart,
subgraph:last_synced_atis used as thetimestamp_gtbound for the bulk query, so a stale timestamp will skip all historical data even ifsubgraph:last_event_idis cleared:redis-cli DEL subgraph:last_event_id subgraph:last_synced_at