Skip to content

feat: download meeting transcripts via SharePoint Stream#20

Merged
mingnz merged 4 commits into
mainfrom
feat/meeting-transcripts
May 29, 2026
Merged

feat: download meeting transcripts via SharePoint Stream#20
mingnz merged 4 commits into
mainfrom
feat/meeting-transcripts

Conversation

@mingnz

@mingnz mingnz commented May 29, 2026

Copy link
Copy Markdown
Owner

What

Adds two commands to download Microsoft Teams meeting transcripts, without using Microsoft Graph (consistent with the repo's design rule):

  • teams recordings <chat> — lists meeting recordings shared in a chat (index, name, date)
  • teams transcript <chat> [index] — downloads a recording's transcript as WebVTT (default), speaker-grouped text, or raw JSON (--format vtt|grouped|json, -o <path>, - for stdout)

How it works

Teams stores recordings/transcripts in SharePoint/Stream, so this adds a fourth API surface:

  1. parseRecordings() scans the meeting chat for RichText/Media_CallRecording messages (a <URIObject> carrying a SharePoint sharing link).
  2. resolveDriveItem()GET /_api/v2.0/shares/u!{base64url}/driveItem for {driveId, itemId}.
  3. getTranscriptMetadata()GET /_api/v2.1/drives/.../items/...?$expand=media/transcripts for the temporaryDownloadUrl.
  4. downloadTranscriptJson() fetches it with ?format=json; converters produce VTT / grouped text.

SharePoint token acquisition

SharePoint tokens aren't in localStorage (the Stream player holds them in memory). So on the first transcript for a host, the recording is opened headlessly in the persistent browser profile and the Bearer token is intercepted off the player's /_api/ request, then cached per host in tokens.sharepoint[host].

Testing

  • 16 new unit + mocked-fetch tests (converters, recording parser, filename helper, the three API functions). All 93 tests pass; tsc + biome clean.
  • Verified end-to-end against a real meeting: recordings lists 3 recordings; transcript produces valid VTT (313 cues), grouped text, and raw JSON. Old recordings with cleaned-up share links fail gracefully with a clear message.

Notes

  • Added a top-level error handler so commands print a clean message instead of a stack trace.
  • Docs updated: CLAUDE.md, ARCHITECTURE.md, README.md, and the skill.

Add `recordings` and `transcript` commands to list meeting recordings
shared in a chat and download their transcripts, without using Graph.

- recordings: parse RichText/Media_CallRecording messages (URIObject with
  a SharePoint sharing link); list with index, name, and date
- transcript: resolve the recording (shares API -> media/transcripts
  expand -> JSON download) and convert to WebVTT, speaker-grouped text, or
  raw JSON (--format vtt|grouped|json, -o output, '-' for stdout)
- SharePoint tokens aren't in localStorage; acquire on demand by opening
  the recording headlessly and intercepting the Bearer token off the
  player's /_api/ request, cached per host in tokens.sharepoint[host]
- index.ts: print a clean error message instead of a raw stack trace
- unit + mocked-fetch tests for converters, parser, and API functions
- docs: CLAUDE.md, ARCHITECTURE.md, README.md, skill
@github-actions

github-actions Bot commented May 29, 2026

Copy link
Copy Markdown

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 90.96% 282 / 310
🔵 Statements 90.51% 315 / 348
🔵 Functions 92.72% 51 / 55
🔵 Branches 75.55% 204 / 270
File Coverage
File Stmts Branches Functions Lines Uncovered Lines
Changed Files
src/api.ts 100% 81.94% 100% 100%
src/auth.ts 56.33% 47.16% 66.66% 55.93% 32, 144-160, 275, 280-282, 292-311
src/config.ts 100% 100% 100% 100%
src/formatting.ts 98.76% 82.75% 100% 98.56% 199, 228
Generated in workflow #71 for commit 6639bd4 by the Vitest Coverage Report Action

mingnz added 3 commits May 29, 2026 13:06
Acquiring a SharePoint token by navigating straight to the recording's
share link fails from a clean login: the browser profile has no
SharePoint session (the Teams login never visits SharePoint), so the
:v: link returns 'cannot access' and no /_api/ call fires.

Navigate to the host root first, which triggers the SSO login redirect
that establishes the session; the OneDrive/site SPA then makes an
authenticated /_api/ call we can intercept. Keep the recording link as a
fallback warmup, and ignore sub-100-char placeholder auth headers.
@mingnz mingnz merged commit 9d5b020 into main May 29, 2026
4 checks passed
@mingnz mingnz deleted the feat/meeting-transcripts branch May 29, 2026 01:42
@github-actions

Copy link
Copy Markdown

🎉 This PR is included in version 1.1.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant