Skip to content

Add 10 trending model architectures across all task families#67

Merged
divyasinghds merged 5 commits into
developfrom
refactor_model_zoo
May 19, 2026
Merged

Add 10 trending model architectures across all task families#67
divyasinghds merged 5 commits into
developfrom
refactor_model_zoo

Conversation

@divyasinghds

@divyasinghds divyasinghds commented May 18, 2026

Copy link
Copy Markdown
Contributor

Summary

Closes the most prominent gaps surfaced by the late-2025 architecture audit. Each new file follows the existing single-file template (framework metadata, main_class / main_method, task-specific fields) plus the new license field convention.

New models

Task File Source / loader
image_classification convnext_v2.py timm.create_model("convnextv2_tiny")
image_classification dinov3.py transformers AutoModel frozen backbone + linear head
object_detection rt_detr.py transformers.RTDetrForObjectDetection
semantic_segmentation mask2former.py transformers.Mask2FormerForUniversalSegmentation
keypoint_detection vitpose.py transformers.VitPoseForPoseEstimation
text_classification modernbert.py answerdotai/ModernBERT-base
text_classification eurobert.py EuroBERT/EuroBERT-210m (multilingual, trust_remote_code)
tabular_classification mitra.py autogluon/mitra-classifier (foundation model)
tabular_regression mitra.py autogluon/mitra-regressor
time_series_forecasting chronos_bolt.py amazon/chronos-bolt-base (T5 via transformers)
time_to_event_prediction random_survival_forest.py sksurv.ensemble.RandomSurvivalForest

CLAUDE.md updates

  • Adds optional license SPDX-style metadata field so downstream tooling can filter models by license.
  • New "Federated averaging conventions" section: BatchNorm handling, EMA buffers, LoRA-only fine-tuning for foundation models.

Infra prerequisites (separate PRs in tracebloc-client / averaging-service)

  • DINOv3 requires transformers >= 4.56. Existing infra is pinned to 4.51.3; that bump is being handled separately. If shipping ahead of the bump, swap DINOv3 → DINOv2.
  • All other models work on transformers == 4.51.3.

Test plan

  • CI passes (ruff, pytest matrix across pytorch / tensorflow / sklearn / survival jobs).
  • Each new file passes tests/test_model_contract.py in the pytorch job (the CI matrix installs latest transformers, which has RT-DETR + ViTPose).
  • In the training container (transformers==4.51.3), run from tracebloc_package import User; User().uploadModel(<path>) for each new file to confirm the SDK forward-pass check.
  • Verify averaging-service can deserialize a checkpoint from one of the foundation models (Mitra or Chronos-Bolt) when LoRA-only fine-tuning is enabled.

🤖 Generated with Claude Code


Note

Medium Risk
Mostly additive model templates, but several new entries load Hugging Face models with trust_remote_code=True and introduce new third-party dependencies/model IDs that can affect security review and runtime compatibility.

Overview
Adds a set of new single-file model templates across task families (image classification, detection, segmentation, keypoints, text, tabular, time-series, and survival), primarily by loading pretrained backbones from transformers (plus timm for ConvNeXtV2) and exposing them via the existing main_class/main_method contract.

Updates CLAUDE.md to recommend a new license metadata field and to document federated-averaging-oriented authoring conventions (BatchNorm/EMA handling and LoRA-only fine-tuning for large backbones).

Reviewed by Cursor Bugbot for commit 4555c12. Bugbot is set up for automated code reviews on this repo. Configure here.

divyasinghds and others added 2 commits May 18, 2026 12:54
Closes the most prominent gaps surfaced by the late-2025 architecture
audit. Each file follows the existing single-file template (framework
metadata, main_class / main_method, task-specific fields) and adds a
`license` field per the new CLAUDE.md convention.

- image_classification: convnext_v2, dinov3 (frozen backbone + linear head)
- object_detection: rt_detr (first transformer detector in the zoo)
- semantic_segmentation: mask2former (universal segmentation)
- keypoint_detection: vitpose (first transformer pose model)
- text_classification: modernbert, eurobert (multilingual)
- tabular_classification + tabular_regression: mitra (foundation model)
- time_series_forecasting: chronos_bolt (foundation model)
- time_to_event_prediction: random_survival_forest

CLAUDE.md gains a `license` metadata field and a federated-averaging
conventions section (BatchNorm handling, EMA buffers, LoRA-only for
foundation models).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@divyasinghds divyasinghds self-assigned this May 18, 2026
@LukasWodka

Copy link
Copy Markdown
Contributor

👋 Heads-up — Code review queue is at 16 / 8

Above the WIP limit. The team convention is to review existing PRs before opening new work.

Open PRs currently in Code review (oldest first):

Pull from review before opening new work. (This is a nudge from the kanban WIP check, not a block.)

Comment thread model_zoo/semantic_segmentation/pytorch/fcn.py Outdated
Comment thread model_zoo/tabular_classification/pytorch/mitra.py Outdated
The backend's model-upload bandit gate (TBT001) now allows
trust_remote_code=True only when the model id is a STRING LITERAL at
the from_pretrained call site, matched against a small vetted-repos
allowlist (see tracebloc/backend follow-up PR).

Move the EuroBERT and Mitra repo ids from a module-level `model_id`
variable into the from_pretrained() call itself so they pass the new
check. Behaviour is unchanged; only the AST shape moves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lasses

- semantic_segmentation/pytorch/fcn.py: FCNResNet.__init__ defaulted
  n_channels to image_size (256), which made the first Conv2d expect
  256-channel input instead of 3-channel RGB. Reverted to n_channels=3
  to match the other FCN class in the same file.
- tabular_classification/pytorch/mitra.py: MyModel accepted num_classes
  but never forwarded it; AutoModel.from_pretrained now receives it as
  num_classes so the Mitra config picks up the requested head size.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 5c6c95a. Configure here.

Comment thread model_zoo/time_series_forecasting/pytorch/chronos_bolt.py Outdated
Comment thread model_zoo/semantic_segmentation/pytorch/hrnet.py Outdated
- time_series_forecasting/pytorch/chronos_bolt.py: amazon/chronos-bolt-base
  ships a custom ChronosBoltModelForForecasting architecture; without
  trust_remote_code=True, transformers falls back to a plain T5 head,
  silently loading the wrong model. Mirrors the pattern already used by
  mitra.py and eurobert.py.
- semantic_segmentation/pytorch/hrnet.py: HRNet.__init__ stored
  input_size/batch_size as instance attrs but never read them; duplicates
  module-level metadata and adds confusion about where config lives.
  Removed both params.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@divyasinghds divyasinghds merged commit 773c0e1 into develop May 19, 2026
1 check passed
@divyasinghds divyasinghds deleted the refactor_model_zoo branch May 19, 2026 07:54
shujaatTracebloc pushed a commit that referenced this pull request May 22, 2026
* fixed hrnet

* feat: add 10 trending model architectures across all task families

Closes the most prominent gaps surfaced by the late-2025 architecture
audit. Each file follows the existing single-file template (framework
metadata, main_class / main_method, task-specific fields) and adds a
`license` field per the new CLAUDE.md convention.

- image_classification: convnext_v2, dinov3 (frozen backbone + linear head)
- object_detection: rt_detr (first transformer detector in the zoo)
- semantic_segmentation: mask2former (universal segmentation)
- keypoint_detection: vitpose (first transformer pose model)
- text_classification: modernbert, eurobert (multilingual)
- tabular_classification + tabular_regression: mitra (foundation model)
- time_series_forecasting: chronos_bolt (foundation model)
- time_to_event_prediction: random_survival_forest

CLAUDE.md gains a `license` metadata field and a federated-averaging
conventions section (BatchNorm handling, EMA buffers, LoRA-only for
foundation models).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(security): inline trust_remote_code model ids as string literals

The backend's model-upload bandit gate (TBT001) now allows
trust_remote_code=True only when the model id is a STRING LITERAL at
the from_pretrained call site, matched against a small vetted-repos
allowlist (see tracebloc/backend follow-up PR).

Move the EuroBERT and Mitra repo ids from a module-level `model_id`
variable into the from_pretrained() call itself so they pass the new
check. Behaviour is unchanged; only the AST shape moves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(model-zoo): correct FCNResNet input channels and wire mitra num_classes

- semantic_segmentation/pytorch/fcn.py: FCNResNet.__init__ defaulted
  n_channels to image_size (256), which made the first Conv2d expect
  256-channel input instead of 3-channel RGB. Reverted to n_channels=3
  to match the other FCN class in the same file.
- tabular_classification/pytorch/mitra.py: MyModel accepted num_classes
  but never forwarded it; AutoModel.from_pretrained now receives it as
  num_classes so the Mitra config picks up the requested head size.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(model-zoo): chronos_bolt trust_remote_code, drop hrnet dead attrs

- time_series_forecasting/pytorch/chronos_bolt.py: amazon/chronos-bolt-base
  ships a custom ChronosBoltModelForForecasting architecture; without
  trust_remote_code=True, transformers falls back to a plain T5 head,
  silently loading the wrong model. Mirrors the pattern already used by
  mitra.py and eurobert.py.
- semantic_segmentation/pytorch/hrnet.py: HRNet.__init__ stored
  input_size/batch_size as instance attrs but never read them; duplicates
  module-level metadata and adds confusion about where config lives.
  Removed both params.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants