Skip to content

feat: --extended_orf_analysis routes hybrid GTF into genome-BAM ORF callers#183

Draft
pinin4fjords wants to merge 6 commits into
feat/164-stringtie-novelfrom
feat/165-extended-orf-analysis
Draft

feat: --extended_orf_analysis routes hybrid GTF into genome-BAM ORF callers#183
pinin4fjords wants to merge 6 commits into
feat/164-stringtie-novelfrom
feat/165-extended-orf-analysis

Conversation

@pinin4fjords

Copy link
Copy Markdown
Member

Summary

Adds --extended_orf_analysis (default false) to opt into novel-ORF discovery on the hybrid annotation from #164. When the flag is on and a novel-transcript source is configured (--skip_stringtie false or --novel_gtf <path>), the hybrid GTF is routed to:

RiboCode, riboWaltz, plastid, and Salmon-based quantification stay on the canonical backbone regardless of the flag. RiboCode in particular is gated by the transcriptome-BAM constraint and gets its own hybrid path in #171.

When --extended_orf_analysis true is set without a novel-transcript source, the pipeline warns and falls back to canonical (no-op rather than error, so users can compose flags incrementally).

Default behaviour is unchanged

With --extended_orf_analysis false, the workflow graph is identical to PR6 (feat/164-stringtie-novel).

Stacked PR notes

Seventh in the stack splitting #174. Targets #182 (feat/164-stringtie-novel).

Closes #165

🤖 Generated with Claude Code

…allers (#165)

Adds --extended_orf_analysis (default false). When enabled and a
novel-transcript source is configured (--skip_stringtie false or
--novel_gtf), the hybrid GTF from #164 is passed to:

- Ribo-TISH predict: hybrid GTF on -g. The optional -a secondary
  annotation is left empty in the extended path to avoid a known
  Ribo-TISH NoneType+=int bug (zhpn1024/ribotish#33, #24) hit when
  the hybrid and secondary annotations share CDS rows; the hybrid
  GTF preserves canonical CDS so background calibration is intact.
- Ribotricer prepare-orfs: hybrid GTF directly. Ribotricer has no
  secondary-annotation concept; CDS-absent novel transcripts are
  auto-labelled 'novel' in its ORF_type column.

RiboCode, riboWaltz, plastid and Salmon-based quantification stay
on the canonical backbone regardless of the flag - transcriptome-BAM
constraint, addressed separately in #171.

When --extended_orf_analysis true is set without a novel-transcript
source, the pipeline warns and falls back to canonical so users can
compose flags incrementally.

Default behaviour is unchanged: with --extended_orf_analysis false
the workflow graph is identical to the pre-#165 state.
@nf-core-bot

Copy link
Copy Markdown
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.5.1.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants