feat: --extended_orf_analysis routes hybrid GTF into genome-BAM ORF callers#183
Draft
pinin4fjords wants to merge 6 commits into
Draft
feat: --extended_orf_analysis routes hybrid GTF into genome-BAM ORF callers#183pinin4fjords wants to merge 6 commits into
pinin4fjords wants to merge 6 commits into
Conversation
…allers (#165) Adds --extended_orf_analysis (default false). When enabled and a novel-transcript source is configured (--skip_stringtie false or --novel_gtf), the hybrid GTF from #164 is passed to: - Ribo-TISH predict: hybrid GTF on -g. The optional -a secondary annotation is left empty in the extended path to avoid a known Ribo-TISH NoneType+=int bug (zhpn1024/ribotish#33, #24) hit when the hybrid and secondary annotations share CDS rows; the hybrid GTF preserves canonical CDS so background calibration is intact. - Ribotricer prepare-orfs: hybrid GTF directly. Ribotricer has no secondary-annotation concept; CDS-absent novel transcripts are auto-labelled 'novel' in its ORF_type column. RiboCode, riboWaltz, plastid and Salmon-based quantification stay on the canonical backbone regardless of the flag - transcriptome-BAM constraint, addressed separately in #171. When --extended_orf_analysis true is set without a novel-transcript source, the pipeline warns and falls back to canonical so users can compose flags incrementally. Default behaviour is unchanged: with --extended_orf_analysis false the workflow graph is identical to the pre-#165 state.
Member
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.5.1. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
--extended_orf_analysis(defaultfalse) to opt into novel-ORF discovery on the hybrid annotation from #164. When the flag is on and a novel-transcript source is configured (--skip_stringtie falseor--novel_gtf <path>), the hybrid GTF is routed to:predict: hybrid GTF on-g. The optional-asecondary annotation (added by feat(ribotish/predict): wire optional -a reference annotation through pipeline #179) is left empty in the extended path to avoid a known Ribo-TISH bug (TypeError: unsupported operand type(s) for +=: 'NoneType' and 'int' zhpn1024/ribotish#33, Incorporate check for 3nt periodicity #24) hit when both-gand-ashare CDS rows. The hybrid GTF preserves canonical CDS records, so background calibration is intact.prepare-orfs: hybrid GTF directly. Ribotricer has no secondary-annotation concept; CDS-absent novel transcripts are auto-labellednovelin itsORF_typecolumn.RiboCode, riboWaltz, plastid, and Salmon-based quantification stay on the canonical backbone regardless of the flag. RiboCode in particular is gated by the transcriptome-BAM constraint and gets its own hybrid path in #171.
When
--extended_orf_analysis trueis set without a novel-transcript source, the pipeline warns and falls back to canonical (no-op rather than error, so users can compose flags incrementally).Default behaviour is unchanged
With
--extended_orf_analysis false, the workflow graph is identical to PR6 (feat/164-stringtie-novel).Stacked PR notes
Seventh in the stack splitting #174. Targets #182 (
feat/164-stringtie-novel).Closes #165
🤖 Generated with Claude Code