Skip to content

feat(output): add optional timed molecule output#2436

Merged
njzjz merged 10 commits into
deepmodeling:masterfrom
hcustc:feat/timed-molecule-output
May 27, 2026
Merged

feat(output): add optional timed molecule output#2436
njzjz merged 10 commits into
deepmodeling:masterfrom
hcustc:feat/timed-molecule-output

Conversation

@hcustc

@hcustc hcustc commented May 15, 2026

Copy link
Copy Markdown
Contributor

This PR adds optional time-aware molecule output for tracking molecule structures across analyzed frames.

Changes include:

  • add optional frame/timestep columns to .moname via --show-molecule-time
  • support filtering .moname entries by analyzed frame via --molecule-frame
  • support filtering .moname entries by original timestep via --molecule-timestep
  • add optional .reactionevent JSONL output via --reaction-event
  • keep .reactionevent disabled by default to avoid extra per-event overhead

Tests

  • python -m black --check reacnetgenerator/reacnetgen.py reacnetgenerator/_path.py reacnetgenerator/_reaction.py reacnetgenerator/commandline.py tests/test_reacnetgen.py
  • python -m pytest tests/test_reacnetgen.py::TestReacNetGen::test_reaction_event_details tests/test_reacnetgen.py::TestReacNetGen::test_reaction_event_default_is_off tests/test_reacnetgen.py::TestReacNetGen::test_molecule_time_formatting tests/test_reacnetgen.py::TestReacNetGen::test_molecule_time_filter_by_timestep tests/test_reacnetgen.py::TestReacNetGen::test_commandline_help
  • git diff --check"

Summary by CodeRabbit

  • New Features

    • New CLI options to emit an optional molecule-timeline CSV (with frame/timestep filters) and to output per-reaction event CSV rows; both outputs are opt-in.
  • Documentation

    • Clarified molecule timeline and name formats, documented molecule-timeline CSV columns, timestep/frame filtering behavior, and reaction-event CSV columns/indexing.
  • Tests

    • Added unit and integration tests for molecule timeline generation, filtering, sorting/merging, and reaction-event CSV output.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds CLI flags and ReacNetGenerator options to emit optional per-molecule timeline CSV and optional per-reaction-event CSV; implements collection helpers and chunked spool merging for molecule timelines, extends reaction processing to emit timed event rows, adds timestep normalization, tests, and documentation.

Changes

Molecule-time and reaction-event output

Layer / File(s) Summary
CLI argument parsing and ReacNetGenerator configuration
reacnetgenerator/commandline.py, reacnetgenerator/reacnetgen.py
Adds --show-molecule-time, --molecule-frame, --molecule-timestep, and --reaction-event CLI flags; wires parsed options into ReacNetGenerator; adds kwargs printmoleculetime, moleculeframes, moleculetimesteps, printreactionevent; adds parm2cmd normalization and filter coercion; tweaks CPU detection.
Molecule-time helpers and output formatting
reacnetgenerator/_path.py
Implements _MoleculeTimelineSpool, frame/timestep extraction, filter sets, decision logic for timeline rows, and updates _CollectMolPaths/_CollectSMILESPaths to emit molecule-name lines plus optional filtered timeline CSV rows (chunked buffering and final merge/sort). Also tweaks filename/progress labels.
Reaction-event extraction and CSV output
reacnetgenerator/_reaction.py
ReactionsFinder.findreactions() conditionally changes worker payloads to include step index and, when enabled, writes per-event CSV rows (Timestep_Index, Reactant, Product) to reactioneventfilename; _getstepreaction() supports both modes and returns either reaction strings or event dicts.
Timestep normalization utility
reacnetgenerator/utils.py
Adds get_timestep_value() to normalize tuple-based or NumPy scalar timestep metadata into a plain Python value.
Test coverage for new features
tests/test_reacnetgen.py
Adds unit and integration tests for reaction-event CSV emission and default-off behavior, timestep normalization, molecule timeline formatting/filtering, spool merging, filter normalization, and parm2cmd CLI behavior.
User-facing documentation for output files and options
docs/guide/report.md
Fixes a .species typo; documents .moname three-column format, optional .molecules.csv timeline (columns, sort, and how --molecule-frame/--molecule-timestep filter rows), and documents .reactionevent.csv columns (Timestep_Index, Reactant, Product).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Suggested reviewers

  • njzjz
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 38.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat(output): add optional timed molecule output' directly aligns with the main changes, which add time-aware molecule timeline output and per-reaction-event CSV files via new CLI flags.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented May 15, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 90.45936% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.88%. Comparing base (6092d9d) to head (f750792).

Files with missing lines Patch % Lines
reacnetgenerator/_path.py 88.29% 22 Missing ⚠️
reacnetgenerator/commandline.py 88.00% 3 Missing ⚠️
reacnetgenerator/_reaction.py 97.67% 1 Missing ⚠️
reacnetgenerator/reacnetgen.py 95.23% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2436      +/-   ##
==========================================
- Coverage   95.22%   94.88%   -0.35%     
==========================================
  Files          17       17              
  Lines        1528     1758     +230     
==========================================
+ Hits         1455     1668     +213     
- Misses         73       90      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@codspeed-hq

codspeed-hq Bot commented May 15, 2026

Copy link
Copy Markdown

Merging this PR will improve performance by 10.6%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 3 improved benchmarks
✅ 6 untouched benchmarks
⏩ 8 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime test_benchmark_hmm[reacnetgen_param3] 12.9 ms 11.7 ms +10.05%
WallTime test_benchmark_hmm[reacnetgen_param1] 2.2 ms 2 ms +10.7%
WallTime test_benchmark_hmm[reacnetgen_param2] 2.2 ms 2 ms +11.06%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing hcustc:feat/timed-molecule-output (f750792) with master (6092d9d)

Open in CodSpeed

Footnotes

  1. 8 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@reacnetgenerator/commandline.py`:
- Around line 275-286: The current checks for presence of numeric filters use
truthiness so numeric 0 is treated as missing; change the conditions that read
if pp.get("moleculeframes", None): and if pp.get("moleculetimesteps", None): to
explicit "is not None" checks (e.g. if pp.get("moleculeframes", None) is not
None:) so that 0 and other falsy but valid values are accepted; keep the rest of
the logic that normalizes to a list (moleculeframes / moleculetimesteps) and
appends flags to commands unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 509114a2-f26c-462f-9af1-f2730f42aede

📥 Commits

Reviewing files that changed from the base of the PR and between 2c3fa77 and 16ba49d.

📒 Files selected for processing (6)
  • docs/guide/report.md
  • reacnetgenerator/_path.py
  • reacnetgenerator/_reaction.py
  • reacnetgenerator/commandline.py
  • reacnetgenerator/reacnetgen.py
  • tests/test_reacnetgen.py

Comment thread reacnetgenerator/commandline.py Outdated
@hcustc hcustc force-pushed the feat/timed-molecule-output branch from 16ba49d to d01d5a4 Compare May 15, 2026 05:57
@hcustc hcustc force-pushed the feat/timed-molecule-output branch from 3bc3cdc to fdc2507 Compare May 15, 2026 05:59

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds optional time-aware molecule output and an optional per-event reaction JSONL output, enabling users to track molecules and reactions across analyzed frames and original timesteps while keeping additional per-event overhead disabled by default.

Changes:

  • Add optional frame/timestep columns to molecule output, plus filtering by analyzed frames or original timesteps.
  • Add optional .reactionevent JSONL output with per-event reaction metadata (frame/timestep range + atom ids).
  • Extend CLI and tests to cover the new options and output formats.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/test_reacnetgen.py Adds unit tests for reaction-event details, default-off behavior, and molecule-time formatting/filtering.
reacnetgenerator/reacnetgen.py Adds new configuration knobs and auto-enables molecule-time output when molecule frame/timestep filters are provided.
reacnetgenerator/commandline.py Introduces CLI flags for molecule-time display/filtering and reaction-event output; updates parm2cmd accordingly.
reacnetgenerator/_reaction.py Implements optional reaction-event JSONL writing and enriches reaction processing to emit per-event details when enabled.
reacnetgenerator/_path.py Extends molecule printing to optionally include frame/timestep columns and filter entries by selected frames/timesteps.
docs/guide/report.md Documents the extended molecule file format and the new optional reaction-event JSONL output; fixes minor typos.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread reacnetgenerator/reacnetgen.py Outdated
Comment thread reacnetgenerator/_path.py Outdated
Comment on lines +292 to +298
@staticmethod
def _gettimestepvalue(timestep):
if isinstance(timestep, tuple):
timestep = timestep[-1]
if isinstance(timestep, np.generic):
return timestep.item()
return timestep
Comment thread reacnetgenerator/_reaction.py Outdated
Comment on lines 145 to 150
if isinstance(timestep, tuple):
timestep = timestep[-1]
if isinstance(timestep, np.generic):
return timestep.item()
return timestep

@njzjz-bot

njzjz-bot commented May 15, 2026

Copy link
Copy Markdown
Collaborator

Superseded by inline review comment: #2436 (comment)

— OpenClaw 2026.4.22 (model: gpt-5.5)

Comment thread reacnetgenerator/commandline.py Outdated
commands.extend((f"--{ii}", str(pp[ii])))
if pp.get("printmoleculetime", False):
commands.append("--show-molecule-time")
if pp.get("moleculeframes", None):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — this is a concrete correctness issue. parm2cmd() should use explicit is not None checks here so scalar 0 is preserved for valid frame/timestep filters. A small regression test for moleculeframes=0 / moleculetimesteps=0 would also be useful.

— OpenClaw 2026.4.22 (model: gpt-5.5)

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: HuangChen <121350288+hcustc@users.noreply.github.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@reacnetgenerator/_path.py`:
- Around line 325-337: The _shouldprintmolecule method currently treats
moleculeframes and moleculetimesteps independently causing false positives;
change it to match (frame, timestep) pairs when both filters are set: in
_shouldprintmolecule, if both self.moleculeframes and self.moleculetimesteps are
non-None, ensure timesteps is populated (call self._getmoleculetimesteps(frames)
if needed) and iterate frames with their corresponding timesteps (zip frames and
timesteps) returning True only if any (int(frame) in self.moleculeframes and
int(timestep) in self.moleculetimesteps); keep the existing behavior when only
one of the filters is set.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6fbe8b4e-ddd2-477e-b213-05fd37c8b8e3

📥 Commits

Reviewing files that changed from the base of the PR and between 16ba49d and e989735.

📒 Files selected for processing (6)
  • docs/guide/report.md
  • reacnetgenerator/_path.py
  • reacnetgenerator/_reaction.py
  • reacnetgenerator/commandline.py
  • reacnetgenerator/reacnetgen.py
  • tests/test_reacnetgen.py
✅ Files skipped from review due to trivial changes (1)
  • docs/guide/report.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • reacnetgenerator/commandline.py
  • reacnetgenerator/_reaction.py
  • reacnetgenerator/reacnetgen.py
  • tests/test_reacnetgen.py

Comment thread reacnetgenerator/_path.py Outdated
@hcustc hcustc requested review from Copilot and njzjz-bot May 15, 2026 14:55

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Comment on lines +276 to +286
commands.append("--molecule-frame")
moleculeframes = pp["moleculeframes"]
if not isinstance(moleculeframes, (list, tuple)):
moleculeframes = [moleculeframes]
commands.extend(str(x) for x in moleculeframes)
if pp.get("moleculetimesteps") is not None:
commands.append("--molecule-timestep")
moleculetimesteps = pp["moleculetimesteps"]
if not isinstance(moleculetimesteps, (list, tuple)):
moleculetimesteps = [moleculetimesteps]
commands.extend(str(x) for x in moleculetimesteps)
Comment on lines +250 to +262
for kk in ("moleculeframes", "moleculetimesteps"):
if kwargs[kk] is not None:
values = (
list(kwargs[kk])
if isinstance(kwargs[kk], (list, tuple, np.ndarray))
else [kwargs[kk]]
)
kwargs[kk] = [int(x) for x in values]
if (
kwargs["moleculeframes"] is not None
or kwargs["moleculetimesteps"] is not None
):
kwargs["printmoleculetime"] = True
# reaction with SMILES name like A+B->C+D
return [self._filterspec(reaction) for reaction in new_networks]
events = []
assert stepidx is not None
Comment thread reacnetgenerator/_path.py Outdated
Comment on lines +318 to +328
def _shouldprintmolecule(self, frames, timesteps=None):
if self.moleculeframes is None and self.moleculetimesteps is None:
return True
assert frames is not None
if self.moleculeframes is not None:
if set(map(int, frames)).intersection(self.moleculeframes):
return True
if self.moleculetimesteps is not None:
if timesteps is None:
timesteps = self._getmoleculetimesteps(frames)
if set(timesteps).intersection(self.moleculetimesteps):
Comment thread reacnetgenerator/utils.py
"""Normalize stored timestep metadata to the timestep value."""
if isinstance(timestep, tuple):
timestep = timestep[-1]
if isinstance(timestep, np.generic):
Comment thread tests/test_reacnetgen.py
Comment on lines +278 to +288
def test_parm2cmd_preserves_zero_molecule_filters(self):
"""Frame and timestep 0 are valid molecule filters."""
cmd = parm2cmd(
{
"inputfilename": "dummy",
"inputfiletype": "lammpsbondfile",
"atomname": ["H", "O"],
"moleculeframes": 0,
"moleculetimesteps": 0,
}
)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

@njzjz-bot njzjz-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-reviewed the latest updates. The main correctness issue I still see is the unresolved combined frame/timestep filtering behavior in _shouldprintmolecule(): when both filters are set, it should match them on the same occurrence rather than as independent OR checks.

The empty-list filter handling comments are also worth addressing before merge, because parm2cmd() can currently emit --molecule-frame / --molecule-timestep without values, and ReacNetGenerator(..., moleculeframes=[]) is treated as an active filter.

CI is green and I do not see additional blockers beyond the existing inline comments.

— OpenClaw 2026.4.22 (model: gpt-5.5)

@njzjz-bot njzjz-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed latest head 9fa09c6.

The previous correctness blockers look addressed now:

  • scalar 0 frame/timestep filters are preserved in parm2cmd();
  • empty frame/timestep filter lists are normalized/skipped instead of producing value-less CLI flags or active empty filters;
  • combined frame+timestep filtering now matches the same (frame, timestep) occurrence;
  • tuple/NumPy-array filter inputs are normalized;
  • timestep normalization has been centralized in get_timestep_value().

CI is green and the PR is mergeable. I do not see a remaining merge blocker. The only things still worth considering are non-blocking polish items already noted inline, such as replacing the assert stepidx is not None in reaction-event generation with an explicit exception and optionally handling 0-d NumPy arrays in get_timestep_value().

— OpenClaw 2026.4.22 (model: gpt-5.5)

Comment thread reacnetgenerator/_path.py Fixed
hcustc-bot and others added 3 commits May 27, 2026 20:43
Convert list comprehension to explicit loop for better static analysis.
Each file is now immediately registered with ExitStack for guaranteed cleanup,
even if an exception occurs during iteration.
Signed-off-by: HuangChen <121350288+hcustc@users.noreply.github.com>
Comment thread reacnetgenerator/_path.py
with ExitStack() as stack:
readers = []
for path in paths:
readers.append(csv.reader(stack.enter_context(open(path, newline=""))))
@njzjz njzjz added this pull request to the merge queue May 27, 2026
Merged via the queue into deepmodeling:master with commit f75f77e May 27, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants