Skip to content

Add support for subset by value in the DAP4 CE syntax#322

Draft
jgallagher59701 wants to merge 23 commits into
masterfrom
jhrg/subset-by-value
Draft

Add support for subset by value in the DAP4 CE syntax#322
jgallagher59701 wants to merge 23 commits into
masterfrom
jhrg/subset-by-value

Conversation

@jgallagher59701

Copy link
Copy Markdown
Member

We have had a long-standing back-and-forth regarding the ERDDAP notation for value based subsetting. I thought I'd revisiting the issue and take this opportunity to update the grid() and maybe geogrid() server functions so they work with DAP4. This branch is the start of that, looking at the syntax that ERDDAP uses and modifying the DAP4 CE parser to recognize it.

What is not so good is that a problem shows up. Using the new
syntax [(45.6)] fails because the '.' scans as a PATH_SEP, and
that breaks everything. This is why I added NUMBER before. But,
that causes its own set of parsing issues. Try FLOAT and add
rules in the parser for both [(FLOAT)] and [(WORD)]?
@jgallagher59701 jgallagher59701 marked this pull request as draft December 22, 2025 18:18
@ndp-opendap

Copy link
Copy Markdown
Contributor

To be clear, are you talking about implementing what is described on the ERDDAP/GridDap web page?

Also, added to the cmake README more info about actually using
cmake for the builds.
This just sets any value-based dimension subset to 0:1:0. This shows
that parsing works. The next step - to determine the indices based on
values, is going to be left until the grid/geogrid functions are done.
This does not affect any other parts of the parser and all
current behavior should work.
Shared dimensions support the 'shared subset' syntax, but will not
support subset by value _unless_ there is a Map array in the
dataset.
@sonarqubecloud

Copy link
Copy Markdown

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 8c16aa7b-b519-4ec3-8a8f-84203517e07b

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@jgallagher59701

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/DMRTest.at (1)

719-739: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

VBS regression cases are asserting placeholder behavior, not VBS semantics.

The new block marks multiple distinct value-based expressions as pass against the same baseline outputs, which can pass even when value-based mapping is incorrect. Before merge, these should assert semantic outcomes (or be clearly segregated as parse-only transitional coverage) so evaluator regressions are detectable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/DMRTest.at` around lines 719 - 739, The VBS test cases in the
DMR_TRANS_CE_NO_CRC test block are currently using placeholder baseline outputs
and marked as pass without validating actual value-based subsetting semantics.
Multiple distinct value-based subsetting expressions are being tested against
identical baseline output files (like test_array_4.xml.vbs.1a.trans_base), which
allows tests to pass even when the VBS mapping logic is incorrect. Either
replace these tests with proper semantic assertions that validate correct
value-based subsetting behavior once the VBS logic is fully implemented, or
explicitly restructure and mark these tests as transitional parse-only
validation tests that do not expect full value-based subsetting functionality,
so that future regressions in actual VBS evaluation logic will be detectable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@d4_ce/D4ConstraintEvaluator.h`:
- Around line 86-94: The two overloaded make_value_based_index static methods
are silently discarding their parsed string parameters and always returning a
fixed index of (0, 1, 0, false, false, "") regardless of input, which causes
value-based constraint expressions to return incorrect data while appearing to
succeed. Replace this silent coercion by either implementing the actual logic to
map the parsed string values to real indices, or add a meaningful error or
exception that explicitly indicates value-based subset evaluation is not yet
implemented. This way incorrect results are prevented and the limitation is made
explicit rather than hidden.

In `@README.cmake.md`:
- Line 74: The heading "Using Cmake Presets" uses incorrect capitalization for
the CMake project name. Change "Cmake" to "CMake" (with capital M) in the
heading to align with the official CMake naming convention and maintain
consistency with other references throughout the document.
- Around line 85-97: The fenced code block containing the Preset CMake variables
section is missing a language identifier on the opening fence. Add a language
identifier such as `text`, `bash`, or `makefile` immediately after the three
backticks on the line before "Preset CMake variables:" to comply with markdown
best practices (markdownlint MD040). This will properly specify the language
type for the code block.
- Line 125: In the README.cmake.md file, locate the phrase "by mistake" in the
sentence about using the ```--parallel``` flag and simplify it by replacing "by
mistake then" with "incorrectly," to make the wording more concise and direct
while maintaining clarity about the conditional action of cleaning directories
and re-running tests.

In `@tests/cmake/dmr-tests.cmake`:
- Around line 481-513: The VBS tests are currently asserting passing status for
all cases using placeholder parser behavior, which masks incomplete
implementation. For the test_array_4.xml cases in the dmr_trans_test calls
(tests 180-182 and 183-185), mark them as expected to fail (xfail) since VBS
logic is not yet coded and these should fail without Maps. For the
vol_1_ce_7.xml cases in dmr_trans_test calls (tests 186-188), either update the
expected baseline files to reflect actual value-based subsetting evaluation
results rather than temporary placeholder outputs, or mark them with xfail
status and create new test entries with corrected baselines once VBS evaluation
is properly implemented. This ensures the test suite accurately reflects
implementation status rather than masking incomplete work.

---

Outside diff comments:
In `@tests/DMRTest.at`:
- Around line 719-739: The VBS test cases in the DMR_TRANS_CE_NO_CRC test block
are currently using placeholder baseline outputs and marked as pass without
validating actual value-based subsetting semantics. Multiple distinct
value-based subsetting expressions are being tested against identical baseline
output files (like test_array_4.xml.vbs.1a.trans_base), which allows tests to
pass even when the VBS mapping logic is incorrect. Either replace these tests
with proper semantic assertions that validate correct value-based subsetting
behavior once the VBS logic is fully implemented, or explicitly restructure and
mark these tests as transitional parse-only validation tests that do not expect
full value-based subsetting functionality, so that future regressions in actual
VBS evaluation logic will be detectable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: b4c334af-0c06-4503-84e4-e6c9aff7cf5d

📥 Commits

Reviewing files that changed from the base of the PR and between 83ce4f0 and 01ff095.

📒 Files selected for processing (19)
  • .pre-commit-config.yaml
  • .prettierignore
  • README.cmake.md
  • d4_ce/D4CEScanner.h
  • d4_ce/D4ConstraintEvaluator.h
  • d4_ce/d4_ce_parser.yy
  • d4_ce/d4_ce_scanner.ll
  • d4_ce/unit-tests/D4ConstraintEvaluatorTest.cc
  • retired/d4_ce/D4ConstraintEvaluator.cc.nat_axes
  • retired/d4_ce/D4ConstraintEvaluator.h.nat_axes
  • retired/d4_ce/d4_ce_parser.yy.nat_axes
  • retired/d4_ce/d4_ce_scanner.ll.nat_axes
  • tests/CMakeLists.txt
  • tests/DMRTest.at
  • tests/cmake/dmr-tests.cmake
  • tests/dmr-test.cc
  • tests/dmr-testsuite/universal/test_array_4.xml.vbs.1a.trans_base
  • tests/dmr-testsuite/universal/test_array_4.xml.vbs.1x.trans_base
  • tests/dmr-testsuite/universal/vol_1_ce_7.xml.vbs.1.trans_base

Comment on lines +86 to +94
// For now, if a value-based subset is given, return the entire dimension. jhrg 12/23/25
// start and end
static index make_value_based_index(const std::string &, const std::string &) {
return index(0, 1, 0, false, false, "");
}
// start, stride, end
static index make_value_based_index(const std::string &, const std::string &, const std::string &) {
return index(0, 1, 0, false, false, "");
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t silently coerce value-based slices to a fixed 0:1:0 index.

Both overloads discard parsed VBS values and return a concrete fixed index (start=0, stop=0, rest=false), so value-based CEs can return incorrect data while still succeeding. This should either resolve to real mapped indices or fail fast until evaluation is implemented.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@d4_ce/D4ConstraintEvaluator.h` around lines 86 - 94, The two overloaded
make_value_based_index static methods are silently discarding their parsed
string parameters and always returning a fixed index of (0, 1, 0, false, false,
"") regardless of input, which causes value-based constraint expressions to
return incorrect data while appearing to succeed. Replace this silent coercion
by either implementing the actual logic to map the parsed string values to real
indices, or add a meaningful error or exception that explicitly indicates
value-based subset evaluation is not yet implemented. This way incorrect results
are prevented and the limitation is made explicit rather than hidden.

Comment thread README.cmake.md
make test -j20 2.20s user 0.14s system 98% cpu 2.381 total
```

## Using Cmake Presets

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Align CMake capitalization with conventions.

The heading should use "CMake" (capital M) per the CMake project naming standard and consistency with other references in the document.

Fix capitalization
-## Using Cmake Presets
+## Using CMake Presets
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## Using Cmake Presets
## Using CMake Presets
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.cmake.md` at line 74, The heading "Using Cmake Presets" uses incorrect
capitalization for the CMake project name. Change "Cmake" to "CMake" (with
capital M) in the heading to align with the official CMake naming convention and
maintain consistency with other references throughout the document.

Source: Linters/SAST tools

Comment thread README.cmake.md
Comment on lines +85 to +97
```
Preset CMake variables:

BIG_ARRAY_TEST="OFF"
BUILD_DEVELOPER="ON"
BUILD_TESTING="ON"
CMAKE_BUILD_TYPE="Debug"
CMAKE_CXX_STANDARD="14"
CMAKE_CXX_STANDARD_REQUIRED="ON"
CMAKE_INSTALL_PREFIX="$prefix"
USE_ASAN="OFF"
USE_CPP_11_REGEX="ON"
```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Add language identifier to fenced code block.

Per markdown best practices (markdownlint MD040), fenced code blocks should specify a language identifier. The preset variables block shows text assignments and could use text, bash, or makefile.

Add language specifier
-```
+```text
 Preset CMake variables:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```
Preset CMake variables:
BIG_ARRAY_TEST="OFF"
BUILD_DEVELOPER="ON"
BUILD_TESTING="ON"
CMAKE_BUILD_TYPE="Debug"
CMAKE_CXX_STANDARD="14"
CMAKE_CXX_STANDARD_REQUIRED="ON"
CMAKE_INSTALL_PREFIX="$prefix"
USE_ASAN="OFF"
USE_CPP_11_REGEX="ON"
```
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 85-85: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.cmake.md` around lines 85 - 97, The fenced code block containing the
Preset CMake variables section is missing a language identifier on the opening
fence. Add a language identifier such as `text`, `bash`, or `makefile`
immediately after the three backticks on the line before "Preset CMake
variables:" to comply with markdown best practices (markdownlint MD040). This
will properly specify the language type for the code block.

Source: Linters/SAST tools

Comment thread README.cmake.md
```

The ```ctest``` program does that ```--parallel``` but don't use that with libdap
until the test tolls are made thread safe. If you do use ```--parallel``` by mistake

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Simplify "by mistake" phrasing.

The phrase "by mistake" is wordy; consider a more concise alternative like "incorrectly" or "accidentally."

Simplify wording
-If you do use ```--parallel``` by mistake then clean the dirs and re-run the tests.
+If you do use ```--parallel``` incorrectly, clean the dirs and re-run the tests.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
until the test tolls are made thread safe. If you do use ```--parallel``` by mistake
until the test tolls are made thread safe. If you do use
🧰 Tools
🪛 LanguageTool

[style] ~125-~125: ‘by mistake’ might be wordy. Consider a shorter alternative.
Context: ...ad safe. If you do use --parallel by mistake then clean the dirs and re-run the test...

(EN_WORDINESS_PREMIUM_BY_MISTAKE)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.cmake.md` at line 125, In the README.cmake.md file, locate the phrase
"by mistake" in the sentence about using the ```--parallel``` flag and simplify
it by replacing "by mistake then" with "incorrectly," to make the wording more
concise and direct while maintaining clarity about the conditional action of
cleaning directories and re-running tests.

Source: Linters/SAST tools

Comment on lines +481 to +513
# Test the value-based subsetting syntax addition. NB: vbs (value-based subsetting). jhrg 12/29/25
# This is the baseline version to start with - the temporary hack for the parsing uses an 'index'
# of 0:1:0. Thus, _for now_, all these CEs return the same values. That will change once the work
# moves forward.

# NB: test_array_4.xml uses shared dimensions, to the indicial syntax parses, but it does not
# have Maps, so this kind of subsetting will actually fail. Make sure that these six following
# tests _do_ fail once the VBS logic is coded. jhrg 12/29/25
dmr_trans_test(180 test_array_4.xml "/row=[0];/col=[0];a[][]" "" test_array_4.xml.vbs.1a.trans_base "universal")
dmr_trans_test(181 test_array_4.xml "/row=[(0):(1)];/col=[(0):(1)];a[][]" "" test_array_4.xml.vbs.1a.trans_base "universal")
dmr_trans_test(182 test_array_4.xml "/row=[(10):(11.11)];/col=[(0.4):(-10.7)];a[][]" "" test_array_4.xml.vbs.1a.trans_base "universal")

# The above tests were derived from, and use the same baselines as, these three below. jhrg 12/29/25
# DMR_TRANS_CE_NO_CRC([test_array_4.xml], [/row=[0];/col=[0];a[][] ], [test_array_4.xml.vbs.1a.trans_base], [pass])
# DMR_TRANS_CE_NO_CRC([test_array_4.xml], [/row=[(0):(1)];/col=[(0):(1)];a[][] ], [test_array_4.xml.vbs.1a.trans_base], [pass])
# DMR_TRANS_CE_NO_CRC([test_array_4.xml], [/row=[(10):(11.11)];/col=[(0.4):(-10.7)];a[][] ], [test_array_4.xml.vbs.1a.trans_base], [pass])

dmr_trans_test(183 test_array_4.xml "/row=[0];/col=[0];x[][]" "" test_array_4.xml.vbs.1x.trans_base "universal")
dmr_trans_test(184 test_array_4.xml "/row=[(0):(1)];/col=[(0):(1)];x[][]" "" test_array_4.xml.vbs.1x.trans_base "universal")
dmr_trans_test(185 test_array_4.xml "/row=[(10):(11.11)];/col=[(0.4):(-10.7)];x[][]" "" test_array_4.xml.vbs.1x.trans_base "universal")

# DMR_TRANS_CE_NO_CRC([test_array_4.xml], [/row=[0];/col=[0];x[][] ], [test_array_4.xml.vbs.1x.trans_base], [pass])
# DMR_TRANS_CE_NO_CRC([test_array_4.xml], [/row=[(0):(1)];/col=[(0):(1)];x[][] ], [test_array_4.xml.vbs.1x.trans_base], [pass])
# DMR_TRANS_CE_NO_CRC([test_array_4.xml], [/row=[(10):(11.11)];/col=[(0.4):(-10.7)];x[][] ], [test_array_4.xml.vbs.1x.trans_base], [pass])

# NB: vol_1_ce_7.xml has Maps and this should not only parse but also subset by value. jhrg 12/29/25
dmr_trans_test(186 vol_1_ce_7.xml "nlon=[0:2];nlat=[0];temp[][]" "" vol_1_ce_7.xml.vbs.1.trans_base "universal")
dmr_trans_test(187 vol_1_ce_7.xml "nlon=[0:2];nlat=[(10.0):(-17.9)];temp[][]" "" vol_1_ce_7.xml.vbs.1.trans_base "universal")
dmr_trans_test(188 vol_1_ce_7.xml "nlon=[0:2];nlat=[(0.09):(-17.9)];temp[][]" "" vol_1_ce_7.xml.vbs.1.trans_base "universal")

# DMR_TRANS_CE_NO_CRC([vol_1_ce_7.xml], [nlon=[0:2];nlat=[0];temp[][] ], [vol_1_ce_7.xml.vbs.1.trans_base], [pass])
# DMR_TRANS_CE_NO_CRC([vol_1_ce_7.xml], [nlon=[0:2];nlat=[(10.0):(-17.9)];temp[][] ], [vol_1_ce_7.xml.vbs.1.trans_base], [pass])
# DMR_TRANS_CE_NO_CRC([vol_1_ce_7.xml], [nlon=[0:2];nlat=[(0.09):(-17.9)];temp[][] ], [vol_1_ce_7.xml.vbs.1.trans_base], [pass])

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

VBS tests currently lock in placeholder behavior instead of semantic correctness.

Line 481–Line 513 document that parsing is still using a temporary index hack, but the tests are still asserted as normal passing baseline matches. This makes CI green even when value-based subsetting is not actually being validated (especially for the map-backed vol_1_ce_7.xml cases), which undermines the PR objective.

Before merge, switch these to semantic expectations (e.g., explicit xfail/pass split and distinct expected outputs for value-ranged cases) so the suite fails when VBS evaluation is wrong rather than preserving the temporary parser-only behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/cmake/dmr-tests.cmake` around lines 481 - 513, The VBS tests are
currently asserting passing status for all cases using placeholder parser
behavior, which masks incomplete implementation. For the test_array_4.xml cases
in the dmr_trans_test calls (tests 180-182 and 183-185), mark them as expected
to fail (xfail) since VBS logic is not yet coded and these should fail without
Maps. For the vol_1_ce_7.xml cases in dmr_trans_test calls (tests 186-188),
either update the expected baseline files to reflect actual value-based
subsetting evaluation results rather than temporary placeholder outputs, or mark
them with xfail status and create new test entries with corrected baselines once
VBS evaluation is properly implemented. This ensures the test suite accurately
reflects implementation status rather than masking incomplete work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants