Skip to content

MSG_31 parser drops 2 radials from sweep_10 of KILX20230629_154426_V06 (358 instead of 360) #376

Description

@aladinor

Summary

On a real NEXRAD Level 2 Archive II file (KILX20230629_154426_V06), xradar.io.open_nexradlevel2_datatree returns sweep_10 with 358 azimuths where the file actually contains 360. The 2 missing radials carry real precipitation data and fit cleanly into the 1° azimuth grid — they are silently dropped by the MSG_31 parser, not by remove_duplicate_rays / reindex_angle post-processing.

Every other sweep in the volume (13 sweeps total) is decoded with the expected ray count. Only sweep_10 is short.

Environment

  • xradar 0.12.0
  • Python 3.12
  • Linux x86_64
  • File: KILX20230629_154426_V06 (10,398,582 bytes; KILX, Lincoln IL; 2023-06-29 15:44 UTC; VCP with mixed 720-/360-ray sweeps)

Reproduction

import xradar as xd
import fsspec
import numpy as np

filepath = "s3://unidata-nexrad-level2/2023/06/29/KILX/KILX20230629_154426_V06"
stream = fsspec.open(filepath, mode="rb", anon=True).open()
dtree = xd.io.open_nexradlevel2_datatree(stream.read())

sw10 = dtree["sweep_10"].ds
az = np.sort(sw10.azimuth.values)
diffs = np.diff(az)
big = np.where(diffs > 1.5 * np.median(diffs))[0]

print(f"sweep_10 ray count : {sw10.sizes['azimuth']}")     # 358 (expected 360)
print(f"azimuth gaps       : {len(big)}")                  # 1
for i in big:
    print(f"  az[{i}]={az[i]:.3f}° → az[{i+1}]={az[i+1]:.3f}°  Δ={diffs[i]:.3f}°")
# az[146]=146.560° → az[147]=149.562°  Δ=3.002°

The full volume layout (per-sweep ray count) shows that only sweep_10 is affected:

sweep_0:  720, sweep_1:  720, sweep_2:  720, sweep_3:  720,
sweep_4:  720, sweep_5:  720, sweep_6:  360, sweep_7:  360,
sweep_8:  360, sweep_9:  360, sweep_10: 358,  ← short by 2
sweep_11: 360, sweep_12: 360

Confirming the dropped radials are real

Cross-decoded the same file with radish (Rust + nexrad-model 1.0.0-rc.4 backend, available as pip install radish-rs). It surfaces 360 rays in sweep_10. The 2 extra radials sit at azimuths 147.571° and 148.571° — exactly where xradar shows the 3° gap.

Probing those rows shows real measurement data, not synthesized:

row az (°) non-NaN gates DBZH range (dBZ) elevation (°) timestamp (s)
146 146.560 146 / 828 -18.5 .. +14.5 5.098 1688053652.211
147 147.571 127 / 828 -15.0 .. +12.0 5.098 1688053652.246
148 148.571 138 / 828 -19.0 .. +11.5 5.098 1688053652.282
149 149.562 138 / 828 -17.0 .. +8.5 5.098 1688053652.317

Several signatures rule out interpolation/padding:

  • Timestamp cadence: 35.6 ms between successive rays (the ~36 ms NEXRAD radial pace for this VCP). Synthesized rays would be zero, identical to a neighbour, or linearly interpolated.
  • Elevation: 5.098° matches the surrounding rows exactly — same physical antenna position.
  • Azimuth jitter: spacings span [0.9586°, 1.0437°] with std 0.0096° — natural antenna-servo variation. Interpolation produces machine-perfect 1.0000° steps.
  • Per-row gate counts vary (127 vs 138 valid gates) with different dBZ ranges — characteristic of independent measurements, not a fill-pattern.

The drop is in the parser, not post-processing

Disabling all post-processing still produces 358 rays:

dtree = xd.io.open_nexradlevel2_datatree(stream.read(),
                                         reindex_angle=False,
                                         decode_coords=False)
# sweep_10 ray count: 358  (same — confirms the loss is upstream)

util.remove_duplicate_rays only collapses exact azimuth duplicates (np.unique), and the surviving azimuths are all distinct. util.reindex_angle snaps to a 1° grid with tolerance = angle_res / 2 = 0.5°; the missing radials at 147.571° and 148.571° are within 0.07° of grid bins 147.5° and 148.5°, so reindex would happily place them. So neither helper is responsible.

The two rays must already be missing from self._data[current_sweep] by the time those helpers run, which points at MSG_31 ingestion in nexrad_level2.py lines roughly 820-870 — most likely the radial_status switch at lines 840-862 misclassifying these specific records (e.g., a mid-sweep status==2 "end of elevation" that closes the sweep early, or a status==1 "intermediate radial" branch that silently drops the data instead of appending it).

Suggested investigation

A targeted print of radial_status, radial_number, azimuth_angle, and elevation_angle for every MSG_31 record from records 145..150 of the affected sweep would identify which branch is dropping the records. Happy to wire that diagnostic up if it would help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions