Integration of MEC workflow by andreaspauling · Pull Request #110 · MeteoSwiss/evalml

andreaspauling · 2026-02-12T14:42:42Z

Add the MEC workflow. The new parts are in green in the DAG: snakemake_dag.pdf

For each valid date a MEC case is set up and run. This includes:

creating the directory structure
adding the observations
organizing the model input including past runs depending on the config
rendering the MEC namelist
executing MEC for all dates with complete data for all leadtimes (excludes the first ones of the period)
storing the final feedback file in a separate place.

All MEC cases can be removed once the final feedback file is produced (removal not yet implemented).

Topics already raised by Francesco:
- put folder mec/ in data/mec in order not to mix up init and valid time (MEC is valid time oriented)
- check globbing options in MEC namelist with DWD (not documented, only FCR_TIME is supported afaik, * etc not). The aim is to avoid copying data.

… we want to factor it out of the rule

* Distinguish between primary runs ('candidates') and secondary runs * Docstrings

* Adopt forecast intervals including the end point * Fix parsing * Experiments work * Update config/forecasters.yaml * Align init times to availabiliy of COE * run pre-commit * Change README to COSMO-E availability --------- Co-authored-by: Jonas Bhend <jonasbhend@users.noreply.github.com> Co-authored-by: Jonas Bhend <jonas.bhend@meteoswiss.ch>

* draft changes * rename workspace resources dir * working for config/forecasters.yaml * improve logging * works for interpolators.yaml * re-add get_leadtime function * refactor run directives into script

* add region averages * add regions to config * Add regions to verification module, scripts, and rules * add stratification to forecaster config and fix typo * fix dict indexing * fix append error * read lon/lat from obs dataset * Add inner verification domain * Add missing dependency * add plots by region * Add regions to dashboard * Fix dashboard * Add region name and initializations to plot title (and remove header div) * Add support for multiple regions * Fix legend

…e-to-generate-namelist

andreaspauling · 2026-04-01T09:26:16Z

Is this really necessary? We are effectively duplicating the entire output data.

Random thought. What if we used a named pipe with cat <*.grib> as a replacement for actually creating the large file?

…ule-to-generate-namelist' into MRB-536-for-review

andreaspauling · 2026-05-21T08:34:57Z

FFV2 in this PR as well
evalml options --mec --ffv2 added (default: no mec/ffv2)
support of lists in config
mec running outside forecast run directories
support ver-files as source for observations
Paths moved to config
minor fixes / cleanup

frazane · 2026-05-21T09:30:07Z

Are these changes to the accumulation logic for total precipitation needed here? If not, I would remove these.

Exactly. MEC needs precip accumulated from the beginning of the run

frazane · 2026-05-21T09:33:57Z

        config=Path(OUT_ROOT / "data/runs/{run_id}/{init_time}/config.yaml"),
        resources=directory(OUT_ROOT / "data/runs/{run_id}/{init_time}/resources"),
        grib_out_dir=directory(OUT_ROOT / "data/runs/{run_id}/{init_time}/grib"),
-        okfile=touch(


Why was this change made?

With Claudes help: The okfile is necessary. inference_execute needs to depend on whichever of the two prepare rules ran, but it can't reference them directly by output path because both produce the same three outputs (config.yaml, resources/, grib/). The _inference_routing_fn function selects the correct prepare rule by model type — but to do so, it must reference a path that is unique per rule. The okfile provides that.

That okfile is used in _inference_routing_fn . The routing function returns the okfile path of whichever prepare rule ran (forecaster or interpolator), and inference_execute declares it as its input — this is how Snakemake knows to wait for the correct prepare rule to finish before launching inference.

Sounds plausible to me.

What I mean is that touch("/some/file") already automatically generates the file when the rule succeeds.

I could use Snakemake's touch() on line 199 in inference.smk and then remove those three lines from the script in each function - would that adress your point? I could do that and test it.

I tried to use touch(.../ok-file) in inference.smk instead of touching it in inference_prepare.py. I found no solution that worked. May we leave it with the current working solution or have a look at it together?

frazane · 2026-05-21T09:35:35Z

+from datetime import timedelta
+
+
+def _parse_steps(steps: str) -> list[int]:


Isn't this a duplicate of

evalml/workflow/rules/plot.smk

Line 119 in e4af0a6

def get_leadtimes(wc):

?

They have different input and output. It may be possible to merge but that would need some time and result in a one more complicated function.

frazane · 2026-05-21T10:12:10Z

+        """
+
+
+# link_mec_input: create the input_mod dir with symlinks to all fc files from all source inits


This rule is not creating symlinks, but copies. Didn't we want to avoid this?

I implemented a version with only symlinks. However, this did not work because all fields needed to calculate precipitation must be one file. This is a consequence of the basic way MEC works - it reads the grib files, does all the processing and then reads the next file. The current version now just copies the data that is really needed, reducing the amount of data considerably - in the first version all inference output was copied.

If we want to save disk space we simply could remove the mec directory. This is what could be done once this workflow is consolidated. Then no disk space is used unnecessarily at the end of the workflow. The feedback files are stored separately.

If the grib writing will be in one file - that would solve this as well.

I added a docstring explaining what this rule does.

dnerini and others added 30 commits October 7, 2025 14:01

Initial draft (pseudo code)

c1375ab

add namelist as resource

9f608f2

add verif_obs.smk to Snakefile

e82bd94

Add rules for observation data and namelist generation (using fake data)

c3ab651

add newline to namelist template

7512d96

somewhat working version of run_mec (with fake data)

13301a5

correct typo and add optional script for generating namelist, in case…

e722e5f

… we want to factor it out of the rule

fix: add localrule to inference_interpolator rule (#57)

3d9e3c1

Fix for interpolator rule

918913f

Consolidate multi packages into unique src/ dir (#58)

179eb4d

Update configs (#63)

e791a30

Adopt 'steps' instead of 'lead_time' (#62)

d197712

Update example config for experiment with interpolators (#70)

9568987

Distinguish between primary runs ('candidates') and secondary runs (#64)

128eb91

* Distinguish between primary runs ('candidates') and secondary runs * Docstrings

Mrb 550 inconcsistent forecast initializations in evalml (#72)

e028f59

Update vega-lite spec (#69)

5406777

Decouple inference preparation and execution (#68)

8d01490

* draft changes * rename workspace resources dir * working for config/forecasters.yaml * improve logging * works for interpolators.yaml * re-add get_leadtime function * refactor run directives into script

input data and namelist for MEC

04c4cf1

Merge remote-tracking branch 'origin/main' into MRB-534-Implement-rul…

b1959dc

…e-to-generate-namelist

Cleanup

23c9599

Refactor MEC namelist generation

804455a

setup MEC case

f793d85

add use of local MEC executable and cleaning

3839476

Support of mec in a sarus container

5b58b7a

First draft of FFV2 rules

569d713

change some params to fix wildcard issue

e6eb2cc

change name of nl file

ce90890

make note about ver ens member

29ab980

andreaspauling closed this Apr 1, 2026

andreaspauling reopened this Apr 1, 2026

Andreas Pauling and others added 18 commits April 7, 2026 11:39

Avoid duplicating model data and update to inn env

4845b6e

Merge branch 'main' into MRB-534-Implement-rule-to-generate-namelist

ef5fb82

fixes after merge with main, support for ICON

097c58f

Merge branch 'main' into MRB-534-Implement-rule-to-generate-namelist

5532469

wildcard fixes

331e67b

support precipitation differencing

bdf12f6

Merge branch 'main' into MRB-534-Implement-rule-to-generate-namelist

a572bbe

Merge remote-tracking branch 'refs/remotes/origin/MRB-534-Implement-r…

634e2d7

…ule-to-generate-namelist' into MRB-536-for-review

Attempt to merge from MRB-534-Implement-rule-to-generate-namelist again

2d66b40

cleanup

d3cc2a5

support ver-files as observation source

f3d45c7

add --mec --ffv2 options, paths in config, support of date lists

21d70b4

logging, cleaning

829932c

Run MEC outside the forecast run directory

8ab3393

cleanup

d5afdf2

Remove trailing whitespace

5396f20

ffv2 config update

332bd1e

Merge branch 'main' into MRB-534-Implement-rule-to-generate-namelist

0d65d78

andreaspauling requested review from dnerini and frazane May 21, 2026 08:35

frazane reviewed May 21, 2026

View reviewed changes

andreaspauling added 2 commits May 21, 2026 16:15

updates PR review

4595921

Merge branch 'main' into MRB-534-Implement-rule-to-generate-namelist

1f60ccf

andreaspauling requested a review from frazane May 21, 2026 14:37

andreaspauling added 2 commits May 27, 2026 14:28

Merge branch 'main' into MRB-534-Implement-rule-to-generate-namelist

93d9354

formatting

a3b1aba

		from datetime import timedelta


		def _parse_steps(steps: str) -> list[int]:

		"""


		# link_mec_input: create the input_mod dir with symlinks to all fc files from all source inits

Conversation

andreaspauling commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andreaspauling commented Apr 1, 2026

Uh oh!

andreaspauling commented May 21, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andreaspauling May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

andreaspauling commented Feb 12, 2026 •

edited

Loading

andreaspauling May 21, 2026 •

edited

Loading