`sc.get.aggregate` memory leak for Dask array

### Please make sure these conditions are met

- [x] I have checked that this issue has not already been reported.
- [x] I have confirmed this bug exists on the latest version of scanpy.
- [x] (optional) I have confirmed this bug exists on the main branch of scanpy.

### What happened?

When working with large data (>8M cells) and trying to generate many pseudobulks (>100k), I noticed that the memory usage explodes, which defeats the purpose of using dask over loading everything in memory.


### Minimal code sample

```python
# /// script
# requires-python = ">=3.12"
# dependencies = [
#   "scanpy@git+https://github.com/scverse/scanpy.git@main",
# ]
# ///
#
# This script automatically imports the development branch of scanpy to check for issues

import scanpy as sc
import anndata as ad

adata = ad.experimental.read_lazy(<large file with >8M cells>)
adata.obs = adata.obs.to_memory()

adata.obs['group'] = adata.obs[['donor_id', 'cluster']].astype(str).agg('-'.join, axis=1)
pb_data = sc.get.aggregate(adata, 'group', 'sum')

pb_data.layers['sum'].compute()
```

### Error output

```pytb
There appear to be 2 leaked semaphore objects to clean up at shutdown
```

### Versions

<details>

```
scanpy  1.12.1
----    ----
wrapt   2.1.2
stack_data      0.6.3
fast-array-utils        1.4.1
matplotlib      3.10.8
anndata 0.12.10
scikit-learn    1.8.0
packaging       26.1
parso   0.8.6
h5py    3.16.0
traitlets       5.14.3
cycler  0.12.1
six     1.17.0
pyarrow 23.0.1
psutil  6.1.1
legacy-api-wrap 1.5
typing_extensions       4.15.0
MarkupSafe      3.0.3
prompt_toolkit  3.0.52
scipy   1.16.3
Deprecated      1.3.1
PyYAML  6.0.3
ipython 9.12.0
executing       2.2.1
pytz    2026.1.post1
numcodecs       0.15.1
msgpack 1.1.2
sparse  0.18.0
colorama        0.4.6
tblib   3.2.2
natsort 8.4.0
joblib  1.5.3
threadpoolctl   3.6.0
setuptools      82.0.1
decorator       5.2.1
pure_eval       0.2.3
python-dateutil 2.9.0.post0
jedi    0.19.2
pyparsing       3.3.2
asttokens       3.0.1
asciitree       0.3.3
pillow  12.2.0
xarray  2026.4.0
numba   0.65.0
Pygments        2.20.0
cloudpickle     3.1.2
kiwisolver      1.5.0
llvmlite        0.47.0
numpy   2.4.4
session-info2   0.4.1
zarr    2.18.7
pandas  2.3.3
Jinja2  3.1.6
wcwidth 0.6.0
dask    2024.7.1
toolz   1.1.0
----    ----
Python  3.12.13 | packaged by conda-forge | (main, Mar  5 2026, 16:50:00) [GCC 14.3.0]
OS      Linux-4.18.0-553.33.1.el8_10.x86_64-x86_64-with-glibc2.28
CPU     128/128 logical CPU cores, x86_64
GPU     No GPU found
```

</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`sc.get.aggregate` memory leak for Dask array #4074

Please make sure these conditions are met

What happened?

Minimal code sample

Error output

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

sc.get.aggregate memory leak for Dask array #4074

Description

Please make sure these conditions are met

What happened?

Minimal code sample

Error output

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`sc.get.aggregate` memory leak for Dask array #4074