Promptolution is a unified, modular framework for prompt optimization built for researchers and advanced practitioners who want full control over their experimental setup. Unlike end-to-end application frameworks with high abstraction, promptolution focuses exclusively on the optimization stage, providing a clean, transparent, and extensible API. It allows for simple prompt optimization for one task up to large-scale reproducible benchmark experiments.
- Implementation of many current prompt optimizers out of the box.
- Unified LLM backend supporting API-based models, Local LLMs, and vLLM clusters.
- Built-in response caching to save costs and parallelized inference for speed.
- Detailed logging and token usage tracking for granular post-hoc analysis.
Have a look at our Release Notes for the latest updates to promptolution.
- CANTANTE: Optimizing Agentic Systems via Contrastive Credit Attribution β Zehle, 2026. arXiv
- MO-CAPO: Multi-Objective Cost-Aware Prompt Optimization β BΓΌssing et al., 2026. arXiv
- promptolution: A Unified, Modular Framework for Prompt Optimization β Zehle et al., 2026. EACL 2026
- Can Calibration of Positional Encodings Enhance Long Context Utilization? β Zehle & AΓenmacher, 2026. EACL 2026
- Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky β Hathidara et al., 2025. arXiv
- CAPO: Cost-Aware Prompt Optimization β Zehle et al., 2025. AutoML 2025
pip install promptolution[api]
For local inference, add [transformers] (HuggingFace) or [vllm] (vLLM serving), or both.
import pandas as pd
from promptolution.utils import ExperimentConfig
from promptolution.helpers import run_experiment
# DataFrame with columns "x" (input) and "y" (label)
df = pd.read_csv("your_data.csv")
config = ExperimentConfig(
optimizer="capo",
task_description="Classify each sentence as subjective or objective.",
prompts=["Classify the text as objective or subjective."],
n_steps=10,
api_url="https://api.openai.com/v1",
model_id="gpt-4o-mini",
api_key="YOUR_API_KEY",
)
best_prompts = run_experiment(df, config)
print(best_prompts)Full tutorial: Getting Started notebook Β· Docs
| Name | Paper | Init prompts | Exploration | Costs | Parallelizable | Few-shot |
|---|---|---|---|---|---|---|
CAPO |
Zehle et al., 2025 | required | π | π² | β | β |
EvoPromptDE |
Guo et al., 2023 | required | π | π²π² | β | β |
EvoPromptGA |
Guo et al., 2023 | required | π | π²π² | β | β |
OPRO |
Yang et al., 2023 | optional | π | π²π² | β | β |
Taskβ Manages the dataset, evaluation metrics, and subsampling.Predictorβ Defines how to extract the answer from the model's response.LLMβ A unified interface handling inference, token counting, and concurrency.Optimizerβ The core component that implements the algorithms that refine prompts.ExperimentConfigβ A configuration abstraction to streamline and parametrize large-scale scientific experiments.
Contributions are welcome! See CONTRIBUTING.md for the workflow, code quality guidelines, and how to run tests.
If you use Promptolution in your research, please cite:
@inproceedings{zehle2026promptolution,
title={promptolution: A unified, modular framework for prompt optimization},
author={Zehle, Tom and Hei{\ss}, Timo and Schlager, Moritz and A{\ss}enmacher, Matthias and Feurer, Matthias},
booktitle={Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations)},
pages={282--296},
year={2026}
}Developed by Timo HeiΓ, Moritz Schlager, Tom Zehle, and Henri Oberpaur (LMU Munich, MCML, ELLIS, TUM, Uni Freiburg).





