UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection
-
Updated
Jun 8, 2026 - Python
UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection
Lightweight hallucination detection framework for RAG applications
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.
HaluMem is the first operation level hallucination evaluation benchmark tailored to agent memory systems.
🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".
[ACL 2026] Paper list of Video LLM hallucination. Welcome to Star and Contribute!
Unofficial implementation of Microsoft’s Claimify Paper: extracts specific, verifiable, decontextualized claims from LLM Q&A to be used for Hallucination, Groundedness, Relevancy and Truthfulness detection
A benchmark for evaluating hallucinations in large visual language models
Code release for THRONE, a CVPR 2024 paper on measuring object hallucinations in LVLM generated text.
Geometric LLM grounding verification — deterministic, auditable, no second LLM. Python library for measuring how faithfully model outputs reflect their sources.
Repository for the paper "A Unified Definition of Hallucination: It’s The World Model, Stupid!" https://arxiv.org/abs/2512.21577
When AI makes $10M decisions, hallucinations aren't bugs—they're business risks. We built the verification infrastructure that makes AI agents accountable without slowing them down.
MCP server for Eruka — anti-hallucination context memory for AI agents
TrustScoreEval: Trust Scores for AI/LLM Responses — Detect hallucinations, flags misinformation & Validate outputs. Build trustworthy AI.
A blind benchmark for legal citation verification — 4-label classification over IL + federal primary law
A robust hybrid pipeline for detecting hallucinated citations in academic papers and research documents. The system combines exact bibliographic lookup, fuzzy matching, and optional LLM verification to classify citations as valid, partially valid, or hallucinated.
A comprehensive study on reducing hallucinations in Large Language Models through strategic prompt engineering techniques. (COV + COT + Hybrid)
HALLUCINATED BY CURSOR WITh CODEX PLUGIN:::BEWARE:::::BaseX Coding Language - Revolutionary Base 5.10 Quantum Teleportation & Infinite Storage System by Joshua Hendricks Cole
Driving away from the binary "hallucinations" evals to a more nuanced and context-dependent eval technique.
Add a description, image, and links to the hallucination-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the hallucination-evaluation topic, visit your repo's landing page and select "manage topics."