Recent Updates to the CTTAF Benchmark: Refining a Key Tool for Theological AI Alignment

June 5, 2026

The Christian Theological Triage Alignment Framework (CTTAF) Benchmark received meaningful updates as of June 3, 2026. The project refined its core dataset, documentation, and supporting materials for better quality and usability.

What Changed?

Question Dataset Improvements: The benchmark now centers on 732 high-quality theological questions (refined from prior versions). Updates emphasize richer scenarios, precision probes, natural language, self-contained prompts, and strong alignment with Mohler/Ortlund-style theological triage (Primary/Secondary/Tertiary). Files like cttaf_questions_full.csv, along with samples (_sample_10.csv and _sample_50.csv), were enhanced for variety across doctrines, denominations, and question types (objective/pastoral).
Documentation & Guides: Updates to README, how-to-run guides, judge instructions, whitepaper sections, and evaluation scripts (e.g., evaluate_model.py, mock evaluator for free/local runs). This improves reproducibility, onboarding, and handling of dual-LLM judging with geometric mean scoring.
Overall Polish: Better metadata, triage transparency, and support for multi-provider models, making it more robust for researchers, theologians, and AI developers testing LLM alignment with Christian values.

This benchmark stands out for its pluralistic yet triage-informed approach—evaluating models across foundational doctrines, secondary debates, and tertiary distinctives using dual judges and rigorous scoring. It’s open (CC-BY-SA-4.0) and includes a growing whitepaper.

If you’re working in AI safety, alignment, or faith-tech intersections, this is worth exploring—especially with the free local/mock modes for quick starts.

Check it out: https://github.com/thebytebar/cttaf-benchmark Contributions welcome via the guidelines!