Entrants' System Descriptions
CASC-J13
ProoVer 2026
CASC-J13
LEO-II 1.7.0
Alexander Steen
University of Greifswald, Germany
Architecture
LEO-II
[BP+08],
the successor of LEO
[BK98],
is a higher-order ATP system based on extensional higher-order resolution.
More precisely, LEO-II employs a refinement of extensional higher-order
RUE resolution
[Ben99].
LEO-II is designed to cooperate with specialist systems for fragments of
higher-order logic.
By default, LEO-II cooperates with the first-order ATP system E
[Sch02].
LEO-II is often too weak to find a refutation amongst the steadily growing
set of clauses on its own.
However, some of the clauses in LEO-II's search space attain a special
status: they are first-order clauses modulo the application of an
appropriate transformation function.
Therefore, LEO-II launches a cooperating first-order ATP system every n
iterations of its (standard) resolution proof search loop (e.g., 10).
If the first-order ATP system finds a refutation, it communicates its success
to LEO-II in the standard SZS format.
Communication between LEO-II and the cooperating first-order ATP system
uses the TPTP language and standards.
Strategies
LEO-II employs an adapted "Otter loop".
Moreover, LEO-II uses some basic strategy scheduling to try different
search strategies or flag settings.
These search strategies also include some different relevance filters.
Implementation
LEO-II is implemented in OCaml 4, and its problem representation language
is the TPTP THF language
[BRS08].
In fact, the development of LEO-II has largely paralleled the development
of the TPTP THF language and related infrastructure
[SB10].
LEO-II's parser supports the TPTP THF0 language and also the TPTP languages
FOF and CNF.
Unfortunately the LEO-II system still uses only a very simple
sequential collaboration model with first-order ATPs instead of using
the more advanced, concurrent and resource-adaptive OANTS architecture
[BS+08]
as exploited by its predecessor LEO.
The LEO-II system is distributed under a BSD style license, and it is
available from
http://www.leoprover.org
Expected Competition Performance
LEO-II is not actively being developed anymore, hence there are no expected improvements to last
year's CASC results.
Prover9 1109a
Bob Veroff on behalf of William McCune
University of New Mexico, USA
Architecture
Prover9, Version 2009-11A, is a resolution/paramodulation prover for first-order logic with
equality.
Its overall architecture is very similar to that of Otter-3.3
[McC03].
It uses the "given clause algorithm", in which not-yet-given clauses are available for rewriting
and for other inference operations (sometimes called the "Otter loop").
Prover9 has available positive ordered (and nonordered) resolution and paramodulation, negative
ordered (and nonordered) resolution, factoring, positive and negative hyperresolution,
UR-resolution, and demodulation (term rewriting).
Terms can be ordered with LPO, RPO, or KBO.
Selection of the "given clause" is by an age-weight ratio.
Proofs can be given at two levels of detail:
(1) standard, in which each line of the proof is a stored clause with detailed justification, and
(2) expanded, with a separate line for each operation.
When FOF problems are input, proof of transformation to clauses is not given.
Completeness is not guaranteed, so termination does not indicate satisfiability.
Strategies
Prover9 has available many strategies; the following statements apply to CASC.
Given a problem, Prover9 adjusts its inference rules and strategy according to syntactic
properties of the input clauses such as the presence of equality and non-Horn clauses.
Prover9 also does some preprocessing, for example, to eliminate predicates.
For CASC Prover9 uses KBO to order terms for demodulation and for the inference rules, with a
simple rule for determining symbol precedence.
For the FOF problems, a preprocessing step attempts to reduce the problem to independent
subproblems by a miniscope transformation; if the problem reduction succeeds, each
subproblem is clausified and given to the ordinary search procedure; if the problem reduction
fails, the original problem is clausified and given to the search procedure.
Implementation
Prover9 is coded in C, and it uses the LADR libraries.
Some of the code descended from EQP
[McC97].
(LADR has some AC functions, but Prover9 does not use them).
Term data structures are not shared (as they are in Otter).
Term indexing is used extensively, with discrimination tree indexing for finding rewrite rules
and subsuming units, FPA/Path indexing for finding subsumed units, rewritable terms, and
resolvable literals.
Feature vector indexing
[Sch04]
is used for forward and backward nonunit subsumption.
Prover9 is available from
http://www.cs.unm.edu/~mccune/prover9/
Expected Competition Performance
Prover9 is the CASC fixed point, against which progress can be judged.
Each year it is expected do worse than the previous year, relative to the other systems.
Vampire 5.0
Michael Rawson
University of Southampton, United Kongdom
Vampire 5.0 remains similar in spirit to all previous versions, but a bumper crop of changes have
been merged this competition cycle.
Various non-competition improvements to Vampire including a program synthesis mode
[HA+24]
and partial support for the polymorphic SMT-LIB 2.7 standard landed, but for the competition we
mention:
- ALASCA
[KK+23]
for reasoning with linear arithmetic, with further VIRAS extensions
[SKK24]
for quantifier elimination.
- Partial redundancy calculi
[HKV25]
- Stabilised and greatly enhanced runtime-specialised unidirectional term ordering checks
[HC+25]
- A variant of the ground joinability redundancy elimination rule, used in forward
simplification.
- Subsumption (resolution) via code trees.
- Integration of the CaDiCaL SAT solver
[BF+24]
alongside Minisat.
- More detailed output, including proofs that are (more) TSTP-compliant, reporting non-trivial
preprocessing in saturations, and producing completely faithful finite models of the input.
- Portability: Vampire is much more standards-compliant and portable than previously, with
much-reduced dependence on platform-specific APIs and hardware architectures, aided by C++17.
Vampire's higher-order support remains very similar to last year, although a re-implementation
intended for mainline Vampire is being merged in stages.
Architecture
Vampire
[BB+25]
is an automatic theorem prover for first-order logic with extensions to theory-reasoning and
higher-order logic.
Vampire implements several extensions of a core superposition calculus.
It also implements a MACE-style finite model builder for finding finite counter-examples
[RSV16].
Splitting in saturation-based proof search is controlled by the AVATAR architecture which uses a
SAT or SMT solver to make splitting decisions
[Vor14,
RB+16].
A number of standard redundancy criteria and simplification techniques are used for pruning the
search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered
unit equalities.
Substitution tree and code tree indices are used to implement all major operations on sets of
terms, literals and clauses.
Internally, Vampire works only with clausal normal form: problems not already in CNF are clausified
during preprocessing
[RSV16].
Vampire implements many preprocessing transformations, including the SInE axiom selection algorithm
for large theories and blocked clause elimination.
Strategies
Vampire 5.0 provides a very large number of options for strategy selection.
The most important ones are:
- Choices of saturation algorithm:
- Limited Resource Strategy
[RV03]
- DISCOUNT loop
- Otter loop
- MACE-style finite model building with sort inference
- Splitting via AVATAR
[Vor14]
- A variety of optional simplifications.
- Parameterized reduction orderings KBO and LPO.
- A number of built-in literal selection functions and different modes of comparing literals
[HR+16].
- Age-weight ratio that specifies how strongly lighter clauses are preferred for inference
selection.
This has been extended with a layered clause selection approach
[GS20].
- The set-of-support strategy with extensions for theory reasoning.
- For theory reasoning:
- Specialised calculi such as ALASCA.
- Addition of theory axioms and evaluation of interpreted functions
[RSV21].
- Use of Z3 with AVATAR to restrict search to ground-theory-consistent splitting branches
[RB+16].
- Specialised theory instantiation and unification
[RSV18].
- Extensionality resolution with detection of extensionality axioms
Implementation
Vampire 5.0 is implemented in C++.
It makes use of fixed versions of Minisat, CaDiCaL, GMP, VIRAS, and Z3.
See the GitHub repository and associated wiki
for more information.
Expected Competition Performance
Vampire 5.0 is the CASC-30 THF, FOF, and UEQ winner.
ProoVer 2026
GDV 2.0
Geoff Sutcliffe
University of Miami, USA
Overview
GDV 2.0 [Sut06] is
a verifier for derivations in classical first-order and typed first-order logic, written in
the TPTP format.
GDV checks a derivation in four verification phases: structural verification, leaf verification,
rule-specific verification, and inference verification.
- Structural verification deals with non-logical aspects of a proof, including checking the
syntax, that formulae are uniquely named, the derivation is acyclic, refutations have
false roots, etc.
- Leaf verification ensures that the leaves of the derivation match formulae in the original
input problem, and that introduced formulae such as definitions meet requirements.
- Rule-specific verification deals with special cases, e.g., splitting as implemented in SPASS
[Wei01].
Special techniques are available for verifying correctly documented Skolemization steps; see
Section 2.3 of
[SBB25].
- Inference verification uses external trusted ATP systems to verify each inference step, based
on the SZS status
[Sut08]
of the inference record.
For example, inference steps often have the SZS status thm indicating that the
inferred formula is (supposed to be) a theorem of the parent formulae.
In this case a proof obligation with the parent formulae as the axioms and in the inferred
formula as the conjecture is created, and discharged (or not) using a trusted theorem proving
ATP system.
Other SZS status values are treated with variants of that process.
Implementation
GDV is implemented in C.
It is available from:
https://github.com/TPTPWorld/GDV
GDV relies heavily on the JJParser library, which has to be downloaded separately into the same
directory as GDV:
https://github.com/TPTPWorld/JJParser
The external ATP systems run remotely, through the SystemOnTPTP service
[Sut00].
Expected Competition Performance
The short time limit of 30s per proof, and the relatively slow process of running the remote
ATP systems, means that GDV is likely to timeout often.
GDV is sound, slow, but very sure of itself.
GDV-LP 2.0
Frédéric Blanqui
ENS Paris-Saclay, INRIA, France
Overview
GDV-LP 2.0
[SBB25]
is a verifier for derivations in classical first-order and typed first-order logic, written in
the TPTP format.
GDV-LP checks a derivation in two steps:
- Standard GDV (as described above) is run, using ZenonModulo
[DD+13]
as the trusted ATP system for discharging proof obligations.
ZenonModulo is configured to output a LambdaPi term
[HB20]
for each discharge proof.
- GDV-LP produces the necessary files that declare the formulae, the signatures of the symbols
in the terms, a LambdaPi term for the root of the proof, and a lambdapi.pkg package
file.
ZenonModulo's LambdaPi terms are chained together from the root term, and passed to the
lambdapi checker to be checked.
The key strength of this added layer is that it is not necessary to trust the "trusted
ATP system", here ZenonModulo.
Additionally, tools other than lambdapi can be used to check the LambdaPi terms,
e.g., dkcheck
[Sai15]
and kontroli
[Far22].
Implementation
GDV-LP is implemented in C.
It is available from:
https://github.com/TPTPWorld/GDV
GDV relies heavily on the JJParser library, which has to be downloaded separately into the same
directory as GDV:
https://github.com/TPTPWorld/JJParser
ZenonModulo is implemented in OCaml.
It is available from:
https://github.com/Deducteam/zenon_modulo
lambdapi is written in OCaml.
It is available from:
https://github.com/Deducteam/lambdapi.git
ZenonModulo, other external ATP systems, and lambdapi run remotely, through the
SystemOnTPTP service
[Sut00].
Expected Competition Performance
The short time limit of 30s per proof, and the relatively slow process of running the remote
ATP systems, means that GDV-LP is likely to timeout often, probably in the first non-LambdaPi
step.
GDV-LP is sound, slow, but very very sure of itself.