Beagle first converts the input formulas into clause normal form.
Pure arithmetic (sub-)formulas are treated by eager application of quantifier
elimination.
The core reasoning component implements the Hierarchic Superposition Calculus
with Weak Abstraction (HSPWA)
[BW13].
Extensions are a splitting rule for clauses that can be divided into variable
disjoint parts, and a chaining inference rule for reasoning with inequalities.
The HSPWA calculus generalizes the superposition calculus by integrating
theory reasoning in a black-box style.
For the theories mentioned above, Beagle combines quantifier elimination
procedures and other solvers to dispatch proof obligations over these theories.
The default solvers are an improved version of Cooper's algorithm for linear
integer arithmetic, and the Fourier-Motzkin algorithm for linear real/rational
arithmetic.
Non-linear integer arithmetic is treated by partial instantiation.
Beagle uses strategy scheduling by trying (at most) two flag settings
sequentially.
Beagle's web site is
Like other SMT solvers, CVC4 treats quantified formulas using a two-tiered
approach.
First, quantified formulas are replaced by fresh boolean predicates and the
ground theory solver(s) are used in conjunction with the underlying SAT
solver to determine satisfiability.
If the problem is unsatisfiable at the ground level, then the solver answers
"unsatisfiable".
Otherwise, the quantifier instantiation module is invoked, and will either
add instances of quantified formulas to the problem, answer "satisfiable",
or return unknown.
The finite model finding has been developed to target problems containing
background theories, whose quantification is limited to finite and
uninterpreted sorts.
In finite model finding mode, CVC4 uses a ground theory of finite cardinality
constraints that minimizes the number of ground equivalence classes, as
described in [RT+13].
When the problem is satisfiable at the ground level, a candidate model is
constructed that contains complete interpretations for all predicate and
function symbols.
Quantifier instantiation strategies are then invoked to add instances of
quantified formulas that are in conflict with the candidate model, as
described in [RT+13].
If no instances are added, then the solver reports "satisfiable".
Beagle 0.9.22
Peter Baumgartner
NICTA and Australian National University, Australia
Architecture
Beagle is an automated theorem prover for sorted first-order logic with
equality over built-in theories.
The theories currently supported are integer arithmetic, linear rational
arithmetic and linear real arithmetic.
It accepts formulas in the FOF and TFF formats of the TPTP syntax, and
formulas in the SMT-LIB version 2 format.
Strategies
Beagles uses the Discount loop for saturating a clause set under the calculus'
inference rules.
Simplification techniques include standard ones, such as subsumption deletion,
demodulation by ordered unit equations, and tautology deletion.
It also includes theory specific simplification rules for evaluating
ground (sub)terms, and for exploiting cancellation laws and properties of
neutral elements, among others.
In the competition an aggressive form of arithmetic simplification is used,
which seems to perform best in practice.
Implementation
Beagle is implemented in Scala.
It is a full implementation of the HSPWA calculus.
It uses a simple form of indexing, essentially top-symbol hashes, stored
with each term and computed in a lazy way.
Fairness is achieved through a combination of measuring clause weights
and their derivation-age.
It can be fine-tuned with a weight-age ratio parameter, as in other provers.
https://bitbucket.org/peba123/beagle
Expected Competition Performance
Beagle is implemented in a straightforward way and would benefit from
optimized data structures.
We do not expect it to come in among the first.
CVC4 1.4
Andrew Reynolds
EPFL, Switzerland
Architecture
CVC4 [BC+11] is an SMT solver based on the DPLL(T) architecture [NOT06] that
includes built-in support for many theories including linear arithmetic,
arrays, bit vectors, datatypes and strings.
It incorporates various approaches for handling universally quantified
formulas.
In particular, CVC4 uses primarily uses heuristic approaches based on
E-matching for answering "unsatisfiable", and finite model finding approaches
for answering "satisfiable".
Strategies
For handling theorems, CVC4 primarily uses various configurations of
E-matching.
This year, CVC4 incorporates new methods for finding conflicting instances
of quantified formulas [RTd14], which have been shown to lead to improved
performance on unsatisfiable TPTP benchmarks.
CVC4 also incorporates a model-based heuristic for handling quantified
formulas containing only pure arithmetic, which will be used in the TFA
division.
For handling non-theorems, the finite model finding feature of CVC4 will use a number of orthogonal quantifier instantiation strategies. This year, it will incorporate several new features, including an optimized implementation of model-based quantifier instantiation which improves upon [RT+13], as well as techniques for sort inference. Since CVC4 with finite model finding is also capable of answering "unsatisfiable", it will be used as a strategy for theorems as well.
https://github.com/CVC4
CVC4 1.5
Andrew Reynolds
EPFL, Switzerland
Architecture
CVC4 [BC+11] is an SMT solver based on the DPLL(T) architecture [NOT06] that
includes built-in support for many theories, including linear arithmetic,
arrays, bit vectors, datatypes and strings.
It incorporates approaches for handling universally quantified formulas.
CVC4 primarily uses heuristic approaches based on E-matching for theorems,
and finite model finding approaches for non-theorems.
Like other SMT solvers, CVC4 treats quantified formulas using a two-tiered
approach. First, quantified formulas are replaced by fresh Boolean
predicates and the ground theory solver(s) are used in conjunction with the
underlying SAT solver to determine satisfiability.
If the problem is unsatisfiable at the ground level, then the solver answers
"unsatisfiable".
Otherwise, the quantifier instantiation module is invoked, and will either
add instances of quantified formulas to the problem, answer "satisfiable",
or return unknown.
Finite model finding in CVC4 targets problems containing background
theories whose quantification is limited to finite and uninterpreted sorts.
In finite model finding mode, CVC4 uses a ground theory of finite cardinality
constraints that minimizes the number of ground equivalence classes, as
described in [RT+13].
When the problem is satisfiable at the ground level, a candidate model is
constructed that contains complete interpretations for all predicate and
function symbols.
It then adds instances of quantified
formulas that are in conflict with the candidate model, as described in
[RT+13].
If no instances are added, it reports "satisfiable".
https://github.com/CVC4
E 1.9.1
Stephan Schulz
DHBW Stuttgart, Germany
Architecture
E
[Sch02,
Sch13]
is a purely equational theorem prover for full first-order logic with equality.
It consists of an (optional) clausifier for pre-processing full first-order
formulae into clausal form, and a saturation algorithm implementing an instance
of the superposition calculus with negative literal selection and a number of
redundancy elimination techniques.
E is based on the DISCOUNT-loop variant of the given-clause algorithm,
i.e., a strict separation of active and passive facts.
No special rules for non-equational literals have been implemented.
Resolution is effectively simulated by paramodulation and equality resolution.
For the LTB divisions, a control program uses a SInE-like analysis to
extract reduced axiomatizations that are handed to several instances
of E.
E will not use the on-the-fly learning introduced this year.
For CASC-25, E implements a strategy-scheduling automatic mode. The total CPU time available is broken into 8 (unequal) time slices. For each time slice, the problem is classified into one of several classes, based on a number of simple features (number of clauses, maximal symbol arity, presence of equality, presence of non-unit and non-Horn clauses,...). For each class, a schedule of strategies is greedily constructed from experimental data as follows: The first strategy assigned to a schedule is the the one that solves the most problems from this class in the first time slice. Each subsequent strategy is selected based on the number of solutions on problems not already solved by a preceding strategy.
About 210 different strategies have been evaluated on all untyped first-order problems from TPTP 6.0.0, and about 180 of these strategies are used in the automatic mode. A few new strategies may be added.
http://www.eprover.org
ePrincess 1.0
Peter Backeman
Uppsala University, Sweden
Architecture
Princess
[Rue08,Rue12]
is a theorem prover for first-order logic modulo linear integer arithmetic.
The prover uses a combination of techniques from the areas of first-order
reasoning and SMT solving.
The main underlying calculus is a free-variable tableau calculus,
which is extended with constraints to enable backtracking-free proof
expansion, and positive unit hyper-resolution for lightweight
instantiation of quantified formulae.
Linear integer arithmetic is handled using a set of built-in proof rules
resembling the Omega test, which altogether yields a calculus that is
complete for full Presburger arithmetic, for first-order logic, and for a
number of further fragments.
In addition, some built-in procedures for nonlinear integer arithmetic are
available.
The internal calculus of Princess only supports uninterpreted
predicates; uninterpreted functions are encoded as predicates,
together with the usual axioms. Through appropriate translation of
quantified formulae with functions, the e-matching technique common in
SMT solvers can be simulated; triggers in quantified formulae are
chosen based on heuristics similar to those in the Simplify prover.
ePrincess is an extension of Princess using a calculus presented in [BR15a] to handle first-order formulas with equality. A restriced version of simultaneous rigid e-unification is utilised to handle equalities with free variables. The solving procedures presented in [BR15b] have been implemented and constitue the foundation of the unification step.
http://user.it.uu.se/~petba168/breu/
ET 0.2
Josef Urban
Radboud University Nijmegen, The Netherlands
Architecture
E.T. 0.1 is a metasystem using E prover with specific strategies [Urb13,KU13]
and preprocessing tools [KU13,KU13,KU13] that are targeted mainly at
problems with many redundant axioms.
Its design is motivated by the recent experiments in the Large-Theory Batch
division [KUV14] and on the Flyspeck, Mizar and Isabelle datasets, however,
E.T. does no learning from related proofs.
Geo-III 2015E
Hans de Nivelle
University of Wroclaw, Poland
Architecture
Geo III is a theorem prover for Partial Classical Logic
[deN11],
based on reduction to Kleene Logic
[deN14].
Currently, only Sections 4 and 5 of
[deN14]
are implemented.
Since Kleene logic generalizes 2-valued logic, it is still possible to take
part in CASC.
Apart from being 3-valued, the main differences with earlier
versions of Geo are (1) more sophisticated learning schemes,
(2) improved proof logging, and (3) replacement of
recursion by explicit use of a stack.
iProver 0.9
Konstantin Korovin
University of Manchester, United Kingdom
Architecture
iProver is an automated theorem prover based on an instantiation calculus
Inst-Gen [GK03,Kor09] which is complete for first-order logic.
One of the distinctive features of iProver is a modular combination of
first-order reasoning with ground reasoning.
In particular, iProver currently integrates MiniSat [ES04] for reasoning
with ground abstractions of first-order clauses.
In addition to instantiation, iProver implements ordered resolution calculus
and a combination of instantiation and ordered resolution; see
[Kor08] for the implementation details.
The saturation process is implemented as a modification of a given
clause algorithm.
iProver uses non-perfect discrimination trees for the unification indexes,
priority queues for passive clauses, and a compressed vector index for
subsumption and subsumption resolution (both forward and backward).
The following redundancy eliminations are implemented: blocking non-proper
instantiations; dismatching constraints [GK04,Kor08]; global subsumption
[Kor08]; resolution-based simplifications and propositional-based
simplifications.
A compressed feature vector index is used
for efficient forward/backward subsumption and subsumption resolution.
Equality is dealt with (internally) by adding the necessary axioms of equality
with an option of using Brand's transformation.
In the LTB division, iProver-SInE uses axiom selection based
on the SInE algorithm [HV11] as implemented in Vampire [HKV10],
i.e., axiom selection is done by Vampire and
proof attempts are done by iProver.
Major additions in the current version are:
iProver is available from
http://www.cs.man.ac.uk/~korovink/iprover/
iProver 1.0
Konstantin Korovin, Christoph Sticksel
University of Manchester, United Kingdom
Architecture
iProver is an automated theorem prover based on an instantiation calculus
Inst-Gen
[GK03,Kor13] ,
which is complete for first-order logic.
iProver combines first-order reasoning with ground reasoning for which it
uses MiniSat
[ES04]
and was recently extended with PicoSAT
[Bie08]
and Lingeling
[Bie12]
(only MiniSat will be used at this CASC).
iProver also combines instantiation with ordered resolution; see
[Kor08]
for the implementation details.
The proof search is implemented using a saturation process based on
the given clause algorithm. iProver uses non-perfect discrimination trees
for the unification indexes, priority queues for passive clauses, and
a compressed vector index for subsumption and subsumption resolution
(both forward and backward).
The following redundancy eliminations are implemented: blocking non-proper
instantiations; dismatching constraints
[GK04,Kor08];
global subsumption
[Kor08];
resolution-based simplifications and propositional-based simplifications.
A compressed feature vector index is used for efficient forward/backward
subsumption and subsumption resolution.
Equality is dealt with (internally) by adding the necessary axioms of equality
with an option of using Brand's transformation.
In the LTB division, iProver uses axiom selection based on the SInE
algorithm
[HV11]
as implemented in Vampire
[HKV11],
i.e., axiom selection is done by Vampire and proof attempts are done by
iProver.
Some of iProver features are summarised below.
iProver is available from
http://www.cs.man.ac.uk/~korovink/iprover/
iProver 2.0
Konstantin Korovin
University of Manchester, United Kingdom
Architecture
iProver is an automated theorem prover based on an instantiation calculus
Inst-Gen
[GK03,Kor13],
which is complete for first-order logic.
iProver combines first-order reasoning with ground reasoning for which it
uses MiniSat
[ES04]
and optionally PicoSAT
[Bie08]
and Lingeling
[Bie12]
(only MiniSat will be used at this CASC).
iProver also combines instantiation with ordered resolution; see
[Kor08,Kor13]
for the implementation details.
The proof search is implemented using a saturation process based on
the given clause algorithm.
iProver uses non-perfect discrimination trees for the unification indexes,
priority queues for passive clauses, and a compressed vector index for
The following redundancy eliminations are implemented: blocking non-proper
instantiations; dismatching constraints
[GK04,Kor08];
global subsumption
[Kor08];
resolution-based simplifications and propositional-based simplifications.
A compressed feature vector index is used for efficient forward/backward
subsumption and subsumption resolution.
Equality is dealt with (internally) by adding the necessary axioms of
equality.
Recent changes in iProver include improved preprocessing and incremental
finite model finding; support of the AIG format for hardware verification
and model-checking (implemented by Dmitry Tsarkov).
In the LTB division, iProver uses axiom selection based on the SInE algorithm [HV11] as implemented in Vampire [HKV11], i.e., axiom selection is done by Vampire and proof attempts are done by iProver.
Some of iProver features are summarised below.
http://www.cs.man.ac.uk/~korovink/iprover/
iProverModulo 0.7-0.3
Guillaume Burel
ENSIIE/Cedric, France
Architecture
iProverModulo
[Bur11]
is an extension of iProver
[Kor08]
to integrate Polarized resolution modulo
[Dow10].
Polarized resolution modulo consists in presenting the theory in which the
problem has to be solved by means of polarized rewriting rules instead of
axioms.
It can also be seen as a combination of the set-of-support strategy and
selection of literals.
iProverModulo consists of two tools: First, autotheo is a theory preprocessor that converts the axioms of the input into rewriting rules that can be used by Polarized resolution modulo. Second, these rewriting rules are handled by a patched version of iProver 0.7 which integrates Polarized resolution modulo. The integration of polarized resolution modulo in iProver only affects its ordered resolution calculus, so that the instantiation calculus is untouched.
iProverModulo 0.7-0.3 outputs a proof that is made of two parts: First, autotheo print a derivation of the transformation of the axioms into rewriting rules. This derivation is in TSTP format and includes the CNF conversions obtained from E. Second, the modified version of iProver outputs a Dedukti proof from this rewriting rules and the non-axiom formulas, following the ideas of [Bur13].
The patched version of iProver is run on the remaining formulas modulo the rewriting rules produced by autotheo. No scheduling is performed. To be compatible with Polarized resolution modulo, literals are selected only when they are maximal w.r.t. a KBO ordering, and orphans are not eliminated. To take advantage of Polarized resolution modulo, the resolution calculus is triggered more often than the instantiation calculus, on the contrary to the original iProver.
Normalization of clauses w.r.t. the term rewriting system produced by autotheo is performed by transforming these rules into an OCaml program, compiling this program, and dynamically linking it with the prover.
http://www.ensiie.fr/~guillaume.burel/blackandwhite_iProverModulo.html.enSince iProverModulo needs to compile rewriting rules, an OCaml compiler is also provided.
Autotheo is available independently from iProverModulo from
http://www.ensiie.fr/~guillaume.burel/blackandwhite_autotheo.html.enAutotheo uses E to compute clausal normal form of formula. The version of E it uses is very slightly modified to make it print the CNF derivation even if no proof is found.
Both of autotheo and iProver are written in OCaml.
Isabelle 2015
Jasmin Blanchette
Inria Nancy, France
Architecture
Isabelle/HOL 2015
[NPW02]
is the higher-order logic incarnation of the generic proof assistant
Isabelle2015.
Isabelle/HOL provides several automatic proof tactics, notably an equational
reasoner
[Nip89],
a classical reasoner
[PN94],
and a tableau prover
[Pau99].
It also integrates external first- and higher-order provers via its subsystem
Sledgehammer
[PB10,BBP11].
Isabelle includes a parser for the TPTP syntaxes CNF, FOF, TFF0, and THF0, due
to Nik Sultana. It also includes TPTP versions of its popular tools, invokable
on the command line as isabelle tptp_tool max_secs
file.p. For example:
Isabelle is available in two versions. The HOT version (which is not participating in CASC-25) includes LEO-II [BP+08] and Satallax [Bro12] as Sledgehammer backends, whereas the competition version leaves them out.isabelle tptp_isabelle_hot 100 SEU/SEU824^3.p
The implementation of Isabelle relies on a small LCF-style kernel, meaning that inferences are implemented as operations on an abstract theorem datatype. Assuming the kernel is correct, all values of type theorem are correct by construction.
Most of the code for Isabelle was written by the Isabelle teams at the University of Cambridge and the Technische Universität München. Isabelle/HOL is available for all major platforms under a BSD-style license from
http://www.cl.cam.ac.uk/research/hvg/Isabelle
leanCoP 2.2
Jens Otten
University of Potsdam, Germany
Architecture
leanCoP [OB03,Ott08] is an automated theorem prover for classical first-order
logic with equality.
It is a very compact implementation of the connection (tableau)
calculus [Bib87,LS01].
leanCoP can read formulae in leanCoP syntax and in TPTP first-order syntax. Equality axioms and axioms to support distinct objects are automatically added if required. The leanCoP core prover returns a very compact connection proof, which is then translated into a more comprehensive output format, e.g., into a lean (TPTP-style) connection proof or into a readable text proof.
The source code of leanCoP 2.2 is available under the GNU general public license. It can be downloaded from the leanCoP website at:
http://www.leancop.deThe website also contains information about ileanCoP [Ott08] and MleanCoP [Ott12,Ott14], two versions of leanCoP for first-order intuitionistic logic and first-order modal logic, respectively.
LEO-II 1.6.2
Christoph Benzmüller
Freie Universität Berlin, Germany
Architecture
LEO-II [BP+08], the successor of LEO [BK98], is a higher-order ATP system
based on extensional higher-order resolution.
More precisely, LEO-II employs a refinement of extensional higher-order
RUE resolution [Ben99].
LEO-II is designed to cooperate with specialist systems for fragments of
higher-order logic.
By default, LEO-II cooperates with the first-order ATP system E [Sch02].
LEO-II is often too weak to find a refutation amongst the steadily growing
set of clauses on its own.
However, some of the clauses in LEO-II's search space attain a special
status: they are first-order clauses modulo the application of an
appropriate transformation function.
Therefore, LEO-II launches a cooperating first-order ATP system every n
iterations of its (standard) resolution proof search loop (e.g., 10).
If the first-order ATP system finds a refutation, it communicates its success
to LEO-II in the standard SZS format.
Communication between LEO-II and the cooperating first-order ATP system
uses the TPTP language and standards.
Unfortunately the LEO-II system still uses only a very simple sequential collaboration model with first-order ATPs instead of using the more advanced, concurrent and resource-adaptive OANTS architecture [BS+08] as exploited by its predecessor LEO.
The LEO-II system is distributed under a BSD style license, and it is available from
http://www.leoprover.org
MaLARea is available from
https://github.com/JUrban/MPTP2/tree/master/MaLAReaThe metasystem's Perl code is released under GPL2
Muscadet 4.5
Dominique Pastre
University Paris Descartes, France
Architecture
The Muscadet theorem prover is a knowledge-based system.
It is based on Natural Deduction, following the terminology of [Ble71] and
[Pas78], and uses methods which resembles those used by humans.
It is composed of an inference engine, which interprets and executes rules,
and of one or several bases of facts,
which are the internal representation of "theorems to be proved".
Rules are either universal and put into the system, or built by the system
itself by metarules from data (definitions and lemmas).
Rules may add new hypotheses, modify the conclusion, create objects,
split theorems into two or more subtheorems
or build new rules which are local for a (sub-)theorem.
The system is also able to work with second order statements. It may also receive knowledge and know-how for a specific domain from a human user; see [Pas89] and [Pas93]. These two possibilities are not used while working with the TPTP Library.
Muscadet is available from
http://www.normalesup.org/~pastre/muscadet/muscadet.html
Nitpick 2015
Jasmin Blanchette
Inria Nancy, France
Architecture
Nitpick
[BN10]
is an open source counterexample generator for Isabelle/HOL
[NPW02].
It builds on Kodkod
[TJ07],
a highly optimized first-order relational model finder based on SAT.
The name Nitpick is appropriated from a now retired Alloy precursor.
In a case study, it was applied successfully to a formalization of the C++
memory model
[BW+11].
SAT solvers are particularly sensitive to the encoding of problems, so special care is needed when translating HOL formulas. As a rule, HOL scalars are mapped to FORL singletons and functions are mapped to FORL relations accompanied by a constraint. More specifically, an n-ary first-order function (curried or not) can be coded as an (n + 1)-ary relation accompanied by a constraint. However, if the return type is the type of Booleans, the function is more efficiently coded as an unconstrained n-ary relation. Higher-order quantification and functions bring complications of their own. A function from σ to τ cannot be directly passed as an argument in FORL; Nitpick's workaround is to pass |σ| arguments of type τ that encode a function table.
Nitpick is available as part of Isabelle/HOL for all major platforms under a BSD-style license from
http://www.cl.cam.ac.uk/research/hvg/Isabelle
Princess 20150706
Philipp Rümmer
Uppsala University, Sweden
Architecture
Princess
[Rue08,Rue12]
is a theorem prover for first-order logic modulo linear integer arithmetic.
The prover uses a combination of techniques from the areas of first-order
reasoning and SMT solving.
The main underlying calculus is a free-variable tableau calculus,
which is extended with constraints to enable backtracking-free proof
expansion, and positive unit hyper-resolution for lightweight
instantiation of quantified formulae. Linear integer arithmetic is
handled using a set of built-in proof rules resembling the Omega test,
which altogether yields a calculus that is complete for full
Presburger arithmetic, for first-order logic, and for a number of
further fragments. In addition, some built-in procedures for nonlinear
integer arithmetic are available.
The internal calculus of Princess only supports uninterpreted
predicates; uninterpreted functions are encoded as predicates,
together with the usual axioms. Through appropriate translation of
quantified formulae with functions, the e-matching technique common in
SMT solvers can be simulated; triggers in quantified formulae are
chosen based on heuristics similar to those in the Simplify prover.
http://www.philipp.ruemmer.org/princess.shtml
Refute 2015
Jasmin Blanchette
Inria Nancy, France
Architecture
Refute
[Web08]
is an open source counterexample generator for Isabelle/HOL
[NPW02] based on a
SAT solver, and Nitpick's
[BN10]
precursor.
Refute employs a SAT solver to find a finite model of the negated conjecture. The translation from HOL to propositional logic is parameterized by the cardinalities of the atomic types occurring in the conjecture. Refute enumerates the possible cardinalities for each atomic type. If a formula has a finite counterexample, the tool eventually finds it, unless it runs out of resources.
Refute is available as part of Isabelle/HOL for all major platforms under a BSD-style license from
http://www.cl.cam.ac.uk/research/hvg/Isabelle
Satallax 2.8
Nik Sultana
Cambridge University, United Kingdom
Architecture
Satallax 2.8
[Bro12]
is an automated theorem prover for higher-order logic.
The particular form of higher-order logic supported by Satallax is Church's
simple type theory with extensionality and choice operators.
The SAT solver MiniSat
[ES04]
is responsible for much of the search for a proof.
The theoretical basis of search is a complete ground tableau calculus for
higher-order logic
[BS10]
with a choice operator
[BB11].
A problem is given in the THF format.
A branch is formed from the axioms of the problem and the negation of the
conjecture (if any is given).
From this point on, Satallax tries to determine unsatisfiability or
satisfiability of this branch.
Satallax progressively generates higher-order formulae and corresponding
propositional clauses
[Bro13].
These formulae and propositional clauses correspond to instances of the
tableau rules.
Satallax uses the SAT solver MiniSat as an engine to test the current set
of propositional clauses for unsatisfiability.
If the clauses are unsatisfiable, then the original branch is unsatisfiable.
Additionally, Satallax may optionally generate first-order formulas in
addition to the propositional clauses.
If this option is used, then Satallax peroidically calls the first-order
theorem prover E to test for first-order unsatisfiability.
If the set of first-order formulas is unsatisfiable, then the original branch
is unsatisfiable.
http://mathgate.info/cebrown/satallax/http://mathgate.info/cebrown/satallax/
Substitution tree and code tree indexes are used to implement all major
operations on sets of terms, literals and clauses.
Although the kernel of the system works only with clausal normal form, the
shell accepts a problem in the full first-order logic syntax, clausifies it
and performs a number of useful transformations before passing the result
to the kernel.
Also the axiom selection algorithm Sine [HV11] can be enabled as part of
the preprocessing.
When a theorem is proved, the system produces a verifiable proof, which
validates both the clausification phase and the refutation of the CNF.
A number of standard redundancy criteria and simplification techniques are used
for pruning the search space: subsumption, tautology deletion, subsumption
resolution and rewriting by ordered unit equalities.
The reduction ordering is the Knuth-Bendix Ordering.
Substitution tree and code tree indexes are used to implement all major
operations on sets of terms, literals and clauses.
Internally, Vampire works only with clausal normal form.
Problems in the full first-order logic syntax are clausified during
preprocessing.
Vampire implements many useful preprocessing transformations including the
Sine axiom selection algorithm.
When a theorem is proved, the system produces a verifiable proof, which
validates both the clausification phase and the refutation of the CNF.
This extension is also able to handle real linear arithmetic, which actually
coincides with the rational fragment of linear arithmetic due to the
syntactical restrictions of the TPTP input format for reals.
Satallax-MaLeS 1.3
Daniel Kuehlwein
Radboud University Nijmegen, The Netherlands
Architecture
Satallax-MaLeS 1.3 improves Satallax's automatic mode with machine learning
techniques to predict which search strategy is most likely to find a proof.
Strategies
Satallax-MaLeS 1.3 relies on Geoff Sutcliffe's MakeListStat to classify
problems.
It uses the same data as Satallax_MaLeS 1.2 as basis for the learning
algorithms.
Implementation
Satallax-MaLeS is based on Python, in particular the Sklearn library that
contains many machine learning algorithms.
After CASC Satallax-MaLeS will be available from
http://www.cs.ru.nl/~kuehlwein/
Expected Competition Performance
Satallax-MaLes 1.3 is the CASC-J7 THF division winner.
SPASS+T 2.2.22
Uwe Waldmann
Max-Planck-Institut für Informatik, Germany
Architecture
SPASS+T is an extension of the superposition-based theorem prover SPASS
that integrates algebraic knowledge into SPASS in three complementary ways:
by passing derived formulas to an external SMT procedure
(currently Yices or CVC3), by adding standard axioms,
and by built-in arithmetic simplification and inference rules.
A first version of the system has been described in
[PW06];
later a much more sophisticated coupling of the SMT procedure has been
added [Zim07]. The latest version fixes some bugs related to
memory management and to name clashes with the SMT procedure
and provides improved support for non-arithmetic problems.
Strategies
Standard axioms and built-in arithmetic simplification and inference rules
are integrated into the standard main loop of SPASS.
Inferences between standard axioms are excluded, so the
user-supplied formulas are taken as set of support.
The external SMT procedure runs in parallel in a separate process,
leading occasionally to non-deterministic behaviour.
Implementation
SPASS+T is implemented in C. The system is available from
http://people.mpi-inf.mpg.de/~uwe/software/#TSPASS
Expected Competition Performance
We expect SPASS+T to be among the leading systems in the TFA division
at CASC-25.
Vampire 2.6
Krystof Hoder, Andrei Voronkov
University of Manchester, England
Architecture
Vampire 2.6 is an automatic theorem prover for first-order classical logic.
It consists of a shell and a kernel.
The kernel implements the calculi of ordered binary resolution and
superposition for handling equality.
It also implements the Inst-gen calculus.
The splitting rule in kernel adds propositional parts to clauses, which are
being manipulated using binary decision diagrams and a SAT solver.
A number of standard redundancy criteria and simplification techniques are
used for pruning the search space: subsumption, tautology deletion,
subsumption resolution and rewriting by ordered unit equalities.
The reduction ordering is the Knuth-Bendix Ordering.
Strategies
The Vampire 2.6 kernel provides a fairly large number of options for strategy se
lection. The most important ones are:
Implementation
Vampire 2.6 is implemented in C++.
Expected Competition Performance
Vampire 2.6 is the CASC-J7 FOF division winner.
Vampire 4.0
Giles Reger
University of Manchester, United Kingdom
Architecture
Vampire 4.0 is an automatic theorem prover for first-order logic.
Vampire implements the calculi of ordered binary resolution and superposition
for handling equality.
It also implements the Inst-gen calculus and a MACE-style finite model builder.
Splitting in resolution-based proof search is controlled by the AVATAR
architecture, which uses a SAT solver to make splitting decisions.
Both resolution and instantiation based proof search make use of global
subsumption.
Strategies
Vampire 4.0 provides a very large number of options for strategy selection. The most important ones are:
Implementation
Vampire 4.0 is implemented in C++.
Expected Competition Performance
Vampire 4.0 should be an improvement on Vampire 2.6, which is the CASC-J7
FOF division winner.
Improvements have also been made to the parts of the prover relevant to
satisfiability checking and theory reasoning.
VampireZ3 1.0
Giles Reger
University of Manchester, United Kingdom
Architecture
VampireZ3 version 1.0 is a combination of Vampire version 4.0 (see
other description) and Z3 version 4.3.1.
Vampire 4.0 uses the AVATAR architecture to make splitting decisions.
Briefly, the first-order search space is represented in the SAT solver
with propositional symbols consistently naming variable-disjoint components.
A SAT solver is then used to (iteratively) select a subset of components to
search.
In VampireZ3 the Z3 SMT solver is used in place of a SAT solver and
ground components are translated into Z3 terms.
This means Z3’s efficient methods for ground reasoning with equality and
theories are exposed by AVATAR, as the SMT solver only produces
theory-consistent models.
Strategies
All strategies of Vampire 4.0 are available. Z3 is only used when splitting
is selected.
Implementation
Vampire and Z3 are both implemented in C++.
Expected Competition Performance
VampireZ3 1.0 should perform better than Vampire 4.0 on theory problems.
The combination of Z3 for reasoning with ground theories and Vampire for
reasoning with quantifiers should be very powerful.
ZenonArith 0.1.0
Guillaume Bury
Inria, France
Architecture
Zenon Arith 0.1.0
[BD15]
is an extension of the tableaux-based Zenon
[BDD07], to linear arithmetic.
This extension uses the simplex algorithm as a decision procedure to solve
problems over rationals, and a branch-and-bound approach to solve problems
over integers.
Strategies
This extension consists of a smooth integration of arithmetic deductive rules
to the basic tableau rules, so that there is a natural interleaving between
arithmetic and regular analytic rules.
This extension is able to deal with problems that are (exclusively)
universally quantified, or existentially quantified formulas.
Universally quantified formulas are handled by translating unsatisfiability
explanation from the simplex algorithm into inference rules.
Instantiation for existentially quantified formulas is achieved by trying to
solve sets of formulas that cover all branches of a derivation tree.
Implementation
Zenon Arith is implemented in OCaml, and uses the Zarith (which is a binding
for the gmp library) library to handle arbitrary precision integers and
rationals.
It uses an incremental implementation of the simplex algorithm in OCaml as
a decision procedure over linear arithmetic.
This version of Zenon comes with a built-in notion of types for terms, and
inference rules that discriminate over the types of expressions.
Zenon Arith can automatically output Coq
[CH88]
proof scripts, that can then be checked by the Coq proof assistant.
The main developper is Guillaume Bury.
Zenon Arith is available at:
https://www.rocq.inria.fr/deducteam/ZenonArith/
Expected Competition Performance
We do not expect to be among the best provers in terms of number of solved
problems, but we aim to solve a large part of the problems, with the lowest
average CPU time.