agsyHOL 1.0
Fredrik Lindblad
University of Gothenburg, Sweden
Architecture
AgsyHOL 1.0 [agsyHOL] is based on intuitionistic sequent calculus extended
with rules for classical and equality reasoning.
There is an introduction and elimination rule for each logical construct,
plus rules for RAA, AC and extensionality.
There are special inference modes for elimination and equality reasoning.
A backward search is applied to find proof derivations in the calculus. Search is controlled by locally assigning priorities to different kind of constraints, and refining that part of the half-finished proof which is blocking a constraint with the highest priority.
https://github.com/frelindb/agsyHOL/
Beagle first converts the input formulas into clause normal form.
Pure arithmetic (sub-)formulas are treated by eager application of quantifier
elimination.
The core reasoning component implements the Hierarchic Superposition Calculus
with Weak Abstraction (HSPWA) [BW13].
Extensions are a splitting rule for clauses that can be divided into variable
disjoint parts, and a chaining inference rule for reasoning with inequalities.
The HSPWA calculus generalizes the superposition calculus by integrating
theory reasoning in a black-box style.
For the theories mentioned above, Beagle combines quantifier elimination
procedures and other solvers to dispatch proof obligations over these theories.
The default solvers are an improved version of Cooper's algorithm for linear
integer arithmetic, and the Fourier-Motzkin algorithm for linear real/rational
arithmetic.
Non-linear integer arithmetic is treated by partial instantiation.
Beagle uses strategy scheduling by trying (at most) two flag settings
sequentially.
Beagle's web site is
At difference of most of the other provers, cocATP does not rely on SAT or
first-order solver.
All reasoning is done in the high-order logic that is the calculus of
constructions.
There is a partially implemented second strategy more similar to the ones
found in other theorem provers.
It consist a saturation algorithm with the high-order modus_ponens.
As example, given a hypothesis (P) and another (or the same) hypothesis
(forall x:T, Q x), unficate P with T generating a new hypothesis Q.
This generalizes the typical saturation rules, as resolution and superposition.
The strategy yet lacks a good conditions for subsumption and simplifcation.
Probably it will not be used in competition.
Includes a type-verifier of Calculus of Constructions without inductive
constructions, which must be defined with axioms.
That is, a buggy partial clone of Coq.
There is support for most of the TPTP syntax, problems are translated to
a set of calculus of constructions terms.
cocATP has been specially prepared for THF, but all the other TPTP formulae
without numbers should be accepted.
However cocATP does NOT include a SAT solver, thus it will probably not solve
your trivial CNF problems.
More info at:
Like other SMT solvers, CVC4 treats quantified formulas using a two-tiered
approach.
First, quantified formulas are replaced by fresh boolean predicates and the
ground theory solver(s) are used in conjunction with the underlying SAT
solver to determine satisfiability.
If the problem is unsatisfiable at the ground level, then the solver answers
"unsatisfiable".
Otherwise, the quantifier instantiation module is invoked, and will either
add instances of quantified formulas to the problem, answer "satisfiable",
or return unknown.
The finite model finding has been developed to target problems containing
background theories, whose quantification is limited to finite and
uninterpreted sorts.
In finite model finding mode, CVC4 uses a ground theory of finite cardinality
constraints that minimizes the number of ground equivalence classes, as
described in [RT+13].
When the problem is satisfiable at the ground level, a candidate model is
constructed that contains complete interpretations for all predicate and
function symbols.
Quantifier instantiation strategies are then invoked to add instances of
quantified formulas that are in conflict with the candidate model, as
described in [RT+13].
If no instances are added, then the solver reports "satisfiable".
Beagle 0.9
Peter Baumgartner, Josh Bax
NICTA and Australian National University, Australia
Architecture
Beagle is an automated theorem prover for sorted first-order logic with
equality over built-in theories.
The theories currently supported are integer arithmetic, linear rational
arithmetic and linear real arithmetic.
It accepts formulas in the FOF and TFF formats of the TPTP syntax, and
formulas in the SMT-LIB version 2 format.
Strategies
Beagles uses the Discount loop for saturating a clause set under the calculus'
inference rules.
Simplification techniques include standard ones, such as subsumption deletion,
demodulation by ordered unit equations, and tautology deletion.
It also includes theory specific simplification rules for evaluating
ground (sub)terms, and for exploiting cancellation laws and properties of
neutral elements, among others.
In the competition an aggressive form of arithmetic simplification is used,
which seems to perform best in practice.
Implementation
Beagle is implemented in Scala.
It is a full implementation of the HSPWA calculus.
It uses a simple form of indexing, essentially top-symbol hashes, stored
with each term and computed in a lazy way.
Fairness is achieved through a combination of measuring clause weights
and their derivation-age.
It can be fine-tuned with a weight-age ratio parameter, as in other provers.
http://users.cecs.anu.edu.au/~baumgart/systems/beagle/
Expected Competition Performance
Beagle should perform reasonably well.
cocATP 0.2.0
Cristóbal Camarero
University of Cantabria, Spain
Architecture
cocATP is a Coq-inspired [BC04] automated theorem prover made on the free
time of the author.
It implements (the non-inductive) part of Coq's logic (calculus of
constructions) and syntax.
The proof terms it creates are accepted as proofs by Coq by the addition of
a few definitions (substituting the respective inductive definitions of Coq).
As in Coq, equality and logical connectives other than implication (which is
a dependent product of the logic) are defined adding the proper axioms.
The reasoning is tried to be done the more general possible, avoiding to
give any special treatment to both equality and logical connectives.
Strategies
The first of the rules is: for a node with conjecture (forall x:T, P x)
create a new node with hypothesis (x:T) and conecture (P x).
Other of the major rules consist in: when having a node with conjecture (C)
and some hypothesis (H:forall (x1:T1) (x2:T2) ... (xn:Tn), P x1 x2 ... xn),
to unificate P with C and prove the Ti necessary.
The unification is a subset of the possible high-order unifications.
And in this case is a one-sided unification.
There is a lot of other ad-hoc rules and conditions to apply them.
Some of the rules can create existential terms (x:=??:T) alike to Coq's
tactic evar.
With these rules resulting proofs are very similar to human-made proofs.
Implementation
cocATP is implemented in Python-2.7 and C.
It uses the Ply-3.4 library to build the parsers for both the Coq and TPTP
syntaxes.
Recently, the processing core has been reimplemented in C as a Python module,
and a Python variable controls if using a pure Python or the C version.
Using the C module can be an order of magnitude faster, but it is less
tested and introduces the possibility of SEGFAULTS.
http://www.alumnos.unican.es/ccc66/cocATP.htm
Expected Competition Performance
With the reimplementation of the kernel in C it is expected a great
reduction in the execution time, but not a lot of newly solved problems.
Crossbow 0.1
Radek Micek
Charles University in Prague, Czech Republic
Architecture
Crossbow 0.1 is a MACE2-style finite model finder similar to Paradox.
It is an implementation of techniques from [CS03] plus
improved flattening and built-in support for commutativity.
Strategies
Same strategy is used for all problems:
Implementation
Crossbow is implemented in OCaml.
Clausification and lemma generation is done by E.
MiniSat is used for incremental SAT solving.
Source code is available from:
https://github.com/radekm/crossbow
Expected Competition Performance
It should perform slightly better than Paradox.
CVC4 1.4
Andrew Reynolds
EPFL, Switzerland
Architecture
CVC4 [BC+11] is an SMT solver based on the DPLL(T) architecture [NOT06] that
includes built-in support for many theories including linear arithmetic,
arrays, bit vectors, datatypes and strings.
It incorporates various approaches for handling universally quantified
formulas.
In particular, CVC4 uses primarily uses heuristic approaches based on
E-matching for answering "unsatisfiable", and finite model finding approaches
for answering "satisfiable".
Strategies
For handling theorems, CVC4 primarily uses various configurations of
E-matching.
This year, CVC4 incorporates new methods for finding conflicting instances
of quantified formulas [RTd14], which have been shown to lead to improved
performance on unsatisfiable TPTP benchmarks.
CVC4 also incorporates a model-based heuristic for handling quantified
formulas containing only pure arithmetic, which will be used in the TFA
division.
For handling non-theorems, the finite model finding feature of CVC4 will use a number of orthogonal quantifier instantiation strategies. This year, it will incorporate several new features, including an optimized implementation of model-based quantifier instantiation which improves upon [RT+13], as well as techniques for sort inference. Since CVC4 with finite model finding is also capable of answering "unsatisfiable", it will be used as a strategy for theorems as well.
https://github.com/CVC4
E 1.9
Stephan Schulz
DHBW Stuttgart, Germany
Architecture
E [Sch02,Sch13] is a purely equational theorem prover for full
first-order logic with equality.
It consists of an (optional) clausifier for pre-processing full first-order
formulae into clausal form, and a saturation algorithm implementing an
instance of the superposition calculus with negative literal selection and
a number of redundancy elimination techniques.
E is based on the DISCOUNT-loop variant of the given-clause algorithm,
i.e., a strict separation of active and passive facts.
No special rules for non-equational literals have been implemented.
Resolution is effectively simulated by paramodulation and equality resolution.
For CASC-J7, E implements a strategy-scheduling automatic mode. The total CPU time available is broken into 8 (unequal) time slices. For each time slice, the problem is classified into one of several classes, based on a number of simple features (number of clauses, maximal symbol arity, presence of equality, presence of non-unit and non-Horn clauses, ...). For each class, a schedule of strategies is greedily constructed from experimental data as follows: The first strategy assigned to a schedule is the the one that solves the most problems from this class in the first time slice. Each subsequent strategy is selected based on the number of solutions on problems not already solved by a preceding strategy.
About 210 different strategies have been evaluated on all untyped first-order problems from TPTP 6.0.0, and about 180 of these strategies are used in the automatic mode.
http://www.eprover.org
E.T. 0.1
Josef Urban1,
Cezary Kaliszyk2,
Stephan Schulz3,
Jiri Vyskocil4
1Radboud University Nijmegen, The Netherlands,
2University of Innsbruck, Austria,
3DHBW Stuttgart, Germany,
4Czech Technical University, Czech Republic
Architecture
E.T. 0.1 is a metasystem using E prover with specific strategies [Urb13,KU13]
and preprocessing tools [KU13,KU13,KU13] that are targeted mainly at
problems with many redundant axioms.
Its design is motivated by the recent experiments in the Large-Theory Batch
division [KUV14] and on the Flyspeck, Mizar and Isabelle datasets, however,
E.T. does no learning from related proofs.
HOLyHammer 140616
Cezary Kaliszyk1,
Josef Urban2,
Stephan Schulz3
1University of Innsbruck, Austria,
2Radboud University Nijmegen, The Netherlands,
3DHBW Stuttgart, Germany
Architecture
HOLyHammer [KU14,KU13] is a metasystem for premise selection and proof
translation from polymorphic HOL to FOF ATP provers.
This is a version restricted to THF0 and to using E prover with specific
strategies.
iProver 0.9
Konstantin Korovin
University of Manchester, United Kingdom
Architecture
iProver is an automated theorem prover based on an instantiation calculus
Inst-Gen [GK03,Kor09] which is complete for first-order logic.
One of the distinctive features of iProver is a modular combination of
first-order reasoning with ground reasoning.
In particular, iProver currently integrates MiniSat [ES04] for reasoning
with ground abstractions of first-order clauses.
In addition to instantiation, iProver implements ordered resolution calculus
and a combination of instantiation and ordered resolution; see
[Kor08] for the implementation details.
The saturation process is implemented as a modification of a given
clause algorithm.
iProver uses non-perfect discrimination trees for the unification indexes,
priority queues for passive clauses, and a compressed vector index for
subsumption and subsumption resolution (both forward and backward).
The following redundancy eliminations are implemented: blocking non-proper
instantiations; dismatching constraints [GK04,Kor08]; global subsumption
[Kor08]; resolution-based simplifications and propositional-based
simplifications.
A compressed feature vector index is used
for efficient forward/backward subsumption and subsumption resolution.
Equality is dealt with (internally) by adding the necessary axioms of equality
with an option of using Brand's transformation.
In the LTB division, iProver-SInE uses axiom selection based
on the SInE algorithm [HV11] as implemented in Vampire [HKV10],
i.e., axiom selection is done by Vampire and
proof attempts are done by iProver.
Major additions in the current version are:
iProver is available from
http://www.cs.man.ac.uk/~korovink/iprover/
iProver 1.0
Konstantin Korovin, Christoph Sticksel
University of Manchester, United Kingdom
Architecture
iProver is an automated theorem prover based on an instantiation calculus
Inst-Gen [GK03,Kor13] which is complete for first-order logic.
iProver combines first-order reasoning with ground reasoning for which it
uses MiniSat [ES04] and was recently extended with PicoSAT [Bie08] and
Lingeling [Bie12] (only MiniSat will be used at this CASC).
iProver also combines instantiation with ordered resolution;
see [Kor08] for the implementation details.
The proof search is implemented using a saturation process based on
the given clause algorithm. iProver uses non-perfect discrimination trees
for the unification indexes, priority queues for passive clauses, and
a compressed vector index for subsumption and subsumption resolution
(both forward and backward).
The following redundancy eliminations are implemented: blocking non-proper
instantiations; dismatching constraints [GK04,Kor08]; global subsumption
[Kor08]; resolution-based simplifications and propositional-based
simplifications. A compressed feature vector index is used
for efficient forward/backward subsumption and subsumption resolution.
Equality is dealt with (internally) by adding the necessary axioms of equality
with an option of using Brand's transformation.
In the LTB division, iProver uses axiom selection based on the SInE
algorithm [HV11] as implemented in Vampire [HKV11], i.e., axiom selection
is done by Vampire and proof attempts are done by iProver.
Some of iProver features are summarised below.
iProver is available from
http://www.cs.man.ac.uk/~korovink/iprover/
Isabelle 2013
Jasmin C. Blanchette1, Lawrence C. Paulson2,
Tobias Nipkow1, Makarius Wenzel3
1Technische Universität München, Germany,
2University of Cambridge, United Kingdom,
3Université Paris Sud, France
Architecture
Isabelle/HOL 2013 [NPW12] is the higher-order logic incarnation of the generic
proof assistant Isabelle2013.
Isabelle/HOL provides several automatic proof tactics, notably an equational
reasoner [Nip89], a classical reasoner [PN94], and a tableau prover [Pau99].
It also integrates external first- and higher-order provers via its subsystem
Sledgehammer [PB10,BBP11].
Isabelle includes a parser for the TPTP syntaxes CNF, FOF, TFF0, and THF0, due to Nik Sultana. It also includes TPTP versions of its popular tools, invokable on the command line as isabelle tptp_tool max_secs file.p. For example:
isabelle tptp_isabelle_hot 100 SEU/SEU824^3.p
The implementation of Isabelle relies on a small LCF-style kernel, meaning that inferences are implemented as operations on an abstract theorem datatype. Assuming the kernel is correct, all values of type theorem are correct by construction.
Most of the code for Isabelle was written by the Isabelle teams at the University of Cambridge and the Technische Universität München. Isabelle/HOL is available for all major platforms under a BSD-style license from
http://www.cl.cam.ac.uk/research/hvg/Isabelle
leanCoP 2.2
Jens Otten
University of Potsdam, Germany
Architecture
leanCoP [OB03,Ott08] is an automated theorem prover for classical first-order
logic with equality.
It is a very compact implementation of the connection (tableau)
calculus [Bib87,LS01].
leanCoP can read formulae in leanCoP syntax and in TPTP first-order syntax. Equality axioms and axioms to support distinct objects are automatically added if required. The leanCoP core prover returns a very compact connection proof, which is then translated into a more comprehensive output format, e.g., into a lean (TPTP-style) connection proof or into a readable text proof.
The source code of leanCoP 2.2 is available under the GNU general public license. It can be downloaded from the leanCoP website at:
http://www.leancop.deThe website also contains information about ileanCoP [Ott08] and MleanCoP [Ott12,Ott14], two versions of leanCoP for first-order intuitionistic logic and first-order modal logic, respectively.
LEO-II 1.6.2
Christoph Benzmüller
Freie Universität Berlin, Germany
Architecture
LEO-II [BP+08], the successor of LEO [BK98], is a higher-order ATP system
based on extensional higher-order resolution.
More precisely, LEO-II employs a refinement of extensional higher-order
RUE resolution [Ben99].
LEO-II is designed to cooperate with specialist systems for fragments of
higher-order logic.
By default, LEO-II cooperates with the first-order ATP system E [Sch02].
LEO-II is often too weak to find a refutation amongst the steadily growing
set of clauses on its own.
However, some of the clauses in LEO-II's search space attain a special
status: they are first-order clauses modulo the application of an
appropriate transformation function.
Therefore, LEO-II launches a cooperating first-order ATP system every n
iterations of its (standard) resolution proof search loop (e.g., 10).
If the first-order ATP system finds a refutation, it communicates its success
to LEO-II in the standard SZS format.
Communication between LEO-II and the cooperating first-order ATP system
uses the TPTP language and standards.
Unfortunately the LEO-II system still uses only a very simple sequential collaboration model with first-order ATPs instead of using the more advanced, concurrent and resource-adaptive OANTS architecture [BS+08] as exploited by its predecessor LEO.
The LEO-II system is distributed under a BSD style license, and it is available from
http://www.leoprover.org
Muscadet 4.4
Dominique Pastre
University Paris Descartes, France
Architecture
The Muscadet theorem prover is a knowledge-based system.
It is based on Natural Deduction, following the terminology of [Ble71] and
[Pas78], and uses methods which resembles those used by humans.
It is composed of an inference engine, which interprets and executes rules,
and of one or several bases of facts,
which are the internal representation of "theorems to be proved".
Rules are either universal and put into the system, or built by the system
itself by metarules from data (definitions and lemmas).
Rules may add new hypotheses, modify the conclusion, create objects,
split theorems into two or more subtheorems
or build new rules which are local for a (sub-)theorem.
The system is also able to work with second order statements. It may also receive knowledge and know-how for a specific domain from a human user; see [Pas89] and [Pas93]. These two possibilities are not used while working with the TPTP Library.
Muscadet is available from
http://www.math-info.univ-paris5.fr/~pastre/muscadet/muscadet.html
Princess 140704
Philipp Rümmer, Peter Backeman
Uppsala University, Sweden
Architecture
Princess [Rue08,Rue12] is a theorem prover for first-order logic
modulo linear integer arithmetic.
The prover uses a combination of techniques from the areas of first-order
reasoning and SMT solving.
The main underlying calculus is a free-variable tableau calculus,
which is extended with constraints to enable backtracking-free proof
expansion, and positive unit hyper-resolution for lightweight
instantiation of quantified formulae.
Linear integer arithmetic is handled using a set of built-in proof rules
resembling the Omega test, which altogether yields a calculus that is
complete for full Presburger arithmetic, for first-order logic, and for
a number of further fragments.
In addition, some built-in procedures for nonlinear integer arithmetic are
available.
The internal calculus of Princess only supports uninterpreted predicates; uninterpreted functions are encoded as predicates, together with the usual axioms. Through appropriate translation of quantified formulae with functions, the e-matching technique common in SMT solvers can be simulated; triggers in quantified formulae are chosen based on heuristics similar to those in the Simplify prover.
http://www.philipp.ruemmer.org/princess.shtml
Prover9 2009-11A
Bob Veroff on behalf of William McCune
University of New Mexico, USA
Architecture
Prover9, Version 2009-11A, is a resolution/paramodulation prover for
first-order logic with equality.
Its overall architecture is very similar to that of Otter-3.3 [McC03].
It uses the "given clause algorithm", in which not-yet-given clauses are
available for rewriting and for other inference operations (sometimes called
the "Otter loop").
Prover9 has available positive ordered (and nonordered) resolution and paramodulation, negative ordered (and nonordered) resolution, factoring, positive and negative hyperresolution, UR-resolution, and demodulation (term rewriting). Terms can be ordered with LPO, RPO, or KBO. Selection of the "given clause" is by an age-weight ratio.
Proofs can be given at two levels of detail: (1) standard, in which each line of the proof is a stored clause with detailed justification, and (2) expanded, with a separate line for each operation. When FOF problems are input, proof of transformation to clauses is not given.
Completeness is not guaranteed, so termination does not indicate satisfiability.
Given a problem, Prover9 adjusts its inference rules and strategy according to syntactic properties of the input clauses such as the presence of equality and non-Horn clauses. Prover9 also does some preprocessing, for example, to eliminate predicates.
In previous CASC competitions, Prover9 has used LPO to order terms for demodulation and for the inference rules, with a simple rule for determining symbol precedence. For CASC 2012, we are going to use KBO instead.
For the FOF problems, a preprocessing step attempts to reduce the problem to independent subproblems by a miniscope transformation; if the problem reduction succeeds, each subproblem is clausified and given to the ordinary search procedure; if the problem reduction fails, the original problem is clausified and given to the search procedure.
http://www.cs.unm.edu/~mccune/prover9/
Finishes in the middle of the pack are anticipated in all categories in which Prover9 competes.
Satallax 2.7
Chad E. Brown
Saarland University, Germany
Architecture
Satallax 2.7 [Bro12] is an automated theorem prover for higher-order logic.
The particular form of higher-order logic supported by Satallax is Church's
simple type theory with extensionality and choice operators.
The SAT solver MiniSat [ES04] is responsible for much of the search for a
proof.
The theoretical basis of search is a complete ground tableau calculus for
higher-order logic [BS10] with a choice operator [BB11].
A problem is given in the THF format.
A branch is formed from the axioms of the problem and the negation of the
conjecture (if any is given).
From this point on, Satallax tries to determine unsatisfiability or
satisfiability of this branch.
Satallax progressively generates higher-order formulae and corresponding propositional clauses [Bro13]. These formulae and propositional clauses correspond to instances of the tableau rules. Satallax uses the SAT solver MiniSat as an engine to test the current set of propositional clauses for unsatisfiability. If the clauses are unsatisfiable, then the original branch is unsatisfiable.
Additionally, Satallax may optionally generate first-order formulas in addition to the propositional clauses. If this option is used, then Satallax peroidically calls the first-order theorem prover E to test for first-order unsatisfiability. If the set of first-order formulas is unsatisfiable, then the original branch is unsatisfiable.
http://satallax.com
Substitution tree and code tree indexes are used to implement all major
operations on sets of terms, literals and clauses.
Although the kernel of the system works only with clausal normal form, the
shell accepts a problem in the full first-order logic syntax, clausifies it
and performs a number of useful transformations before passing the result
to the kernel.
Also the axiom selection algorithm Sine [HV11] can be enabled as part of
the preprocessing.
When a theorem is proved, the system produces a verifiable proof, which
validates both the clausification phase and the refutation of the CNF.
The new version has experimental support for linear integer arithmetic.
It features a new calculus (not published yet), inspired from the work
of Waldmann on superposition modulo rational arithmetic [Wal01].
The calculus is an extension of superposition that doesn't use a
black-box solver, but has special inferences to deal with arithmetic
equations, inequations and divisibility literals.
Additional redundancy criteria are used to make the search space more
tractable.
As we don't have completeness results yet, the solver will fail on
satisfiable problems, even if it finds a saturated set of clauses.
In TFA, since we feature a new calculus, performance is hard to predict.
We expect the prover to perform well on problems SMT solvers will struggle
with, and conversely; best performance should be reached on unsatisfiable
problems with few inequations but possibly function symbols and quantifiers;
worst will be on combinatorial ground problems that require little symbolic
reasoning.
Satallax_MaLeS 1.3
Daniel Kuehlwein
Radboud University Nijmegen, The Netherlands
Architecture
Satallax_MaLeS 1.3 improves Satallax's automatic mode with machine learning
techniques to predict which search strategy is most likely to find a proof.
Strategies
Satallax_MaLeS 1.3 relies on Geoff Sutcliffe's MakeListStat to classify
problems.
It uses the same data as Satallax_MaLeS 1.2 as basis for the learning
algorithms.
Implementation
Satallax_MaLeS is based on Python, in particular the Sklearn library that
contains many machine learning algorithms.
After CASC Satallax_MaLeS will be available from
http://www.cs.ru.nl/~kuehlwein/
Expected Competition Performance
We expect Satallax_MaLeS 1.3 to perform slightly better than
Satallax_MaLeS 1.2.
SPASS+T 2.2.19
Uwe Waldmann
Max-Planck-Insitut für Informatik, Germany
Architecture
SPASS+T is an extension of the superposition-based theorem prover SPASS
that integrates algebraic knowledge into SPASS in three complementary ways:
by passing derived formulas to an external SMT procedure
(currently Yices or CVC3), by adding standard axioms,
and by built-in arithmetic simplification and inference rules.
A first version of the system has been described in [PW06];
later a much more sophisticated coupling of the SMT procedure has been
added [Zim07].
The latest version provides improved support for isint/1, israt/1, floor/1
and ceiling/1 and adds partial input abstraction and history-dependent weights
for numerical constants.
Strategies
Standard axioms and built-in arithmetic simplification and inference rules
are integrated into the standard main loop of SPASS.
Inferences between standard axioms are excluded, so the
user-supplied formulas are taken as set of support.
The external SMT procedure runs in parallel in a separate process,
leading occasionally to non-deterministic behaviour.
Implementation
SPASS+T is implemented in C.
SPASS+T is available from
http://www.mpi-inf.mpg.de/~uwe/software/#TSPASS
Expected Competition Performance
SPASS+T 2.2.16 came second in the TFA division of last CASC;
we expect a similar performance in CASC 2013.
SPASS+T 2.2.20
Uwe Waldmann
Max-Planck-Insitut für Informatik, Germany
Architecture
SPASS+T is an extension of the superposition-based theorem prover SPASS
that integrates algebraic knowledge into SPASS in three complementary ways:
by passing derived formulas to an external SMT procedure
(currently Yices or CVC3), by adding standard axioms,
and by built-in arithmetic simplification and inference rules.
A first version of the system has been described in [PW06];
later a much more sophisticated coupling of the SMT procedure has been
added [Zim07].
The latest version provides improved support for isint/1, israt/1, floor/1
and ceiling/1 and adds partial input abstraction and history-dependent weights
for numerical constants.
Strategies
Standard axioms and built-in arithmetic simplification and inference rules
are integrated into the standard main loop of SPASS.
Inferences between standard axioms are excluded, so the
user-supplied formulas are taken as set of support.
The external SMT procedure runs in parallel in a separate process,
leading occasionally to non-deterministic behaviour.
Implementation
SPASS+T is implemented in C.
SPASS+T is available from
http://www.mpi-inf.mpg.de/~uwe/software/#TSPASS
Expected Competition Performance
SPASS+T 2.2.19 came first in the TFA division of last CASC;
we expect a similar performance in CASC 2014.
Vampire 2.6
Krystof Hoder, Andrei Voronkov
University of Manchester, England
Architecture
Vampire 2.6 is an automatic theorem prover for first-order classical logic.
It consists of a shell and a kernel.
The kernel implements the calculi of ordered binary resolution and
superposition for handling equality.
It also implements the Inst-gen calculus.
The splitting rule in kernel adds propositional parts to clauses, which are
being manipulated using binary decision diagrams and a SAT solver.
A number of standard redundancy criteria and simplification techniques are
used for pruning the search space: subsumption, tautology deletion,
subsumption resolution and rewriting by ordered unit equalities.
The reduction ordering is the Knuth-Bendix Ordering.
Strategies
The Vampire 2.6 kernel provides a fairly large number of options for strategy se
lection. The most important ones are:
Implementation
Vampire 2.6 is implemented in C++.
Expected Competition Performance
Vampire 2.6 is the CASC-24 FOF division winner.
Waldmeister 710
Thomas Hillenbrand
Max-Planck-Institut für Informatik, Germany
Architecture
Waldmeister 710 [Hil03] is a system for unit equational deduction.
Its theoretical basis is unfailing completion in the sense of
[BDP89] with refinements towards ordered completion (cf. [AHL03]).
The system saturates the input axiomatization, distinguishing active facts,
which induce a rewrite relation, and passive facts, which are the one-step
conclusions of the active ones up to redundancy.
The saturation process is parameterized by a reduction ordering and a heuristic
assessment of passive facts [HJL99].
This year's version is the result of polishing and fixing a few things in
last year's.
Strategies
The prover is coded in ANSI-C. It runs on Solaris, Linux,
MacOS X, and Windows/Cygwin. The central data strucures are: perfect
discrimination trees for the active facts; group-wise compressions for
the passive ones; and sets of rewrite successors for the conjectures.
Visit the Waldmeister web pages at:
http://www.waldmeister.org
Implementation
The approach taken to control the proof search is to choose the search
parameters – reduction ordering and heuristic assessment –
according to the algebraic structure given in the problem specification [HJL99].
This is based on the observation that proof tasks sharing major parts of
their axiomatization often behave similarly.
Expected Competition Performance
Waldmeister 710 is the CASC-23 UEQ division winner.
VanHElsing 1.0
Daniel Kuehlwein
Radboud University Nijmegen, The Netherlands
Architecture
VanHElsing improves E's automatic mode with machine learning techniques to
predict which search strategy is most likely to find a proof.
Strategies
VanHElsing relies on E's features to classify problems.
It uses the same data as E 1.8 as basis for the learning algorithms.
Implementation
VanHElsing is based on Python, in particular the Sklearn library that
contains many machine learning algorithms.
After CASC VanHElsing will be availble from
http://www.cs.ru.nl/~kuehlwein/
Expected Competition Performance
We expect VanHElsing to perform slightly better than E 1.8.
Zipperposition 0.4
Simon Cruanes
INRIA, France
Architecture
Zipperposition is a typed first-order superposition theorem prover written in
OCaml for prototyping.
It deals with hashconsed polymorphic terms, has a Hindley-Milner-like
type-inference algorithm and accepts TFF1 input.
Its core superposition calculus is strongly inspired from E (discount loop,
mostly same inferences including condensation and contextual literal cutting).
Strategies
Zipperposition uses several clause queues, but unlike E it has no
heuristic to choose them based on the input.
The strategy is therefore a simple alternance of old clauses, and
"small" clauses, in one run.
Implementation
Zipperposition is in pure OCaml, with a mix of imperative and functional
style, including iterators, zippers, etc.
It uses Non-perfect Discrimination Trees and Feature Vectors [Sch04] for
indexing.
Numbers are handled by Zarith, a frontend to GMP.
Expected Competition Performance
In FOF, we expect decent performance.
The results should be quite similar to last year as the superposition core
isn't the focus of our research.