The system is available from
After the competition, Currahee(E,iProver) can be obtained from:
The current version of Darwin implements first-order versions of unit
propagation inference rules, analogous to a restricted form of unit
resolution and subsumption by unit clauses. To retain completeness,
it includes a first-order version of the (binary) propositional
splitting inference rule.
Darwin is implemented in OCaml and has been tested under various Linux
distributions. It is available from
E is based on the DISCOUNT-loop variant of the given-clause
algorithm, i.e., a strict separation of active and passive facts.
Proof search in E is primarily controlled by a literal selection
strategy, a clause evaluation heuristic, and a simplification ordering.
The prover supports a large number of preprogrammed literal selection
strategies.
Clause evaluation heuristics can be constructed on the fly by combining
various parameterized primitive evaluation functions, or can be selected
from a set of predefined heuristics.
Supported term orderings are several parameterized instances of
Knuth-Bendix-Ordering (KBO) and Lexicographic Path Ordering (LPO).
The prover uses a preprocessing step to convert formulae in full first
order format to clause normal form.
This step may introduce (first-order) definitions to avoid an exponential
growth of formulae.
Preprocessing also unfolds equational definitions and performs some
simplifications on the clause level.
EP 1.2pre is just a combination of E 1.2pre in verbose mode and
a proof analysis tool extracting the used inference steps.
E distinguishes problem classes based on a number of features, all of
which have between 2 and 4 possible values.
The most important ones are:
For the LTB part of the competition, E will use a relevancy-based pruning
approach and attempt to solve the problems with successively more complete
specifications until it succeeds or runs out of time.
The program has been successfully installed under SunOS 4.3.x,
Solaris 2.x, HP-UX B 10.20, MacOS-X, and various
versions of Linux. Sources of the latest released version are
available freely from:
EP 1.2p will be hampered by the fact that it has to analyse the
inference step listing, an operation that typically is about as
expensive as the proof search itself.
Nevertheless, it should be competitive among the FOF systems.
An E-hyper tableau is a tree whose nodes are labeled with clauses, and which
is built up by the application of the inference rules of the E-hyper tableau
calculus.
The calculus rules are designed such that most of the reasoning is
performed using positive unit clauses.
A branch can be extended with new clauses that have been derived from the
clauses of that branch.
A positive disjunction can be used to split a branch, creating a new branch
for each disjunct.
No variables may be shared between branches, and if a case-split creates
branches with shared variables, then these are immediately substituted
by ground terms.
The grounding substitution is arbitrary as long as the terms in its range
are irreducible: the branch being split may not contain a positive equational
unit which can simplify a substituting term, i.e., rewrite it with one that
is smaller according to a reduction ordering. When multiple irreducible
substitutions are possible, each of them must be applied in consecutive
splittings in order to preserve completeness.
Redundancy rules allow the detection and removal of clauses that are redundant
with respect to a branch.
The hyper extension inference from the original hyper tableau calculus is
equivalent to a series of E-hyper tableau calculus inference applications.
Therefore the implementation of the hyper extension in KRHyper by a variant
of semi-naive evaluation [Ull89] is retained in
E-KRHyper, where it serves as a shortcut inference for the resolution of
non-equational literals.
Ayane 2
Russell Wallace
Independent researcher, Ireland
Architecture
Ayane is an automated theorem prover for first-order logic.
It uses refutation by saturation, converting the input to clause normal
form then generating additional clauses until a contradiction is found or
until no further inference can be performed.
Inference is based on the superposition calculus.
Strategies
The current version uses a straightforward implementation of the given
clause algorithm, with clauses selected for processing in order of size
(smallest first) and basic filtering of generated clauses for duplication
and tautologies.
Implementation
Ayane is written in C#, and has been tested on Linux (Mono 2.4.4, should
work with later or somewhat earlier versions) and Windows (.Net); it is
expected to work on any platform that supports a reasonably up to date
version of the CLR.
http://code.google.com/p/ayane/
Expected Competition Performance
Ayane is not expected to win the competition yet, as it is still in early
development and much work on efficient inference remains to be done.
Currahee(E,iProver) 0.1
Matthias Schmalz, Jann Röder
ETH Zurich, Switzerland
Architecture
Currahee(E,iProver) 0.1 applies a number of axiom selection strategies
to a problem in parallel, and passes the resulting smaller problems to
E and iProver.
Currahee(E,iProver) relies on the assumption that several axiom selection
strategies and theorem provers are more successful than one axiom selection
strategy with one theorem prover.
The challenge was to find out the strengths of the individual selection
strategies, i.e., when to apply which strategy.
Strategies
Currahee(E,iProver) applies the following axiom selection strategies (with
slight adjustments):
For a given problem the strategies are chosen depending on the problem's
size.
Implementation
Problems are parsed using Andrei Tchaltsev's TPTP parser (in Java).
The axiom selection strategies have been reimplemented in Java.
The underlying theorem provers are E 1.1 and iProver 0.8pre.
http://www.infsec.ethz.ch/people/mschmalz
Expected Competition Performance
Based on preliminary measurements, we expect Currahee(E,iProver) to solve at
least half of the problems from last year's LTB division (applying this year's
resource restrictions).
Darwin 1.4.5
Peter Baumgartner
NICTA, Australia
Architecture
Darwin [BFT04,BFT06]
is an automated theorem prover for first order clausal logic.
It is an implementation of the Model Evolution calculus
[BT03].
The Model Evolution calculus lifts the propositional DPLL procedure to
first-order logic.
One of the main motivations for this approach was the possibility of
migrating to the first-order level some of those very effective search
techniques developed by the SAT community for the DPLL procedure.
Strategies
Proof search in Darwin starts with a default interpretation for a
given clause set, which is evolved towards a model or until a
refutation is found.
Darwin traverses the search space by iterative deepening over the term
depth of candidate literals. Darwin employs a
uniform search strategy for all problem classes.
Implementation
The central data structure is the context. A context
represents an interpretation as a set of first-order literals.
The context is grown by using instances of literals from the input clauses.
The implementation of Darwin is intended to support basic operations
on contexts in an efficient way. This involves the handling of large
sets of candidate literals for modifying the current context. The
candidate literals are computed via simultaneous unification between
given clauses and context literals. This process is sped up by
storing partial unifiers for each given clause and merging them for
different combinations of context literals, instead of redoing
whole unifier computations. For efficient filtering of unneeded
candidates against context literals, discrimination tree or substitution
tree indexing is employed.
The splitting rule generates choice points in the derivation which are
backtracked using a form of backjumping, similar to the one used in
DPLL-based SAT solvers.
http://goedel.cs.uiowa.edu/Darwin/
Changes to the previous version consist of minor bug fixes.
Expected Competition Performance
We expect its performance to be very strong in the EPR division.
E/EP 1.2pre
Stephan Schulz
Technische Universität München, Germany
Architecture
E 1.2pre [Sch02,Sch04] is a purely
equational theorem prover.
The core proof procedure operates on formulae in clause normal form, using
a calculus that combines superposition (with selection of negative literals)
and rewriting.
No special rules for non-equational literals have been implemented, i.e.,
resolution is simulated via paramodulation and equality resolution.
The basic calculus is extended with rules for AC redundancy elimination,
some contextual simplification, and pseudo-splitting with definition caching.
The latest versions of E also supports simultaneous paramodulation, either
for all inferences or for selected inferences.
Strategies
The automatic mode determines literal selection strategy, term ordering,
and search heuristic based on simple problem characteristics of the
preprocessed clausal problem.
E has been optimized for performance over the TPTP.
The automatic mode of E 1.2pre is partially inherited from previous
version and is based on about 60 test runs over TPTP 4.0.1.
It consists of the selection of one of about 40 different strategies for each
problem.
All test runs have been performed on Linux/Intel machines with a time
limit roughly equivalent to 300 seconds on 300MHz Sun SPARC machines,
i.e., around 30 seconds on 2Ghz class machines.
All individual strategies are refutationally complete.
For classes above a threshold size, we assign the absolute best heuristic to
the class.
For smaller, non-empty classes, we assign the globally best heuristic that
solves the same number of problems on this class as the best heuristic on
this class does.
Empty classes are assigned the globally best heuristic.
Typically, most selected heuristics are assigned to more than one class.
Implementation
E is implemented in ANSI C, using the GNU C compiler.
At the core is an implementation of aggressively shared first-order terms
in a term bank data structure.
Based on this, E supports the global sharing of rewrite steps.
Rewriting is implemented in the form of rewrite links from
rewritten to new terms.
In effect, E is caching rewrite operations as long as sufficient memory is
available.
E uses perfect discrimination trees with age and size constraints
for rewriting and unit-subsumption, feature vector indexing
[Sch04] for forward- and backward subsumption and
contextual literal cutting, and a new technique called fingerprint
indexing for backward rewriting and (hopefully) paramodulation.
Knuth-Bendix Ordering and Lexicographic Path Ordering are implemented
using the linear and polynomial algorithms described by Bernd Löchner
[Loe04,Loe06].
http://www.eprover.org
EP 1.2pre is a simple Bourne shell script calling E and the
postprocessor in a pipeline.
Expected Competition Performance
In the last years, E performed well in most proof categories.
We believe that E will again be among the stronger provers in the FOF and
CNF categories.
We hope that E will at least be a useful complement to dedicated systems
in the other categories.
E-Darwin 1.3
Björn Pelzer
Universität Koblenz-Landau, Germany
Architecture
E-Darwin is an automated theorem prover for first order clausal logic with
equality.
It is a modified version of the Darwin prover
[BFT04], intended as a testbed for variants of
the Model Evolution calculus [BT03].
Among other things, it implements the Model Evolution with Equality calculus
[BT05].
Also, since the last version, a new calculus has been implemented, using a
different approach to incorporating equality reasoning into Model Evolution.
This new calculus will be used for CASC this year.
Three principal data structures are used: the context (a set of rewrite
literals), the set of constrained clauses, and the set of derived candidates.
The prover always selects one candidate, which may be a new clause or a
new context literal, and exhaustively computes inferences with this
candidate and the context and clause set, moving the results to the
candidate set.
Afterwards the candidate is inserted into one of the context or the clause
set, respectively, and the next candidate is selected.
The inferences are superposition-based. Demodulation and various means of
redundancy detection are used as well.
Strategies
The uniform search strategy is identical to the one employed in the original
Darwin, slightly adapted to account for derived clauses.
Implementation
E-Darwin is implemented in the functional/imperative language OCaml.
The system has been tested on Unix and is available under the GNU Public
License, from:
http://www.uni-koblenz.de/~bpelzer/edarwin
Darwin's method of storing partial unifiers has been adapted to equations
and subterm positions for the superposition inferences in E-Darwin.
A combination of perfect and non-perfect discrimination tree indexes is
used to store the context and the clauses.
Expected Competition Performance
The new calculus used this year differs significantly from the previous
versions , which still had a lot in common with the original Model Evolution.
The implementation was completed very recently.
As such we do not yet know what performance to expect.
E-KRHyper 1.1.4
Björn Pelzer
Universität Koblenz-Landau, Germany
Architecture
E-KRHyper [PW07] is a theorem proving and model
generation system for first-order logic with equality.
It is an implementation of the E-hyper tableau calculus
[BFP07], which integrates a superposition-based
handling of equality [BG98] into the hyper
tableau calculus [BFN96].
The system is an extension of the KRHyper theorem prover
[Wer03], which implements the original hyper
tableau calculus.
Strategies
E-KRHyper uses a uniform search strategy for all problems.
The E-hyper tableau is generated depth-first, with E-KRHyper always working
on a single branch.
Refutational completeness and a fair search control are ensured by an
iterative deepening strategy with a limit on the maximum term weight of
generated clauses.
Implementation
E-KRHyper is implemented in the functional/imperative language OCaml.
The system runs on Unix and MS-Windows platforms and is available under the
GNU Public License, from:
http://www.uni-koblenz.de/~bpelzer/ekrhyper
The system accepts input in the TPTP-format and in the TPTP-supported
Protein-format.
The calculus implemented by E-KRHyper works on clauses, so first order
formula input is converted into CNF by an algorithm which follows the
transformation steps as used in the clausification of Otter
[McC03].
E-KRHyper operates on an E-hyper tableau which is represented by linked
node records.
Several layered discrimination-tree based indexes (both perfect and
non-perfect) provide access to the clauses in the tableau and support
backtracking.
Geo 2010C
Hans de Nivelle
Uniwersytetu Wroclawskiego, Poland
Architecture
Geo is based on geometric resolution
[dNM06,deN07,deN10].
A geometric formula is a formula without function symbols
and constants, which has form:
FORALL x1, ..., xn [ A1 /\ ... /\ Ap /\ v1 != w1 /\ ... /\ vq != wq -> Z(x1,...,xn) ].The Ai are atoms, which are not equality atoms or inequality atoms. The arguments of the atoms Ai and the inequalities vj != wj must be among the variables x1, ..., xn. Z(x1,...,xn) must have one of the following three forms:
Geo accepts formulae either in its own syntax or in TPTP syntax. Because CNF has no special status for Geo, TPTP formulae in CNF form are treated as ordinary first-order formulae.
During search, Geo searches for a model of the geometric formulae by a combination of backtracking and learning. Whenever it cannot extend an interpretation into a model, Geo learns a new rule of Type 1, which ensures that Geo will not explore similar models in the future.
In case Geo finds a finite model, it simply prints the model. In case no model exists, it will eventually learn the rule -> FALSE, so that it is able to output a proof.
The final aim of geometric resolution, and of Geo, is to obtain a prover that is both good at finding proofs, and at finding finite models.
iProver 0.7
Konstantin Korovin
University of Manchester, United Kingdom
Architecture
iProver is an automated theorem prover based on an instantiation calculus
Inst-Gen [GK03,Kor09]
which is complete for first-order logic.
One of the distinctive features of iProver is a modular combination of
first-order reasoning with ground reasoning.
In particular, iProver currently integrates MiniSat
[ES04] for reasoning with ground abstractions
of first-order clauses.
In addition to instantiation, iProver implements ordered resolution calculus
and a combination of instantiation and ordered resolution; see
[Kor08] for the implementation details.
The saturation process is implemented as a modification of a given
clause algorithm.
We use non-perfect discrimination trees for the unification indexes,
priority queues for passive clauses, and a compressed vector index for
subsumption and subsumption resolution (both forward and backward).
The following redundancy eliminations are implemented: blocking non-proper
instantiations; dismatching constraints
[GK04,Kor08];
global subsumption [Kor08]; resolution-based
simplifications and propositional-based simplifications.
We implemented a compressed feature vector index
for efficient forward/backward subsumption and subsumption resolution.
Equality is dealt with (internally) by adding the necessary axioms of equality.
http://www.cs.man.ac.uk/~korovink/iprover/
iProver(-SInE) 0.8
Konstantin Korovin
University of Manchester, United Kingdom
Architecture
iProver is an automated instantiation-based theorem prover for first-order
logic.
We refer to description of iProver 0.7 for
general information.
Major additions in the current version are:
http://www.cs.man.ac.uk/~korovink/iprover/
iProver-Eq(-SInE) 0.6
Konstantin Korovin, Christoph Sticksel
University of Manchester, United Kingdom
Architecture
iProver-Eq [KS10] extends the iProver system
[Kor08] with built-in equational reasoning,
along the lines of [GK04].
As in the iProver system, first-order reasoning is combined with ground
satisfiability checking, where the latter is delegated to an off-the-shelf
ground solver.
iProver-Eq consists of three core components: i) ground reasoning by an SMT solver, ii) first-order equational reasoning on literals in a candidate model by a labelled unit superposition calculus [KS10] and iii) instantiation of clauses with substitutions obtained by ii).
Given a set of first-order clauses, iProver-Eq first abstracts it to a set of ground clauses which are passed to the ground solver. If the ground abstraction is unsatisfiable, then the set of first-order clauses is also unsatisfiable. Otherwise, literals are selected from the first-order clauses based on the model of the ground solver. The labelled unit superposition calculus checks whether the selected literals are conflicting. If they are conflicting, then clauses are instantiated such that the ground solver has to refine its model in order to resolve the conflict. Otherwise, satisfiability of the initial first-order clause set is shown.
Clause selection and literal selection in the unit superposition calculus are implemented in separate given clause algorithms. Relevant substitutions are accumulated in labels during unit superposition inferences and then used to instantiate clauses. For redundancy elimination iProver-Eq uses demodulation and dismatching constraints. In order to efficiently propagate redundancy elimination from instantiation into unit superposition, we implemented different representations of labels based on sets, AND/OR-trees and OBDDs. Non-equational resolution inferences provide further simplifications.
For the LTB division, axiom selection and clausification is done by Vampire 0.6. SInE scripts are used for problem scheduling and iProver-Eq for reasoning.
If no equational literals occur in the input, iProver-Eq falls back to the inference rules of iProver, otherwise the latter are disabled and only unit superposition is used. If all clauses are unit equations, no instantiations need to be generated and the calculus is run without the otherwise necessary bookkeeping.
http://www.cs.man.ac.uk/~korovink/iprover
Isabelle/HOL 2009-2
Jasmin C. Blanchette1, Lawrence C. Paulson2,
Tobias Nipkow1, Makarius Wenzel1,
Stefan Berghofer1
1Technische Universität München, Germany
2University of Cambridge, United Kingdom
Architecture
Isabelle/HOL 2009-2 [NPW02] is the higher-order
logic incarnation of the generic proof assistant
Isabelle2009-2.
Isabelle/HOL provides several automatic proof tactics, notably an equational
reasoner [Nip89], a classical reasoner
[PN94], a tableau prover
[Pau99], and a first-order resolution-based prover
[Hur03].
Although Isabelle is designed for interactive proof development, it is a little known fact that it is possible to run Isabelle from the command line, passing in a theory file with a formula to solve. Isabelle theory files can include Standard ML code to be executed when the file is processed. The TPTP2X Isabelle format module outputs a THF problem in Isabelle/HOL syntax, augmented with ML code that (1) runs the ten tactics in sequence, each with a CPU time limit, until one succeeds or all fail, and (2) reports the result and proof (if found) using the SZS standards. A Perl script is used to insert the CPU time limit (equally divided over the ten tactics) into TPTP2X's Isabelle format output, and then run the command line isabelle-process on the resulting theory file.
The implementation of Isabelle relies on a small LCF-style kernel, meaning that inferences are implemented as operations on an abstract theorem data type. Assuming the kernel is correct, all values of type theorem are correct by construction.
Most of the code for Isabelle was written by the Isabelle teams at the University of Cambridge and the Technische Universität München. A notable exception is the metis proof method, which was taken from the HOL4 theorem prover (also implemented in ML).
Isabelle/HOL is available for all major platforms under a BSD-style license, from
http://www.cl.cam.ac.uk/research/hvg/Isabelle
leanCoP(-SInE) 2.2
Jens Otten
University of Potsdam, Germany
Architecture
leanCoP [OB03,Ott08]
is an automated theorem prover for classical first-order logic with equality.
It is a very compact implementation of the connection (tableau) calculus
[Bib87,LS01].
For the LTB division, leanCoP-SInE runs leanCoP as the underlying inference engine of SInE 0.4, i.e., axiom selection is done by SInE, and proof attempts are done by leanCoP. SInE is developed by Kryštof Hoder; leanCoP-SInE is co-developed by Thomas Raths.
leanCoP can read formulae using the leanCoP syntax as well as the (raw) TPTP syntax format. Equality axioms are automatically added if required. The core leanCoP prover returns a very compact connection proof, which is translated into a readable proof. Several output formats are available. As the main enhancement, leanCoP 2.2 now supports the output of proofs in the (proposed) TPTP syntax for representing derivations in connection (tableau) calculi [OS10].
The source code of leanCoP 2.2, which is available under the GNU general public license, together with more information can be found on the leanCoP web site:
http://www.leancop.de
leanCoP-Omega 0.1
Jens Otten, Holger Trölenberg, Thomas Raths
University of Potsdam, Germany
Architecture
leanCoP-Ω is an automated theorem prover for classical first-order
logic with equality and interpreted functions and predicates for integer
arithmetic.
It combines version 2.1 of the compact connection prover leanCoP
[Ott08,Ott10]
with version 1.2 of the Omega test system [Pug92].
LEO-II 1.2
Christoph Benzmüller1,
Frank Theiss2
1Articulate Software, USA,
2Saarland University, Germany
Architecture
LEO-II [BP+08], the successor of LEO
[BK98], is a higher-order ATP system based
on extensional higher-order resolution.
More precisely, LEO-II employs a refinement of extensional higher-order
RUE resolution [Ben99].
LEO-II is designed to cooperate with specialist systems for fragments of
higher-order logic.
By default, LEO-II cooperates with the first-order ATP systems E
[Sch02].
LEO-II is often too weak to find a refutation amongst the steadily growing
set of clauses on its own.
However, some of the clauses in LEO-II's search space attain a special
status: they are first-order clauses modulo the application of an
appropriate transformation function.
The default transformation is Hurd's fully typed translation
[Hur03].
Therefore, LEO-II launches a cooperating first-order ATP system every n
iterations of its (standard) resolution proof search loop (e.g.,
n = 10).
If the first-order ATP system finds a refutation, it communicates its success
to LEO-II in the standard SZS format.
Communication between LEO-II and the cooperating first-order ATP system
uses the TPTP language and standards.
The improved performance of LEO-II in comparison to its predecessor LEO (implemented in LISP) is due to several novel features including the exploitation of term sharing and term indexing techniques [TB06], support for primitive equality reasoning (extensional higher-order RUE resolution), and improved heuristics at the calculus level. One recent development is LEO-II's new parser: in addition to the TPTP THF language, this parser now also supports the TPTP FOF and CNF languages. Hence, LEO-II can now also be used for FOF and CNF problems. Unfortunately the LEO-II system still uses only a very simple sequential collaboration model with first-order ATPs instead of using the more advanced, concurrent and resource-adaptive OANTS architecture [BS+08] as exploited by its predecessor LEO.
The LEO-II system is distributed under a BSD style license, and it is available from:
http://leoprover.org
This year LEO-II will also participate in the FOF and CNF categories in order to evaluate its performance for these fragments. For this, note that LEO-II still employs its own input processing and normalization techniques, and that calls to prover E are applied only modulo Hurd's fully typed translation. The evaluation will thus reveal the performance loss in comparison to E and it will likely point to relevant future work. Another reason for entering THF, FOF, and CNF is to demonstrate that integrated higher-order-first-order theorem provers are generally feasible.
Metis 2.2
Joe Hurd
Galois, Inc., USA
Architecture
Metis 2.2 [Hur03] is a proof tactic used in the
HOL4 interactive theorem prover.
It works by converting a higher order logic goal to a set of clauses in
first order logic, with the property that a refutation of the clause set
can be translated to a higher order logic proof of the original goal.
Experiments with various first order calculi [Hur03] have shown a given clause algorithm and ordered resolution best suit this application, and that is what Metis 2.2 implements. Since equality often appears in interactive theorem prover goals, Metis 2.2 also implements the ordered paramodulation calculus.
In addition to standard age and size measures, Metis 2.2 uses finite models to weight clauses in the Passive set. When integrated with higher order logic, an interpretation of known functions and relations is manually constructed to make many of their standard properties valid in the finite model. For example, if the domain of the model is the set {0,...,7}, and the higher order logic arithmetic functions are interpreted in the model modulo 8. Unknown functions and relations are interpreted randomly, but with a bias towards making supporting theorems valid in the model. The finite model strategy carries over to TPTP problems, by manually interpreting a collection of functions and relations that appear in TPTP axiom files in such a way as to make the axioms valid in the model.
Metis 2.2 reads problems in TPTP format and outputs detailed proofs in TSTP format, where each refutation step is one of 6 simple inference rules. Metis 2.2 implements a complete calculus, so when the set of clauses is saturated it can soundly declare the input problem to be unprovable (and outputs the saturation set).
Metis 2.2 is free software, released under the GPL. It is available from
http://www.gilith.com/software/metis
MetiTarski 1.3
Larry Paulson
University of Cambridge, United Kingdom
Architecture
MetiTarski [AP10,Bro03]
is an automatic theorem prover based on a combination of resolution and
QEPCAD-B [Bro03], a decision procedure for the
theory of real closed fields.
It is designed to prove theorems involving real-valued special functions
such as log, exp, sin, cos and sqrt.
In particular, it is designed to prove universally quantified inequalities
involving such functions.
This problem is undecidable: MetiTarski is incomplete.
MetiTarski is a modified version of Joe Hurd's theorem prover, Metis
[Hur03].
http://www.cl.cam.ac.uk/~lp15/papers/Arith/
Muscadet 4.0
Dominique Pastre
University Paris Descartes, France
Architecture
The Muscadet theorem prover is a knowledge-based system.
It is based on Natural Deduction, following the terminology of
[Ble71] and [Pas78],
and uses methods that resemble those used by humans.
It is composed of an inference engine, which interprets and executes rules,
and of one or several bases of facts,
which are the internal representation of "theorems to be proved".
Rules are either universal and put into the system, or built by the system
itself by metarules derived from data (definitions and lemmas).
Rules may add new hypotheses, modify the conclusion, create objects,
split theorems into two or more subtheorems,
or build new rules that are local for a (sub-)theorem.
The system is also able to work with second order statements. It may also receive knowledge and know-how for a specific domain from a human user; see [Pas89] and [Pas93]. These two possibilities are not used while working with the TPTP Library.
Muscadet is available from:
http://www.math-info.univ-paris5.fr/~pastre/muscadet/muscadet.html
omkbTT 1.0
Sarah Winkler
University of Innsbruck, Austria
Architecture
Based on ordered completion [BDP89], omkbTT 1.0
constitutes a theorem prover for unit equality problems.
It stands out from classical completion tools in two respects:
(1) automatic termination tools replace a fixed reduction order, and
(2) omkbTT employs a multi-completion approach
[KK99]
to explore multiple branches of the search tree at once.
A detailed description of the underlying inference system can be found
in [WM10].
omkbTT is parameterized by a termination strategy for the termination checks performed internally, and a selection strategy to choose the equation to be processed next. To ensure that a saturated set of equations and rules is ground-complete, the default termination strategy comprises a number of termination techniques inducing total termination such as Knuth-Bendix orders, lexicographic path orders, polynomial interpretations, and multiset path orders. The default selection strategy first chooses a process for which the size of its current equation and constraint set is minimal. Then a node for this process is selected by considering the term size and timestamp, the latter to ensure fairness of the deduction.
Since in the setting with termination tools the reduction order developed along the way is not known in advance, the set of ordered critical pairs is not computable during the deduction. Instead, critical consequences are over-approximated using the embedding relation [WM10]. Furthermore, omkbTT uses multi-completion variants of critical pair criteria [BD88] to restrict the number of equational consequences, and isomorphic processes are filtered out to restrict the search space. Details about these optimizations can be found in [WS+10].
omkbTT is written in OCaml. It interfaces the termination tool TTT2 [KS+09] internally to perform termination checks. Version 1.0 is available from:
http://cl-informatik.uibk.ac.at/software/omkbtt
Otter 3.3
William McCune
Argonne National Laboratory, USA
Architecture
Otter 3.3 [McC03] is an ATP system for
statements in first-order (unsorted) logic with equality.
Otter is based on resolution and paramodulation applied to clauses.
An Otter search uses the "given clause algorithm", and typically involves
a large database of clauses; subsumption and demodulation play an important
role.
http://www.cs.unm.edu/~mccune/otter/
Acknowledgments: Ross Overbeek, Larry Wos, Bob Veroff, and Rusty Lusk contributed to the development of Otter.
Paradox 3.0
Koen Claessen, Niklas Sörensson
Chalmers University of Technology, Sweden
Architecture
Paradox [CS03] is a
finite-domain model generator. It is based on a MACE-style
[McC03] flattening and instantiating of the
first-order clauses into propositional clauses, and then the use of
a SAT solver to solve the resulting problem.
Paradox incorporates the following features: Polynomial-time clause splitting heuristics, the use of incremental SAT, static symmetry reduction techniques, and the use of sort inference.
Satallax 1.4
Chad E. Brown
Saarland University, Germany
Architecture
Satallax is an automated theorem prover for higher-order logic.
The particular form of higher-order logic supported by Satallax is
Church's simple type theory with extensionality and choice
operators.
The SAT solver MiniSat [ES04]
is responsible for much of the search for a proof.
The theoretical basis of search is a complete ground tableau calculus for higher-order logic [BS10] with a choice operator [BB10]. A problem is given using an S-expression variant of the THF format. A branch is formed from the axioms of the problem and negation of the conjecture (if any is given). From this point on, Satallax tries to determine unsatisfiability or satisfiability of this branch.
Satallax progressively generates higher-order formulae and corresponding propositional clauses. These formulae and propositional clauses correspond to instances of the tableau rules. Satallax uses the SAT solver MiniSat as an engine to test the current set of propositional clauses for unsatisfiability. If the clauses are unsatisfiable, then the original branch is unsatisfiable. If there are no quantifiers at function types, the generation of higher-order formulae and corresponding clauses may terminate [BS09,BS09]. In such a case, if MiniSat reports the final set of clauses as satisfiable, then the original set of higher-order formulae is satisfiable (by a standard model in which all types are interpreted as finite sets).
A strategy is an ordered collection of modes with information about how much time the mode should be allotted. Satallax tries each of the modes for a certain amount of time sequentially. Satallax 1.4 uses a default strategy consisting of 22 modes determined through experimentation using the THF problems in the TPTP library.
http://satallax.com
Procedural attachment is used to evaluate arithmetic expressions that are
fully instantiated with numeric arguments.
For example, ($$SUM 2 2) and ($$LESS 2 3) can be rewritten to 4 and TRUE.
The equality relation is also procedurally attached when an argument is
numeric so that, for example, (= ($$SUM ?X 2) 4) can be solved.
For the TFA division, the TFF formulae are converted to FOF using the
standard approach [Wal83], but without predicates
to check the types of numeric variables.
This means SPASS-XDB is unsound for certain types of conjectures that mix
different types of numbers (e.g., integers with reals), but those problems
will not arise in CASC.
Until recently a simple computation system implemented in Prolog was
used (and it's still available) to obtain ground arithmetic facts,
as required to resolve against arithmetic relations (i.e., all arithmetic
function evaluation had to be done inside an $evaleq/2 relation).
More recently this has been replaced by internal evaluation of ground
arithmetic relations and functions.
The external mechanism is now being used to develop more sophisticated
arithmetic capabilities, e.g., solving for a variable in an equation
involving arithmetic functions, e.g., solving for X in
$sum(2,X) = 5.
It is hoped this will be ready in time for CASC.
SPASS-XDB supports integer, rational, and finite precision real arithmetic.
SNARK---20080805r027
Mark Stickel
SRI International, USA
Architecture
SNARK 20080805r027 is a resolution and paramodulation theorem prover
for first-order predicate calculus.
Procedural attachment is used to evaluate arithmetic expressions.
The objective to date is to add some arithmetic calculation in support of
SNARK's applications, not to duplicate or incorporate symbolic algebra
systems to extend SNARK's mathematical range.
SNARK has a rudimentary type system that includes subtypes but not polymorphism.
Thus, the mapping onto TPTP arithmetic types, which lack subtypes, and TPTP
arithmetic operations, which are polymorphic, is approximate.
It uses Common Lisp's unlimited precision rational arithmetic.
Integers are a subtype.
Inexact floating-point arithmetic is not used; real number inputs are
converted to rationals.
Real numbers are a supertype so that irrational numbers can be represented
symbolically.
Implementation
SNARK is written in Common Lisp that is also used as its extension and
scripting language.
SNARK is available from
http://www.ai.sri.com/~stickel/snark.html
SPASS+T 2.2.12
Uwe Waldmann1,
Stephan Zimmer2
1Max-Planck-Institut für Informatik, Germany,
2AbsInt GmbH, Germany
Architecture
SPASS+T is an extension of the superposition-based theorem prover SPASS
that integrates algebraic knowledge into SPASS in three complementary ways:
by passing derived formulae to an external SMT procedure
(currently Yices or CVC3), by adding standard axioms,
and by built-in arithmetic simplification and inference rules.
A first version of the system has been described in
[PW06].
In the current version, a much more sophisticated coupling of
the SMT procedure has been added [Zim07].
Strategies
Standard axioms and built-in arithmetic simplification and inference rules
are integrated into the standard main loop of SPASS.
The external SMT procedure runs in parallel in a separate process,
leading occasionally to non-deterministic behaviour.
Implementation
SPASS+T is implemented in C. The system is available from
http://www.mpi-inf.mpg.de/~uwe/software/#TSPASS
SPASS-XDB 3.01X0.6
Geoff Sutcliffe1,
Martin Suda2
1University of Miami, USA,
2Charles University in Prague, Czech Republic
Architecture
SPASS-XDB [SS+09,SS+10]
is an extended version of the well-known, state-of-the-art, SPASS automated
theorem proving system [WF+09].
The original SPASS reads a problem, consisting of axioms and a conjecture,
in TPTP format from a file, and searches for a proof by refutation for the
conjecture.
SPASS-XDB adds the capability of retrieving extra positive unit axioms
(facts) from external sources during the proof search (hence the "XDB",
standing for eXternal DataBases).
The axioms are retrieved asynchronously, on-demand, based on an expectation
that they will contribute to completing the proof.
The axioms are retrieved from a range of external sources, including SQL
databases, SPARQL endpoints, WWW services, computation sources (e.g.,
computer algebra systems), etc., using a TPTP standard protocol.
Strategies
Generally, SPASS-XDB follows SPASS' strategies.
However, SPASS, like most (all?) ATP systems, was designed under the
assumption that all formulae are in the problem file, i.e., it is ignorant
that external axioms might be delivered.
To regain completeness, constraints on SPASS’ search had to be relaxed in
SPASS-XDB.
This increases the search space, i.e., so the constraints were relaxed
in a controlled, incremental fashion. [SS+10]
The search space is also affected by the number of external axioms that can
be delivered, and mechanisms to control the delivery and focus the consequent
search have been implemented. [SS+10]
http://www.tptp.org/cgi-bin/SystemOnTPTP
TPS 3.080227G1d
Peter B. Andrews1,
Chad E. Brown2
1Carnegie Mellon University, USA,
2Saarland University, Germany
Architecture
TPS is a higher-order theorem proving system that has been developed over
several decades under the supervision of Peter B. Andrews with substantial
work by Eve Longini Cohen, Dale A. Miller, Frank Pfenning, Sunil Issar, Carl
Klapper, Dan Nesmith, Hongwei Xi, Matthew Bishop, Chad E. Brown, and Mark
Kaminski. TPS can be used to prove theorems of Church's type theory
automatically, interactively, or semi-automatically
[AB+96,AB06].
When searching for a proof, TPS first searches for an expansion proof [Mil87] or an extensional expansion proof [Bro07] of the theorem. Part of this process involves searching for acceptable matings [And81]. Using higher-order unification, a pair of occurrences of subformulae (which are usually literals) is mated appropriately on each vertical path through an expanded form of the theorem to be proved. The expansion proof thus obtained is then translated [Pfe87] without further search into a proof of the theorem in natural deduction style. The translation process provides an effective soundness check, but it is not actually used during the competition.
TPS is implemented in Common Lisp, and is available from:
http://gtps.math.cmu.edu/tps.html
No system description supplied.
Substitution tree indexing is used to implement all major operations on
sets of terms and literals.
Although the kernel of the system works only with clausal normal forms,
the shell accepts a problem in the full first-order logic syntax, clausifies
it and performs a number of useful transformations before passing the result
to the kernel.
Also the axiom selection algorithm of SInE can be enabled as part of the
preprocessing.
When a theorem is proved, the system produces a verifiable proof, which
validates both the clausification phase and the refutation of the CNF.
Waldmeister C09a is a minor upgrade of Waldmeister 806, the best feature
of which is TPTP format output.
Vampire 10.0
Andrei Voronkov
University of Manchester, United Kingdom
Expected Competition Performance
Vampire 10.0 is the CASC-22 CNF division winner.
Vampire 11.0 and Vampire-LTB 11.0
Andrei Voronkov, Kryštof Hoder
University of Manchester, United Kingdom
Architecture
Vampire 11.0 is an automatic theorem prover for first-order classical logic.
It consists of a shell and a kernel.
The kernel implements the calculi of ordered binary resolution and
superposition for handling equality.
The splitting rule in kernel adds propositional parts to clauses, which
are manipulated using binary decision diagrams (BDDs).
A number of standard redundancy criteria and simplification techniques are
used for pruning the search space: subsumption, tautology deletion,
subsumption resolution and rewriting by ordered unit equalities.
The reduction ordering is the Knuth-Bendix Ordering.
Strategies
The Vampire 11.0 kernel provides a fairly large number of features for
strategy selection.
The most important ones are:
The Vampire 11.0 core is a completely new Vampire, and it shares virtually
no code with the previous versions.
The automatic mode of Vampire 11.0, however, uses both the Vampire 11.0
core and Vampire 10.0 to solve problems, based on input problem
classification that takes into account simple syntactic properties, such
as being Horn or non-Horn, presence of equality, etc.
Implementation
Vampire 11.0 is implemented in C++. The supported compilers are gcc 4.1.x,
gcc 4.3.x.
Expected Competition Performance
Vampire 11.0 is the CASC-22 FOF division winner.
Vampire-LTB 11.0 is the CASC-22 LTB division winner.
Waldmeister C09a
Thomas Hillenbrand
Max-Planck-Institut für Informatik, Germany
Architecture
Waldmeister C09a [Hil03] is a system for unit
equational deduction.
Its theoretical basis is unfailing completion in the sense of
[BDP89] with refinements towards ordered
completion (cf. [AHL03]).
The system saturates the input axiomatization, distinguishing active facts,
which induce a rewrite relation, and passive facts, which are the one-step
conclusions of the active ones up to redundancy.
The saturation process is parameterized by a reduction ordering and a heuristic
assessment of passive facts [HJL99].
Strategies
The approach taken to control the proof search is to choose the search
parameters according to the algebraic structure given in the problem
specification [HJL99].
This is based on the observation that proof tasks sharing major parts of
their axiomatization often behave similar.
Hence, for a number of domains, the influence of different reduction orderings
and heuristic assessments has been analyzed experimentally; and in most cases
it has been possible to distinguish a strategy uniformly superior on the whole
domain.
In essence, every such strategy consists of an instantiation of the first
parameter to a Knuth-Bendix ordering or to a lexicographic path ordering, and
an instantiation of the second parameter to one of the weighting functions
addweight, gtweight, or mixweight, which, if
called on an equation , return
,
, or
,
respectively, where denotes the number of symbols
in .
Implementation
The prover is coded in ANSI-C. It runs on Solaris, Linux, and newly
also on MacOS X. In addition, it is now available for Windows users
via the Cygwin platform. The central data structures are: perfect
discrimination trees for the active facts; group-wise compressions for
the passive ones; and sets of rewrite successors for the conjectures.
Visit the Waldmeister web pages at:
http://www.waldmeister.org
Expected Competition Performance
Waldmeister C09a is the CASC-22 UEQ division winner.
Waldmeister 710
Thomas Hillenbrand
Max-Planck-Institut für Informatik, Germany
Architecture
Waldmeister 710 [Hil03] is a system for unit
equational deduction.
Its theoretical basis is unfailing completion in the sense of
[BDP89] with refinements towards ordered
completion (cf. [AHL03]).
The system saturates the input axiomatization, distinguishing active facts,
which induce a rewrite relation, and passive facts, which are the one-step
conclusions of the active ones up to redundancy.
The saturation process is parameterized by a reduction ordering and a heuristic
assessment of passive facts [HJL99].
This year's version is the result of polishing and fixing a few things in
last year's.
Implementation
The approach taken to control the proof search is to choose the search
parameters – reduction ordering and heuristic assessment –
according to the algebraic structure given in the problem specification
[HJL99].
This is based on the observation that proof tasks sharing major parts of
their axiomatization often behave similarly.
http://www.waldmeister.org
Zenon 0.6.3
Damien Doligez
INRIA, France
Architecture
Zenon 0.6.3 [BDD07] is a theorem prover based on
a proof-confluent version of analytic tableaux.
It uses all the usual tableau rules for first-order logic with a rule-based
handling of equality.
Zenon outputs formal proofs that can be checked by Coq or Isabelle.
http://zenon-prover.org