Entrants' System Descriptions


Darwin 1.3

Peter Baumgartner1, Alexander Fuchs2, Cesare Tinelli2
1National ICT Australia, Australia,
2The University of Iowa, USA

Architecture

Darwin [
BFT04, BFT06a] is an automated theorem prover for first order clausal logic. It is an implementation of the Model Evolution Calculus [BT03]. The Model Evolution Calculus lifts the propositional DPLL procedure to first-order logic. One of the main motivations for this approach was the possibility of migrating to the first-order level some of those very effective search techniques developed by the SAT community for the DPLL procedure.

The current version of Darwin implements first-order versions of unit propagation inference rules analogously to a restricted form of unit resolution and subsumption by unit clauses. To retain completeness, it includes a first-order version of the (binary) propositional splitting inference rule.

Proof search in Darwin starts with a default interpretation for a given clause set, which is evolved towards a model or until a refutation is found.

Implementation

The central data structure is the context. A context represents an interpretation as a set of first-order literals. The context is grown by using instances of literals from the input clauses. The implementation of Darwin is intended to support basic operations on contexts in an efficient way. This involves the handling of large sets of candidate literals for modifying the current context. The candidate literals are computed via simultaneous unification between given clauses and context literals. This process is sped up by storing partial unifiers for each given clause and merging them for different combinations of context literals, instead of redoing the whole unifier computations. For efficient filtering of unneeded candidates against context literals, discrimination tree or substitution tree indexing is employed. The splitting rule generates choice points in the derivation which are backtracked using a form of backjumping similar to the one used in DPLL-based SAT solvers.

Improvements to the previous version include additional preprocessing steps, less memory requirements, and lemma learning [BFT06b].

Darwin is implemented in OCaml and has been tested under various Linux distributions. It is available from:

    http://goedel.cs.uiowa.edu/Darwin/

Strategies

Darwin traverses the search space by iterative deepening over the term depth of candidate literals. Darwin employs a uniform search strategy for all problem classes.

Expected Competition Performance

Darwin 1.3 is the CASC-J3 EPR division winner.


Darwin 1.4.1

Peter Baumgartner1, Alexander Fuchs2, Cesare Tinelli2
1NICTA, Australia,
2The University of Iowa, USA

Architecture

Darwin [
BFT04, BFT06a] is an automated theorem prover for first order clausal logic. It is an implementation of the Model Evolution Calculus [BT03]. The Model Evolution Calculus lifts the propositional DPLL procedure to first-order logic. One of the main motivations for this approach was the possibility of migrating to the first-order level some of those very effective search techniques developed by the SAT community for the DPLL procedure.

The current version of Darwin implements first-order versions of unit propagation inference rules analogously to a restricted form of unit resolution and subsumption by unit clauses. To retain completeness, it includes a first-order version of the (binary) propositional splitting inference rule.

Proof search in Darwin starts with a default interpretation for a given clause set, which is evolved towards a model or until a refutation is found.

Implementation

The central data structure is the context. A context represents an interpretation as a set of first-order literals. The context is grown by using instances of literals from the input clauses. The implementation of Darwin is intended to support basic operations on contexts in an efficient way. This involves the handling of large sets of candidate literals for modifying the current context. The candidate literals are computed via simultaneous unification between given clauses and context literals. This process is sped up by storing partial unifiers for each given clause and merging them for different combinations of context literals, instead of redoing the whole unifier computations. For efficient filtering of unneeded candidates against context literals, discrimination tree or substitution tree indexing is employed. The splitting rule generates choice points in the derivation which are backtracked using a form of backjumping similar to the one used in DPLL-based SAT solvers.

Darwin is implemented in OCaml and has been tested under various Linux distributions. It is available from:

    http://goedel.cs.uiowa.edu/Darwin/

Strategies

Darwin traverses the search space by iterative deepening over the term depth of candidate literals. Darwin employs a uniform search strategy for all problem classes.

Expected Competition Performance

We expect its performance to be very strong in the EPR division.


FM-Darwin 1.4.1

Peter Baumgartner1, Alexander Fuchs2, Cesare Tinelli2
1NICTA, Australia,
2The University of Iowa, USA

Architecture

FM-Darwin [
BF+06] is a finite model finder for first order clausal logic with equality, in the spirit of Paradox [CS03].

For each domain size the problem is transformed into an equisatisfiable function-free clause set, which is decided by the Darwin prover [BFT06a].

Implementation

FM-Darwin is implemented as an extension to Darwin, which can now be used in its default mode or as a finite model finder.

Like Paradox, the system uses clause splitting, term definitions, static symmetry reduction, sort inference, and lemmas. In contrast, clause splitting and term definitions are only applied in a restricted way, that is for variable disjunct clause and ground terms, as ground instantiation is not performed and thus the exponential increase in the size of the clause set does not happen.

FM-Darwin is available from:

    http://goedel.cs.uiowa.edu/Darwin/

Strategies

FM-Darwin follows a popular strategy of finite domain finders.

The first-order clausal problem is transformed into a decidable logic, here function-free clause logic. Then, the system tries to find a finite domain model for sizes 1 .. n. A given domain is encoded by adding the axioms of totality and functionality for the symbols which have been converted from function to predicate symbols, such that any finite model of the original problem can be translated into a model of the transformed problem, and vice versa.

Like Paradox, FM-Darwin is a complete method for function-free clause sets, and can also detect unsatisfiability if the totality axioms are not used in a refutation.

Expected Competition Performance

We expect its performance to be good in the SAT and FNT divisions.


E and EP 0.999

Stephan Schulz
Institut für Informatik, Technische Universität, Germany

Architecture

E 0.999 [
Sch02,Sch04a] is a purely equational theorem prover. The core proof procedure operates on formulas in clause normal form, using a calculus that combines superposition (with selection of negative literals) and rewriting. No special rules for non-equational literals have been implemented, i.e., resolution is simulated via paramodulation and equality resolution. The basic calculus is extended with rules for AC redundancy elimination, some contextual simplification, and pseudo-splitting. The latest versions of E also supports simultaneous paramodulation, either for all inferences or for selected inferences.

E is based on the DISCOUNT-loop variant of the given-clause algorithm, i.e. a strict separation of active and passive facts. Proof search in E is primarily controlled by a literal selection strategy, a clause evaluation heuristic, and a simplification ordering. The prover supports a large number of preprogrammed literal selection strategies, many of which are only experimental. Clause evaluation heuristics can be constructed on the fly by combining various parameterized primitive evaluation functions, or can be selected from a set of predefined heuristics. Supported term orderings are several parameterized instances of Knuth-Bendix-Ordering (KBO) and Lexicographic Path Ordering (LPO).

The prover uses a preprocessing step to convert formulas in full first order format to clause normal form. This step may introduce (first-order) definitions to avoid an exponential growth of the formula. Preprocessing also unfolds equational definitions and performs some simplifications on the clause level.

The automatic mode determines literal selection strategy, term ordering, and search heuristic based on simple problem characteristics of the preprocessed clausal problem.

EP 0.999 is just a combination of E 0.999 in verbose mode and a proof analysis tool extracting the used inference steps.

Implementation

E is implemented in ANSI C, using the GNU C compiler. The most outstanding feature is the global sharing of rewrite steps. Current versions of E add rewrite links from rewritten to new terms. In effect, E is caching rewrite operations as long as sufficient memory is available. Other important features are the use of perfect discrimination trees with age and size constraints for rewriting and unit-subsumption, feature vector indexing [Sch04b] for forward- and backward subsumption and contextual literal cutting, and a new polynomial implementation of LPO [Loe04].

The program has been successfully installed under SunOS 4.3.x, Solaris 2.x, HP-UX B 10.20, MacOS-X, and various versions of Linux. Sources of the latest released version are available freely from:

    http://www.eprover.org
EP 0.999 is a simple Bourne shell script calling E and the postprocessor in a pipeline.

Strategies

E's automatic mode is optimized for performance on TPTP. The optimization of E 0.999 is based on about 90 test runs over the library (and previous experience) and consists of the selection of one of about 40 different strategies for each problem. All test runs have been performed on SUN Ultra 60/300 machines with a time limit of 300 seconds (or roughly equivalent configurations). All individual strategies are refutationally complete. The worst one solves about 49% of TPTP 3.0.1, the best one about 60%. We expect similar optimization for E 0.999.

E distinguishes problem classes based on a number of features, all of which have between 2 and 4 possible values. The most important ones are:

Wherever there is a three-way split on a numerical feature value, the limits are selected automatically with the aim of splitting the set of problems into approximately equal sized parts based on this one feature.

For classes above a threshold size, we assign the absolute best heuristic to the class. For smaller, non-empty classes, we assign the globally best heuristic that solves the same number of problems on this class as the best heuristic on this class does. Empty classes are assigned the globally best heuristic. Typically, most selected heuristics are assigned to more than one class.

Expected Competition Performance

In the last years, E performed well in most proof categories. We believe that E will again be among the stronger provers in the CNF category. Performance on FOF problems should be competitive. We hope that E will at least be a useful complement to dedicated systems in the other categories.

EP 0.999 will be hampered by the fact that it has to analyse the inference step listing, an operation that typically is about as expensive as the proof search itself. Nevertheless, it should be competitive among the FOF* and CNF* systems.


E-KRHyper 1.0

Björn Pelzer, Christoph Wernhard
Universität Koblenz-Landau, Germany

Architecture

E-KRHyper 1.0 [
PW07] is a theorem proving and model generation system for first-order logic with equality. It is an implementation of the E-hyper tableau calculus [BFP07], which integrates a superposition-based handling of equality [BG98] into the hyper tableau calculus [BFN96]. The system is an extension of the KRHyper theorem prover [Wer03], which implements the original hyper tableau calculus.

An E-hyper tableau is a tree whose nodes are labeled with clauses and which is built up by the application of the inference rules of the E-hyper tableau calculus. The calculus rules are designed such that most of the reasoning is performed using positive unit clauses. A branch can be extended with new clauses that have been derived from the clauses of that branch.

A positive disjunction can be used to split a branch, creating a new branch for each disjunct. No variables may be shared between branches, and if a case-split creates branches with shared variables, then these are immediately substituted by ground terms. The grounding substitution is arbitrary as long as the terms in its range are irreducible: the branch being split may not contain a positive equational unit which can simplify a substituting term, i.e. rewrite it with one that is smaller according to a reduction ordering. When multiple irreducible substitutions are possible, each of them must be applied in consecutive splittings in order to preserve completeness.

Redundancy rules allow the detection and removal of clauses that are redundant with respect to a branch.

The hyper extension inference from the original hyper tableau calculus is equivalent to a series of E-hyper tableau calculus inference applications. Therefore the implementation of the hyper extension in KRHyper by a variant of semi-naive evaluation [Ull89] is retained in E-KRHyper, where it serves as a shortcut inference for the resolution of non-equational literals.

Implementation

E-KRHyper is implemented in the functional/imperative language OCaml with additional preprocessing scripts in Prolog. The system runs on Unix and MS-Windows platforms and is available under the GNU Public License from the E-KRHyper website at:
   http://www.uni-koblenz.de/~bpelzer/ekrhyper
The system accepts clause input in the TPTP-supported Protein-format. E-KRHyper operates on an E-hyper tableau that is represented by linked node records. A layered discrimination-tree based index provides access to the clauses in the tableau and supports backtracking.

Strategies

This first version of E-KRHyper uses a uniform search strategy for all problems. The E-hyper tableau is generated depth-first, with E-KRHyper always working on a single branch. Refutational completeness and a fair search control are ensured by an iterative deepening strategy with a limit on the maximum term weight of generated clauses.

Expected Competition Performance

E-KRHyper has been completed recently, with limited testing and optimization done, so the expectations are humble.


Equinox 1.2

Koen Claessen
Chalmers University of Technology, Sweden

Architecture

Equinox is an experimental theorem prover for pure first-order logic with equality. It finds ground proofs of the input theory, by solving successive ground instantiations of the theory using an incremental SAT-solver. Equality is dealt with using a Nelson-Oppen framework.

Implementation

The main part of Equinox is implemented in Haskell using the GHC compiler. Equinox also has a built-in incremental SAT solver (MiniSat) which is written in C++. The two parts are linked together on the object level using Haskell's Foreign Function Interface.

Strategies

There is only one strategy in Equinox:
  1. Give all ground clauses in the problem to a SAT solver.
  2. Run the SAT-solver.
  3. If a contradiction is found, we have a proof and we terminate.
  4. If a model is found, we use the model to indicate which new ground instantiations should be added to the SAT-solver.
  5. Goto 2.

Expected Competition Performance

Equinox should perform reasonably well. There should be problems that it can solve that few other provers can handle.


Fampire 1.3

Josef Urban
Charles University of Prague, Czech Republic

Architecture

Fampire 1.3. is the Vampire 8.1 prover using the FLOTTER (SPASS) clausifier.

Implementation

Vampire and (a slightly modified) FLOTTER are linked together using a simple
FlotterOnTPTP.pl Perl script, Geoff Sutcliffe's tptp4X utility, and slightly modified dfg2tptp utility.

Strategies

The strategy is to use the FLOTTER clausifier, and run Vampire on the CNF problem.

Expected Competition Performance

The system outperforms Vampire 8.1 on Mizar problems, however it is weaker than Vampire 8.1 on the TPTP library (for which is Vampire pre-optimized). It could solve some problems that Vampire 8.1 will not solve.


Geo 2007f

Hans de Nivelle
University of Wroclaw, Poland

Architecture

Geo [
dNM06,deN07] is based on geometric resolution. A geometric formula is a formula of form:
   FORALL x1, ..., xn
   [ A1 /\ ... /\ Ap /\ v1 != w1 /\ ... /\ vq != wq -> Z(x1,...,xn) ].
The Ai are atoms not containing (dis)equality. The arguments of the atoms Ai and the inequalities vj != wj are among the variables x1, ..., xn. Function symbols and constant symbols are not allowed.

Z(x1,...,xn) must have one of the following three forms:

  1. The constant FALSE.
  2. A disjunction B1 \/ ... \/ Br. The atoms Bk must be non-equality atoms. The arguments of the atoms Bk are among the variables x1, ..., xn. There are no function symbols, and no constants in the atoms Bk.
  3. An existential quantification EXISTS y B. y is a variable distinct from x1,...,xn. B is an atom which has all arguments among x1,...,xn,y. There are no function symbols and constants in B.
As input, Geo accepts geometric formulas and first-order formulas. First-order formulas are transformed into geometric formulas. During this translation, function symbols and constants are removed, and replaced by relations.

Geo accepts formulas either in its own syntax or in TPTP-syntax. Because CNF has no special status for Geo, TPTP-formulas in CNF form are treated as ordinary first-order formulas.

During search, Geo searches for a model of the geometric formulas by a combination of backtracking and learning. Whenever it cannot extend an interpretation into a model, Geo learns a new rule of Type 1, which ensures that Geo will not explore similar models in the future.

In case Geo finds a finite model, it simply prints the model. In case no model exists, it will eventually learn the rule -> FALSE, so that it is able to output a proof.

The final aim of geometric resolution, and of Geo, is to obtain a prover that is both good at finding proofs, and at finding finite models.

Differences with last year (2006)

Implementation

Geo is written in C++. We try to keep the code readable and reasonably efficient at the same time. Geo has no sophisticated datastructures. At this moment the priority lies with the theoretical understanding of the calculus.

Strategies

Currently, Geo has only one strategy: It searches for a model by exhaustive search, and learns a new rule at each backtracking point. Lemmas do not expire, but they can be simplified.

Expected Competition Performance

Geo has become up to a hundred times faster on hard problems, due to the added simplification rules and the improved selection rules. So we expect that it will solve some more problems than last year.


leanCoP 2.0

Jens Otten
University of Potsdam, Germany

Architecture

leanCoP is an automated theorem prover for classical first-order logic. It is a very compact implementation of the connection calculus as described in the original leanCoP paper [
OB03], which is similar to the connection tableau calculus [LS01].

Implementation

leanCoP is implemented in Prolog. The source code of the core prover is only a few lines long and fits on half a page. In a preprocessing step first-order formulas are translated into clausal form; equality can be handled by adding the equality axioms. leanCoP 2.0 keeps the input clauses in Prolog's database thus integrating Prolog's indexing mechanism into the lean theorem proving framework. leanCoP is available at
    http://www.leancop.de

Strategies

The reduction rule of the connection calculus is applied before the extension rule is applied. Open branches are selected in a depth-first way. Iterative deepening on the proof depth is used to achieve completeness. leanCoP 2.0 integrates inference rules for regularity, factorization and an additional technique to limit backtracking.

Expected Competition Performance

We expect the performance of leanCoP 2.0 to be state-of-the-art according to the TPTP standards, i.e. it is not subsumed by any other theorem prover on the given problem set.

Acknowledgments: Thomas Raths contributed to the development of leanCoP 2.0 by performing comprehensive benchmark tests on several problem libraries.


ileanCoP 1.2

Jens Otten
University of Potsdam, Germany

Architecture

ileanCoP is an automated theorem prover for intuitionistic first-order logic. It is a very compact implementation of the clausal connection calculus for intuitionistic logic [
Ott05], which uses prefixes to capture the specific restrictions of intuitionistic logic.

Implementation

ileanCoP is implemented in Prolog and based on the classical prover leanCoP. Only a few minor additions are necessary to turn leanCoP into a theorem prover for intuitionistic logic. The source code of the core prover, which includes the prefix unification algorithm, is only a few lines long and fits on one page. In a preprocessing step formulas are translated into clausal form and a prefix is added to each literal. Equality can be handled by adding the equality axioms. ileanCoP is available at
    http://www.leancop.de/ileancop

Strategies

ileanCoP uses a classical proof search within the connection calculus to prove classical validity first, since every formula that is valid in intuitionistic logic is valid in classical logic as well. After a classical proof has been found the prefixes of the connected literals are unified using a prefix unification algorithm [OK96]. If unification of all prefixes succeeds the formula is valid in intuitionistic logic. ileanCoP 1.2 integrates some additional techniques from leanCoP 2.0, i.e. inference rules for regularity, factorization and an additional technique to limit backtracking.

Expected Competition Performance

This is the first time an automated theorem prover for intuitionistic logic participates. We expect the performance of ileanCoP 1.2 to be similar to former state-of-the-art theorem provers for classical logic. This assumes that most of the given problems are intuitionistically valid as well. Comprehensive tests on the TPTP library and the ILTP library [ROK07] have shown that this is the case for many problems. Whenever the classical component finds a proof for a given problem and the problem is intuitionistically valid, ileanCoP should find a proof.

Acknowledgments: Thomas Raths contributed to the development of ileanCoP by performing comprehensive benchmark tests on several problem libraries.


iProver 0.2

Konstantin Korovin
The University of Manchester, England

Architecture

iProver is based on the instantiation calculus [
GK03], which is complete for first-order logic.

iProver successively instantiates a given set of clauses, and uses a ground SAT solver for checking whether the current ground abstraction of clauses is satisfiable. If the ground solver finds unsatisfiability then the prover terminates, proving unsatisfiability of the initial set of clauses; otherwise the model for ground clauses is used to guide further instantiations. The saturation process is implemented as a modification of a given clause algorithm. We use non-perfect discrimination trees for the unification index. The following redundancy eliminations are implemented: blocking non-proper instantiations; dismatching constraints [GK04]; resolution-based simplifications; propositional-based simplifications, and removing duplicates. Equality is dealt with (internally) by adding the necessary axioms of equality.

Implementation

iProver is implemented in OCaml and for the ground reasoning uses MiniSat. iProver accepts cnf and fof formats. In the case of fof format, E prover (http://www.eprover.org/) is used for clausification.

Strategies

iProver has parameters to control literal selection, passive clause selection, frequency of calling the SAT solver, interleaving instantiation with resolution.

Expected Competition Performance

Since the instantiation method behind iProver is a decision procedure for the EPR class, we expect a reasonable performance in this class.


Metis 2.0

Joe Hurd
Oxford University, UK

Architecture

Metis 2.0 [
Hur03] is a proof tactic used in the HOL4 interactive theorem prover. It works by converting a higher order logic goal to a set of clauses in first order logic, with the property that a refutation of the clause set can be translated to a higher order logic proof of the original goal.

Experiments with various first order calculi [Hur03] have shown a given clause algorithm and ordered resolution to best suit this application, and that is what Metis 2.0 implements. Since equality often appears in higher order logic goals, Metis 2.0 also implements the ordered paramodulation calculus.

Implementation

Metis 2.0 is written in Standard ML, for ease of integration with HOL4. It uses indexes for resolution, paramodulation, (forward) subsumption and demodulation. It keeps the Active clause set reduced with respect to all the unit equalities so far derived.

In addition to standard size and distance measures, Metis 2.0 uses finite models to weight clauses in the Waiting set. When integrated with higher order logic, a finite model is manually constructed to interpret standard functions and relations in such a way as to make many axioms true and negated goals false. Non-standard functions and relations are interpreted randomly. Since it is part of the CASC competition rules that standard functions and relations are obfuscated, Metis 2.0 will back-off to interpreting all functions and relations randomly (except equality).

Metis 2.0 is free software, released under the GPL. It can be downloaded from:

    http://www.gilith.com/software/metis

Strategies

Metis 2.0 uses a fixed strategy for every input problem. Negative literals are always chosen in favour of positive literals, and terms are ordered using the Knuth-Bendix ordering with uniform symbol weight and precedence favouring reduced arity.

Expected Competition Performance

Running previous versions of Metis on previous CNF problem sets has scored in the bottom half of the table, so that is the expected performance. Metis 2.0 always keeps proofs, so the performance will be no different in the proof classes.

Testing on previous UEQ problem sets has shown Metis to be weak on equality problems, so I expect Metis 2.0 to perform relatively worse on the equality divisions, particularly UEQ.


Muscadet 2.7

Dominique Pastre
Université René Descartes (Paris 5), France

Architecture

The Muscadet theorem prover [
Pas01] is a knowledge-based system. It is based on Natural Deduction, following the terminology of [Ble71] and [Pas78], and uses methods which resembles those used by humans. It is composed of an inference engine, which interprets and executes rules, and of one or several bases of facts, which are the internal representation of "theorems to be proved". Rules are either universal and put into the system, or built by the system itself by metarules from data (definitions and lemmas). Rules may add new hypotheses, modify the conclusion, create objects, split theorems into two or more subtheorems or build new rules which are local for a (sub-)theorem.

Implementation

Muscadet 2 [Pas01] is implemented in SWI-Prolog. Rules are written as declarative Prolog clauses. Metarules are written as sets of Prolog clauses, more or less declarative. The inference engine includes the Prolog interpreter and some procedural Prolog clauses. Proofs can be obtained in natural style. Muscadet 2 is available from:
  http://www.math-info.univ-paris5.fr/~pastre/muscadet/muscadet.html

Strategies

There are specific strategies for existential, universal, conjunctive or disjunctive hypotheses and conclusions. Functional symbols may be used, but an automatic creation of intermediate objects allows deep subformulae to be flattened and treated as if the concepts were defined by predicate symbols. The successive steps of a proof may be forward deduction (deduce new hypotheses from old ones), backward deduction (replace the conclusion by a new one) or refutation (only if the conclusion is a negation).

The system is also able to work with second order statements. It may also receive knowledge and know-how for a specific domain from a human user; see [Pas89, Pas93]. These two possibilities are not used while working with the TPTP Library.

Expected Competition Performance

The best performances of Muscadet will be for problems manipulating many concepts in which all statements (conjectures, definitions, axioms) are expressed in a manner similar to the practice of humans, especially of mathematicians [Pas02,Pas07]. It will have poor performances for problems using few concepts but large and deep formulas leading to many splittings.

Otter 3.3

William McCune
Argonne National Laboratory, USA

Architecture

Otter 3.3 [McC03a] is an ATP system for statements in first-order (unsorted) logic with equality. Otter is based on resolution and paramodulation applied to clauses. An Otter search uses the "given clause algorithm", and typically involves a large database of clauses; subsumption and demodulation play an important role.

Implementation

Otter is written in C. Otter uses shared data structures for clauses and terms, and it uses indexing for resolution, paramodulation, forward and backward subsumption, forward and backward demodulation, and unit conflict. Otter is available from:

    http://www-unix.mcs.anl.gov/AR/otter/

Strategies

Otter's original automatic mode, which reflects no tuning to the TPTP problems, will be used.

Expected Competition Performance

Otter has been entered into CASC as a stable benchmark against which progress can be judged (there have been only minor changes to Otter since 1996 [MW97], nothing that really affects its performace in CASC). This is not an ordinary entry, and we do not hope for Otter to do well in the competition.

Acknowledgments: Ross Overbeek, Larry Wos, Bob Veroff, and Rusty Lusk contributed to the development of Otter.


Paradox 1.3

Koen Claessen, Niklas Sörensson
Chalmers University of Technology and Gothenburg University, Sweden

Architecture

Paradox [
CS03] is a finite-domain model generator. It is based on a MACE-style [McC94] flattening and instantiating of the first-order clauses into propositional clauses, and then the use of a SAT solver to solve the resulting problem.

Paradox incorporates the following features: Polynomial-time clause splitting heuristics, the use of incremental SAT, static symmetry reduction techniques, and the use of sort inference.

The main differences with Paradox 1.0 are: a better SAT-solver, better memory behaviour, and a faster clause instantiation algorithm.

Implementation

The main part of Paradox is implemented in Haskell using the GHC compiler. Paradox also has a built-in incremental SAT solver which is written in C++. The two parts are linked together on the object level using Haskell's Foreign Function Interface.

Strategies

There is only one strategy in Paradox:
  1. Analyze the problem, finding an upper bound N on the domain size of models, where N is possibly infinite. A finite such upper bound can be found, for example, for EPR problems.
  2. Flatten the problem, and split clauses and simplify as much as possible.
  3. Instantiate the problem for domain sizes 1 up to N, applying the SAT solver incrementally for each size. Report "SATISFIABLE" when a model is found.
  4. When no model of sizes smaller or equal to N is found, report "CONTRADICTION".
In this way, Paradox can be used both as a model finder and as an EPR solver.

Expected Competition Performance

Paradox 1.3 is the CASC-J3 SAT division winner.


Paradox 2.2

Koen Claessen, Niklas Sörensson
Chalmers University of Technology, Sweden

Architecture

Paradox 2.2 is a rewrite of
Paradox 1.3. Paradox 2.2 does not have all the features yet that Paradox 1.3 has. Some experimental features, such as type-based model finding, have been added. See the description of Paradox 1.3 for general information.

Expected Competition Performance

Paradox 2.2 will probably perform comparably to Paradox 1.3.


Vampire 8.1

Andrei Voronkov
University of Manchester, England

No system description supplied. However, see the description of Vampire 8.0 for general information. Minor changes have been made, including a bugfix in the FOF to CNF conversion.

Expected Competition Performance

Vampire 8.1 is the CASC-J3 FOF and CNF division winner.


Vampire 9.0

Andrei Voronkov
University of Manchester, England

None supplied.


Waldmeister 806

Thomas Hillenbrand1, Bernd Löchner2
1Max-Planck-Institut für Informatik Saarbrücken, Germany
2Technische Universität Kaiserslautern, Germany,

No system description supplied. However, see the description of Waldmeister 704 for general information.

Expected Competition Performance

Waldmeister 806 is the CASC-J3 UEQ division winner.


References

BG98
Bachmair L., Ganzinger H. (1998), Equational Reasoning in Saturation-Based Theorem Proving, Bibel W., Schmitt P.H., Automated Deduction, A Basis for Applications, pp.352-397, Applied Logic Series 10, Kluwer Academic Publishers.
BFN96
Baumgartner P., Furbach U., Niemelä I. (1996), Hyper Tableaux, Alferes J., Pereira L., Orlowska E., Proceedings of JELIA'96: European Workshop on Logic in Artificial Intelligence (Evora, Portugal), pp.1-17, Lecture Notes in Artificial Intelligence 1126, Springer-Verlag.
BT03
Baumgartner P., Tinelli C. (2003), The Model Evolution Calculus, Baader F., Proceedings of the 19th International Conference on Automated Deduction (Miami, USA), pp.350-364, Lecture Notes in Artificial Intelligence 2741, Springer-Verlag.
BFT04
Baumgartner P., Fuchs A., Tinelli C. (2004), Darwin - A Theorem Prover for the Model Evolution Calculus, Sutcliffe G., Schulz S., Tammet T., Proceedings of the Workshop on Empirically Successful First Order Reasoning, 2nd International Joint Conference on Automated Reasoning (Cork, Ireland).
BFT06a
Baumgartner P., Fuchs A., Tinelli C. (2006), Implementing the Model Evolution Calculus, International Journal on Artificial Intelligence Tools 15(1), pp.21-52.
BFT06b
Baumgartner P., Fuchs A., Tinelli C. (2006), Lemma Learning in the Model Evolution Calculus, Hermann M., Voronkov A., Proceedings of the 13th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (Phnom Penh, Cambodia), Lecture Notes in Artificial Intelligence.
BFP07
Baumgartner P., Furbach U., Pelzer B. (2007), Hyper Tableaux with Equality, Pfenning F., Proceedings of the 21st International Conference on Automated Deduction (Bremen, Germany), Lecture Notes in Artificial Intelligence 4603, Springer-Verlag.
CS03
Claessen K., Sorensson N. (2003), New Techniques that Improve MACE-style Finite Model Finding, Baumgartner P., Fermueller C., Proceedings of the CADE-19 Workshop: Model Computation - Principles, Algorithms, Applications (Miami, USA).
dNM06
de Nivelle H., Meng J. (2006), Geometric Resolution: A Proof Procedure Based on Finite Model Search, Furbach U., Shankar N., Proceedings of the 3rd International Joint Conference on Automated Reasoning (Seattle, USA), pp.303-317, Lecture Notes in Artificial Intelligence 4130, Springer-Verlag.
deN07
de Nivelle H. (2007), Redundancy for Geometric Resolution, Ahrendt W., Baumgartner P., de Nivelle H., Proceedings of CADE-21 Workshop on DISPROVING - Non-Theorems, Non-Validity, Non-Provability (Bremen, Germany).
GK03
Ganzinger H., Korovin K. (2003), New Directions in Instantiation-Based Theorem Proving, Kolaitis P., Proceedings of the 18th IEEE Symposium on Logic in Computer Science (Ottawa, Canada), pp.55-64.

GK04
Ganzinger H., Korovin K. (2004), Integrating Equational Reasoning into Instantiation-Based Theorem Proving, Marcinkowski J., Tarlecki A., Proceedings of the 18th International Workshop on Computer Science Logic, 13th Annual Conference of the EACSL (Karpacz, Poland), pp.71-84, Lecture Notes in Computer Science 3210.
Hur03
Hurd J. (2003), First-Order Proof Tactics in Higher-Order Logic Theorem Provers, Archer M., Di Vito B., Munoz C., Proceedings of the 1st International Workshop on Design and Application of Strategies/Tactics in Higher Order Logics (Rome, Italy), pp.56-68, NASA Technical Reports NASA/CP-2003-212448.
LS01
Letz R., Stenz G. (2001), Model Elimination and Connection Tableau Procedures, Robinson A., Voronkov A., Handbook of Automated Reasoning, pp.2015-2114, Elsevier Science.
Loe04
Loechner B. (2004), What to Know When Implementing LPO, Sutcliffe G., Schulz S., Tammet T., Proceedings of the Workshop on Empirically Successful First Order Reasoning, 2nd International Joint Conference on Automated Reasoning (Cork, Ireland).
MW97
McCune W.W., Wos L. (1997), Otter: The CADE-13 Competition Incarnations, Journal of Automated Reasoning 18(2), pp.211-220.
McC94
McCune W.W. (1994), A Davis-Putnam Program and its Application to Finite First-Order Model Search: Quasigroup Existence Problems, ANL/MCS-TM-194, Argonne National Laboratory, Argonne, USA.
McC03a
McCune W.W. (2003), Otter 3.3 Reference Manual, ANL/MSC-TM-263, Argonne National Laboratory, Argonne, USA.
OB03
Otten J., Bibel W. (2003), leanCoP: Lean Connection-Based Theorem Proving, Journal of Symbolic Computation 36(1-2), pp.139-161.
OK96
Otten J., Kreitz C. (1996), T-String-Unification: Unifying Prefixes in Non-Classical Proof Methods, Miglioli P., Moscato U., Mundici D., Ornaghi M., Proceedings of the 5th International Conference on Automated Reasoning with Analytic Tableaux and Related Methods (Palermo, Italy), pp.244-20, Lecture Notes in Computer Science 1071, Springer-Verlag.
Ott05
Otten J. (2005), Clausal Connection-Based Theorem Proving in Intuitionistic First-Order Logic, Beckert B., Proceedings of the 14th International Conference on Automated Reasoning with Analytic Tableaux and Related Methods (Koblenz, Germany), pp.245-261, Lecture Notes in Artificial Intelligence 3702, Springer-Verlag.
Pas78
Pastre D. (1978), Automatic Theorem Proving in Set Theory, Artificial Intelligence 10, pp.1-27.
Pas89
Pastre D. (1989), Muscadet : An Automatic Theorem Proving System using Knowledge and Metaknowledge in Mathematics, Artificial Intelligence 38, pp.257-318.
Pas93
Pastre D. (1993), Automated Theorem Proving in Mathematics, Annals of Mathematics and Artificial Intelligence 8, pp.425-447.
Pas01
Pastre D. (2001), Muscadet 2.3 : A Knowledge-based Theorem Prover based on Natural Deduction, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), pp.685-689, Lecture Notes in Artificial Intelligence 2083, Springer-Verlag.
Pas02
Pastre D. (2002), Strong and Weak Points of the Muscadet Theorem Prover, AI Communications 15(2-3), pp.147-160.
Pas07
Pastre D. (2007), Complementarity of a Natural Deduction Knowledge-based Prover and Resolution-based Provers in Automated Theorem Proving, http://www.math-info.univ-paris5.fr/~pastre/compl-NDKB-RB.pdf.
PW07
Pelzer B., Wernhard C. (2007), System Description: E-KRHyper, Pfenning F., Proceedings of the 21st International Conference on Automated Deduction (Bremen, Germany), Lecture Notes in Artificial Intelligence 4603, Springer-Verlag.
ROK07
Raths T., Otten J., Kreitz C. (2007), The ILTP Problem Library for Intuitionistic Logic - Release v1.1, Journal of Automated Reasoning 38(1-2), pp.261-271.
Sch02
Schulz S. (2002), E: A Brainiac Theorem Prover, AI Communications 15(2-3), pp.111-126.
Sch04a
Schulz S. (2004), System Abstract: E 0.81, Rusinowitch M., Basin D., Proceedings of the 2nd International Joint Conference on Automated Reasoning (Cork, Ireland), pp.223-228, Lecture Notes in Artificial Intelligence 3097.
Sch04b
Schulz S. (2004), Simple and Efficient Clause Subsumption with Feature Vector Indexing, Sutcliffe G., Schulz S., Tammet T., Proceedings of the Workshop on Empirically Successful First Order Reasoning, 2nd International Joint Conference on Automated Reasoning (Cork, Ireland).
Ull89
Ullman J. (1989), Principles of Database and Knowledge-Base Bystems, Computer Science Press, Inc..
Wer03
Wernhard C. (2003), System Description: KRHyper, Fachberichte Informatik 14--2003, Universität Koblenz-Landau, Koblenz, Germany.