CASC-JC Entrants' System Descriptions

Bliksem 1.12
DCTP 0.1
E 0.6 (CASC-17 MIX Division winner)
E 0.62 and EP 0.62
E-SETHEO csp01
Gandalf and GandalfFOF c-2.3
GandalfSat 1.0 (CASC-17 SAT Division winner)
GandalfSat 1.1
MUSCADET 2.3
Otter-MACE 3.2-2.0
PizEAndSATO 0.1
SCOTT 6
Vampire 1.0 (CASC-17 FOF Division winner)
Vampire 2.0 and Vampire CASC-JC
Waldmeister 600 (CASC-17 UEQ Division winner)
Waldmeister 601

Bliksem 1.12

H. de Nivelle
Max Planck Institut für Informatik, Germany
nivelle@mpi-sb.mpg.de

Architecture

Bliksem is a theorem prover implementing resolution and paramodulation. It implements many different orderings, including many orders that define decision procedures. The automatic selection of strategies is poor. A special feature of Bliksem is that it is able to output totally specified proofs. After processing, these proofs can be checked by Coq.

Implementation

Bliksem is written in portable C. High priority has been given to portability. Bliksem has been compiled succcessfully using cc, gcc, and djpcc. We plan to rewrite Bliksem, but at this moment we are not certain about the direction in which Bliksem should develop. Bliksem is available from:

    http://www.mpi-sb.mpg.de/~bliksem

Expected Competition Performance

Bliksem did not change significantly in the last 1.5 years. But the competition has changed, and the other systems as well. We will see ...

References

DCTP 0.1

R. Letz, G. Stenz
Institut für Informatik, Technische Universität München, Germany
{letz,stenz}@informatik.tu-muenchen.de

Architecture

DCTP 0.1 [LS01] is an automated theorem prover for first order clause logic. It is an implementation of the disconnection calculus described in [Bil96]. The disconnection calculus is a proof confluent and inherently cut-free tableau calculus with a weak connectedness condition. Initially, an initial active path is selected through the input clause set, which is then expanded by repeated application of the single inference rule, the linking rule: Two clauses that are linked via the active path are unified and appended to the bottom of the active path. The new path containing the linked literals is closed, The remaining new paths are solved in the same manner, until no open path remains in the tableau, in which case the input set is considered proven, or no further links can be applied, then a model for the input set can be derived from the open path. The concept of branch closure is different from most other tableau calculi. A branch is considered closed iff it is ground closed, i.e. it contains a pair of contradicting ground literals when all variables on the branch are instantiated to the same new constant symbol.

The inherently depth-first proof search is guided by a literal selection based on literal instantiatedness and a heavily parameterized link selection. The pruning mechanisms indispensable for a proof confluent calculus mostly rely on different forms variant deletion and unit based strategies. Additionally the calculus has been augmented by relevancy based backtracking.

The current version has no builtin equality treatment, that is, equality problems must be augmented by the full set of equality axioms to preserve completeness. Currently a version of ordered paramodulation, different from the proposals in Billon's paper, is being integrated into DCTP, but will not be ready for the competition.

Implementation

DCTP has been implemented in the Bigloo dialect of the Scheme language. Major emphasis has been placed on memory preserving techniques, as a proof confluent calculus otherwise easily gobbles up all available resources. All parts of the tableau no longer needed are immediately discarded. DCTP uses a kind of weak term sharing via bound variables to prevent excessive replication of terms during unification.

Expected Competition Performance

As DCTP is a very new system and little experience could be gathered on the effects of the existing guidance heuristics, generally DCTP is not expected to perform on par with the current state of the art systems. Slightly better performance might possibly be expected in the new EPR division.

References

Bil96: Billon J-P. (1996), The Disconnection Method: A Confluent Integration of Unification in the Analytic Framework, Miglioli P., Moscato U., Mundici D., Ornaghi M., Proceedings of TABLEAUX'96: the 5th Workshop on Theorem Proving with Analytic Tableaux and Related Methods (Palermo, Italy), pp.110-126, Lecture Notes in Artificial Intelligence 1071, Springer-Verlag.
LS01: Letz R., Stenz G. (2001), System Description: DCTP - A Disconnection Calculus Theorem Prover, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.

E 0.6

S. Schulz
Institut für Informatik, Technische Universität München, Germany
schulz@informatik.tu-muenchen.de

Architecture

E 0.6 [Sch99] is a purely equational theorem prover. The calculus used by E combines superposition (with selection of negative literals) and rewriting. For the special case that only unit clauses are present, it degenerates into unfailing completion. For the Horn case E can simulate various positive unit strategies. No special rules for non-equational literals have been implemented, i.e., resolution is simulated via paramodulation and equality resolution.

Proof search in E is controlled by a literal selection strategy and an evaluation heuristic. A variety of selection strategies (including e.g., no selection, or select all negative literals) is available. Evaluation heuristics can be constructed on the fly by combining various parameterized primitive evaluation functions. A simple automatic mode can select strategy, term ordering, and heuristic based on simple problem characteristics.

New features since last year include a wider variety of literal selection strategies, and an improved automatic mode that now also switches between two different Knuth-Bendix orderings. A proof checker is now available as well.

Implementation

E is implemented in ANSI C, using the GNU C compiler. The most unique feature of the implementation is the maximally shared term representation. This includes parallel rewriting for all instances of a particular subterm. A second important feature is the use of perfect discrimination trees with age and size constraints for rewriting and unit-subsumption.

The program has been successfully installed under SunOS 4.3.x, Solaris 2.x, HP-UX B 10.20, and various versions of Linux. Sources are available freely from:

    http://wwwjessen.informatik.tu-muenchen.de/~schulz/WORK/eprover.html

References

Sch99: Schulz S. (1999), System Abstract: E 0.3, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), pp.297-301, Lecture Notes in Artificial Intelligence 1632, Springer-Verlag.

E 0.62 and EP 0.62

S. Schulz
Institut für Informatik, Technische Universität München, Germany
schulz@informatik.tu-muenchen.de

Architecture

E 0.62 [Sch01] is a purely equational theorem prover. The calculus used by E combines superposition (with selection of negative literals) and rewriting. No special rules for non-equational literals have been implemented, i.e. resolution is simulated via paramodulation and equality resolution. New additions to the calculus since E 0.6 include AC redundancy elimination and AC simplification for dynamically recognized associative and commutative equational theories, as well as simulated clause splitting (inspired by Vampire).

E is based on the DISCOUNT-loop variant of the given-clause algorithm, i.e. a strict separation of active and passive facts. Proof search in E is primarily controlled by a literal selection strategy, a clause evaluation heuristic, and a simplification ordering. The prover supports a large number of preprogrammed literal selection strategies, many of which are only experimental. Clause evaluation heuristics can be constructed on the fly by combining various parameterized primitive evaluation functions, or can be selected from a set of predefined heuristics. Supported term orderings are several parameterized instances of Knuth-Bendix-Ordering (KBO) and Lexicographic Path Ordering (LPO).

An automatic mode can select literal selection strategy, term ordering (different versions of KBO and LPO), and search heuristic based on simple problem characteristics.

EP 0.62 is just a combination of E 0.62 in full verbose mode and a proof generation tool. Proof object generation is handled by complex postprocessing of the full search protocol.

Implementation

The program has been successfully installed under SunOS 4.3.x, Solaris 2.x, HP-UX B 10.20, and various versions of Linux. Sources of the latest released version and a current snapshot are available freely from:

    http://wwwjessen.informatik.tu-muenchen.de/~schulz/WORK/eprover.html

EP 0.62 is a simple Bourne shell script calling E and the postprocessor in a pipeline.

Expected Competition Performance

In the last year, E won the MIX category of CASC and came in fourth in the UEQ division. We believe that E will again be among the strongest provers in the MIX category, in particular due to its good performance for Horn problems. We also hope to improve in the UEQ category, although we probably will not nearly reach the performance of Waldmeister.

EP 0.62 will be hampered by the fact that it has to reconstruct proofs in a post-processing step that typically usually uses at least the same amount of time as the proof search itself. This is particularly grave because E is able to find a relatively large number of proofs late in the proof search.

E's auto mode is optimized for performance on the TPTP 2.3.0. The optimization is based on a fairly large number of test runs and consists of the selection of one of about 50 different strategies for each problem.

References

Sch01: Schulz S. (2001), System Abstract: E 0.61, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.

E-SETHEO csp01

R. Letz, S. Schulz, G. Stenz
Technische Universität München, Germany
{letz,schulz,stenz}@informatik.tu-muenchen.de

Architecture

E-SETHEO is a compositional theorem prover for formulae in first order clause logic, combining the systems E [Sch99] and SETHEO [MI+97]. It incorporates different calculi and proof procedures like superposition, model elimination and semantic trees (the DPLL procedure). Furthermore, the system contains transformation techniques which may split the formula into independent subparts or which may perform a ground instantiation. Finally, advanced methods for controlling and optimizing the combination of the subsystems are applied. E-SETHEO additionally includes the program Flotter [WGR96] as a preprocessing module for transforming non-clausal formulae to clausal form.

Since version 99csp of E-SETHEO, the different strategies are run sequentially, one after the other, depending on the allocation of resources to the different strategies, so-called schedules, which have been computed from experimental data using machine learning techniques as described in [SW99]. Schedule selection depends on syntactic characteristics of the input formula. E-SETHEO csp01 incorporates the disconnection prover DCTP [LS01] as a new strategy as well as a new version of the E prover.

Implementation

According to the diversity of the contained systems, the modules of E-SETHEO are implemented in different programming languages like C, Prolog, Scheme, and Shell tools.

The program runs under Solaris 2.6. Sources are available freely.

Expected Competition Performance

We expect E-SETHEO to perform well in all categories it participates in.

References

LS01: Letz R., Stenz G. (2001), System Description: DCTP - A Disconnection Calculus Theorem Prover, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.
MI+97: Moser M., Ibens O., Letz R., Steinbach J., Goller C., Schumann J., Mayr K. (1997), SETHEO and E-SETHEO: The CADE-13 Systems, Journal of Automated Reasoning 18(2), pp.237-246.
Sch99: Schulz S. (1999), System Abstract: E 0.3, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), pp.297-301, Lecture Notes in Artificial Intelligence 1632, Springer-Verlag.
SW99: Stenz G., Wolf A. (1999), E-SETHEO: Design, Configuration and Use of a Parallel Automated Theorem Prover, Foo N., Proceedings of AI'99: The 12th Australian Joint Conference on Artificial Intelligence (Sydney, Australia), pp.231-243, Lecture Notes in Artificial Intelligence 1747, Springer-Verlag.
WGR96: Weidenbach C., Gaede B., Rock G. (1996), SPASS and FLOTTER, McRobbie M., Slaney J.K., Proceedings of the 13th International Conference on Automated Deduction (New Brunswick, USA), pp.141-145, Lecture Notes in Artificial Intelligence 1104, Springer-Verlag.

Gandalf and GandalfFOF c-2.3

T. Tammet
Tallinn Technical University, Estonia
tammet@cc.ttu.ee

Architecture

Gandalf version "c" stands for the Gandalf for classical logic. Special versions for intuitionistic logic and type theory exist but are not covered here.

Gandalf is a reolsution prover which implements a number of different basic strategies: binary ordered resolution for several orderings, versions of set-of-support resolution, binary unit resolution, hyperresolution. Enhancements are used for cutting off literals and combining different strategies, in particular forward and backward reasoning, into one run. Equality is handled by ordered paramodulation and demodulation.

The basic idea of the automatic mode in Gandalf is time-slicing: Gandalf selects a set of different search strategies, allocates time to these and finally runs the strategies one after another. The motivation is the following: since it is very hard to determine a suitable strategy by some heuristic procedure, it pays off to try a number of possibly suitable strategies. The selection of strategies and the percentage of time they receive is in a some what ad hoc way based on the characteristics of the problem: percentage of Horn and almost-Horn clauses, the number of clauses, percentage of clauses where literals can be ordered using term and variable depths.

GandalfFOF c-2.3 is a pipeline of Otter [McC94], which converts the problem to CNF, and Gandalf c-2.3.

Some relevant publications are: [Tam97, Tam98, TS98].

Implementation

Gandalf is implemented in Scheme, using the scm interpreter developed by A. Jaffer and the Scheme-to-C compiler Hobbit developed by T. Tammet for the scm system.

Expected Competition Performance

Gandalf is likely to do well in the non-Horn-with-equality subclass of problems. It is likely to have mediocre performance in the general mixed class, and poor performance in the pure equality class.

References

McC94: McCune W.W. (1994), Otter 3.0 Reference Manual and Guide, ANL-94/6, Argonne National Laboratory, Argonne, USA.
Tam97: Tammet T. (1997), Gandalf, Journal of Automated Reasoning 18(2), pp.199-204.
Tam98: Tammet T. (1998), Towards Efficient ATPProgress, Kirchner C., Kirchner H., Proceedings of the 15th International Conference on Automated Deduction (Lindau, Germany), pp.427-440, Lecture Notes in Artificial Intelligence 1421, Springer-Verlag.
TS98: Tammet T., Smith J. (1998), Optimised Encodings of Fragments of Type Theory in First Order Logic, Journal of Logic and Computation 8(6), pp.713-744.

GandalfSat 1.0

T. Tammet
Tallinn Technical University, Estonia
tammet@cc.ttu.ee

Architecture

GandalfSat is a special version of the Gandalf theorem prover, optimised for satisifiability checking. Essentially it contains the same code as Gandalf, but uses different default search strategies to Gandalf.

Ordinary Gandalf is a resolution prover which implements a number of different basic strategies: binary ordered resolution for several orderings, versions of set-of-support resolution, binary unit resolution, hyperresolution. Enhancements are used for cutting off literals and combining different strategies, in particular forward and backward reasoning, into one run. Equality is handled by ordered paramodulation and demodulation.

GandalfSat uses time-slicing similarly to ordinary Gandalf. It attempts hyperresolution, ordered resolution based on [FL+93] results, and two kinds of finite model building algorithms (Falcon-like and MACE-like) for attempting to show satisfiability.

Some relevant publications are: [FL+93, Tam97, TS98].

Implementation

Gandalf is implemented in Scheme, using the scm interpreter developed by A. Jaffer and the Scheme-to-C compiler Hobbit developed by T. Tammet for the scm system.

References

FL+93: Fermüller C., Leitsch A., Tammet T., Zamov N. (1993), Resolution Methods for the Decision Problem, Lecture Notes in Computer Science 679, Springer-Verlag.
Tam97: Gandalf, Journal of Automated Reasoning 18(2), pp.199-204.
Tam98: Towards Efficient Subsumption, Kirchner C., Kirchner H., Proceedings of the 15th International Conference on Automated Deduction (Lindau, Germany), pp.427-440, Lecture Notes in Artificial Intelligence 1421, Springer-Verlag.

GandalfSat 1.1

T. Tammet
Tallinn Technical University, Estonia
tammet@cc.ttu.ee

Architecture

Some relevant publications are: [FL+93, Tam97, TS98].

Implementation

Gandalf is implemented in Scheme, using the scm interpreter developed by A. Jaffer and the Scheme-to-C compiler Hobbit developed by T. Tammet for the scm system.

Expected Competition Performance

We expect it to be among the better provers for satisfiability checking.

References

FL+93: Fermüller C., Leitsch A., Tammet T., Zamov N. (1993), Resolution Methods for the Decision Problem, Lecture Notes in Computer Science 679, Springer-Verlag.
Tam97: Tammet T. (1997), Gandalf, Journal of Automated Reasoning 18(2), pp.199-204.
TS98: Tammet T., Smith J. (1998), Optimised Encodings of Fragments of Type Theory in First Order Logic, Journal of Logic and Computation 8(6), pp.713-744.

MUSCADET 2.3

D. Pastre
Université René Descartes (Paris 5), France
pastre@math-info.univ-paris5.fr

Architecture

The MUSCADET theorem prover is a knowledge-based system. It is based on Natural Deduction, following the terminology of [Ble71] and [Pas78], and uses methods which resembles those used by humans. It is composed of an inference engine, which interprets and executes rules, and of one or several bases of facts, which are the internal representation of "theorems to be proved". Rules are either universal and put into the system, or built by the system itself by metarules from data (definitions and lemmas). Rules may add new hypotheses, modify the conclusion, create objects, split theorems into two or more subtheorems or build new rules which are local for a (sub-)theorem.

There are specific strategies for existential, universal, conjonctive or disjunctive hypotheses and conclusions. Functional symbols may be used, but an automatic creation of intermediate objects allows deep subformulae to be flattened and treated as if the concepts were defined by predicate symbols. The successive steps of a proof may be forward deduction (deduce new hypotheses from old ones), backward deduction (replace the conclusion by a new one) or refutation (only if the conclusion is a negation).

The system is also able to work with second order statements. It may also receive knowledge and know-how for a specific domain from a human user; see [Pas89] and [Pas93]. These two possibilities are not used while working with the TPTP Library.

Implementation

MUSCADET 2 [Pas01] is implemented in SWI-Prolog. Rules are written as declarative Prolog clauses. Metarules are written as sets of Prolog clauses, more or less declarative. The inference engine includes the Prolog interpreter and some procedural Prolog clauses. MUSCADET 2.3 can be obtained from:

      http://www.math-info.univ-paris5.fr/~pastre/muscadet/muscadet.html

Expected Competition Performance

The best performances of MUSCADET will be for problems manipulating many concepts in which all statements (conjectures, definitions, axioms) are expressed in a manner similar to the practice of humans, especially of mathematicians. It will have poor performances for problems using few concepts but large and deep formulas leading to many splittings.

References

Ble71: Bledsoe W.W. (1971), Splitting and Reduction Heuristics in Automatic Theorem Proving, Artificial Intelligence 2, pp.55-77.
Pas78: Pastre D. (1978), Automatic Theorem Proving in Set Theory, Artificial Intelligence 10, pp.1-27.
Pas89: Pastre D. (1989), MUSCADET : An Automatic Theorem Proving System using Knowledge and Metaknowledge in Mathematics, Artificial Intelligence 38, pp.257-318.
Pas93: Pastre D. (1993), Automated Theorem Proving in Mathematics, Annals of Mathematics and Artificial Intelligence 8, pp.425-447.
Pas01: Pastre D. (2001), MUSCADET version 2.3 : User's Manual, http://www.math-info.univ-paris5.fr/~pastre/muscadet/manual-en.ps.

Otter-MACE 3.2-2.0

W. McCune
Argonne National Laboratory, USA
mccune@mcs.anl.gov

Architecture

Otter 3.2 [McC94, McC00b,MW97] and MACE 2.0 [McC01, McC00a] are introduced in this section. It describes the calculus, inference rules, and search control used. Both programs apply to statements in first-order (unsorted) logic with equality. The two programs are independent and work by completely different methods. (In practice, the programs are complementary.) Otter is based on resolution and paramodulation applied to clauses; an Otter search uses the ``given clause algorithm'' and typically involves a large database of clauses; subsumption and demodulation play an important role. Roughly speaking, MACE works as follows. For a given domain size, an equivalent propositional problem is constructed and a Davis-Putnam-Loveland-Logeman decision procedure is applied. MACE typically cannot handle clauses with many variables or deeply nested terms.

Implementation

The programs are written in C and accept the same input files. Otter uses shared data structures for clauses and terms, and it uses indexing for resolution, paramodulation, forward and backward subsumption, forward and backward demodulation, and unit conflict. MACE is much simpler than Otter; its Davis-Putnam-Loveland-Logeman procedure uses a counting algorithm to do the unit resolutions, and forward subsumption is not applied.

Expected Competition Performance

Otter's original automatic mode, which reflects no tuning to the TPTP problems, will be used. Otter is especially ineffective on non-Horn problems, so it is not expected to do well on the NNE or NEQ categories. Otter's performance in the other categories should be acceptable. MACE won the SAT division at the 1999 CASC and should do well on the SAT problems this year.

References

McC94: McCune W.W. (1994), Otter 3.0 Reference Manual and Guide, ANL-94/6, Argonne National Laboratory, Argonne, USA.
MW97: McCune W.W., Wos L. (1997), Otter: The CADE-13 Competition Incarnations, Journal of Automated Reasoning 18(2), pp.211-220.
McC00a: McCune W.W. (2000), MACE: Models and Counterexamples, http://www-unix.mcs.anl.gov/AR/mace/.
McC00b: McCune W.W. (2000), Otter: An Automated Deduction System, http://www-unix.mcs.anl.gov/AR/otter/.
McC01: McCune W.W. (2001), MACE 2.0 Reference Manual and Guide, ANL/MCS-TM-249, Argonne National Laboratory, Argonne, USA.

Acknowledgments: Ross Overbeek, Larry Wos, Bob Veroff, and Rusty Lusk contributed to the development of Otter. MACE was motivated by previous work of John Slaney, Mark Stickel, and Hantao Zhang.

PizEAndSATO 0.1

G. Sutcliffe¹, S. Schulz²
¹University of Miami, USA, ²Technische Universität München
geoff@cs.miami.edu, schulz@informatik.tu-muenchen.de

Architecture

PizEAndSATO is a system for CNF first order problems with a finite Herbrand universe. It is implemented by combining the ground-instance generating program PizE with the propositional prover SATO.

PizE is currently a very simple program. It systematically enumerates all ground instances of a clause set with a finite Herbrand universe. The program is not yet very much optimized. There are only two techniques implemented to cut down on the number of instances generated. First, PizE will split clause input clauses that can be partitioned into variable-disjoint subsets. Secondly, it performs forward unit subsumption and forward unit resolution on the generated ground instances to simplify the formula and to reduce the amount of memory used.

SATO [ZS00], is a highly optimized implementation of the Davis-Putnam-Loveland-Logemann procedure. SATO is well known and has been sucessfully applied in a variety of challenging propositional domains.

PizE and SATO are combined with a shell script called And.

Implementation

PizE is implemented in ANSI C. It is based on the E [Sch01] libraries, shares a large amount of code with E, and will be bundled with future releases of E. Since E is optimized for the first order case, clause representation in PizE is far from optimal for propositional clauses. SATO is implemented in C. It implements its own highly efficient data structures and algorithms.

Expected Competition Performance

We expect a reasonable performance in the EPR division. Testing indicates that the only problems that have very many clauses instances are beyond PizEAndSATO's reach, because PizE runs out of memory.

References

Sch01: System Abstract: E 0.61, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.
ZS00: Zhang H., Stickel M. (2000), Implementing the Davis-Putnam Method, Journal of Automated Reasoning 24(1/2), pp.277-296.

SCOTT 6

K. Hodgson, J. Slaney
Australian National University, Australia
{kahlil,jks}@arp.anu.edu.au

Architecture

SCOTT (Semantically Constrained OTTER) [HS99] is an extension of OTTER [McC94] and uses the same inference rules and overall proof search structure. This extension incorporates FINDER [Sla94] (a finite model generator) as a subsystem of OTTER in order to determine consistent subsets of the clause space. These consistent subsets are used to guide the proof search by increasing the likelihood that certain necessary inferences are made. (This can be seen to loosely generalise the set of support strategy). Not all consistent subsets are created equal, with different collections providing different levels of guidance, and so a number of strategies for determining which collections provide the best guidance have been developed. These incorporate theoretical estimates of the guidance provided and the corresponding cost incurred, enabling us to manage the trade-off much more effectively than in previous versions of SCOTT.

Implementation

SCOTT is written in C. It uses the existing program OTTER to perform the proof search and additional code to determine which clauses should be chosen for inference. This additional code calls the existing program FINDER in order to witness the consistency of particular sets of clauses. The system may be obtained from:

    http://discus.anu.edu.au/software/scott/

Expected Competition Performance

In principle SCOTT should perform at least as well as OTTER across all parts of the MIX and UEQ divisions, if only because it has OTTER as a subroutine. However, searching for models and testing clauses against them can be costly, and so SCOTT may perform poorly if the time-limit is too severe. Preliminary results show that, SCOTT-6 out performs SCOTT-5, so we expect an improvement on the performance in CASC-17. Overall, we expect good performance in comparison with cognate systems that employ relatively naive guidance.

References

HS99: Hodgson K., Slaney J.K. (1999), Semantic Guidance with SCOTT, TR-ARP-01-99, Automated Reasoning Project, Australian National University, Canberra, Australia.
McC94: McCune W.W. (1994), Otter 3.0 Reference Manual and Guide, ANL-94/6, Argonne National Laboratory, Argonne, USA.
Sla94: Slaney J.K. (1994), FINDER: Finite Domain Enumerator, System Description, Bundy A., Proceedings of the 12th International Conference on Automated Deduction (Nancy, France), pp.798-801, Lecture Notes in Artificial Intelligence 814, Springer-Verlag.

Vampire 1.0

A. Riazanov, A. Voronkov
Computer Science Department, The University of Manchester
{riazanoa,voronkov}@cs.man.ac.uk

Architecture

Vampire 1.0 is an automatic theorem prover for first-order classical logic. It implements the calculi of ordered binary resolution, hyperresolution and superposition for handling equality. The splitting rule is simulated by introducing new predicate symbols. A number of standard redundancy criteria and simplification techniques is used for pruning the search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered unit equalities. The only term ordering used in Vampire at the moment is a special version of the Knuth-Bendix ordering which allows efficient approximation algorithms for solving ordering constraints. The competition version will exploit a number of simple preprocessing techniques, such as elimination of simple predicate and function definitions. A number of efficient indexing techniques is used to implement all major operations on sets of terms and clauses, such as an improved version of code trees [Vor95] for forward subsumption and a combination of path indexing [Sti89] and database joins for backward subsumption.

Implementation

Vampire 1.0 is implemented in C++ and can be compiled by 2.91 or newer versions of gcc. The binaries for Solaris and Linux are available from the authors, for details see:

    http://www.cs.man.ac.uk/~riazanoa/Vampire/

References

Sti89: Stickel M.E. (1989), The Path Indexing Method for Indexing Terms, Technical Note 473, SRI International, Menlo Park, USA.
Vor95: Voronkov A. (1995), The Anatomy of Vampire, Journal of Automated Reasoning 15(2), pp.237-265.

Vampire 2.0 and Vampire CASC-JC

A. Riazanov, A. Voronkov
Computer Science Department, The University of Manchester
{riazanoa,voronkov}@cs.man.ac.uk

Architecture

Vampire 2.0 is an automatic theorem prover for first-order classical logic. It implements the calculi of ordered binary resolution, hyperresolution, and superposition for handling equality. The splitting rule is simulated by introducing new predicate symbols. A number of standard redundancy criteria and simplification techniques are used for pruning the search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered unit equalities. The only term ordering used in Vampire at the moment is a special version of the Knuth-Bendix ordering which allows efficient approximation algorithms for solving ordering constraints. A number of efficient indexing techniques are used to implement all major operations on sets of terms and clauses, such as an improved version [RV00] of code trees [Vor95] for forward subsumption, and a combination of path indexing [Sti89] and database joins for backward subsumption.

Compared to Vampire 1.0 that participated in the previous competition, this version has many more literal selection functions, more flexible splitting without backtracking and improved memory management. The automatic mode of Vampire 2.0 is primitive, it recognises only very simple syntactic properties of the input problem, like the presence of equality or non-Horn clauses. In the preprocessing stage it exploits a number of primitive techniques, such as elimination of simple predicate and function definitions.

Vampire CASC-JC is a version based on Vampire 2.0, specialised to win the competition. To adjust the strategy it estimates the problem by checking some syntactic properties, such as presence of multiliteral, non-Horn and ground clauses, equations and nonequational literals. Additionally we consider such quantitative characteristics as the number of axioms, literals, small and large terms.

Implementation

Vampire 2.0 is implemented in C++ and can be compiled by 2.91 or newer versions of gcc. The binaries for Solaris and Linux are available from the authors, for details see:

    http://www.cs.man.ac.uk/~riazanoa/Vampire

Expected Competition Performance

We expect Vampire CASC-JC to perform much better than Vampire 2.0.

References

RV00: Riazanov A., Voronkov A. (2000), Partiallly Adaptive Code Trees, Ojeda-Aciego M., de Guzman I., Brewka G., Pereira L., Proceedings of the 7th European Workshop on Logics in Artificial Inteligence (Malaga, Spain), pp.209-223, Lecture Notes in Artificial Intelligence 1919, Springer-Verlag.
Sti89: Stickel M.E. (1989), The Path Indexing Method for Indexing Terms, Technical Note 473, SRI International, Menlo Park, USA.
Vor95: Voronkov A. (1995), The Anatomy of Vampire, Journal of Automated Reasoning 15(2), pp.237-265.

Waldmeister 600

T. Hillenbrand, B. Löchner, A. Jaeger, and A. Buch
FB Informatik, Universität Kaiserslautern
waldmeister@informatik.uni-kl.de

Architecture

Waldmeister 600 [HJL99] is an implementation of unfailing Knuth-Bendix completion [BDP89]. The system architecture is built on a stringent distinction of active facts (selected rewrite rules) and passive facts (critical pairs): Firstly, only the active facts induce the normalization relation. Secondly, the passive facts are normalized not permanently, but on generation only. The saturation cycle is parameterized by a reduction ordering and a heuristic assessment of passive facts. The control parameters are chosen according to the algebraic structure given in the problem specification [HJL99]. Recent developments include the integration of the kernel strategy [HL00] as well as extensions towards ordered completion, but are still in progress.

Implementation

The system is implemented in ANSI-C and runs under SunOS, Solaris and Linux. Refined perfect discrimination trees are employed to index the active facts, whereas the passive facts are stored in a compressed representation. The hypotheses are blown up to sets of rewrite successors and sometimes predecessors. A step-by-step proof object is constructed at run-time. Waldmeister can be found on the Web at:

    http://www-avenhaus.informatik.uni-kl.de/waldmeister

References

BDP89: Bachmair L., Dershowitz N., Plaisted D.A. (1989), Completion Without Failure, Ait-Kaci H., Nivat M., Resolution of Equations in Algebraic Structures, pp.1-30, Academic Press.
HJL99: Hillenbrand T., Jaeger A., Löchner B. (1999), Waldmeister - Improvements in Performance and Ease of Use, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.
HL00: Hillenbrand T., Löchner B. (2000), Shorter Fixed Point Combinators with Waldmeister and the Kernel Strategy, AAR Newsletter 46.

Waldmeister 601

T. Hillenbrand¹, B. Löchner², A. Jaeger, and A. Buch
¹Max-Planck-Institut für Informatik Saarbrücken, ²Universität Kaiserslautern
waldmeister@informatik.uni-kl.de

Architecture

Waldmeister is a system for unit equational deduction. Its theoretical basis is unfailing completion in the sense of [BDP89] with refinements towards ordered completion. The prover saturates the input axiomatization in a repeated cycle that works on a set of active resp. passive facts. The selection of the reduction ordering and the heuristical guidance of the proof search are described in [HJL99].

Since last year's version, stronger redundancy criteria have been integrated, including ground joinability tests with ordering constraints on variables [AHL00]. In several problem domains this technique is helpful especially for harder proof tasks. Some restructuring of the prover is on the way to also include an implementation of confluence trees with full ordering constraints [AL01]; but further work is necessary to have them speed up the proof search.

Implementation

The prover is coded in ANSI-C and available for SunOS, Solaris and Linux. The set of active facts is represented by perfect discrimination trees. Stimulated by the evaluation of indexing techniques reported on in [NH+01], the implementation of these has somewhat been improved. The storage of passive facts, and the treatment of hypotheses have remained unchanged. The Waldmeister Web page is located at:

    http://www-avenhaus.informatik.uni-kl.de/waldmeister

Expected Competition Performance

Once again, the Waldmeister system being that powerful already, we hope to improve over last year's version in some problem domains.

References

AHL00: Avenhaus J., Hillenbrand T., Löchner B. (2000), On Using Ground Joinable Equations in Equational Theorem Proving, Baumgartner P., Zhang H., Proceedings of the 3rd International Workshop on First Order Theorem Proving (St Andrews, Scotland), pp.33-43.
AL01: Avenhaus J., Löchner B. (2001), System Description: CCE: Testing Ground Joinability, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.
BDP89: Bachmair L., Dershowitz N., Plaisted D.A. (1989), Completion Without Failure, Ait-Kaci H., Nivat M., Resolution of Equations in Algebraic Structures, pp.1-30, Academic Press.
HJL99: Hillenbrand T., Jaeger A., Löchner B. (1999), Waldmeister - Improvements in Performance and Ease of Use, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), pp.232-236, Lecture Notes in Artificial Intelligence 1632, Springer-Verlag.
NH+01: Nieuwenhuis R., Hillenbrand T., Riazanov A., Voronkov A. (2001), On the Evaluation of Indexing Techniques for Theorem Proving, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.