CASC-17 Entrants' System Descriptions


Bliksem 1.10

H. de Nivelle
Max Planck Institut für Informatik, Germany
nivelle@mpi-sb.mpg.de

Architecture

Bliksem is a resolution and paramodulation based theorem prover. It implements many different orders, including many orders that define decision procedures. The automatic selection of strategies is poor. A special feature of Bliksem is that it is able to output totally specified proofs. After processing, these proofs can be checked by Coq.

Implementation

Bliksem is written in portable C. High priority has been given to portability. Bliksem has been compiled succcessfully using cc, gcc, and djpcc. The data structures have been carefully chosen after benchmark tests on the basic operations. Bliksem can be obtained from:
    http://www.mpi-sb.mpg.de/~bliksem

Expected Competition Performance

The system does not differ from last year's entry. So it is reasonable to expect the same competition performance as last year.

References


E 0.6

S. Schulz
Institut für Informatik, Technische Universität München, Germany
schulz@informatik.tu-muenchen.de

Architecture

E 0.6 [
Sch99] is a purely equational theorem prover. The calculus used by E combines superposition (with selection of negative literals) and rewriting. For the special case that only unit clauses are present, it degenerates into unfailing completion. For the Horn case E can simulate various positive unit strategies. No special rules for non-equational literals have been implemented, i.e., resolution is simulated via paramodulation and equality resolution.

Proof search in E is controlled by a literal selection strategy and an evaluation heuristic. A variety of selection strategies (including e.g., no selection, or select all negative literals) is available. Evaluation heuristics can be constructed on the fly by combining various parameterized primitive evaluation functions. A simple automatic mode can select strategy, term ordering, and heuristic based on simple problem characteristics.

New features since last year include a wider variety of literal selection strategies, and an improved automatic mode that now also switches between two different Knuth-Bendix orderings. A proof checker is now available as well.

Implementation

E is implemented in ANSI C, using the GNU C compiler. The most unique feature of the implementation is the maximally shared term representation. This includes parallel rewriting for all instances of a particular subterm. A second important feature is the use of perfect discrimination trees with age and size constraints for rewriting and unit-subsumption.

The program has been successfully installed under SunOS 4.3.x, Solaris 2.x, HP-UX B 10.20, and various versions of Linux. Sources are available freely from:

    http://wwwjessen.informatik.tu-muenchen.de/~schulz/WORK/eprover.html

Expected Competition Performance

As in the last year, E is expected to perform pretty well in all categories it participates in, with particularly strong showing for Horn problems. Very little effort has been spent on the performance for unit equality problems, however, and it is unlikely that E can come close to Waldmeister's performance in this category.

E's auto mode is optimized for performance on the complete TPTP 2.3.0. The optimization is based on a fairly large number of test runs and consists of the selection of one of about 50 different strategies for each problem.

References

Sch99
Schulz S. (1999), System Abstract: E 0.3, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), pp.297-301, Lecture Notes in Artificial Intelligence 1632, Springer-Verlag.


E-SETHEO 2000csp

R. Letz, S. Schulz, G. Stenz
Institut für Informatik, Technische Universität München, Germany
{letz,schulz,stenz}@informatik.tu-muenchen.de

Architecture

E-SETHEO is a compositional theorem prover for formulae in first order clause logic, combining the systems E [
Sch99] and SETHEO [MI+97]. It incorporates different calculi and proof procedures like superposition, model elimination and semantic trees (the DPLL procedure). Furthermore, the system contains transformation techniques which may split the formula into independent subparts or which may perform a ground instantiation. Finally, advanced methods for controlling and optimizing the combination of the subsystems are applied.

E-SETHEO-FLOTTER additionally includes the program Flotter [WGR96] as a preprocessing module for transforming non-clausal formulae to clausal form.

In contrast to the former P-SETHEO [Wol98], since version 99csp of E-SETHEO, the different strategies are run sequentially, one after the other, depending on the allocation of resources to the different strategies, so-called schedules, which have been computed from experimental data using machine learning techniques as described in [SW99]. Schedule selection depends on syntactic characteristics of the input formula. Version 2000csp of E-SETHEO employs a new strategy interweaving the lemma mechanisms of E and SETHEO.

Implementation

According to the diversity of the contained systems, the modules of E-SETHEO are implemented in different programming languages like C, Prolog, Scheme, and Shell tools.

The program runs under Solaris 2.6. Sources are available freely.

Expected Competition Performance

We expect E-SETHEO to perform well in all categories it participates in.

References

MI+97
Moser M., Ibens O., Letz R., Steinbach J., Goller C., Schumann J., Mayr K. (1997), SETHEO and E-SETHEO: The CADE-13 Systems, Journal of Automated Reasoning 18(2), pp.237-246.
Sch99
Schulz S. (1999), System Abstract: E 0.3, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), pp.297-301, Lecture Notes in Artificial Intelligence 1632, Springer-Verlag.
SW99
Stenz G., Wolf A. (1999), E-SETHEO: Design, Configuration and Use of a Parallel Automated Theorem Prover, Foo N., Proceedings of AI'99: The 12th Australian Joint Conference on Artificial Intelligence (Sydney, Australia), pp.231-243, Lecture Notes in Artificial Intelligence 1747, Springer-Verlag.
WGR96
Weidenbach C., Gaede B., Rock G. (1996), SPASS and FLOTTER, McRobbie M., Slaney J.K., Proceedings of the 13th International Conference on Automated Deduction (New Brunswick, USA), pp.141-145, Lecture Notes in Artificial Intelligence 1104, Springer-Verlag.
Wol98
Wolf A. (1998), p-SETHEO: Strategy Parallelism in Automated Theorem Proving, de Swart H., Proceedings of International Conference TABLEAUX'98: Analytic Tableaux and Related Methods (Oisterwijk, The Netherlands), pp.320-324, Lecture Notes in Artificial Intelligence 1397, Springer-Verlag.


Gandalf c-2.1

T. Tammet
Tallinn Technical University, Estonia
tammet@cc.ttu.ee

Architecture

Gandalf version "c" stands for the Gandalf for classical logic. Special versions for intuitionistic logic and type theory exist but are not covered here.

Gandalf is a reolsution prover which implements a number of different basic strategies: binary ordered resolution for several orderings, versions of set-of-support resolution, binary unit resolution, hyperresolution. Enhancements are used for cutting off literals and combining different strategies, in particular forward and backward reasoning, into one run. Equality is handled by ordered paramodulation and demodulation.

When compared to earlier versions, the new version adds special strategies for a limited kind of clause demodulation and combining a propositional prover with a first-order prover.

The basic idea of the automatic mode in Gandalf is time-slicing: Gandalf selects a set of different search strategies, allocates time to these and finally runs the strategies one after another. The motivation is the following: since it is very hard to determine a suitable strategy by some heuristic procedure, it pays off to try a number of possibly suitable strategies. The selection of strategies and the percentage of time they receive is in a some what ad hoc way based on the characteristics of the problem: percentage of Horn and almost-Horn clauses, the number of clauses, percentage of clauses where literals can be ordered using term and variable depths.

Some relevant publications are: [Tam97, TS98, Tam98].

Implementation

Gandalf is implemented in Scheme, using the scm interpreter developed by A. Jaffer and the Scheme-to-C compiler Hobbit developed by T. Tammet for the scm system.

Expected Competition Performance

Gandalf is likely to do well in the non-Horn-with-equality subclass of problems. It is likely to have mediocre performance in the general mixed class, and poor performance in the pure equality class.

References

TS98
Optimised Encodings of Fragments of Type Theory in First Order Logic, Journal of Logic and Computation 8(6), pp.713-744.
Tam97
Gandalf, Journal of Automated Reasoning 18(2), pp.199-204.
Tam98
Towards Efficient Subsumption, Kirchner C., Kirchner H., Proceedings of the 15th International Conference on Automated Deduction (Lindau, Germany), pp.427-440, Lecture Notes in Artificial Intelligence 1421, Springer-Verlag.


GandalfSat 1.0

T. Tammet
Tallinn Technical University, Estonia
tammet@cc.ttu.ee

Architecture

GandalfSat is a special version of the
Gandalf theorem prover, optimised for satisifiability checking. Essentially it contains the same code as Gandalf, but uses different default search strategies to Gandalf.

Ordinary Gandalf is a resolution prover which implements a number of different basic strategies: binary ordered resolution for several orderings, versions of set-of-support resolution, binary unit resolution, hyperresolution. Enhancements are used for cutting off literals and combining different strategies, in particular forward and backward reasoning, into one run. Equality is handled by ordered paramodulation and demodulation.

GandalfSat uses time-slicing similarly to ordinary Gandalf. It attempts hyperresolution, ordered resolution based on [FLTZ] results, and two kinds of finite model building algorithms (Falcon-like and MACE-like) for attempting to show satisfiability.

Some relevant publications are: [FL+93, Tam97, TS98].

Implementation

Gandalf is implemented in Scheme, using the scm interpreter developed by A. Jaffer and the Scheme-to-C compiler Hobbit developed by T. Tammet for the scm system.

Expected Competition Performance

Since GandalfSat has been developed recently, we cannot reliably predict its performance. However, we expect it to be among the better provers for satisfiability checking.

References

FL+93
Fermüller C., Leitsch A., Tammet T., Zamov N. (1993), Resolution Methods for the Decision Problem, Lecture Notes in Computer Science 679, Springer-Verlag.
Tam97
Gandalf, Journal of Automated Reasoning 18(2), pp.199-204.
Tam98
Towards Efficient Subsumption, Kirchner C., Kirchner H., Proceedings of the 15th International Conference on Automated Deduction (Lindau, Germany), pp.427-440, Lecture Notes in Artificial Intelligence 1421, Springer-Verlag.


MACE 1.4

W. McCune
Mathematics and Computer Science, Argonne National Laboratory
mccune@mcs.anl.gov

Architecture

MACE [
McC00] searches for small finite models. The program applies to statements in first-order (unsorted) logic with equality. Roughly speaking, MACE works as follows. For a given domain size, an equivalent propositional problem is constructed and a Davis-Putnam-Loveland decision procedure is applied. MACE typically cannot handle clauses with many variables or deeply nested terms.

Implementation

MACE is written in C. Its Davis-Putnam-Loveland procedure uses a counting algorithm to do the unit resolutions, and forward subsumption is not applied.

Expected Competition Performance

MACE unexpectedly won the SAT division at CASC-16, in part because the difficulty ratings were biased towards MACE. MACE is expected to do well at CASC-17.

References

McC00
McCune W.W. (2000), MACE: Models and Counterexamples, http://www.mcs.anl.gov/AR/mace/.
Acknowledgments: MACE was motivated by previous work of John Slaney, Mark Stickel, and Hantao Zhang.


MUSCADET 2.2

D. Pastre
Université René Descartes (Paris 5), UFR de Mathématiques et Informatique, France
pastre@math-info.univ-paris5.fr

Architecture

MUSCADET is a theorem prover based upon Natural Deduction, in the sense of [
Ble71] and [Pas78]. It is built as a knowledged-based system, that is, methods are expressed by rules interpreted by an inference engine. Some rules are universal mathematical knowledge and are given to the system. Other rules are specific to some concepts and are built by metarules from the definitions of these concepts. The notions of hypothesis and conclusion to be proved are privileged. The first conclusion is the first order statement of the conjecture.

Methods are those used by humans, and are executed according to a plan decided by metarules and heuristics. Methods include splitting theorems into two or more subtheorems (conjunctive conclusion or disjunctive hypotheses), treatment of universal, existential or disjunctive conclusion, treatment of existential or disjunctive hypotheses (one at a time when it is decided by heuristic rules), replacement of the conclusion by its definition. Hypotheses are never replaced by their definition. Instead, rules are built by metarules from definitions and used to deduce new hypotheses from old ones. Universal hypotheses are not added. Instead, rules are built as from the definitions. Functional symbols may be used, but an automatic creation of intermediate objects allows deep subformulae to be flattened and treated as if the concepts were defined by predicate symbols. Positive properties are privileged, in hypotheses and in conclusions. So, a special treatment of negation is done. The successive steps of a proof may be forward deduction (deduce new hypotheses from old ones), backward deduction (replace the conclusion by a new one) or refutation (only if the conclusion is a negation).

The system is also able to work with second order statements. It may also receive knowledge and know-how for a specific domain from a human user; see [Pas89] and [Pas93]. These two possibilities are not used while working with the TPTP Library.

Implementation

The first version of MUSCADET was a real rule-based system [Pas89]. Rules and metarules were written in a declarative language defined for this and interpreted by an inference engine written in Pascal. The competition version is entirely written in SWI-Prolog. Rules are written as declarative Prolog clauses. Metarules are written as sets of Prolog clauses, more or less declarative. The Pascal inference engine is replaced by the Prolog interpreter and some procedural Prolog clauses.

Expected Competition Performance

The best performances of MUSCADET will be for problems manipulating many concepts in which all statements (conjectures, definitions, axioms) are expressed in a manner similar to the practice of humans, especially of mathematicians. It will have poor performances for problems using few concepts but large and deep formulas leading to many splittings.

References

Ble71
Bledsoe W.W. (1971), Splitting and Reduction Heuristics in Automatic Theorem Proving, Artificial Intelligence 2, pp.55-77.
Pas78
Pastre D. (1978), Automatic Theorem Proving in Set Theory, Artificial Intelligence 10, pp.1-27.
Pas89
Pastre D. (1989), MUSCADET : An Automatic Theorem Proving System using Knowledge and Metaknowledge in Mathematics, Artificial Intelligence 38, pp.257-318.
Pas93
Pastre D. (1993), Automated Theorem Proving in Mathematics, Annals of Mathematics and Artificial Intelligence 8, pp.425-447.
Pas99
Pastre D. (1999), Le nouveau MUSCADET et la TPTP Problem Library, Colloque sur la métaconnaissance, Berder, report LIP6 2000/002, pp.54-98, Université René Descartes (Paris 5), UFR de Mathématiques et Informatique, Paris, France.


Otter-MACE 437

W. McCune
Mathematics and Computer Science, Argonne National Laboratory
mccune@mcs.anl.gov

Architecture

Otter [
McC94,McC98a,MW97] searches for proofs, and MACE [McC98b] searches for small finite models. Both programs apply to statements in first-order (unsorted) logic with equality. The two programs are independent and work by completely different methods. (In practice, the programs are complementary.) Version 3.0.5 of Otter has been used. Otter is based on resolution and paramodulation applied to clauses; an Otter search uses the "given clause algorithm" and typically involves a large database of clauses; subsumption and demodulation play an important role. Version 1.3.2 of MACE has been used. Roughly speaking, MACE works as follows. For a given domain size, an equivalent propositional problem is constructed and a Davis-Putnam-Loveland decision procedure is applied. MACE typically cannot handle clauses with many variables or deeply nested terms. Otter-MACE 437 (= Otter 3.0.5 + MACE 1.3.2) is a simple combination of the two programs.

Implementation

Both programs are written in C. Otter uses shared data structures for clauses and terms, and it uses indexing for resolution, paramodulation, forward and backward subsumption, forward and backward demodulation, and unit conflict. MACE is much simpler than Otter; its Davis-Putnam-Loveland procedure uses a counting algorithm to do the unit resolutions, and forward subsumption is not applied. Otter-MACE is a shell script that runs Otter for a few seconds, then (if Otter has not found a refutation) MACE for a few seconds, then (if MACE has not found a model) Otter again for the remaining time.

Expected Competition Performance

These versions of Otter and MACE are essentially the same as the versions entered in the previous two CASC competitions. Both programs generally perform well at CASC, but no top prizes have been won since CASC-13. We do not expect any winning performances in CASC-16. We see our entry as a benchmark for measuring improvements in other systems.

References

MW97
McCune W.W., Wos L. (1997), Otter: The CADE-13 Competition Incarnations, Journal of Automated Reasoning 18(2), pp.211-220.
McC94
McCune W.W. (1994), Otter 3.0 Reference Manual and Guide, Technical Report ANL-94/6, Argonne National Laboratory, Argonne, USA.
McC98a
McCune W.W. (1998), Otter: An Automated Deduction System, http://www.mcs.anl.gov/home/mccune/ar/otter/.
McC98b
McCune W.W. (1998), MACE: Models and Counterexamples, http://www.mcs.anl.gov/home/mccune/ar/mace/.
Acknowledgments: Ross Overbeek, Larry Wos, Bob Veroff, and Rusty Lusk contributed to the development of Otter. MACE was motivated by previous work of John Slaney, Mark Stickel, and Hantao Zhang.


Otter 3.1

W. McCune
Mathematics and Computer Science, Argonne National Laboratory
mccune@mcs.anl.gov

Architecture

Otter [
McC94, McC98, MW97] searches for proofs. Otter applies to statements in first-order (unsorted) logic with equality. Otter is based on resolution and paramodulation applied to clauses; an Otter search uses the ``given clause algorithm'' and typically involves a large database of clauses; subsumption and demodulation play an important role.

Implementation

Otter is written in C. Otter uses shared data structures for clauses and terms, and it uses indexing for resolution, paramodulation, forward and backward subsumption, forward and backward demodulation, and unit conflict.

Expected Competition Performance

Otter's original automatic mode was used for CASC-13 (the first CASC). A secondary automatic mode, derived from some tuning with respect to TPTP, was used For CASC-14, -15, and -16. The secondary automatic mode is less complete for nonunit equality problems but it proves a few more TPTP theorems. For CASC-17, we return to the original automatic mode, which reflects no tuning to TPTP. Otter did very well in CASC-13, but it has changed very little in automatic capability since then. Powerful new programs have been introduced in the past four years, and old competitors have been greatly improved, so no winning performances are expected from Otter.

References

McC94
McCune W.W. (1994), Otter 3.0 Reference Manual and Guide, Technical Report ANL-94/6, Argonne National Laboratory, Argonne, USA.
McC98
McCune W.W. (1998), Otter: An Automated Deduction System, http://www.mcs.anl.gov/AR/otter/.
MW97
McCune W.W., Wos L. (1997), Otter: The CADE-13 Competition Incarnations, Journal of Automated Reasoning 18(2), pp.211-220.
Acknowledgments: Ross Overbeek, Larry Wos, Bob Veroff, and Rusty Lusk contributed to the development of Otter.


SCOTT V

K. Hodgson and J. Slaney
Automated Reasoning Project, Australian National University, Australia
{kahlil,jks}@arp.anu.edu.au

Architecture

SCOTT (Semantically Constrained OTTER) [
HS99, HS00] is an extension of OTTER [McC94] and uses the same inference rules and overall proof search structure. This extension incorporates FINDER [Sla94] (a finite model generator) as a subsystem of OTTER in order to determine consistent subsets of the clause space. These consistent subsets are used to guide the proof search by increasing the likelihood that certain necessary inferences are made [HS00]. (This can be seen to loosely generalise the set of support strategy). Not all consistent subsets are created equal, with different collections providing different levels of guidance, and so a number of strategies for determining which collections provide the best guidance have been developed. These incorporate theoretical estimates of the guidance provided and the corresponding cost incurred, enabling us to manage the trade-off much more effectively than in previous versions of SCOTT [HS99].

Implementation

SCOTT is written in C. It uses the program OTTER to perform the proof search, the progam FINDER to witness the consistency of particular sets of clauses, and additional code to determine which clauses should be chosen for inference. The system may be obtained from:
     http://arp.anu.edu.au/software/scott/

Expected Competition Performance

Preliminary results show that, SCOTT-V out performs SCOTT-IV, so we expect an improvement on the performance in CASC-16. However, searching for models and testing clauses against them can be costly, and so SCOTT may perform poorly if the time-limit is too severe.

References

McC94
McCune W.W. (1994), Otter 3.0 Reference Manual and Guide, Technical Report ANL-94/6, Argonne National Laboratory, Argonne, USA.
Sla94
Slaney J.K. (1994), Finite Domain Enumerator, System Description, Bundy A., Proceedings of the 12th International Conference on Automated Deduction (Nancy, France), pp.764-768, Lecture Notes in Artificial Intelligence 814, Springer-Verlag.
HS99
Hodgson K., Slaney J.K. (1999), Semantic Guidance with SCOTT, Technical Report TR-ARP-01-99, Automated Reasoning Project, Australian National University, Canberra, Australia.
HS00
Hodgson K., Slaney J.K. (2000), Semantic Guidance for Saturation-based Theorem Proving, Technical Report TR-ARP-04-2000, Automated Reasoning Project, Australian National University, Canberra, Australia.


SPASS 1.0.0TPTP

C. Weidenbach
Max-Planck-Institut für Informatik
weidenb@mpi-sb.mpg.de

Architecture

SPASS [
WA+99] is an automated theorem prover for first-order logic with equality. It is a saturation based prover employing superposition, sorts and splitting [Wei99]. The current version 1.0.0 offers a variety of further inference and reduction rules including hyper resolution, unit resulting resolution, various variants of paramodulation and a terminator [AO83].

The prover is available for many platforms, in particular, a Windows 95/98/NT version with a neat graphical user interface is now also provided.

Implementation

SPASS is implemented in ANSI C and relies on our internal library supporting specific data structures and algorithms like, e.g., indexing or orderings (KBO, RPOS). The code is documented, flexible and was designed to be reusable. The code should compile on any platform supporting an ANSI C compiler.

Expected Competition Performance

The SPASS competition version 1.0.0TPTP is tuned towards TPTP problems. In particular, the standard automatic configuration module is replaced by a module adapted to the TPTP. We hope that the new version can improve over the version 0.95TPTP that entered the previous competition.

References

AO93
Antoniou G., Ohlbach H-J. (1983), Terminator, Bundy A., Proceedings of the 8th International Joint Conference on Artificial Intelligence (Karlsruhe, Germany), pp.916-919.
WA+99
Weidenbach C., Afshordel B., Brahm U., Cohrs C., Engel T., Keen E., Theobalt C., Tpoic D. (1999), System Description: SPASS Version 1.0.0, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.
Wei99
Weidenbach C. (1999), SPASS: Combining Superposition, Sorts and Splitting, Robinson A., Voronkov A., Handbook of Automated Reasoning, Elsevier Science.


Vampire 0.0

A. Riazanov, A. Voronkov
Computer Science Department, The University of Manchester
{riazanoa,voronkov}@cs.man.ac.uk

Architecture

Vampire is a resolution-based theorem prover for first-order classical logic. The current version implements ordered binary resolution and hyperresolution, superposition and some simplifications. For proof search Vampire uses a saturation-based procedure similar to that of Otter [
McC94]. One interesting feature of Vampire is the so-called limited resource strategy. The idea is as follows. When a time limit is given, Vampire makes estimations on what clauses (called currently eligible clauses) can be processed by the end of the time limit. For each newly generated clause it is checked whether it is currently eligible. If it is not, Vampire discards the clause. The estimation principle takes into consideration the current strategy and tries to calculate how the process of clause generation will develop by the time limit.

Implementation

To gain efficiency the most costly set-at-a-time operations, such as resolution, superposition and subsumption, are implemented using combination of indexing and compilation. We use a variant of code trees [Vor95] for forward subsumption and forward demodulation, path indexing [Sti89] for backward subsumption and a variant of discrimination trees [McC92] for resolution and superposition. All used indexing techniques provide so-called "perfect filtering" [Graf96,RSV99]. For storing persistent clauses a tree-like representation of terms is used. The sharing is agressive. For other operations different versions of flatterms are used. Vampire is implemented in C++ and compiled by gcc.

Expected Competition Performance

We expect Vampire to perform well, especially on the problems without equality.

References

Gra96
Graf P. (1996), Term Indexing, Lecture Notes in Computer Science 1053, Springer-Verlag.
McC92
McCune W.W. (1992), Experiments with Discrimination-Tree Indexing and Path Indexing for Term Retrieval, Journal of Automated Reasoning 9(2), pp.147-167.
McC94
McCune W.W. (1994), Otter 3.0 Reference Manual and Guide, Technical Report ANL-94/6, Argonne National Laboratory, Argonne, USA.
RSV99
Ramakrishnan I.V., Sekar R., Voronkov A. (1999), Term Indexing, Robinson A., Voronkov A., Handbook of Automated Reasoning, Elsevier Science.
Sti89
Stickel M.E. (1989), The Path Indexing Method for Indexing Terms, Technical Note 473, SRI International, Menlo Park, USA.
Vor95
Voronkov A. (1995), The Anatomy of Vampire, Journal of Automated Reasoning 15(2), pp.237-265.


Vampire 1.0

A. Riazanov, A. Voronkov
Computer Science Department, The University of Manchester
{riazanoa,voronkov}@cs.man.ac.uk

Architecture

Vampire 1.0 is an automatic theorem prover for first-order classical logic. It implements the calculi of ordered binary resolution, hyperresolution and superposition for handling equality. The splitting rule is simulated by introducing new predicate symbols. A number of standard redundancy criteria and simplification techniques is used for pruning the search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered unit equalities. The only term ordering used in Vampire at the moment is a special version of the Knuth-Bendix ordering which allows efficient approximation algorithms for solving ordering constraints. The competition version will exploit a number of simple preprocessing techniques, such as elimination of simple predicate and function definitions. A number of efficient indexing techniques is used to implement all major operations on sets of terms and clauses, such as an improved version of code trees [
Vor95] for forward subsumption and a combination of path indexing [Sti89] and database joins for backward subsumption.

Implementation

Vampire 1.0 is implemented in C++ and can be compiled by 2.91 or newer versions of gcc. The binaries for Solaris and Linux are available from the authors, for details see:
    http://www.cs.man.ac.uk/~riazanoa/Vampire/

Expected Competition Performance

We cannot make reliable predictions of Vampire's performance at CASC-17.

References

Sti89
Stickel M.E. (1989), The Path Indexing Method for Indexing Terms, Technical Note 473, SRI International, Menlo Park, USA.
Vor95
Voronkov A. (1995), The Anatomy of Vampire, Journal of Automated Reasoning 15(2), pp.237-265.


Waldmeister 799

T. Hillenbrand, A. Jaeger, B. Löchner, and A. Buch
FB Informatik, Universität Kaiserslautern
{hillen,jaeger,loechner}@informatik.uni-kl.de,arnim.buch@sdm.de

This year's version of our unit equality theorem prover is in most parts identical to Waldmeister 798 that won the CASC-15 UEQ competition with an enormous lead [SS99]. Since last year's description is included in this collection only the differences are reported here.

Architecture

Waldmeister is an implementation of unfailing Knuth-Bendix completion [BDP89]. Our approach to control the proof search is to choose selection strategy and reduction ordering according to the algebraic structure given in the problem specification [HJL99]. Since Waldmeister 798, the corresponding part of the system has been developed further into a robust component that is easy to maintain. In addition the derivation of useful control knowledge has been systematized, which also gave rise to progress on some problem domains.

Recently, Waldmeister has successfully been integrated into the Omega proof development environment [BC+97]. Moreover, a convenient WWW interface has been established. It is reachable via:

    http://www-avenhaus.informatik.uni-kl.de/waldmeister
From this URL further documents and the complete Waldmeister distribution containing an easy-to-use version together with problems and a manual are available.

Implementation

No major changes took place. Some portions of code have been slightly revised, e.g., a number of time-consuming inner loops have carefully been reworked.

Expected Competition Performance

We expect some improvements in comparison with last year's version, depending on problem domains and proof tasks.

References

BDP89
Bachmair L., Dershowitz N., Plaisted D.A. (1989), Completion Without Failure, Ait-Kaci H., Nivat M., Resolution of Equations in Algebraic Structures, pp.1-30, Academic Press.
BC+97
Benzmüller C., Cheikhrouhou L., Fehrer D., Fiedler A., Huang X., Kerber M., Kohlhase M., Konrad K., Meier A., Melis E., Schaarschmidt W., Siekmann J., Sorge V. (1997), OMEGA: Towards a Mathematical Assistant, McCune W.W., Proceedings of the 14th International Conference on Automated Deduction (Townsville, Australia), pp.146-160, Lecture Notes in Artificial Intelligence 1249, Springer-Verlag.
HJL99
Hillenbrand T., Jaeger A., Löchner B. (1999), Waldmeister - Improvements in Performance and Ease of Use, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.
SS99
Sutcliffe G., Suttner C.B. (1999), The CADE-15 ATP System Competition, Journal of Automated Reasoning.


Waldmeister 600

T. Hillenbrand, B. Löchner, A. Jaeger, and A. Buch
FB Informatik, Universität Kaiserslautern
waldmeister@informatik.uni-kl.de

Architecture

Waldmeister 600 [
HJL99] is an implementation of unfailing Knuth-Bendix completion [BDP89]. The system architecture is built on a stringent distinction of active facts (selected rewrite rules) and passive facts (critical pairs): Firstly, only the active facts induce the normalization relation. Secondly, the passive facts are normalized not permanently, but on generation only. The saturation cycle is parameterized by a reduction ordering and a heuristic assessment of passive facts. The control parameters are chosen according to the algebraic structure given in the problem specification [HJL99]. Recent developments include the integration of the kernel strategy [HL00] as well as extensions towards ordered completion, but are still in progress.

Implementation

The system is implemented in ANSI-C and runs under SunOS, Solaris and Linux. Refined perfect discrimination trees are employed to index the active facts, whereas the passive facts are stored in a compressed representation. The hypotheses are blown up to sets of rewrite successors and sometimes predecessors. A step-by-step proof object is constructed at run-time. Waldmeister can be found on the Web at:
    http://www-avenhaus.informatik.uni-kl.de/waldmeister

Expected Competition Performance

The Waldmeister system being as powerful as it is today, we hope to improve over last year's version in some problem domains.

References

BDP89
Bachmair L., Dershowitz N., Plaisted D.A. (1989), Completion Without Failure, Ait-Kaci H., Nivat M., Resolution of Equations in Algebraic Structures, pp.1-30, Academic Press.
HJL99
Hillenbrand T., Jaeger A., Löchner B. (1999), Waldmeister - Improvements in Performance and Ease of Use, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.
HL00
Hillenbrand T., Löchner B. (2000), Shorter Fixed Point Combinators with Waldmeister and the Kernel Strategy, AAR Newsletter 46.