Entrants' System Descriptions


Bliksem 1.12a

Hans de Nivelle
Max Planck Institut für Informatik, Germany
nivelle@mpi-sb.mpg.de

Architecture

Bliksem is a theorem prover implementing resolution and paramodulation. It implements many different strategies, including several orders that define decision procedures. It does not implement basicness or constraints. The automatic selection of strategies is poor, which partially explains the bad performance. A special feature of Bliksem is that it is able to output total proofs that can be easily verified [
BHN00]. Additional programs that can check the proofs, or convert them into Coq-format [COQ02], are available from the home page.

Bliksem 1.12a is a reduced version of Bliksem 1.12. It was planned to replace the CNF-transformer by a Java program that is able to generate correctness proofs. In order to prepare for this, the CNF-transformer of Bliksem has been reduced to a minimum. Unfortunately, the new CNF-transformer will not be completed before the competition.

Implementation

Bliksem is written in portable C. High priority has been given to portability. Bliksem has been compiled succcessfully using cc, gcc, and djpcc. As said above, the CNF-transformer is to be replaced by a Java program. Bliksem (together with the additional proof checking programs) can be downloaded from:
    http://www.mpi-sb.mpg.de/~bliksem

Strategies

Stategy selection is primitive. Bliksem 1.12a checks if the problem is Horn or non-Horn, and it checks if the problem contains equality. If the problem is Horn, then hyperresolution is selected. If the problem contains equality, then paramodulation and equality factoring are switched on.

Expected Competition Performance

It can be expected that Bliksem 1.12a will perform worse than Bliksem 1.12, because of the fact that part of the system was removed.

References

BHN00
Bezem M., Hendriks D., de Nivelle H. (2000), Automated Proof construction in Type Theory using Resolution, McAllester D., Proceedings of the 17th International Conference on Automated Deduction (Pittsburg, USA), pp.148-163, Lecture Notes in Artificial Intelligence 1831, Springer-Verlag.
COQ02
(2002), The Coq Proof Assistant, http://pauillac.inria.fr/coq.


CiME 2

Evelyne Contejean, Benjamin Monate
LRI, Universite Paris-Sud, France
{contejea,monate}@lri.fr

Architecture

CiME is intended to be a toolkit, which contains nowadays the following features: The ordered completion of term rewriting systems [
CM96] will be used during the competion to attempt to solve unification problems, that is problems in the UEQ division.

Implementation

CiME 2 [CM+00] is fully written in Objective CAML, a functional language of the ML family developed in the CRISTAL project at INRIA Rocquencourt. CiME 2 is available at:
    http://cime.lri.fr/
as binaries for SPARC workstations running Solaris (at least version 2.6) and for pentium PCs running Linux, and its sources are available by read-only anynomous CVS.

Strategies

There are two distinct kinds of strategies to perform completion:

Expected Competition Performance

This will be the first participation of CiME 2 in CASC, in the UEQ division.

References

CM+00
Contejean E., March C., Monate B., Urbain X. (2000), CiME version 2, http://cime.lri.fr.
CM96
Contejean E., March C. (1996), CiME: Completion Modulo E, Ganzinger H., Proceedings of the 7th International Conference on Rewriting Techniques and Applications (New Brunswick, USA), pp.416-419, Lecture Notes in Computer Science 1103, Springer-Verlag.
Acknowledgments: Claude Marché and Xavier Urbain contributed to the development of CiME 2.

DCTP 1.2 and DCTP 10.1p

Gernot Stenz
Technische Universität München, Germany
stenz@informatik.tu-muenchen.de

Architecture

DCTP 1.2 [
Ste02] is an automated theorem prover for first order clause logic. It is an implementation of the disconnection calculus described in [Bil96] and [LS01]. The disconnection calculus is a proof confluent and inherently cut-free tableau calculus with a weak connectedness condition. The inherently depth-first proof search is guided by a literal selection based on literal instantiatedness or literal complexity and a heavily parameterised link selection. The pruning mechanisms mostly rely on different forms of variant deletion and unit based strategies. Additionally the calculus has been augmented by full tableau pruning.

The new DCTP 1.2 features several methods of builtin equality treatment [LS02], based on ordered paramodulation and on a variant of lazy root paramodulation that is also featured in the current version of SETHEO. Also, a formula preprocessor has been integrated making use of clause factoring, full subsumption, demodulation and a number of other techniques.

DCTP 10.1p is a strategy parallel version using the technology of E-SETHEO [SW99] to combine several different strategies based on DCTP 1.2.

Implementation

DCTP 1.2 has been implemented as a monolithic system in the Bigloo dialect of the Scheme language. The most important data structures are perfect discrimination trees, of which heavy use is made. DCTP 10.1p has been derived from the Perl implementation of E-SETHEO and includes DCTP 1.2 as well as additional components written in Prolog and Shell tools. Both versions run under Solaris and Linux.

Strategies

DCTP 1.2 is a single strategy prover. Individual strategies are started by DCTP 10.1p using the schedule based resource allocation scheme known from the E-SETHEO system. Of course, different schedules have been precomputed for the syntactic problem classes. The problem classes are more or less identical with the sub-classes of the competition organisers. We have no idea whether or not this conflicts with the organisers' tuning restrictions.

Expected Competition Performance

We expect both DCTP 1.2 and DCTP 10.1p to perform reasonably well, in particular in the SAT and EPR categories.

References

Bil96
Billon J-P. (1996), The Disconnection Method: A Confluent Integration of Unification in the Analytic Framework, Miglioli P., Moscato U., Mundici D., Ornaghi M., Proceedings of TABLEAUX'96: the 5th Workshop on Theorem Proving with Analytic Tableaux and Related Methods (Palermo, Italy), pp.110-126, Lecture Notes in Artificial Intelligence 1071, Springer-Verlag.
LS01
Letz R., Stenz G. (2001), Proof and Model Generation with Disconnection Tableaux, Nieuwenhuis R., Voronkov A., Proceedings of the 8th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (Havana, Cuba), pp.142-156, Lecture Notes in Artificial Intelligence 2250, Springer-Verlag.
LS02
Letz R., Stenz G. (2002), Integration of Equality Reasoning into the Disconnection Calculus, Fermüller C., Egly U., Proceedings of TABLEAUX 2002: Automated Reasoning with Analytic Tableaux and Related Methods (Copenhagen, Denmark), To appear.
Ste02
Stenz G. (2002), DCTP 1.2 - System Abstract, Fermüller C., Egly U., Proceedings of TABLEAUX 2002: Automated Reasoning with Analytic Tableaux and Related Methods (Copenhagen, Denmark), To appear.
SW99
Stenz G., Wolf A. (1999), E-SETHEO: Design, Configuration and Use of a Parallel Automated Theorem Prover, Foo N., Proceedings of AI'99: The 12th Australian Joint Conference on Artificial Intelligence (Sydney, Australia), pp.231-243, Lecture Notes in Artificial Intelligence 1747, Springer-Verlag.


E 0.7 and EP 0.7

Stephan Schulz
Technische Universität München, Germany, and Safelogic A.B., Sweden
schulz@informatik.tu-muenchen.de

Architecture

E 0.7 [
Sch01,Sch02] is a purely equational theorem prover. The calculus used by E combines superposition (with selection of negative literals) and rewriting. No special rules for non-equational literals have been implemented, i.e. resolution is simulated via paramodulation and equality resolution. E also implements AC redundancy elimination and AC simplification for dynamically recognized associative and commutative equational theories, as well as pseudo-splitting for clauses.

E is based on the DISCOUNT-loop variant of the given-clause algorithm, i.e., a strict separation of active and passive facts. Proof search in E is primarily controlled by a literal selection strategy, a clause evaluation heuristic, and a simplification ordering. The prover supports a large number of preprogrammed literal selection strategies, many of which are only experimental. Clause evaluation heuristics can be constructed on the fly by combining various parameterized primitive evaluation functions, or can be selected from a set of predefined heuristics. Supported term orderings are several parameterized instances of Knuth-Bendix Ordering (KBO) and Lexicographic Path Ordering (LPO).

An automatic mode can select literal selection strategy, term ordering (different versions of KBO and LPO), and search heuristic based on simple problem characteristics.

EP 0.7 is just a combination of E 0.7 in verbose mode and a proof analysis tool extracting the used inference steps.

Implementation

E is implemented in ANSI C, using the GNU C compiler. The most outstanding feature is the global sharing of rewrite steps. Contrary to earlier versions of E, which destructively changed all shared instances of a term, the latest version adds only a rewrite link from the rewritten to the new term. In effect, E is caching rewrite operations as long as sufficient memory is available. A second important feature is the use of perfect discrimination trees with age and size constraints for rewriting and unit-subsumption.

The program has been successfully installed under SunOS 4.3.x, Solaris 2.x, HP-UX B 10.20, and various versions of Linux. Sources of the latest released version and a current snapshot are available freely from:

    http://wwwjessen.informatik.tu-muenchen.de/~schulz/WORK/eprover.html
EP 0.7 is a simple Bourne shell script calling E and the postprocessor in a pipeline.

Strategies

E's automatic mode is optimized for performance on TPTP 2.4.0. The optimization is based on a fairly large number of test runs and consists of the selection of one of about 50 different strategies for each problem. All test runs have been performed on SUN Ultra 60/300 machines with a time limit of 300 second (or roughly equivalent configurations).

E distinguishes four primary problem classes: Problems where all non-negative clauses are unit clauses, where all non-negative clauses are Horn clauses (and at least one is not a unit), and non-Horn problems without and with equality. For each of these classes a separate set of candidate heuristics is created and run over all problems in this class. To generate the automatic mode for any of the four primary classes, the class is partitioned into subclasses defined by (some) of the following nine features:

Wherever there is a selection of few, some, and many of a certain entity, the limits are selected automatically with the aim of splitting the set of clauses into three sets of approximately equal size based on this one feature.

For each non-empty class, we assign the most general candidate heuristic that solves the same number of problems on this class as the best heuristic on this class does. Typically, most selected heuristics are assigned to more than one class. As an example, the current intermediate version uses a total of eight heuristics to cover all 532 problems with unit axioms.

Expected Competition Performance

Last year E performed well in the MIX category of CASC and came in third in the UEQ division. We believe that E will again be among the strongest provers in the MIX category, in particular due to its good performance for Horn problems. In UEQ, E will probably be beaten by only Waldmeister.

EP 0.7 will be hampered by the fact that it has to analyse the inference step listing, an operation that typically is about as expensive as the proof search itself. Nevertheless, it should be competitive among the MIX* systems.

References

Sch01
Schulz S. (2001), System Abstract: E 0.61, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), pp.370-375, Lecture Notes in Artificial Intelligence, Springer-Verlag.
Sch02
Schulz S. (2002), E: A Brainiac Theorem Prover, AI Communications, To appear.


E-SETHEO csp01

Reinhold Letz, Stephan Schulz, Gernot Stenz
Technische Universität München, Germany
{letz,schulz,stenz}@informatik.tu-muenchen.de

Architecture

E-SETHEO is a compositional theorem prover for formulae in first order clause logic, combining the systems E [
Sch99] and SETHEO [MI+97]. It incorporates different calculi and proof procedures like superposition, model elimination and semantic trees (the DPLL procedure). Furthermore, the system contains transformation techniques which may split the formula into independent subparts or which may perform a ground instantiation. Finally, advanced methods for controlling and optimizing the combination of the subsystems are applied. E-SETHEO additionally includes the program Flotter [WGR96] as a preprocessing module for transforming non-clausal formulae to clausal form.

Since version 99csp of E-SETHEO, the different strategies are run sequentially, one after the other, depending on the allocation of resources to the different strategies, so-called schedules, which have been computed from experimental data using machine learning techniques as described in [SW99]. Schedule selection depends on syntactic characteristics of the input formula. E-SETHEO csp01 incorporates the disconnection prover DCTP [LS01] as a new strategy as well as a new version of the E prover.

Implementation

According to the diversity of the contained systems, the modules of E-SETHEO are implemented in different programming languages like C, Prolog, Scheme, and Shell tools.

The program runs under Solaris 2.6. Sources are available freely.

References

LS01
Letz R., Stenz G. (2001), System Description: DCTP - A Disconnection Calculus Theorem Prover, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.
MI+97
Moser M., Ibens O., Letz R., Steinbach J., Goller C., Schumann J., Mayr K. (1997), SETHEO and E-SETHEO: The CADE-13 Systems, Journal of Automated Reasoning 18(2), pp.237-246.
Sch99
Schulz S. (1999), System Abstract: E 0.3, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), pp.297-301, Lecture Notes in Artificial Intelligence 1632, Springer-Verlag.
SW99
Stenz G., Wolf A. (1999), E-SETHEO: Design, Configuration and Use of a Parallel Automated Theorem Prover, Foo N., Proceedings of AI'99: The 12th Australian Joint Conference on Artificial Intelligence (Sydney, Australia), pp.231-243, Lecture Notes in Artificial Intelligence 1747, Springer-Verlag.
WGR96
Weidenbach C., Gaede B., Rock G. (1996), SPASS and FLOTTER, McRobbie M., Slaney J.K., Proceedings of the 13th International Conference on Automated Deduction (New Brunswick, USA), pp.141-145, Lecture Notes in Artificial Intelligence 1104, Springer-Verlag.


E-SETHEO csp02

Reinhold Letz, Stephan Schulz, Gernot Stenz
Technische Universität München, Germany
{letz,schulz,stenz}@informatik.tu-muenchen.de

Architecture

E-SETHEO is a compositional theorem prover for formulae in first order clause logic, combining the systems E [
Sch01], DCTP [Ste02] and SETHEO [MI+97]. It incorporates different calculi and proof procedures like superposition, model elimination and semantic trees (the DPLL procedure). Furthermore, the system contains transformation techniques which may split the formula into independent subparts or which may perform a ground instantiation. Finally, advanced methods for controlling and optimizing the combination of the subsystems are applied. The first-order variant of E-SETHEO no longer uses Flotter [WGR96] as a preprocessing module for transforming non-clausal formulae to clausal form. Instead, a more primitive normal form transformation is employed.

Since version 99csp of E-SETHEO, the different strategies are run sequentially, one after the other. E-SETHEO csp02 incorporates the new version of the disconnection prover DCTP with integrated equality handling as a new strategy as well as a new version of the E prover. The new Scheme version of SETHEO that is in use features local unit failure caching [LS01] and lazy root paramodulation, an optimisation of lazy paramodulation which is complete in the Horn case [LS02].

Implementation

According to the diversity of the contained systems, the modules of E-SETHEO are implemented in different programming languages like C, Prolog, Scheme, and Shell tools.

The program runs under Solaris and, with a little luck, under Linux, too. Sources are available from the authors.

Strategies

Individual strategies are started be E-SETHEO depending on the allocation of resources to the different strategies, so-called schedules, which have been computed from experimental data using machine learning techniques as described in [SW99]. Schedule selection depends on syntactic characteristics of the input formula such as the Horn-ness of formulae, whether a problem contains equality literals or whether the formula is in the Bernays-Schönfinkel class. The problem classes are more or less identical with the sub-classes of the competition. We have no idea whether or not this conflicts with the organisers' tuning restrictions.

Expected Competition Performance

We expect E-SETHEO to perform well in all categories it participates in.

References

LS01
Letz R., Stenz G. (2001), Model Elimination and Connection Tableau Procedures, Robinson A., Voronkov A., Handbook of Automated Reasoning, pp.2015-2114, Elsevier Science.
LS02
Letz R., Stenz G. (2002), Integration of Equality Reasoning into the Disconnection Calculus, Fermüller C., Egly U., Proceedings of TABLEAUX 2002: Automated Reasoning with Analytic Tableaux and Related Methods (Copenhagen, Denmark), To appear.
MI+97
Moser M., Ibens O., Letz R., Steinbach J., Goller C., Schumann J., Mayr K. (1997), SETHEO and E-SETHEO: The CADE-13 Systems, Journal of Automated Reasoning 18(2), pp.237-246.
Sch01
Schulz S. (2001), System Abstract: E 0.61, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), pp.370-375, Lecture Notes in Artificial Intelligence 2083, Springer-Verlag.
SW99
Stenz G., Wolf A. (1999), E-SETHEO: Design, Configuration and Use of a Parallel Automated Theorem Prover, Foo N., Proceedings of AI'99: The 12th Australian Joint Conference on Artificial Intelligence (Sydney, Australia), pp.231-243, Lecture Notes in Artificial Intelligence 1747, Springer-Verlag.
Ste02
Stenz G. (2002), DCTP 1.2 - System Abstract, Fermüller C., Egly U., Proceedings of TABLEAUX 2002: Automated Reasoning with Analytic Tableaux and Related Methods (Copenhagen, Denmark), To appear.
WGR96
Weidenbach C., Gaede B., Rock G. (1996), SPASS and FLOTTER, McRobbie M., Slaney J.K., Proceedings of the 13th International Conference on Automated Deduction (New Brunswick, USA), pp.141-145, Lecture Notes in Artificial Intelligence 1104, Springer-Verlag.


EXLOG 2

Ivan Kossey
Uzhgorod State University, Ukraine
Ivan.Kossey@t-online.de

Architecture

EXLOG, developed since 1999, is a theorem prover in first order logic. The basic features are binary resolution with factors [
Kos82], forward and backward subsumption, unit preference, a weak variant of set of support, and exclusion of predicates with satisfiable propositional abstraction. Features for equality are rewrite rules (created from unit equality clauses with ordering of terms on length/alphabetical), and commutative unification. A somewhat modified Knuth-Bendix algorithm is used for equations with commutative unification for symmetric predicates and commutative functions. Paramodulation and AC-unification are not implemented. EXLOG accepts FOF, clausal TPTP format and his own input format.

Implementation

Assembler (TASM32) under Windows32. EXLOG runs on PC under Win95, Win98, WinNT, WinME, Win2000, WinXP. Developed and tested under Win2000. The subsumption algorithm [Kos84] and its implementation seems to be fast.

Strategies

Different strategies are implemented: some heuristic techniques for simplifying, unit preference, a weak variant of set of support (the conjecture clauses and their decsendants have preference). Many "small" heuristics are implemented, e.g., clause sorting on literal number/length. The program is self-tuning and has no run parameters.

Expected Competition Performance

Top half for problems with equality, bottom half for problems without equality.

References

Kos82
Kossey I. (1982), Four Fast Algorithms in Resolution Method, Messages of the Academy of Sciences of USSR, Technical Cybernetics Series, Nr 5, Academy of Sciences of USSR, Moscow, Russia.
Kos84
Kossey I. (1984), Subsumption Algorithms in First and Finite Order Resolution, Mathematical Logic and its Applications, Moscow Institute of Pedagogics, Moscow, Russia.


Gandalf c-2.5

Tanel Tammet
Tallinn Technical University, Estonia
Safelogic AB, Sweden
tammet@cc.ttu.ee

Architecture

Gandalf [
Tam97,Tam98] is a family of automated theorem provers, including classical, type theory, intuitionistic and linear logic provers, plus finite a model builder. The version c-2.5 contains the classical logic prover for clause form input and the finite model builder. One distinguishing feature of Gandalf is that it contains a large number of different search strategies and is capable of automatically selecting suitable strategies and experimenting with these strategies.

Gandalf is available under GPL. There exists a separate commercial version of Gandalf, called G, developed and distributed by Safelogic AB (www.safelogic.se), which contains numerous additions, strategies, and optimisations, aimed specifically at verification of large systems.

The finite model building component of Gandalf uses the Zchaff propositional logic solver by L.Zhang [MM+01] as an external program called by Gandalf. Zchaff is not free, although it can be used freely for research purposes. Gandalf is not optimised for Zchaff or linked together with it: Zchaff can be freely replaced by other satisfiability checkers.

Implementation

Gandalf is implemented in Scheme and compiled to C using the Hobbit Scheme-to-C compiler. Version scm5d6 of the Scheme interpreter scm by A.Jaffer is used as the underlying Scheme system.

Gandalf has been tested on Linux, Solaris and MS Windows under Cygwin.

The propositional satisifiability checker Zchaff used by Gandalf during finite model building is implemented by L.Zhang in C++.

Gandalf should be publicly available at:

    http://www.ttu.ee/it/gandalf

Strategies

One of the basic ideas used in Gandalf is time-slicing: Gandalf typically runs a number of searches with different strategies one after another, until either the proof is found or time runs out. Also, during each specific run Gandalf typically modifies its strategy as the time limit for this run starts coming closer. Selected clauses from unsuccessful runs are sometimes used in later runs.

In the normal mode Gandalf attempts to find only unsatisfiability. It has to be called with a -sat flag to find satisfiability. Gandalf selects the strategy list according to the following criteria:

Expected Competition Performance

In the MIX and UEQ divisions, Gandalf will probably be among the better contestants but is not likely to win, except for (possibly) some categories of MIX.

In the SAT division, Gandalf version 2.5 should perform significantly better than older versions of Gandalf or any other satisfiability checker that participated at CASC during 2001. However, no prediction can be made about the relative performance at 2002.

In the EPR division, Gandalf now uses suitable strategies, differently from Gandalf in 2001, and is expected to be among the better provers in this class.

References

MM+01
Moskewicz M., Madigan C., Zhao Y., Zhang L., Malik S. (2001), Chaff: Engineering an Efficient SAT Solver, Blaauw D., Lavagno L., Proceedings of the 39th Design Automation Conference (Las Vegas, USA), pp.530-535.
Tam97
Gandalf, Journal of Automated Reasoning 18(2), pp.199-204.
Tam98
Towards Efficient Subsumption, Kirchner C., Kirchner H., Proceedings of the 15th International Conference on Automated Deduction (Lindau, Germany), pp.427-440, Lecture Notes in Artificial Intelligence 1421, Springer-Verlag.


GandalfSat 1.0

Tanel Tammet
Tallinn Technical University, Estonia
tammet@cc.ttu.ee

Architecture

GandalfSat is a special version of the
Gandalf theorem prover, optimised for satisifiability checking. Essentially it contains the same code as Gandalf, but uses different default search strategies to Gandalf.

Ordinary Gandalf is a resolution prover which implements a number of different basic strategies: binary ordered resolution for several orderings, versions of set-of-support resolution, binary unit resolution, hyperresolution. Enhancements are used for cutting off literals and combining different strategies, in particular forward and backward reasoning, into one run. Equality is handled by ordered paramodulation and demodulation.

GandalfSat uses time-slicing similarly to ordinary Gandalf. It attempts hyperresolution, ordered resolution based on [FL+93] results, and two kinds of finite model building algorithms (Falcon-like and MACE-like) for attempting to show satisfiability.

Some relevant publications are: [FL+93, Tam97, TS98].

Implementation

Gandalf is implemented in Scheme, using the scm interpreter developed by A. Jaffer and the Scheme-to-C compiler Hobbit developed by T. Tammet for the scm system.

References

FL+93
Fermüller C., Leitsch A., Tammet T., Zamov N. (1993), Resolution Methods for the Decision Problem, Lecture Notes in Computer Science 679, Springer-Verlag.
Tam97
Gandalf, Journal of Automated Reasoning 18(2), pp.199-204.
Tam98
Towards Efficient Subsumption, Kirchner C., Kirchner H., Proceedings of the 15th International Conference on Automated Deduction (Lindau, Germany), pp.427-440, Lecture Notes in Artificial Intelligence 1421, Springer-Verlag.


GrAnDe 1.1

Geoff Sutcliffe1, Stephan Schulz2
1University of Miami, USA, 2Technische Universität München, Germany
geoff@cs.miami.edu, schulz@informatik.tu-muenchen.de

Architecture

GrAnDe (Ground And Decide) [
SS02] is a decision procedure for CNF problems with a finite Herbrand universe. GrAnDe uses the grounding procedure eground [Sch02] to generate ground instances of such a problem, and the propositional decision system ZChaff [MM+01] to determine the satisfiability of the resulting propositional problem. The set of all ground instances of a set of clauses is often too large to compute with reasonable resources. Therefore eground tries to find a smaller set of ground instances that is still equiconsistent with the original set of clauses. Three techniques are combined: clause splitting, structural constraints on variable instantiation, and propositional simplification. Clause splitting separates a clause into variable disjoint parts, and replaces the orginal clause by new clauses formed from the parts and link literals. Structural constraints are used to prohibit variable instantiations that would lead to the generation of pure literals in the resulting ground clauses. Propositional simplification removes clauses and literals from the generated ground clause set, while maintaining satisfiability.

For details about ZChaff, see [MM+01].

Despite the optimizations in eground, there are problems where eground runs out of time or memory. In these cases eground outputs an incomplete ground clause set. If eground has output an incomplete clause set and ZChaff reports that it is satisfiable, then no result is reported by GrAnDe. If eground has generated an incomplete clause set and ZChaff reports that it is unsatisfiable, or if eground has generated a complete clause set, then ZChaff's result is reported by GrAnDe as the overall result.

Implementation

eground is implemented in ANSI-C and based on the E [Sch01] libraries. It uses the basic infrastructure up to the shared term representation, and adds a new, compact clause data type for the generated propositional clauses. Propositional unit-clauses, used for simplification, are represented by a simple array entry, making unit subsumption and unit simplification into O(1) operations (in the number of unit clauses). Clause splitting is implemented naively, using a flood-fill algorithm to find connected sub-clauses. Finally, variable constraints are directly computed in conjunctive normal form, and are used to directly construct possible instantiations.

For details about ZChaff, see [MM+01].

A Perl script called And is used to combine eground and ZChaff. The script invokes eground on the EPR problem, allowing it maximally 66% of the CPU time limit. The propositional clauses written by eground are captured into a temporary file, which is used as the input for ZChaff. ZChaff is allowed whatever CPU time has not been used by eground.

GrAnDe, including the separate components, is available from:

    http://www.cs.miami.edu/~tptp/ATPSystems/GrAnDe/
It should be widely portable between recent UNIX variants.

Strategies

GrAnDe has only one mode of operation, i.e., there are no different strategies for different types of problems. The overhead for the different techniques in eground is small enough that all optimizations can be applied in all cases without serious degradation of performance.

Expected Competition Performance

GrAnDe is expected to outperform all systems that use only first-order techniques.

References

MM+01
Moskewicz M., Madigan C., Zhao Y., Zhang L., Malik S. (2001), Chaff: Engineering an Efficient SAT Solver, Blaauw D., Lavagno L., Proceedings of the 39th Design Automation Conference (Las Vegas, USA), pp.530-535.
SS02
Schulz S., Sutcliffe G. (2002), System Description: GrAnDe 1.0, Voronkov A., Proceedings of the 18th International Conference on Automated Deduction (Copenhagen, Denmark), To appear, Lecture Notes in Artificial Intelligence, Springer-Verlag.
Sch01
Schulz S. (2001), System Abstract: E 0.61, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), pp.370-375, Lecture Notes in Artificial Intelligence 2083, Springer-Verlag.
Sch02
Schulz S. (2002), A Comparison of Different Techniques for Grounding Near-Propositional CNF Formulae, Haller S., Simmons G., Proceedings of the 15th Florida Artificial Intelligence Research Symposium (Pensecola, USA), pp.72-76, AAAI Press.


ICGNS 2002

William McCune
Argonne National Laboratory, USA
mccune@mcs.anl.gov

Architecture

ICGNS 2002 [
McC02] searches for finite models of first-order (unsorted, with equality) statements. Given input clauses, it generates ground instances over a finite domain, then uses a decision procedure try to determine satisfiability. If there is no model of that size, it increments the domain size and tries again.

The ICGNS method is SEM-like [ZZ95] rather than MACE-like [McC01]; that is, the ground problem is propositional/equality rather than flattened and purely propositional. The two main parts of the ICGNS method are (1) selecting the next empty cell in the tables of functions being constructed and deciding which values need to be considered for that cell, and (2) propagating assignments. ICGNS uses the basic least number heuristic (LNH) to reduce isomorphism. The LNH was introduced in Falcon [Zha96] and is also used in SEM. Effective use of the LNH requires careful cell selection. Propagation is by ground rewriting and inference rules to derive negated equalities.

Implementation

ICGNS 2002 is a new program, started in the winter of 2002. It uses some pre-existing code from various sources for some of the low level operations such as parsing and term structure. ICGNS is coded in ANSI C and should be easily portable to recent variants of UNIX. The source code and test problems are available from:
    http://www.mcs.anl.gov/home/mccune/icgns/

Strategies

Several strategies are available for cell selection and for applying the negative inference rules. A single strategy is used for CASC-18: (1) the LNH, selecting cells with the fewest number of possible values, and (2) applying all negative propagation inference rules. The strategy was derived by experimenting on problems that arose in mathematics practice, mostly in abstract algebra. No tuning to the TPTP set has been done.

Expected Competition Performance

ICGNS seems to do well on equational problems, especially if the input set contains some simple equations. Satisfiable problems without reasonably sized finite models (including problems without finite models) are out of reach of ICGNS. Some of those problems can be done by satisfiability methods that work by saturation with a complete inference system. It is doubtful that ICGNS will do as well, overall, as programs that try various methods and various strategies.

References

McC01
McCune W.W. (2001), MACE 2.0 Reference Manual and Guide, ANL/MCS-TM-249, Argonne National Laboratory, Argonne, USA.
McC02
McCune W.W. (2002), ICGNS, http://www.mcs.anl.gov/~mccune/icgns.
ZZ95
Zhang J., Zhang H. (1995), SEM: A System for Enumerating Models, Mellish C.S., Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp.298-303, Morgan Kaufmann.
Zha96
Zhang J. (1996), Constructing Finite Algebras with FALCON, Journal of Automated Reasoning 17(1), pp.1-22.
Acknowledgments: Olga Shumsky Matlin and Michael Rose assisted in the development of ICNGS 2002.


Otter 3.2

William McCune
Argonne National Laboratory, USA
mccune@mcs.anl.gov

Architecture

Otter 3.2 [
McC94] is an ATP system for statements in first-order (unsorted) logic with equality. Otter is based on resolution and paramodulation applied to clauses. An Otter search uses the "given clause algorithm", and typically involves a large database of clauses; subsumption and demodulation play an important role.

Implementation

Otter is written in C. Otter uses shared data structures for clauses and terms, and it uses indexing for resolution, paramodulation, forward and backward subsumption, forward and backward demodulation, and unit conflict.

Expected Competition Performance

Otter has been entered into CASC-18 as a stable benchmark against which progress can be judged (there have been only minor changes to Otter since 1996 [MW97], nothing that really affects its performace in CASC). Otter's original automatic mode, which reflects no tuning to the TPTP problems, will be used. This is not an ordinary entry, and we do not hope for Otter to do well in the competition.

References

McC94
McCune W.W. (1994), Otter 3.0 Reference Manual and Guide, ANL-94/6, Argonne National Laboratory, Argonne, USA.
MW97
McCune W.W., Wos L. (1997), Otter: The CADE-13 Competition Incarnations, Journal of Automated Reasoning 18(2), pp.211-220.
McC02
McCune W.W. (2002), Otter: An Automated Deduction System, http://www-unix.mcs.anl.gov/AR/otter/.

Acknowledgments: Ross Overbeek, Larry Wos, Bob Veroff, and Rusty Lusk contributed to the development of Otter.


SCOTT 6.1

John Slaney
Australian National University, Australia
John.Slaney@anu.edu.au

Architecture

SCOTT [
HS01, HS02] consists of two main components: a saturation-based theorem prover (Otter [McC02] in fact) and a finite domain constraint solver (FINDER [Sla94]) that generates models for subsets of the clauses deduced. The models are used to guide selection of given clauses, clauses which are false in the guiding models being given preference. Because model generation is computationally expensive, the critical issue for performance is the tradeoff between search quality due to semantic guidance and the time used in securing and refining it.

Successive versions of SCOTT have competed regularly in CASC. Version 6.1 is little changed from version 6.0 which competed in 2001, except for some amendments aimed at reducing the number of times a useless model is generated, deleted and re-generated, and slight tuning of the default values for parameters.

Implementation

The system is written in C, and consists of three main modules: the theorem prover is Otter, the model generator is FINDER, and there is a relatively small module linking the two into the combined system. The sources are available from
    http://arp.anu.edu.au/software/scott/

Strategies

The proof techniques are taken over directly from Otter, along with that system's weight reduction strategies, pick-given ratio and the like. The characteristic feature of SCOTT is semantic guidance, applied as a heuristic for selection of given clauses. This improves the quality of the choices overall, though the effect is far from uniform across problems and the overheads incurred are severe. Many problems from TPTP have been used in the development of the semantic guidance strategy, but there has been little attempt to tune it more specifically than that.

Expected Competition Performance

SCOTT 6.0 competed in 2001, and version 6.1 is essentially similar, so we expect similar performance. Much depends on the time limit, as there are over 1000 problems in TPTP that the system proves in more than one minute and less than ten minutes (taking timings on a Sun E250 UltraSPARC II, for instance) so it will perform significantly better with a 600 second limit than with a 300 second one. In the past it has done moderately well on UEQ problems and on essentially propositional ones. In the MIX division, however, it is hampered by the lack of a sufficiently adaptive autonomous mode and will not threaten the leading provers. We shall be interested to discover what it finds easy and what it finds hard, in comparison with other systems.

References

HS01
Hodgson K., Slaney J.K. (2001), System Description: SCOTT-5, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), pp.443-447, Lecture Notes in Artificial Intelligence 2083, Springer-Verlag.
HS02
Hodgson K., Slaney J.K. (2002), TPTP, CASC and the Development of a Semantically Guided Theorem Prover, AI Communications, To appear.
McC02
McCune W.W. (2002), Otter: An Automated Deduction System, http://www-unix.mcs.anl.gov/AR/otter/.
Sla94
Slaney J.K. (1994), FINDER: Finite Domain Enumerator, System Description, Bundy A., Proceedings of the 12th International Conference on Automated Deduction (Nancy, France), pp.798-801, Lecture Notes in Artificial Intelligence 814, Springer-Verlag.

Vampire 2.0-CASC

Alexandre Riazanov, Andrei Voronkov
University of Manchester, England
{riazanoa,voronkov}@cs.man.ac.uk

Architecture

Vampire 2.0 is an automatic theorem prover for first-order classical logic. It implements the calculi of ordered binary resolution, hyperresolution, and superposition for handling equality. The splitting rule is simulated by introducing new predicate symbols. A number of standard redundancy criteria and simplification techniques are used for pruning the search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered unit equalities. The only term ordering used in Vampire at the moment is a special version of the Knuth-Bendix ordering which allows efficient approximation algorithms for solving ordering constraints. A number of efficient indexing techniques are used to implement all major operations on sets of terms and clauses, such as an improved version [
RV00] of code trees [Vor95] for forward subsumption, and a combination of path indexing [Sti89] and database joins for backward subsumption.

Compared to Vampire 1.0 that participated in the previous competition, this version has many more literal selection functions, more flexible splitting without backtracking and improved memory management. The automatic mode of Vampire 2.0 is primitive, it recognises only very simple syntactic properties of the input problem, like the presence of equality or non-Horn clauses. In the preprocessing stage it exploits a number of primitive techniques, such as elimination of simple predicate and function definitions.

Vampire 2.0-CASC is a version based on Vampire 2.0, specialised to win the competition. To adjust the strategy it estimates the problem by checking some syntactic properties, such as presence of multiliteral, non-Horn and ground clauses, equations and nonequational literals. Additionally we consider such quantitative characteristics as the number of axioms, literals, small and large terms.

Implementation

Vampire 2.0 is implemented in C++ and can be compiled by 2.91 or newer versions of gcc. The binaries for Solaris and Linux are available from the authors, for details see:
    http://www.cs.man.ac.uk/~riazanoa/Vampire

References

RV00
Riazanov A., Voronkov A. (2000), Partiallly Adaptive Code Trees, Ojeda-Aciego M., de Guzman I., Brewka G., Pereira L., Proceedings of the 7th European Workshop on Logics in Artificial Inteligence (Malaga, Spain), pp.209-223, Lecture Notes in Artificial Intelligence 1919, Springer-Verlag.
Sti89
Stickel M.E. (1989), The Path Indexing Method for Indexing Terms, Technical Note 473, SRI International, Menlo Park, USA.
Vor95
Voronkov A. (1995), The Anatomy of Vampire, Journal of Automated Reasoning 15(2), pp.237-265.


Vampire 5.0 and Vampire 5.0-CASC

Alexandre Riazanov, Andrei Voronkov
University of Manchester, England
{riazanoa,voronkov}@cs.man.ac.uk

Architecture

Vampire [
RV01, RV02] 5.0 is an automatic theorem prover for first-order classical logic. Its kernel implements the calculi of ordered binary resolution and superposition for handling equality. The splitting rule is simulated by introducing new predicate symbols. A number of standard redundancy criteria and simplification techniques are used for pruning the search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered unit equalities. The only term ordering used in Vampire at the moment is a special non-recursive version of the Knuth-Bendix ordering which allows efficient approximation algorithms for solving ordering constraints. By the system installation deadline we may implement the standard Knuth-Bendix ordering. A number of efficient indexing techniques are used to implement all major operations on sets of terms and clauses. Although the kernel of the system works only with clausal normal forms, the preprocessor component accepts a problem in the full first-order logic syntax, clausifies them and performs a number of useful transformations before passing the result to the kernel.

Implementation

Vampire 5.0 is implemented in C++. The main supported compiler version is gcc 2.95.3, although in the nearest future we are going to move to gcc 3.x. The system has been successfully compiled for Linux and Solaris. It is available from:
    http://www.cs.man.ac.uk/~riazanoa/Vampire/

Strategies

The Vampire kernel provides a fairly large number of features for strategy selection. The most important ones are: The standalone executables for Vampire 5.0 and Vampire 5.0-CASC use very simple time slicing to make sure that several kernel strategies are tried on a given problem.

The automatic mode of Vampire 5.0 is primitive. Seven problem classes are distinguished corresponding to the competition divisions HNE, HEQ, NNE, NEQ, PEQ, UEQ and EPR. Every class is assigned a fixed schedule consisting of a number of kernel strategies called one by one with different time limits. The automatic mode of Vampire 5.0-CASC is better tuned towards TPTP 2.4.0. It uses the same problem characteristics as E 0.7 to divide the problems into a large number of classes and may potentially assign a separate strategy to every such class.

Expected Competition Performance

We expect Vampire 5.0 to perform much better than Vampire 2.0 on problems with equality, especially on the unit equality problems.

References

RV01
Riazanov A., Voronkov A. (2001), Vampire 1.1 (System Description), Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), pp.376-380, Lecture Notes in Artificial Intelligence 2083, Springer-Verlag.
RV02
Riazanov A., Voronkov A. (2002), The Design and Implementation of Vampire, AI Communications, To appear.

Waldmeister 601

Thomas Hillenbrand1, Bernd Löchner2, Andreas Jaeger, and Arnim Buch
1Max-Planck-Institut für Informatik Saarbrücken, Germany, 2Universität Kaiserslautern, Germany
waldmeister@informatik.uni-kl.de

Architecture

Waldmeister is a system for unit equational deduction. Its theoretical basis is unfailing completion in the sense of [
BDP89] with refinements towards ordered completion. The prover saturates the input axiomatization in a repeated cycle that works on a set of active resp. passive facts. The selection of the reduction ordering and the heuristical guidance of the proof search are described in [HJL99].

Since last year's version, stronger redundancy criteria have been integrated, including ground joinability tests with ordering constraints on variables [AHL00]. In several problem domains this technique is helpful especially for harder proof tasks. Some restructuring of the prover is on the way to also include an implementation of confluence trees with full ordering constraints [AL01]; but further work is necessary to have them speed up the proof search.

Implementation

The prover is coded in ANSI-C and available for SunOS, Solaris and Linux. The set of active facts is represented by perfect discrimination trees. Stimulated by the evaluation of indexing techniques reported on in [NH+01], the implementation of these has somewhat been improved. The storage of passive facts, and the treatment of hypotheses have remained unchanged. The Waldmeister Web page is located at:
    http://www-avenhaus.informatik.uni-kl.de/waldmeister

References

AHL00
Avenhaus J., Hillenbrand T., Löchner B. (2000), On Using Ground Joinable Equations in Equational Theorem Proving, Baumgartner P., Zhang H., Proceedings of the 3rd International Workshop on First Order Theorem Proving (St Andrews, Scotland), pp.33-43.
AL01
Avenhaus J., Löchner B. (2001), System Description: CCE: Testing Ground Joinability, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.
BDP89
Bachmair L., Dershowitz N., Plaisted D.A. (1989), Completion Without Failure, Ait-Kaci H., Nivat M., Resolution of Equations in Algebraic Structures, pp.1-30, Academic Press.
HJL99
Hillenbrand T., Jaeger A., Löchner B. (1999), Waldmeister - Improvements in Performance and Ease of Use, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), pp.232-236, Lecture Notes in Artificial Intelligence 1632, Springer-Verlag.
NH+01
Nieuwenhuis R., Hillenbrand T., Riazanov A., Voronkov A. (2001), On the Evaluation of Indexing Techniques for Theorem Proving, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), Lecture Notes in Artificial Intelligence, Springer-Verlag.


Waldmeister 702

Thomas Hillenbrand1, Bernd Löchner2
1Max-Planck-Institut für Informatik Saarbrücken, Germany, 2Universität Kaiserslautern, Germany
waldmeister@informatik.uni-kl.de

Architecture

Waldmeister 702 is an implementation of unfailing Knuth-Bendix completion [
BDP89] with extensions towards ordered completion (see [AHL00]) and basicness [BG+92, NR92]. The system saturates the input axiomatization, distinguishing active facts, which induce a rewrite relation, and passive facts, which are the one-step conclusions of the active ones up to redundancy. The saturation process is parameterized by a reduction ordering and a heuristic assessment of passive facts.

Only recently, we have designed a thorough refinement of the system architecture concerning the representation of passive facts [HL02]. The aim of that work - the next Waldmeister loop - is, besides gaining more structural clarity, to cut down memory consumption especially for long-lasting proof attempts, and hence less relevant in the CASC setting.

Implementation

The system is implemented in ANSI-C and runs under Solaris and Linux. The central data strucures are: perfect discrimination trees for the active facts; element-wise compressions for the passive ones; and sets of rewrite successors for the conjectures. Waldmeister can be found on the Web at
    http://www-avenhaus.informatik.uni-kl.de/waldmeister

Strategies

Our approach to control the proof search is to choose the search parameters according to the algebraic structure given in the problem specification [HJL99]. This is based on the observation that proof tasks sharing major parts of their axiomatization often behave similar. Hence, for a number of domains, the influence of different reduction orderings and heuristic assessments has been analyzed experimentally; and in most cases it has been possible to distinguish a strategy uniformly superior on the whole domain. In essence, every such strategy consists of an instantiation of the first parameter to a Knuth-Bendix ordering or to a lexicographic path ordering, and an instantiation of the second parameter to one of the weighting functions addweight, gtweight, or mixweight, which, if called on an equation s = t, return |s| + |t|, |max>(s,t)|, or |max>(s,t)| · (|s| + |t| + 1) + |s| + |t|, respectively, where |s| denotes the number of symbols in s.

Expected Competition Performance

We expect Waldmeister 702 to be quite competitive. But since our current restructuring of the prover to implement the concepts of [HL02] may affect the inference behaviour, the outcome in comparison to last year's version is difficult to predict.

References

AHL00
Avenhaus J., Hillenbrand T., Löchner B. (2000), On Using Ground Joinable Equations in Equational Theorem Proving, Baumgartner P., Zhang H., Proceedings of the 3rd International Workshop on First Order Theorem Proving (St Andrews, Scotland), pp.33-43.
BDP89
Bachmair L., Dershowitz N., Plaisted D.A. (1989), Completion Without Failure, Ait-Kaci H., Nivat M., Resolution of Equations in Algebraic Structures, pp.1-30, Academic Press.
BG+92
Bachmair L., Ganzinger H., Lynch C., Snyder W. (1992), Basic Paramodulation and Superposition, Kapur D., Proceedings of the 11th International Conference on Automated Deduction (Saratoga Springs, USA), pp.462-476, Lecture Notes in Artificial Intelligence 607, Springer-Verlag.
HJL99
Hillenbrand T., Jaeger A., Löchner B. (1999), Waldmeister - Improvements in Performance and Ease of Use, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), pp.232-236, Lecture Notes in Artificial Intelligence 1632, Springer-Verlag.
HL02
Hillenbrand T., Löchner B. (2002), The Next Waldmeister Loop, Voronkov A., Proceedings of the 18th International Conference on Automated Deduction (Copenhagen, Denmark), To appear, Lecture Notes in Artificial Intelligence, Springer-Verlag.
NR92
Nieuwenhuis R., Rivero J.M. (1992), Basic Superposition is Complete, Krieg-Brückner B., Proceedings of the 4th European Symposium on Programming (Rennes, France), pp.371-390, Lecture Notes in Computer Science 582, Springer-Verlag.