Entrants' System Descriptions
Bliksem 1.12a
Hans de Nivelle
Max Planck Institut für Informatik, Germany
nivelle@mpi-sb.mpg.de
Architecture
Bliksem is a theorem prover implementing resolution and paramodulation.
It implements many different strategies, including several orders that
define decision procedures.
It does not implement basicness or constraints.
The automatic selection of strategies is poor, which partially explains the
bad performance.
A special feature of Bliksem is that it is able to output total proofs
that can be easily verified [BHN00].
Additional programs that can check the proofs, or convert them into
Coq-format [COQ02], are available from the
home page.
Bliksem 1.12a is a reduced version of Bliksem 1.12.
It was planned to replace the CNF-transformer by a Java program that is able
to generate correctness proofs.
In order to prepare for this, the CNF-transformer of Bliksem has been
reduced to a minimum.
Unfortunately, the new CNF-transformer will not be completed before the
competition.
Implementation
Bliksem is written in portable C.
High priority has been given to portability.
Bliksem has been compiled succcessfully using cc, gcc, and djpcc.
As said above, the CNF-transformer is to be replaced by a Java program.
Bliksem (together with the additional proof checking programs) can be
downloaded from:
http://www.mpi-sb.mpg.de/~bliksem
Strategies
Stategy selection is primitive. Bliksem 1.12a checks if the problem is
Horn or non-Horn, and it checks if the problem contains equality.
If the problem is Horn, then hyperresolution is selected. If the problem
contains equality, then paramodulation and equality factoring are
switched on.
Expected Competition Performance
It can be expected that Bliksem 1.12a will perform worse than
Bliksem 1.12, because of the fact that part of the system
was removed.
References
- BHN00
- Bezem M., Hendriks D., de Nivelle H. (2000),
Automated Proof construction in Type Theory using
Resolution,
McAllester D.,
Proceedings of the 17th International Conference on Automated
Deduction
(Pittsburg, USA),
pp.148-163,
Lecture Notes in Artificial Intelligence 1831,
Springer-Verlag.
- COQ02
- (2002),
The Coq Proof Assistant,
http://pauillac.inria.fr/coq.
CiME 2
Evelyne Contejean, Benjamin Monate
LRI, Universite Paris-Sud, France
{contejea,monate}@lri.fr
Architecture
CiME is intended to be a toolkit, which contains nowadays the following
features:
- An interactive toplevel to allow naming of objects and call to
various functions.
- Solving Diophantine constraints over finite intervals
- Solving Presburger constraints
- String Rewriting Systems, KB completion.
- Parameterized String Rewriting Systems confluence checking
- Term Rewriting Systems, possibly with commutative or
associative-commutative symbols, KB or ordered completion.
- Termination of TRSs using standard or dependency pairs criteria,
automatic generation of termination orderings based on polynomial
interpretations, including weak orderings for dependency pairs
criteria.
The ordered completion of term rewriting systems
[CM96] will be used during
the competion to attempt to solve unification problems, that is
problems in the UEQ division.
Implementation
CiME 2 [CM+00] is fully written in
Objective CAML, a functional
language of the ML family developed in the
CRISTAL project at
INRIA Rocquencourt.
CiME 2 is available at:
http://cime.lri.fr/
as binaries for SPARC workstations running Solaris (at least version 2.6) and
for pentium PCs running Linux, and its sources are available by read-only
anynomous CVS.
Strategies
There are two distinct kinds of strategies to perform completion:
- The first one is, given an equation, how to choose its orientation
when it becomes a rule?
The choice is made thanks to an ordering which has usually to be
provided by the user.
During the competition, this ordering is arbitrarily fixed to be an
RPO ordering based on a precedence where the symbols are ordered
according to their arities.
- The second one is which inference rule has to be applied to the
system, among orienting an equation into a rule and computing critical
pairs.
Each of these choices is given a weight, and the lowest weighted choice
is made.
The weight depends on the size of the involved equations/rules and
on how "old" they are.
Some tuning may occur at this point by choosing the ratio between the
coefficients for the size and the age.
During the competition, this ratio will be fixed.
Expected Competition Performance
This will be the first participation of CiME 2 in CASC, in the UEQ division.
References
- CM+00
- Contejean E., March C., Monate B., Urbain X. (2000),
CiME version 2,
http://cime.lri.fr.
- CM96
- Contejean E., March C. (1996),
CiME: Completion Modulo E,
Ganzinger H.,
Proceedings of the 7th International Conference on Rewriting
Techniques and Applications
(New Brunswick, USA),
pp.416-419,
Lecture Notes in Computer Science 1103,
Springer-Verlag.
Acknowledgments: Claude Marché and Xavier Urbain contributed to the
development of CiME 2.
DCTP 1.2 and DCTP 10.1p
Gernot Stenz
Technische Universität München, Germany
stenz@informatik.tu-muenchen.de
Architecture
DCTP 1.2 [Ste02] is an automated theorem prover
for first order clause logic.
It is an implementation of the disconnection calculus described in
[Bil96] and [LS01].
The disconnection calculus is a proof confluent and inherently cut-free
tableau calculus with a weak connectedness condition.
The inherently depth-first proof search is guided by a literal selection
based on literal instantiatedness or literal complexity and a heavily
parameterised link selection.
The pruning mechanisms mostly rely on different forms of variant
deletion and unit based strategies.
Additionally the calculus has been augmented by full tableau pruning.
The new DCTP 1.2 features several methods of builtin equality treatment
[LS02], based on ordered paramodulation and
on a variant of lazy root paramodulation that is also featured in the current
version of SETHEO.
Also, a formula preprocessor has been integrated making use of clause
factoring, full subsumption, demodulation and a number of other techniques.
DCTP 10.1p is a strategy parallel version using the technology of E-SETHEO
[SW99] to combine several different strategies
based on DCTP 1.2.
Implementation
DCTP 1.2 has been implemented as a monolithic system in the Bigloo dialect
of the Scheme language.
The most important data structures are perfect discrimination trees,
of which heavy use is made.
DCTP 10.1p has been derived from the Perl implementation of E-SETHEO
and includes DCTP 1.2 as well as additional components written in Prolog and
Shell tools.
Both versions run under Solaris and Linux.
Strategies
DCTP 1.2 is a single strategy prover.
Individual strategies are started by DCTP 10.1p using the schedule based
resource allocation scheme known from the E-SETHEO system.
Of course, different schedules have been precomputed for the syntactic
problem classes.
The problem classes are more or less identical with the sub-classes of the
competition organisers.
We have no idea whether or not this conflicts with the organisers' tuning
restrictions.
Expected Competition Performance
We expect both DCTP 1.2 and DCTP 10.1p to perform reasonably well, in
particular in the SAT and EPR categories.
References
- Bil96
- Billon J-P. (1996),
The Disconnection Method: A Confluent Integration of Unification
in the Analytic Framework,
Miglioli P., Moscato U., Mundici D., Ornaghi M.,
Proceedings of TABLEAUX'96: the 5th Workshop on Theorem Proving with
Analytic Tableaux and Related Methods
(Palermo, Italy),
pp.110-126,
Lecture Notes in Artificial Intelligence 1071,
Springer-Verlag.
- LS01
- Letz R., Stenz G. (2001),
Proof and Model Generation with Disconnection Tableaux,
Nieuwenhuis R., Voronkov A.,
Proceedings of the 8th International Conference on Logic for
Programming, Artificial Intelligence, and Reasoning
(Havana, Cuba),
pp.142-156,
Lecture Notes in Artificial Intelligence 2250,
Springer-Verlag.
- LS02
- Letz R., Stenz G. (2002),
Integration of Equality Reasoning into the Disconnection
Calculus,
Fermüller C., Egly U.,
Proceedings of TABLEAUX 2002: Automated Reasoning with Analytic
Tableaux and Related Methods
(Copenhagen, Denmark),
To appear.
- Ste02
- Stenz G. (2002),
DCTP 1.2 - System Abstract,
Fermüller C., Egly U.,
Proceedings of TABLEAUX 2002: Automated Reasoning with Analytic
Tableaux and Related Methods
(Copenhagen, Denmark),
To appear.
- SW99
- Stenz G., Wolf A. (1999),
E-SETHEO: Design, Configuration and Use of a Parallel Automated
Theorem Prover,
Foo N.,
Proceedings of AI'99: The 12th Australian Joint Conference on
Artificial Intelligence
(Sydney, Australia),
pp.231-243,
Lecture Notes in Artificial Intelligence 1747,
Springer-Verlag.
E 0.7 and EP 0.7
Stephan Schulz
Technische Universität München, Germany, and
Safelogic A.B., Sweden
schulz@informatik.tu-muenchen.de
Architecture
E 0.7 [Sch01,Sch02]
is a purely equational theorem prover.
The calculus used by E combines superposition (with selection of negative
literals) and rewriting.
No special rules for non-equational literals have been implemented,
i.e. resolution is simulated via paramodulation and equality resolution.
E also implements AC redundancy elimination and AC simplification for
dynamically recognized associative and commutative equational theories,
as well as pseudo-splitting for clauses.
E is based on the DISCOUNT-loop variant of the given-clause algorithm,
i.e., a strict separation of active and passive facts.
Proof search in E is primarily controlled by a literal selection strategy,
a clause evaluation heuristic, and a simplification ordering.
The prover supports a large number of preprogrammed literal selection
strategies, many of which are only experimental.
Clause evaluation heuristics can be constructed on the fly by combining
various parameterized primitive evaluation functions, or can be selected
from a set of predefined heuristics.
Supported term orderings are several parameterized instances of Knuth-Bendix
Ordering (KBO) and Lexicographic Path Ordering (LPO).
An automatic mode can select literal selection strategy, term ordering
(different versions of KBO and LPO), and search heuristic based on simple
problem characteristics.
EP 0.7 is just a combination of E 0.7 in verbose mode and a proof
analysis tool extracting the used inference steps.
Implementation
E is implemented in ANSI C, using the GNU C compiler.
The most outstanding feature is the global sharing of rewrite steps.
Contrary to earlier versions of E, which destructively changed all shared
instances of a term, the latest version adds only a rewrite link
from the rewritten to the new term.
In effect, E is caching rewrite operations as long as sufficient memory is
available.
A second important feature is the use of perfect discrimination trees
with age and size constraints for rewriting and unit-subsumption.
The program has been successfully installed under SunOS 4.3.x,
Solaris 2.x, HP-UX B 10.20, and various versions of
Linux.
Sources of the latest released version and a current snapshot are available
freely from:
http://wwwjessen.informatik.tu-muenchen.de/~schulz/WORK/eprover.html
EP 0.7 is a simple Bourne shell script calling E and the postprocessor in
a pipeline.
Strategies
E's automatic mode is optimized for performance on TPTP 2.4.0.
The optimization is based on a fairly large number of test runs and
consists of the selection of one of about 50 different strategies for
each problem.
All test runs have been performed on SUN Ultra 60/300 machines with a time
limit of 300 second (or roughly equivalent configurations).
E distinguishes four primary problem classes: Problems where all
non-negative clauses are unit clauses, where all non-negative clauses
are Horn clauses (and at least one is not a unit), and non-Horn
problems without and with equality.
For each of these classes a separate set of candidate heuristics is created
and run over all problems in this class.
To generate the automatic mode for any of the four primary classes, the class
is partitioned into subclasses defined by (some) of the following nine
features:
- Are all negative clauses unit clauses?
- Are all literals equality literals, are some literas equality
literals, or is the problem non-equational?
- Are there only a few, some, or many positive non-ground
unit clauses among the axioms?
- Are all goals (negative clauses) ground?
- Are there a few, some, or many clauses in the problem?
- Are there a few, some, or many literals?
- Are there a few, some, or many (sub)terms?
- Are there a few, some or many positive ground
unit clauses among the axioms?
- Is the maximum arity of any function symbol 0, 1, 2, or greater?
Wherever there is a selection of few, some, and many of a certain
entity, the limits are selected automatically with the aim of
splitting the set of clauses into three sets of approximately equal
size based on this one feature.
For each non-empty class, we assign the most general candidate
heuristic that solves the same number of problems on this class as the
best heuristic on this class does.
Typically, most selected heuristics are assigned to more than one class.
As an example, the current intermediate version uses a total of eight
heuristics to cover all 532 problems with unit axioms.
Expected Competition Performance
Last year E performed well in the MIX category of CASC and came in third
in the UEQ division.
We believe that E will again be among the strongest provers in the MIX
category, in particular due to its good performance for Horn problems.
In UEQ, E will probably be beaten by only Waldmeister.
EP 0.7 will be hampered by the fact that it has to analyse the
inference step listing, an operation that typically is about as
expensive as the proof search itself.
Nevertheless, it should be competitive among the MIX* systems.
- Sch01
- Schulz S. (2001),
System Abstract: E 0.61,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
pp.370-375,
Lecture Notes in Artificial Intelligence,
Springer-Verlag.
- Sch02
- Schulz S. (2002),
E: A Brainiac Theorem Prover,
AI Communications,
To appear.
E-SETHEO csp01
Reinhold Letz, Stephan Schulz, Gernot Stenz
Technische Universität München,
Germany
{letz,schulz,stenz}@informatik.tu-muenchen.de
Architecture
E-SETHEO is a compositional theorem prover for formulae in first order
clause logic, combining the systems E [Sch99]
and SETHEO [MI+97].
It incorporates different calculi and proof procedures
like superposition, model elimination and semantic trees (the DPLL
procedure).
Furthermore, the system contains transformation techniques
which may split the formula into independent subparts or which may
perform a ground instantiation.
Finally, advanced methods for controlling and optimizing the combination
of the subsystems are applied.
E-SETHEO additionally includes the program Flotter
[WGR96] as a
preprocessing module for transforming non-clausal formulae to clausal form.
Since version 99csp of E-SETHEO, the different strategies
are run sequentially, one after the other, depending on the allocation
of resources to the different strategies, so-called schedules,
which have been computed from experimental data using machine learning
techniques as described in [SW99].
Schedule selection depends on syntactic characteristics of the input formula.
E-SETHEO csp01 incorporates the disconnection prover DCTP
[LS01] as a new strategy as well as a
new version of the E prover.
Implementation
According to the diversity of the contained systems, the modules of
E-SETHEO are implemented in different programming languages like C,
Prolog, Scheme, and Shell tools.
The program runs under Solaris 2.6. Sources are available freely.
- LS01
- Letz R., Stenz G. (2001),
System Description: DCTP - A Disconnection Calculus Theorem
Prover,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
Lecture Notes in Artificial Intelligence,
Springer-Verlag.
- MI+97
- Moser M., Ibens O., Letz R., Steinbach J., Goller C., Schumann J.,
Mayr K. (1997),
SETHEO and E-SETHEO: The CADE-13 Systems,
Journal of Automated Reasoning 18(2),
pp.237-246.
- Sch99
- Schulz S. (1999),
System Abstract: E 0.3,
Ganzinger H.,
Proceedings of the 16th International Conference on Automated
Deduction
(Trento, Italy),
pp.297-301,
Lecture Notes in Artificial Intelligence 1632,
Springer-Verlag.
- SW99
- Stenz G., Wolf A. (1999),
E-SETHEO: Design, Configuration and Use of a Parallel Automated
Theorem Prover,
Foo N.,
Proceedings of AI'99: The 12th Australian Joint Conference on
Artificial Intelligence
(Sydney, Australia),
pp.231-243,
Lecture Notes in Artificial Intelligence 1747,
Springer-Verlag.
- WGR96
- Weidenbach C., Gaede B., Rock G. (1996),
SPASS and FLOTTER,
McRobbie M., Slaney J.K.,
Proceedings of the 13th International Conference on Automated
Deduction
(New Brunswick, USA),
pp.141-145,
Lecture Notes in Artificial Intelligence 1104,
Springer-Verlag.
E-SETHEO csp02
Reinhold Letz, Stephan Schulz, Gernot Stenz
Technische Universität München,
Germany
{letz,schulz,stenz}@informatik.tu-muenchen.de
Architecture
E-SETHEO is a compositional theorem prover for formulae in first order
clause logic, combining the systems E [Sch01],
DCTP [Ste02] and SETHEO
[MI+97].
It incorporates different calculi and proof procedures like superposition,
model elimination and semantic trees (the DPLL procedure).
Furthermore, the system contains transformation techniques which may split
the formula into independent subparts or which may perform a ground
instantiation.
Finally, advanced methods for controlling and optimizing the combination
of the subsystems are applied.
The first-order variant of E-SETHEO no longer uses Flotter
[WGR96] as a preprocessing module for
transforming non-clausal formulae to clausal form.
Instead, a more primitive normal form transformation is employed.
Since version 99csp of E-SETHEO, the different strategies are run sequentially,
one after the other.
E-SETHEO csp02 incorporates the new version of the disconnection prover DCTP
with integrated equality handling as a new strategy as well as a new version
of the E prover.
The new Scheme version of SETHEO that is in use features local unit failure
caching [LS01] and lazy root paramodulation,
an optimisation of lazy paramodulation which is complete in the Horn case
[LS02].
Implementation
According to the diversity of the contained systems, the modules of
E-SETHEO are implemented in different programming languages like C,
Prolog, Scheme, and Shell tools.
The program runs under Solaris and, with a little luck, under Linux,
too. Sources are available from the authors.
Strategies
Individual strategies are started be E-SETHEO depending on the allocation
of resources to the different strategies, so-called schedules,
which have been computed from experimental data using machine learning
techniques as described in [SW99].
Schedule selection depends on syntactic characteristics of the input formula
such as the Horn-ness of formulae, whether a problem contains equality
literals or whether the formula is in the Bernays-Schönfinkel class.
The problem classes are more or less identical with the sub-classes of the
competition.
We have no idea whether or not this conflicts with the organisers' tuning
restrictions.
Expected Competition Performance
We expect E-SETHEO to perform well in all categories it participates in.
References
- LS01
- Letz R., Stenz G. (2001),
Model Elimination and Connection Tableau Procedures,
Robinson A., Voronkov A.,
Handbook of Automated Reasoning,
pp.2015-2114,
Elsevier Science.
- LS02
- Letz R., Stenz G. (2002),
Integration of Equality Reasoning into the Disconnection
Calculus,
Fermüller C., Egly U.,
Proceedings of TABLEAUX 2002: Automated Reasoning with Analytic
Tableaux and Related Methods
(Copenhagen, Denmark),
To appear.
- MI+97
- Moser M., Ibens O., Letz R., Steinbach J., Goller C., Schumann J.,
Mayr K. (1997),
SETHEO and E-SETHEO: The CADE-13 Systems,
Journal of Automated Reasoning 18(2),
pp.237-246.
- Sch01
- Schulz S. (2001),
System Abstract: E 0.61,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
pp.370-375,
Lecture Notes in Artificial Intelligence 2083,
Springer-Verlag.
- SW99
- Stenz G., Wolf A. (1999),
E-SETHEO: Design, Configuration and Use of a Parallel Automated
Theorem Prover,
Foo N.,
Proceedings of AI'99: The 12th Australian Joint Conference on
Artificial Intelligence
(Sydney, Australia),
pp.231-243,
Lecture Notes in Artificial Intelligence 1747,
Springer-Verlag.
- Ste02
- Stenz G. (2002),
DCTP 1.2 - System Abstract,
Fermüller C., Egly U.,
Proceedings of TABLEAUX 2002: Automated Reasoning with Analytic
Tableaux and Related Methods
(Copenhagen, Denmark),
To appear.
- WGR96
- Weidenbach C., Gaede B., Rock G. (1996),
SPASS and FLOTTER,
McRobbie M., Slaney J.K.,
Proceedings of the 13th International Conference on Automated
Deduction
(New Brunswick, USA),
pp.141-145,
Lecture Notes in Artificial Intelligence 1104,
Springer-Verlag.
EXLOG 2
Ivan Kossey
Uzhgorod State University,
Ukraine
Ivan.Kossey@t-online.de
Architecture
EXLOG, developed since 1999, is a theorem prover in first order logic.
The basic features are binary resolution with factors
[Kos82], forward and
backward subsumption, unit preference, a weak variant of set of support,
and exclusion of predicates with satisfiable propositional abstraction.
Features for equality are rewrite rules (created from unit equality clauses
with ordering of terms on length/alphabetical), and commutative unification.
A somewhat modified Knuth-Bendix algorithm is used for equations with
commutative unification for symmetric predicates and commutative functions.
Paramodulation and AC-unification are not implemented.
EXLOG accepts FOF, clausal TPTP format and his own input format.
Implementation
Assembler (TASM32) under Windows32. EXLOG runs on PC under Win95, Win98,
WinNT, WinME, Win2000, WinXP. Developed and tested under Win2000.
The subsumption algorithm [Kos84]
and its implementation seems to be fast.
Strategies
Different strategies are implemented: some heuristic techniques for
simplifying, unit preference, a weak variant of set of support (the
conjecture clauses and their decsendants have preference).
Many "small" heuristics are implemented, e.g., clause sorting on literal
number/length.
The program is self-tuning and has no run parameters.
Expected Competition Performance
Top half for problems with equality, bottom half for problems without
equality.
References
- Kos82
- Kossey I. (1982),
Four Fast Algorithms in Resolution Method,
Messages of the Academy of Sciences of USSR, Technical Cybernetics
Series, Nr 5,
Academy of Sciences of USSR, Moscow, Russia.
- Kos84
- Kossey I. (1984),
Subsumption Algorithms in First and Finite Order
Resolution,
Mathematical Logic and its Applications,
Moscow Institute of Pedagogics, Moscow, Russia.
Gandalf c-2.5
Tanel Tammet
Tallinn Technical University, Estonia
Safelogic AB, Sweden
tammet@cc.ttu.ee
Architecture
Gandalf
[Tam97,Tam98]
is a family of automated theorem provers, including classical, type theory,
intuitionistic and linear logic provers, plus finite a model builder.
The version c-2.5 contains the classical logic prover for clause
form input and the finite model builder.
One distinguishing feature of Gandalf is that it contains a large
number of different search strategies and is capable of automatically
selecting suitable strategies and experimenting with these strategies.
Gandalf is available under GPL.
There exists a separate commercial version of Gandalf, called G,
developed and distributed by Safelogic AB
(www.safelogic.se), which
contains numerous additions, strategies, and optimisations, aimed specifically
at verification of large systems.
The finite model building component of Gandalf uses the Zchaff propositional
logic solver by L.Zhang [MM+01] as an external program called by Gandalf.
Zchaff is not free, although it can be used freely for research purposes.
Gandalf is not optimised for Zchaff or linked together with it: Zchaff
can be freely replaced by other satisfiability checkers.
Implementation
Gandalf is implemented in Scheme and compiled to C using the Hobbit
Scheme-to-C compiler. Version scm5d6 of the Scheme interpreter scm by A.Jaffer
is used as the underlying Scheme system.
Gandalf has been tested on Linux, Solaris and MS Windows under Cygwin.
The propositional satisifiability checker Zchaff used by Gandalf during
finite model building is implemented by L.Zhang in C++.
Gandalf should be publicly available at:
http://www.ttu.ee/it/gandalf
Strategies
One of the basic ideas used in Gandalf is time-slicing: Gandalf typically
runs a number of searches with different strategies one after another, until
either the proof is found or time runs out. Also, during each specific run
Gandalf typically modifies its strategy as the time limit for this run
starts coming closer. Selected clauses from unsuccessful runs are sometimes
used in later runs.
In the normal mode Gandalf attempts to find only unsatisfiability.
It has to be called with a -sat flag to find satisfiability.
Gandalf selects the strategy list according to the following criteria:
- Unsatisfiability checking.
Gandalf selects the basic strategies from the following list:
hyperresolution, binary sos resolution, unit resolution, ordered
resolution (term-depth based, literal size based and polarity plus
literal size and structure based).
Each strategy may be iterated over a limit on term depth.
For clause sets containing equality, some strategies are tried with both
the Knuth-Bendix ordering and recursive path ordering, as well as with
several different ordering principles of function symbols for these
orderings.
Typically Gandalf selects one or two strategies to iterate over the term
depth limit and one or two strategies to iterate over the selection of
equality orderings.
At the second half of each strategy run Gandalf imposes additional
restrictions, like introducing unit restriction and switching over
to strict best-first clause selection.
The strategy list selection criteria for a particular problem is based
on following:
- Problem class from TPTP: UEQ, PEQ, HNE, HEQ, NEQ, NNE.
This strictly determines the list of basic strategies.
The following criteria determine relative amount of time given to
each strategy.
- Problem size.
A problem is classified either as small, medium, or big, according
to the number of clauses in the problem.
For bigger problems, the set of support strategy gets relatively
more time than other strategies.
- Percentage of clauses which can be ordered by term depth:
small, medium, or all.
For bigger percentages, the term depth ordering gets relatively
more time than other strategies.
- Satisfiability checking.
The following strategies are run:
- Finite model building by incremental search through function symbol
interpretations.
- Ordered binary resolution (term depth): only for problems not
containing equality.
- Finite model building using MACE-style flattening and external
propositional prover.
- Satisfiability/unsatisfiability checking for essentially propositional
problems.
The following strategies are run:
- Unsatisfiability search by resolution.
- Satisfiability/unsatisfiability search by propositonal saturation.
- Satisfiability search for small models by propositonal saturation.
Expected Competition Performance
In the MIX and UEQ divisions, Gandalf will probably be among the better
contestants but is not likely to win, except for (possibly) some categories
of MIX.
In the SAT division, Gandalf version 2.5 should perform
significantly better than older versions of Gandalf or any other
satisfiability checker that participated at CASC during 2001.
However, no prediction can be made about the relative performance at 2002.
In the EPR division, Gandalf now uses suitable strategies, differently from
Gandalf in 2001, and is expected to be among the better provers in this class.
- MM+01
- Moskewicz M., Madigan C., Zhao Y., Zhang L., Malik S. (2001),
Chaff: Engineering an Efficient SAT Solver,
Blaauw D., Lavagno L.,
Proceedings of the 39th Design Automation Conference
(Las Vegas, USA),
pp.530-535.
- Tam97
- Gandalf,
Journal of Automated Reasoning 18(2),
pp.199-204.
- Tam98
- Towards Efficient Subsumption,
Kirchner C., Kirchner H.,
Proceedings of the 15th International Conference on Automated
Deduction
(Lindau, Germany),
pp.427-440,
Lecture Notes in Artificial Intelligence 1421,
Springer-Verlag.
GandalfSat 1.0
Tanel Tammet
Tallinn Technical University, Estonia
tammet@cc.ttu.ee
Architecture
GandalfSat is a special version of the
Gandalf
theorem prover, optimised for satisifiability checking.
Essentially it contains the same code as Gandalf, but uses different
default search strategies to Gandalf.
Ordinary Gandalf is a resolution prover which implements a number of
different basic strategies: binary ordered resolution for several orderings,
versions of set-of-support resolution, binary unit resolution, hyperresolution.
Enhancements are used for cutting off literals and combining different
strategies, in particular forward and backward reasoning, into one run.
Equality is handled by ordered paramodulation and demodulation.
GandalfSat uses time-slicing similarly to ordinary Gandalf.
It attempts hyperresolution, ordered resolution based on
[FL+93] results, and two kinds of
finite model building algorithms (Falcon-like and MACE-like) for attempting
to show satisfiability.
Some relevant publications are:
[FL+93,
Tam97,
TS98].
Implementation
Gandalf is implemented in Scheme, using the scm interpreter developed
by A. Jaffer and the Scheme-to-C compiler Hobbit developed by T. Tammet
for the scm system.
References
- FL+93
- Fermüller C., Leitsch A., Tammet T., Zamov N. (1993),
Resolution Methods for the Decision Problem,
Lecture Notes in Computer Science 679,
Springer-Verlag.
- Tam97
- Gandalf,
Journal of Automated Reasoning 18(2),
pp.199-204.
- Tam98
- Towards Efficient Subsumption,
Kirchner C., Kirchner H.,
Proceedings of the 15th International Conference on Automated
Deduction
(Lindau, Germany),
pp.427-440,
Lecture Notes in Artificial Intelligence 1421,
Springer-Verlag.
GrAnDe 1.1
Geoff Sutcliffe1, Stephan Schulz2
1University of Miami, USA,
2Technische Universität München, Germany
geoff@cs.miami.edu, schulz@informatik.tu-muenchen.de
Architecture
GrAnDe (Ground And Decide)
[SS02] is a decision procedure for
CNF problems with a finite Herbrand universe.
GrAnDe uses the grounding procedure eground
[Sch02] to generate ground instances of
such a problem, and the propositional decision system ZChaff
[MM+01] to determine the satisfiability of
the resulting propositional problem.
The set of all ground instances of a set of clauses is often too large
to compute with reasonable resources.
Therefore eground tries to find a smaller set of ground instances
that is still equiconsistent with the original set of clauses.
Three techniques are combined: clause splitting, structural constraints on
variable instantiation, and propositional simplification.
Clause splitting separates a clause into variable disjoint parts, and
replaces the orginal clause by new clauses formed from the parts and
link literals.
Structural constraints are used to prohibit variable instantiations that
would lead to the generation of pure literals in the resulting ground clauses.
Propositional simplification removes clauses and literals from the generated
ground clause set, while maintaining satisfiability.
For details about ZChaff, see [MM+01].
Despite the optimizations in eground, there are problems where
eground runs out of time or memory.
In these cases eground outputs an incomplete ground clause set.
If eground has output an incomplete clause set and ZChaff reports
that it is satisfiable, then no result is reported by GrAnDe.
If eground has generated an incomplete clause set and ZChaff
reports that it is unsatisfiable, or if eground has generated a
complete clause set, then ZChaff's result is reported by GrAnDe as the overall
result.
Implementation
eground is implemented in ANSI-C and based on the E
[Sch01] libraries.
It uses the basic infrastructure up to the shared term representation,
and adds a new, compact clause data type for the generated propositional
clauses.
Propositional unit-clauses, used for simplification, are represented by
a simple array entry, making unit subsumption and unit simplification into
O(1) operations (in the number of unit clauses).
Clause splitting is implemented naively, using a flood-fill algorithm to
find connected sub-clauses.
Finally, variable constraints are directly computed in conjunctive normal
form, and are used to directly construct possible instantiations.
For details about ZChaff, see [MM+01].
A Perl script called And is used to combine
eground and ZChaff.
The script invokes eground on the EPR problem, allowing it
maximally 66% of the CPU time limit.
The propositional clauses written by eground are captured
into a temporary file, which is used as the input for ZChaff.
ZChaff is allowed whatever CPU time has not been used by eground.
GrAnDe, including the separate components, is available from:
http://www.cs.miami.edu/~tptp/ATPSystems/GrAnDe/
It should be widely portable between recent UNIX variants.
Strategies
GrAnDe has only one mode of operation, i.e., there are no different
strategies for different types of problems.
The overhead for the different techniques in eground is small enough
that all optimizations can be applied in all cases without serious
degradation of performance.
Expected Competition Performance
GrAnDe is expected to outperform all systems that use only first-order
techniques.
References
- MM+01
- Moskewicz M., Madigan C., Zhao Y., Zhang L., Malik S. (2001),
Chaff: Engineering an Efficient SAT Solver,
Blaauw D., Lavagno L.,
Proceedings of the 39th Design Automation Conference
(Las Vegas, USA),
pp.530-535.
- SS02
- Schulz S., Sutcliffe G. (2002),
System Description: GrAnDe 1.0,
Voronkov A.,
Proceedings of the 18th International Conference on Automated
Deduction
(Copenhagen, Denmark),
To appear,
Lecture Notes in Artificial Intelligence,
Springer-Verlag.
- Sch01
- Schulz S. (2001),
System Abstract: E 0.61,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
pp.370-375,
Lecture Notes in Artificial Intelligence 2083,
Springer-Verlag.
- Sch02
- Schulz S. (2002),
A Comparison of Different Techniques for Grounding
Near-Propositional CNF Formulae,
Haller S., Simmons G.,
Proceedings of the 15th Florida Artificial Intelligence Research
Symposium
(Pensecola, USA),
pp.72-76,
AAAI Press.
ICGNS 2002
William McCune
Argonne National Laboratory, USA
mccune@mcs.anl.gov
Architecture
ICGNS 2002 [McC02] searches for finite models
of first-order (unsorted, with equality) statements.
Given input clauses, it generates ground instances over a finite domain,
then uses a decision procedure try to determine satisfiability.
If there is no model of that size, it increments the domain size and
tries again.
The ICGNS method is SEM-like [ZZ95] rather
than MACE-like [McC01]; that is,
the ground problem is propositional/equality rather than flattened
and purely propositional.
The two main parts of the ICGNS method are (1) selecting the next empty
cell in the tables of functions being constructed and deciding which values
need to be considered for that cell, and (2) propagating assignments.
ICGNS uses the basic least number heuristic (LNH) to reduce isomorphism.
The LNH was introduced in Falcon [Zha96] and
is also used in SEM.
Effective use of the LNH requires careful cell selection.
Propagation is by ground rewriting and inference rules to derive negated
equalities.
Implementation
ICGNS 2002 is a new program, started in the winter of 2002.
It uses some pre-existing code from various sources for some
of the low level operations such as parsing and term structure.
ICGNS is coded in ANSI C and should be easily portable to recent
variants of UNIX.
The source code and test problems are available from:
http://www.mcs.anl.gov/home/mccune/icgns/
Strategies
Several strategies are available for cell selection and for applying the
negative inference rules.
A single strategy is used for CASC-18: (1) the LNH, selecting cells with
the fewest number of possible values, and (2) applying all negative
propagation inference rules.
The strategy was derived by experimenting on problems that
arose in mathematics practice, mostly in abstract algebra.
No tuning to the TPTP set has been done.
Expected Competition Performance
ICGNS seems to do well on equational problems, especially if the input set
contains some simple equations.
Satisfiable problems without reasonably sized finite models (including
problems without finite models) are out of reach of ICGNS.
Some of those problems can be done by satisfiability methods that work by
saturation with a complete inference system.
It is doubtful that ICGNS will do as well, overall, as programs that try
various methods and various strategies.
References
- McC01
- McCune W.W. (2001),
MACE 2.0 Reference Manual and Guide,
ANL/MCS-TM-249,
Argonne National Laboratory, Argonne, USA.
- McC02
- McCune W.W. (2002),
ICGNS,
http://www.mcs.anl.gov/~mccune/icgns.
- ZZ95
- Zhang J., Zhang H. (1995),
SEM: A System for Enumerating Models,
Mellish C.S.,
Proceedings of the 14th International Joint Conference on Artificial
Intelligence,
pp.298-303,
Morgan Kaufmann.
- Zha96
- Zhang J. (1996),
Constructing Finite Algebras with FALCON,
Journal of Automated Reasoning 17(1),
pp.1-22.
Acknowledgments: Olga Shumsky Matlin and Michael Rose assisted in the
development of ICNGS 2002.
Otter 3.2
William McCune
Argonne National Laboratory, USA
mccune@mcs.anl.gov
Architecture
Otter 3.2 [McC94] is an ATP system for
statements in first-order (unsorted) logic with equality.
Otter is based on resolution and paramodulation applied to clauses.
An Otter search uses the "given clause algorithm", and typically involves
a large database of clauses; subsumption and demodulation play an important
role.
Implementation
Otter is written in C.
Otter uses shared data structures for clauses and terms, and it uses
indexing for resolution, paramodulation, forward and backward subsumption,
forward and backward demodulation, and unit conflict.
Expected Competition Performance
Otter has been entered into CASC-18 as a stable benchmark against which
progress can be judged (there have been only minor changes to Otter since
1996 [MW97], nothing that really affects
its performace in CASC).
Otter's original automatic mode, which reflects no tuning to the TPTP
problems, will be used.
This is not an ordinary entry, and we do not hope for Otter to do well
in the competition.
References
- McC94
- McCune W.W. (1994),
Otter 3.0 Reference Manual and Guide,
ANL-94/6,
Argonne National Laboratory, Argonne, USA.
- MW97
- McCune W.W., Wos L. (1997),
Otter: The CADE-13 Competition Incarnations,
Journal of Automated Reasoning 18(2),
pp.211-220.
- McC02
- McCune W.W. (2002),
Otter: An Automated Deduction System,
http://www-unix.mcs.anl.gov/AR/otter/.
Acknowledgments: Ross Overbeek, Larry Wos, Bob Veroff, and
Rusty Lusk contributed to the development of Otter.
SCOTT 6.1
John Slaney
Australian National University, Australia
John.Slaney@anu.edu.au
Architecture
SCOTT [HS01,
HS02] consists of two main components:
a saturation-based theorem prover (Otter [McC02]
in fact) and a finite domain constraint solver
(FINDER [Sla94]) that generates models for
subsets of the clauses deduced.
The models are used to guide selection of given clauses, clauses which are
false in the guiding models being given preference.
Because model generation is computationally expensive, the critical issue
for performance is the tradeoff between search quality due to semantic
guidance and the time used in securing and refining it.
Successive versions of SCOTT have competed regularly in CASC.
Version 6.1 is little changed from version 6.0 which competed in 2001, except
for some amendments aimed at reducing the number of times a useless
model is generated, deleted and re-generated, and slight tuning of the
default values for parameters.
Implementation
The system is written in C, and consists of three main modules: the theorem
prover is Otter, the model generator is FINDER, and there is a relatively
small module linking the two into the combined system. The sources are
available from
http://arp.anu.edu.au/software/scott/
Strategies
The proof techniques are taken over directly from Otter, along with
that system's weight reduction strategies, pick-given ratio and the like.
The characteristic feature of SCOTT is semantic guidance, applied as a
heuristic for selection of given clauses.
This improves the quality of the choices overall, though the effect is far
from uniform across problems and the overheads incurred are severe.
Many problems from TPTP have been used in the development of the semantic
guidance strategy, but there has been little attempt to tune it more
specifically than that.
Expected Competition Performance
SCOTT 6.0 competed in 2001, and version 6.1 is essentially similar, so
we expect similar performance.
Much depends on the time limit, as there are over 1000 problems in TPTP
that the system proves in more than one minute and less than ten minutes
(taking timings on a Sun E250 UltraSPARC II, for instance) so it will
perform significantly better with a 600 second limit than with a 300
second one.
In the past it has done moderately well on UEQ problems and on essentially
propositional ones.
In the MIX division, however, it is hampered by the lack of a sufficiently
adaptive autonomous mode and will not threaten the leading provers.
We shall be interested to discover what it finds easy and what it finds
hard, in comparison with other systems.
References
- HS01
- Hodgson K., Slaney J.K. (2001),
System Description: SCOTT-5,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
pp.443-447,
Lecture Notes in Artificial Intelligence 2083,
Springer-Verlag.
- HS02
- Hodgson K., Slaney J.K. (2002),
TPTP, CASC and the Development of a Semantically Guided Theorem
Prover,
AI Communications,
To appear.
- McC02
- McCune W.W. (2002),
Otter: An Automated Deduction System,
http://www-unix.mcs.anl.gov/AR/otter/.
- Sla94
- Slaney J.K. (1994),
FINDER: Finite Domain Enumerator, System Description,
Bundy A.,
Proceedings of the 12th International Conference on Automated
Deduction
(Nancy, France),
pp.798-801,
Lecture Notes in Artificial Intelligence 814,
Springer-Verlag.
Vampire 2.0-CASC
Alexandre Riazanov, Andrei Voronkov
University of Manchester, England
{riazanoa,voronkov}@cs.man.ac.uk
Architecture
Vampire 2.0 is an automatic theorem prover for first-order classical logic.
It implements the calculi of ordered binary
resolution, hyperresolution, and superposition for handling equality.
The splitting rule is simulated by introducing new
predicate symbols. A number of standard redundancy criteria and
simplification techniques are used for pruning the
search space: subsumption, tautology deletion, subsumption
resolution and rewriting by ordered unit equalities. The only
term ordering used in Vampire at the moment is a special version
of the Knuth-Bendix ordering which allows efficient
approximation algorithms for solving ordering constraints.
A number of efficient indexing techniques are used to implement all major
operations on sets of terms and clauses, such as an improved version
[RV00]
of code trees [Vor95] for forward
subsumption, and a combination
of path indexing [Sti89] and database
joins for backward subsumption.
Compared to Vampire 1.0 that participated in the previous competition,
this version has many more literal selection functions, more flexible
splitting without backtracking and improved memory management.
The automatic mode of Vampire 2.0 is primitive, it
recognises only very simple syntactic properties of the input
problem, like the presence of equality or non-Horn clauses.
In the preprocessing stage it exploits a number of primitive techniques,
such as elimination of simple predicate and function definitions.
Vampire 2.0-CASC is a version based on Vampire 2.0, specialised to
win the competition. To adjust the strategy it estimates the problem
by checking some syntactic properties, such as
presence of multiliteral, non-Horn and ground clauses, equations and
nonequational literals. Additionally we consider such quantitative
characteristics as the number of axioms, literals, small and large
terms.
Implementation
Vampire 2.0 is implemented in C++ and can be compiled
by 2.91 or newer versions of gcc. The binaries for Solaris and
Linux are available from the authors, for details see:
http://www.cs.man.ac.uk/~riazanoa/Vampire
References
- RV00
- Riazanov A., Voronkov A. (2000),
Partiallly Adaptive Code Trees,
Ojeda-Aciego M., de Guzman I., Brewka G., Pereira L.,
Proceedings of the 7th European Workshop on Logics in Artificial
Inteligence
(Malaga, Spain),
pp.209-223,
Lecture Notes in Artificial Intelligence 1919,
Springer-Verlag.
- Sti89
- Stickel M.E. (1989),
The Path Indexing Method for Indexing Terms,
Technical Note 473,
SRI International, Menlo Park, USA.
- Vor95
- Voronkov A. (1995),
The Anatomy of Vampire,
Journal of Automated Reasoning 15(2),
pp.237-265.
Vampire 5.0 and Vampire 5.0-CASC
Alexandre Riazanov, Andrei Voronkov
University of Manchester, England
{riazanoa,voronkov}@cs.man.ac.uk
Architecture
Vampire [RV01,
RV02] 5.0 is an automatic theorem prover
for first-order classical logic.
Its kernel implements the calculi of ordered binary resolution and
superposition for handling equality.
The splitting rule is simulated by introducing new predicate symbols.
A number of standard redundancy criteria and simplification techniques are
used for pruning the search space: subsumption, tautology deletion, subsumption
resolution and rewriting by ordered unit equalities.
The only term ordering used in Vampire at the moment is a special
non-recursive version of the Knuth-Bendix ordering which allows
efficient approximation algorithms for solving ordering constraints.
By the system installation deadline we may implement the standard
Knuth-Bendix ordering.
A number of efficient indexing techniques are used to implement all
major operations on sets of terms and clauses.
Although the kernel of the system works only with clausal normal
forms, the preprocessor component accepts a problem in the full
first-order logic syntax, clausifies them and performs a number of
useful transformations before passing the result to the kernel.
Implementation
Vampire 5.0 is implemented in C++.
The main supported compiler version is gcc 2.95.3, although in the nearest
future we are going to move to gcc 3.x.
The system has been successfully compiled for Linux and Solaris.
It is available from:
http://www.cs.man.ac.uk/~riazanoa/Vampire/
Strategies
The Vampire kernel provides a fairly large number of features
for strategy selection.
The most important ones are:
- Choice of the main saturation procedure : (i) OTTER loop, with or without
the Limited Resource Strategy, (ii) DISCOUNT loop.
- A variety of optional simplifications.
- Parameterised simplification ordering.
- A number of built-in literal selection functions and different modes of
comparing literals.
- Age-weight ratio that specifies how strongly lighter clauses are preferred
for inference selection.
The standalone executables for Vampire 5.0 and Vampire 5.0-CASC use very
simple time slicing to make sure that several kernel strategies are tried on
a given problem.
The automatic mode of Vampire 5.0 is primitive.
Seven problem classes are distinguished corresponding to the competition
divisions HNE, HEQ, NNE, NEQ, PEQ, UEQ and EPR.
Every class is assigned a fixed schedule consisting of a number of kernel
strategies called one by one with different time limits.
The automatic mode of Vampire 5.0-CASC is better tuned towards TPTP 2.4.0.
It uses the same problem characteristics as
E 0.7 to divide the problems into a large number of classes
and may potentially assign a separate strategy to every such class.
Expected Competition Performance
We expect Vampire 5.0 to perform much better than Vampire 2.0
on problems with equality, especially on the unit equality problems.
References
- RV01
- Riazanov A., Voronkov A. (2001),
Vampire 1.1 (System Description),
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
pp.376-380,
Lecture Notes in Artificial Intelligence 2083,
Springer-Verlag.
- RV02
- Riazanov A., Voronkov A. (2002),
The Design and Implementation of Vampire,
AI Communications,
To appear.
Waldmeister 601
Thomas Hillenbrand1, Bernd Löchner2,
Andreas Jaeger, and Arnim Buch
1Max-Planck-Institut für Informatik Saarbrücken, Germany,
2Universität Kaiserslautern, Germany
waldmeister@informatik.uni-kl.de
Architecture
Waldmeister is a system for unit equational deduction.
Its theoretical basis is unfailing completion in the sense of
[BDP89] with refinements towards
ordered completion. The prover saturates the input axiomatization in a repeated
cycle that works on a set of active resp. passive facts. The selection of the
reduction ordering and the heuristical guidance of the proof search are
described in [HJL99].
Since last year's version, stronger redundancy criteria have been integrated,
including ground joinability tests with ordering constraints on variables
[AHL00].
In several problem domains this technique is helpful especially
for harder proof tasks. Some restructuring of the prover is on the way to
also include an implementation of confluence trees with full ordering
constraints [AL01]; but further work
is necessary to have them speed up the proof search.
Implementation
The prover is coded in ANSI-C and available for SunOS, Solaris and Linux. The
set of active facts is represented by perfect discrimination trees.
Stimulated by the evaluation of indexing techniques reported on in
[NH+01], the implementation of these
has somewhat been improved. The storage of passive
facts, and the treatment of hypotheses have remained unchanged. The
Waldmeister Web page is located at:
http://www-avenhaus.informatik.uni-kl.de/waldmeister
References
- AHL00
- Avenhaus J., Hillenbrand T., Löchner B. (2000),
On Using Ground Joinable Equations in Equational Theorem
Proving,
Baumgartner P., Zhang H.,
Proceedings of the 3rd International Workshop on First Order Theorem
Proving
(St Andrews, Scotland),
pp.33-43.
- AL01
- Avenhaus J., Löchner B. (2001),
System Description: CCE: Testing Ground Joinability,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
Lecture Notes in Artificial Intelligence,
Springer-Verlag.
- BDP89
- Bachmair L., Dershowitz N., Plaisted D.A. (1989),
Completion Without Failure,
Ait-Kaci H., Nivat M.,
Resolution of Equations in Algebraic Structures,
pp.1-30,
Academic Press.
- HJL99
- Hillenbrand T., Jaeger A., Löchner B. (1999),
Waldmeister - Improvements in Performance and Ease of Use,
Ganzinger H.,
Proceedings of the 16th International Conference on Automated
Deduction
(Trento, Italy),
pp.232-236,
Lecture Notes in Artificial Intelligence 1632,
Springer-Verlag.
- NH+01
- Nieuwenhuis R., Hillenbrand T., Riazanov A., Voronkov A. (2001),
On the Evaluation of Indexing Techniques for Theorem
Proving,
Gore R., Leitsch A., Nipkow T.,
Proceedings of the International Joint Conference on Automated
Reasoning
(Siena, Italy),
Lecture Notes in Artificial Intelligence,
Springer-Verlag.
Waldmeister 702
Thomas Hillenbrand1, Bernd Löchner2
1Max-Planck-Institut für Informatik Saarbrücken, Germany,
2Universität Kaiserslautern, Germany
waldmeister@informatik.uni-kl.de
Architecture
Waldmeister 702 is an implementation of unfailing Knuth-Bendix
completion [BDP89] with extensions
towards ordered completion (see [AHL00])
and basicness [BG+92,
NR92].
The system saturates the input axiomatization, distinguishing active facts,
which induce a rewrite relation, and passive facts, which are the one-step
conclusions of the active ones up to redundancy.
The saturation process is parameterized by a reduction ordering and a
heuristic assessment of passive facts.
Only recently, we have designed a thorough refinement of the system
architecture concerning the representation of passive facts
[HL02].
The aim of that work - the next Waldmeister loop - is, besides gaining
more structural clarity, to cut down memory consumption especially for
long-lasting proof attempts, and hence less relevant in the CASC setting.
Implementation
The system is implemented in ANSI-C and runs under Solaris and Linux.
The central data strucures are: perfect discrimination trees for the
active facts; element-wise compressions for the passive ones; and sets
of rewrite successors for the conjectures. Waldmeister can be found on
the Web at
http://www-avenhaus.informatik.uni-kl.de/waldmeister
Strategies
Our approach to control the proof search is to choose the search
parameters according to the algebraic structure given in the problem
specification [HJL99].
This is based on the observation that proof tasks sharing major parts of
their axiomatization often behave similar.
Hence, for a number of domains, the influence of different reduction orderings
and heuristic assessments has been analyzed experimentally; and in most cases
it has been possible to distinguish a strategy uniformly superior on the whole
domain.
In essence, every such strategy consists of an instantiation of the first
parameter to a Knuth-Bendix ordering or to a lexicographic path ordering, and
an instantiation of the second parameter to one of the weighting functions
addweight, gtweight, or mixweight, which, if
called on an equation s = t, return
s| + |t|,
>(s,t)|, or
>(s,t)| · (|s|
+ |t| + 1) + |s| + |t|,
respectively, where s| denotes the number of symbols
in s.
Expected Competition Performance
We expect Waldmeister 702 to be quite competitive.
But since our current restructuring of the prover to implement the concepts of
[HL02] may affect the inference behaviour,
the outcome in comparison to last year's version is difficult to predict.
References
- AHL00
- Avenhaus J., Hillenbrand T., Löchner B. (2000),
On Using Ground Joinable Equations in Equational Theorem
Proving,
Baumgartner P., Zhang H.,
Proceedings of the 3rd International Workshop on First Order Theorem
Proving
(St Andrews, Scotland),
pp.33-43.
- BDP89
- Bachmair L., Dershowitz N., Plaisted D.A. (1989),
Completion Without Failure,
Ait-Kaci H., Nivat M.,
Resolution of Equations in Algebraic Structures,
pp.1-30,
Academic Press.
- BG+92
- Bachmair L., Ganzinger H., Lynch C., Snyder W. (1992),
Basic Paramodulation and Superposition,
Kapur D.,
Proceedings of the 11th International Conference on Automated
Deduction (Saratoga Springs, USA),
pp.462-476,
Lecture Notes in Artificial Intelligence 607,
Springer-Verlag.
- HJL99
- Hillenbrand T., Jaeger A., Löchner B. (1999),
Waldmeister - Improvements in Performance and Ease of Use,
Ganzinger H.,
Proceedings of the 16th International Conference on Automated
Deduction
(Trento, Italy),
pp.232-236,
Lecture Notes in Artificial Intelligence 1632,
Springer-Verlag.
- HL02
- Hillenbrand T., Löchner B. (2002),
The Next Waldmeister Loop,
Voronkov A.,
Proceedings of the 18th International Conference on Automated
Deduction
(Copenhagen, Denmark),
To appear,
Lecture Notes in Artificial Intelligence,
Springer-Verlag.
- NR92
- Nieuwenhuis R., Rivero J.M. (1992),
Basic Superposition is Complete,
Krieg-Brückner B.,
Proceedings of the 4th European Symposium on Programming
(Rennes, France),
pp.371-390,
Lecture Notes in Computer Science 582,
Springer-Verlag.