Bliksem 1.10
H. de Nivelle
Max-Planck-Institut für Informatik
nivelle@mpi-sb.mpg.de
Architecture
Bliksem implements the ordered resolution + superposition calculus
[BG90,DeN94,Rob65].
It supports many different orders,
including reduction orders, A-orders,
and non-liftable orders. Special attention has been given to resolution
strategies that provide decision procedures for certain subsets
of first order logic [FL+93,DeN98].
Bliksem is able to transform first order formulae into clausal normal form,
using different structural or non-structural clause transformations.
Bliksem 1.10 improves over Bliksem 1.00 by having a posteriori orders,
(which will not be used during the competition), equality factoring +
tautology elimination, optimized normal form transformations,
non-unit demodulators, and equality subsumption (which takes into account
commutativity of =)
In the automode Bliksem 1.10 is able to recognize decidable classes. However these are unlikely to show up in the competition. Apart from that, the automode is still quite primitive. We don't know how to select the right strategy + order.
Bliksem 1.00 has had soundness problems. We believe that there is no way in which the correctness of a program with the complexity of Bliksem can be guaranteed. We do not give any such guarantees for Bliksem 1.10. Instead Bliksem 1.10 is able to output its proofs with such a level of specification that they can be easily checked by another computer program. In the future we plan to have the proofs checked by a type checker, namely COQ [BB+96].
http://www.mpi-sb.mpg.de/~nivelle
E 0.5
S. Schulz
Institut für Informatik, Technische Universität München
schulz@informatik.tu-muenchen.de
Architecture
E [Sch99] is a purely equational theorem prover.
It was not originally designed as an independent prover, but as an
preprocessor implementing the bottom-up part of the METOP calculus,
with SETHEO [MI+97] to implement the
top-down part in a future combined system.
The calculus used by E combines superposition [BG94] (with selection of negative literals) and rewriting. For the special case that only unit clauses are present, this calculus degenerates into unfailing completion [BDP89]. For the Horn case, E can simulate variants of the positive unit strategy described in [Der91]. No special rules for non-equational literals have been implemented, i.e., resolution is simulated via paramodulation and equality resolution.
Proof search in E is controlled by a literal selection strategy and an evaluation heuristic. A variety of selection strategies (including e.g., no selection, or select all negative literals) is available. Evaluation heuristics can be constructed on the fly by combining various parameterized primitive evaluation functions. We also aim at including some learning heuristics into the prover by the competition date. A simple automatic mode can select strategy and heuristic based on simple problem characteristics.
The program has been successfully installed under SunOS 4.3.x, Solaris 2.x, HP-UX B 10.20, and various versions of Linux. Sources are available freely from:
http://wwwjessen.informatik.tu-muenchen.de/~schulz/WORK/eprover.html
E-SETHEO 99csp and E-SETHEO-FLOTTER 99csp
R. Letz, S. Schulz, G. Stenz, A. Wolf
Institut für Informatik, Technische Universität München
{letz,schulz,stenzg,wolfa}@informatik.tu-muenchen.de
Architecture
E-SETHEO is a compositional theorem prover for formulae in first order
clause logic, combining the systems E [Sch99]
and SETHEO [MI+97].
It incorporates different calculi and proof procedures
like superposition, model elimination and semantic trees (DPLL
procedure). Furthermore, the system contains transformation techniques
which may split the formula into independent subparts or which may
perform a ground instantiation.
Finally, advanced methods for controlling and optimizing the combination
of the subsystems are applied.
E-SETHEO-FLOTTER includes the program FLOTTER [WGR96] as a preprocessing module for transforming non-clausal formulae to clausal form.
On the top level, the system performs a parallelization of different strategies stemming from the system P-SETHEO [Wol98a]. In contrast to P-SETHEO, in the version 99csp of E-SETHEO, the different strategies are run sequentially, one after the other, depending on the allocation of resources to the different strategies, so-called schedules. A schedule is determined from rough syntactic characteristics of the input formula (e.g., if it is a Horn formula without equality). The schedules have been computed from a large number of experimental data using a combination of a genetic algorithm [SW99] and a gradient procedure [Wol98b].
The program runs under Solaris 2.6. Sources are available freely.
The model construction technique is similar to the one of the
predecessor calculus "Hyper Tableaux - The Next Generation"
described in [Bau98]. Since there is no paper
on FDP yet, this is the only relevant publication so far.
Proof search in FDP can be semantically guided by supplying an
"initial interpretation", and proof search means to modify this
initial interpretation until success. By default, a "trivial"
initial interpretation is used that assign true to all positive
literals. This emphasizes the role of negative clauses.
There is no built-in equality-handling (yet!) and so far FDP is
restricted to clausal logic without equality. Currently, equality is
treated in a preprocessing transformation similar to Brand's
well-known STE-transformation.
The current implementation is available through the WWW at:
Comments etc. are highly welcome!
In the MIX division, I expect better performance for non-equality
problems than for equality problems, and better performance for
non-Horn problems than for Horn problems.
FDP decides function-free clause logic. Many SAT problems are of this
kind. Therefore, FDP participates in the SAT division as well.
The main characteristic of this system is that it does not adjust any
system parameters according to the given axiom set or conjecture.
This is of course not realistic w.r.t. real-world applications. But
we want to make the point that we strongly believe that this is also
the case for the other extreme, i.e., applying specific strategies
(for very small problem sets), discovered by studying proofs obtained
by running (sometimes several) provers on the competition problem set
without any time limit. This of course gives no real information
either about the behaviour of a prover in a five-minutes run on a new
problem.
The following are the fixed settings used at the competition (but of
course this does not mean that the prover includes no other
orderings or selection strategies;
all system parameters can be set in a flexible way):
Fiesta always uses a lexicographic path ordering with a precedence
where symbols are first compared by arity (a larger arities means
larger in the precedence) and second by order of appearance in the
input.
The selection of the next rule used for superposition consists
of picking every three cycles once the oldest one, and twice
the smallest one according to the size measure where symbols
are counted as their arity plus one, and variables as one.
The system has a self-adjusting policy for deleting too large equations:
initially the weight limit is small (27), but each time
completion terminates without a proof, it restarts with the limit
incremented by four (hence this policy is not a source of
incompleteness).
If a proof is found, it is output in a format readible by our
Prolog proof checker.
Terms sometimes carry additional marks used for indicating
that they are in normal form, or as an index in an LPO-cache
that avoids the repeated comparision of the same
terms with LPO.
This is quite useful for rewriting with rules that are
not orientable a priori, which is done eagerly.
We believe however, that the performance of Fiesta
at the competition will give some evidence for the fact
that with such a table we could prove most if not all
problems.
The basic idea of the automatic mode in Gandalf - also used during
the CASC-14 competition - is time-slicing: Gandalf selects a set of
different search strategies, allocates time to these and finally runs
the strategies one after another.
The selection of strategies and the percentage of time they receive is
in a somewhat ad hoc way based on the characteristics of the problem:
percentage of Horn and almost-Horn clauses, the number of clauses,
percentage of clauses where literals can be ordered using term and
variable depths.
There exist also versions of Gandalf for intuitionistic logic
and type theory, see [Tam96] and
[TS96].
More information about Gandalf can be found in
[Tam98].
The source, binaries, manual and other materials are available at:
Gandalf is likely to do well in the UEQ, SAT and FOF categories,
although it is unlikely it will win any of these.
The model generation method is a new method
[Zh99] which represents
models by rewrite rules. SATO
[Zh97], an implementation of the
Davis-Putnam method, is also built into HOTTER as a decision procedure
for the propositional logic.
For the SAT problems, McCune's MACE
[McC98b] is used to test if the
input clauses have small finite models. If it does not have, HOTTER
takes it over.
Methods are those used by humans, and are executed according to a plan
decided by metarules and heuristics.
Methods include splitting theorems into two or more subtheorems
(conjonctive conclusion or disjunctive hypotheses),
treatment of universal, existential or disjunctive conclusion,
treatment of existential or disjunctive hypotheseses (one at a time when it is
decided (by heuristic rules),
replacement of the conclusion by its definition.
Hypotheses are never replaced by their definition.
Instead, rules are built by metarules from definitions and used to deduce
new hypotheses from old ones.
Universal hypotheses are not added. Instead, rules are built as from the
definitions.
Functional symbols may be used, but an automatic creation of intermediate
objects allows to flatten deep subformulae and treat them as if concepts
were defined by predicate symbols.
Positive properties are privileged, in hypotheses and in conclusion.
So, a special treatment of negation is done.
The successive steps of a proof may be forward deduction (deduce new
hypotheses from old ones), backward deduction (replace the conclusion by
a new one) or refutation (only if the conclusion is a negation).
The system is also able to work with second order statements.
It may also receive knowledge and know-how for a specific domain from a human
user; see [Pas89] and
[Pas93].
These two possibilities are not used while working with the TPTP Library.
Unlike previous versions [Sla93,SLM94], the
current
version of SCOTT searches for maximally consistent subsets much more
aggressively, and manages the cost of this much more intelligently.
Most importantly, however, our naïve notions of semantic guidance
have been superseded by those with genuine theoretical support. In
addition, semantic resolution has been tentatively implemented and,
provided it can be made sufficiently stable, will be incorporated in
the competition version.
The choice of what syntactic characteristics are used to form the SPCs
is based on community input and analysis of system performance data.
Performance data for the available ATP systems, for problems in the
TPTP [SS98], is used to evaluate the
systems within each SPC.
SSCPA has five modes of operation, which differ in terms of systems
chosen, CPU allocation to each system, CPU priority for each system,
and start time for each system.
In CASC-16 SSCPA will run the eager mode, which uses three or
four recommended systems, CPU allocation halving from the time limit,
equal CPU priority, and all systems starting together.
The splitting rule splits clauses, if possible,
into variable disjoint parts that can be independently refuted.
This rule is not necessary for completeness of the calculus but it makes
SPASS an efficient decision procedure for several decidable subclasses
of first-order logic [HS97].
We put a lot of effort in a clause normal form translation that
produces small clause normal forms.
The major techniques are renaming and improved Skolemization
[NRW98].
The SPASS distribution containing the source code for SPASS is freely
available from the SPASS homepage. The code should compile and run on
any machine where the usual GNU tools are available. This has been
tested on SUN Sparc workstations under Solaris and SunOS, on personal
computers under Linux and Solaris and on Dec Alpha workstations under
Digital Unix.
The prover is available for many platforms, in particular, we now also provide
a Windows 95/98/NT version including a neat graphical user interface.
The main completion loop is parameterized by a selection heuristic and
a reduction ordering. Both are derived from the algebraic structure
described by the axioms. Since last year, the derivation process has
been lifted such that control knowledge is expressed declaratively,
which eases the integration of human experience. For full flexibility,
the search parameters may be re-instantiated during the proof search.
Recently, we also realized a database-driven mechanism which allows
simple extensions of the underlying inference calculus, e.g. for
reasoning on Horn clauses as described in
[BDP89:pp.24ff].
To incorporate goal orientation into the completion procedure, the
hypotheses are blown up to sets of rewrite successors and sometimes
predecessors. A step-by-step proof object is constructed at run-time.
Our ATP system is implemented in ANSI-C and runs under Sun-OS, Solaris
and Linux. The Waldmeister distribution contains an easy-to-use
version along with documentation material and problem sets. If you are
interested in tackling problems with Waldmeister, just send an
e-mail to the authors.
This year's version of our unit equality theorem prover is in most parts
identical to Waldmeister 798 that won the CASC-15 UEQ competition with
an enormous lead [SS99]. Since
last year's description is included in
this collection we will only report on the differences here.
Recently, Waldmeister has successfully been integrated into the Omega proof
development environment [BC+97].
Moreover, we have established a convenient WWW interface that is
reachable via:
FDP 0.9
P. Baumgartner
Institute for Computer Science, University of Koblenz-Landau
peter@uni-koblenz.de
Architecture
FDP is a complete theorem prover for first order clausal logic. The
underlying calculus is a generalisation of the well-known Davis-Putnam
procedure. The generalisation is the splitting wrt. complementary
literals, which may contain variables.
Implementation
FDP is written as an interpreter in Eclipse Prolog. The implementation
is highly experimental, preliminary and slow. Some, but not enough
effort has been spent on clause selection for the next inference.
This step is costly, but poorly implemented. The next implementation
will be significantly faster by using indexing techniques.
http://www.uni-koblenz.de/~peter/FDP/
Expected Competition Performance
No chance to seriously compete with the well-known, leading
systems. I am optimistic that the situation will improve one day with
a better implementation and when knowing the calculus better.
References
Fiesta 2
Robert Nieuwenhuis
Technical University of Catalonia (UPC)
roberto@lsi.upc.es
Architecture
Fiesta is essentially an implementation of
unfailing Knuth-Bendix completion with the basicness restriction.
The marks indicating subterms blocked for inferences are removed
in rules that are used in backward demodulation, but not in
forward demodulation (this is a potential source of incompleteness).
Implementation
The system is implemented on top of our collection Dedam of
data structures, which provide terms with all their operations
(input-output, matching, unification, etc.), as well as substitution
trees for efficient many-to-one matching and unification.
Expected Competition Performance
We are perfectly aware of the fact that this system cannot
compete with other participating systems that look up
in a table the right strategy for for a class
of problems. In our opinion, such a table may be crucial
for winning the competition, but it is probably far less useful
for real-world problems and,
even if we had the manpower for doing so,
we would not want to spend
the impressive amount of time needed for building it.
References
Gandalf c-1.8c
T. Tammet
Department of Computer Science, University of Göteborg
tammet@cs.chalmers.se
Architecture
Gandalf implements a number of different basic strategies: binary
ordered resolution for several orderings, versions of set-of-support
resolution, binary unit resolution, hyperresolution. Enhancements are
used for cutting off literals and combining different strategies, in
particular forward and backward reasoning, into one run. Equality is
handled by ordered paramodulation and demodulation.
Implementation
Gandalf is implemented in Scheme. We are using the scm
interpreter developed by A. Jaffer and the Scheme-to-C compiler
Hobbit developed by T. Tammet for the scm system.
http://www.cs.chalmers.se/~tammet/gandalf/
Expected Competition Performance
Since Gandalf won the MIX class of CASC-14, we expect Gandalf to be one of
the top performers in the MIX category during CASC-15. With some luck it
may even win the MIX category again.
References
HOTTER 0.0
H. Zhang
Department of Computer Science, University of Iowa
hzhang@cs.uiowa.edu
Architecture
HOTTER stands for Humble OTTER, an extension of OTTER 3.0.5
[McC94,McC98a,MW97] for experimenting
ordered resolution and model generation.
HOTTER inherits most inference rules and overall proof search
structure of OTTER, except that ordered binary resolution is used
hyperresolution is used usually in OTTER. Like SCOTT
[HS99], a method
for building models of a partial set of clauses is built into
HOTTER in order to determine consistent subsets of the clause set.
These models are used to guide the proof search in a hope to get the
proof or show its nonexistence quicker.
Implementation
Like OTTER and SATO, HOTTER is written in C and use the same input as
OTTER. The plan has been made to use the built-in SATO for the search
of small size finite models in the future release of HOTTER.
Expected Competition Performance
HOTTER has not been fully tested over TPTP problems yet. It's expected
the performance of HOTTER is different from OTTER when hyperresolution plays
an major role in the proof of OTTER. Overall, HOTTER's performance should
be comparable to that of OTTER, with the former being more efficient
on near-propositional non-Horn problems.
References
Acknowledgments: Without Bill McCune's OTTER, there
would be no HOTTER.
MUSCADET v2.1
D. Pastre
Université René Descartes (Paris 5), UFR de Mathématiques et Informatique
pastre@math-info.univ-paris5.fr
Architecture
MUSCADET is a theorem prover based upon Natural Deduction, in the sense of
[Ble71] and
[Pas78].
It is built as a knowledged-based system, that is, methods are expressed
by rules interpreted by an inference engine.
Some rules are universal mathematical knowledge and are given to the system.
Other rules are specific to some concepts and are built by metarules from the
definitions of these concepts.
The notions of hypothesis and conclusion to be proved are privileged.
The first conclusion is the first order statement of the conjecture.
Implementation
The first version of MUSCADET was a real rule-based system
[Pas89].
Rules and metarules were written in a declarative language defined for
this and interpreted by an inference engine written in Pascal.
The competition version is entirely written in SWI-Prolog.
Rules are written as declarative Prolog clauses.
Metarules are written as sets of Prolog clauses, more or less declarative.
The Pascal inference engine is replaced by the Prolog interpreter and some
procedural Prolog clauses.
Expected Competition Performance
The best performances of MUSCADET will be for problems manipulating many
concepts in which all statements (conjectures, definitions, axioms)
are expressed in a manner similar to the practice of humans, especially of
mathematicians.
It will have poor performances for problems using few concepts but large and
deep formulas leading to many splittings.
References
Otter-MACE 437
W. McCune
Mathematics and Computer Science, Argonne National Laboratory
mccune@mcs.anl.gov
Architecture
Otter [McC94,McC98a,MW97] searches for
proofs, and MACE [McC98b] searches for
small finite models. Both programs
apply to statements in first-order (unsorted) logic with equality. The
two programs are independent and work by completely different
methods. (In practice, the programs are complementary.) Version 3.0.5
of Otter has been used. Otter is based on resolution and
paramodulation applied to clauses; an Otter search uses the "given
clause algorithm" and typically involves a large database of clauses;
subsumption and demodulation play an important role. Version 1.3.2 of
MACE has been used. Roughly speaking, MACE works as follows. For a
given domain size, an equivalent propositional problem is constructed
and a Davis-Putnam-Loveland decision procedure is applied. MACE
typically cannot handle clauses with many variables or deeply nested
terms. Otter-MACE 437 (= Otter 3.0.5 + MACE 1.3.2) is a simple
combination of the two programs.
Implementation
Both programs are written in C. Otter uses shared data structures for
clauses and terms, and it uses indexing for resolution,
paramodulation, forward and backward subsumption, forward and backward
demodulation, and unit conflict. MACE is much simpler than Otter; its
Davis-Putnam-Loveland procedure uses a counting algorithm to do the
unit resolutions, and forward subsumption is not applied.
Otter-MACE is a shell script that runs Otter for a few seconds,
then (if Otter has not found a refutation) MACE for a few seconds,
then (if MACE has not found a model) Otter again for the remaining time.
Expected Competition Performance
These versions of Otter and MACE are essentially the same as the
versions entered in the previous two CASC competitions. Both programs
generally perform well at CASC, but no top prizes have been won since
CASC-13. We do not expect any winning performances in CASC-16. We
see our entry as a benchmark for measuring improvements in other
systems.
References
Acknowledgments: Ross Overbeek, Larry Wos, Bob Veroff,
and Rusty Lusk contributed to the development of Otter. MACE was motivated
by previous work of John Slaney, Mark Stickel, and Hantao Zhang.
SCOTT IV
K. Hodgson and J. Slaney
Automated Reasoning Project, Australian National University
{kahlil, jks}@arp.anu.edu.au
Architecture
SCOTT (Semantically Constrained OTTER) [HS99]
is an extension
of OTTER [McC94] and uses the same inference
rules and overall
proof search structure. This extension incorporates FINDER
[Sla94] (a finite model generator) as a
subsystem of OTTER in
order to determine (maximally) consistent subsets of the clause space.
These maximally consistent subsets are used to guide the proof search
by allowing us to increase the likelihood of certain necessary
inferences being made.
Implementation
SCOTT is written in C. It uses the existing program OTTER to
perform the proof search and additional code to determine which
clauses should be chosen for inference. This additional code calls
the existing program FINDER in order to witness the consistency of
particular sets of clauses.
Expected Competition Performance
In principle SCOTT should perform at least as well as OTTER across
all parts of the MIX and UEQ divisions, if only because it has OTTER
as a subroutine. Preliminary results show that, when SCOTT does
find a proof, it requires at least an order of magnitude fewer
inferences than OTTER. However, searching for models and testing
clauses against them can be costly, and so SCOTT may perform poorly
if the time-limit is too severe. Overall, we expect good performance
in comparison with cognate systems that employ relatively naïve
guidance.
References
SSCPA 1999.6
G. Sutcliffe
Department of Computer Science, James Cook University
geoff@cs.jcu.edu.au
Architecture
SSCPA [SS99a] is an uncooperative multiple
calculus competition
parallelism ATP system, that multitasks on a single CPU.
SSCPA uses the syntactic characteristics of the given problem to classify
it into one of 18 predefined specialist problem classes (SPCs).
SSCPA then uses performance data from available sequential ATP
systems to recommend those systems that perform well for that SPC.
The recommended systems are run in parallel, in one of several user
selectable modes.
Implementation
SSCPA uses perl} scripts that extract system recommendations from
the performance data anto d control the parallel execution of the systems.
The tptp2X utility is used to convert the problem to the formats
expected by the ATP systems.
A WWW interface to some SSCPA modes is publicly available at:
http://www.cs.jcu.edu.au/cgi-bin/tptp/SystemOnTPTPFormMaker
Expected Competition Performance
SSCPA is expected to perform well in the more heterogeneous divisions,
i.e., MIX and FOF.
The benefits of SSCPA's archtecture are less pronounced in divisions
where single systems could dominate, e.g., UEQ, where Waldmesiter
may dominate as it did in CASC-15 [SS99b].
References
SPASS 0.95TPTP
C. Weidenbach
Max-Planck-Institut für Informatik
weidenb@mpi-sb.mpg.de
Architecture
SPASS [Wei97] is an automated theorem
prover for full sorted first-order logic with equality.
Its calculus is an extension of superposition by sorts
[Wei98] and a splitting rule for case analysis.
In contrast to many approaches to order-sorted clausal reasoning,
the calculus enables sort predicates and equations
to occur arbitrarily within clauses. Therefore, the sort theory is not
separated from the problem clauses, but automatically
and dynamically extracted [GMW97].
Nevertheless, it can be used to significantly cut the search space.
Implementation
SPASS is written in ANSI-C and relies on a generic library that supports
data structures and efficient functions for automated theorem proving.
Expected Competition Performance
We hope to improve on the performance of SPASS at CASC-13 and CASC-14.
References
SPASS 1.0.0TPTP
C. Weidenbach
Max-Planck-Institut für Informatik
weidenb@mpi-sb.mpg.de
Architecture
SPASS [WA+99] is an automated theorem prover
for first-order logic with equality.
It is a saturation based prover employing superposition, sorts and
splitting [Wei99]. The current
version 1.0.0 offers a variety of further inference and reduction rules
including hyper resolution, unit resulting resolution, various variants
of paramodulation and a terminator [AO83].
Implementation
SPASS is implemented in ANSI C and relies on our internal
library supporting specific data structures and algorithms like, e.g., indexing
or orderings (KBO, RPOS). The code is documented, flexible and was
designed to be reusable. The code should compile on any platform
supporting an ANSI C compiler.
Expected Competition Performance
The SPASS competition version 1.0.0TPTP is tuned towards
TPTP problems. In particular, the standard automatic configuration module
is replaced by a module adapted to the TPTP. We hope that the new
version can improve over the version 0.95TPTP that entered the previous
competition.
References
Vampire 0.0
A. Riazanov, A. Voronkov
Computer Science Department, The University of Manchester
{riazanoa,voronkov}@cs.man.ac.uk
Architecture
Vampire is a resolution-based theorem prover for first-order classical
logic. The current version implements ordered binary resolution and
hyperresolution, superposition and some simplifications. For proof
search Vampire uses a saturation-based procedure similar
to that of Otter [McC94].
One interesting feature of Vampire is the so-called
limited resource strategy.
The idea is as follows. When a time limit is given, Vampire makes estimations
on what clauses (called currently eligible clauses) can be processed
by the end of the time limit. For each newly generated clause it is
checked whether it is currently eligible. If it is not, Vampire discards the
clause. The estimation principle takes into consideration the current
strategy and tries to calculate how the process of clause generation will
develop by the time limit.
Implementation
To gain efficiency we implement the most costly set-at-a-time
operations, such as resolution, superposition and subsumption,
using combination of indexing and compilation.
We use a variant of code trees [Vor95]
for forward subsumption and forward demodulation,
path indexing [Sti89] for backward subsumption
and a variant of discrimination trees [McC92]
for resolution and superposition.
All used indexing techniques provide so-called "perfect filtering"
[Graf96,RSV99].
For storing persistent clauses we use tree-like representation of
terms. The sharing is agressive. For other operations we use different
versions of flatterms.
Vampire is implemented in C++ and compiled by gcc.
Expected Competition Performance
We expect Vampire to perform well, especially on the problems without equality.
References
Waldmeister 798
T. Hillenbrand, A. Jaeger, B. Löchner, and A. Buch
FB Informatik, Universität Kaiserslautern
{hillen,jaeger,loechner}@informatik.uni-kl.de,arnim.buch@sdm.de
Architecture
Waldmeister is a theorem prover for unit equational logic. Its
proof procedure is unfailing Knuth-Bendix completion
[BDP89].
We started the development from a three-level model of saturation-based
ATP systems [BH96,
HB+96]. Since the efforts
of the last years have led to a satisfactory inference machine, we now
focus on the question how to guide the proof search.
Implementation
Reaching efficiency both in terms of time and space has been a major
goal in the development of our inference machine. As to the first,
fast rewriting is achieved by sophisticated indexing techniques. As to
the second, the set of critical pairs is represented in a compressed
data structure.
Expected Competition Performance
We conjecture that Waldmeister will solve the majority of the
competition problems in less time than one would need to study the
corresponding problem specifications.
References
Waldmeister 799
T. Hillenbrand, A. Jaeger, B. Löchner, and A. Buch
FB Informatik, Universität Kaiserslautern
{hillen,jaeger,loechner}@informatik.uni-kl.de,arnim.buch@sdm.de
Architecture
Waldmeister is an implementation of unfailing Knuth-Bendix
completion [BDP89].
Our approach to control the proof search is
to choose selection strategy and reduction ordering according to the
algebraic structure given in the problem
specification [HJL99].
Since Waldmeister 798, the
corresponding part of the system has been developed further into a robust
component that is easy to maintain. In addition we have systematized
the derivation of useful control knowledge, which also gave rise to
progress on some problem domains.
http://www-avenhaus.informatik.uni-kl.de/waldmeister
From
this URL further documents and the complete Waldmeister
distribution containing an easy-to-use version together with problems
and a manual are available.
Implementation
No major changes took place. Some portions of code have been slightly
revised, e.g., a number of time-consuming inner loops have carefully
been reworked.
Expected Competition Performance
We expect some improvements in comparison with last year's version,
depending on problem domains and proof tasks.
References