Entrants' System Descriptions


CVC4 0.0

Andrew Reynolds1, Cesare Tinelli1, Clark Barrett2
1University of Iowa, USA, 2New York University, USA

Architecture

CVC4 is a Satisfiability Modulo Theories (SMT) solver with support for many theories including linear arithmetic, arrays, bit vectors, and datatypes. Like most SMT solvers, it is based on the DPLL(T) framework [NOT06]. The latest development version of CVC4 supports reasoning for first-order formulas, including a finite model finding feature which can answer satisfiable in the presence of quantified (universal) formulas. Using a theory solver for EUF with cardinality constraints, the system finds a minimal candidate model with respect to uninterpreted sorts, that is, one with a minimal number of ground equivalence classes. When such a candidate model is found, the system will build interpretations for all uninterpreted functions and predicates, using heuristics for defining default values. Consistency is then checked by exhaustively instantiating all quantified formulas with ground instances that may falsify the candidate interpretation. The candidate interpretation is then refined according to these instantiations. This process is repeated until a fixed point is reached, at which point the candidate interpretation specifies a model.

Strategies

The EUF with cardinality constraints theory solver searches for candidate models for a sort S incrementally, i.e. it first postulates that the cardinality of S = 1, and if it fails will postulate the cardinality of S = 2, etc. Given cardinality constraint k for S, it maintains a disequality graph induced by the ground constraints between terms of sort S. The solver attempts to merge equivalence classes of type S using splitting on demand [BN+06] until no further equivalence classes can be merged. The solver will guide the search by adding conflict lemmas of the form ( C => card(S)>k ), where C is the explanation of a clique of size k+1 in the disequality graph of S. The strategies for model completion and quantifier instantiation are very preliminary at this point.

Implementation

CVC4 is implemented in C++. Much of the code uses context dependent data structures, whose values automatically update when the DPLL(T) search backtracks. The finite model finding module maintains discrimination-tree-like data structures for storing candidate interpretations for functions and predicates, as well as the set of instantiations produced. Since the CVC4 parser does not support the TPTP syntax, the system relies on a translation from TPTP to SMT2.

Expected Competition Performance

The finite model finding feature in CVC4 is still in development. Much can be done to improve the strategy for producing instantiations, including recognizing when certains axioms do not apply. Currently, the bottleneck for CVC4+finite model finding is in processing the large number of instantiations produced. This problem is made worse by the fact that CVC4 currently has no way of removing lemmas once they are added to the system. Preliminary results on TPTP problems are favorable compared to other SMT solvers, which however are not generally equipped to find models for universal formulas.


E-Darwin 1.5

Björn Pelzer
University Koblenz-Landau, Germany

Architecture

E-Darwin 1.5 [BFT04,BPT11] is an automated theorem prover for first order clausal logic with equality. It is a modified version of the Darwin prover [BFT04], intended as a testbed for variants of the Model Evolution calculus [BT03]. Among other things it implements several different approaches [BT05,BPT11] to incorporating equality reasoning into Model Evolution. Three principal data structures are used: the context (a set of rewrite literals), the set of constrained clauses, and the set of derived candidates. The prover always selects one candidate, which may be a new clause or a new context literal, and exhaustively computes inferences with this candidate and the context and clause set, moving the results to the candidate set. Afterwards the candidate is inserted into one of the context or the clause set, respectively, and the next candidate is selected. The inferences are superposition-based. Demodulation and various means of redundancy detection are used as well.

Strategies

The uniform search strategy is identical to the one employed in the original Darwin, slightly adapted to account for derived clauses.

Implementation

E-Darwin is implemented in the functional/imperative language OCaml. Darwin's method of storing partial unifiers has been adapted to equations and subterm positions for the superposition inferences in E-Darwin. A combination of perfect and non-perfect discrimination tree indexes is used to store the context and the clauses. The system has been tested on Unix and is available under the GNU Public License from
    http://userpages.uni-koblenz.de/~bpelzer/edarwin

Expected Competition Performance

After some bugfixes the performance should be slightly better than last year. While the original Darwin performs strongly in EPR, E-Darwin is more of a generalist, less effective in EPR, yet stronger in the other divisions.


E-KRHyper 1.3

Björn Pelzer
University Koblenz-Landau, Germany

Architecture

E-KRHyper [PW07] is a theorem proving and model generation system for first-order logic with equality. It is an implementation of the E-hyper tableau calculus [BFP07], which integrates a superposition-based handling of equality [BG98] into the hyper tableau calculus [BFN96]. The system is an extension of the KRHyper theorem prover [Wer03], which implements the original hyper tableau calculus.

An E-hyper tableau is a tree whose nodes are labeled with clauses and which is built up by the application of the inference rules of the E-hyper tableau calculus. The calculus rules are designed such that most of the reasoning is performed using positive unit clauses. Splitting is done without rigid variables. Instead, variables which would be shared between branches are prevented by ground substitutions, which are guessed from the Herbrand universe and constrained by rewrite rules. Redundancy rules allow the detection and removal of clauses that are redundant with respect to a branch. The hyper extension inference from the original hyper tableau calculus is equivalent to a series of E-hyper tableau calculus inference applications. Therefore the implementation of the hyper extension in KRHyper by a variant of semi-naive evaluation [Ull89] is retained in E-KRHyper, where it serves as a shortcut inference for the resolution of non-equational literals.

Strategies

E-KRHyper uses a uniform search strategy for all problems. The E-hyper tableau is generated depth-first, with E-KRHyper always working on a single branch. Refutational completeness and a fair search control are ensured by an iterative deepening strategy with a limit on the maximum term weight of generated clauses. In the LTB division E-KRHyper sequentially tries three axiom selection strategies: an implementation of Krystof Hoder's SInE algorithm, another incomplete selection based on the CNF representations of the axioms, and finally the complete axiom set.

Implementation

E-KRHyper is implemented in the functional/imperative language OCaml. The system accepts input in the TPTP-format and in the TPTP-supported Protein-format. The calculus implemented by E-KRHyper works on clauses, so first order formula input is converted into CNF by an algorithm similar to the one used by Otter [McC03], with some additional connector literals to prevent explosive clause growth when dealing with DNF-like structures. E-KRHyper operates on an E-hyper tableau which is represented by linked node records. Several layered discrimination-tree based indexes (both perfect and non-perfect) provide access to the clauses in the tableau and support backtracking. The system runs on Unix and MS-Windows platforms and is available under the GNU Public License from
    http://www.uni-koblenz.de/~bpelzer/ekrhyper

Expected Competition Performance

The redundancy handling has been improved, resulting in a slight improvement over last year. Overall E-KRHyper will remain in the middle ground.


E-MaLeS 1.1

Daniel Kuehlwein1, Josef Urban1, Stephan Schulz2
1Radboud University Nijmegen, The Netherlands, 2Technische Universität München, Germany

Architecture

E-MaLeS (Machine Learning of Strategies) uses kernel-based machine learning to improve the strategy selection of E prover.

From the known performance of different E strategies over the TPTP library, the system learns which strategy is most likely to solve a problem. Given a new problem, E-MaLeS extracts the problems features (number of terms, number of literals, etc) and applies the learned function to the features. The result is a (hopefully optimal) strategy. As a new addition in 1.1, E-MaLeS learns not only the optimal strategy, but also for how long it should run each strategy. This allows better strategy scheduling.

Strategies

Each E strategy is run for the predicted time, starting with the strategy with the lowest predicted time.

Implementation

E-MaLeS is implemented in python, using the numpy and scipy libraries. It is available from
    http://www.cs.ru.nl/~kuehlwein/

Expected Competition Performance

E-MaLeS should perform slightly better than E.


EP 1.4pre

Stephan Schulz
Technische Universität München, Germany

Architecture

E 1.4pre [Sch02,Sch04] is described in this section. E is a purely equational theorem prover for full first-order logic with equality. It consists of an (optional) clausifier for pre-processing full first-order formulae into clausal from, and a saturation algorithm implementing an instance of the superposition calculus with negative literal selection and a number of redundancy elimination techniques. E is based on the DISCOUNT-loop variant of the given-clause algorithm, i.e., a strict separation of active and passive facts. No special rules for non-equational literals have been implemented. Resolution is effectively simulated by paramodulation and equality resolution.

EP 1.4pre is just a combination of E 1.4pre in verbose mode and a proof analysis tool extracting the used inference steps. For the LTB division, a control program uses a SInE-like analysis to extract reduced axiomatizations that are handed to several instances of E.

Strategies

Proof search in E is primarily controlled by a literal selection strategy, a clause evaluation heuristic, and a simplification ordering. The prover supports a large number of pre-programmed literal selection strategies. Clause evaluation heuristics can be constructed on the fly by combining various parametrized primitive evaluation functions, or can be selected from a set of predefined heuristics. Clause evaluation heuristics are based on symbol-counting, but also take other clause properties into account. In particular, the search can prefer clauses from the set of support, or containing many symbols also present in the goal. Supported term orderings are several parametrized instances of Knuth-Bendix-Ordering (KBO) and Lexicographic Path Ordering (LPO).

The automatic mode is based on a static partition of the set of all clausal problems based on a number of simple features (number of clauses, maximal symbol arity, presence of equality, presence of non-unit and non-Horn clauses,...). Each class of clauses is automatically assigned a heuristic that performs well on problems from this class in test runs. About 100 different strategies have been evaluated on all untyped first-order problems from TPTP 4.1.0.

Implementation

E is build around perfectly shared terms, i.e. each distinct term is only represented once in a term bank. The whole set of terms thus consists of a number of interconnected directed acyclic graphs. Term memory is managed by a simple mark-and-sweep garbage collector. Unconditional (forward) rewriting using unit clauses is implemented using perfect discrimination trees with size and age constraints. Whenever a possible simplification is detected, it is added as a rewrite link in the term bank. As a result, not only terms, but also rewrite steps are shared. Subsumption and contextual literal cutting (also known as subsumption resolution) is supported using feature vector indexing [Sch04]. Superposition and backward rewriting use fingerprint indexing, a new technique combining ideas from feature vector indexing and path indexing. Finally, LPO and KBO are implemented using the elegant and efficient algorithms developed by Bernd Löchner in [Loe06,Loe06]. The prover and additional information are available at
    http://www.eprover.org

Expected Competition Performance

EP 1.4pre is the CASC-23 CNF division winner.


EP 1.6pre

Stephan Schulz
Technische Universität München, Germany

Architecture

E 1.6pre [Sch02] is a purely equational theorem prover for full first-order logic with equality. It consists of an (optional) clausifier for pre-processing full first-order formulae into clausal form, and a saturation algorithm implementing an instance of the superposition calculus with negative literal selection and a number of redundancy elimination techniques. E is based on the DISCOUNT-loop variant of the given-clause algorithm, i.e., a strict separation of active and passive facts. No special rules for non-equational literals have been implemented. Resolution is effectively simulated by paramodulation and equality resolution.

EP 1.6pre is just a combination of E 1.6pre in verbose mode and a proof analysis tool extracting the used inference steps. For the LTB and MZR divisions, a control program uses a SInE-like analysis to extract reduced axiomatizations that are handed to several instances of E.

Strategies

Proof search in E is primarily controlled by a literal selection strategy, a clause evaluation heuristic, and a simplification ordering. The prover supports a large number of pre-programmed literal selection strategies. Clause evaluation heuristics can be constructed on the fly by combining various parametrized primitive evaluation functions, or can be selected from a set of predefined heuristics. Clause evaluation heuristics are based on symbol-counting, but also take other clause properties into account. In particular, the search can prefer clauses from the set of support, or containing many symbols also present in the goal. Supported term orderings are several parametrized instances of Knuth-Bendix-Ordering (KBO) and Lexicographic Path Ordering (LPO).

The automatic mode is based on a static partition of the set of all clausal problems based on a number of simple features (number of clauses, maximal symbol arity, presence of equality, presence of non-unit and non-Horn clauses,...). Each class of clauses is automatically assigned a heuristic that performs well on problems from this class in test runs. About 60 different strategies have been evaluated on all bug-free untyped first-order problems from TPTP 5.3.0.

Implementation

E is build around perfectly shared terms, i.e. each distinct term is only represented once in a term bank. The whole set of terms thus consists of a number of interconnected directed acyclic graphs. Term memory is managed by a simple mark-and-sweep garbage collector. Unconditional (forward) rewriting using unit clauses is implemented using perfect discrimination trees with size and age constraints. Whenever a possible simplification is detected, it is added as a rewrite link in the term bank. As a result, not only terms, but also rewrite steps are shared. Subsumption and contextual literal cutting (also known as subsumption resolution) is supported using feature vector indexing [Sch04]. Superposition and backward rewriting use fingerprint indexing [Sch12], a new technique combining ideas from feature vector indexing and path indexing. Finally, LPO and KBO are implemented using the elegant and efficient algorithms developed by Bernd Löchner in [Loe06,Loe06]. The prover and additional information are available at
    http://www.eprover.org

Expected Competition Performance

E 1.6pre has improved indexing and now integrates optional SinE-like preprocessing for large FOF problems. The system is expected to perform well in most proof classes, but will at best complement top systems in the disproof classes.


FIMO 0.3

Orkunt Sabuncu
University of Potsdam, Germany

Architecture

Fimo is a system for computing finite models of first-order formulas by incremental Answer Set Programming (iASP). The input theory is transformed to an incremental logic program. If any, answer sets of this program represent finite models of the input theory. iClingo is used for computing answer sets of iASP programs.

Strategies

Fimo is the successor of the system fmc2iasp [GSS11]. Unlike fmc2iasp Fimo does not rely on flattening for translating the input theory to a satisfiability problem.

Fimo features symmetry breaking and incremental answer set solving provided by the underlying iASP system iClingo. Additionally, Fimo deskolemizes some of the Skolem functions by transforming them to aggregates in the resulting logic program.

Implementation

Fimo is developed in Python. It is available from
    http://potassco.sourceforge.net

Expected Competition Performance

We are expecting a similar performance to the version that competed in CASC 2011.


iProver 0.9

Konstantin Korovin
University of Manchester, United Kingdom

Architecture

iProver is an automated theorem prover based on an instantiation calculus Inst-Gen [GK03,Kor09] which is complete for first-order logic. One of the distinctive features of iProver is a modular combination of first-order reasoning with ground reasoning. In particular, iProver currently integrates MiniSat [ES04] for reasoning with ground abstractions of first-order clauses. In addition to instantiation, iProver implements ordered resolution calculus and a combination of instantiation and ordered resolution; see [Kor08] for the implementation details. The saturation process is implemented as a modification of a given clause algorithm. iProver uses non-perfect discrimination trees for the unification indexes, priority queues for passive clauses, and a compressed vector index for subsumption and subsumption resolution (both forward and backward). The following redundancy eliminations are implemented: blocking non-proper instantiations; dismatching constraints [GK04,Kor08]; global subsumption [Kor08]; resolution-based simplifications and propositional-based simplifications. A compressed feature vector index is used for efficient forward/backward subsumption and subsumption resolution. Equality is dealt with (internally) by adding the necessary axioms of equality with an option of using Brand's transformation. In the LTB division, iProver-SInE uses axiom selection based on the SInE algorithm [HV11] as implemented in Vampire [HKV10], i.e., axiom selection is done by Vampire and proof attempts are done by iProver.

Major additions in the current version are:

Strategies

iProver has around 40 options to control the proof search including options for literal selection, passive clause selection, frequency of calling the SAT solver, simplifications and options for combination of instantiation with resolution. At CASC iProver will execute a small number of fixed schedules of selected options depending on general syntactic properties such as Horn/non-Horn, equational/non-equational, and maximal term depth.

Implementation

iProver is implemented in OCaml and for the ground reasoning uses MiniSat. iProver accepts FOF and CNF formats, where Vampire [HKV10] is used for clausification of FOF problems.

iProver is available from

    http://www.cs.man.ac.uk/~korovink/iprover/

Expected Competition Performance

iProver 0.9 is the CASC-23 EPR division winner.


iProver 0.99

Konstantin Korovin, Christoph Sticksel
University of Manchester, United Kingdom

Architecture

iProver is an automated theorem prover based on an instantiation calculus Inst-Gen [GK03,Kor09] which is complete for first-order logic. iProver combines first-order reasoning with ground reasoning for which it uses MiniSat [ES04]. iProver also combines instantiation with ordered resolution; see [Kor08] for the implementation details. The proof search is implemented using a saturation process based on the given clause algorithm. iProver uses non-perfect discrimination trees for the unification indexes, priority queues for passive clauses, and a compressed vector index for subsumption and subsumption resolution (both forward and backward). The following redundancy eliminations are implemented: blocking non-proper instantiations; dismatching constraints [GK04,Kor08]; global subsumption [Kor08]; resolution-based simplifications and propositional-based simplifications. A compressed feature vector index is used for efficient forward/backward subsumption and subsumption resolution. Equality is dealt with (internally) by adding the necessary axioms of equality with an option of using Brand's transformation.

In the LTB division, iProver uses axiom selection based on the SInE algorithm [HV11] as implemented in Vampire [HKV11], i.e., axiom selection is done by Vampire and proof attempts are done by iProver.

iProver features are summaries below with recent additions marked with (*).

Type inference is targeted at improving finite model finding and symmetry breaking. Semantic filtering is used in preprocessing to eliminated irrelevant clauses. Proof extraction is challenging due to simplifications such global subsumption which involve global reasoning with the whole clause set and can be computationally expensive.

Strategies

iProver has around 60 options to control the proof search including options for literal selection, passive clause selection, frequency of calling the SAT solver, simplifications and options for combination of instantiation with resolution. At CASC iProver will execute a small number of fixed schedules of selected options depending on general syntactic properties such as Horn/non-Horn, equational/non-equational, and maximal term depth. The strategy for satisfiable problems (FNT division) includes finite model finding.

Implementation

iProver is implemented in OCaml and for the ground reasoning uses MiniSat [ES04]. iProver accepts FOF and CNF formats. Vampire [HKV11] is used for proof-producing clausification of FOF problems as well as for axiom selection [HV11] in the LTB division.

iProver is available from

    http://www.cs.man.ac.uk/~korovink/iprover/

Expected Competition Performance

We expect very good performance on satisfiable problems (FNT division) due to recent additions. In the EPR division we expect similar performance to the previous version but additional computations needed for the proof extraction can degrade the performance.


iProver-Eq 0.8

Christoph Sticksel, Konstantin Korovin
University of Manchester, United Kingdom

Architecture

iProver-Eq [KS10] extends the iProver system [Kor08] with built-in equational reasoning, along the lines of [GK04]. As in the iProver system, first-order reasoning is combined with ground satisfiability checking where the latter is delegated to an off-the-shelf ground solver.

iProver-Eq consists of three core components: i) ground reasoning by an SMT solver, ii) first-order equational reasoning on literals in a candidate model by a labelled unit superposition calculus [KS10,KS10] and iii) instantiation of clauses with substitutions obtained by ii).

Given a set of first-order clauses, iProver-Eq first abstracts it to a set of ground clauses which are then passed to the ground solver. If the ground abstraction is unsatisfiable, then the set of first-order clauses is also unsatisfiable. Otherwise, literals are selected from the first-order clauses based on the model of the ground solver. The labelled unit superposition calculus checks whether selected literals are conflicting. If they are conflicting, then clauses are instantiated such that the ground solver has to refine its model in order to resolve the conflict. Otherwise, satisfiability of the initial first-order clause set is shown.

Clause selection and literal selection in the unit superposition calculus are implemented in separate given clause algorithms. Relevant substitutions are accumulated in labels during unit superposition inferences and then used to instantiate clauses. For redundancy elimination iProver-Eq uses demodulation, dismatching constraints and global subsumption. In order to efficiently propagate redundancy elimination from instantiation into unit superposition, we implemented different representations of labels based on sets, AND/OR-trees and OBDDs. Non-equational resolution and equational superposition inferences provide further simplifications.

For the LTB division, iProver-Eq-SInE runs iProver-Eq as the underlying inference engine of SInE [HV11], i.e., axiom selection is done by SInE, and proof attempts are done by iProver-Eq.

Strategies

Proof search options in iProver-Eq control clause and literal selection in the respective given clause algorithms. Equally important is the global distribution of time between the inference engines and the ground solver. At CASC, iProver-Eq will execute a fixed schedule of selected options.

If no equational literals occur in the input, iProver-Eq falls back to the inference rules of iProver, otherwise the latter are disabled and only unit superposition is used. If all clauses are unit equations, no instances need to be generated and the calculus is run without the otherwise necessary bookkeeping.

Implementation

iProver-Eq is implemented in OCaml and uses a prototype of the CVC4 SMT solver, which is the successor of CVC3 [BT07], for the ground reasoning in the equational case and MiniSat [ES04] in the non-equational case. iProver-Eq accepts FOF and CNF formats, where Vampire [HKV11] is used for clausification and preprocessing of FOF problems.

iProver-Eq is available from

    http://www.cs.man.ac.uk/~sticksec/iprover-eq

Expected Competition Performance

iProver-Eq has seen many optimisations from the version in the previous CASCs, in particular regarding labelling and the integration of CVC4. We expect reasonably good performance in all divisions, including the EPR divisions where instantiation-based methods are particularly strong.


Isabelle 2012

Jasmin C. Blanchette1, Lawrence C. Paulson2, Tobias Nipkow1, Makarius Wenzel3
1Technische Universität München, Germany, 2University of Cambridge, United Kingdom, 3Université Paris Sud, France

Architecture

Isabelle/HOL 2012 [NPW12] is the higher-order logic incarnation of the generic proof assistant
Isabelle2012. Isabelle/HOL provides several automatic proof tactics, notably an equational reasoner [Nip89], a classical reasoner [PN94], and a tableau prover [Pau99]. It also integrates external first- and higher-order provers via its subsystem Sledgehammer [PB10,BBP11].

Previous versions of Isabelle relied on the TPTP2X tool to translate TPTP files to Isabelle theory files. Starting this year, Isabelle includes a parser for the TPTP syntaxes CNF, FOF, TFF0, and THF0 as well as TPTP versions of its popular tools, invokable on the command line as isabelle tptp_tool max_secs file.p. For example:

isabelle tptp_isabelle_demo 100 SEU/SEU824^3.p
Two versions of Isabelle participate this year. The demo version includes its competitors LEO-II [BP+08] and Satallax [Bro12] as Sledgehammer backends, whereas the competition version leaves them out.

Strategies

The Isabelle tactic submitted to the competition simply tries the following tactics sequentially:
sledgehammer
Invokes the following sequence of provers as oracles via Sledgehammer:
nitpick
For problems involving only the type $o of Booleans, checks whether a finite model exists using Nitpick [BN10].
simp
Performs equational reasoning using rewrite rules [Nip89].
blast
Searches for a proof using a fast untyped tableau prover and then attempts to reconstruct the proof using Isabelle tactics [Pau99].
auto+spass
Combines simplification and classical reasoning [PN94] under one roof; then invoke Sledgehammer with SPASS on any subgoals that emerge.
z3
Invokes the SMT solver Z3 3.2 [dMB08].
cvc3
Invokes the SMT solver CVC3 2.2 [BT07].
fast
Searches for a proof using sequent-style reasoning, performing a depth-first search [PN94]. Unlike blast, it construct proofs directly in Isabelle. That makes it slower but enables it to work in the presence of the more unusual features of HOL, such as type classes and function unknowns.
best
Similar to fast, except that it performs a best-first search.
force
Similar to auto, but more exhaustive.
meson
Implements Loveland's MESON procedure [Lov78]. Constructs proofs directly in Isabelle.
fastforce
Combines fast and force.

Implementation

Isabelle is a generic theorem prover written in Standard ML. Its meta-logic, Isabelle/Pure, provides an intuitionistic fragment of higher-order logic. The HOL object logic extends pure with a more elaborate version of higher-order logic, complete with the familiar connectives and quantifiers. Other object logics are available, notably FOL (first-order logic) and ZF (Zermelo-Fraenkel set theory).

The implementation of Isabelle relies on a small LCF-style kernel, meaning that inferences are implemented as operations on an abstract theorem datatype. Assuming the kernel is correct, all values of type theorem are correct by construction.

Most of the code for Isabelle was written by the Isabelle teams at the University of Cambridge and the Technische Universität München. Isabelle/HOL is available for all major platforms under a BSD-style license from

    http://www.cl.cam.ac.uk/research/hvg/Isabelle

Expected Competition Performance

By integrating LEO-II and Satallax as two of many of its tactics, Isabelle should have all the tools at its disposals to win the competition this year.


leanCoP---2.2

Jens Otten
University of Potsdam, Germany

Architecture

leanCoP [OB03,Ott08] is an automated theorem prover for classical first-order logic with equality. It is a very compact implementation of the connection (tableau) calculus [Bib87,LS01].

Strategies

The reduction rule of the connection calculus is applied before the extension rule. Open branches are selected in a depth-first way. Iterative deepening on the proof depth is used to achieve completeness. Additional inference rules and strategies include regularity, lemmata, and restricted backtracking [Ott10]. leanCoP uses an optimized structure-preserving transformation into clausal form [Ott10] and a fixed strategy scheduling, which is invoked by a shell script.

Implementation

leanCoP is implemented in Prolog. The source code of the core prover is only a few lines long and fits on half a page. Prolog's built-in indexing mechanism is used to quickly find connections.

leanCoP can read formulae in leanCoP syntax as well as in (raw) TPTP first-order syntax. Equality axioms and axioms to support distinct objects are automatically added if required. The leanCoP core prover returns a very compact connection proof, which can be translated into different proof formats, e.g., into a lean (unofficial) TPTP syntax format for representing connection proofs [OS10] or into a readable text proof.

The leanCoP core prover and all its additional components run on any (standard) Prolog system; ECLiPSe, SICStus and SWI Prolog are currently explicitly supported. The shell script that implements the fixed strategy scheduling runs on Linux/Unix and MacOS platforms as well as on most Windows platforms.

The source code of leanCoP 2.2 is available under the GNU general public license. Together with more information it can be found on the leanCoP website at

    http://www.leancop.de
The website also contains information about the leanCoP versions ileanCoP and MleanCoP, leading automated theorem provers for first-order intuitionistic logic and several first-order modal logics, respectively.

Expected Competition Performance

As the core prover has not changed, the performance of leanCoP 2.2 is expected to be similar to the performance of leanCoP 2.1.


leanCoP-ARDE---2.2

Mario Frank, Thomas Raths, Jens Otten
University of Potsdam, Germany

Architecture

ARDE (Axiom Relevance Decision Engine, current version 0.5) is a tool that processes large sets of axioms and categorizes them into classes. Given a conjecture it selects those axioms from the axiom set that are somehow connected to the conjecture and thus possibly needed for the proof of the conjecture. For the proof process itself it consults leanCoP [OB03,Ott08], a compact theorem prover for first-order classical logic.

Strategies

ARDE analyses all axioms and gathers information about included predicate symbols, their arity and additional information like their polarities. Based on these properties, ARDE tries to search for axioms that are possibly needed. These axioms together with the conjecture are then given to the leanCoP 2.2 theorem prover. ARDE is multi-threaded and has the ability to trigger multiple proof attemps concurrently with all threads having different axiom sets.

Implementation

ARDE is implemented in C++ using the C++11 standard. It uses version 1.42 of the Boost library (e.g. Spirit), G++ 4.4.5 and GNU Prolog 1.3.1. Other used tools include egrep and GNU make.

Expected Competition Performance

On large problems containing many axioms leanCoP-ARDE 2.2 is exptected to show a better performance than the leanCoP 2.2 prover on its own.


LEO-II---1.4

Christoph Benzmüller
Free University Berlin, Germany

Architecture

LEO-II [BP+08], the successor of LEO [BK98], is a higher-order ATP system based on extensional higher-order resolution. More precisely, LEO-II employs a refinement of extensional higher-order RUE resolution [Ben99]. LEO-II is designed to cooperate with specialist systems for fragments of higher-order logic. By default, LEO-II cooperates with the first-order ATP system E [Sch02]. LEO-II is often too weak to find a refutation amongst the steadily growing set of clauses on its own. However, some of the clauses in LEO-II's search space attain a special status: they are first-order clauses modulo the application of an appropriate transformation function. Therefore, LEO-II launches a cooperating first-order ATP system every n iterations of its (standard) resolution proof search loop (e.g., 10). If the first-order ATP system finds a refutation, it communicates its success to LEO-II in the standard SZS format. Communication between LEO-II and the cooperating first-order ATP system uses the TPTP language and standards.

Strategies

LEO-II employs an adapted "Otter loop". Moreover, LEO-II now also uses some very basic strategy scheduling to try different search strategies or flag settings. These search strategies also include some different relevance filters.

Implementation

LEO-II is implemented in Objective Caml version 3.12, and its problem representation language is the TPTP THF language [BRS08]. In fact, the development of LEO-II has largely paralleled the development of the TPTP THF language and related infrastructure [SB10].

LEO-II's parser supports the TPTP THF0 language and also the TPTP languages FOF and CNF.

Unfortunately the LEO-II system still uses only a very simple sequential collaboration model with first-order ATPs instead of using the more advanced, concurrent and resource-adaptive OANTS architecture [BS+08] as exploited by its predecessor LEO.

The LEO-II system is distributed under a BSD style license, and it is available from

    http://www.leoprover.org

Expected Competition Performance

LEO-II has not improved much since 2010. The main modifications concern proof output in order to enable proof reconstruction/verification of LEO-II proofs in other systems [SB12].


MaLARea 0.4

Josef Urban1, Daniel Kuehlwein1, Stephan Schulz2, Jiri Vyskocil3
1Radboud University Nijmegen, The Netherlands, 2Technische Universität München, Germany, 3Czech Technical University, Czech Republic

Architecture

MaLARea 0.4 [Urb07,USPV08] is a metasystem for ATP in large theories where symbol and formula names are used consistently. It uses several deductive systems (now E,SPASS,Vampire, Paradox, Mace), as well as complementary AI techniques like machine learning (the SNoW system) based on symbol-based similarity, model-based similarity, term-based similarity, and obviously previous successful proofs. The version for CASC 2012 will mainly use E prover, possibly also Prover9, Mace and Paradox. The premise selection methods will likely also use kernel-based learning and E's implementation of SInE.

Strategies

The basic strategy is to run ATPs on problems, then use the machine learner to learn axiom relevance for conjectures from solutions, and use the most relevant axioms for next ATP attempts. This is iterated, using different timelimits and axiom limits. Various features are used for learning, and the learning is complemented by other criteria like model-based reasoning, symbol and term-based similarity, etc.

Implementation

The metasystem is implemented in ca. 2500 lines of Perl. It uses many external programs - the above mentioned ATPs and machine learner, TPTP utilities, LADR utilities for work with models, and some standard Unix tools.

MaLARea is available from

    https://github.com/JUrban/MPTP2/tree/master/MaLARea
The metasystem's Perl code is released under GPL2.

Expected Competition Performance

Thanks to machine learning, MaLARea is strongest on batches of many related problems with many redundant axioms where some of the problems are easy to solve and can be used for learning the axiom relevance. MaLARea is not very good when all problems are too difficult (nothing to learn from), or the problems (are few and) have nothing in common. Some of its techniques (selection by symbol and term-based similarity, model-based reasoning) could however make it even there slightly stronger than standard ATPs. MaLARea has a very good performance on the MPTP Challenge, which is a predecessor of the LTB division, and it is the winner of the 2008 MZR LTB category which had the same rules as the MPTP Challenge. Since then the LTB rules changed and MaLARea did not compete.


Muscadet 4.2

Dominique Pastre
University Paris Descartes, France

Architecture

The Muscadet theorem prover is a knowledge-based system. It is based on Natural Deduction, following the terminology of [Ble71] and [Pas78], and uses methods which resembles those used by humans. It is composed of an inference engine, which interprets and executes rules, and of one or several bases of facts, which are the internal representation of "theorems to be proved". Rules are either universal and put into the system, or built by the system itself by metarules from data (definitions and lemmas). Rules may add new hypotheses, modify the conclusion, create objects, split theorems into two or more subtheorems or build new rules which are local for a (sub-)theorem.

Strategies

There are specific strategies for existential, universal, conjonctive or disjunctive hypotheses and conclusions, and equalities. Functional symbols may be used, but an automatic creation of intermediate objects allows deep subformulae to be flattened and treated as if the concepts were defined by predicate symbols. The successive steps of a proof may be forward deduction (deduce new hypotheses from old ones), backward deduction (replace the conclusion by a new one), refutation (only if the conclusion is a negation), search for objects satisfying the conclusion or dynamic building of new rules.

The system is also able to work with second order statements. It may also receive knowledge and know-how for a specific domain from a human user; see [Pas89] and [Pas93]. These two possibilities are not used while working with the TPTP Library.

Implementation

Muscadet [Pas01] is implemented in SWI-Prolog. Rules are written as more or less declarative Prolog clauses. Metarules are written as sets of Prolog clauses. The inference engine includes the Prolog interpreter and some procedural Prolog clauses. A theorem may be split into several subtheorems, structured as a tree with "and" and "or" nodes. All the proof search steps are memorized as facts including all the elements which will be necessary to extract later the useful steps (the name of the executed action or applied rule, the new facts added or rule dynamically built, the antecedents and a brief explanation).

Muscadet is available from

    http://www.math-info.univ-paris5.fr/~pastre/muscadet/muscadet.html

Expected Competition Performance

The best performances of Muscadet will be for problems manipulating many concepts in which all statements (conjectures, definitions, axioms) are expressed in a manner similar to the practice of humans, especially of mathematicians [Pas02,Pas07]. It will have poor performances for problems using few concepts but large and deep formulas leading to many splittings. Its best results will be in set theory, especially for functions and relations. Its originality is that proofs are given in natural style. This 4.2 version is a minor revision of 4.1 version. It differs only in that a number of bugs have been fixed.


Nitrox 2012

Jasmin C. Blanchette1, Emina Torlak2
1Technische Universität München, Germany, 2University of California, USA

Architecture

Nitrox is the first-order version of Nitpick [BN10], an open source counterexample generator for Isabelle/HOL [NPW12]. It builds on Kodkod [TJ07], a highly optimized first-order relational model finder based on SAT. The name Nitrox is a portmanteau of Nitpick and Paradox (clever, eh?).

Strategies

Nitrox employs Kodkod to find a finite model of the negated conjecture. It performs a few transformations on the input, such as pushing quantifiers inside, but 99% of the solving logic is in Kodkod and the underlying SAT solver.

The translation from HOL to Kodkod's first-order relational logic (FORL) is parameterized by the cardinalities of the atomic types occurring in it. Nitrox enumerates the possible cardinalities for the universe. If a formula has a finite counterexample, the tool eventually finds it, unless it runs out of resources.

Nitpick is optimized to work with higher-order logic (HOL) and its definitional principles (e.g., (co)inductive predicates, (co)inductive datatypes, (co)recursive functions). When invoked on untyped first-order problem, few of its optimizations come into play, and the problem handed to Kodkod is essentially a first-order relational logic (FORL) rendering of the TPTP FOF problem. There are two main exceptions:

Implementation

Nitrox, like most of Isabelle/HOL, is written in Standard ML. Unlike Isabelle itself, which adheres to the LCF small-kernel discipline, Nitrox does not certify its results and must be trusted. Kodkod is written in Java. MiniSat 1.14 is used as the SAT solver.

Expected Competition Performance

Since Nitpick was designed for HOL, it doesn't have any type inference à la Paradox. It also doesn't use the SAT solver incrementally, which penalizes it a bit (but not as much as the missing type inference). Kodkod itself is known to perform less well on FOF than Paradox, because it is designed and optimized for a somewhat different logic, FORL. On the other hand, Kodkod's symmetry breaking seems better calibrated than Paradox's. Hence, we expect Nitrox to end up in second place at best in the TNF category.


Paradox 3.0

Koen Claessen, Niklas Sörensson
Chalmers University of Technology, Sweden

Architecture

Paradox [CS03] is a finite-domain model generator. It is based on a MACE-style [McC03] flattening and instantiating of the first-order clauses into propositional clauses, and then the use of a SAT solver to solve the resulting problem.

Paradox incorporates the following features: Polynomial-time clause splitting heuristics, the use of incremental SAT, static symmetry reduction techniques, and the use of sort inference.

Strategies

There is only one strategy in Paradox:
  1. Analyze the problem, finding an upper bound N on the domain size of models, where N is possibly infinite. A finite such upper bound can be found, for example, for EPR problems.
  2. Flatten the problem, and split clauses and simplify as much as possible.
  3. Instantiate the problem for domain sizes 1 up to N, applying the SAT solver incrementally for each size. Report "SATISFIABLE" when a model is found.
  4. When no model of sizes smaller or equal to N is found, report "CONTRADICTION".
In this way, Paradox can be used both as a model finder and as an EPR solver.

Implementation

The main part of Paradox is implemented in Haskell using the GHC compiler. Paradox also has a built-in incremental SAT solver which is written in C++. The two parts are linked together on the object level using Haskell's Foreign Function Interface.

Expected Competition Performance

Paradox 3.0 is the CASC-23 FNT division winner.


Princess 2012-06-04

Philipp Rümmer, Aleksandar Zeljić
Uppsala University, Sweden

Architecture

Princess [Rue08,Rue12] is a theorem prover for first-order logic modulo linear integer arithmetic. The prover has been under development since 2007, and represents a combination of techniques from the areas of first-order reasoning and SMT solving. The main underlying calculus is a free-variable tableau calculus, which is extended with constraints to enable backtracking-free proof expansion, and positive unit hyper-resolution for lightweight instantiation of quantified formulae. Linear integer arithmetic is handled using a set of built-in proof rules resembling the Omega test, which altogether yields a calculus that is complete for full Presburger arithmetic, for first-order logic, and for a number of further fragments.

The internal calculus of Princess only supports uninterpreted predicates; uninterpreted functions are encoded as predicates, together with the usual axioms. Through appropriate translation of quantified formulae with functions, the e-matching technique common in SMT solvers can be simulated; triggers in quantified formulae are chosen based on heuristics similar to those in the Simplify prover.

Strategies

Princess supports a number of different proof expansion strategies (e.g., depth-first, breadth-first), which are chosen based on syntactic properties of a problem (in particular, the kind of quantifiers occurring in the problem). Further options exist to control, for instance, the selection of triggers in quantified formulae, clausification, and the handling of functions.

For CASC, Princess will run a schedule with a small number of configurations for each problem (portfolio method). The schedule is determined either statically, or dynamically using syntactic attributes of problems (such as number and kind of quantifiers, etc), based on training using a random sample of problems from the TPTP library.

Implementation

Princess is entirely written in Scala and runs on any recent Java virtual machine; besides the standard Scala and Java libraries, only the Cup parser library is employed. Princess is available from
    http://www.philipp.ruemmer.org/princess.shtml

Expected Competition Performance

Since Princess enters CASC for the first time, performance predictions are entirely speculative. Princess is mainly designed for the TFI category, and should perform reasonably well here. Support for rationals and reals is preliminary and incomplete, so that mediocre results are expected for TFR. Finally, although Princess is in theory complete for FOL, it is not designed for this logic and enters FOF just for fun.


Prover9 2009-11A

Bob Veroff on behalf of William McCune
University of New Mexico, USA

Architecture

Prover9, Version 2009-11A, is a resolution/paramodulation prover for first-order logic with equality. Its overall architecture is very similar to that of Otter-3.3 [McC03]. It uses the "given clause algorithm", in which not-yet-given clauses are available for rewriting and for other inference operations (sometimes called the "Otter loop").

Prover9 has available positive ordered (and nonordered) resolution and paramodulation, negative ordered (and nonordered) resolution, factoring, positive and negative hyperresolution, UR-resolution, and demodulation (term rewriting). Terms can be ordered with LPO, RPO, or KBO. Selection of the "given clause" is by an age-weight ratio.

Proofs can be given at two levels of detail: (1) standard, in which each line of the proof is a stored clause with detailed justification, and (2) expanded, with a separate line for each operation. When FOF problems are input, proof of transformation to clauses is not given.

Completeness is not guaranteed, so termination does not indicate satisfiability.

Strategies

Like Otter, Prover9 has available many strategies; the following statements apply to CASC-2012.

Given a problem, Prover9 adjusts its inference rules and strategy according to syntactic properties of the input clauses such as the presence of equality and non-Horn clauses. Prover9 also does some preprocessing, for example, to eliminate predicates.

In previous CASC competitions, Prover9 has used LPO to order terms for demodulation and for the inference rules, with a simple rule for determining symbol precedence. For CASC 2012, we are going to use KBO instead.

For the FOF problems, a preprocessing step attempts to reduce the problem to independent subproblems by a miniscope transformation; if the problem reduction succeeds, each subproblem is clausified and given to the ordinary search procedure; if the problem reduction fails, the original problem is clausified and given to the search procedure.

Implementation

Prover9 is coded in C, and it uses the LADR libraries. Some of the code descended from EQP [McC97]. (LADR has some AC functions, but Prover9 does not use them). Term data structures are not shared (as they are in Otter). Term indexing is used extensively, with discrimination tree indexing for finding rewrite rules and subsuming units, FPA/Path indexing for finding subsumed units, rewritable terms, and resolvable literals. Feature vector indexing [Sch04] is used for forward and backward nonunit subsumption. Prover9 is available from
    http://www.cs.unm.edu/~mccune/prover9/

Expected Competition Performance

Some of the strategy development for CASC was done by experimentation with the CASC-2004 competition "selected" problems. (Prover9 has not yet been run on other TPTP problems.) Prover9 is unlikely to challenge the CASC leaders, because (1) extensive testing and tuning over TPTP problems has not been done, (2) theories (e.g., ring, combinatory logic, set theory) are not recognized, (3) term orderings and symbol precedences are not fine-tuned, and (4) multiple searches with differing strategies are not run.

Finishes in the middle of the pack are anticipated in all categories in which Prover9 competes.


PS-E---1.0

Daniel Kuehlwein1, Josef Urban1, Stephan Schulz2
1Radboud University Nijmegen, The Netherlands, 2Technische Universität München, Germany

Architecture

PS-E (Premise Selector - E) [KvL+12] is a learning based metasystem for large theories where symbol and formula names are used consistently. It uses the MOR [KvL+12] algorithm for learning and is based upon E. PS-E learns how symbols and terms relate to used premises. Given a new conjecture, PS-E extracts its symbols, terms and subterms and predicts which premises are needed to solve it. When PS-E finds new proofs, it updates its learning algorithm to produce even more accurate predictions.

Strategies

The basic strategy is to run ATPs on problems, then use the machine learner to learn axiom relevance for conjectures from solutions, and use the most relevant axioms for next ATP attempts. This is iterated, using different timelimits and axiom limits. The learning parameters (regularization,the gaussian kernel parameters sigma, thresholds) are optimized via cross validation on the 1000 example problems and solutions.

Implementation

PS-E is implemented in python, using the numpy and scipy libraries. It is available from
    http://www.cs.ru.nl/~kuehlwein/

Expected Competition Performance

Based on the performance of the MOR algorithms in [KvL+12] PS-E is expected to be very competitive.


Satallax 2.1

Chad E. Brown
Saarland University, Germany

Architecture

Satallax [Bro11] is an automated theorem prover for higher-order logic. The particular form of higher-order logic supported by Satallax is Church's simple type theory with extensionality and choice operators. The SAT solver MiniSat [ES04] is responsible for much of the search for a proof.

The theoretical basis of search is a complete ground tableau calculus for higher-order logic [BS10] with a choice operator [BB11]. A problem is given in the THF format. A branch is formed from the axioms of the problem and the negation of the conjecture (if any is given). From this point on, Satallax tries to determine unsatisfiability or satisfiability of this branch.

Satallax progressively generates higher-order formulae and corresponding propositional clauses [Bro11]. These formulae and propositional clauses correspond to instances of the tableau rules. Satallax uses the SAT solver MiniSat as an engine to test the current set of propositional clauses for unsatisfiability. If the clauses are unsatisfiable, then the original branch is unsatisfiable. If there are no quantifiers at function types, the generation of higher-order formulae and corresponding clauses may terminate [BS09,BS09]. In such a case, if MiniSat reports the final set of clauses as satisfiable, then the original set of higher-order formulae is satisfiable (by a standard model in which all types are interpreted as finite sets).

Strategies

There are a number of flags that control the order in which formulas and instantiation terms are considered and propositional clauses are generated. Other flags activate some optional extensions to the basic proof procedure. A collection of flag settings is called a mode. Approximately 250 modes have been tried so far. Regardless of the mode, the search procedure is sound and complete for higher-order logic with choice. This implies that if search terminates with a particular mode, then we can conclude that the original set of formulae is unsatisfiable or satisfiable.

A strategy schedule is an ordered collection of modes with information about how much time the mode should be allotted. Satallax tries each of the modes for a certain amount of time sequentially. Satallax 2.1 has eight strategy schedules which were determined through experimentation using the THF problems in version 5.1.0 of the TPTP library. One of these eight strategy schedules is chosen based on the amount of time Satallax is given to solve the problem. For example, if Satallax is given 180 seconds to solve the problem, then a schedule with 38 modes is chosen.

Implementation

Satallax 2.1 is implemented in OCaml. A foreign function interface is used to interact with MiniSat. Satallax is available from
    http://satallax.com

Expected Competition Performance

Satallax 2.1 is the CASC-23 THF division winner.


Satallax 2.4

Chad E. Brown
Saarland University, Germany

Architecture

Satallax 2.4 is an automated theorem prover for higher-order logic. The particular form of higher-order logic supported by Satallax is Church's simple type theory with extensionality and choice operators. The SAT solver MiniSat [ES04] is responsible for much of the search for a proof. The theoretical basis of search is a complete ground tableau calculus for higher-order logic [BS10] with a choice operator [BB11]. A problem is given in the THF format. A branch is formed from the axioms of the problem and the negation of the conjecture (if any is given). From this point on, Satallax tries to determine unsatisfiability or satisfiability of this branch.

Satallax progressively generates higher-order formulae and corresponding propositional clauses [Bro11]. These formulae and propositional clauses correspond to instances of the tableau rules. Satallax uses the SAT solver MiniSat as an engine to test the current set of propositional clauses for unsatisfiability. If the clauses are unsatisfiable, then the original branch is unsatisfiable.

Strategies

There are about a hundred flags [Bro12] that control the order in which formulas and instantiation terms are considered and propositional clauses are generated. Other flags activate some optional extensions to the basic proof procedure. A collection of flag settings is called a mode. Approximately 300 modes have been defined and tested so far. A strategy schedule is an ordered collection of modes with information about how much time the mode should be allotted. Satallax tries each of the modes for a certain amount of time sequentially. Satallax 2.4 has strategy schedule consisting of 58 modes (3 of which may be tried twice - the second time with a longer timeout). Each mode is tried for time limits ranging from 0.1 seconds to 27.3 seconds. The strategy schedule was determined through experimentation using the THF problems in version 5.3.0 of the TPTP library.

Implementation

Satallax 2.4 is implemented in OCaml. A foreign function interface is used to interact with MiniSat 2.2.0 (Satallax 2.1 used MiniSat 2.0.) Satallax is available from
    http://satallax.com

Expected Competition Performance

Satallax 2.1 won the THF division last year by proving 246 of 300 theorems. Satallax 2.4 can prove 272 of these same 300 theorems indicating an improvement of roughly 10 percent. On the other hand, the Sledgehammer tactic in Isabelle has also improved in the past year and has a good chance of winning the THF division.


SPASS+T 2.2.14

Uwe Waldmann1, Stephan Zimmer2
1Max-Planck-Institut für Informatik, Germany, 2AbsInt GmbH, Germany

Architecture

SPASS+T is an extension of the superposition-based theorem prover SPASS that integrates algebraic knowledge into SPASS in three complementary ways: by passing derived formulas to an external SMT procedure (currently Yices or CVC3), by adding standard axioms, and by built-in arithmetic simplification and inference rules. A first version of the system has been described in [PW06]. In the current version, a much more sophisticated coupling of the SMT procedure has been added [Zim07].

Strategies

Standard axioms and built-in arithmetic simplification and inference rules are integrated into the standard main loop of SPASS. Inferences between standard axioms are excluded, so the user-supplied formulas are taken as set of support. The external SMT procedure runs in parallel in a separate process, leading occasionally to non-deterministic behaviour.

Implementation

SPASS+T is implemented in C. The system is available from
    http://www.mpi-inf.mpg.de/~uwe/software/#TSPASS

Expected Competition Performance

SPASS+T 2.2.14 is the CASC-23 TFA division winner.


SPASS+T 2.2.16

Uwe Waldmann1, Stephan Zimmer2
1Max-Planck-Institut für Informatik, Germany, 2AbsInt GmbH, Germany

Architecture

SPASS+T is an extension of the superposition-based theorem prover SPASS that integrates algebraic knowledge into SPASS in three complementary ways: by passing derived formulas to an external SMT procedure (currently Yices or CVC3), by adding standard axioms, and by built-in arithmetic simplification and inference rules. A first version of the system has been described in [PW06]; later a much more sophisticated coupling of the SMT procedure has been added [Zim07].

Strategies

Standard axioms and built-in arithmetic simplification and inference rules are integrated into the standard main loop of SPASS. Inferences between standard axioms are excluded, so the user-supplied formulas are taken as set of support. The external SMT procedure runs in parallel in a separate process, leading occasionally to non-deterministic behaviour.

Implementation

SPASS+T is implemented in C. The system is available from
    http://www.mpi-inf.mpg.de/~uwe/software/#TSPASS

Expected Competition Performance

SPASS+T 2.2.14 has been the winner of the TFA division of last CASC; we expect a similar performance in CASC 2012.


STP---1.0

Adam Pease1, Stephan Schulz2
1Articulate Software, USA, 2Technische Universität München, Germany

Architecture

STP is a minimalistic resolution theorem prover designed for educational purposes. The goal is to present a simple and clean series of provers starting with the most basic functional prover that can process TPTP problems, and gradually extending the system to implement more and more state-of-the-art techniques. STP is open source and it is hoped that it will provide a basis for many new derivative provers to participate in CASC.

The current system implements a complete calculus based on binary resolution.

Strategies

It includes subsumption and tautology elimination as simplification techniques, and a few simple strategies for axiom selection.

Implementation

It supports both CNF translation and proof object generation and output in TSTP format. STP is available in both Python and Java implementations.

Expected Competition Performance

Performance is expected to be poor compared to modern systems.


SuperZenon 0.0.1

David Delahaye1, Melanie Jacquel2
1CEDRIC/CNAM, France, 2Siemens IC-MOL, France

Architecture

SuperZenon is an experimental extension of the Zenon [BDD07] automated theorem prover, using the principles of superdeduction, among which the theory is used to enrich the deduction system with new deduction rules.

Superdeduction [BHK07] is a variant of deduction modulo [DHK03], which is an extension of logical deduction systems consisting in canonically adding ad hoc deduction rules translating axiomatic theories. This has several advantages:

A version of SuperZenon has been instantiated for the set theory of the B method [Abr96]. This allows us to provide another prover to
Atelier B, which can be used to verify B proof rules in particular. This version of SuperZenon has been successfully applied (with significant speed-ups both in terms of proof time and proof size) to the verification of B proof rules coming from the database maintained by Siemens IC-MOL [JB+12].

Strategies

The strategy of SuperZenon relies on the automatic orientation of some axioms of the theory. The axioms that can be oriented are either equivalences or implications where at least one member is atomic. The axioms which cannot be oriented are left as axioms. It should be noted that the preservation of completeness after this transformation of a part of the theory cannot be ensured in general.

Implementation

The implementation of SuperZenon relies on the one of Zenon, and has been realized thanks to the ability of Zenon to extend its core of deductive rules to match specific requirements. The compilation of superdeduction rules is performed using the proof search method of Zenon with a subset of rules. The implementation of Super Zenon is available from
    http://cedric.cnam.fr/~delahaye/super-zenon/

Expected Competition Performance

SuperZenon is expected to improve the performance of Zenon in each division in which Zenon competes (i.e., FOFT and FOF).


TPS 3.120601S1b

Chad E. Brown1, Peter Andrews2
1Saarland University, Germany, 2Carnegie Melon University, USA

Architecture

TPS is a higher-order theorem proving system that has been developed over several decades under the supervision of Peter B. Andrews with substantial work by Eve Longini Cohen, Dale A. Miller, Frank Pfenning, Sunil Issar, Carl Klapper, Dan Nesmith, Hongwei Xi, Matthew Bishop, Chad E. Brown, Mark Kaminski, Rémy Chrétien and Cris Perdue. TPS can be used to prove theorems of Church's type theory automatically, interactively, or semi-automatically [AB+96,AB06]. When searching for a proof, TPS first searches for an expansion proof [Mil87] or an extensional expansion proof [Bro07] of the theorem. Part of this process involves searching for acceptable matings [And81]. Using higher-order unification, a pair of occurrences of subformulae (which are usually literals) is mated appropriately on each vertical path through an expanded form of the theorem to be proved. The expansion proof thus obtained is then translated [Pfe87] without further search into a proof of the theorem in natural deduction style.

Strategies

Strategies used by TPS in the search process include:

Implementation

TPS has been developed as a research tool for developing, investigating, and refining a variety of methods of searching for expansion proofs, and variations of these methods. Its behavior is controlled by hundreds of flags. A set of flags, with values for them, is called a mode. The strategy of the current version of TPS consists of 71 modes. When searching for a proof in automatic mode, TPS tries each of these modes in turn for a specified amount of time (at least 1 second and at most 41 seconds). If TPS succeeds in finding an expansion proof, it translates the expansion proof to a natural deduction proof. This final step ensures that TPS will not incorrectly report that a formula has been proven. TPS is implemented in Common Lisp, and is available from
    http://gtps.math.cmu.edu/tps.html

Expected Competition Performance

TPS was the CASC-22 THF division winner in 2009. In the two CASC competitions since 2009 TPS has come in last behind Isabelle, LEO-II and Satallax. There have been no changes to the code of TPS since last year. However, TPS has been run with many modes on the THF problems from the TPTP. This information has been used to generate a new strategy schedule which should make TPS more competitive.


Vampire-LTB 1.8

Krystof Hoder, Andrei Voronkov
The University of Manchester, United Kingdom

Architecture

Vampire 1.8, is an automatic theorem prover for first-order classical logic. It consists of a shell and a kernel. The kernel implements the calculi of ordered binary resolution and superposition for handling equality. The splitting rule in kernel adds propositional parts to clauses, which are manipulated using binary decision diagrams and a SAT solver. A number of standard redundancy criteria and simplification techniques are used for pruning the search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered unit equalities. The reduction ordering is the Knuth-Bendix Ordering.

Substitution tree and code tree indexes are used to implement all major operations on sets of terms, literals and clauses. Although the kernel of the system works only with clausal normal form, the shell accepts a problem in the full first-order logic syntax, clausifies it and performs a number of useful transformations before passing the result to the kernel. Also the axiom selection algorithm Sine [HV11] can be enabled as part of the preprocessing.

When a theorem is proved, the system produces a verifiable proof, which validates both the clausification phase and the refutation of the CNF.

Strategies

The Vampire 1.8 kernel provides a fairly large number of options for strategy selection. The most important ones are:
  1. Choice of the main procedure:
    • Limited Resource Strategy
    • DISCOUNT loop
    • Otter loop
    • Goal oriented mode based on tabulation
  2. A variety of optional simplifications.
  3. Parameterized reduction orderings.
  4. A number of built-in literal selection functions and different modes of comparing literals.
  5. Age-weight ratio that specifies how strongly lighter clauses are preferred for inference selection.
  6. Set-of-support strategy.

Implementation

Vampire 1.8 is implemented in C++.

Expected Competition Performance

Vampire-LTB 1.8 is the CASC-23 LTB division winner.


Vampire 2.6

Krystof Hoder, Andrei Voronkov
University of Manchester, England

Architecture

Vampire 2.6 is an automatic theorem prover for first-order classical logic. It consists of a shell and a kernel. The kernel implements the calculi of ordered binary resolution and superposition for handling equality. It also implements the Inst-gen calculus. The splitting rule in kernel adds propositional parts to clauses, which are being manipulated using binary decision diagrams and a SAT solver. A number of standard redundancy criteria and simplification techniques are used for pruning the search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered unit equalities. The reduction ordering is the Knuth-Bendix Ordering.

Substitution tree and code tree indexes are used to implement all major operations on sets of terms, literals and clauses. Although the kernel of the system works only with clausal normal form, the shell accepts a problem in the full first-order logic syntax, clausifies it and performs a number of useful transformations before passing the result to the kernel. Also the axiom selection algorithm Sine [HV11] can be enabled as part of the preprocessing.

When a theorem is proved, the system produces a verifiable proof, which validates both the clausification phase and the refutation of the CNF.

Strategies

The Vampire 2.6 kernel provides a fairly large number of options for strategy se lection. The most important ones are:

Implementation

Vampire 2.6 is implemented in C++.

Expected Competition Performance

We expect Vampire 2.6 to outperform the last year's Vampire 1.8, especially in the satisfiability-checking divisions.


Zenon 0.7.1

Damien Doligez
INRIA, France

Architecture

Zenon 0.7.1 [BDD07] is a theorem prover based on a proof-confluent version of analytic tableaux. It uses all the usual tableau rules for first-order logic with a rule-based handling of equality.

Zenon outputs formal proofs that can be checked by Coq or Isabelle.

Strategies

Zenon is a fully automatic black-box design with no user-selectable strategies.

Implementation

Zenon is implemented in
OCaml. Its most interesting data structure is the representation of first-order formulas and terms: they are hash-consed modulo alpha conversion.

Zenon is available from

    http://zenon-prover.org

Expected Competition Performance

The first-order reasoning part of Zenon has not changed much since 2010, and the handling of equality is still really bad, so Zenon is again expected to rank in the lower half of the field.

The FOF division now includes some CNF problems; this will probably hinder Zenon's performance somewhat.


References

Abr96
Abrial J-R. (1996), The B-Book, Assigning Programs to Meanings, Cambridge University Press.

AB+96
Andrews P.B., Bishop M., Issar S., Nesmith. D., Pfenning F., Xi H. (1996), TPS: A Theorem-Proving System for Classical Type Theory, Journal of Automated Reasoning 16(3), 321-353.

AB06
Andrews P.B., Brown C.E. (2006), TPS: A Hybrid Automatic-Interactive System for Developing Proofs, Journal of Applied Logic 4(4), 367-395.

And81
Andrews P.B. (1981), Theorem Proving via General Matings, Journal of the ACM 28(2), 193-214.

And89
Andrews P.B. (1989), On Connections and Higher-Order Logic, Journal of Automated Reasoning 5(3), 257-291.

BG98
Bachmair L., Ganzinger H. (1998), Equational Reasoning in Saturation-Based Theorem Proving, Bibel W., Schmitt P.H., Automated Deduction, A Basis for Applications, 352-397, Applied Logic Series I Foundations - Calculi and Methods(10), Kluwer Academic Publishers.

BB11
Backes J., Brown C.E. (2011), Analytic Tableaux for Higher-Order Logic with Choice, Journal of Automated Reasoning 47(4), 451-479.

BT07
Barrett C., Tinelli C. (2007), CVC3, Damm W., Hermanns H., Proceedings of the 19th International Conference on Computer Aided Verification (Berlin, Germany), 298-302, Lecture Notes in Computer Science 4590, Springer-Verlag.

BN+06
Barrett C. Nieuwenhuis R., Oliveras A., Tinelli C. (2006), Splitting on Demand in SAT Modulo Theories, Hermann M., Voronkov A., Proceedings of the 13th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (Phnom Penh, Cambodia), 512-526, Lecture Notes in Artificial Intelligence 4246.

BFT04
Baumgartner P., Fuchs A., Tinelli C. (2004), Darwin - A Theorem Prover for the Model Evolution Calculus, Sutcliffe G., Schulz S., Tammet T., Proceedings of the Workshop on Empirically Successful First Order Reasoning, 2nd International Joint Conference on Automated Reasoning (Cork, Ireland).

BFN96
Baumgartner P., Furbach U., Niemelä I. (1996), Hyper Tableaux, Alferes J., Pereira L., Orlowska E., Proceedings of JELIA'96: European Workshop on Logic in Artificial Intelligence (Evora, Portugal), 1-17, Lecture Notes in Artificial Intelligence 1126, Springer-Verlag.

BFP07
Baumgartner P., Furbach U., Pelzer B. (2007), Hyper Tableaux with Equality, Pfenning F., Proceedings of the 21st International Conference on Automated Deduction (Bremen, Germany), 492-507, Lecture Notes in Artificial Intelligence 4603, Springer-Verlag.

BPT11
Baumgartner P., Pelzer B., Tinelli C. (2011), Model Evolution with Equality - Revised and Implemented, Journal of Symbolic Computation, To appear.

BT03
Baumgartner P., Tinelli C. (2003), The Model Evolution Calculus, Baader F., Proceedings of the 19th International Conference on Automated Deduction (Miami, USA), 350-364, Lecture Notes in Artificial Intelligence 2741, Springer-Verlag.

BT05
Baumgartner P., Tinelli C. (2005), The Model Evolution Calculus with Equality, Nieuwenhuis R., Proceedings of the 20th International Conference on Automated Deduction (Tallinn, Estonia), 392-408, Lecture Notes in Artificial Intelligence 3632, Springer-Verlag.

BP+08
Benzmüller C., Paulson L., Theiss F., Fietzke A. (2008), LEO-II - A Cooperative Automatic Theorem Prover for Higher-Order Logic, Baumgartner P., Armando A., Gilles D., Proceedings of the 4th International Joint Conference on Automated Reasoning (Sydney, Australia), 162-170, Lecture Notes in Artificial Intelligence 5195, Springer-Verlag.

BRS08
Benzmüller C., Rabe F., Sutcliffe G. (2008), THF0 - The Core TPTP Language for Classical Higher-Order Logic, Baumgartner P., Armando A., Gilles D., Proceedings of the 4th International Joint Conference on Automated Reasoning (Sydney, Australia), 491-506, Lecture Notes in Artificial Intelligence 5195, Springer-Verlag.

BS+08
Benzmüller C., Sorge V., Jamnik M., Kerber M. (2008), Combined Reasoning by Automated Cooperation, Journal of Applied Logic 6(3), 318-342.

Ben99
Benzmüller C. (1999), Extensional Higher-order Paramodulation and RUE-Resolution, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), 399-413, Lecture Notes in Artificial Intelligence 1632, Springer-Verlag.

Bib87
Bibel W. (1987), Automated Theorem Proving, Vieweg and Sohn.

BA98
Bishop M., Andrews P.B. (1998), Selectively Instantiating Definitions, Kirchner C., Kirchner H., Proceedings of the 15th International Conference on Automated Deduction (Lindau, Germany), 365-380, Lecture Notes in Artificial Intelligence 1421, Springer-Verlag.

Bis99
Bishop M. (1999), A Breadth-First Strategy for Mating Search, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), 359-373, Lecture Notes in Artificial Intelligence 1632, Springer-Verlag.

BBP11
Blanchette J., Boehme S., Paulson L. (2011), Extending Sledgehammer with SMT Solvers, Bjorner N., Sofronie-Stokkermans V., Proceedings of the 23rd International Conference on Automated Deduction (Wroclaw, Poland), 116-130, Lecture Notes in Artificial Intelligence 6803, Springer-Verlag.

BN10
Blanchette J., Nipkow T. (2010), Nitpick: A Counterexample Generator for Higher-Order Logic Based on a Relational Model Finder, Kaufmann M. Paulson L., Proceedings of the 1st International Conference on Interactive Theorem Proving (Edinburgh, United Kingdom), 131-146, Lecture Notes in Computer Science 6172, Springer-Verlag.

BP+12
Blanchette J., Popescu A., Wand D., Weidenbach C. (2012), More SPASS with Isabelle, Beringer L., Felty A., Proceedings of Interactive Theorem Proving 2012 (Princeton, United States), Lecture Notes in Artificial Intelligence.

Ble71
Bledsoe W.W. (1971), Splitting and Reduction Heuristics in Automatic Theorem Proving, Artificial Intelligence 2, 55-77.

BDD07
Bonichon R., Delahaye D., Doligez D. (2007), Zenon : An Extensible Automated Theorem Prover Producing Checkable Proofs, Dershowitz N., Voronkov A., Proceedings of the 14th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (Yerevan, Armenia), 151-165, Lecture Notes in Artificial Intelligence 4790.

BHK07
Brauner P., Houtmann C., Kirchner C. (2007), Principles of Superdeduction, Ong L., Proceedings of the 22nd Annual IEEE Symposium on Logic in Computer Science (Wroclaw, Poland), 41-50, IEEE Computer Society Press.

BS09
Brown C.E., Smolka G. (2009), Terminating Tableaux for the Basic Fragment of Simple Type Theory, Giese M., Waaler A., Proceedings of the 18th International Conference on Automated Reasoning with Analytic Tableaux and Related Methods (Oslo, Norway), 138-151, Lecture Notes in Artificial Intelligence 5697, Springer-Verlag.

BS09
Brown C.E., Smolka G. (2009), Extended First-Order Logic, Nipkow T., Urban C., Proceedings of the 22nd International Conference on Theorem Proving in Higher Order Logics (Munich, Germany), 164-179, Lecture Notes in Computer Science 5674, Springer-Verlag.

BS10
Brown C.E., Smolka G. (2010), Analytic Tableaux for Simple Type Theory and its First-Order Fragment, Logical Methods in Computer Science 6(2).

Bro02
Brown C.E. (2002), Solving for Set Variables in Higher-Order Theorem Proving, Voronkov A., Proceedings of the 18th International Conference on Automated Deduction (Copenhagen, Denmark), 408-422, Lecture Notes in Artificial Intelligence 2392, Springer-Verlag.

Bro07
Brown C.E. (2007), Automated Reasoning in Higher-Order Logic: Set Comprehension and Extensionality in Church's Type Theory, Studies in Logic: Logic and Cognitive Systems 10, College Publications.

Bro11
Brown C.E. (2011), Reducing Higher-Order Theorem Proving to a Sequence of SAT Problems, Bjorner N., Sofronie-Stokkermans V., Proceedings of the 23rd International Conference on Automated Deduction (Wroclaw, Poland), 147-161, Lecture Notes in Artificial Intelligence 6803, Springer-Verlag.

Bro12
Brown C.E. (2012), Satallax: An Automated Higher-Order Prover (System Description), Gramlich B., Miller D., Sattler U., Proceedings of the 6th International Joint Conference on Automated Reasoning (Manchester, United Kingdom), To appear, Lecture Notes in Artificial Intelligence.

CLS11
Claessen K., Lilliestrom A., Smallbone N. (2011), Sort It Out with Monotonicity - Translating between Many-Sorted and Unsorted First-Order Logic, Bjorner N., Sofronie-Stokkermans V., Proceedings of the 23rd International Conference on Automated Deduction (Wroclaw, Poland), 207-221, Lecture Notes in Artificial Intelligence 6803, Springer-Verlag.

CS03
Claessen K., Sörensson N. (2003), New Techniques that Improve MACE-style Finite Model Finding, Baumgartner P., Fermueller C., Proceedings of the CADE-19 Workshop: Model Computation - Principles, Algorithms, Applications (Miami, USA).

dMB08
de Moura L., Bjorner N. (2008), Z3: An Efficient SMT Solver, Ramakrishnan C., Rehof J., Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (Budapest, Hungary), 337-340, Lecture Notes in Artificial Intelligence 4963, Springer-Verlag.

DHK03
Dowek G., Hardin T., Kirchner C. (2003), Theorem Proving Modulo, Journal of Automated Reasoning 31(1), 33-72.

ES04
E{\'e}n N., Sörensson N. (2004), An Extensible SAT-solver, Giunchiglia E., Tacchella A., Proceedings of the 6th International Conference on Theory and Applications of Satisfiability Testing (Santa Margherita Ligure, Italy), 502-518, Lecture Notes in Computer Science 2919, Springer-Verlag.

GK03
Ganzinger H., Korovin K. (2003), New Directions in Instantiation-Based Theorem Proving, Kolaitis P., Proceedings of the 18th IEEE Symposium on Logic in Computer Science (Ottawa, Canada), 55-64, IEEE Press.

GK04
Ganzinger H., Korovin K. (2004), Integrating Equational Reasoning into Instantiation-Based Theorem Proving, Marcinkowski J., Tarlecki A., Proceedings of the 18th International Workshop on Computer Science Logic, 13th Annual Conference of the EACSL (Karpacz, Poland), 71-84, Lecture Notes in Computer Science 3210, Springer-Verlag.

GSS11
Gebser M., Sabuncu O., Schaub T. (2011), An Incremental Answer Set Programming Based System for Finite Model Computation, AI Communications 24(2), 195-212.

HKV10
Hoder K., Kovacs L., Voronkov A. (2010), Interpolation and Symbol Elimination in Vampire, Giesl J., Haehnle R., Proceedings of the 5th International Joint Conference on Automated Reasoning (Edinburgh, United Kingdom), 188-195, Lecture Notes in Artificial Intelligence 6173.

HKV11
Hoder K., Kovacs L., Voronkov A. (2011), Invariant Generation in Vampire, Abdulla P., Leino R., Proceedings of the 17th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (Saarbruecken, Germany), 60-64, Lecture Notes in Computer Science 6605, Springer-Verlag.

HV11
Hoder K., Voronkov A. (2011), Sine Qua Non for Large Theory Reasoning, Sofronie-Stokkermans V., Bj{\oe}rner N., Proceedings of the 23rd International Conference on Automated Deduction (Wroclaw, Poland), 299-314, Lecture Notes in Artificial Intelligence 6803, Springer-Verlag.

Iss90
Issar S. (1990), Path-Focused Duplication: A Search Procedure for General Matings, Dietterich T. Swartout W., Proceedings of the 8th National Conference on Artificial Intelligence (Boston, USA), 221-226, American Association for Artificial Intelligence / MIT Press.

JB+12
Jacquel M., Berkani K., Delahaye D., Dubois C. (2012), Tableaux Modulo Theories using Superdeduction: An Application to the Verification of B Proof Rules with the Zenon Automated Theorem Prover, Gramlich B., Miller D., Sattler U., Proceedings of the 6th International Joint Conference on Automated Reasoning (Manchester, United Kingdom), To appear, Lecture Notes in Artificial Intelligence.

KS10
Korovin K., Sticksel C. (2010), iProver-Eq - An Instantiation-Based Theorem Prover with Equality, Giesl J., Haehnle R., Proceedings of the 5th International Joint Conference on Automated Reasoning (Edinburgh, United Kingdom), 196-202, Lecture Notes in Artificial Intelligence 6173.

KS10
Korovin K., Sticksel C. (2010), Labelled Unit Superposition Calculi for Instantiation-based Reasoning, Fermüller C., Voronkov A., Proceedings of the 17th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (Yogyakarta, Indonesia), 459-473, Lecture Notes in Computer Science 6397, Springer-Verlag.

Kor08
Korovin K. (2008), iProver - An Instantiation-Based Theorem Prover for First-order Logic (System Description), Baumgartner P., Armando A., Gilles D., Proceedings of the 4th International Joint Conference on Automated Reasoning (Sydney, Australia), 292-298, Lecture Notes in Artificial Intelligence 5195.

Kor09
Korovin K. (2009), An Invitation to Instantiation-Based Reasoning: From Theory to Practice, Podelski A., Voronkov A., Wilhelm R., Volume in Memoriam of Harald Ganzinger, 163-166, Lecture Notes in Computer Science 5663, Springer-Verlag.

Kv+12
Kuehlwein D., van Laarhoven T., Tsivtsivadze E., Urban J., Heskes T. (2012), Overview and Evaluation of Premise Selection Techniques for Large Theory Mathematics, Gramlich B., Miller D., Sattler U., Proceedings of the 6th International Joint Conference on Automated Reasoning (Manchester, United Kingdom), To appear, Lecture Notes in Artificial Intelligence.

LS01
Letz R., Stenz G. (2001), Model Elimination and Connection Tableau Procedures, Robinson A., Voronkov A., Handbook of Automated Reasoning, 2015-2114, Elsevier Science.

Loe06
Loechner B. (2006), Things to Know When Implementing KBO, Journal of Automated Reasoning 36(4), 289-310.

Loe06
Loechner B. (2006), Things to Know When Implementing LBO, Journal of Artificial Intelligence Tools 15(1), 53-80.

Lov78
Loveland D.W. (1978), Automated Theorem Proving : A Logical Basis, Elsevier Science.

McC03
McCune W.W. (2003), Mace4 Reference Manual and Guide, ANL/MCS-TM-264, Argonne National Laboratory (Argonne, USA).

McC03
McCune W.W. (2003), Otter 3.3 Reference Manual, ANL/MSC-TM-263, Argonne National Laboratory (Argonne, USA).

McC97
McCune W.W. (1997), 33 Basic Test Problems: A Practical Evaluation of Some Paramodulation Strategies, Veroff R., Automated Reasoning and its Applications: Essays in Honor of Larry Wos, 71-114, MIT Press.

Mil87
Miller D. (1987), A Compact Representation of Proofs, Studia Logica 46(4), 347-370.

NOT06
Nieuwenhuis R., Oliveras A., Tinelli C. (2006), Solving SAT and SAT Modulo Theories: from an Abstract Davis-Putnam-Logemann-Loveland Procedure to DPLL(T), Journal of the ACM 53(6), 937-977.

NPW12
Nipkow T., Paulson L., Wenzel M., Isabelle/HOL: A Proof Assistant for Higher-Order Logic, http://www.cl.cam.ac.uk/research/hvg/Isabelle/dist/Isabelle/doc/tutorial.pdf.

Nip89
Nipkow T. (1989), Equational Reasoning in Isabelle, Science of Computer Programming 12(2), 123-149.

OB03
Otten J., Bibel W. (2003), leanCoP: Lean Connection-Based Theorem Proving, Journal of Symbolic Computation 36(1-2), 139-161.

OS10
Otten J., Sutcliffe G. (2010), Using the TPTP Language for Representing Derivations in Tableau and Connection Calculi, Konev B., Schmidt R., Schulz S., Proceedings of the Workshop on Practical Aspects of Automated Reasoning, 5th International Joint Conference on Automated Reasoning (Edinburgh, United Kingdom), 90-100.

Ott08
Otten J. (2008), {\sf leanCoP 2.0} and {\sf ileancop 1.2}: High Performance Lean Theorem Proving in Classical and Intuitionistic Logic, Baumgartner P., Armando A., Gilles D., Proceedings of the 4th International Joint Conference on Automated Reasoning (Sydney, Australia), 283-291, Lecture Notes in Artificial Intelligence 5195.

Ott10
Otten J. (2010), Restricting Backtracking in Connection Calculi, AI Communications 23(2-3), 159-182.

Pas01
Pastre D. (2001), Muscadet 2.3 : A Knowledge-based Theorem Prover based on Natural Deduction, Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), 685-689, Lecture Notes in Artificial Intelligence 2083, Springer-Verlag.

Pas02
Pastre D. (2002), Strong and Weak Points of the Muscadet Theorem Prover, AI Communications 15(2-3), 147-160.

Pas07
Pastre D. (2007), Complementarity of a Natural Deduction Knowledge-based Prover and Resolution-based Provers in Automated Theorem Proving, http://www.math-info.univ-paris5.fr/~pastre/compl-NDKB-RB.pdf.

Pas78
Pastre D. (1978), Automatic Theorem Proving in Set Theory, Artificial Intelligence 10, 1-27.

Pas89
Pastre D. (1989), Muscadet : An Automatic Theorem Proving System using Knowledge and Metaknowledge in Mathematics, Artificial Intelligence 38, 257-318.

Pas93
Pastre D. (1993), Automated Theorem Proving in Mathematics, Annals of Mathematics and Artificial Intelligence 8, 425-447.

PB10
Paulson L., Blanchette J. (2010), Three Years of Experience with Sledgehammer, a Practical Link between Automatic and Interactive Theorem Provers, Sutcliffe G., Ternovska E., Schulz S., Proceedings of the 8th International Workshop on the Implementation of Logics (Yogyakarta, Indonesia), To appear.

PN94
Paulson L.C., Nipkow T. (1994), Isabelle: A Generic Theorem Prover, Lecture Notes in Computer Science 828, Springer-Verlag.

Pau99
Paulson L. (1999), A Generic Tableau Prover and its Integration with Isabelle, Artificial Intelligence 5(3), 73-87.

PW07
Pelzer B., Wernhard C. (2007), System Description: E-KRHyper, Pfenning F., Proceedings of the 21st International Conference on Automated Deduction (Bremen, Germany), 508-513, Lecture Notes in Artificial Intelligence 4603, Springer-Verlag.

Pfe87
Pfenning F. (1987), Proof Transformations in Higher-Order Logic, PhD thesis, Carnegie-Mellon University (Pittsburg, USA).

PW06
Prevosto V., Waldmann U. (2006), SPASS+T, Sutcliffe G., Schmidt R., Schulz S., Proceedings of the FLoC'06 Workshop on Empirically Successful Computerized Reasoning, 3rd International Joint Conference on Automated Reasoning (Seattle, USA), 19-33, CEUR Workshop Proceedings 192.

RV01
Riazanov A., Voronkov A. (2001), Vampire 1.1 (System Description), Gore R., Leitsch A., Nipkow T., Proceedings of the International Joint Conference on Automated Reasoning (Siena, Italy), 376-380, Lecture Notes in Artificial Intelligence 2083, Springer-Verlag.

RV02
Riazanov A., Voronkov A. (2002), The Design and Implementation of Vampire, AI Communications 15(2-3), 91-110.

Rue08
Rümmer P. (2008), A Constraint Sequent Calculus for First-Order Logic with Linear Integer Arithmetic, Cervesato I., Veith H., Voronkov A., Proceedings of the 15th International Conference on Logic for Programming Artificial Intelligence and Reasoning (Doha, Qatar), 274-289, Lecture Notes in Artificial Intelligence 5330, Springer-Verlag.

Rue12
Rümmer P. (2012), A Constraint Sequent Calculus for First-Order Logic with Linear Integer Arithmetic, Bjorner N., Voronkov A., Proceedings of the 18th International Conference on Logic for Programming Artificial Intelligence and Reasoning (Merida, Venezuela), 274-289, Lecture Notes in Artificial Intelligence 7180, Springer-Verlag.

Sch02
Schulz S. (2002), E: A Brainiac Theorem Prover, AI Communications 15(2-3), 111-126.

Sch04
Schulz S. (2004), Simple and Efficient Clause Subsumption with Feature Vector Indexing, Sutcliffe G., Schulz S., Tammet T., Proceedings of the Workshop on Empirically Successful First Order Reasoning, 2nd International Joint Conference on Automated Reasoning (Cork, Ireland).

Sch12
Schulz S. (2012), Fingerprint Indexing for Paramodulation and Rewriting, Gramlich B., Miller D., Sattler U., Proceedings of the 6th International Joint Conference on Automated Reasoning (Manchester, United Kingdom), To appear, Lecture Notes in Artificial Intelligence.

SB12
Sultana N., Benzmüller C. (2012), Automated Reasoning in Higher-Order Logic using the TPTP THF Infrastructure, Ternovska E., Korovin K., Schulz S., Proceedings of the 9th International Workshop on the Implementation of Logics (Merida, Venezuela).

SB10
Sutcliffe G., Benzmüller C. (2010), Automated Reasoning in Higher-Order Logic using the TPTP THF Infrastructure, Journal of Formalized Reasoning 3(1), 1-27.

TJ07
Torlak E., Jackson D. (2007), Kodkod: A Relational Model Finder, Grumberg O., Huth M., Proceedings of the 13th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (Braga, Portugal), 632-647, Lecture Notes in Computer Science 4424, Springer-Verlag.

Ull89
Ullman J. (1989), Principles of Database and Knowledge-Base Bystems, Computer Science Press, Inc..

US+08
Urban J., Sutcliffe G., Pudlak P., Vyskocil J. (2008), MaLARea SG1: Machine Learner for Automated Reasoning with Semantic Guidance, Baumgartner P., Armando A., Gilles D., Proceedings of the 4th International Joint Conference on Automated Reasoning (Sydney, Australia), 441-456, Lecture Notes in Artificial Intelligence 5195, Springer-Verlag.

Urb07
Urban J. (2007), MaLARea: a Metasystem for Automated Reasoning in Large Theories, Urban J., Sutcliffe G., Schulz S., Proceedings of the CADE-21 Workshop on Empirically Successful Automated Reasoning in Large Theories (Bremen, German), 45-58, CEUR Workshop Proceedings 257.

Wer03
Wernhard C. (2003), System Description: KRHyper, Fachberichte Informatik 14--2003, Universität Koblenz-Landau (Koblenz, Germany).

Zim07
Zimmer S. (2007), Intelligent Combination of a First Order Theorem Prover and SMT Procedures, MSc thesis, Saarland University (Saarbruecken, Germany).