Entrants' System Descriptions

Beagle 0.4
cocATP 0.1.8
CVC4 1.2
E 1.8
E-KRHyper 1.4
E-MaLeS 1.2
iProver 0.9 (CASC-J6 EPR division winner)
iProver 1.0
iProver-Eq 0.85
iProverModulo 0.7-0.2
Isabelle 2012 (CASC-J6 THF division winner)
Isabelle 2013
LEO-II 1.6.0
MaLARea 0.5
Muscadet 4.3
Nitrox 2013
Paradox 3.0 (CASC-J6 FNT division winner)
PEPR 0.0ps
Princess 2012-06-04 (CASC-J6 TFA division winner)
Prover9 2011-11A
Satallax 2.7
Satallax-MaLeS 1.2
SPASS+T 2.2.19
TEMPLAR::leanCoP 0.8
TPS 3.120601S1b
Vampire 2.6 (CASC-J6 FOF and LTB division winner)

Beagle 0.4

Architecture

Beagle is an automated theorem prover for sorted first-order logic with equality over built-in theories. The theories currently supported are linear integer, linear rational and linear real arithmetic. It accepts formulas in the FOF, TFF, and TFF-INT formats of the TPTP syntax.

A first-order prover, beagle accepts arbitrarily quantified input formulas and converts them to clause normal form. The core reasoning component implements the Hierarchic Superposition Calculus with Weak Abstraction [BW13]. That calculus generalizes the well-known superposition calculus by integrating black-box reasoning for specific theories. For the theories mentioned above, beagle combines quantifier elimination procedures and other solvers to discard proof obligations over these theories.

Strategies

Beagles uses the well-know Discount loop for saturating a clause set under the calculus' inference rules. Simplification techniques include standard ones, such as subsumption deletion, rewriting by ordered unit equations, tautology deletion. It also includes theory specific simplification rules for evaluating ground (sub)terms, and for exploiting cancellation laws and properties of neutral elements, among others.

Beagle features a splitting rule for clauses that can be divided into variable disjoint parts. Splitting is particularly useful in combination with the quantifier elimination procedure mentioned above, as the theory reasoner then need to deal with conjunctions of quantifier-free unit clauses only.

Beagle uses a single, uniform strategy for every problem.

Implementation

Beagle has been implemented in Scala. It is a full implementation of the calculus mentioned above, albeit a slow one. Fairness is achieved through a combination of measuring clause lengths, depths and their derivation-age.

Beagle's web site is

    http://users.cecs.anu.edu.au/~baumgart/systems/beagle/

Expected Competition Performance

Beagle is implemented in a straightforward way and would benefit from optimized data structures. We do not expect it to come in among the first.

cocATP 0.1.8

Cristobal Camarero
University of Cantabria, Spain

Architecture

cocATP is a Coq-inspired [Coq] automated theorem prover made on the free time of the author. It implements (the non-inductive) part of Coq's logic (calculus of constructions) and syntax. The proof terms it creates are accepted as proofs by Coq by the addition of a few definitions (substituting the respective inductive definitions of Coq). As in Coq, equality and logical connectives other than implication (which is a dependent product of the logic) are defined adding the proper axioms. The reasoning is tried to be done the more general possible, avoiding to give any special treatment to both equality and logical connectives.

cocATP is currently very underdeveloped. Recently the axiom of choice and excluded middle have been added, but the latter is not actually exploited. The functional extensionality axiom have some records of appropriate usage. The propositional extensionality axiom is not included yet.

In addition to axioms, cocATP includes a little library of basic lemmas, most of which with the proof that cocATP has constructed.

CVC4 1.2

Andrew Reynolds
University of Iowa, USA

Architecture

CVC4 [BC+11] is an SMT solver based on the DPLL(T) architecture [NOT06] that includes built-in support for many theories including linear arithmetic, arrays, bit vectors and datatypes. Additionally, it supports universally quantified formulas. When quantified formulas are present, CVC4 typically uses E-matching for answering unsatisfiable, and a finite model finding method for answering satisfiable.

Like other SMT solvers, CVC4 treats quantified formulas using a two-tiered approach. First, quantified formulas are replaced by fresh boolean predicates and the ground theory solver(s) are used in conjunction with the underlying SAT solver to determine satisfiability. If the problem is unsatisfiable at the ground level, then the solver answers "unsatisfiable". Otherwise, the quantifier instantiation module is invoked, and will either add instances of quantified formulas to the problem, answer "satisfiable", or return unknown.

The finite model finding has been developed to target problems containing background theories, whose quantification is limited to finite and uninterpreted sorts. When finite model finding is turned on, CVC4 uses a ground theory of finite cardinality constraints that minimizes the number of ground equivalence classes, as described in [RT+13]. When the problem is satisfiable at the ground level, a candidate model is constructed that contains complete interpretations for all predicate and function symbols. Quantifier instantiation strategies are then invoked to add instances of quantified formulas that are in conflict with the candidate model, as described in [4]. If no instances are added, then the solver reports "satisfiable".

Strategies

For handling theorems, CVC4 uses a variation of E-matching. Quantifier instantiation based on E-matching is performed lazily in CVC4, after the ground solver reports satisfiable.

For handling non-theorems, the finite model finding feature of CVC4 will use a small number of orthogonal quantifier instantiation strategies. Since CVC4 with finite model finding is also capable of answering unsatisfiable, it will be used as a strategy for theorems as well.

Implementation

CVC4 is implemented in C++. The code is available from

    https://github.com/CVC4

Expected Competition Performance

This is the first year CVC4 has entered the FOF division, where it will likely finish around the middle. The finite model finding feature in CVC4 has undergone a lot of development, so it should perform better than last year.

E 1.8

Stephan Schulz
Technische Universität München, Germany

Architecture

E 1.8pre [Sch02] is a purely equational theorem prover for full first-order logic with equality. It consists of an (optional) clausifier for pre-processing full first-order formulae into clausal form, and a saturation algorithm implementing an instance of the superposition calculus with negative literal selection and a number of redundancy elimination techniques. E is based on the DISCOUNT-loop variant of the given-clause algorithm, i.e., a strict separation of active and passive facts. No special rules for non-equational literals have been implemented. Resolution is effectively simulated by paramodulation and equality resolution.

For the LTB divisions, a control program uses a SInE-like analysis to extract reduced axiomatizations that are handed to several instances of E.

Strategies

Proof search in E is primarily controlled by a literal selection strategy, a clause evaluation heuristic, and a simplification ordering. The prover supports a large number of pre-programmed literal selection strategies. Clause evaluation heuristics can be constructed on the fly by combining various parameterized primitive evaluation functions, or can be selected from a set of predefined heuristics. Clause evaluation heuristics are based on symbol-counting, but also take other clause properties into account. In particular, the search can prefer clauses from the set of support, or containing many symbols also present in the goal. Supported term orderings are several parameterized instances of Knuth-Bendix-Ordering (KBO) and Lexicographic Path Ordering (LPO).

For CASC-24, E implements a new strategy-scheduling automatic mode. The total CPU time available is broken into 5 (unequal) time slices. For each time slice, the problem is classified into one of several classes, based on a number of simple features (number of clauses, maximal symbol arity, presence of equality, presence of non-unit and non-Horn clauses,...). For each class, a schedule of strategies is greedily constructed from experimental data as follows: The first strategy assigned to a schedule is the the one that solves the most problems from this class in the first time slice. Each subsequent strategy is selected based on the number of solutions on problems not already solved by a preceding strategy.

About 170 different strategies have been evaluated on all untyped first-order problems from TPTP 5.4.0, and about 140 of these strategies are used in the automatic mode.

Implementation

E is build around perfectly shared terms, i.e. each distinct term is only represented once in a term bank. The whole set of terms thus consists of a number of interconnected directed acyclic graphs. Term memory is managed by a simple mark-and-sweep garbage collector. Unconditional (forward) rewriting using unit clauses is implemented using perfect discrimination trees with size and age constraints. Whenever a possible simplification is detected, it is added as a rewrite link in the term bank. As a result, not only terms, but also rewrite steps are shared. Subsumption and contextual literal cutting (also known as subsumption resolution) is supported using feature vector indexing [Sch04]. Superposition and backward rewriting use fingerprint indexing [Sch12], a new technique combining ideas from feature vector indexing and path indexing. Finally, LPO and KBO are implemented using the elegant and efficient algorithms developed by Bernd Löchner in [Loe06,Loe06]. The prover and additional information are available from

    http://www.eprover.org

Expected Competition Performance

E 1.8 has slightly better strategies than previous versions, and now can produce proof objects internally and much more efficiently. The system is expected to perform well in most proof classes, but will at best complement top systems in the disproof classes.

E-KRHyper 1.4

Björn Pelzer
University Koblenz-Landau, Germany

Architecture

E-KRHyper [PW07] is a theorem proving and model generation system for first-order logic with equality. It is an implementation of the E-hyper tableau calculus [BFP07], which integrates a superposition-based handling of equality [BG98] into the hyper tableau calculus [BFN96]. The system is an extension of the KRHyper theorem prover [Wer03], which implements the original hyper tableau calculus.

An E-hyper tableau is a tree whose nodes are labeled with clauses and which is built up by the application of the inference rules of the E-hyper tableau calculus. The calculus rules are designed such that most of the reasoning is performed using positive unit clauses. Splitting is done without rigid variables. Instead, variables which would be shared between branches are prevented by ground substitutions, which are guessed from the Herbrand universe and constrained by rewrite rules. Redundancy rules allow the detection and removal of clauses that are redundant with respect to a branch. The hyper extension inference from the original hyper tableau calculus is equivalent to a series of E-hyper tableau calculus inference applications. Therefore the implementation of the hyper extension in KRHyper by a variant of semi-naive evaluation [Ull89] is retained in E-KRHyper, where it serves as a shortcut inference for the resolution of non-equational literals.

Strategies

E-KRHyper has traditionally used a uniform search strategy for all problems, based on the aforementioned semi-naive evaluation. As of version 1.4 the prover also contains a prototypical implementation of the common Given-Clause-Algorithm ("Otter-loop") [McC03], which may be used as an alternative on problems of a certain size. The E-hyper tableau is generated depth-first, with E-KRHyper always working on a single branch. Refutational completeness and a fair search control are ensured by an iterative deepening strategy with a limit on the maximum term weight of generated clauses. In the LTB division E-KRHyper sequentially tries three axiom selection strategies: an implementation of Krystof Hoder's SInE algorithm, another incomplete selection based on the CNF representations of the axioms, and finally the complete axiom set.

Implementation

E-KRHyper is implemented in the functional/imperative language OCaml. The system runs on Unix and MS-Windows platforms. the GNU Public License from the E-KRHyper website at: The system accepts input in the TPTP-format and in the TPTP-supported Protein-format. The calculus implemented by E-KRHyper works on clauses, so first order formula input is converted into CNF by an algorithm similar to the one used by Otter [McC03], with some additional connector literals to prevent explosive clause growth when dealing with DNF-like structures. E-KRHyper operates on an E-hyper tableau which is represented by linked node records. Several layered discrimination-tree based indexes (both perfect and non-perfect) provide access to the clauses in the tableau and support backtracking.

E-KRHyper is available from

    http://www.uni-koblenz.de/~bpelzer/ekrhyper

Expected Competition Performance

There has been some streamlining regarding memory consumption, but overall we expect no big change over last year's performance.

E-MaLeS 1.2

Daniel Kühlwein¹, Josef Urban¹, Stephan Schulz²
¹Radboud University Nijmegen, The Netherlands
²Technische Universität München, Germany

Architecture

E-MaLeS 1.2 is the result of applying the MaLeS 1.2 framework to E prover (version 1.7) [Sch02].

MaLeS (Machine Learning of Strategies) is a framework for automatic tuning of ATPs. It combines strategy finding (similar to ParamILS [HH+09] and BliStr [Urb13]) with strategy scheduling (similar to SatZilla [XH+08]). A description of E can be found above.

Strategies

Given a set of problems, MaLeS 1.2 finds good parameter settings by using a local random search algorithm. The found settings are stored as strategies. For each strategy, MaLeS learns a function that given a new problem predicts the time the ATP needs to solve this problem when using this particular strategy. When trying to solve a new problem, MaLeS uses these prediction functions to create a strategy schedule.

The learning algorithm is described in [KUS13]. Further information can be found on the project website.

Implementation

MaLeS is implemented in python, using the numpy and scipy libraries. MaLeS is open source and available from

    https://code.google.com/p/males/

E prover is available from

 http://www.eprover.org

Expected Competition Performance

E-MaLeS 1.2 should perform better than E 1.7.

iProver 0.9

Konstantin Korovin
University of Manchester, United Kingdom

Architecture

iProver is an automated theorem prover based on an instantiation calculus Inst-Gen [GK03,Kor09] which is complete for first-order logic. One of the distinctive features of iProver is a modular combination of first-order reasoning with ground reasoning. In particular, iProver currently integrates MiniSat [ES04] for reasoning with ground abstractions of first-order clauses. In addition to instantiation, iProver implements ordered resolution calculus and a combination of instantiation and ordered resolution; see [Kor08] for the implementation details. The saturation process is implemented as a modification of a given clause algorithm. iProver uses non-perfect discrimination trees for the unification indexes, priority queues for passive clauses, and a compressed vector index for subsumption and subsumption resolution (both forward and backward). The following redundancy eliminations are implemented: blocking non-proper instantiations; dismatching constraints [GK04,Kor08]; global subsumption [Kor08]; resolution-based simplifications and propositional-based simplifications. A compressed feature vector index is used for efficient forward/backward subsumption and subsumption resolution. Equality is dealt with (internally) by adding the necessary axioms of equality with an option of using Brand's transformation. In the LTB division, iProver-SInE uses axiom selection based on the SInE algorithm [HV11] as implemented in Vampire [HKV10], i.e., axiom selection is done by Vampire and proof attempts are done by iProver.

Major additions in the current version are:

answer computation,
several modes for model output using first-order definitions in term algebra,
Brand's transformation.

Strategies

iProver has around 40 options to control the proof search including options for literal selection, passive clause selection, frequency of calling the SAT solver, simplifications and options for combination of instantiation with resolution. At CASC iProver will execute a small number of fixed schedules of selected options depending on general syntactic properties such as Horn/non-Horn, equational/non-equational, and maximal term depth.

Implementation

iProver is implemented in OCaml and for the ground reasoning uses MiniSat. iProver accepts FOF and CNF formats, where Vampire [HKV10] is used for clausification of FOF problems.

iProver is available from

    http://www.cs.man.ac.uk/~korovink/iprover/

Expected Competition Performance

iProver 0.9 is the CASC-23 EPR division winner.

iProver 1.0

Konstantin Korovin, Christoph Sticksel
University of Manchester, United Kingdom

Architecture

iProver is an automated theorem prover based on an instantiation calculus Inst-Gen [GK03,Kor13] which is complete for first-order logic. iProver combines first-order reasoning with ground reasoning for which it uses MiniSat [ES04] and was recently extended with PicoSAT [Bie08] and Lingeling [Bie12] (only MiniSat will be used at this CASC). iProver also combines instantiation with ordered resolution; see [Kor08] for the implementation details. The proof search is implemented using a saturation process based on the given clause algorithm. iProver uses non-perfect discrimination trees for the unification indexes, priority queues for passive clauses, and a compressed vector index for subsumption and subsumption resolution (both forward and backward). The following redundancy eliminations are implemented: blocking non-proper instantiations; dismatching constraints [GK04,Kor08]; global subsumption [Kor08]; resolution-based simplifications and propositional-based simplifications. A compressed feature vector index is used for efficient forward/backward subsumption and subsumption resolution. Equality is dealt with (internally) by adding the necessary axioms of equality with an option of using Brand's transformation. In the LTB division, iProver uses axiom selection based on the SInE algorithm [HV11] as implemented in Vampire [HKV11], i.e., axiom selection is done by Vampire and proof attempts are done by iProver.

Some of iProver features are summarised below.

proof extraction for both instantiation and resolution,
model representation, using first-order definitions in term algebra,
answer substitutions,
semantic filtering,
type inference, monotonic [CLS11] and non-cyclic types,
Brand's transformation.

Type inference is targeted at improving finite model finding and symmetry breaking. Semantic filtering is used in preprocessing to eliminated irrelevant clauses. Proof extraction is challenging due to simplifications such global subsumption which involve global reasoning with the whole clause set and can be computationally expensive.

Strategies

iProver has around 60 options to control the proof search including options for literal selection, passive clause selection, frequency of calling the SAT solver, simplifications and options for combination of instantiation with resolution. At CASC iProver will execute a small number of fixed schedules of selected options depending on general syntactic properties such as Horn/non-Horn, equational/non-equational, and maximal term depth. The strategy for satisfiable problems (FNT division) includes finite model finding.

Implementation

iProver is implemented in OCaml and for the ground reasoning uses MiniSat [ES04]. iProver accepts FOF and CNF formats. Vampire [HKV11,HK+12] is used for proof-producing clausification of FOF problems as well as for axiom selection [HV11] in the LTB division.

iProver is available from

    http://www.cs.man.ac.uk/~korovink/iprover/

Expected Competition Performance

Compared to the last year, iProver had a number of changes in internal datastructures affecting mainly extensibility of the code. We expect similar performance in the EPR division and good performance on satisfiable problems (FNT division).

iProver-Eq 0.85

Christoph Sticksel, Konstantin Korovin
University of Iowa, USA

Architecture

iProver-Eq [KS10] extends the iProver system [Kor08] with built-in equational reasoning, along the lines of [GK04]. As in the iProver system, first-order reasoning is combined with ground satisfiability checking where the latter is delegated to an off-the-shelf ground solver.

iProver-Eq consists of three core components: i) ground reasoning by an SMT solver, ii) first-order equational reasoning on literals in a candidate model by a labelled unit superposition calculus [KS10,KS10] and iii) instantiation of clauses with substitutions obtained by ii).

Given a set of first-order clauses, iProver-Eq first abstracts it to a set of ground clauses which are then passed to the ground solver. If the ground abstraction is unsatisfiable, then the set of first-order clauses is also unsatisfiable. Otherwise, literals are selected from the first-order clauses based on the model of the ground solver. The labelled unit superposition calculus checks whether selected literals are conflicting. If they are conflicting, then clauses are instantiated such that the ground solver has to refine its model in order to resolve the conflict. Otherwise, satisfiability of the initial first-order clause set is shown.

Clause selection and literal selection in the unit superposition calculus are implemented in separate given clause algorithms. Relevant substitutions are accumulated in labels during unit superposition inferences and then used to instantiate clauses. For redundancy elimination iProver-Eq uses demodulation, dismatching constraints and global subsumption. In order to efficiently propagate redundancy elimination from instantiation into unit superposition, we implemented different representations of labels based on sets, AND/OR-trees and OBDDs. Non-equational resolution and equational superposition inferences provide further simplifications.

Strategies

Proof search options in iProver-Eq control clause and literal selection in the respective given clause algorithms. Equally important is the global distribution of time between the inference engines and the ground solver. At CASC, iProver-Eq will execute a fixed schedule of selected options.

If no equational literals occur in the input, iProver-Eq falls back to the inference rules of iProver, otherwise the latter are disabled and only unit superposition is used. If all clauses are unit equations, no instances need to be generated and the calculus is run without the otherwise necessary bookkeeping.

Implementation

iProver-Eq is implemented in OCaml and the CVC4 SMT solver [BC+11] for the ground reasoning in the equational case and MiniSat [ES04] in the non-equational case. iProver-Eq accepts FOF and CNF formats, where Vampire [HKV11,HK+12] is used for clausification and preprocessing of FOF problems.

iProver-Eq is available from

    http://www.divms.uiowa.edu/~csticksel/iprover-eq

Expected Competition Performance

iProver-Eq has seen some optimisations from the version in the previous CASCs, in particular the integration of CVC4. We expect reasonably good performance in all divisions, including the EPR divisions where instantiation-based methods are particularly strong.

iProverModulo 0.7-0.2

Guillaume Burel
ENSIIE/Cedric, France

Architecture

iProverModulo [Bur11] is a patch to iProver [Kor08] to integrate polarized resolution modulo [Dow10]. Polarized resolution modulo consists in presenting the theory in which the problem has to be solved by means of polarized rewriting rules instead of axioms. It can also be seen as a combination of the set-of-support strategy and selection of literals.

The integration of polarized resolution modulo in iProver only affects its ordered resolution calculus, so that the instantiation calculus is untouched.

To be able to use polarized resolution modulo, the theory has to be presented as a rewriting system. A theory preprocessor is therefore used to convert the axioms of the problem into such a rewriting system.

Strategies

A theory preprocessor is first run to transform the formulas of the problem whose role is "axiom" into polarized rewriting rules. This preprocessor offers a set of strategies to that purpose. For the competition, the Equiv(ClausalAll) and the ClausalAll strategies will be used. The former strategy may be incomplete, depending on the shape of the axioms, so that the prover may give up in certain cases. However, it shows interesting results on some problems. The second strategy should be complete, at least when equality is not involved. The rewriting system for the first strategy is tried for half the time given for the problem, then the prover is restarted with the second strategy if no proof has been found.

The patched version of iProver is run on the remaining formulas modulo the rewriting rules produced by the preprocessor. No scheduling is performed. To be compatible with polarized resolution modulo, literals are selected only when they are maximal w.r.t. a KBO ordering, and orphans are not eliminated. To take advantage of polarized resolution modulo, the resolution calculus is triggered more often than the instantiation calculus, on the contrary to the original iProver.

Normalization of clauses w.r.t. the term rewriting system produced by the preprocessor is performed by transforming these rules into an OCaml program, compiling this program, and dynamically linking it with the prover.

Implementation

iProverModulo is available as a patch to iProver. The most important additions are the plugin-based normalization engine and the handling of polarized rewriting rules.

iProverModulo is available from

    http://www.ensiie.fr/~guillaume.burel/blackandwhite_iProverModulo.html.en

The theory preprocessor is available independently from iProverModulo from

    http://www.ensiie.fr/~guillaume.burel/blackandwhite_autotheo.html.en

Both of them are written in OCaml.

Expected Competition Performance

Although iProver Modulo is based on iProver, we do not expect to perform better than the original iProver. This can be justified by the following arguments: First, iProver Modulo is based on a somewhat old version of iProver (v0.7). Second, the theory preprocessor may produce rewriting systems for which polarized resolution modulo is incomplete. Third, we had to switch off scheduling to cope with the constraints of polarized resolution modulo. Nevertheless, we expect to solve some problems that are difficult for other provers.

Isabelle 2012

Jasmin C. Blanchette¹, Lawrence C. Paulson², Tobias Nipkow¹, Makarius Wenzel³
¹Technische Universität München, Germany, ²University of Cambridge, United Kingdom, ³Université Paris Sud, France

Architecture

Isabelle/HOL 2012 [NPW12] is the higher-order logic incarnation of the generic proof assistant Isabelle. Isabelle/HOL provides several automatic proof tactics, notably an equational reasoner [Nip89], a classical reasoner [PN94], and a tableau prover [Pau99]. It also integrates external first- and higher-order provers via its subsystem Sledgehammer [PB10,BBP11].

Previous versions of Isabelle relied on the TPTP2X tool to translate TPTP files to Isabelle theory files. Starting this year, Isabelle includes a parser for the TPTP syntaxes CNF, FOF, TFF0, and THF0 as well as TPTP versions of its popular tools, invokable on the command line as isabelle tptp_tool max_secs file.p. For example:

isabelle tptp_isabelle_demo 100 SEU/SEU824^3.p

Two versions of Isabelle participate this year. The demo version includes its competitors LEO-II [BP+08] and Satallax [Bro12] as Sledgehammer backends, whereas the competition version leaves them out.

Strategies

The Isabelle tactic submitted to the competition simply tries the following tactics sequentially:

sledgehammer

Invokes the following sequence of provers as oracles via Sledgehammer:

satallax - Satallax 2.4 [Bro12] (demo only);
leo2 - LEO-II 1.3.2 [BP+08] (demo only);
spass - SPASS 3.8ds [BP+12];
vampire - Vampire 1.8 (revision 1435) [RV02];
e - E 1.4 [Sch04];
z3_tptp - Z3 3.2 with TPTP syntax [dMB08].

nitpick

For problems involving only the type $o of Booleans, checks whether a finite model exists using Nitpick [BN10].

simp

Performs equational reasoning using rewrite rules [Nip89].

blast

Searches for a proof using a fast untyped tableau prover and then attempts to reconstruct the proof using Isabelle tactics [Pau99].

auto+spass

Combines simplification and classical reasoning [PN94] under one roof; then invoke Sledgehammer with SPASS on any subgoals that emerge.

z3

Invokes the SMT solver Z3 3.2 [dMB08].

cvc3

Invokes the SMT solver CVC3 2.2 [BT07].

fast

Searches for a proof using sequent-style reasoning, performing a depth-first search [PN94]. Unlike blast, it construct proofs directly in Isabelle. That makes it slower but enables it to work in the presence of the more unusual features of HOL, such as type classes and function unknowns.

best

force

Similar to auto, but more exhaustive.

meson

Implements Loveland's MESON procedure [Lov78]. Constructs proofs directly in Isabelle.

fastforce

Combines fast and force.

Implementation

Isabelle is a generic theorem prover written in Standard ML. Its meta-logic, Isabelle/Pure, provides an intuitionistic fragment of higher-order logic. The HOL object logic extends pure with a more elaborate version of higher-order logic, complete with the familiar connectives and quantifiers. Other object logics are available, notably FOL (first-order logic) and ZF (Zermelo-Fraenkel set theory).

The implementation of Isabelle relies on a small LCF-style kernel, meaning that inferences are implemented as operations on an abstract theorem datatype. Assuming the kernel is correct, all values of type theorem are correct by construction.

Most of the code for Isabelle was written by the Isabelle teams at the University of Cambridge and the Technische Universität München. Isabelle/HOL is available for all major platforms under a BSD-style license from

    http://www.cl.cam.ac.uk/research/hvg/Isabelle

Expected Competition Performance

Isabelle 2012 is the CASC-J6 THF division winner.

Isabelle 2013

Architecture

Isabelle/HOL 2013 [NPW12] is the higher-order logic incarnation of the generic proof assistant Isabelle. Isabelle/HOL provides several automatic proof tactics, notably an equational reasoner [Nip89], a classical reasoner [PN94], and a tableau prover [Pau99]. It also integrates external first- and higher-order provers via its subsystem Sledgehammer [PB10,BBP11].

Previous versions of Isabelle relied on the TPTP2X tool to translate TPTP files to Isabelle theory files. Isabelle includes a parser for the TPTP syntaxes CNF, FOF, TFF0, and THF0 as well as TPTP versions of its popular tools, invokable on the command line as isabelle tptp_tool max_secs file.p. For example:

isabelle tptp_isabelle_demo 100 SEU/SEU824^3.p

Strategies

The Isabelle tactic submitted to the competition simply tries the following tactics sequentially:

sledgehammer

Invokes the following sequence of provers as oracles via Sledgehammer:

spass - SPASS 3.8ds [BP+12];
vampire - Vampire 1.8 (revision 1435) [RV02];
e - E 1.4 [Sch04];
z3_tptp - Z3 3.2 with TPTP syntax [dMB08].

nitpick

For problems involving only the type $o of Booleans, checks whether a finite model exists using Nitpick [BN10].

simp

Performs equational reasoning using rewrite rules [Nip89].

blast

Searches for a proof using a fast untyped tableau prover and then attempts to reconstruct the proof using Isabelle tactics [Pau99].

auto+spass

Combines simplification and classical reasoning [PN94] under one roof; then invoke Sledgehammer with SPASS on any subgoals that emerge.

z3

Invokes the SMT solver Z3 3.2 [dMB08].

cvc3

Invokes the SMT solver CVC3 2.2 [BT07].

fast

best

force

Similar to auto, but more exhaustive.

meson

Implements Loveland's MESON procedure [Lov78]. Constructs proofs directly in Isabelle.

fastforce

Combines fast and force.

Implementation

    http://www.cl.cam.ac.uk/research/hvg/Isabelle

Expected Competition Performance

Isabelle won last year by a thin margin. This year: first or second place.

LEO-II 1.6.0

Christoph Benzmüller
Freie Universität Berlin

Architecture

LEO-II [BP+08], the successor of LEO [BK98], is a higher-order ATP system based on extensional higher-order resolution. More precisely, LEO-II employs a refinement of extensional higher-order RUE resolution [Ben99]. LEO-II is designed to cooperate with specialist systems for fragments of higher-order logic. By default, LEO-II cooperates with the first-order ATP system E [Sch02]. LEO-II is often too weak to find a refutation amongst the steadily growing set of clauses on its own. However, some of the clauses in LEO-II's search space attain a special status: they are first-order clauses modulo the application of an appropriate transformation function. Therefore, LEO-II launches a cooperating first-order ATP system every n iterations of its (standard) resolution proof search loop (e.g., 10). If the first-order ATP system finds a refutation, it communicates its success to LEO-II in the standard SZS format. Communication between LEO-II and the cooperating first-order ATP system uses the TPTP language and standards.

Strategies

LEO-II employs an adapted "Otter loop". Moreover, LEO-II uses some basic strategy scheduling to try different search strategies or flag settings. These search strategies also include some different relevance filters.

Implementation

LEO-II is implemented in OCaml 4, and its problem representation language is the TPTP THF language [BRS08]. In fact, the development of LEO-II has largely paralleled the development of the TPTP THF language and related infrastructure [SB10]. LEO-II's parser supports the TPTP THF0 language and also the TPTP languages FOF and CNF.

Unfortunately the LEO-II system still uses only a very simple sequential collaboration model with first-order ATPs instead of using the more advanced, concurrent and resource-adaptive OANTS architecture [BS+08] as exploited by its predecessor LEO.

The LEO-II system is distributed under a BSD style license, and it is available from

    http://www.leoprover.org

Expected Competition Performance

Recent improvements of LEO-II include a revised ATP interface, new translations into first-order logic, rule support for the axiom of choice, detection of defined equality, modified extensional unification, and more flexible strategy scheduling. LEO-II can now also classify some Satisfiable problems and detect some CounterSatisfiable problems. However, these improvements are very likely not strong enough to attack Satallax or Isabelle in CASC.

MaLARea 0.5

Josef Urban¹, Stephan Schulz², Jiri Vyskocil³
¹Radboud University Nijmegen, The Netherlands
²Technische Universität München, Germany
³Czech Technical University, Czech Republic

Architecture

MaLARea 0.5 [Urb07,US+08] is a metasystem for ATP in large theories where symbol and formula names are used consistently. It uses several deductive systems (now E, SPASS, Vampire, Paradox, Mace), as well as complementary AI techniques like machine learning (the SNoW system) based on symbol-based similarity, model-based similarity, term-based similarity, and obviously previous successful proofs. The version for CASC 2013 will mainly use E prover with the BliStr [Urb13] large-theory strategies, possibly also Prover9, Mace and Paradox. The premise selection methods will likely also use the distance-weighted k-nearest neighbor [KU13] and E's implementation of SInE.

Strategies

The basic strategy is to run ATPs on problems, then use the machine learner to learn axiom relevance for conjectures from solutions, and use the most relevant axioms for next ATP attempts. This is iterated, using different timelimits and axiom limits. Various features are used for learning, and the learning is complemented by other criteria like model-based reasoning, symbol and term-based similarity, etc.

Implementation

The metasystem is implemented in ca. 2500 lines of Perl. It uses many external programs - the above mentioned ATPs and machine learner, TPTP utilities, LADR utilities for work with models, and some standard Unix tools.

MaLARea is available from

    https://github.com/JUrban/MPTP2/tree/master/MaLARea

The metasystem's Perl code is released under GPL2

Expected Competition Performance

Thanks to machine learning, MaLARea is strongest on batches of many related problems with many redundant axioms where some of the problems are easy to solve and can be used for learning the axiom relevance. MaLARea is not very good when all problems are too difficult (nothing to learn from), or the problems (are few and) have nothing in common. Some of its techniques (selection by symbol and term-based similarity, model-based reasoning) could however make it even there slightly stronger than standard ATPs. MaLARea has a very good performance on the MPTP Challenge, which is a predecessor of the LTB division, and it is the winner of the 2008 MZR LTB category. MaLARea 0.4 came second in the 2012 MZR@Turing competition and solved most problems in the Assurance class.

Muscadet 4.3

Dominique Pastre
University Paris Descartes, France

Architecture

The Muscadet theorem prover is a knowledge-based system. It is based on Natural Deduction, following the terminology of [Ble71] and [Pas78], and uses methods which resembles those used by humans. It is composed of an inference engine, which interprets and executes rules, and of one or several bases of facts, which are the internal representation of "theorems to be proved". Rules are either universal and put into the system, or built by the system itself by metarules from data (definitions and lemmas). Rules may add new hypotheses, modify the conclusion, create objects, split theorems into two or more subtheorems or build new rules which are local for a (sub-)theorem.

Strategies

There are specific strategies for existential, universal, conjonctive or disjunctive hypotheses and conclusions, and equalities. Functional symbols may be used, but an automatic creation of intermediate objects allows deep subformulae to be flattened and treated as if the concepts were defined by predicate symbols. The successive steps of a proof may be forward deduction (deduce new hypotheses from old ones), backward deduction (replace the conclusion by a new one), refutation (only if the conclusion is a negation), search for objects satisfying the conclusion or dynamic building of new rules.

The system is also able to work with second order statements. It may also receive knowledge and know-how for a specific domain from a human user; see [Pas89] and [Pas93]. These two possibilities are not used while working with the TPTP Library.

Implementation

Muscadet [Pas01] is implemented in SWI-Prolog. Rules are written as more or less declarative Prolog clauses. Metarules are written as sets of Prolog clauses. The inference engine includes the Prolog interpreter and some procedural Prolog clauses. A theorem may be split into several subtheorems, structured as a tree with "and" and "or" nodes. All the proof search steps are memorized as facts including all the elements which will be necessary to extract later the useful steps (the name of the executed action or applied rule, the new facts added or rule dynamically built, the antecedents and a brief explanation).

Muscadet is available from

    http://www.math-info.univ-paris5.fr/~pastre/muscadet/muscadet.html

Expected Competition Performance

The best performances of Muscadet will be for problems manipulating many concepts in which all statements (conjectures, definitions, axioms) are expressed in a manner similar to the practice of humans, especially of mathematicians [Pas02,Pas07]. It will have poor performances for problems using few concepts but large and deep formulas leading to many splittings. Its best results will be in set theory, especially for functions and relations. Its originality is that proofs are given in natural style. Changes since last year are that some bugs have been fixed, also some imperfections in the writing of the proofs, and some heuristics have been improved which shorten a number of proofs.

Nitrox 2013

Jasmin C. Blanchette¹, Emina Torlak²
¹Technische Universität München, Germany, ²University of California, USA

Architecture

Nitrox is the first-order version of Nitpick [BN10], an open source counterexample generator for Isabelle/HOL [NPW12]. It builds on Kodkod [TJ07], a highly optimized first-order relational model finder based on SAT. The name Nitrox is a portmanteau of Nitpick and Paradox (clever, eh?).

Strategies

Nitrox employs Kodkod to find a finite model of the negated conjecture. It performs a few transformations on the input, such as pushing quantifiers inside, but 99% of the solving logic is in Kodkod and the underlying SAT solver.

The translation from HOL to Kodkod's first-order relational logic (FORL) is parameterized by the cardinalities of the atomic types occurring in it. Nitrox enumerates the possible cardinalities for the universe. If a formula has a finite counterexample, the tool eventually finds it, unless it runs out of resources.

Nitpick is optimized to work with higher-order logic (HOL) and its definitional principles (e.g., (co)inductive predicates, (co)inductive datatypes, (co)recursive functions). When invoked on untyped first-order problem, few of its optimizations come into play, and the problem handed to Kodkod is essentially a first-order relational logic (FORL) rendering of the TPTP FOF problem. There are two main exceptions:

Nested quantifiers are moved as far inside the formula as possible before Kodkod gets a chance to look at them [BN10].
Definitions invoked with fixed arguments are specialized.

Implementation

Nitrox, like most of Isabelle/HOL, is written in Standard ML. Unlike Isabelle itself, which adheres to the LCF small-kernel discipline, Nitrox does not certify its results and must be trusted. Kodkod is written in Java. MiniSat 1.14 is used as the SAT solver.

Expected Competition Performance

Since Nitpick was designed for HOL, it doesn't have any type inference à la Paradox. It also doesn't use the SAT solver incrementally, which penalizes it a bit (but not as much as the missing type inference). Kodkod itself is known to perform less well on FOF than Paradox, because it is designed and optimized for a somewhat different logic, FORL. On the other hand, Kodkod's symmetry breaking seems better calibrated than Paradox's. Hence, we expect Nitrox to end up in second or third place in the FNT division.

Paradox 3.0

Koen Claessen, Niklas Sörensson
Chalmers University of Technology, Sweden

Architecture

Paradox [CS03] is a finite-domain model generator. It is based on a MACE-style [McC03] flattening and instantiating of the first-order clauses into propositional clauses, and then the use of a SAT solver to solve the resulting problem.

Paradox incorporates the following features: Polynomial-time clause splitting heuristics, the use of incremental SAT, static symmetry reduction techniques, and the use of sort inference.

Strategies

There is only one strategy in Paradox:

Analyze the problem, finding an upper bound N on the domain size of models, where N is possibly infinite. A finite such upper bound can be found, for example, for EPR problems.
Flatten the problem, and split clauses and simplify as much as possible.
Instantiate the problem for domain sizes 1 up to N, applying the SAT solver incrementally for each size. Report "SATISFIABLE" when a model is found.
When no model of sizes smaller or equal to N is found, report "CONTRADICTION".

In this way, Paradox can be used both as a model finder and as an EPR solver.

Implementation

The main part of Paradox is implemented in Haskell using the GHC compiler. Paradox also has a built-in incremental SAT solver which is written in C++. The two parts are linked together on the object level using Haskell's Foreign Function Interface.

Expected Competition Performance

Paradox 3.0 is the CASC-J6 FNT division winner.

PEPR 0.0ps

Tianyi Liang, Cesare Tinelli
University of Iowa, USA

Architecture

PEPR is a hybrid parallel automated theorem prover for the effectively propositional fragment of first-order logic (EPR). Its core is based on a variant of the Model Evolution calculus [BT03,BT05], which is a decision procedure for the satisfiability of EPR formulas. During reasoning, the prover maintains a set of possibly non-ground literals, the context, to represent a candidate Herbrand model of the input clause set; it uses resolution-like operations between literals in the context and input clauses to identify instances of the latter that are falsified by the candidate model. These instances are then used to modify ("repair") the context non-deterministically in an attempt to satisfy them. If a context is not repairable, the input set has been proven to be unsatisfiable. A model is found when the context is saturated.

PEPR combines synergistically a portfolio approach, with multiple sub-provers that differ on the strategies they use to repair their context, and MapReduce-style clause-level parallelism for the computation of input clause instances [LT12]. The parallel architecture of PEPR is based on the Actor model with asynchronous message passing. To minimize communication all literals/clauses, including the input ones, are created locally in each sub-prover, which allows PEPR to run also in a distributed environment.

Because CASC measures performance in terms of total CPU time, the version of PEPR submitted to CASC-24 is actually sequential.

Strategies

In addition to the strategies implemented in Darwin, a theorem prover for the full ME calculus, PEPR implements conjecture distance, candidate coverage and predicate credits as literal selection heuristics, as well as thread scheduling for the parallel part.

Implementation

The sequential core of PEPR is fully implemented in C++, and it features new functionalities from the new C++11 standards. It requires either GCC 4.7 (or above) or MSVC 2012 to compile. To parse the TPTP syntax, it also requires the ANTLR 3.4 library. PEPR accepts the TPTP CNF format for EPR formula natively, and requires an external clausifier for the FOF format. The parallel part is built strictly upon the sequential core, and requires the Charm++ library to compile.

Although PEPR shows some promising results in its ability to exploit multiple cores, it still a prototype. Many configurations inside PEPR need to be further tuned.

Expected Competition Performance

PEPR will enter only the EPR division. Based on local tests with the CASC-23 benchmarks we expect it to rank in the middle range.

Princess 2012-06-04

Philipp Rümmer, Aleksandar Zeljić
Uppsala University, Sweden

Architecture

Princess [Rue08,Rue12] is a theorem prover for first-order logic modulo linear integer arithmetic. The prover has been under development since 2007, and represents a combination of techniques from the areas of first-order reasoning and SMT solving. The main underlying calculus is a free-variable tableau calculus, which is extended with constraints to enable backtracking-free proof expansion, and positive unit hyper-resolution for lightweight instantiation of quantified formulae. Linear integer arithmetic is handled using a set of built-in proof rules resembling the Omega test, which altogether yields a calculus that is complete for full Presburger arithmetic, for first-order logic, and for a number of further fragments.

The internal calculus of Princess only supports uninterpreted predicates; uninterpreted functions are encoded as predicates, together with the usual axioms. Through appropriate translation of quantified formulae with functions, the e-matching technique common in SMT solvers can be simulated; triggers in quantified formulae are chosen based on heuristics similar to those in the Simplify prover.

Strategies

Princess supports a number of different proof expansion strategies (e.g., depth-first, breadth-first), which are chosen based on syntactic properties of a problem (in particular, the kind of quantifiers occurring in the problem). Further options exist to control, for instance, the selection of triggers in quantified formulae, clausification, and the handling of functions.

For CASC, Princess will run a schedule with a small number of configurations for each problem (portfolio method). The schedule is determined either statically, or dynamically using syntactic attributes of problems (such as number and kind of quantifiers, etc), based on training using a random sample of problems from the TPTP library.

Implementation

Princess is entirely written in Scala and runs on any recent Java virtual machine; besides the standard Scala and Java libraries, only the Cup parser library is employed. Princess is available from

    http://www.philipp.ruemmer.org/princess.shtml

Expected Competition Performance

Princess 2012-06-04 is the CASC-J6 TFA division winner.

Prover9 2009-11A

Bob Veroff on behalf of William McCune
University of New Mexico, USA

Architecture

Prover9, Version 2009-11A, is a resolution/paramodulation prover for first-order logic with equality. Its overall architecture is very similar to that of Otter-3.3 [McC03]. It uses the "given clause algorithm", in which not-yet-given clauses are available for rewriting and for other inference operations (sometimes called the "Otter loop").

Prover9 has available positive ordered (and nonordered) resolution and paramodulation, negative ordered (and nonordered) resolution, factoring, positive and negative hyperresolution, UR-resolution, and demodulation (term rewriting). Terms can be ordered with LPO, RPO, or KBO. Selection of the "given clause" is by an age-weight ratio.

Proofs can be given at two levels of detail: (1) standard, in which each line of the proof is a stored clause with detailed justification, and (2) expanded, with a separate line for each operation. When FOF problems are input, proof of transformation to clauses is not given.

Completeness is not guaranteed, so termination does not indicate satisfiability.

Strategies

Like Otter, Prover9 has available many strategies; the following statements apply to CASC-2012.

Given a problem, Prover9 adjusts its inference rules and strategy according to syntactic properties of the input clauses such as the presence of equality and non-Horn clauses. Prover9 also does some preprocessing, for example, to eliminate predicates.

In previous CASC competitions, Prover9 has used LPO to order terms for demodulation and for the inference rules, with a simple rule for determining symbol precedence. For CASC 2012, we are going to use KBO instead.

For the FOF problems, a preprocessing step attempts to reduce the problem to independent subproblems by a miniscope transformation; if the problem reduction succeeds, each subproblem is clausified and given to the ordinary search procedure; if the problem reduction fails, the original problem is clausified and given to the search procedure.

Implementation

Prover9 is coded in C, and it uses the LADR libraries. Some of the code descended from EQP [McC97]. (LADR has some AC functions, but Prover9 does not use them). Term data structures are not shared (as they are in Otter). Term indexing is used extensively, with discrimination tree indexing for finding rewrite rules and subsuming units, FPA/Path indexing for finding subsumed units, rewritable terms, and resolvable literals. Feature vector indexing [Sch04] is used for forward and backward nonunit subsumption. Prover9 is available from

    http://www.cs.unm.edu/~mccune/prover9/

Expected Competition Performance

Some of the strategy development for CASC was done by experimentation with the CASC-2004 competition "selected" problems. (Prover9 has not yet been run on other TPTP problems.) Prover9 is unlikely to challenge the CASC leaders, because (1) extensive testing and tuning over TPTP problems has not been done, (2) theories (e.g., ring, combinatory logic, set theory) are not recognized, (3) term orderings and symbol precedences are not fine-tuned, and (4) multiple searches with differing strategies are not run.

Finishes in the middle of the pack are anticipated in all categories in which Prover9 competes.

Satallax 2.7

Chad E. Brown
Saarland University, Germany

Architecture

Satallax 2.7 [Bro12] is an automated theorem prover for higher-order logic. The particular form of higher-order logic supported by Satallax is Church's simple type theory with extensionality and choice operators. The SAT solver MiniSat [ES04] is responsible for much of the search for a proof. The theoretical basis of search is a complete ground tableau calculus for higher-order logic [BS10] with a choice operator [BB11]. A problem is given in the THF format. A branch is formed from the axioms of the problem and the negation of the conjecture (if any is given). From this point on, Satallax tries to determine unsatisfiability or satisfiability of this branch.

Satallax progressively generates higher-order formulae and corresponding propositional clauses [Bro13]. These formulae and propositional clauses correspond to instances of the tableau rules. Satallax uses the SAT solver MiniSat as an engine to test the current set of propositional clauses for unsatisfiability. If the clauses are unsatisfiable, then the original branch is unsatisfiable.

Additionally, Satallax may optionally generate first-order formulas in addition to the propositional clauses. If this option is used, then Satallax peroidically calls the first-order theorem prover E to test for first-order unsatisfiability. If the set of first-order formulas is unsatisfiable, then the original branch is unsatisfiable.

Strategies

There are about a hundred flags that control the order in which formulas and instantiation terms are considered and propositional clauses are generated. Other flags activate some optional extensions to the basic proof procedure (such as whether or not to call the theorem prover E). A collection of flag settings is called a mode. Approximately 500 modes have been defined and tested so far. A strategy schedule is an ordered collection of modes with information about how much time the mode should be allotted. Satallax tries each of the modes for a certain amount of time sequentially. Satallax 2.7 has strategy schedule consisting of 68 modes. Each mode is tried for time limits ranging from 0.1 seconds to 54.9 seconds. The strategy schedule was determined through experimentation using the THF problems in version 5.4.0 of the TPTP library.

Implementation

Satallax 2.7 is implemented in OCaml. A foreign function interface is used to in teract with MiniSat 2.2.0. Satallax is available from

    http://satallax.com

Expected Competition Performance

Satallax 2.1 won the THF division at CASC in 2011 and Satallax 2.4 came in second among the competing systems in 2012. Satallax has traditionally been weak on problems involving equational reasoning. Satallax 2.7 should be much better on such problems since most of these problems (in practice) are first-order and can be solved by E, so Satallax 2.7 should perform better than Satallax 2.4 would in the competition.

Satallax-MaLeS 1.2

Daniel Kühlwein¹, Josef Urban¹, Chad Brown²
¹Radboud University Nijmegen, The Netherlands
²Saarland University, Germany

Architecture

Satallax-MaLeS 1.2 is the result of applying the MaLeS 1.2 framework to Satallax (version 2.7) [Bro12]. MaLeS (Machine Learning of Strategies) is a framework for automatic tuning of ATPs. It combines strategy finding (similar to ParamILS [HH+09] and BliStr [Urb13]) with strategy scheduling (similar to SatZilla [XH+08]). A description of Satallax can be found above.

Strategies

The learning algorithm is described in [KUS13]. Further information can be found on the project website.

Implementation

MaLeS is implemented in python, using the numpy and scipy libraries. MaLeS is open source and available from

    https://code.google.com/p/males/

Satallax is available from

    http://satallax.com

Expected Competition Performance

Satallax-MaLeS 1.2 should perform similar to Satallax 2.7.

SPASS+T 2.2.19

Uwe Waldmann
Max-Planck-Insitut für Informatik, Germany

Architecture

SPASS+T is an extension of the superposition-based theorem prover SPASS that integrates algebraic knowledge into SPASS in three complementary ways: by passing derived formulas to an external SMT procedure (currently Yices or CVC3), by adding standard axioms, and by built-in arithmetic simplification and inference rules. A first version of the system has been described in [PW06]; later a much more sophisticated coupling of the SMT procedure has been added [Zim07]. The latest version provides improved support for isint/1, israt/1, floor/1 and ceiling/1 and adds partial input abstraction and history-dependent weights for numerical constants.

Strategies

Standard axioms and built-in arithmetic simplification and inference rules are integrated into the standard main loop of SPASS. Inferences between standard axioms are excluded, so the user-supplied formulas are taken as set of support. The external SMT procedure runs in parallel in a separate process, leading occasionally to non-deterministic behaviour.

Implementation

SPASS+T is implemented in C. SPASS+T is available from

    http://www.mpi-inf.mpg.de/~uwe/software/#TSPASS

Expected Competition Performance

SPASS+T 2.2.16 came second in the TFA division of last CASC; we expect a similar performance in CASC 2013.

TEMPLAR::leanCoP 0.8

Mario Frank, Jens Otten
University of Potsdam, Germany

Architecture

TEMPLAR is a formula set pruner which has the scope to reduce the set of given axioms to a size which can be handled by current state-of-the-art automated theorem provers. It reads and analyses the given axiom set and searches for a set of formulae which seem to be relevant for the proof of a given formula. The selected formulae can be given to an automated theorem prover with additional information and the theorem prover can attempt a proof. Multiple instances of a theorem prover can be started in parallel while TEMPLAR calculates other formula sets.

The input formulae are transformed into negation normal form (but not into clause normal form) and multiple indexing techniques are applied to speed up the formula selection. The selection itself is done by a dynamic module which uses two different algorithms and triggers them depending on the structure of the current formula set.

TEMPLAR conceptually supports every theorem prover for which an executor (a module) is implemented. Currently, one executor is existent, which is the leanCoP-Executor. leanCop [OB03,Ott08] 2.2, a compact connection calculus based theorem prover for classical first order logic is used as inference engine.

The CASC version of TEMPLAR is a beta stadium version of the project and lacks of some features and performance optimizations in order to reduce the memory consumption.

Strategies

TEMPLAR analyses all axioms and gathers information about included predicate symbols, their arity and additional information like their polarities and for instance the complete formula graph.

TEMPLAR uses - depending on the metrics of the formula set (e.g. size of the set) - two different search approaches which are chained, if appropriate. The one approach which is tailored for smaller sets is an unification based graph search with multiple search graph pruning techniques. This is an reconception and extension of the concept of ARDE which was described in [Fra12] and used in the CASC-J6. For this approach, the negation normal forms of all formulae are used. The other approach is an implementation of a frequency based concept commonly used for natural language processing (e.g. automated text summarization) and is tailored for bigger formula sets. This concept aims to recognise the topics of a set of documents and can recognise common symbols and thus should have similar effects like SInE [HV11] but uses another weighting approach.

Implementation

TEMPLAR is fully implemented in C++ conforming to the C++ standard of 2011. The TPTP and batch file parsing is done by a parser which is implemented in a descriptive meta-language (a grammar) conforming to the C++ syntax. This grammar is transformed into C++ instructions by the boost library (boost spirit Qi). The selected formulae can be output in the original form or the Skolem normal form, conforming to the TPTP syntax.

TEMPLAR can be compiled with the GNU Compiler Collection (GCC) with the minimum version 4.6.3 and the Boost library, minimum version 1.42 (1.46 is better). leanCop can be run with SWI Prolog or Eclipse Prolog (5.x) for example.

Currently, TEMPLAR is not available since it is a diploma thesis project but will be published afterwards in the final version.

Expected Competition Performance

TEMPLAR extends a theorem prover by the described pruning techniques but gives the prover the chance to attempt a proof on the original file, too. Thus, the performance is expected to be at least the performance of the underlying prover. Due to the restrictions, some proofs are expected to be found faster and some normally not (in the given time) provable problems are expected to become provable.

TPS 3.120601S1b

Chad E. Brown¹, Peter Andrews²
¹Saarland University, Germany, ²Carnegie Melon University, USA

Architecture

TPS is a higher-order theorem proving system that has been developed over several decades under the supervision of Peter B. Andrews with substantial work by Eve Longini Cohen, Dale A. Miller, Frank Pfenning, Sunil Issar, Carl Klapper, Dan Nesmith, Hongwei Xi, Matthew Bishop, Chad E. Brown, Mark Kaminski, Rémy Chrétien and Cris Perdue. TPS can be used to prove theorems of Church's type theory automatically, interactively, or semi-automatically [AB+96,AB06]. When searching for a proof, TPS first searches for an expansion proof [Mil87] or an extensional expansion proof [Bro07] of the theorem. Part of this process involves searching for acceptable matings [And81]. Using higher-order unification, a pair of occurrences of subformulae (which are usually literals) is mated appropriately on each vertical path through an expanded form of the theorem to be proved. The expansion proof thus obtained is then translated [Pfe87] without further search into a proof of the theorem in natural deduction style.

Strategies

Strategies used by TPS in the search process include:

Re-ordering conjunctions and disjunctions to alter the way paths through the formula are enumerated.
The use of primitive substitutions and gensubs [And89].
Path-focused duplication [Iss90].
Dual instantiation of definitions, and generating substitutions for higher-order variables which contain abbreviations already present in the theorem to be proved [BA98].
Component search [Bis99].
Generating and solving set constraints [Bro02].
Generating connections using extensional and equational reasoning [Bro07].

Implementation

TPS has been developed as a research tool for developing, investigating, and refining a variety of methods of searching for expansion proofs, and variations of these methods. Its behavior is controlled by hundreds of flags. A set of flags, with values for them, is called a mode. The strategy of the current version of TPS consists of 71 modes. When searching for a proof in automatic mode, TPS tries each of these modes in turn for a specified amount of time (at least 1 second and at most 41 seconds). If TPS succeeds in finding an expansion proof, it translates the expansion proof to a natural deduction proof. This final step ensures that TPS will not incorrectly report that a formula has been proven. TPS is implemented in Common Lisp, and is available from

    http://gtps.math.cmu.edu/tps.html

Expected Competition Performance

TPS was the CASC-22 THF division winner in 2009. In the CASC competitions since 2009 TPS has come in last behind Isabelle, LEO-II and Satallax. There have been no changes to TPS since last year, so there is no reason to believe TPS will win this year.

Vampire 2.6

Krystof Hoder, Andrei Voronkov
University of Manchester, England

Architecture

Vampire 2.6 is an automatic theorem prover for first-order classical logic. It consists of a shell and a kernel. The kernel implements the calculi of ordered binary resolution and superposition for handling equality. It also implements the Inst-gen calculus. The splitting rule in kernel adds propositional parts to clauses, which are being manipulated using binary decision diagrams and a SAT solver. A number of standard redundancy criteria and simplification techniques are used for pruning the search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered unit equalities. The reduction ordering is the Knuth-Bendix Ordering.

Substitution tree and code tree indexes are used to implement all major operations on sets of terms, literals and clauses. Although the kernel of the system works only with clausal normal form, the shell accepts a problem in the full first-order logic syntax, clausifies it and performs a number of useful transformations before passing the result to the kernel. Also the axiom selection algorithm Sine [HV11] can be enabled as part of the preprocessing.

When a theorem is proved, the system produces a verifiable proof, which validates both the clausification phase and the refutation of the CNF.

Strategies

The Vampire 2.6 kernel provides a fairly large number of options for strategy se lection. The most important ones are:

Choice of the main procedure:
- Limited Resource Strategy
- DISCOUNT loop
- Otter loop
- Goal oriented mode based on tabulation
- Instantiation using the Inst-gen calculus
A variety of optional simplifications.
Parameterized reduction orderings.
A number of built-in literal selection functions and different modes of comparing literals.
Age-weight ratio that specifies how strongly lighter clauses are preferred for inference selection.
Set-of-support strategy.

Implementation

Vampire 2.6 is implemented in C++.

Expected Competition Performance

Vampire 2.6 is the CASC-J6 FOF and LTB division winner.

Zipperposition 0.2

Guillaume Burel, Simon Cruanes
ENSIIE/Cedric, France

Architecture

Zipperposition 0.2 is a superposition prover for first order logic with equality. It is written in OCaml, and largely inspired from E [Sch02]. It features lazy transformation to CNF [GS03], the Superposition, Equality Resolution and Equality Factoring inference rules, and many simplification rules including unit rewriting, condensation, subsumption and contextual literal cutting.

The prover also uses a so-called knowledge base that contains description of algebraic theories, of individual axioms (such as associativity and commutativity), lemmas and theory-specific rewrite systems. This base currently contains few definitions and lemmas.

Strategies

Zipperposition allows one to choose a term ordering family among KBO or RPO (using RPO by default). Otherwise, it is totally automated and tries to select a symbol precedence based on its input (e.g., if it detect a symbol definition in the input, it will try to orient it so that demodulation will eliminate the symbol).

Implementation

The prover is written in OCaml. For indexing, it uses perfect discrimination trees for demodulation, feature vector indexes for subsumption, and fingerprint indexing for inferences. It depends on a few OCaml libraries to serialize/deserialize its knowledge base and to provide a Datalog reasoner at the meta-level. Since Zipperposition is designed to be a prototype for experimenting with superposition, its performance is average at best.

Zipperposition is available from

    http://www.rocq.inria.fr/deducteam/Zipperposition/

Expected Competition Performance

We expect Zipperposition to have decent performance on FOF problems with equality, and poor performance on large CNF problems without equality. The system is not tuned for CASC in any way.

References

AB+96: Andrews P.B., Bishop M., Issar S., Nesmith. D., Pfenning F., Xi H. (1996), TPS: A Theorem-Proving System for Classical Type Theory, Journal of Automated Reasoning 16(3), 321-353.
AB06: Andrews P.B., Brown C.E. (2006), TPS: A Hybrid Automatic-Interactive System for Developing Proofs, Journal of Applied Logic 4(4), 367-395.
And81: Andrews P.B. (1981), Theorem Proving via General Matings, Journal of the ACM 28(2), 193-214.
And89: Andrews P.B. (1989), On Connections and Higher-Order Logic, Journal of Automated Reasoning 5(3), 257-291.
BG98: Bachmair L., Ganzinger H. (1998), Equational Reasoning in Saturation-Based Theorem Proving, Bibel W., Schmitt P.H., Automated Deduction, A Basis for Applications, 352-397, Applied Logic Series I Foundations - Calculi and Methods(10), Kluwer Academic Publishers.
BB11: Backes J., Brown C.E. (2011), Analytic Tableaux for Higher-Order Logic with Choice, Journal of Automated Reasoning 47(4), 451-479.
BC+11: Barrett C., Conway C., Deters M., Hadarean L., Jovanovic D., King T., Reynolds A., Tinelli C. (2011), CVC4, Gopalakrishnan G., Qadeer S., Proceedings of the 23rd International Conference on Computer Aided Verification (Snowbird, USA), 171-177, Lecture Notes in Computer Science 6806, Springer-Verlag.
BT07: Barrett C., Tinelli C. (2007), CVC3, Damm W., Hermanns H., Proceedings of the 19th International Conference on Computer Aided Verification (Berlin, Germany), 298-302, Lecture Notes in Computer Science 4590, Springer-Verlag.
BFN96: Baumgartner P., Furbach U., Niemelä I. (1996), Hyper Tableaux, Alferes J., Pereira L., Orlowska E., Proceedings of JELIA'96: European Workshop on Logic in Artificial Intelligence (Evora, Portugal), 1-17, Lecture Notes in Artificial Intelligence 1126, Springer-Verlag.
BFP07: Baumgartner P., Furbach U., Pelzer B. (2007), Hyper Tableaux with Equality, Pfenning F., Proceedings of the 21st International Conference on Automated Deduction (Bremen, Germany), 492-507, Lecture Notes in Artificial Intelligence 4603, Springer-Verlag.
BT03: Baumgartner P., Tinelli C. (2003), The Model Evolution Calculus, Baader F., Proceedings of the 19th International Conference on Automated Deduction (Miami, USA), 350-364, Lecture Notes in Artificial Intelligence 2741, Springer-Verlag.
BT05: Baumgartner P., Tinelli C. (2005), The Model Evolution Calculus with Equality, Nieuwenhuis R., Proceedings of the 20th International Conference on Automated Deduction (Tallinn, Estonia), 392-408, Lecture Notes in Artificial Intelligence 3632, Springer-Verlag.
BW13: Baumgartner P., Waldmann U. (2013), Hierarchic Superposition With Weak Abstraction, Bonacina M.P., Proceedings of the 24t International Conference on Automated Deduction (Lake Placid, USA), Lecture Notes in Artificial Intelligence, Springer-Verlag.
BK98: Benzmüller C., Kohlhase M. (1998), LEO - A Higher-Order Theorem Prover, Kirchner C., Kirchner H., Proceedings of the 15th International Conference on Automated Deduction (Lindau, Germany), 139-143, Lecture Notes in Artificial Intelligence 1421, Springer-Verlag.
BP+08: Benzmüller C., Paulson L., Theiss F., Fietzke A. (2008), LEO-II - A Cooperative Automatic Theorem Prover for Higher-Order Logic, Baumgartner P., Armando A., Gilles D., Proceedings of the 4th International Joint Conference on Automated Reasoning (Sydney, Australia), 162-170, Lecture Notes in Artificial Intelligence 5195, Springer-Verlag.
BRS08: Benzmüller C., Rabe F., Sutcliffe G. (2008), THF0 - The Core TPTP Language for Classical Higher-Order Logic, Baumgartner P., Armando A., Gilles D., Proceedings of the 4th International Joint Conference on Automated Reasoning (Sydney, Australia), 491-506, Lecture Notes in Artificial Intelligence 5195, Springer-Verlag.
BS+08: Benzmüller C., Sorge V., Jamnik M., Kerber M. (2008), Combined Reasoning by Automated Cooperation, Journal of Applied Logic 6(3), 318-342.
Ben99: Benzmüller C. (1999), Extensional Higher-order Paramodulation and RUE-Resolution, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), 399-413, Lecture Notes in Artificial Intelligence 1632, Springer-Verlag.
Bib87: Bibel W. (1987), Automated Theorem Proving, Vieweg and Sohn.
Bie08: Biere A. (2008), PicoSAT Essentials, Journal on Satisfiability, Boolean Modeling and Computation 4, 75-97.
Bie12: Biere A. (2012), Lingeling and Friends Entering the SAT Challenge 2012, Balint A., Belov A., Diepold A., Gerber S., Järvisalo M., Sinz C., Proceedings of SAT Challenge 2012: Solver and Benchmark Descriptions (Trento, Italy), 15-16, Department of Computer Science Series of Publications B B-2012-2, University of Helsinki.
BA98: Bishop M., Andrews P.B. (1998), Selectively Instantiating Definitions, Kirchner C., Kirchner H., Proceedings of the 15th International Conference on Automated Deduction (Lindau, Germany), 365-380, Lecture Notes in Artificial Intelligence 1421, Springer-Verlag.
Bis99: Bishop M. (1999), A Breadth-First Strategy for Mating Search, Ganzinger H., Proceedings of the 16th International Conference on Automated Deduction (Trento, Italy), 359-373, Lecture Notes in Artificial Intelligence 1632, Springer-Verlag.
BBP11: Blanchette J., Boehme S., Paulson L. (2011), Extending Sledgehammer with SMT Solvers, Bjorner N., Sofronie-Stokkermans V., Proceedings of the 23rd International Conference on Automated Deduction (Wroclaw, Poland), 116-130, Lecture Notes in Artificial Intelligence 6803, Springer-Verlag.
BN10: Blanchette J., Nipkow T. (2010), Nitpick: A Counterexample Generator for Higher-Order Logic Based on a Relational Model Finder, Kaufmann M. Paulson L., Proceedings of the 1st International Conference on Interactive Theorem Proving (Edinburgh, United Kingdom), 131-146, Lecture Notes in Computer Science 6172, Springer-Verlag.
Ble71: Bledsoe W.W. (1971), Splitting and Reduction Heuristics in Automatic Theorem Proving, Artificial Intelligence 2, 55-77.
BS10: Brown C.E., Smolka G. (2010), Analytic Tableaux for Simple Type Theory and its First-Order Fragment, Logical Methods in Computer Science 6(2).
Bro02: Brown C.E. (2002), Solving for Set Variables in Higher-Order Theorem Proving, Voronkov A., Proceedings of the 18th International Conference on Automated Deduction (Copenhagen, Denmark), 408-422, Lecture Notes in Artificial Intelligence 2392, Springer-Verlag.
Bro07: Brown C.E. (2007), Automated Reasoning in Higher-Order Logic: Set Comprehension and Extensionality in Church's Type Theory, Studies in Logic: Logic and Cognitive Systems 10, College Publications.
Bro11: Brown C.E. (2011), Reducing Higher-Order Theorem Proving to a Sequence of SAT Problems, Bjorner N., Sofronie-Stokkermans V., Proceedings of the 23rd International Conference on Automated Deduction (Wroclaw, Poland), 147-161, Lecture Notes in Artificial Intelligence 6803, Springer-Verlag.
Bro12: Brown C.E. (2012), Satallax: An Automated Higher-Order Prover (System Description), Gramlich B., Miller D., Sattler U., Proceedings of the 6th International Joint Conference on Automated Reasoning (Manchester, United Kingdom), 111-117, Lecture Notes in Artificial Intelligence 7364.
Bro13: Brown C.E. (2013), Reducing Higher-Order Theorem Proving to a Sequence of SAT Problems, Journal of Automated Reasoning 51(1), 57-77.
Bur11: Burel G. (2011), Experimenting with Deduction Modulo, Bjorner N., Sofronie-Stokkermans V., Proceedings of the 23rd International Conference on Automated Deduction (Wroclaw, Poland), 162-176, Lecture Notes in Artificial Intelligence 6803, Springer-Verlag.
CLS11: Claessen K., Lilliestrom A., Smallbone N. (2011), Sort It Out with Monotonicity - Translating between Many-Sorted and Unsorted First-Order Logic, Bjorner N., Sofronie-Stokkermans V., Proceedings of the 23rd International Conference on Automated Deduction (Wroclaw, Poland), 207-221, Lecture Notes in Artificial Intelligence 6803, Springer-Verlag.
CS03: Claessen K., Sörensson N. (2003), New Techniques that Improve MACE-style Finite Model Finding, Baumgartner P., Fermueller C., Proceedings of the CADE-19 Workshop: Model Computation - Principles, Algorithms, Applications (Miami, USA).
dMB08: de Moura L., Bjorner N. (2008), Z3: An Efficient SMT Solver, Ramakrishnan C., Rehof J., Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (Budapest, Hungary), 337-340, Lecture Notes in Artificial Intelligence 4963, Springer-Verlag.
Dow10: Dowek G. (2010), Polarized Resolution Modulo, Calude C., Sassone V., Theoretical Computer Science, 182-196, IFIP Advances in Information and Communication Technology, Springer-Verlag.
ES04: E{\'e}n N., Sörensson N. (2004), An Extensible SAT-solver, Giunchiglia E., Tacchella A., Proceedings of the 6th International Conference on Theory and Applications of Satisfiability Testing (Santa Margherita Ligure, Italy), 502-518, Lecture Notes in Computer Science 2919, Springer-Verlag.
Fra12: Frank M. (2012), Relevanzbasiertes Preprocessing für Automatische Theorembeweiser, Leipziger Beiträge zur Informatik - SKIL 2012 - Dritte Studentenkonferenz Informatik Leipzig 2012 (Leipzig, Germany), 87-98, Leipziger Informatik-Verbund.
GK03: Ganzinger H., Korovin K. (2003), New Directions in Instantiation-Based Theorem Proving, Kolaitis P., Proceedings of the 18th IEEE Symposium on Logic in Computer Science (Ottawa, Canada), 55-64, IEEE Press.
GK04: Ganzinger H., Korovin K. (2004), Integrating Equational Reasoning into Instantiation-Based Theorem Proving, Marcinkowski J., Tarlecki A., Proceedings of the 18th International Workshop on Computer Science Logic, 13th Annual Conference of the EACSL (Karpacz, Poland), 71-84, Lecture Notes in Computer Science 3210, Springer-Verlag.
GS03: Ganzinger H., Stuber J. (2003), Superposition with Equivalence Reasoning and Delayed Clause Normal Form Transformation, Baader F., Proceedings of the 19th International Conference on Automated Deduction (Miami, USA), 335-349, Lecture Notes in Artificial Intelligence 2741, Springer-Verlag.
HK+12: Hoder K., Khasidashvili Z., Korovin K., Voronkov A. (2012), Preprocessing Techniques for First-Order Clausification, Cabodi G., Singh S., Proceedings of the Formal Methods in Computer-Aided Design 2012 (Cambridge, United Kindom), 44-51, IEEE Press.
HKV10: Hoder K., Kovacs L., Voronkov A. (2010), Interpolation and Symbol Elimination in Vampire, Giesl J., Haehnle R., Proceedings of the 5th International Joint Conference on Automated Reasoning (Edinburgh, United Kingdom), 188-195, Lecture Notes in Artificial Intelligence 6173.
HKV11: Hoder K., Kovacs L., Voronkov A. (2011), Invariant Generation in Vampire, Abdulla P., Leino R., Proceedings of the 17th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (Saarbruecken, Germany), 60-64, Lecture Notes in Computer Science 6605, Springer-Verlag.
HV11: Hoder K., Voronkov A. (2011), Sine Qua Non for Large Theory Reasoning, Sofronie-Stokkermans V., Bj{\oe}rner N., Proceedings of the 23rd International Conference on Automated Deduction (Wroclaw, Poland), 299-314, Lecture Notes in Artificial Intelligence 6803, Springer-Verlag.
HH+09: Hutter F., Hoos H., Leyton-Brown K., Stützle T. (2009), ParamILS: An Automatic Algorithm Configuration Framework, Journal of Artificial Intelligence Research 36, 267-306.
Iss90: Issar S. (1990), Path-Focused Duplication: A Search Procedure for General Matings, Dietterich T. Swartout W., Proceedings of the 8th National Conference on Artificial Intelligence (Boston, USA), 221-226, American Association for Artificial Intelligence / MIT Press.
KU13: Kaliszyk C., Urban J. (2013), Stronger Automation for Flyspeck by Feature Weighting and Strategy Evolution, Proceedings of the 3rd International Workshop on Proof Exchange for Theorem Proving (Lake Placid, USA), To appear, EasyChair Proceedings in Computing.
KS10: Korovin K., Sticksel C. (2010), iProver-Eq - An Instantiation-Based Theorem Prover with Equality, Giesl J., Haehnle R., Proceedings of the 5th International Joint Conference on Automated Reasoning (Edinburgh, United Kingdom), 196-202, Lecture Notes in Artificial Intelligence 6173.
KS10: Korovin K., Sticksel C. (2010), Labelled Unit Superposition Calculi for Instantiation-based Reasoning, Fermüller C., Voronkov A., Proceedings of the 17th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (Yogyakarta, Indonesia), 459-473, Lecture Notes in Computer Science 6397, Springer-Verlag.
Kor08: Korovin K. (2008), iProver - An Instantiation-Based Theorem Prover for First-order Logic (System Description), Baumgartner P., Armando A., Gilles D., Proceedings of the 4th International Joint Conference on Automated Reasoning (Sydney, Australia), 292-298, Lecture Notes in Artificial Intelligence 5195.
Kor09: Korovin K. (2009), Instantiation-Based Reasoning: From Theory to Practice, Schmidt R., Proceedings of the 22nd International Conference on Automated Deduction (Montreal, Canada), 163-166, Lecture Notes in Computer Science 5663, Springer-Verlag.
Kor13: Korovin K. (2013), Inst-Gen - A Modular Approach to Instantiation-Based Automated Reasoning, Voronkov A., Weidenbach C., Programming Logics, Essays in Memory of Harald Ganzinger, 239-270, Lecture Notes in Computer Science 7797, Springer-Verlag.
KUS13: Kuehlwein D., Urban J., Schulz S. (2013), E-MaLeS 1.1, Bonacina M.P., Proceedings of the 24th International Conference on Automated Deduction (Lake Placid USA), To appear, Lecture Notes in Artificial Intelligence 6803, Springer-Verlag.
LS01: Letz R., Stenz G. (2001), Proof and Model Generation with Disconnection Tableaux, Nieuwenhuis R., Voronkov A., Proceedings of the 8th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (Havana, Cuba), 142-156, Lecture Notes in Artificial Intelligence 2250, Springer-Verlag.
LT12: Liang T., Tinelli C. (2012), Exploiting Parallelism in the ME Calculus, Fontaine P., Schmidt R., Schulz S., Proceedings of the 3rd Workshop on Practical Aspects of Automated Reasoning, 6th International Joint Conference on Automated Reasoning (Manchester, England).
Loe06: Loechner B. (2006), Things to Know When Implementing KBO, Journal of Automated Reasoning 36(4), 289-310.
Loe06: Loechner B. (2006), Things to Know When Implementing LBO, Journal of Artificial Intelligence Tools 15(1), 53-80.
Lov78: Loveland D.W. (1978), Automated Theorem Proving : A Logical Basis, Elsevier Science.
McC03: McCune W.W. (2003), Mace4 Reference Manual and Guide, ANL/MCS-TM-264, Argonne National Laboratory (Argonne, USA).
McC03: McCune W.W. (2003), Otter 3.3 Reference Manual, ANL/MSC-TM-263, Argonne National Laboratory (Argonne, USA).
McC97: McCune W.W. (1997), Solution of the Robbins Problem, Journal of Automated Reasoning 19(3), 263-276.
Mil87: Miller D. (1987), A Compact Representation of Proofs, Studia Logica 46(4), 347-370.
NOT06: Nieuwenhuis R., Oliveras A., Tinelli C. (2006), Solving SAT and SAT Modulo Theories: from an Abstract Davis-Putnam-Logemann-Loveland Procedure to DPLL(T), Journal of the ACM 53(6), 937-977.
NPW12: Nipkow T., Paulson L., Wenzel M., Isabelle/HOL: A Proof Assistant for Higher-Order Logic, http://www.cl.cam.ac.uk/research/hvg/Isabelle/dist/Isabelle/doc/tutorial.pdf.
Nip89: Nipkow T. (1989), Equational Reasoning in Isabelle, Science of Computer Programming 12(2), 123-149.
OB03: Otten J., Bibel W. (2003), leanCoP: Lean Connection-Based Theorem Proving, Journal of Symbolic Computation 36(1-2), 139-161.
Ott08: Otten J. (2008), {\sf leanCoP 2.0} and {\sf ileancop 1.2}: High Performance Lean Theorem Proving in Classical and Intuitionistic Logic, Baumgartner P., Armando A., Gilles D., Proceedings of the 4th International Joint Conference on Automated Reasoning (Sydney, Australia), 283-291, Lecture Notes in Artificial Intelligence 5195.
Pas01: Pastre D. (2001), Implementation of Knowledge Bases for Natural Deduction, Nieuwenhuis R., Voronkov A., Proceedings of the 8th International Conference on Logic for Programming, Artificial Intelligence and Reasoning (Havana, Cuba), 29-68, Lecture Notes in Artificial Intelligence 2250, Springer-Verlag.
Pas02: Pastre D. (2002), Strong and Weak Points of the Muscadet Theorem Prover, AI Communications 15(2-3), 147-160.
Pas07: Pastre D. (2007), Complementarity of a Natural Deduction Knowledge-based Prover and Resolution-based Provers in Automated Theorem Proving, http://www.math-info.univ-paris5.fr/~pastre/compl-NDKB-RB.pdf.
Pas78: Pastre D. (1978), Automatic Theorem Proving in Set Theory, Artificial Intelligence 10, 1-27.
Pas89: Pastre D. (1989), Muscadet : An Automatic Theorem Proving System using Knowledge and Metaknowledge in Mathematics, Artificial Intelligence 38, 257-318.
Pas93: Pastre D. (1993), Automated Theorem Proving in Mathematics, Annals of Mathematics and Artificial Intelligence 8, 425-447.
PB10: Paulson L., Blanchette J. (2010), Three Years of Experience with Sledgehammer, a Practical Link between Automatic and Interactive Theorem Provers, Sutcliffe G., Ternovska E., Schulz S., Proceedings of the 8th International Workshop on the Implementation of Logics (Yogyakarta, Indonesia), To appear.
PN94: Paulson L.C., Nipkow T. (1994), Isabelle: A Generic Theorem Prover, Lecture Notes in Computer Science 828, Springer-Verlag.
Pau99: Paulson L. (1999), A Generic Tableau Prover and its Integration with Isabelle, Artificial Intelligence 5(3), 73-87.
PW07: Pelzer B., Wernhard C. (2007), System Description: E-KRHyper, Pfenning F., Proceedings of the 21st International Conference on Automated Deduction (Bremen, Germany), 508-513, Lecture Notes in Artificial Intelligence 4603, Springer-Verlag.
Pfe87: Pfenning F. (1987), Proof Transformations in Higher-Order Logic, PhD thesis, Carnegie-Mellon University (Pittsburg, USA).
PW06: Prevosto V., Waldmann U. (2006), SPASS+T, Sutcliffe G., Schmidt R., Schulz S., Proceedings of the FLoC'06 Workshop on Empirically Successful Computerized Reasoning, 3rd International Joint Conference on Automated Reasoning (Seattle, USA), 19-33, CEUR Workshop Proceedings 192.
RT+13: Reynolds A., Tinelli C., Goel A., Krstic S., Deters M., Barrett C. (2013), Quantifier Instantiation Techniques for Finite Model Finding in SMT, Bonacina M.P., Proceedings of the 24th International Conference on Automated Deduction (Lake Placid USA), To appear, Lecture Notes in Artificial Intelligence 6803, Springer-Verlag.
RT+13: Reynolds A., Tinelli C., Goel A., Krstic S. (2013), Finite Model Finding in SMT, Sharygina N., Veith H., Proceedings of the 25th International Conference on Computer Aided Verification (St Petersburg, Russia), To appear, Lecture Notes in Computer Science, Springer-Verlag.
RV02: Riazanov A., Voronkov A. (2002), The Design and Implementation of Vampire, AI Communications 15(2-3), 91-110.
Rue08: Rümmer P. (2008), A Constraint Sequent Calculus for First-Order Logic with Linear Integer Arithmetic, Cervesato I., Veith H., Voronkov A., Proceedings of the 15th International Conference on Logic for Programming Artificial Intelligence and Reasoning (Doha, Qatar), 274-289, Lecture Notes in Artificial Intelligence 5330, Springer-Verlag.
Rue12: Rümmer P. (2012), A Constraint Sequent Calculus for First-Order Logic with Linear Integer Arithmetic, Bjorner N., Voronkov A., Proceedings of the 18th International Conference on Logic for Programming Artificial Intelligence and Reasoning (Merida, Venezuela), 274-289, Lecture Notes in Artificial Intelligence 7180, Springer-Verlag.
Sch02: Schulz S. (2002), E: A Brainiac Theorem Prover, AI Communications 15(2-3), 111-126.
Sch04: Schulz S. (2004), Simple and Efficient Clause Subsumption with Feature Vector Indexing, Sutcliffe G., Schulz S., Tammet T., Proceedings of the Workshop on Empirically Successful First Order Reasoning, 2nd International Joint Conference on Automated Reasoning (Cork, Ireland).
Sch12: Schulz S. (2012), Fingerprint Indexing for Paramodulation and Rewriting, Gramlich B., Miller D., Sattler U., Proceedings of the 6th International Joint Conference on Automated Reasoning (Manchester, United Kingdom), 477-483, Lecture Notes in Artificial Intelligence 7364.
SB10: Sutcliffe G., Benzmüller C. (2010), Automated Reasoning in Higher-Order Logic using the TPTP THF Infrastructure, Journal of Formalized Reasoning 3(1), 1-27.
Ull89: Ullman J. (1989), Principles of Database and Knowledge-Base Bystems, Computer Science Press, Inc..
US+08: Urban J., Sutcliffe G., Pudlak P., Vyskocil J. (2008), MaLARea SG1: Machine Learner for Automated Reasoning with Semantic Guidance, Baumgartner P., Armando A., Gilles D., Proceedings of the 4th International Joint Conference on Automated Reasoning (Sydney, Australia), 441-456, Lecture Notes in Artificial Intelligence 5195, Springer-Verlag.
Urb07: Urban J. (2007), MaLARea: a Metasystem for Automated Reasoning in Large Theories, Urban J., Sutcliffe G., Schulz S., Proceedings of the CADE-21 Workshop on Empirically Successful Automated Reasoning in Large Theories (Bremen, German), 45-58, CEUR Workshop Proceedings 257.
Urb13: Urban J. (2013), BliStr: The Blind Strategymaker, Computing Research Repository - arXiv.
Wer03: Wernhard C. (2003), System Description: KRHyper, Fachberichte Informatik 14--2003, Universität Koblenz-Landau (Koblenz, Germany).
XH+08: Xu L., Hutter F., Hoos H., Leyton-Brown K. (2008), SATzilla: Portfolio-based Algorithm Selection for SAT, Journal of Artificial Intelligence Research 32, 565-606.
Zim07: Zimmer S. (2007), Intelligent Combination of a First Order Theorem Prover and SMT Procedures, MSc thesis, Saarland University (Saarbruecken, Germany).
Coq: (), The Coq Proof Assistant, http://coq.inria.fr.