http://userpages.uni-koblenz.de/~bpelzer/edarwin
An E-hyper tableau is a tree whose nodes are labeled with clauses and which is built up by the application of the inference rules of the E-hyper tableau calculus. The calculus rules are designed such that most of the reasoning is performed using positive unit clauses. Splitting is done without rigid variables. Instead, variables which would be shared between branches are prevented by ground substitutions, which are guessed from the Herbrand universe and constrained by rewrite rules. Redundancy rules allow the detection and removal of clauses that are redundant with respect to a branch. The hyper extension inference from the original hyper tableau calculus is equivalent to a series of E-hyper tableau calculus inference applications. Therefore the implementation of the hyper extension in KRHyper by a variant of semi-naive evaluation [Ull89] is retained in E-KRHyper, where it serves as a shortcut inference for the resolution of non-equational literals.
http://www.uni-koblenz.de/~bpelzer/ekrhyper
From the known performance of different E strategies over the TPTP library, the system learns which strategy is most likely to solve a problem. Given a new problem, E-MaLeS extracts the problems features (number of terms, number of literals, etc) and applies the learned function to the features. The result is a (hopefully optimal) strategy. As a new addition in 1.1, E-MaLeS learns not only the optimal strategy, but also for how long it should run each strategy. This allows better strategy scheduling.
http://www.cs.ru.nl/~kuehlwein/
EP 1.4pre is just a combination of E 1.4pre in verbose mode and a proof analysis tool extracting the used inference steps. For the LTB division, a control program uses a SInE-like analysis to extract reduced axiomatizations that are handed to several instances of E.
The automatic mode is based on a static partition of the set of all clausal problems based on a number of simple features (number of clauses, maximal symbol arity, presence of equality, presence of non-unit and non-Horn clauses,...). Each class of clauses is automatically assigned a heuristic that performs well on problems from this class in test runs. About 100 different strategies have been evaluated on all untyped first-order problems from TPTP 4.1.0.
http://www.eprover.org
EP 1.6pre is just a combination of E 1.6pre in verbose mode and a proof analysis tool extracting the used inference steps. For the LTB and MZR divisions, a control program uses a SInE-like analysis to extract reduced axiomatizations that are handed to several instances of E.
The automatic mode is based on a static partition of the set of all clausal problems based on a number of simple features (number of clauses, maximal symbol arity, presence of equality, presence of non-unit and non-Horn clauses,...). Each class of clauses is automatically assigned a heuristic that performs well on problems from this class in test runs. About 60 different strategies have been evaluated on all bug-free untyped first-order problems from TPTP 5.3.0.
http://www.eprover.org
Fimo features symmetry breaking and incremental answer set solving provided by the underlying iASP system iClingo. Additionally, Fimo deskolemizes some of the Skolem functions by transforming them to aggregates in the resulting logic program.
http://potassco.sourceforge.net
Major additions in the current version are:
iProver is available from
http://www.cs.man.ac.uk/~korovink/iprover/
In the LTB division, iProver uses axiom selection based on the SInE algorithm [HV11] as implemented in Vampire [HKV11], i.e., axiom selection is done by Vampire and proof attempts are done by iProver.
iProver features are summaries below with recent additions marked with (*).
iProver is available from
http://www.cs.man.ac.uk/~korovink/iprover/
iProver-Eq consists of three core components: i) ground reasoning by an SMT solver, ii) first-order equational reasoning on literals in a candidate model by a labelled unit superposition calculus [KS10,KS10] and iii) instantiation of clauses with substitutions obtained by ii).
Given a set of first-order clauses, iProver-Eq first abstracts it to a set of ground clauses which are then passed to the ground solver. If the ground abstraction is unsatisfiable, then the set of first-order clauses is also unsatisfiable. Otherwise, literals are selected from the first-order clauses based on the model of the ground solver. The labelled unit superposition calculus checks whether selected literals are conflicting. If they are conflicting, then clauses are instantiated such that the ground solver has to refine its model in order to resolve the conflict. Otherwise, satisfiability of the initial first-order clause set is shown.
Clause selection and literal selection in the unit superposition calculus are implemented in separate given clause algorithms. Relevant substitutions are accumulated in labels during unit superposition inferences and then used to instantiate clauses. For redundancy elimination iProver-Eq uses demodulation, dismatching constraints and global subsumption. In order to efficiently propagate redundancy elimination from instantiation into unit superposition, we implemented different representations of labels based on sets, AND/OR-trees and OBDDs. Non-equational resolution and equational superposition inferences provide further simplifications.
For the LTB division, iProver-Eq-SInE runs iProver-Eq as the underlying inference engine of SInE [HV11], i.e., axiom selection is done by SInE, and proof attempts are done by iProver-Eq.
If no equational literals occur in the input, iProver-Eq falls back to the inference rules of iProver, otherwise the latter are disabled and only unit superposition is used. If all clauses are unit equations, no instances need to be generated and the calculus is run without the otherwise necessary bookkeeping.
iProver-Eq is available from
http://www.cs.man.ac.uk/~sticksec/iprover-eq
Previous versions of Isabelle relied on the TPTP2X tool to translate TPTP files to Isabelle theory files. Starting this year, Isabelle includes a parser for the TPTP syntaxes CNF, FOF, TFF0, and THF0 as well as TPTP versions of its popular tools, invokable on the command line as isabelle tptp_tool max_secs file.p. For example:
Two versions of Isabelle participate this year. The demo version includes its competitors LEO-II [BP+08] and Satallax [Bro12] as Sledgehammer backends, whereas the competition version leaves them out.isabelle tptp_isabelle_demo 100 SEU/SEU824^3.p
The implementation of Isabelle relies on a small LCF-style kernel, meaning that inferences are implemented as operations on an abstract theorem datatype. Assuming the kernel is correct, all values of type theorem are correct by construction.
Most of the code for Isabelle was written by the Isabelle teams at the University of Cambridge and the Technische Universität München. Isabelle/HOL is available for all major platforms under a BSD-style license from
http://www.cl.cam.ac.uk/research/hvg/Isabelle
leanCoP can read formulae in leanCoP syntax as well as in (raw) TPTP first-order syntax. Equality axioms and axioms to support distinct objects are automatically added if required. The leanCoP core prover returns a very compact connection proof, which can be translated into different proof formats, e.g., into a lean (unofficial) TPTP syntax format for representing connection proofs [OS10] or into a readable text proof.
The leanCoP core prover and all its additional components run on any (standard) Prolog system; ECLiPSe, SICStus and SWI Prolog are currently explicitly supported. The shell script that implements the fixed strategy scheduling runs on Linux/Unix and MacOS platforms as well as on most Windows platforms.
The source code of leanCoP 2.2 is available under the GNU general public license. Together with more information it can be found on the leanCoP website at
http://www.leancop.deThe website also contains information about the leanCoP versions ileanCoP and MleanCoP, leading automated theorem provers for first-order intuitionistic logic and several first-order modal logics, respectively.
LEO-II's parser supports the TPTP THF0 language and also the TPTP languages FOF and CNF.
Unfortunately the LEO-II system still uses only a very simple sequential collaboration model with first-order ATPs instead of using the more advanced, concurrent and resource-adaptive OANTS architecture [BS+08] as exploited by its predecessor LEO.
The LEO-II system is distributed under a BSD style license, and it is available from
http://www.leoprover.org
MaLARea is available from
https://github.com/JUrban/MPTP2/tree/master/MaLAReaThe metasystem's Perl code is released under GPL2.
The system is also able to work with second order statements. It may also receive knowledge and know-how for a specific domain from a human user; see [Pas89] and [Pas93]. These two possibilities are not used while working with the TPTP Library.
Muscadet is available from
http://www.math-info.univ-paris5.fr/~pastre/muscadet/muscadet.html
The translation from HOL to Kodkod's first-order relational logic (FORL) is parameterized by the cardinalities of the atomic types occurring in it. Nitrox enumerates the possible cardinalities for the universe. If a formula has a finite counterexample, the tool eventually finds it, unless it runs out of resources.
Nitpick is optimized to work with higher-order logic (HOL) and its definitional principles (e.g., (co)inductive predicates, (co)inductive datatypes, (co)recursive functions). When invoked on untyped first-order problem, few of its optimizations come into play, and the problem handed to Kodkod is essentially a first-order relational logic (FORL) rendering of the TPTP FOF problem. There are two main exceptions:
Paradox incorporates the following features: Polynomial-time clause splitting heuristics, the use of incremental SAT, static symmetry reduction techniques, and the use of sort inference.
The internal calculus of Princess only supports uninterpreted predicates; uninterpreted functions are encoded as predicates, together with the usual axioms. Through appropriate translation of quantified formulae with functions, the e-matching technique common in SMT solvers can be simulated; triggers in quantified formulae are chosen based on heuristics similar to those in the Simplify prover.
For CASC, Princess will run a schedule with a small number of configurations for each problem (portfolio method). The schedule is determined either statically, or dynamically using syntactic attributes of problems (such as number and kind of quantifiers, etc), based on training using a random sample of problems from the TPTP library.
http://www.philipp.ruemmer.org/princess.shtml
Prover9 has available positive ordered (and nonordered) resolution and paramodulation, negative ordered (and nonordered) resolution, factoring, positive and negative hyperresolution, UR-resolution, and demodulation (term rewriting). Terms can be ordered with LPO, RPO, or KBO. Selection of the "given clause" is by an age-weight ratio.
Proofs can be given at two levels of detail: (1) standard, in which each line of the proof is a stored clause with detailed justification, and (2) expanded, with a separate line for each operation. When FOF problems are input, proof of transformation to clauses is not given.
Completeness is not guaranteed, so termination does not indicate satisfiability.
Given a problem, Prover9 adjusts its inference rules and strategy according to syntactic properties of the input clauses such as the presence of equality and non-Horn clauses. Prover9 also does some preprocessing, for example, to eliminate predicates.
In previous CASC competitions, Prover9 has used LPO to order terms for demodulation and for the inference rules, with a simple rule for determining symbol precedence. For CASC 2012, we are going to use KBO instead.
For the FOF problems, a preprocessing step attempts to reduce the problem to independent subproblems by a miniscope transformation; if the problem reduction succeeds, each subproblem is clausified and given to the ordinary search procedure; if the problem reduction fails, the original problem is clausified and given to the search procedure.
http://www.cs.unm.edu/~mccune/prover9/
Finishes in the middle of the pack are anticipated in all categories in which Prover9 competes.
http://www.cs.ru.nl/~kuehlwein/
The theoretical basis of search is a complete ground tableau calculus for higher-order logic [BS10] with a choice operator [BB11]. A problem is given in the THF format. A branch is formed from the axioms of the problem and the negation of the conjecture (if any is given). From this point on, Satallax tries to determine unsatisfiability or satisfiability of this branch.
Satallax progressively generates higher-order formulae and corresponding propositional clauses [Bro11]. These formulae and propositional clauses correspond to instances of the tableau rules. Satallax uses the SAT solver MiniSat as an engine to test the current set of propositional clauses for unsatisfiability. If the clauses are unsatisfiable, then the original branch is unsatisfiable. If there are no quantifiers at function types, the generation of higher-order formulae and corresponding clauses may terminate [BS09,BS09]. In such a case, if MiniSat reports the final set of clauses as satisfiable, then the original set of higher-order formulae is satisfiable (by a standard model in which all types are interpreted as finite sets).
A strategy schedule is an ordered collection of modes with information about how much time the mode should be allotted. Satallax tries each of the modes for a certain amount of time sequentially. Satallax 2.1 has eight strategy schedules which were determined through experimentation using the THF problems in version 5.1.0 of the TPTP library. One of these eight strategy schedules is chosen based on the amount of time Satallax is given to solve the problem. For example, if Satallax is given 180 seconds to solve the problem, then a schedule with 38 modes is chosen.
http://satallax.com
Satallax progressively generates higher-order formulae and corresponding propositional clauses [Bro11]. These formulae and propositional clauses correspond to instances of the tableau rules. Satallax uses the SAT solver MiniSat as an engine to test the current set of propositional clauses for unsatisfiability. If the clauses are unsatisfiable, then the original branch is unsatisfiable.
http://satallax.com
http://www.mpi-inf.mpg.de/~uwe/software/#TSPASS
http://www.mpi-inf.mpg.de/~uwe/software/#TSPASS
The current system implements a complete calculus based on binary resolution.
Superdeduction [BHK07] is a variant of deduction modulo [DHK03], which is an extension of logical deduction systems consisting in canonically adding ad hoc deduction rules translating axiomatic theories. This has several advantages:
http://cedric.cnam.fr/~delahaye/super-zenon/
http://gtps.math.cmu.edu/tps.html
Substitution tree and code tree indexes are used to implement all major operations on sets of terms, literals and clauses. Although the kernel of the system works only with clausal normal form, the shell accepts a problem in the full first-order logic syntax, clausifies it and performs a number of useful transformations before passing the result to the kernel. Also the axiom selection algorithm Sine [HV11] can be enabled as part of the preprocessing.
When a theorem is proved, the system produces a verifiable proof, which validates both the clausification phase and the refutation of the CNF.
Substitution tree and code tree indexes are used to implement all major operations on sets of terms, literals and clauses. Although the kernel of the system works only with clausal normal form, the shell accepts a problem in the full first-order logic syntax, clausifies it and performs a number of useful transformations before passing the result to the kernel. Also the axiom selection algorithm Sine [HV11] can be enabled as part of the preprocessing.
When a theorem is proved, the system produces a verifiable proof, which validates both the clausification phase and the refutation of the CNF.
Zenon outputs formal proofs that can be checked by Coq or Isabelle.
Zenon is available from
http://zenon-prover.org
The FOF division now includes some CNF problems; this will probably hinder Zenon's performance somewhat.