Sheraton Breakwater Hotel, Townsville, Australia
16th July, 1997
The CADE-14 ATP System Competition will evaluate the performance of sound, fully automatic, 1st order ATP systems, in terms of:
The competition machines are SUN Ultra 140s, each with 64MB memory and running Solaris 2.5.1.
The competition is being organized by Christian Suttner and Geoff Sutcliffe. If you have any questions about the competition, please email the organizers. The competition will be overseen by a panel of knowledgeable researchers who are not participating in the event. It is planned to publish the competition results in a form that includes contributions written by entrants.
This document contains information about the:
The TPTP difficulty rating scheme identifies problems as:
The TPTP distinguishes versions of problems as one of standard, non-standard,
incomplete, or special.
The competition will use standard TPTP problems.
The eligible problems for each competition division and category can be
extracted from the TPTP using the tptp1T script that is distributed with
the TPTP, as follows:
The number of problems to be used will be chosen between a minimum that
ensures sufficient confidence in the competition results (the competition
organizers will ensure that there are sufficient resources available), and
a maximum determined from
the number of workstations available,
the time allocated to the competition,
the number of ATP systems, and
the minimal time limit:
The problems to be used will be randomly selected on the day of the
competition, from the most recent TPTP release.
The tptp2X utility, distributed with the TPTP, will be used to:
The timing will be done in units of 1 second, and the minimal time to find
a solution is 1 second.
If an ATP system cannot solve a problem, the runtime will be set to the
time limit.
The ATP systems will be executed by a shell script that invokes each system
by a single command line, with the TPTP file name, the time limit (if required
by the entrant), and entrant specified system switches (the same for all
problems) as command line arguments.
The command line may not use UNIX shell features, e.g., redirection and
piping cannot be used.
The time limit will be imposed by sending a SIGXCPU (signal 30) to the ATP
system.
The ATP systems must be interruptable by SIGXCPU.
When terminating, the ATP system must output a distinguished string (specified
by the entrant) to stdout, indicating the result:
For every problem solved, the solution process must be reproducible by running
the system again.
Note: If only one ATP system is registered for a particular division or
category, no winner can be announced for that division or category, but the
results for that system will still be presented.
The infrastructure for a FOF competition division will not be ready for
CASC-14.
However, systems that can deal with FOF syntax can demonstrate their abilities
in the FOF Demonstration division
(the tptp2X utility will contain a clausifer that may be prepended to a CNF
system to form a FOF system; the tptp2X runtime is included in the total
runtime).
Ideally the systems will run on locally provided standard UNIX workstations,
but use of any hardware supplied by the entrant or accessed via the
Internet is acceptable.
The FOF Demonstration division will use non-trivial FOF theorems randomly
selected from the TPTP.
A CPU time limit, equal to the one in the competition divisions, will be
imposed on each solution attempt.
The system execution will be controlled by a perl script, provided
by the competition organizers.
The results will be presented, but no winner assessment will be made.
The rules for entry are the same as for the competition divisions, and
the same problems will be used.
A wall clock time limit, equal to the CPU time limit in the competition
divisions, will be imposed on each solution attempt.
The system execution will be controlled by a perl script, provided
by the competition organizers.
The results will be presented along with the competition divisions' results,
but no winner assessment will be made.
Note: You have to be directly associated with an entered ATP system to attend
this dinner.
It's an exclusive event.
(Spouses are welcome to come along; please indicate this in your registration.)
Competition Divisions
Entry in the competition is subject to the following rules:
Mixed means Horn and non-Horn problems, with or without
equality, but not unit equality problems (see the UEQ division below).
The MIX division is divided into four categories:
Theorems containing only unit equality clauses.
Problem Selection and Preparation
The competition will use "difficult" TPTP problems.
The problems will be selected to ensure an even distribution of problem
difficulty.
tptp1T CNF NonProp Unsatisfiable Rating 0.01 0.99 Standard Horn NoEq
tptp1T CNF NonProp Unsatisfiable Rating 0.01 0.99 Standard NonHorn NoEq
tptp1T CNF NonProp Unsatisfiable Rating 0.01 0.99 Standard Horn SomeEq
and
tptp1T CNF NonProp Unsatisfiable Rating 0.01 0.99 Standard Horn NonUnitEq
tptp1T CNF NonProp Unsatisfiable Rating 0.01 0.99 Standard NonHorn SomeEq
tptp1T CNF NonProp Unsatisfiable Rating 0.01 0.99 Standard UnitEq
tptp1T CNF NonProp Satisfiable Rating 0.00 0.00 Standard
and
tptp1T CNF NonProp Satisfiable Rating 0.00 0.00 Special
Note that all the problems are "easy".
We have had to accept this to get some eligible problems.
Further, Special problems have been accepted, as there
are too few Standard problems.
Finally, there have been a few changes to problem version status since
the release of TPTP v2.0.0, so check the explicit eligible problems
list.
tptp1T FOF Standard NonProp Theorem
There is no Rating information available for
FOF problems.
Number of workstations * Time for competition
Maximum = ---------------------------------------------
Number of systems * Minimal time limit
Time Limits and Timing
Number of workstations * Time for competition
Maximum = ----------------------------------------------
Number of systems * Minimal number of problems
System Execution
The ATP systems are not required to output solutions (proofs or models).
However, systems that do output solutions will be highlighted in the
presentation of results.
Performance Evaluation
FOF (First Order Form) Demonstration Division
Special Hardware Demonstration Divisions
Conditions for Participation
Disclaimer
Every effort is being made to organize the competition in a fair and
constructive manner.
No responsibility will be taken if, for one reason or the other, your system
does not win.
Competition Dinner