The CADE-14 ATP System Competition

to be held at

Sheraton Breakwater Hotel, Townsville, Australia
16th July, 1997

Call for Participation


The CADE-14 ATP System Competition will evaluate the performance of sound, fully automatic, 1st order ATP systems, in terms of:

in the context of:

The competition machines are SUN Ultra 140s, each with 64MB memory and running Solaris 2.5.1.

The competition is being organized by Christian Suttner and Geoff Sutcliffe. If you have any questions about the competition, please email the organizers. The competition will be overseen by a panel of knowledgeable researchers who are not participating in the event. It is planned to publish the competition results in a form that includes contributions written by entrants.

This document contains information about the:


Competition Divisions

The competition will be divided into divisions according to syntactic problem characteristics: Entry in the competition is subject to the following rules:


Problem Selection and Preparation

The problems to be solved will be selected from the
TPTP Problem Library, v2.0.0.

The TPTP difficulty rating scheme identifies problems as:

The competition will use "difficult" TPTP problems. The problems will be selected to ensure an even distribution of problem difficulty.

The TPTP distinguishes versions of problems as one of standard, non-standard, incomplete, or special. The competition will use standard TPTP problems.

The eligible problems for each competition division and category can be extracted from the TPTP using the tptp1T script that is distributed with the TPTP, as follows:

The number of problems to be used will be chosen between a minimum that ensures sufficient confidence in the competition results (the competition organizers will ensure that there are sufficient resources available), and a maximum determined from the number of workstations available, the time allocated to the competition, the number of ATP systems, and the minimal time limit:

          Number of workstations * Time for competition
Maximum = ---------------------------------------------
             Number of systems * Minimal time limit

The problems to be used will be randomly selected on the day of the competition, from the most recent TPTP release. The tptp2X utility, distributed with the TPTP, will be used to:


Time Limits and Timing

A CPU time limit will be imposed on individual solution attempts. The time limit will be chosen between a minimum of 3 minutes (the competition organizers will ensure that there are sufficient resources available), and a maximum determined from the number of workstations available, the time allocated to the competition, the number of ATP systems, and the
minimal number of problems to be used:
          Number of workstations * Time for competition
Maximum = ----------------------------------------------
          Number of systems * Minimal number of problems

The timing will be done in units of 1 second, and the minimal time to find a solution is 1 second. If an ATP system cannot solve a problem, the runtime will be set to the time limit.


System Execution

It is the responsibility of each entrant to ensure that their ATP system is operational on the competition hardware, by 4th July 1997. The ATP systems will be tested for soundness before the competition. Systems which fail this test, or are found to be unsound at any time during or after the competition, will be disqualified.

The ATP systems will be executed by a shell script that invokes each system by a single command line, with the TPTP file name, the time limit (if required by the entrant), and entrant specified system switches (the same for all problems) as command line arguments. The command line may not use UNIX shell features, e.g., redirection and piping cannot be used.

The time limit will be imposed by sending a SIGXCPU (signal 30) to the ATP system. The ATP systems must be interruptable by SIGXCPU.

When terminating, the ATP system must output a distinguished string (specified by the entrant) to stdout, indicating the result:

The ATP systems are not required to output solutions (proofs or models). However, systems that do output solutions will be highlighted in the presentation of results.

For every problem solved, the solution process must be reproducible by running the system again.


Performance Evaluation

The systems will be ranked within each competition division and category. The ranking will be according to the number of problems solved. If several systems solve the same number of problems, then those systems will be ranked according to their average runtimes over solutions found.

Note: If only one ATP system is registered for a particular division or category, no winner can be announced for that division or category, but the results for that system will still be presented.


FOF (First Order Form) Demonstration Division

The infrastructure for a FOF competition division will not be ready for CASC-14. However, systems that can deal with FOF syntax can demonstrate their abilities in the FOF Demonstration division (the tptp2X utility will contain a clausifer that may be prepended to a CNF system to form a FOF system; the tptp2X runtime is included in the total runtime). Ideally the systems will run on locally provided standard UNIX workstations, but use of any hardware supplied by the entrant or accessed via the Internet is acceptable.

The FOF Demonstration division will use non-trivial FOF theorems randomly selected from the TPTP. A CPU time limit, equal to the one in the competition divisions, will be imposed on each solution attempt. The system execution will be controlled by a perl script, provided by the competition organizers.

The results will be presented, but no winner assessment will be made.


Special Hardware Demonstration Divisions

CNF ATP systems that cannot run on the locally provided standard UNIX workstations may enter the Special Hardware Demonstration divisions. The hardware is supplied by the entrant or accessed via the InterNet.

The rules for entry are the same as for the competition divisions, and the same problems will be used. A wall clock time limit, equal to the CPU time limit in the competition divisions, will be imposed on each solution attempt. The system execution will be controlled by a perl script, provided by the competition organizers.

The results will be presented along with the competition divisions' results, but no winner assessment will be made.


Conditions for Participation

Entering CASC-14 is subject to the following rules:

Disclaimer

Every effort is being made to organize the competition in a fair and constructive manner. No responsibility will be taken if, for one reason or the other, your system does not win.


Competition Dinner

A dinner for entrants and associates will be held on the evening of 14th July. To attend this dinner it is necessary to register for the competition, using the
CADE-14 registration form.

Note: You have to be directly associated with an entered ATP system to attend this dinner. It's an exclusive event. (Spouses are welcome to come along; please indicate this in your registration.)


ATP System Registration Form

Registration Deadline: 15th June 1997

Register as early as possible, so that the organizers can ensure that
sufficient resources are available. Do it now!


ATP System Name
Competition Divisions MIX UEQ SAT
FOF Demonstration Division
SH Demonstration Divisions MIX UEQ SAT FOF

Entrant (handling all competition participation issues)
Name
Institution
Email
Phone
FAX
System URL
Associate's names

There is a competition registration fee of AU$50, which includes the competition dinner and an elegant CADE-14 ATP System Competition T-shirt. The nominated entrant must register and pay the fee for the competition, using the
CADE-14 registration form (you don't need to do that right now, but you should send in this system registration immediately). If you cannot come to Townsville, we'll post you a T-shirt and eat your dinner. Associates who wish to attend the competition dinner and get a T-shirt, must also register.

Your system registration will be confirmed by email.