Introduction

The TPTP (Thousands of Problems for Theorem Provers) is a library of test problems for Automated Theorem Proving (ATP) systems, in classical and non-classical logics, with an interpreted equality symbol and assuming the axiom of choice. The TPTP contains problems in first-order form (FOF), and clause normal form (CNF), monomorphic and polymorphic typed first-order form (TFF, which includes the extended first-order form (TXF)), monomorphic and polymorphic typed higher-order form (THF), and dependently typed monomorphic and polymorphic typed higher-order form (DHF). Interpreted arithmetic types and symbols are supported in the typed logics.

The TPTP supplies the automated reasoning community with a comprehensive library of the ATP test problems that are available today, in order to provide an overview and a simple, unambiguous reference mechanism. The principal motivation for the TPTP is to support the testing and evaluation of ATP systems, to help ensure that performance results accurately reflect the capabilities of the ATP system being considered. A common library of problems is necessary for meaningful system evaluations, meaningful system comparisons, repeatability of testing, and the production of statistically significant results. The TPTP problems are stored in a specifically designed, easy to understand format. Utilities for manipulating the problems, for converting the problems to other known ATP formats, and for finding problems with certain characteristics, are provided. Since its first release in 1993, many researchers have used the TPTP as an appropriate and convenient basis for ATP system evaluation.

This technical manual explains the motivations and reasoning behind the development of the TPTP, and thus implicitly explains the design decisions made. It also serves as a manual explaining the structure and use of the TPTP: it provides a full description of the TPTP contents and organization, details of the utility programs, and guidelines for obtaining and using the TPTP.

What's New in TPTP v9.0.0 (since TPTP v8.0.0):

Previous Problem Collections

A large number of interesting problems had accumulated over the years in the ATP community. Besides publishing particularly interesting individual problems, from early on researchers collected problems in order to obtain a basis for experimentation. Problems in First Order Form (FOF), published by mathematicians and logicians prior to the mechanization of reasoning (e.g., [
Chu56]) provided the first source for ATP researchers. The first major publication in this regard was [MOW76], which provides an explicit listing of clauses for 63 Clause Normal Form (CNF) problems, many of which are still relevant today (to our knowledge, the first circulation of problems for testing ATP systems was due to Larry Wos in the late sixties). In the same year [WM76] documented 86 CNF problems which were commonly used for ATP testing. However, the problem clauses are not supplied in [WM76]. A second major thrust was provided by [Pel86], which lists 75 problems in both FOF and CNF. Other early papers are [BL+86], [Qua92], [MW92], and [McC93], to name a few. The Journal of Automated Reasoning's Problem Corner also provided interesting challenge problems. Problems published in hardcopy form are, however, often not suitable for testing ATP systems, because they have to be transcribed to electronic form. This is a cumbersome, error-prone process, and is feasible for only small numbers of small problems.

The sparsness of research into ATP systems for FOF problems meant that no electronic collections of FOF test problems had previously been commonly available. A CNF problem collection in electronic form was made publicly available by Argonne National Laboratories (in Otter format [McC94]) in 1988 [ANL]. This collection was a major source of problems for ATP researchers. Other electronic collections of CNF problems have been available, but not announced officially (e.g., that distributed with the SPRFN ATP system [SPRFN]). Although some of these collections provided significant support to researchers, and formed the early core of the TPTP library, none (with the possible exception of the ANL collection) was specifically designed to serve as a common basis for ATP research. Rather, these collections typically were built in the course of research into a particular ATP system. As a result there are several factors that limited their usefulness as a common basis for research. In particular, previously existing problem collections:

The problem of meaningfully interpreting results can be even worse than indicated. A few problems might be selected and hand-tuned (formulae arranged in a special way, irrelevant formulae omitted, lemmas added in, etc) specifically for the ATP system being tested. The presentation of a problem can significantly affect the nature of the problem, and changing the formulae clearly makes a different problem altogether. Nevertheless the problem might be referenced under the same name as it was presented elsewhere. As a consequence the experimental results reveal little. Some researchers avoid this ambiguity by listing the formulae explicitly, but obviously this usually cannot be done for a large number of problems or for large individual problems. The only satisfactory solution to these issues is a common and stable library of problems. The TPTP is such a library.

What is Required?

The goal for building the TPTP has been to overcome previous drawbacks, and to centralize the burden of problem collection and maintenance. The TPTP tries to address all relevant issues. In particular, the TPTP:

The development of the TPTP problem library is an ongoing project, with the aim to provide all of the desired properties.