Inside the TPTP

Scope.
The TPTP contains problems in classical and non-classical logic, with an interpreted equality symbol. Release v9.0.0 of the TPTP contains:

15888 abstract problems, which result in 25775 ATP problems (due to alternative presentations and Problem Generators).
25628 problems in classical logic, and 147 problems in non-classical logic
20172 problems with equality
3549 THF abstract problems, which result in 5042 THF ATP problems
2439 TFF abstract problems, which result in 3297 TFF ATP problems, of which 436 are TXF problems
6033 FOF abstract problems, which result in 9091 FOF ATP problems
6312 CNF abstract problems, which result in 8345 CNF ATP problems

The TPTP documents OverallSynopsis, THFSynopsis, TFFSynopsis, FOFSynopsis, and CNFSynopsis provide some overall statistics about the TPTP.

The problems in the TPTP are syntactically diverse, as is indicated by the ranges of the values in the tables. The problems in the TPTP are also semantically diverse, as is indicated by the range of domains that are covered. The problems are grouped into domains, covering topics in the fields of logic, mathematics, computer science, engineering, social sciences, arts and humanities, and others.

Sources
The problems have been collected from various sources. Early sources were existing electronic problem collections and the ATP literature. Other sources include logic programming, mathematics, puzzles, and correspondence with ATP researchers. Many people and organizations have contributed towards the TPTP: the foundations of the TPTP were laid with David Plaisted's SPRFN collection; many problems were taken from Argonne National Laboratory's ATP problem library (special thanks to Bill McCune here); Art Quaife provided several hundred problems in set theory and algebra; the Journal of Automated Reasoning, CADE Proceedings, and Association for Automated Reasoning Newsletters have provided a wealth of material; many problems have been provided by ATP users and system developers (see the Acknowledgements).

Releases
The TPTP is managed in the manner of a software product, in the sense that fixed releases are made. Each release of the TPTP is identified by a release number, in the form vVersion.Edition.Patch level. The Version number enumerates major new releases of the TPTP, in which important new features have been added. The Edition number is incremented each time new problems are added to the current version. The Patch level is incremented each time errors, found in the current edition, are corrected. All non-trivial changes are recorded in a history file, as well as in the file for an affected problem.

The Domain Structure

This section provides the structure according to which the problems are grouped into domains. Some information about the domains is also given.

An attempt has been made to classify the totality of the TPTP problems in a systematic and natural way. The resulting domain scheme reflects the natural hierarchy of scientific domains, as presented in standard subject classification literature. The current classification is based mainly on the Dewey Decimal Classification (DDC) [CB+89] and the Mathematics Subject Classification (MSC) [MSC92] used for the Mathematical Reviews by the American Mathematical Society. Seven fields are defined: logic, mathematics, computer science, engineering, social sciences, arts and humanities, and other. Each field contains further subdivisions, called domains. Each domain is identified by a three-letter mnemonic. These mnemonics are also part of the TPTP problem naming scheme. The TPTP domains constitute the basic units of the classification. The full classification scheme is shown in Figure The Domain Structure of the TPTP, and the numbers of abstract problems, problems, and generic problems in each domain are shown in the TPTP document OverallSynopsis

Logic	Combinatory logic		`COL`
	Logic calculi		`LCL`
	Henkin models		`HEN`
Mathematics	Set theory		`SET`, `SEU`, and `SEV`
	Graph theory		`GRA`
	Algebra	Relation algebra	`REL`
		MV Algebras	`MVA`
		Boolean algebra	`BOO`
		Robbins algebra	`ROB`
		Left distributive	`LDA`
		Lattices	`LAT`
		Quantales	`QUA`
		Kleene algebra	`KLE`
		Groups	`GRP`
		Rings	`RNG`
		Fields	`FLD`
		Homological algebra	`HAL`
		Real Algebra	`RAL`
		General algebra	`ALG`
	Arithmetic		`ARI`
	Number theory		`NUM` and `NUN`
	Topology		`TOP`
	Analysis		`ANA`
	Geometry		`GEO`
	Category theory		`CAT`
Computer science	Computing theory		`COM`
	Knowledge representation		`KRS`
	Natural Language Processing		`NLP`
	Planning		`PLA`
	Agents		`AGT`
	Commonsense Reasoning		`CSR`
	Semantic Web		`SWB`
	Interactive Theorem Proving		`ITP`
	Data Structures		`DAT`
	Software creation		`SWC`
	Software verification		`SWV` and `SWW` and `SWX`
Science and Engineering	Biology		`BIO`
	Hardware creation		`HWC`
	Hardware verification		`HWV`
	Medicine		`MED`
	Processes		`PRO`
	Products		`PRD`
	Time		`TIM`
Social sciences	Social Choice Theory		`SCT`
	Management		`MGT`
	Geography		`GEG`
Arts and Humanities	Philosophy		`PHI`
Other	Syntactic		`SYN` and `SYO` and `SYP`
	Puzzles		`PUZ`
	Miscellaneous		`MSC`

The Domain Structure of the TPTP.

A brief description of the domains, with a non-ATP reference for a general introduction and a generic ATP reference, is given below. For each domain, appropriate DDC and MSC numbers are also given if available:

AGT Agents.
An agent is an autonomous software component of a computer program, typically designed to act intelligently and communicate with other agents.
Indices: DDC ???; MSC 68T35.
References: General [RN95]; ATP --.
ALG Algebra.
Algebra is a branch of mathematics concerning the study of structure, relation, and quantity. An algebra is a set with a system of operations defined on it.
Indices: DDC 512; MSC 06XX, 20XX.
References: General [Bou89, BM65, BB70]; ATP --.
ARI Arithmetic.
Arithmetic problems that do arithmetic computations (but do not reason about arithmetic (see the NUM domain).
Indices: DDC 513; MSC 97FXX.
References: General [Hig10]; ATP [Sim86].
ANA Analysis.
Analysis is a branch of mathematics concerned with functions and limits. The main parts of analysis are differential calculus, integral calculus, and the theory of functions.
Indices: DDC 515; MSC 26XX.
References: General [Ros90]; ATP [Ble90].
BIO Biology.
Biology is a natural science concerned with the study of life and living organisms, including their structure, function, growth, evolution, distribution, and taxonomy.
Indices: DDC 570; MSC 92XX.
References: General [RU+13]; ATP [CC+13, CDI13].
BOO Boolean Algebra.
A Boolean algebra is a set of elements with two binary operations that are idempotent, commutative, and associative. These operations are mutually distributive, there exist universal bounds 0, 1, and there is a unary operation of complementation.
Indices: DDC 511.324, 512.89; MSC 06EXX.
References: General [Whi61, BM65, BB70]; ATP --.
CAT Category Theory.
A category is a mathematical structure together with the morphisms that preserve this structure.
Indices: DDC 512.55; MSC 18XX.
References: General [Mac71]; ATP [MOW76].
COL Combinatory Logic.
Combinatory logic is about applying one function to another. It can be viewed as an alternative foundation of mathematics (or, due to its Turing-completeness, as a programming language). More formally, it is a system satisfying two combinators and satisfying reflexivity, symmetry, transitivity, and two equality substitution axioms for the function that exists implicitly for applying one combinator to another.
Indices: DDC 510.101; MSC 03B40.
References: General [CF58, CHS72, Bar81]; ATP [WM88].
COM Computing Theory.
Computing theory is a subfield of computer science dealing with theoretical issues such as decidability (whether or not a given problem admits an algorithmic solution), completeness (does an algorithm always find a solution if one exists?), correctness (are only solutions produced?), and computational complexity (the resource requirements of algorithms).
Indices: DDC 004-006; MSC 68XX.
References: General [HU79]; ATP --.
CSR Commonsense Reasoning.
Commonsense reasoning is the branch of artificial intelligence concerned with replicating human thinking. There are several components to this problem, including: developing adequately broad and deep commonsense knowledge bases; developing reasoning methods that exhibit the features of human thinking, including the ability to reason with knowledge that is true by default, the ability to reason rapidly across a broad range of domains, and the ability to tolerate uncertainty in your knowledge; developing new kinds of cognitive architectures that support multiple reasoning methods and representations.
Indices: DDC 121.3; MSC 68TXX.
References: General [Sha97]; ATP [Lif95, McC59, SME04]
DAT Data Structures.
A data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.
Indices: DDC 005.73; MSC 68P05.
References: General [AHU87]; ATP --.
FLD Field Theory.
A field is ring (see below) in which the * operation is commutative, and for which there is an identity element in G, and for which each non-identity element of G has an inverse in G.
Indices: DDC 512.32; MSC 12XX.
References: General [Ada82]; ATP [Dra93].
GEG Geography.
Geography is the study of the Earth and its lands, features, inhabitants, and phenomena.
Indices: DDC 910; MSC 91C99.
References: General [NGS99]; ATP --.
GEO Geometry.
Geometry is a branch of mathematics that deals with the measurement, properties, and relationships of points, lines, angles, surfaces, and solids.
Indices: DDC 516; MSC 51.
References: General [Tar51, Tar59]; ATP [Qua92].
GRA Graph Theory.
A graph consists of a finite non-empty set of vertices together with a prescribed set of edges, each edge connecting a pair of vertices.
Indices: DDC 510.09; MSC 05CXX, 68R10.
References: General [Har69, BB70]; ATP --.
GRP Group Theory.
A group is a set G and a binary operation +:GxG -> G which is associative, and for which there is an identity element in G, and for which each element of G has an inverse in G.
Indices: DDC 512.2; MSC 20
References: General [Bou89, BM65]; ATP [MOW76].
HAL Homological Algebra.
Homological algebra is an abstract algebra concerned with results valid for many different kinds of spaces. Modules are the basic tools used in homological algebra.
Indices: DDC 512.55; MSC 18XX.
References: General [Wei94]; ATP --.
HEN Henkin Models.
Henkin models provide a generalized semantics for higher order logics. This leads to a larger class of models and, as a consequence, fewer true sentences. However, in contrast to standard semantics, complete and correct calculi can be found.
Indices: DDC 160; MSC 03CXX.
References: General [Hen50, Leb83]; ATP --.
HWC Hardware Creation.
Computer hardware is created by inter-connecting logic gates. Hardware creation is used to form a circuit that will transform given input patterns to required output patterns.
Indices: DDC 621.395; MSC 94CXX.
References: General [Hay93]; ATP [WW83].
HWV Hardware Verification.
Hardware verification is used to ensure that a previously designed circuit performs the desired transformation of input patterns to required output patterns. One approach is to check the performance of the circuit for every possible combination of given inputs. Other techniques are also used.
Indices: DDC 621.395; MSC 94CXX.
References: General [Hay93]; ATP [Woj83].
ITP Interactive Theorem Proving.
An interactive theorem prover is a software tool to assist with the development of formal proofs by human-machine collaboration. This involves some sort of interactive proof editor, or other interface, with which a human can guide the search for proofs, the details of which are stored in, and some steps provided by, a computer.
Indices:
References: General [Wie06, Geu09]; ATP [Urb06, BG+19].
KLE Kleene Algebra.
A Kleene Algebra is a bounded distributive lattice with an involution satisfying De Morgan's laws, and the inequality x∧−x ≤ y∨−y. Alternatively, a Kleene Algebra is an algebraic structure that generalizes the operations known from regular expressions.
Indices: DDC 512.74; MSC 06A99
References: General [Koz90, BMS03]; ATP [HS07].
KRS Knowledge Representation.
Knowledge Representation is concerned with writing down descriptions that can be manipulated by a machine.
Indices: DDC 006.3; MSC 68T30.
References: General [BL85, CM87]; ATP --.
LAT Lattice Theory.
A lattice is a set of elements, with two binary operations which are idempotent, commutative, and associative, and which satisfy the absorption law.
Indices: DDC 512.865; MSC 06BXX.
References: General [BM65]; ATP [McC88].
LCL Logic Calculi.
A logic calculus defines axioms and rules of inference that can be used to prove theorems. Indices: DDC 511.3; MSC 03XX. References: General [Luk63]; ATP [MW92].
LDA Left Distributive Algebra.
LD-algebras are related to large cardinals. Under a very strong large cardinal assumption, the free-monogenic LD-algebra can be represented by an algebra of elementary embeddings. Theorems about this algebra can be proven from a small number of properties, suggesting the definition of an embedding algebra.
Indices: DDC 512; MSC 20N02, 03E55, 08B20
References: General --; ATP [Jec93].
LIN Linear Algebra.
Linear algebra is the branch of mathematics concerning vector spaces and linear mappings between such spaces. Indices: DDC 512.5; MSC 15Axx.
References: General [LH+01]; ATP --.
MVA MV Algebras.
An MV algebra is an algebraic structure with a binary "plus" operation, a unary "negate" operation, and the constant 0, satisfying certain axioms. Indices: DDC 512; MSC 03G20
References: General [Mun11, GT75]; ATP --.
MED Medicine.
The science of diagnosing and treating illness, disease, and injury.
Indices: DDC 610; MSC --.
References: General [LH+01]; ATP [HLB05].
MGT Management.
Management is the study of systems, and their use and production of resources.
Indices: DDC 302-303; MSC 90XX.
References: General --; ATP [PM94, PB+94].
MSC Miscellaneous.
A collection of problems which do not fit well into any other domain.
NLP Natural Language Processing.
Natural language processing considers the automated generation and comprehension of languages used by humans.
Indices: DDC 006.3; MSC 68T50
References: General [Obe89i, GWS86]; ATP [Bos00].
NUM Number Theory.
Number theory is the study of integers and their properties.
Indices: DDC 512.7; MSC 11YXX.
References: General [HW92]; ATP [Qua92].
PHI Philosophy.
Philosophy is the study of general and fundamental problems, such as those connected with reality, existence, knowledge, values, reason, mind, and language.
Indices: DDC 100; MSC -
References: General [TE99]; ATP -.
PLA Planning.
Planning is the process of determining the sequence of actions to be performed by an agent, to reach a specified desired state from a specified initial state.
Indices: DDC 006.3; MSC 68T99.
References: General [AK+91]; ATP [Pla81, Pla82].
PRO Processes.
Process information includes production scheduling, process planning, workflow, business process reengineering, simulation, process realization process modeling, and project management.
Indices: DDC 670; MSC 93-XX.
References: General [Sch99]; ATP [Rei01, BG05].
PRD Products.
A product is anything that can be offered to a market that might satisfy a want or need.
Indices: DDC ???; MSC ??.
References: General [KA+06]; ATP --.
QUA Quantales.
Quantales are partially ordered algebraic structures that generalize locales (point free topologies) as well as various lattices of multiplicative ideals from ring theory and functional analysis. Quantales are sometimes referred to as complete residuated semigroups.
Indices: DDC 512; MSC 06F07.
References: General [Mul01, Con71]; ATP --.
PUZ Puzzles.
Puzzles are designed to test the ingenuity of humans.
Indices: DDC 510, 793.73; MSC --.
References: General [Car86, Smu78, Smu85]; ATP --.
RAL Real Algebra.
Real algebra is the study of "real" objects such as real rings, real places and real varieties. Indices: DDC 512; MSC 13J30
References: General [Lam84]; ATP --.
REL Relation Algebra.
A relation algebra is a residuated Boolean algebra supporting an involutary unary operation called converse. The motivating example of a relation algebra is the algebra 2^X^2 of all binary relations on a set X, with RoS interpreted as the usual composition of binary relations.
Indices: DDC 512; MSC 03G15, 06D99
References: General [SS93, Mad06]; ATP [HS08].
RNG Ring Theory.
A ring is a group (see above) in which the binary operation is commutative, with an associative and distributive operation *:GxG -> G for which there is an identity element in G.
Indices: DDC 200; MSC -.
References: General [Bou89, BB70]; ATP [MOW76].
ROB Robbins Algebra.
The Robbins Algebra domain revolves around the question "Is every Robbins algebra Boolean?". Most of the problems in this domain identify conditions that make a near-Boolean algebra Boolean.
Indices: DDC 512; MSC 03G15.
References: General [HMT71]; ATP [Win90].
SCT Social Choice Theory.
Social choice theory is a theoretical framework for measuring individual interests, values, or welfares as an aggregate towards collective decision..
Indices: DDC 330.1; MSC -.
References: General [Arr63]; ATP [Nip09].
SET, SEU, and SEV Set Theory.
Classically, a set is a totality of certain definite, distinguishable objects of our intuition or thought - called the elements of the set. Due to paradoxes that arise from such a naive definition, mathematicians now regard the notion of a set as an undefined, primitive concept [How72].
Indices: DDC 511.322, 512.817; MSC 03EXX, 04XX.
References: General [Neu25, Qui69]; ATP [Qua92].
SWB Semantic Web.
Semantic Web is a group of methods and technologies to allow machines to understand the meaning – or "semantics" – of information on the World Wide Web.
Indices: DDC ??.?; MSC --.
References: General [BHL01]; ATP [HV06].
SWC Software Creation.
Software creation is used to form a computer program that meets given specifications.
Indices: DDC ???.?; MSC 68N30.
References: General --; ATP --.
SWV and SWW and SWX Software Verification.
Software verification formally establishes that a computer program does the task it is designed for.
Indices: DDC 005.14; MSC 68Q60.
References: General --; ATP [WO+92, MOW76].
SYN and SYO Syntactic.
Syntactic problems have no obvious semantic interpretation.
Indices: DDC --; MSC --.
References: General [Chu56]; ATP [Pel86].
TIM Time.
Time is the continuous progression of existence that occurs in an apparently irreversible succession from the past, through the present, and into the future. Indices: DDC 115, 529, 389.17; MSC --.
References: General [Haw88, Har11]; ATP [All83, AH89].
TOP Topology.
Topology is the study of properties of geometric configurations (e.g., point sets) which are unaltered by elastic deformations (homeomorphisms, i.e., functions that are 1-1 mappings between sets, such that both the function and its inverse are continuous).
Indices: DDC 514; MSC 46AXX.
References: General [Kel55, Mun75]; ATP [WM89].

Problem Versions and Standard Axiomatizations

There are often many ways to formulate a problem for presentation to an ATP system. Thus, in the TPTP, there are often alternative presentations of a problem. The alternative presentations are called versions of the underlying abstract problem. As the problem versions are the objects that ATP systems must deal with, they are referred to simply as problems, and the abstract problems are referred to explicitly as such. Each problem is stored in a separate physical file. In the TPTP the most coarse grain differentiation between problem versions is whether the problem is presented in THF (with the subdivisions TH0 for monomorphic logic and TH1 for polymorphic logic), TFF (with the subdivisions TXF for extended logic (with the subdivisions TX0 for monomorphic logic and TX1 for polymorphic logic), TF0 for monomorphic logic, and TF1 for polymorphic logic), FOF, or CNF. Within a given presentation, the primary reason for different versions of an abstract problem is the use of different axiomatizations. This issue is discussed below. A secondary reason is different formulations of the problem to be solved, e.g., different clausal forms of a FOF problem.

Different Axiomatizations
Commonly, many different axiomatizations of a theory exist. By using different axiomatizations of a particular theory, different versions of a problem can be formed.

In the TPTP an axiomatization is standard if it is a complete axiomatization of an established theory, no lemmas have been added, and it is not designed specifically to be suited or ill-suited to any ATP system, calculus, or control strategy. A problem version is standard if it uses a standard axiomatization. (Note: A standard axiomatization might be redundant, because some axioms are dependent on others. In general, it is not known whether or not an axiomatization is minimal, and thus the possibility of redundancy in standard axiomatizations must be tolerated.) In the TPTP, standard axiomatizations are kept in separate axiom files, and are included in problems as appropriate. Sets of specialization axioms might be used to extend or constrain an axiomatization. Specialization axioms are also kept in separate axiom files.

Within the ATP community, some problems have been created with non-standard axiomatizations. A non-standard axiomatization can be formed by modifying a standard axiomatization: the standard axiomatization can be reduced (i.e., axioms are removed) and the result is an incomplete axiomatization, or it can be augmented (i.e., lemmas are added) and the result is a redundant axiomatization. Incomplete and redundant axiomatizations are typically used to find a proof of a conjecture (based on the axiomatization) using a particular ATP system. An axiomatization can also be non-standard because it is biased, i.e., designed specifically to be suited or ill-suited to some ATP system, calculus, or control strategy. A problem version is incomplete, redundant, or biased if its axiomatization is. Finally, an axiomatization can be non-standard because it does not capture any established theory, i.e., a standard axiomatization does not exist, but the axioms are not biased. A problem version with such an axiomatization is especial. Typically, the axioms in an especial problem are specific to that problem, and are not used in any other problem.

In any 'real' application of an ATP system, a standard or especial problem version would typically be used, at least initially. The use of standard axiomatizations is particularly desirable.

Equality Axiomatization
In the TPTP equality is represented in infix using = and != for equality and inequality. An inequality has the same meaning as a negated equality. If equality is present in a problem, axioms of equality are not provided; it is assumed that equality reasoning is builtin to every ATP system. If equality axioms are required by an ATP system they can be added using the TPTP2X or TPTP4X utility. If any axioms are added when the problems are submitted to an ATP system, then the addition must be explicitly noted in reported results/

Problem Generators

Some abstract problems have a generic nature, and particular instances of the abstract problem are formed according to some size parameter(s). An example of a generic problem is the N-queens problem: place N queens on a N by N chess board such that no queen attacks another. The formulae for any size of this problem can be generated automatically, for any size of N >= 2. Note that the problem status might depend on the problem size.

Up to TPTP v1.1.3, the TPTP contained problem files for particular sizes of generic problems. This, however, was undesirable. Firstly, only a finite number of different problem sizes could be included, and therefore a desired size might have been missing. Secondly, even a small number of different problem sizes for each generic problem could consume a considerable amount of disk space. To overcome these problems, the TPTP contains generator files. Generator files are used to generate instances of generic problems, according to user supplied size parameters. The generators are used in conjunction with the tptp2X utility.

For convenience, the TPTP still contains, where they exist, interesting size instances of each generic problem. The TPTP sizes were chosen to be non-trivial, The statistics in the TPTP documents take into account these instances of generic problems.

Problem and Axiomatization Naming

Providing unambiguous names for all problems is necessary in a problem library. A naming scheme has been developed for the TPTP, to provide unique, stable names for abstract problems, problems, axiomatizations, and generators. Files are assigned names according to the schemes depicted in Tables Problem Naming and Axiom Naming. The DDDNNN combination provides an unambiguous name for an abstract problem or axiomatization. The DDDNNNFV[.SSS] combination provides an unambiguous name for a problem or generator, and the DDDNNNFE combination provides an unambiguous name for a set of axioms. The complete file names are unique within the TPTP. For example, the file GRP135-1.002.p contains the group theory problem number 135, version number 1, generated size 2. Similarly, the file CAT001-0.ax contains the basic category theory axiomatization number 1.

A regular expression for recognizing problem file names is "[A-Z]{3}[0-9]{3}[-+^=_][1-9][0-9]*(\.[0-9]{3})*\.p". A regular expression for recognizing axiom file names is "[A-Z]{3}[0-9]{3}[-+^=_][1-9][0-9]*\.ax".

`DDD`	`NNN`	`F`	`V`	`.SSS`	`.T`
`DDD` - Domain name abbreviation.
	`NNN` - Abstract problem number. It is unique within the domain.
		`F` - Form. `^` for THF, `_` for TFF without arithmetic, `=` for TFF with arithmetic, `+` for FOF, and `-` for CNF.
			`V` - Version number. It differentiates between different versions of the abstract problem.
				`SSS` - Size parameter(s). These only occur for generated problems, and give the size parameter(s).
					`T` - File name extension. `p` for problem files, `g` for generator files.

Problem file naming scheme.

The version numbers used for each abstract problem do not always start at 1, and are not always successive. This is because the same version number is assigned (wherever possible) to all problems that come from the same source, within each domain.

`DDD`	`NNN`	`F`	`V`	`.TT`
`DDD` - Domain name abbreviation.
	`NNN` - Axiomatization number. It is unique within the domain.
		`F` - Form. `^` for THF, `_` for TFF without arithmetic, `=` for TFF with arithmetic, `+` for FOF, and `-` for CNF.
			`V` - Specialization number. It identifies sets of axioms that are used to specialize an axiomatization. Axiomatizations of basic theories are allocated the number `0`, and specialization axiom sets are numbered from `1` onwards.
				`T` - File name extension. An extension is always `ax`.

Axiom file naming scheme.

If a file is ever removed from or renamed in the TPTP, then its extension is changed to .rm. The file is not physically removed, and a comment is added to the file to explain what has happened. This mechanism maintains the unique identification of problems and axiomatizations.

Problem Presentation

The physical presentation of the TPTP problem library is such that ATP researchers can easily use the problems. The syntax of all files is that of Prolog (with some operators defined). This conformance makes it trivial to manipulate the files using Prolog. All information that is not for use by ATP systems is formatted as comments. Comments extend from a % character to the end of the line, or may be block comments within /* ... */ bracketing. The TPTP file format, for problem files and axiom files, has three main sections. The first section is a header section that contains information for the user. This information is not for use by ATP systems. The second section contains include directives for axiom files. The last section contains the formulae that are specific to the problem or axiomatization. TPTP generator files have three sections, different from the problem and axiom files. The header section of generator files is similar to that of problem and axiom files, but with place-holders for information that is filled in based on the size of problem and the formulae generated. Following that comes Prolog source code to generate the formulae, and finally data describing the permissible sizes and the chosen TPTP size for the problem. More details are provided here.

A full BNF specification of the problem and axiom file formats is provided in the Documents directory of the TPTP. The TPTP2X and TPTP4X utilities can be used to convert TPTP files to other known ATP system formats. A description of the information contained in TPTP files is given below.

The problem files SYN000* are contrived to use most features of the TPTP language, and are thus suitable for testing parsers, etc. They are not suitable for evaluating ATP systems' reasoning capabilities.

The Header Section

This section contains information about the problem, for the user. It is divided into four parts separated by blank lines. The first part identifies and describes the problem. The second part provides information about occurrences of the problem in the literature and elsewhere. The third part gives the problem's ATP status and a table of syntactic measurements made on the problem. The last part contains general information about the problem. An example of a TPTP header, from the problem file GRP194+1.p, is shown in Figure Example header.

%--------------------------------------------------------------------------
% File     : GRP194+1 : TPTP v8.0.0. Released v2.0.0.
% Domain   : Group Theory (Semigroups)
% Problem  : In semigroups, a surjective homomorphism maps the zero
% Version  : [Gol93] axioms.
% English  : If (F,*) and (H,+) are two semigroups, phi is a surjective
%            homomorphism from F to H, and id is a left zero for F,
%            then phi(id) is a left zero for H.

% Refs     : [Gol93] Goller (1993), Anwendung des Theorembeweisers SETHEO a
% Source   : [Gol93]
% Names    :

% Status   : Theorem
% Rating   : 0.22 v7.5.0, 0.25 v7.4.0, 0.10 v7.3.0, 0.14 v7.2.0, 0.10 v7.1.0, 0.13 v7.0.0, 0.20 v6.4.0, 0.23 v6.3.0, 0.21 v6.2.0, 0.36 v6.1.0, 0.47 v6.0.0, 0.35 v5.5.0, 0.37 v5.4.0, 0.50 v5.3.0, 0.48 v5.2.0, 0.40 v5.1.0, 0.33 v4.1.0, 0.35 v4.0.1, 0.30 v4.0.0, 0.29 v3.7.0, 0.15 v3.5.0, 0.16 v3.4.0, 0.21 v3.3.0, 0.14 v3.2.0, 0.18 v3.1.0, 0.11 v2.7.0, 0.17 v2.6.0, 0.14 v2.5.0, 0.12 v2.4.0, 0.25 v2.3.0, 0.33 v2.2.1, 0.00 v2.1.0
% Syntax   : Number of formulae    :    8 (   2 unt;   0 def)
%            Number of atoms       :   21 (   4 equ)
%            Maximal formula atoms :    4 (   2 avg)
%            Number of connectives :   13 (   0   ~;   0   |;   6   &)
%                                         (   1 <=>;   6  =>;   0  <=;   0 <~>)
%            Maximal formula depth :    8 (   5 avg)
%            Maximal term depth    :    3 (   1 avg)
%            Number of predicates  :    3 (   2 usr;   0 prp; 2-2 aty)
%            Number of functors    :    5 (   5 usr;   3 con; 0-3 aty)
%            Number of variables   :   15 (  14   !;   1   ?)
% SPC      : FOF_THM_RFO_SEQ

% Comments :
%--------------------------------------------------------------------------

Example of a problem file header (GRP194+1.p).

The % File field.
This field contains three items of information. The first item is the problem's name. The current TPTP release number is given next, followed by the TPTP release in which the problem was released or last bugfixed (i.e., the formulae were changed).

The % Domain field.
The domain field identifies the domain, and possibly a subdomain, from which the problem is drawn. The domain corresponds to the first three letters of the problem name.

The % Problem field.
This field provides a one-line, high-level description of the abstract problem. In axiom files, this field is called % Axioms, and provides a one-line, high-level description of the axiomatization.

The % Version field.
This field gives information that differentiates this version of the problem from other versions of the problem. The first possible differentiation is the axiomatization that is used. If a specific axiomatization is used, a citation is provided. If the axiomatization is a pure equality axiomatization (uses only the equal/2 predicate) then this is noted too.

The second possible differentiation is the status of the axiomatization. There are 12 possiblities:

empty. Indicates that the axiomatization is standard, i.e., it is complete and has had no lemmas added (it might be redundant).
Especial. Indicates that the axiomatization does not capture any established theory.
Biased. Indicates that the axiomatization is designed specifically to be suited or ill-suited to some ATP system, calculus, or control strategy.
Incomplete. Indicates that the axiomatization is incomplete.
Reduced > Complete. Indicates that a existing standard axiomatization has had axioms removed, and the result is complete. The existing axiomatization is necessarily redundant.
Reduced > Incomplete. Indicates that a existing standard axiomatization has had axioms removed, and the result is non-standard due to incompleteness.
Augmented. Indicates that a existing standard axiomatization has had lemmas added, and the result is non-standard due to redundancy.
Reduced & Augmented > Complete. Indicates that a existing standard axiomatization has had axioms removed and lemmas added, and the result is complete. (Non-standard) after Complete. >
Reduced & Augmented > Incomplete. Indicates that a existing standard axiomatization has had axioms removed and lemmas added, and the result is non-standard due to incompleteness.
Incomplete > Reduced > Incomplete. Indicates that an existing incomplete axiomatization has had axioms removed, and the result is non-standard due to incompleteness.
Incomplete > Augmented > Complete. Indicates that an existing incomplete axiomatization has had axioms added, and the result is complete.
Incomplete > Augmented > Incomplete. Indicates that an existing incomplete axiomatization has had lemmas added, and the result is non-standard due to incompleteness.
Incomplete > Reduced & Augmented > Complete. Indicates that an existing incomplete axiomatization has had axioms removed and lemmas added, and the result is complete. (Non-standard) after Complete. >
Incomplete > Reduced & Augmented > Incomplete. Indicates that an existing incomplete axiomatization has had axioms removed and lemmas added, and the result is non-standard due to incompleteness.
Especial > Reduced > Especial. Indicates that an especial axiomatization has had axioms removed, and the result remains especial.
Especial > Augmented > Especial. Indicates that an especial axiomatization has had axioms added, and the result remains especial.

The third possible differentiation between problem versions is in the theorem formulation. Variations in the theorem formulation are noted in a Theorem formulation entry, e.g.,

% Version  : [McCharen, et al., 1976] (equality) axioms.
%            Theorem formulation : Explicit formulation of the commutator.

The % English field.
This field provides a full description of the problem, if the one-line description in the % Problem field is too terse.

The % Refs field.
This field provides a list of abbreviated references for material in which the problem has been presented. Other relevant references are also listed. The reference keys identify BibTeX entries in the Bibliography.bib file supplied with the TPTP.

The % Source field.
This field acknowledges the source of the problem, usually as a citation. If the problem was sourced from an existing problem collection then the collection name is given in [ ] brackets. The problem collections used thus far are:

ANL - the Argonne National Laboratory library [ANL]
OTTER - the collection distributed with the Otter ATP system [McC98]
Quaife - Art Quaife's collection of set theory (based) problems [Qua92]
ROO - the problems used to test the ROO ATP system [LM92]
SPRFN - the collection distributed with the SPRFN ATP system [SPRFN]
TPTP - the problem first ever appeared in the TPTP
SETHEO - the collection used to test the SETHEO ATP system [LS+92]

The % Names field.
Problems that have appeared in other problem collections or the literature, often have names which are known in the ATP community. This field lists such names for the problem, along with the source of the name. If the source is an existing problem collection then the collection name is cited, as in the % Source field. If the source of a name is a paper then a citation is given. If a problem is not given a name in a paper then ``-'' is used as the known name and a citation is given. Problems that first appeared in the TPTP have source TPTP, and no other name. In generator files all known names for instances of the abstract problem are listed, with the instance size given before the source. A reverse index, from known names to TPTP names, is distributed with the TPTP.

The % Status field.
This field gives the ATP status of the problem, according to the SZS ontology. For THF, TFF, and FOF problems it is one of the following values. For THF problems the default semantics is Henkin with extensionality and choice. If the status is different for other semantics, this is noted on lines after the % Status field.

Theorem - Every model of the axioms (and other non-conjecture formulae, e.g., hypotheses and lemmas), and there are some such models, is a model of all the conjectures.
ContradictoryAxioms - There are not models of the axioms (and other non-conjecture formulae, e.g., hypotheses and lemmas).
CounterSatisfiable - Some models of the axioms (and there are some) are models of the negation of at least one of the conjectures.
Unsatisfiable - There are no conjectures, and no interpretations are models of the axioms.
Satisfiable - There are no conjectures, and some interpretations are models of the axioms.
Unknown - The problem has never been solved by an ATP system.
Open - The abstract problem has never been solved.

For CNF problems it is one of:

Unsatisfiable - No interpretations are models of the clauses.
Satisfiable - Some interpretations are models of the clauses.
Unknown - The problem has never been solved by an ATP system.
Open - The abstract problem has never been solved.

The % Rating field.
This field gives the difficulty of the problem, measured relative to state-of-the-art ATP systems. It is a comma separated list of ratings, each with a TPTP release number. The rating is a real number in the range 0.0 to 1.0, where 0.0 means that all state-of-the-art ATP systems can solve the problem (i.e., the problem is easy), and 1.0 means no state-of-the-art ATP system can solve the problem (i.e., the problem is hard). The rating is followed by a TPTP release number, indicating in which release the rating was assigned. If no rating has been assigned, a ? is given.

The % Syntax field.
This field lists various syntactic measures of the problem's formulae. The measures for THF problems are:

the numbers of formulae, unit formulae, type formulae, definition formulae,
the numbers of atoms, equality atoms, connective atoms,
the maximal and average number of atoms in a formula,
the numbers of logical connectives, each logical connective,
the maximal and average formula depth, the number of nested formulae,
the numbers of tuples, if-then-else formulae, and let-in formulae,
the numbers of arithmetic formulae, terms, numbers, and variables,
the numbers of types, user types,
the numbers of type declarations, polymorphic type declarations, dependent type declarations
the numbers of type connectives, each type connective,
the numbers of distinct (non-variable) symbols, user symbols, constant symbols, and the arity range of the symbols,
the numbers of uses of the Π, Σ, ε, ι, and typed equality constants,
the numbers of distinct variables, lambda quantifications, universal quantifications, existential quantifications, typed variables, defined variables, Π, Σ, ε, and ι binders.

The measures for TFF problems are:

the numbers of formulae, unit formulae, type formulae, definition formulae,
the numbers of atoms, equality atoms,
the maximal and average number of atoms in a formula,
the numbers of logical connectives, each logical connective,
the maximal and average formula depth,
the maximal and average term depth,
the numbers of extended terms, nested formulae, boolean variables,
the numbers of tuples, if-then-else formulae, and let-in formulae,
the numbers of arithmetic formulae, terms, numbers, and variables,
the numbers of types, user types,
the numbers of type connectives, each type connective,
the numbers of distinct predicates, user predicates, propositions, and the arity range of the predicates.
the numbers of distinct function, user functions, constants, and the arity range of the functions,
the numbers of distinct variables, universal quantifications, existential quantifications, typed variables, defined variables, Π and Σ binders.

The measures for FOF problems are:

the numbers of formulae, unit formulae, definition formulae,
the numbers of atoms, equality atoms
the maximal and average number of atoms in a formula,
the numbers of logical connectives, each logical connective,
the maximal and average formula depth,
the maximal and average term depth,
the numbers of distinct predicates, user predicates, propositions, and the arity range of the predicates.
the numbers of distinct function, user functions, constants, and the arity range of the functions,
the numbers of distinct variables, universal quantifications, existential quantifications.

The measures for CNF problems are:

the numbers of clauses, unit clauses, non-Horn clauses, range restricted clauses,
the numbers of literals, equality literals, negated literals,
the maximal and average clause size,
the maximal and average term depth,
the numbers of distinct predicates, user predicates, propositions, and the arity range of the predicates.
the numbers of distinct function, user functions, constants, and the arity range of the functions,
the numbers of distinct variables, singletons.

See A HREF="#InsideTheTPTP">the summary of this information over the entire TPTP.

The % SPC field.
This field indicates to which Specialist Problem Class (SPC) the problem belongs. SPCs are sets of problems that have similar syntactic characteristics, so that that the problems are reasonably homogeneous with respect ATP systems' performances on the problems. SPCs are used for caclulating the difficulty rating [SS01], and for recommending systems to use on problems [SS99]. The characteristics used to build the SPCs are:

The form of the problem: TH1, TH0, TX1, TX0, TF1, TF0, FOF, CNF
The known or expected logical (SZS) status: THM (theorem), CAX (contradictory axioms), CSA (countersatisfiable), UNS (unsatisfiable), SAT (satisfiable), UNK (unknown), OPN (open)
The effective order of the logic (for FOF and CNF problems): PRP (propositional), EPR (effectively propositional), RFO (really first-order),
The amount of equality in the problem: EQU (with equality), SEQ (some, but not only, equality), PEQ (pure equality), UEQ (unit equality) NUE (non-unit pure equality), NEQ (no equality),
The amount of arithmetic (for TFF problems): ARI (with arithmetic) NAR (no arithmetic),
Horness (for CNF problems): HRN (Horn), NHN (non-Horn)

The % Comments field.
This field contains free format comments about the problem, e.g., the significance of the problem, or the reason for creating the problem. If the problem was created using the TPTP2X utility then the TPTP2X parameters are given.

The % Bugfixes field.
This field describes any bugfixes that have been done to the formulae of the problem. Each entry gives the release number in which the bugfix was done, and attempts to pinpoint the bugfix accurately.

The Include Section

The include section contains include directives for TPTP axiom files. An example extracted from the problem file GEO146+1.p is shown below.

%--------------------------------------------------------------------------
%----Include simple curve axioms
include('Axioms/GEO004+0.ax').
%----Include axioms of betweenness for simple curves
include('Axioms/GEO004+1.ax').
%----Include oriented curve axioms
include('Axioms/GEO004+2.ax').
%----Include trajectory axioms
include('Axioms/GEO004+3.ax',[connect_defn,symmetry_of_at_the_same_time]).
%--------------------------------------------------------------------------

Example of a problem file include section (adapted from GEO146+1.p).

Each of the include directives indicates that the formulae in the named file should be included at that point. Include files with relative path names are expected to be found either under the directory of the current file, or if not found there then under the directory specified in the TPTP environment variable. If the include directive has a []ed list of formulae names as a second argument, the named file must be filtered to retain only those formulae (this filter also applies to formulae that have been recursively included into the named file). If any member of the list cannot be found, or there are multiple formulae with a given name, that is an error condition. If there is no second argument, or the second argument is all, then all the formulae in the file are included. In the example above only connect_defn and symmetry_of_at_the_same_time are included from Axioms/GEO004+3.ax.

Full versions of TPTP problems (without include directives) can be created by using the TPTP2X and TPTP4X utilities.

A side effect of separating out the axioms into axiom files is that the formula order in the TPTP presentation of problems is typically different from that of any original presentation. This reordering is acceptable because the performance of a decent ATP system should not be very dependent on a particular formula ordering.

Annotated Formulae

The TPTP contains problems in first-order (FOF) and clause normal form (CNF), monomorphic and polymorphic typed first-order form (TFF, which includes the extended first-order form (TXF)), monomorphic and polymorphic typed higher-order form (THF), dependently typed monomorphic and polymorphic typed higher-order form (DHF), and non-classical typed first-order (NXF) and higher-order (NHF) forms. Interpreted arithmetic types and symbols are supported in all the typed logics.

TPTP problems and TSTP solutions are built from annotated formulae of the form

language(name,role,formula,source,useful_info).

The languages supported are cnf, fof, tff, and thf.

The role gives the user semantics of the formula, one of axiom, hypothesis, definition, assumption, lemma, theorem, corollary, conjecture, negated_conjecture, plain, type, interpretation, logic, and unknown.

The axiom-like formulae are those with the roles axiom, hypothesis, definition, assumption, lemma, theorem, and corollary. They are accepted, without proof, as a basis for proving conjectures in THF, TFF, and FOF problems. In CNF problems the axiom-like formulae are accepted as part of the set whose satisfiability has to be established. There is no guarantee that the axiom-like formulae of a problem are consistent.

hypothesiss are assumed to be true for a particular problem.

definitions are used to define symbols.

assumptions must be discharged before a derivation is complete.

lemmas and theorems have been proven from the other axiom-like formulae, and are thus redundant wrt those axiom-like formulae. theorem is used also as the role of proven conjectures, in output. A problem containing a lemma or theorem that is not redundant wrt the other axiom-like formulae is ill-formed. theorems are more important than lemmas from the user perspective.

corollarys have been proven from the axioms and a theorem, and are thus redundant wrt the other axiom-like and theorem formulae. A problem containing a corollary that is not redundant wrt the other axiom-like formulae and theorem formulae is ill-formed.

conjectures occur in DHF, THF, TFF, and FOF problems, and are to all be proven from the axiom-like formulae. A problem is solved only when all conjectures are proven. (TPTP problems never contain more than one conjecture, and the creation of problems with more than one conjecture is currently deprecated, because contemporary ATP systems commonly treat them as a disjunction, and will prove only one of them.)

negated_conjectures are formed from negation of a conjecture, typically in FOF to CNF conversion.

plain formulae have no special user semantics, and are typically used in derivation output.

type formulae define the types of symbols globally.

interpretation formulae are used for writing interpretations, typically models of satisfiable sets of formulae, in particular the axioms and negated_conjecture of a non-theorem.

logic formulae are used for defining the logic in non-classical logics.
unknowns have unknown role, and this is an error situation.

The useful_info field of an annotated formula is optional, and if it is not used then the source field becomes optional. The source field is used to record where the annotated formula came from, and is most commonly a file record or an inference record. A file record stores the name of the file from which the annotated formula was read, and optionally the name of the annotated formula as it occurs in that file - this might be different from the name of the annotated formula itself, e.g., if an ATP systems reads an annotated formula, renames it, and then prints it out. An inference record stores information about an inferred formula. The useful_info field of an annotated formula is a list of arbitrary useful information formatted as Prolog terms, as required for user applications.

Logical Formulae
In a formula, terms and atoms follow Prolog conventions - functions and predicates start with a lowercase letter or are 'single quoted', and variables start with an uppercase letter. Symbols may be overloaded with different arity signatures, and are treated as different symbols. The language also supports interpreted symbols that are either defined symbols that start with a $, or are composed of non-alphabetic characters. Defined symbols come in two varieties: TPTP defined predicates and functors, whose interpretation is specified by the TPTP language, and system defined predicates and functors, whose interpretation is ATP system specific. The defined predicates and functors recognized so far are listed below.

The logical connectives in the classical TPTP languages are !>, ?*, @+, @-, ^, !, ?, @, ~, |, &, =>, <=, <=>, and <~>, for Pi, Sigma, choice (indefinite description), definite description, lambda abstraction, universal quantification, existential quantification, function application, negation, disjunction, conjunction, implication, reverse implication, equivalence, and non-equivalence (XOR). Equality and inequality are expressed as the infix predicates = and !=. Quantified formulae are written in the form
Quantifier [Variables] : Formula
In all except the CNF language, every variable in a Formula must be bound by a preceding quantification with adequate scope. Negation has higher precedence than quantification, which in turn has higher precedence than the binary connectives. No precedence is specified between the binary connectives; brackets are used to ensure the correct association. See the section on non-classical logics for information about the non-classical connectives.

Types
The typed languages (TFF, TXF, THF, DHF, NXF, NHF) support types and type declarations (all the details are in the section on types and type declarations). Symbols are declared before their use, with type signatures that specify the types of their arguments and result. Two TPTP defined types are available, $i for individuals, and $o for booleans. User defined types can be introduced as being of the "type" $tType. In the first-order languages, a signature of the form (t₁ * ... * t_n) > r is the type of an n-ary symbol, where the i-th argument is of type t_i and the result type is r. If r is $o then the symbol is a predicate. Variables are given their type by a :type suffix. Equality is ad hoc polymorphic over all types. A useful feature of TFF is default typing for symbols that are not explicitly declared: predicates default to ($i,...,$i) > $o, and functions default to ($i,...,$i) > $i. This allows TFF to effectively degenerate to untyped FOF. The typed extended first-order form (TXF) augments TFF with FOOL constructs: formulae of type $o as terms; variables of type $o as formulae; tuples; conditional (if-then-else) expressions; and let (let-defn-in) expressions. The typed higher-order form (THF) includes type declarations in curried form, lambda-terms with a binder symbol ^ for lambda, explicit application with @, and quantification over variables of any type. A curried type signature has the form t₁ > ... > t_n > r. THF does not admit default typing - all symbol types must be declared before use. The polymorphic languages use a !> to quantify over types, producing types signatures of the form !> [T:$tType] : (t₁ * ... * t_n) > r and !> [T:$tType] : t₁ > ... > t_n > r. where any t_i or r can be the type T. When using polymorphic symbols the concrete type(s) must be provided as the first arguments in the use.

Defined Symbols
The defined symbols are:

The truth constants $true and $false.
Integer/rational/real numbers such as 27, 43/92, -99.66.
$distinct, whose arguments are hence known to be unequal from each other (but not necessarily unequal to any other constants). $distinct maybe overloaded with different arities. In the typed languages $distinct is ad hoc polymorphic with all arguments having the same type.
"distinct object"s, written in double quotes. All "distinct object"s are unequal to all "different distinct object"s (but not necessarily unequal to any other constants), e.g., "Apple" != "Microsoft". One way of implementing this is to interpret "distinct object" as the domain element in the double quotes. In typed contexts "distinct object"s are of type $i (and as a result, are unequal to all numbers because their interpretation domains are disjoint in the TPTP type systems).
Numbers (numeric constants). These are used in only the typed languages (numbers require a type system in order to make sense). Numbers are interpreted as themselves (as domain elements). All different numbers are unequal, e.g., 1 != 2 (but not necessarily unequal to any other constants of the same type). All numbers are also unequal to terms of type $i (because their interpretation domains are disjoint in the TPTP type systems).

Conditional and Let Expressions
The THF and TXF languages have conditional and let expressions.

Conditional expressions have $ite as the functor. The expressions are parametric polymorphic, taking a boolean expression as the first argument, then two expressions of any one type as the second and third arguments, as the true and false return values respectively, i.e., the return type is the same as that of the second and third arguments.
Let expressions have $let as the functor. The expressions provide the types of defined symbols, definitions for the symbols, and a formula/term in which the definitions are applied. Each type declaration is the same as a type declaration in an annotated formula with the type role, and multiple type declarations are given in []ed tuples of declarations. Each definition defines the expansion of one of the declared symbols, and multiple definitions are given in []ed tuples of definitions. If variables are used in the lefthand side of a definition, their values are supplied in the defined symbol's use. Such variables do not need to be declared (they are implicitly declared to be of the type defined by the symbol declaration), but must be top-level arguments of the defined symbol and be pairwise distinct. In the polymorphic case, type declarations may include type variables introduced in an enclosing scope, such as
```
%----The constant function using A from 'outside' and B from 'inside'. B 
%----is required in the use of const, but not A.
tff(let_polymorphic_both_types, axiom,
    !>[A: $tType] : 
      $let(
        const : !>[B: $tType] : (A * B) > A,
        const(B, X, Y) := X),
        ... ).  
```
No extra type arguments are needed when using such $let-defined symbols.

Examples
An example CNF problem is PUZ003-1. An example FOF problem is PUZ001+1. An example TFF problem is PUZ031_1. An example TXF problem is PUZ081_8. An example THF problem is PUZ081^1. An example DHF problem is DAT346^1. An example NXF problem is PUZ001_10. An example NHF problem is PHI003^8.

The Type System

A high-level overview is in the section on annotated formulae.

Subtypes
None of the type systems offer subtypes, despite the subtype symbol << being part of the THF and TFF languages. I tried to convince the ATP system developers it would be useful, but they cried crocodile tears. Here's an example of what was planned:

    tff(food_type,type, food: $tType).

    tff(fruit_type,type, fruit: $tType).

    tff(vegetable_type,type, vegetable: $tType).

    tff(fruit_is_food,type, 
        fruit << food ).

    tff(vegetable_is_food,type, 
        vegetable << food ). 

    tff(apple_decl,type, apple: fruit).

    tff(potatoe_decl,type, potatoe: vegetable).

All subtypes of an atomic type are disjoint, i.e., it is implicit that apple != potatoe.

The Typed First-order Monomorphic Type System (TF0 and TX0)

Refer to [SS+12] (TF0) and [SK18] (TX0) for more details on type checking and semantics, and much more.

Defined atomic types
The atomic types $i (individuals), $o (Booleans), and $tType are defined. ($tType is the "type" of atomic types - it is not really a type, and in other places this is known as the "kind" of atomic types. See below for motivation for its existence and usage.) Other defined atomic types are associated with specific theories. In particular, $int, $rat, and $real are types for arithmetic. The type $i of individuals is predefined but has no peculiar semantics, whereas the arithmetic types $int, $rat, and $real are modeled by ℤ, ℚ, and ℝ, respectively.

User atomic type declarations:
User atomic types are declared in advance of use, to be of "type" $tType. Example:

    tff(food_type,type, 
        food: $tType ).

    tff(fruit_type,type, 
        fruit: $tType ).

    tff(list_type,type, 
        list: $tType ).

Defined functions and predicates
Defined functions and predicates have preassigned types.

$true is of type $o
$false is of type $o
Distinct objects in "double quotes" are of type $i
= and != are ad hoc polymorphic over atomic types other than $o, with result type $o.
$distinct is ad hoc polymorphic over all atomic types other than $o, with result type $o.
The (atomic) types of numbers are defined in the arithmetic section.
The types of the arithmetic predicates and functions are defined in the arithmetic section.

Function and predicate type declarations
Every function and predicate symbol has maximally one declared type. Example:

    tff(cons_type,type, 
        cons: (fruit * list) > list   ).

    tff(is_empty_type,type, 
        isEmpty: list > $o   ).

The argument types must be atomic, and cannot be $o in TFF, but can be $o in TXF. The range type of a function must be atomic, and cannot be $o. The range type of a predicate is $o. Currying is not possible in TFF or TXF, i.e., the first example above cannot be written cons: fruit > list > list (currying is used in THF and DHF). If a symbol's type is declared more than once, and the types are not the same, that's an error.

Every variable is given an atomic type at quantification time. Example:

    tff(list_not_empty,axiom, 
        ! [X: fruit,Xs: list] : ~isEmpty(cons(X,Xs))   ).

Implicit typing for functions and predicates
If a symbol is used and its type has not been declared, then default types are assumed.

All untyped predicates get type ($i * ... * $i) > $o.
All untyped functions get type ($i * ... * $i) > $i.
All untyped variables get type $i.

If a symbol's type is declared later to be different from the assumed type, that's an error.

The Typed First-order Polymorphic Type System (TF1 and TX1)

Refer to [BP13] for more details on type checking and semantics, and much more.

The TF1 polymorphic type system extends the monomorphic type system with rank-1 polymorphism. Syntactically, the types, terms, and formulas of TF1 are analogous to those of TF0, except that function and predicate symbols can be declared to be polymorphic, types can contain type variables, and n-ary type constructors are allowed. Type variables in type signatures and formulas are explicitly bound. Instances of polymorphic symbols are specified by explicit type arguments, rather than inferred.

Types and type signatures
The types of TF1 are built from type variables and type constructors of fixed arities. A type is polymorphic if it contains type variables (otherwise, it is monomorphic). Polymorphic type signatures are of the form: !>[α1 : $tType, ..., αn : $tType]: ς for n > 0, where α1, ..., αn are distinct type variables and ς is a monomorphic type. The binder !> denotes universal quantification.

Type declarations
Type constructors can be declared, e.g., the following declarations introduce a type constant bird, a unary type constructor list, and a binary type constructor map:

    tff(bird_t, type, bird: $tType).
    tff(list_t, type, list: $tType > $tType).
    tff(map_t, type, map: ($tType * $tType) > $tType).

Every type variable must be bound by a !>-binder. The following declarations introduce a polymorphic predicate is_empty, and a pair of polymorphic functions cons and lookup:

    tff(is_empty_t, type, is_empty : !>[A : $tType]: (list(A) > $o)).
    tff(cons_t, type, cons : !>[A : $tType]: ((A * list(A)) > list(A))).
    tff(lookup_t, type, lookup : !>[A : $tType, B : $tType]: ((map(A, B) * A) > B)).

Using symbols with a polymorphic type
Every use of a polymorphic symbol must explicitly specify the type instance. A function or predicate symbol with a type signature !>[α₁$: $tType, ..., α_m : $tType]: ((τ₁ * ... * τ_n) > υ) must be applied to m type arguments and n term arguments. Given the above signatures, the term lookup($int, list(A), M, 2) and the atom is_empty($i, cons($i, X, nil($i))) are well-formed and contain free occurrences of the type variable A and the term variable X, respectively. As TF1 is restricted to rank-1 polymorphism, type variables can be instantiated with only concrete types. In particular, $o, $tType, and !>-binders cannot occur in type arguments of polymorphic symbols.

Every variable in a TF1 formula must be bound. In particular, every type variable in a TF1 formula must be bound with the pseudotype $tType:

    tff(bird_list_not_empty, axiom,
        ![B : bird, Bs : list(bird)]: ~ is_empty(bird, cons(bird, B, Bs))).

    tff(lookup_update_same, axiom,
        ! [A : $tType, B : $tType, M : map(A, B), K : A, V : B]:
          lookup(A, B, update(A, B, M, K, V), K) = V).

Universal and existential quantifiers over type variables may not occur in the scope of a quantifier over a term variable (as is possible in dependent types).

Example
An example TF1 problem is PUZ139_1.

The Typed Higher-order Monomorphic Type System (TH0)

Refer to [SB10] for more details on type checking and semantics, and much more.

The TH0 monomorphic type system lifts the TF0 type system to simply typed λ-calculus, with Henkin semantics (i.e., including Boolean and functional extensionality) and choice. Type signatures are curried, and variables can have function types. Example:

    thf(mix_type,type,
        mix: beverage > syrup > beverage ).
    
    thf(mixed_coffee,conjecture,
        ? [Mixture: beverage > syrup > beverage] :
        ! [S: syrup] :
          ( ( Mixture @ coffee @ S ) = coffee ) ).

An example TH0 problem is PUZ140^1.

The Typed Higher-order Polymorphic Type System (TH1)

Refer to [KSR16] for more details on type checking and semantics, and much more.

The TH1 polymorphic type system combines the polymorphic features of TF1/TX1 with the higher-order features of TH0.
Example:

    thf(list,type, list: $tType > $tType ).
    
    thf(nil,type, nil: !>[A: $tType] : ( list @ A ) ).
    
    thf(append,type, append: !>[A: $tType] : ( ( list @ A ) > ( list @ A ) > ( list @ A ) ) ).
    
    thf(append_nil,axiom,
        ! [A: $tType,L: list @ A] :
          ( ( append @ A @ ( nil @ A ) @ L ) = L ) ).

TH1 has five additional polymorphic constants: !! for Π (universal quantification), ?? for Σ (existential quantification), @@+ for ε (indefinite description, aka choice), @@- for ι (definite description), and @= (equality). The type of the first four is !>[A: $tType] : (A > $o) > $o, and the type of @= is !>[A: $tType]: (A > A) > $o. When used they must be instantiated by applying them to a type argument. Example:

    thf(eps,type, eps: ( $i > $o ) > $i ).
    
    thf(choiceax,axiom,
        ! [P: $i > $o] :
          ( ? [X: $i] : ( P @ X )
         => ( P @ ( eps @ P ) ) ) ).
    
    thf(conj,conjecture,
        ( ( ^ [P: $i > $o] : ( P @ ( eps @ P ) ) )
        = ( ^ [P: $i > $o] : ( ?? @ $i @ P ) ) ) ).
    
    thf(bird_type,type, bird: $tType).

    thf(tweety_decl,type, tweety: bird).

    thf(some_tweety,axiom,
        ? [B: bird] : ( (@= @ bird) @ tweety @ B ) ).

An example TH1 problem is DAT434^1.

The Dependently Typed Higher-order Type System (DH0 and DH1)

Refer to [RK+25] for more details on type checking and semantics, and much more.

The DH0 and DH1 type systems add dependent types, i.e., types that take term arguments, to TH0 and TH1. As such they are classical and extensional type theories. The !> binder used for TF1 and TH1 polymorphism is reused to specify the term typess on which a type depends. Example:

    thf(elem_type,type, elem: $tType ).

    thf(nat_type,type, nat: $tType ).

    thf(zero_type,type, zero: nat ).

    thf(suc_type,type, suc: nat > nat ).

    thf(plus_type,type, plus: nat > nat > nat ).

    thf(list_type,type, list: nat > $tType ).

    thf(nil_type,type, nil: list @ zero ).

    thf(cons_type,type, cons:
        !>[N: nat] : (elem > (list @ N) > (list @ (suc @ N))) ).

    thf(app_type,type, app:
        !>[N: nat,M: nat] : ((list @ N) > (list @ M) > (list @ (plus @ N @ M))) ).

The application operator @ is used to instantiate the terms to the dependent type. With polymorphic types, the variable list is prepended with the type variables. Example:

    thf(ax1,axiom,
        ! [N: nat] : ((plus @ zero @ N) = N) ).

    thf(ax2,axiom,
        ! [N: nat,X: list @ N] : ((app @ zero @ N @ nil @ X) = X) ).

    thf(list_app_assoc_base,conjecture,
        ! [M2: nat,L2: list @ M2,M3: nat,L3: list @ M3] :
          ( ( app @ zero @ ( plus @ M2 @ M3 ) @ nil @ ( app @ M2 @ M3 @ L2 @ L3 ) )
          = ( app @ ( plus @ zero @ M2 ) @ M3 @ ( app @ zero @ M2 @ nil @ L2 ) @ L3 ) ) ).

An example DH0 problem is DAT346^1. An example DH1 problem is DAT342^1.

The Arithmetic System

Arithmetic must be done in the context of the THF or TFF types logics, which support the predefined atomic numeric types $int, $rat, and $real. Using THF or TFF enforces semantics that separate $int from $rat from $real from $i from $o. The numbers are unbounded, and the reals of infinite precision (rather than some specific implementation such as 32 bit 2's complement integer, or IEEE floating point). Systems that implement limited arithmetic must halt in an SZS error state if they hit overflow.

The following interpreted predicates and interpreted functions are defined. The symbols are overloaded (i.e., ad-hoc polymorphic) for the provided type signatures.

Symbol Types Operation Comments, examples                                        -
$int $tType The type of integers 123, -123

<integer> ::- (<signed_integer>|<unsigned_integer>) <signed_integer> ::- <sign><unsigned_integer> <unsigned_integer> ::- <decimal> <decimal> ::- (<zero_numeric>|<positive_decimal>) <positive_decimal> ::- <non_zero_numeric><numeric>* <sign> ::: [+-] <zero_numeric> ::: [0] <non_zero_numeric> ::: [1-9] <numeric> ::: [0-9]

$rat $tType The type of rationals 123/456, -123/456, +123/456
<rational> ::- (<signed_rational>|<unsigned_rational>) <signed_rational> ::- <sign><unsigned_rational> <unsigned_rational> ::- <decimal><slash><positive_decimal> <slash> ::: [/]

$real $tType The type of reals 123.456, -123.456,
123.456E789, 123.456e789, -123.456E789,
123.456E-789, -123.456E-789
<real> ::- (<signed_real>|<unsigned_real>) <signed_real> ::- <sign><unsigned_real> <unsigned_real> ::- (<decimal_fraction>|<decimal_exponent>) <decimal_exponent> ::- (<decimal>|<decimal_fraction>)<exponent><decimal> <decimal_fraction> ::- <decimal><dot_decimal> <dot_decimal> ::- <dot><numeric><numeric>* <dot> ::: [.] <exponent> ::: [Ee]

= (infix) ($int * $int) > $o
($rat * $rat) > $o
($real * $real) > $o
Comparison of two numbers. The numbers must be the same atomic type (see the type system).
$less/2 ($int * $int) > $o
($rat * $rat) > $o
($real * $real) > $o
Less-than comparison of two numbers. $less, $lesseq, $greater, and $greatereq are related by
  ! [X,Y] : ( $lesseq(X,Y) <=> ( $less(X,Y) | X = Y ) )
  ! [X,Y] : ( $greater(X,Y) <=> $less(Y,X) )
  ! [X,Y] : ( $greatereq(X,Y) <=> $lesseq(Y,X) )
i.e, only $less and equality need to be implemented to get all four relational operators.
$lesseq/2 ($int * $int) > $o
($rat * $rat) > $o
($real * $real) > $o
Less-than-or-equal-to comparison of two numbers.
$greater/2 ($int * $int) > $o
($rat * $rat) > $o
($real * $real) > $o
Greater-than comparison of two numbers.
$greatereq/2 ($int * $int) > $o
($rat * $rat) > $o
($real * $real) > $o
Greater-than-or-equal-to comparison of two numbers.
$uminus/1 $int > $int
$rat > $rat
$real > $real
Unary minus of a number. $uminus, $sum, and $difference are related by
  ! [X,Y] : $difference(X,Y) = $sum(X,$uminus(Y))
i.e, only $uminus and $sum need to be implemented to get all three additive operators.
$sum/2 ($int * $int) > $int
($rat * $rat) > $rat
($real * $real) > $real
Sum of two numbers.
$difference/2 ($int * $int) > $int
($rat * $rat) > $rat
($real * $real) > $real
Difference between two numbers.
$product/2 ($int * $int) > $int
($rat * $rat) > $rat
($real * $real) > $real
Product of two numbers.
$quotient/2 ($int * $int) > $rat
($rat * $rat) > $rat
($real * $real) > $real
Exact quotient of two numbers of the same type. For non-zero divisors, the result can be computed. For zero divisors the result is not specified. In practice, if an ATP system does not "know" that the divisor is non-zero, it should simply not evaluate the $quotient. Users should always guard their use of $quotient using inequality, e.g.,
  ! [X: $real] : ( X != 0.0 => p($quotient(5.0,X)) )
For $rat or $real operands the result is the same type. For $int operands the result is $rat; the result might be simplied.
$quotient_e/2, $quotient_t/2, $quotient_f/2 ($int * $int) > $int
($rat * $rat) > $rat
($real * $real) > $real
Integral quotient of two numbers. The three variants use different rounding to an integral result:

$quotient_e(N,D) - the Euclidean quotient, which has a non-negative remainder. If D is positive then $quotient_e(N,D) is the floor (in the type of N and D) of the real division N/D, and if D is negative then $quotient_e(N,D) is the ceiling of N/D.
$quotient_t(N,D) - the truncation of the real division N/D.
$quotient_f(N,D) - the floor of the real division N/D.
For zero divisors the result is not specified.
$remainder_e/2, $remainder_t/2, $remainder_f/2 ($int * $int) > $int
($rat * $rat) > $rat
($real * $real) > $real
Remainder after integral division of two numbers. For τ ∈ {$int,$rat, $real}, ρ ∈ {e, t,f}, $quotient_ρ and $remainder_ρ are related by
  ! [N:τ,D:τ] : $sum($product($quotient_ρ(N,D),D),$remainder_ρ(N,D)) = N
For zero divisors the result is not specified.
$floor/1 $int > $int
$rat > $rat
$real > $real
Floor of a number. The largest integral value (in the type of the argument) not greater than the argument (i.e., ! [X] : ($is_int($floor(X)))).
$ceiling/1 $int > $int
$rat > $rat
$real > $real
Ceiling of a number. The smallest integral value (in the type of the argument) not less than the argument (i.e., ! [X] : ($is_int($ceiling(X)))).
$truncate/1 $int > $int
$rat > $rat
$real > $real
Truncation of a number. The nearest integral value (in the type of the argument) with magnitude not greater than the absolute value of the argument (i.e., ! [X] : ($is_int($truncate(X)))).
$round/1 $int > $int
$rat > $rat
$real > $real
Rounding of a number. The nearest integral value (in the type of the argument) to the argument (i.e., ! [X] : ($is_int($rount(X)))). If the argument is halfway between two integral values, the nearest even integral value to the argument.
$abs/1 $int > $int
$rat > $rat
$real > $real
Absolute value of a number. $abs is related to other operators by
! [X] : ( ( $greatereq(X,0) => $abs(X) = X ) & ( $less(X,0) => $abs(X) = $uminus(X) ) )

$is_int/1 $int > $o
$rat > $o
$real > $o
Test for coincidence with an $int value.
$is_rat/1 $int > $o
$rat > $o
$real > $o
Test for coincidence with a $rat value.
$to_int/1 $int > $int
$rat > $int
$real > $int
Coercion of a number to $int. The largest $int not greater than the argument; the argument effectively gets truncated before conversion: ! [X] : ($to_int($truncate(X)) = $to_int(X)). If applied to an argument of type $int, this is the identity function.
$to_rat/1 $int > $rat
$rat > $rat
$real > $rat
Coercion of a number to $rat. This function is not fully specified. If applied to an argument of type $int, the result is the argument over 1. If applied to an argument of type $rat, this is the identity function. If applied to an argument of type $real that is (known to be) rational, the result is the $rat value. For other reals the result is not specified. In practice, if an ATP system does not "know" that the argument is rational, it should simply not evaluate the $to_rat. Users should always guard their use of $to_rat using $is_rat, e.g.,
  ! [X: $real] : ( $is_rat(X) => p($to_rat(X)) )
$to_real/1 $int > $real
$rat > $real
$real > $real
Coercion of a number to $real. If applied to an argument of type $real, this is the identity function.

Symbol	Types	Operation	Comments, examples -
`$int`	`$tType`	The type of integers	`123`, `-123` <integer> ::- (<signed_integer>\|<unsigned_integer>) <signed_integer> ::- <sign><unsigned_integer> <unsigned_integer> ::- <decimal> <decimal> ::- (<zero_numeric>\|<positive_decimal>) <positive_decimal> ::- <non_zero_numeric><numeric>* <sign> ::: [+-] <zero_numeric> ::: [0] <non_zero_numeric> ::: [1-9] <numeric> ::: [0-9]
`$rat`	`$tType`	The type of rationals	`123/456`, `-123/456`, `+123/456` <rational> ::- (<signed_rational>\|<unsigned_rational>) <signed_rational> ::- <sign><unsigned_rational> <unsigned_rational> ::- <decimal><slash><positive_decimal> <slash> ::: [/]
`$real`	`$tType`	The type of reals	`123.456`, `-123.456`, `123.456E789`, `123.456e789`, `-123.456E789`, `123.456E-789`, `-123.456E-789` <real> ::- (<signed_real>\|<unsigned_real>) <signed_real> ::- <sign><unsigned_real> <unsigned_real> ::- (<decimal_fraction>\|<decimal_exponent>) <decimal_exponent> ::- (<decimal>\|<decimal_fraction>)<exponent><decimal> <decimal_fraction> ::- <decimal><dot_decimal> <dot_decimal> ::- <dot><numeric><numeric>* <dot> ::: [.] <exponent> ::: [Ee]
`=` (infix)	`($int * $int) > $o` `($rat * $rat) > $o` `($real * $real) > $o`	Comparison of two numbers.	The numbers must be the same atomic type (see the type system).
`$less`/2	`($int * $int) > $o` `($rat * $rat) > $o` `($real * $real) > $o`	Less-than comparison of two numbers.	`$less`, `$lesseq`, `$greater`, and `$greatereq` are related by `! [X,Y] : ( $lesseq(X,Y) <=> ( $less(X,Y) \| X = Y ) )` `! [X,Y] : ( $greater(X,Y) <=> $less(Y,X) )` `! [X,Y] : ( $greatereq(X,Y) <=> $lesseq(Y,X) )` i.e, only `$less` and equality need to be implemented to get all four relational operators.
`$lesseq`/2	`($int * $int) > $o` `($rat * $rat) > $o` `($real * $real) > $o`	Less-than-or-equal-to comparison of two numbers.
`$greater`/2	`($int * $int) > $o` `($rat * $rat) > $o` `($real * $real) > $o`	Greater-than comparison of two numbers.
`$greatereq`/2	`($int * $int) > $o` `($rat * $rat) > $o` `($real * $real) > $o`	Greater-than-or-equal-to comparison of two numbers.
`$uminus`/1	`$int > $int` `$rat > $rat` `$real > $real`	Unary minus of a number.	$uminus, `$sum`, and `$difference` are related by `! [X,Y] : $difference(X,Y) = $sum(X,$uminus(Y))` i.e, only `$uminus` and `$sum` need to be implemented to get all three additive operators.
`$sum`/2	`($int * $int) > $int` `($rat * $rat) > $rat` `($real * $real) > $real`	Sum of two numbers.
`$difference`/2	`($int * $int) > $int` `($rat * $rat) > $rat` `($real * $real) > $real`	Difference between two numbers.
`$product`/2	`($int * $int) > $int` `($rat * $rat) > $rat` `($real * $real) > $real`	Product of two numbers.
`$quotient`/2	`($int * $int) > $rat` `($rat * $rat) > $rat` `($real * $real) > $real`	Exact quotient of two numbers of the same type.	For non-zero divisors, the result can be computed. For zero divisors the result is not specified. In practice, if an ATP system does not "know" that the divisor is non-zero, it should simply not evaluate the `$quotient`. Users should always guard their use of `$quotient` using inequality, e.g., `! [X: $real] : ( X != 0.0 => p($quotient(5.0,X)) )` For `$rat` or `$real` operands the result is the same type. For `$int` operands the result is `$rat`; the result might be simplied.
`$quotient_e`/2, `$quotient_t`/2, `$quotient_f`/2	`($int * $int) > $int` `($rat * $rat) > $rat` `($real * $real) > $real`	Integral quotient of two numbers.	The three variants use different rounding to an integral result: `$quotient_e(N,D)` - the Euclidean quotient, which has a non-negative remainder. If `D` is positive then `$quotient_e(N,D)` is the floor (in the type of `N` and `D`) of the real division `N/D`, and if `D` is negative then `$quotient_e(N,D)` is the ceiling of `N/D`. `$quotient_t(N,D)` - the truncation of the real division `N/D`. `$quotient_f(N,D)` - the floor of the real division `N/D`. For zero divisors the result is not specified.
`$remainder_e`/2, `$remainder_t`/2, `$remainder_f`/2	`($int * $int) > $int` `($rat * $rat) > $rat` `($real * $real) > $real`	Remainder after integral division of two numbers.	For `τ` ∈ {`$int`,`$rat`, `$real`}, `ρ` ∈ {`e`, `t`,`f`}, `$quotient_ρ` and `$remainder_ρ` are related by `! [N:τ,D:τ] : $sum($product($quotient_ρ(N,D),D),$remainder_ρ(N,D)) = N` For zero divisors the result is not specified.
`$floor`/1	`$int > $int` `$rat > $rat` `$real > $real`	Floor of a number.	The largest integral value (in the type of the argument) not greater than the argument (i.e., `! [X] : ($is_int($floor(X)))`).
`$ceiling`/1	`$int > $int` `$rat > $rat` `$real > $real`	Ceiling of a number.	The smallest integral value (in the type of the argument) not less than the argument (i.e., `! [X] : ($is_int($ceiling(X)))`).
`$truncate`/1	`$int > $int` `$rat > $rat` `$real > $real`	Truncation of a number.	The nearest integral value (in the type of the argument) with magnitude not greater than the absolute value of the argument (i.e., `! [X] : ($is_int($truncate(X)))`).
`$round`/1	`$int > $int` `$rat > $rat` `$real > $real`	Rounding of a number.	The nearest integral value (in the type of the argument) to the argument (i.e., `! [X] : ($is_int($rount(X)))`). If the argument is halfway between two integral values, the nearest even integral value to the argument.
`$abs`/1	`$int > $int` `$rat > $rat` `$real > $real`	Absolute value of a number.	`$abs` is related to other operators by ! [X] : ( ( $greatereq(X,0) => $abs(X) = X ) & ( $less(X,0) => $abs(X) = $uminus(X) ) )
`$is_int`/1	`$int > $o` `$rat > $o` `$real > $o`	Test for coincidence with an `$int` value.
`$is_rat`/1	`$int > $o` `$rat > $o` `$real > $o`	Test for coincidence with a `$rat` value.
`$to_int`/1	`$int > $int` `$rat > $int` `$real > $int`	Coercion of a number to `$int`.	The largest `$int` not greater than the argument; the argument effectively gets truncated before conversion: `! [X] : ($to_int($truncate(X)) = $to_int(X))`. If applied to an argument of type `$int`, this is the identity function.
`$to_rat`/1	`$int > $rat` `$rat > $rat` `$real > $rat`	Coercion of a number to `$rat`.	This function is not fully specified. If applied to an argument of type `$int`, the result is the argument over `1`. If applied to an argument of type `$rat`, this is the identity function. If applied to an argument of type `$real` that is (known to be) rational, the result is the `$rat` value. For other reals the result is not specified. In practice, if an ATP system does not "know" that the argument is rational, it should simply not evaluate the `$to_rat`. Users should always guard their use of `$to_rat` using `$is_rat`, e.g., `! [X: $real] : ( $is_rat(X) => p($to_rat(X)) )`
`$to_real`/1	`$int > $real` `$rat > $real` `$real > $real`	Coercion of a number to `$real`.	If applied to an argument of type `$real`, this is the identity function.

The extent to which ATP systems are able to work with the arithmetic predicates and functions can vary, from a simple ability to evaluate ground terms, e.g., $sum(2,3) can be evaluated to 5, through an ability to instantiate variables in equations involving such functions, e.g., $product(2,$uminus(X)) = $uminus($sum(X,2)) can instantiate X to 2, to extensive algebraic manipulation capability. The syntax does not axiomatize arithmetic theory, but may be used to write axioms of the theory.

The Non-classical Logics

The TPTP recognizes several normal modal logics, and one temporal logic. Each logic has a define name:

Generic modal - $modal
Alethic modal - $alethic_modal
Deontic modal - $deontic_modal
Epistemic modal - $epistemic_modal
Doxastic modal - $doxastic_modal
Temporal (instant-based) - $temporal_instant

Connectives
The non-classical connectives of NTF have the form {$name}. For the logics recognized by the TPTP the connectives are:

$modal - {$box}, {$dia}
$alethic_modal - {$necessary}, {$possible}
$deontic_modal - {$obligatory}, {$permissible}, {$forbidden}
$epistemic_modal - {$knows}, {$canKnow}
$doxastic_modal - {$believes}, {$canBelieve}
$temporal_instant - {$future}, {$past}, {$henceforth}, {$hitherto}

(System names may also be used for user-defined connectives, e.g., {$$canadian_conditional}, thus allowing use of the TPTP syntax when experimenting with logics that have not yet been formalized in the TPTP.) A connective can be parameterized to reflect more complex non-classical connectives. The form is {$name(param₁,param₂,param₂)}. If the connective is indexed the index is given as the first parameter prefixed with a #, e.g., {$knows(#manuel)} @ (nothing)}, so that the connective is {$knows(#manuel)} (and not the connective {$knows} applied to the index #manuel). All other parameters are key-value assignments, e.g., to list the agents of common knowledge the form might be {$common($agents:=[alice,bob,claire])}.

In NXF the non-classical connectives are applied in a mixed higher-order-applied/first-order-functional style, with the connectives applied using @ to a ()ed list of arguments. In NHF the non-classical connectives are applied using @ in usual higher-order style with curried function applications. There are also short form unary connectives for unparameterised {$box} and {$dia}, applied directly like negation: [.] and <.>, e.g., {$box} @ (p) can be written [.] p. Short forms and long forms can be used together, e.g., it’s OK to use {$necessary} and [.] in the same problem or formula.

Full specification of the connectives and their use in formulae is in the BNF starting at <nxf_atom> and <thf_defined_atomic>.

Semantics Specification
An annotated formula with the role logic is used to specify the semantics of formulae. The semantic specification typically comes first in a file. A semantic specification consists of the defined name of the logic followed by == and a list of properties value assignments. Each specification is the property name, followed by == and either a value or a tuple of specification details. If the first element of a list of details is a value, that is the default value for all cases that are not specified in the rest of the list. Each detail after the optional default value is the name of a relevant part of the vocabulary of the problem, followed by == and a value for that named part. The BNF grammar is here. The grammar is not very restrictive on purpose, to enable working with other logics as well. It is possible to create a lot of nonsense specifications, and to say the same thing in different meaningful ways. A tool to check the sanity of a specification is available.

A semantic specification changes the meaning of things such as the boolean type $o, universal quantification with !, etc - their existing meaning in classical logic should not be confused with the meaning in the declared logic.

For plain $modal and all the *_modal logics the properties that may be specified are $domains, $designation, $terms, and $modalities.

$domains
- $constant - Constant domain semantics has all world's domains fixed the same.
- $varying - Varying domain semantics has (potentially) different domains for each world.
- $cumulative - Cumulative domain semantics and varying, and the domain of each world is a superset of the domains of the worlds from which it can be reached.
- $decreasing - Decreasing domain semantics and varying, and the domain of each world is a subset of the domains of the worlds from which it can be reached.
$designation
- $rigid - Rigid designation independent of worlds.
- $flexible - Flexible designation dependent on worlds.
$terms
- $local - Terms are interpreted in the local world
- $global - Terms are interpreted globally in all the worlds
$modalities
- A known system name in the form $modal_system_Sys
  Sys ∈ { K,KB,K4,K5,K45,KB5,D,DB,D4,D5,D45,M,B,S4,S5,S5U }
- A tuple of known axiom names in the form [ $modal_axiom_Ax₁, $modal_axiom_Ax₂, ... ]
  Ax_i ∈ { K,M,B,D,4,5,CD,BoxM,C4,C }

For $temporal_instant the properties are the $domains, $designation, and $terms of the modal logic, $modalities with different possible values, and another property $time.

$modalities
- A tuple of known axiom names in the form [ $modal_axiom_AxTm₁, $modal_axiom_AxTm₂, ... ]
  Ax_i ∈ { K,M,B,D,4,5 }, Tm_i ∈ { +,- },
$time
- A tuple of known time properties in the form [ P₁,P₂, ... ]
  P_i ∈ { $reflexivity, $irreflexivity, $transitivity, $asymmetry, $anti_symmetry, $linearity, $forward_linearity, $backward_linearity, $beginning, $end, $no_beginning, $no_end, $density, $forward_discreteness, $backward_discreteness }

The formulae of a problem can be either local (true in the current world) or global (true in all worlds). By default, formulae with the roles hypothesis and conjecture are local, and all others are global. These defaults can be overridden by adding a subrole, e.g., axiom-$local, conjecture-$global.

Physical Organization

The TPTP is physically organized into six subdirectories, as follows:

Problems - The problem files directory, with subdirectories for each domain. The domain name abbreviations are used as subdirectory names. The subdirectories contain the problem files.
Axioms - The axiom files directory.
Generators - The generator files directory.
TPTP2X - The directory containing the TPTP2X utility.
Scripts - A directory containing the TPTP4X and TPTP2T utilities (the directory is called "Scripts" because originally it was planned to put shell scripts there, but things changed).
Documents - A directory containing documents that relate to the TPTP:
- ReadMe - General information about the TPTP.
- History - A history of the changes made to the TPTP up to the current release.
- ProblemList - A list of all the abstract problems in the TPTP, giving their names, number of THF, TFF, FOF, and CNF versions, and one-line descriptions.
- AxiomList - A list of the axiomatizations in the TPTP, giving their names, number of THF, TFF, FOF, and CNF specializations, and one-line descriptions.
- GeneratorList - A list of the generic problems in the TPTP, giving their names, number of FOF and CNF versions, and one-line descriptions. For each version the constraints that determine the permissible sizes, and the TPTP sizes, are given.
- ReverseIndex - An index from existing known names for problems to their TPTP file names.
- Bibliography.bib - The BibTeX file containing references for all material cited in the TPTP.
- ProblemAndSolutionStatistics - Lists all the problems in the TPTP, giving statistics about the problems and information about ATP systems' performances on the problems. The fields are listed and explained below.
- THFSynopsis, TFFSynopsis, FOFSynopsis, and CNFSynopsis - Statistics on the THF, TFF, FOF, and CNF probhlems in the TPTP, and charts showing the structure of the TPTP problem domains.
- OverallSynopsis - Statistics on the overall TPTP, and a chart showing the structure of the TPTP problem domains.
- VerySimilarProblemsLists - A directory containing lists of very similar problems. Problems are considered to be be "very similar" if:
  - They have many formulae in common, e.g., when the same axioms are used for very many conjectures.
  - They have every similar characteristics, e.g., when the problems have been created using different parameters in some automated process.
- SyntaxBNF - BNF of the TPTP syntax
- Template - A template for submitting new TPTP problems.
The files in the Documents directory contain comprehensive online information about the TPTP. They summarize much of the information contained in this report, in specific files. Their format provides quick access to the data, using standard system tools (e.g., grep, awk).
The fields in the problem lines of the ProblemAndSolutionStatistics file are:
- Problem name
- Frm - one of THF, TFF, FOF, or CNF.
- SZS - The SZS status (using the SZS ontology)
- Rtng - The problem's difficulty rating.
- SPC - The problem's Specialist Problem Class
- FmlCls - The number of formulae/clauses
- UnitF - The number of unit formulae/clauses
- TypeF - The number of type declarations (THF and TFF)
- Atoms - The number of atoms
- EquAts - The number of equality atoms
- Conns - The number of logical connectives
- FOOLs - The number of FOOL logic atoms/terms (TXF)
- Ariths - The number of arithmetic atoms/terms (THF and TFF)
- Types - The number of Types (THF and TFF)
- Symbls - The number of distinct non-variable symbols
- Preds - The number of disinct predicate symbols
- Arity - The arity range of predicate symbols
- Funcs - The number of distinct function symbols
- Arity - The arity range of function symbols
- Vars - The number of distinct variables
- ^ - The number of lambda quantified variables (THF)
- ! - The number of universally quantified variables
- ? - The number of existentially quantified variables
- !> - The number of lambda-pi quantified (polymorphic) variables (TH1 and TF1)
- {.} - The number of mono-modal non-classical connectives
- {#} - The number of multi-modal non-classical connectives
The fields in the solution lines of the ProblemAndSolutionStatistics file are:
- System name
- Res - The result established (using the SZS ontology)
- RsT - The CPU time taken to get the result
- Out - The output claimed (using the SZS ontology)
- SZS - The output detected during analysis (it might be difference from what was claimed, e.g., if the output was not in TPTP format)
- DDpDSz - If the output is some form of derivation then the derivation depth. If the output is a finite domain model then the domain size.
- Leaves - The number of leaves if the output is a derivation.
- All other fields as for a problem.