# The TPTP Language

### Annotated Formulae

TPTP problems and TSTP solutions are built from annotated formulae of the form language(name,role,formula,source,[useful_info]).. The languages currently supported are thf - formulae in typed higher-order form, tff - formulae in typed first-order form (including extended form), fof - formulae in first order form, and cnf - formulae in clause normal form. The name identifies the formula within the problem. The role gives the user semantics of the formula, one of axiom, hypothesis, definition, assumption, lemma, theorem, corollary, conjecture, negated_conjecture, plain, type, interpretation, logic, and unknown.

The syntax for atoms is that of Prolog: variables start with upper case letters, atoms and terms are written in prefix notation, uninterpreted predicates and functors either start with lower case and contain alphanumerics and underscore, or are in 'single quotes'. The language also supports interpreted predicates and functors. These come in two varieties: defined predicates and functors, whose interpretation is specified by the TPTP language, and system predicates and functors, whose interpretation is ATP system specific. Interpreted predicates and functors are syntactically distinct from uninterpreted ones - they are = and !=, or start with a \$, a ", or a digit. Non-variable symbols can be given a type globally, in the formula with role type. The defined types are \$o - the Boolean type, \$i - the type of individuals, \$real - the type of reals, \$rat - the type of rational, and \$int - the type of integers. New types are introduced in formulae with the type role, based on \$tType - the type of all types. Full details of the THF and TFF type systems are provided in [SS+12,BP13,SB10,KSR16], with the last providing an overview of the first three.

The defined predicates recognized so far are

• \$true and \$false, with the obvious interpretations. In typed languages they are of type \$o.
• = and != for equality and inequality. In typed languages they are ad hoc polymorphic with both sides having the same type.
• \$distinct, whose arguments are hence known to be unequal from each other (but not necessarily unequal to any other constants). \$distinct maybe overloaded with different arities. \$distinct is used in only the typed languages, and is ad hoc polymorphic with all arguments having the same type.
• The arithmetic predicates described below. The arithmetic predicates are used in only the THF and TFF languages.
The defined functors recognized so far are
• "distinct object"s, written in double quotes. All "distinct object"s are unequal to all "different distinct object"s (but not necessarily unequal to any other constants), e.g., "Apple" != "Microsoft". One way of implementing this is to interpret "distinct object" as the domain element in the double quotes. In typed contexts "distinct object"s are of type \$i (and as a result, are unequal to all numbers because their interpretation domains are disjoint in the TPTP type systems).
• Numbers (numeric constants). These are used in the THF and TFF languages. Numbers are interpreted as themselves (as domain elements). All different numbers are unequal, e.g., 1 != 2 (but not necessarily unequal to any other constants). All numbers are also unequal to terms of type \$i (because their interpretation domains are disjoint in the TPTP type systems). Numbers are not allowed in FOF and CNF, but for many applications an adequate alternative is to write "numbers" as distinct objects, e.g., "27", so that different "numbers" are known to be unequal to each other, but possibly equal to other terms of type \$i.
• The arithmetic functions described below. The arithmetic functions are used in only the THF and TFF languages.

System predicates and functors are used for interpreted predicates and functors that are available in particular ATP tools. System predicates and functors start with \$\$. The names are not controlled by the TPTP language, so they must be used with caution.

The connectives used to build non-atomic formulae are written using intuitive notations. The universal quantifier is !, the existential quantifier is ?, and the lambda binder is ^. Quantified formulae are written in the form Quantifier [Variables] :  Formula. In the THF, TFF, and FOF languages, every variable in a Formula must be bound by a preceding quantification with adequate scope. Typed Variables are given their type by a :type suffix. The binary connectives are infix | for disjunction, infix & for conjunction, infix <=> for equivalence, infix => for implication, infix <= for reverse implication, infix <~> for non-equivalence (XOR), infix ~| for negated disjunction (NOR), infix ~& for negated conjunction (NAND), infix @ for application. The only unary connective is prefix ~ for negation. Negation has higher precedence than quantification, which in turn has higher precedence than the binary connectives. No precedence is specified between the binary connectives; brackets are used to ensure the correct association. The binary connectives are left associative.

The THF and TFF languages have conditional and let expressions.

• Conditional expressions have \$ite as the functor. The expressions are parametric polymorphic, taking a boolean expression as the first argument, then two expressions of any one type as the second and third arguments, as the true and false return values respectively, i.e., the return type is the same as that of the second and third arguments. Let expressions have \$let as the functor.
• The expressions provide the types of defined symbols, definitions for the symbols, and a formula/term in which the definitions are applied. Each type declaration is the same as a type declaration in an annotated formula with the type role, and multiple type declarations are given in []ed tuples of declarations. Each definition defines the expansion of one of the declared symbols, and multiple definitions are given in []ed tuples of definitions. If variables are used in the lefthand side of a definition, their values are supplied in the defined symbol's use. Such variables do not need to be declared (they are implicitly declared to be of the type defined by the symbol declaration), but must be top-level arguments of the defined symbol and be pairwise distinct.

The useful_info field of an annotated formula is optional, and if it is not used then the source field becomes optional. The source field is used to record where the annotated formula came from, and is most commonly a file record or an inference record. A file record stores the name of the file from which the annotated formula was read, and optionally the name of the annotated formula as it occurs in that file - this might be different from the name of the annotated formula itself, e.g., if an ATP systems reads an annotated formula, renames it, and then prints it out. An inference record stores information about an inferred formula. The useful_info field of an annotated formula is a list of arbitrary useful information formatted as Prolog terms, as required for user applications.

An example of a THF formula section, extracted from the problem file LCL633^1.p, is shown below. An example of a TFF formula section, extracted from the problem file DAT013=1.p, is shown below that. Example THF formulae. An example of a FOF formula section, extracted from the problem file GRP194+1.p, is shown below that. An example of a clause section, extracted from the problem file GRP039-7.p, is shown below that.

```%------------------------------------------------------------------------------
%----Signature
thf(a,type,(
a: \$tType )).

thf(p,type,(
p: ( a > \$i > \$o ) > \$i > \$o )).

thf(g,type,(
g: a > \$i > \$o )).

thf(e,type,(
e: ( a > \$i > \$o ) > a > \$i > \$o )).

thf(r,type,(
r: \$i > \$i > \$o )).

%----Axioms
thf(positiveness,axiom,(
! [X: a > \$i > \$o] :
( mvalid
@ ( mimpl @ ( mnot @ ( p @ X ) )
@ ( p
@ ^ [Z: a] :
( mnot @ ( X @ Z ) ) ) ) ) )).

thf(g,definition,
( g
= ( ^ [Z: a,W: \$i] :
! [X: a > \$i > \$o] :
( mimpl @ ( p @ X ) @ ( X @ Z ) @ W ) ) )).

thf(e,definition,
( e
= ( ^ [X: a > \$i > \$o,Z: a,P: \$i] :
! [Y: a > \$i > \$o] :
( mimpl @ ( Y @ Z )
@ ( mbox @ r
@ ^ [Q: \$i] :
! [W: a] :
( mimpl @ ( X @ W ) @ ( Y @ W ) @ Q ) )
@ P ) ) )).

%----Conjecture
thf(thm,conjecture,
( mvalid
@ ^ [W: \$i] :
! [Z: a] :
( mimpl @ ( g @ Z ) @ ( e @ g @ Z ) @ W ) )).
%------------------------------------------------------------------------------
```
Example of a THF problem file formulae section (LCL633^1.p).
```%------------------------------------------------------------------------------
tff(list_type,type,(
list: \$tType )).

tff(nil_type,type,(
nil: list )).

tff(mycons_type,type,(
mycons: ( \$int * list ) > list )).

tff(sorted_type,type,(
fib_sorted: list > \$o )).

tff(empty_fib_sorted,axiom,(
fib_sorted(nil) )).

tff(single_is_fib_sorted,axiom,(
! [X: \$int] : fib_sorted(mycons(X,nil)) )).

tff(double_is_fib_sorted_if_ordered,axiom,(
! [X: \$int,Y: \$int] :
( \$less(X,Y)
=> fib_sorted(mycons(X,mycons(Y,nil))) ) )).

tff(recursive_fib_sort,axiom,(
! [X: \$int,Y: \$int,Z: \$int,R: list] :
( ( \$less(X,Y)
& \$greatereq(Z,\$sum(X,Y))
& fib_sorted(mycons(Y,mycons(Z,R))) )
=> fib_sorted(mycons(X,mycons(Y,mycons(Z,R)))) ) )).

tff(check_list,conjecture,(
fib_sorted(mycons(1,mycons(2,mycons(4,mycons(7,mycons(100,nil)))))) )).
%------------------------------------------------------------------------------
```
Example of a TFF problem file formulae section (DAT002=1.p).
```%--------------------------------------------------------------------------
%----Definition of a homomorphism
fof(homomorphism1,axiom,
( ! [X] :
( group_member(X,f)
=> group_member(phi(X),h) ) )).

fof(homomorphism2,axiom,
( ! [X,Y] :
( ( group_member(X,f)
& group_member(Y,f) )
=> multiply(h,phi(X),phi(Y)) = phi(multiply(f,X,Y)) ) )).

fof(surjective,axiom,
( ! [X] :
( group_member(X,h)
=> ? [Y] :
( group_member(Y,f)
& phi(Y) = X ) ) )).

%----Definition of left zero
fof(left_zero,axiom,
( ! [G,X] :
( left_zero(G,X)
<=> ( group_member(X,G)
& ! [Y] :
( group_member(Y,G)
=> multiply(G,X,Y) = X ) ) ) )).

%----The conjecture
fof(left_zero_for_f,hypothesis,
( left_zero(f,f_left_zero) )).

fof(prove_left_zero_h,conjecture,
( left_zero(h,phi(f_left_zero)) )).
%--------------------------------------------------------------------------
```
Example of a FOF problem file formulae section (GRP194+1.p).

```%--------------------------------------------------------------------------
%----Redundant two axioms
cnf(right_identity,axiom,
( multiply(X,identity) = X )).

cnf(right_inverse,axiom,
( multiply(X,inverse(X)) = identity )).

... some clauses omitted here for brevity

cnf(property_of_O2,axiom,
( subgroup_member(X)
| subgroup_member(Y)
| multiply(X,element_in_O2(X,Y)) = Y )).

%----Denial of theorem
cnf(b_in_O2,negated_conjecture,
( subgroup_member(b) )).

cnf(b_times_a_inverse_is_c,negated_conjecture,
( multiply(b,inverse(a)) = c )).

cnf(a_times_c_is_d,negated_conjecture,
( multiply(a,c) = d )).

cnf(prove_d_in_O2,negated_conjecture,
( ~ subgroup_member(d) )).
%--------------------------------------------------------------------------
```
Example of a CNF problem file clause section (GRP039-7.p).

### The Arithmetic System

Arithmetic must be done in the context of the THF or TFF types logics, which support the predefined atomic numeric types \$int, \$rat, and \$real. Using THF or TFF enforces semantics that separate \$int from \$rat from \$real from \$i from \$o. The numbers are unbounded, and the reals of infinite precision (rather than some specific implementation such as 32 bit 2's complement integer, or IEEE floating point). Systems that implement limited arithmetic must halt in an
SZS error state if they hit overflow.

The following interpreted predicates and interpreted functions are defined. Each symbol is ad-hoc polymorphic over the numeric types exception - \$quotient is not defined for \$int). All arguments must have the same numeric type. All the functions, except for \$quotient of two \$ints, and the coercion functions \$to_int and \$to_rat, have the same range type as their arguments. For example, \$sum can be used with the type signatures (\$int * \$int) > \$int, (\$rat * \$rat) > \$rat, and (\$real * \$real) > \$real. The \$quotient of two \$ints is a \$rat. The coercion function \$to_int always has a \$int result, and the coercion function \$to_rat always has a \$rat result. All the predicates have a \$o result. For example, \$less can be used with the type signatures (\$int * \$int) > \$o, (\$rat * \$rat) > \$o, and (\$real * \$real) > \$o.

\$int The type of integers 123, -123
```<integer>            ::- (<signed_integer>|<unsigned_integer>)
<signed_integer>     ::- <sign><unsigned_integer>
<unsigned_integer>   ::- <decimal>
<decimal>            ::- (<zero_numeric>|<positive_decimal>)
<positive_decimal>   ::- <non_zero_numeric><numeric>*
<sign>               ::: [+-]
<zero_numeric>       ::: [0]
<non_zero_numeric>   ::: [1-9]
<numeric>            ::: [0-9]```
\$rat The type of rationals 123/456, -123/456, +123/456
```<rational>           ::- (<signed_rational>|<unsigned_rational>)
<signed_rational>    ::- <sign><unsigned_rational>
<unsigned_rational>  ::- <decimal><slash><positive_decimal>
<slash>              ::: [/]```
\$real The type of reals 123.456, -123.456,
123.456E789, 123.456e789, -123.456E789,
123.456E-789, -123.456E-789
```<real>               ::- (<signed_real>|<unsigned_real>)
<signed_real>        ::- <sign><unsigned_real>
<unsigned_real>      ::- (<decimal_fraction>|<decimal_exponent>)
<decimal_exponent>   ::- (<decimal>|<decimal_fraction>)<exponent><decimal>
<decimal_fraction>   ::- <decimal><dot_decimal>
<dot_decimal>        ::- <dot><numeric><numeric>*
<dot>                ::: [.]
<exponent>           ::: [Ee]```
= (infix) Comparison of two numbers. The numbers must be the same atomic type (see the type system).
\$less/2 Less-than comparison of two numbers. \$less, \$lesseq, \$greater, and \$greatereq are related by
! [X,Y] : ( \$lesseq(X,Y) <=> ( \$less(X,Y) | X = Y ) )
! [X,Y] : ( \$greater(X,Y) <=> \$less(Y,X) )
! [X,Y] : ( \$greatereq(X,Y) <=> \$lesseq(Y,X) )
i.e, only \$less and equality need to be implemented to get all four relational operators.
\$lesseq/2 Less-than-or-equal-to comparison of two numbers.
\$greater/2 Greater-than comparison of two numbers.
\$greatereq/2 Greater-than-or-equal-to comparison of two numbers.
\$uminus/1 Unary minus of a number. \$uminus, \$sum, and \$difference are related by
! [X,Y] : \$difference(X,Y) = \$sum(X,\$uminus(Y))
i.e, only \$uminus and \$sum need to be implemented to get all three additive operators.
\$sum/2 Sum of two numbers.
\$difference/2 Difference between two numbers.
\$product/2 Product of two numbers.
\$quotient/2 Exact quotient of two numbers of the same type. For non-zero divisors, the result can be computed. For zero divisors the result is not specified. In practice, if an ATP system does not "know" that the divisor is non-zero, it should simply not evaluate the \$quotient. Users should always guard their use of \$quotient using inequality, e.g.,
! [X: \$real] : ( X != 0.0 => p(\$quotient(5.0,X)) )
For \$rat or \$real operands the result is the same type. For \$int operands the result is \$rat; the result might be simplied.
\$quotient_e/2, \$quotient_t/2, \$quotient_f/2 Integral quotient of two numbers. The three variants use different rounding to an integral result:
• \$quotient_e(N,D) - the Euclidean quotient, which has a non-negative remainder. If D is positive then \$quotient_e(N,D) is the floor (in the type of N and D) of the real division N/D, and if D is negative then \$quotient_e(N,D) is the ceiling of N/D.
• \$quotient_t(N,D) - the truncation of the real division N/D.
• \$quotient_f(N,D) - the floor of the real division N/D.
For zero divisors the result is not specified.
\$remainder_e/2, \$remainder_t/2, \$remainder_f/2 Remainder after integral division of two numbers. For τ ∈ {\$int,\$rat, \$real}, ρ ∈ {e, t,f}, \$quotient_ρ and \$remainder_ρ are related by
! [N:τ,D:τ] : \$sum(\$product(\$quotient_ρ(N,D),D),\$remainder_ρ(N,D)) = N
For zero divisors the result is not specified.
\$floor/1 Floor of a number. The largest integral value (in the type of the argument) not greater than the argument.
\$ceiling/1 Ceiling of a number. The smallest integral value (in the type of the argument) not less than the argument.
\$truncate/1 Truncation of a number. The nearest integral value (in the type of the argument) with magnitude not greater than the absolute value of the argument.
\$round/1 Rounding of a number. The nearest integral value (in the type of the argument) to the argument. If the argument is halfway between two integral values, the nearest even integral value to the argument.
\$is_int/1 Test for coincidence with an \$int value.
\$is_rat/1 Test for coincidence with a \$rat value.
\$to_int/1 Coercion of a number to \$int. The largest \$int not greater than the argument. If applied to an argument of type \$int this is the identity function.
\$to_rat/1 Coercion of a number to \$rat. This function is not fully specified. If applied to a \$int the result is the argument over 1. If applied to a \$rat this is the identity function. If applied to a \$real that is (known to be) rational the result is the \$rat value. For other reals the result is not specified. In practice, if an ATP system does not "know" that the argument is rational, it should simply not evaluate the \$to_rat. Users should always guard their use of \$to_rat using \$is_rat, e.g.,
! [X: \$real] : ( \$is_rat(X) => p(\$to_rat(X)) )
\$to_real/1 Coercion of a number to \$real.

The extent to which ATP systems are able to work with the arithmetic predicates and functions can vary, from a simple ability to evaluate ground terms, e.g., \$sum(2,3) can be evaluated to 5, through an ability to instantiate variables in equations involving such functions, e.g., \$product(2,\$uminus(X)) = \$uminus(\$sum(X,2)) can instantiate X to 2, to extensive algebraic manipulation capability. The syntax does not axiomatize arithmetic theory, but may be used to write axioms of the theory.

### The Non-classical Logics

The TPTP recognizes several normal modal logics, and one temporal logic. Each logic has a define name:
• Generic modal - \$modal
• Alethic modal - \$alethic_modal
• Deontic modal - \$deontic_modal
• Epistemic modal - \$epistemic_modal
• Doxastic modal - \$doxastic_modal
• Temporal (instant-based) - \$temporal_instant
Connectives
The non-classical connectives of NTF have the form {\$name}. For the logics recognized by the TPTP the connectives are:
• \$modal - {\$box}, {\$dia}
• \$alethic_modal - {\$necessary}, {\$possible}
• \$deontic_modal - {\$obligatory}, {\$permissible}, {\$forbidden}
• \$epistemic_modal - {\$knows}, {\$canKnow}
• \$doxastic_modal - {\$believes}, {\$canBelieve}
• \$temporal_instant - {\$future}, {\$past}, {\$henceforth}, {\$hitherto}
(System names may also be used for user-defined connectives, e.g., {\$\$canadian_conditional}, thus allowing use of the TPTP syntax when experimenting with logics that have not yet been formalized in the TPTP.) A connective can be parameterized to reflect more complex non-classical connectives. The form is {\$name(param1,param2,param2)}. If the connective is indexed the index is given as the first parameter prefixed with a #, e.g., {\$knows(#manuel)} @ (nothing)}, so that the connective is {\$knows(#manuel)} (and not the connective {\$knows} applied to the index #manuel). All other parameters are key-value assignments, e.g., to list the agents of common knowledge the form might be {\$common(\$agents:=[alice,bob,claire])}.

In NXF the non-classical connectives are applied in a mixed higher-order-applied/first-order-functional style, with the connectives applied using @ to a ()ed list of arguments. In NHF the non-classical connectives are applied using @ in usual higher-order style with curried function applications. There are also short form unary connectives for unparameterised {\$box} and {\$dia}, applied directly like negation: [.] and <.>, e.g., {\$box} @ (p) can be written [.] p. Short forms and long forms can be used together, e.g., itâ€™s OK to use {\$necessary} and [.] in the same problem or formula.

Semantics Specification
An annotated formula with the role logic is used to specify the semantics of formulae. The semantic specification typically comes first in a file. A semantic specification consists of the defined name of the logic followed by == and a list of properties value assignments. Each specification is the property name, followed by == and either a value or a tuple of specification details. If the first element of a list of details is a value, that is the default value for all cases that are not specified in the rest of the list. Each detail after the optional default value is the name of a relevant part of the vocabulary of the problem, followed by == and a value for that named part. The BNF grammar is
here. The grammar is not very restrictive on purpose, to enable working with other logics as well. It is possible to create a lot of nonsense specifications, and to say the same thing in different meaningful ways. A tool to check the sanity of a specification is available.

A semantic specification changes the meaning of things such as the boolean type \$o, universal quantification with !, etc - their existing meaning in classical logic should not be confused with the meaning in the declared logic.

For plain \$modal and all the *_modal logics the properties that may be specified are \$domains, \$designation, \$terms, and \$modalities.

• \$domains
• \$constant - Constant domain semantics has all world's domains fixed the same.
• \$varying - Varying domain semantics has (potentially) different domains for each world.
• \$cumulative - Cumulative domain semantics and varying, and the domain of each world is a superset of the domains of the worlds from which it can be reached.
• \$decreasing - Decreasing domain semantics and varying, and the domain of each world is a subset of the domains of the worlds from which it can be reached.
• \$designation
• \$rigid - Rigid designation independent of worlds.
• \$flexible - Flexible designation dependent on worlds.
• \$terms
• \$local - Terms are interpreted in the local world
• \$global - Terms are interpreted globally in all the worlds
• \$modalities
• A known system name in the form \$modal_system_Sys
Sys ∈ { K,KB,K4,K5,K45,KB5,D,DB,D4,D5,D45,M,B,S4,S5,S5U }
• A tuple of known axiom names in the form [ \$modal_axiom_Ax1, \$modal_axiom_Ax2, ... ]
Axi ∈ { K,M,B,D,4,5,CD,BoxM,C4,C }

For \$temporal_instant the properties are the \$domains, \$designation, and \$terms of the modal logic, \$modalities with different possible values, and another property \$time.

• \$modalities
• A tuple of known axiom names in the form [ \$modal_axiom_AxTm1, \$modal_axiom_AxTm2, ... ]
Axi ∈ { K,M,B,D,4,5 }, Tmi ∈ { +,- },
• \$time
• A tuple of known time properties in the form [ P1,P2, ... ]
Pi ∈ { \$reflexivity, \$irreflexivity, \$transitivity, \$asymmetry, \$anti_symmetry, \$linearity, \$forward_linearity, \$backward_linearity, \$beginning, \$end, \$no_beginning, \$no_end, \$density, \$forward_discreteness, \$backward_discreteness }

The formulae of a problem can be either local (true in the current world) or global (true in all worlds). By default, formulae with the roles hypothesis and conjecture are local, and all others are global. These defaults can be overridden by adding a subrole, e.g., axiom-\$local, conjecture-\$global.