unknowns have unknown role, and this is an error situation.
The syntax for atoms is that of Prolog:
variables start with upper case letters,
atoms and terms are written in prefix notation,
uninterpreted predicates and functors either start with lower case
and contain alphanumerics and underscore, or are in 'single quotes'.
The language also supports interpreted predicates and functors.
These come in two varieties: defined predicates and functors, whose
interpretation is specified by the TPTP language, and system
predicates and functors, whose interpretation is ATP system specific.
Interpreted predicates and functors are syntactically distinct from
uninterpreted ones - they are = and !=, or start with
a $, a ", or a digit.
Non-variable symbols can be given a type globally, in the formula with
role type.
The defined types are
$o - the Boolean type,
$i - the type of individuals,
$real - the type of reals,
$rat - the type of rational,
and
$int - the type of integers.
New types are introduced in formulae with the type role, based on
$tType - the type of all types.
Full details of the THF and TFF type systems are provided in
[SS+12,BP13,SB10,KSR16],
with the last providing an overview of the first three.
The defined predicates recognized so far are
$true and $false, with the obvious interpretations.
In typed languages they are of type $o.
= and != for equality and inequality.
In typed languages they are ad hoc polymorphic with both sides having
the same type.
$distinct, whose arguments are hence known to be unequal from
each other (but not necessarily unequal to any other constants).
$distinct maybe overloaded with different arities.
$distinct is used in only the typed languages, and is ad hoc
polymorphic with all arguments having the same type.
The arithmetic predicates described below.
The arithmetic predicates are used in only the THF and TFF languages.
The defined functors recognized so far are
"distinct object"s, written in double quotes.
All "distinct object"s are unequal to all "different
distinct object"s (but not necessarily unequal to any other
constants), e.g., "Apple" != "Microsoft".
One way of implementing this is to interpret "distinct object"
as the domain element in the double quotes.
In typed contexts "distinct object"s are of type $i
(and as a result, are unequal to all numbers because their interpretation
domains are disjoint in the TPTP type systems).
Numbers (numeric constants).
These are used in the THF and TFF languages.
Numbers are interpreted as themselves (as domain elements).
All different numbers are unequal, e.g., 1 != 2
(but not necessarily unequal to any other constants).
All numbers are also unequal to terms of type $i (because
their interpretation domains are disjoint in the TPTP type systems).
Numbers are not allowed in FOF and CNF, but for many applications an
adequate alternative is to write "numbers" as distinct objects,
e.g., "27", so that different "numbers" are known to be
unequal to each other, but possibly equal to other terms of type
$i.
The arithmetic functions described below.
The arithmetic functions are used in only the THF and TFF languages.
System predicates and functors are used for interpreted predicates and
functors that are available in particular ATP tools.
System predicates and functors start with $$.
The names are not controlled by the TPTP language, so they must be used
with caution.
The connectives used to build non-atomic formulae are written using intuitive
notations.
The universal quantifier is !, the existential quantifier is
?, and the lambda binder is ^.
Quantified formulae are written in the form
Quantifier[Variables] :
Formula.
In the THF, TFF, and FOF languages, every variable in a Formula
must be bound by a preceding quantification with adequate scope.
Typed Variables are given their type by a :type
suffix.
The binary connectives are
infix | for disjunction,
infix & for conjunction,
infix <=> for equivalence,
infix => for implication,
infix <= for reverse implication,
infix <~> for non-equivalence (XOR),
infix ~| for negated disjunction (NOR),
infix ~& for negated conjunction (NAND),
infix @ for application.
The only unary connective is prefix ~ for negation.
Negation has higher precedence than quantification, which in turn has
higher precedence than the binary connectives.
No precedence is specified between the binary connectives; brackets are
used to ensure the correct association.
The binary connectives are left associative.
The THF and TFF languages have conditional and let expressions.
Conditional expressions have $ite as the functor.
The expressions are parametric polymorphic, taking a boolean expression as
the first argument, then two expressions of any one type as the second and
third arguments, as the true and false return values respectively, i.e.,
the return type is the same as that of the second and third arguments.
Let expressions have $let as the functor.
The expressions provide the types of defined symbols, definitions for the
symbols, and a formula/term in which the definitions are applied.
Each type declaration is the same as a type declaration in an annotated
formula with the type role, and multiple type declarations are given
in []ed tuples of declarations.
Each definition defines the expansion of one of the declared symbols, and
multiple definitions are given in []ed tuples of definitions.
If variables are used in the lefthand side of a definition, their values
are supplied in the defined symbol's use.
Such variables do not need to be declared (they are implicitly declared to
be of the type defined by the symbol declaration), but must be top-level
arguments of the defined symbol and be pairwise distinct.
The useful_info field of an annotated formula is optional, and if
it is not used then the source field becomes optional.
The source field is used to record where the annotated formula came from,
and is most commonly a file record or an inference record.
A file record stores the name of the file from which the annotated
formula was read, and optionally the name of the annotated formula as it
occurs in that file - this might be different from the name of the annotated
formula itself, e.g., if an ATP systems reads an annotated formula,
renames it, and then prints it out.
An inference record stores information about an inferred formula.
The useful_info field of an annotated formula is a list of arbitrary
useful information formatted as Prolog terms, as required for user
applications.
An example of a THF formula section, extracted from the problem file
LCL633^1.p, is shown below.
An example of a TFF formula section, extracted from the problem file
DAT013=1.p, is shown below that.
Example THF formulae.
An example of a FOF formula section, extracted from the problem file
GRP194+1.p, is shown below that.
An example of a clause section, extracted from the problem file
GRP039-7.p, is shown below that.
The following interpreted predicates and interpreted functions are defined.
Each symbol is ad-hoc polymorphic over the numeric types
exception - $quotient is not defined for $int).
All arguments must have the same numeric type.
All the functions, except for $quotient of two $ints, and the coercion functions
$to_int and $to_rat, have the same range type as their arguments.
For example,
$sum can be used with the type signatures
($int * $int) > $int,
($rat * $rat) > $rat,
and
($real * $real) > $real.
The $quotient of two $ints is a $rat.
The coercion function $to_int always has a $int result,
and the coercion function $to_rat always has a $rat result.
All the predicates have a $o result.
For example,
$less can be used with the type signatures
($int * $int) > $o,
($rat * $rat) > $o,
and
($real * $real) > $o.
The numbers must be the same atomic type (see the
type system).
$less/2
Less-than comparison of two numbers.
$less, $lesseq, $greater,
and $greatereq are related by ! [X,Y] : ( $lesseq(X,Y) <=> ( $less(X,Y) | X = Y ) ) ! [X,Y] : ( $greater(X,Y) <=> $less(Y,X) ) ! [X,Y] : ( $greatereq(X,Y) <=> $lesseq(Y,X) )
i.e, only $less and equality need to be implemented
to get all four relational operators.
$lesseq/2
Less-than-or-equal-to comparison of two numbers.
$greater/2
Greater-than comparison of two numbers.
$greatereq/2
Greater-than-or-equal-to comparison of two numbers.
$uminus/1
Unary minus of a number.
$uminus, $sum, and $difference are
related by ! [X,Y] : $difference(X,Y) = $sum(X,$uminus(Y))
i.e, only $uminus and $sum need to
be implemented to get all three additive operators.
$sum/2
Sum of two numbers.
$difference/2
Difference between two numbers.
$product/2
Product of two numbers.
$quotient/2
Exact quotient of two numbers of the same type.
For non-zero divisors, the result can be computed.
For zero divisors the result is not specified.
In practice, if an ATP system does not "know" that the divisor
is non-zero, it should simply not evaluate the $quotient.
Users should always guard their use of $quotient using
inequality, e.g., ! [X: $real] : ( X != 0.0 => p($quotient(5.0,X)) )
For $rat or $real operands the result is the same type.
For $int operands the result is $rat; the result might be simplied.
$quotient_e/2, $quotient_t/2, $quotient_f/2
Integral quotient of two numbers.
The three variants use different rounding to an integral result:
$quotient_e(N,D) - the Euclidean quotient, which
has a non-negative remainder.
If D is positive then $quotient_e(N,D) is
the floor (in the type of N and D) of the
real division N/D, and if D is negative
then $quotient_e(N,D) is the ceiling of N/D.
$quotient_t(N,D) - the truncation of the real
division N/D.
$quotient_f(N,D) - the floor of the real
division N/D.
For zero divisors the result is not specified.
$remainder_e/2, $remainder_t/2, $remainder_f/2
Remainder after integral division of two numbers.
For τ ∈ {$int,$rat,
$real}, ρ ∈ {e,
t,f}, $quotient_ρ and
$remainder_ρ are related by ! [N:τ,D:τ] :
$sum($product($quotient_ρ(N,D),D),$remainder_ρ(N,D)) = N
For zero divisors the result is not specified.
$floor/1
Floor of a number.
The largest integral value (in the type of the argument) not greater than
the argument.
$ceiling/1
Ceiling of a number.
The smallest integral value (in the type of the argument) not less than
the argument.
$truncate/1
Truncation of a number.
The nearest integral value (in the type of the argument) with magnitude
not greater than the absolute value of the argument.
$round/1
Rounding of a number.
The nearest integral value (in the type of the argument) to the argument.
If the argument is halfway between two integral values, the nearest even integral
value to the argument.
The largest $int not greater than the argument.
If applied to an argument of type $int this is the
identity function.
$to_rat/1
Coercion of a number to $rat.
This function is not fully specified.
If applied to a $int the result is the argument over
1.
If applied to a $rat this is the identity function.
If applied to a $real that is (known to be) rational the
result is the $rat value.
For other reals the result is not specified.
In practice, if an ATP system does not "know" that the argument
is rational, it should simply not evaluate the $to_rat.
Users should always guard their use of $to_rat using
$is_rat, e.g., ! [X: $real] : ( $is_rat(X) => p($to_rat(X)) )
$to_real/1
Coercion of a number to $real.
The extent to which ATP systems are able to work with the arithmetic
predicates and functions can vary, from a simple ability to evaluate ground
terms, e.g., $sum(2,3) can be evaluated to 5, through
an ability to instantiate variables in equations involving such functions,
e.g., $product(2,$uminus(X)) = $uminus($sum(X,2))
can instantiate X to 2, to extensive algebraic manipulation
capability.
The syntax does not axiomatize arithmetic theory, but may be used to write
axioms of the theory.
A semantic specification changes the meaning of things such as the boolean type $o,
universal quantification with !, etc - their existing meaning in classical logic should
not be confused with the meaning in the declared logic.
For plain $modal and all the *_modal logics the properties that may be
specified are $domains, $designation, $terms, and $modalities.
$domains
$constant - Constant domain semantics has all world's domains fixed the same.
$varying - Varying domain semantics has (potentially) different domains for
each world.
$cumulative - Cumulative domain semantics and varying, and the domain of each
world is a superset of the domains of the worlds from which it can be reached.
$decreasing - Decreasing domain semantics and varying, and the domain of each
world is a subset of the domains of the worlds from which it can be reached.
$designation
$rigid - Rigid designation independent of worlds.
$flexible - Flexible designation dependent on worlds.
$terms
$local - Terms are interpreted in the local world
$global - Terms are interpreted globally in all the worlds
$modalities
A known system name in the form $modal_system_Sys Sys ∈ { K,KB,K4,K5,K45,KB5,D,DB,D4,D5,D45,M,B,S4,S5,S5U }
A tuple of known axiom names in the form
[ $modal_axiom_Ax1,
$modal_axiom_Ax2, ... ] Axi ∈ { K,M,B,D,4,5,CD,BoxM,C4,C }
For $temporal_instant the properties are the $domains, $designation,
and $terms of the modal logic, $modalities with different possible values,
and another property $time.
$modalities
A tuple of known axiom names in the form
[ $modal_axiom_AxTm1,
$modal_axiom_AxTm2, ... ] Axi ∈ { K,M,B,D,4,5 },
Tmi ∈ { +,- },
$time
A tuple of known time properties in the form
[ P1,P2, ... ] Pi
∈ { $reflexivity, $irreflexivity, $transitivity, $asymmetry, $anti_symmetry,
$linearity, $forward_linearity, $backward_linearity, $beginning, $end, $no_beginning,
$no_end, $density, $forward_discreteness, $backward_discreteness }
The formulae of a problem can be either local (true in the current world) or global (true in all
worlds).
By default, formulae with the roles hypothesis and conjecture are local, and
all others are global.
These defaults can be overridden by adding a subrole, e.g., axiom-$local,
conjecture-$global.
The introduction of new non-variable symbols should be recorded in a
<new_symbol_record> in the <useful_info> field of the
<inference_record> of a derived formula, or in the
<optional_info> field of the
<internal_source> of an introduced formula.
The new form will have the format
new_symbols(Type,[Symbols]).
The Type indicates the type of new symbol, with possible values
skolem, definition, other.
Symbols is a comma separated list of new symbols of that type.
The <useful_info> field can have multiple new_symbols() terms.
For examples,
ATP systems should check for clashes whenever they generate a new symbol, and whenever
they read input after any new symbols have been generated.
If you are interested in the history and motivations behind this convention, read the source
of this web page.
The TPTP Language BNF
TPTP Syntax
%----v9.0.0.0 (TPTP version.internal development number)
%----v9.0.0.1 Added <tff_quantifier>. Added <hash> (for epsilon terms) to <thf_quantifier> and
%---- <tff_quantifier>. Changed <tff_quantified_formula> to use <tff_quantifier>.
%----v9.0.0.2 Removed axiom_of_choice as an <intro_type>
%---- Renamed <inference_parents> to <parents>
%---- <inference_record> :== inference(<inference_rule>,<useful_info>,<parents>)
%---- Aligned introduced and creator with inferred
%---- <internal_source> :== introduced(<intro_type>,<useful_info>,<parents>)
%---- <creator_source> :== creator(<creator_name>,<useful_info>,<parents>)
%----v9.0.0.3 Moved <hash> to <fof_quantifier> so it's available in all languages.
%----v9.0.0.4 Removed ()s around <thf_defined_infix> and <thf_infix_unary>
%---- Removed ()s around <tff_defined_infix> and <tff_infix_unary>
%----v9.0.0.5 Fixed typo in <ntf_domain_type>
%----v9.0.0.6 Converted <source> and it's descendents to grammar rules. That required adding
%---- a grammar rule for <intro_type>.
%---- Changed <file_name> to <atomic_word>, to allow non-quoted filenames.
%----v9.0.0.7 Linearised <thf_formula_list>, <tff_arguments>, <fof_formula_tuple_list>,
%---- <parent_list>, <info_items>, <general_terms>.
%----v9.0.0.8 Replaced <null> by <nothing> to avoid conflicts in Java parsers
%----v9.0.0.9 Allow only <unsigned_integer> in a <name>
%----v9.0.0.10 Reordered numerics to make lex/yacc happy, including some renaming and factoring
%---- Reverted <name> to allow <integer>, because lex/yacc stuff hurts.
%----v9.0.0.11 Moved <hash> to <tff_quantifier> and set <thf_quantifier> to use <tff_quantifier>.
% This means epsilon terms are quantified formulae (yak) available in only TFF (really
% TXF, which is cool).
%----9.0.0.12 Changed <parent_details> to :<general_term>.
%--------------------------------------------------------------------------------------------------
%----README ... this header provides important meta- and usage information
%----
%----Intended uses of the various parts of the TPTP syntax are explained in the TPTP technical
%----manual, linked from www.tptp.org.
%----
%----Four kinds of separators are used, to indicate different types of rules:
%---- ::= is used for regular grammar rules, for syntactic parsing.
%---- :== is used for semantic grammar rules. These define specific values that make semantic
%---- sense when more general syntactic rules apply.
%---- ::- is used for rules that produce tokens.
%---- ::: is used for rules that define character classes used in the construction of tokens.
%----
%----White space may occur between any two tokens. White space is not specified in the grammar, but
%----there are some restrictions to ensure that the grammar is compatible with standard Prolog: a
%----<TPTP_file> should be readable with read/1.
%----
%----The syntax of comments is defined by the <comment> rule. Comments may occur between any two
%----tokens, but do not act as white space. Comments will normally be discarded at the lexical
%----level, but may be processed by systems that understand them (e.g., if the system comment
%----convention is followed).
%----
%----Multiple languages are defined. Depending on your need, you can implement just the one(s) you
%----need. The common rules for atoms, terms, etc, come after the definitions of the languages, and
%----mostly all needed for all the languages.
%----Top of Page-----------------------------------------------------------------------------------
%----Files. Empty file is OK.
<TPTP_file> ::= <TPTP_input>*
<TPTP_input> ::= <annotated_formula> | <include>
%----Types for THF and TFF
<type_constant> ::= <type_functor>
<type_functor> ::= <atomic_word>
<defined_type> ::= <atomic_defined_word>
<defined_type> :== $oType | $o | $iType | $i | $tType | $real | $rat | $int
%----$oType/$o is the Boolean type, i.e., the type of $true and $false.
%----$iType/$i is non-empty type of individuals, which may be finite or
%----infinite. $tType is the type of all types. $real is the type of <real>s.
%----$rat is the type of <rational>s. $int is the type of <signed_integer>s
%----and <unsigned_integer>s.
<system_type> :== <atomic_system_word>
%----Useful info fields
<optional_info> ::= ,<useful_info> | <nothing>
<useful_info> ::= <general_list>
<useful_info> :== [] | [<info_items>]
<info_items> :== <info_item><comma_info_item>*
<comma_info_items> :== ,<info_item>
<info_item> :== <formula_item> | <inference_item> | <general_function>
%----Useful info for formula records
<formula_item> :== <description_item> | <iquote_item>
<description_item> :== description(<atomic_word>)
<iquote_item> :== iquote(<atomic_word>)
%----<iquote_item>s are used for recording exactly what the system output about
%----the inference step. In the future it is planned to encode this information
%----in standardized forms as <parent_details> in each <inference_record>.
%----Useful info for inference records
<inference_item> :== <inference_status> | <assumptions_record> | <new_symbol_record> |
<refutation>
<inference_status> :== status(<status_value>) | <inference_info>
%----These are the success status values from the SZS ontology. The most
%----commonly used values are:
%---- thm - Every model of the parent formulae is a model of the inferred formula. Regular logical
%---- consequences.
%---- cth - Every model of the parent formulae is a model of the negation of the inferred formula.
%---- Used for negation of conjectures in FOF to CNF conversion.
%---- esa - There exists a model of the parent formulae iff there exists a model of the inferred
%---- formula. Used for Skolemization steps.
%----For the full hierarchy see the SZSOntology file distributed with the TPTP.
<status_value> :== suc | unp | sap | esa | sat | fsa | thm | eqv | tac | wec | eth | tau |
wtc | wth | cax | sca | tca | wca | cup | csp | ecs | csa | cth | ceq |
unc | wcc | ect | fun | uns | wuc | wct | scc | uca | noc
%----<inference_info> is used to record standard information associated with an arbitrary inference
%----rule. The <inference_rule> is the same as the <inference_rule> of the <inference_record>. The
%----<atomic_word> indicates the information being recorded in the <general_list>. The
%----<atomic_word> are (loosely) set by TPTP conventions, and include esplit, sr_split, and
%----discharge.
<inference_info> :== <inference_rule>(<atomic_word>,<general_list>)
%----An <assumptions_record> lists the names of assumptions upon which this
%----inferred formula depends. These must be discharged in a completed proof.
<assumptions_record> :== assumptions([<name_list>])
%----A <refutation> record names a file in which the inference recorded here
%----is recorded as a proof by refutation.
<refutation> :== refutation(<file_source>)
%----A <new_symbol_record> provides information about a newly introduced symbol.
<new_symbol_record> :== new_symbols(<atomic_word>,[<new_symbol_list>])
<new_symbol_list> :== <principal_symbol> | <principal_symbol>,<new_symbol_list>
%----Principal symbols are predicates, functions, variables
<principal_symbol> :== <functor> | <variable>
%----General purpose
%----Integer names are expected to be unsigned, but lex stuff prevents this ..
%----<name> ::= <atomic_word> | <unsigned_integer>
<name> ::= <atomic_word> | <integer>
<atomic_word> ::= <lower_word> | <single_quoted>
%----<single_quoted>s are the enclosed <atomic_word> without the quotes. Therefore the <lower_word>
%----<atomic_word> cat and the <single_quoted> <atomic_word> 'cat' are the same, but <numbers>s and
%----<variable>s are not <lower_word>s, so 123' and 123, and 'X' and X, are different. Quotes can
%----be removed from a <single_quoted> <atomic_word>, and is recommended if doing so produces a
%----<lower_word> <atomic_word>.
<atomic_defined_word> ::= <dollar_word>
<atomic_system_word> ::= <dollar_dollar_word>
<number> ::= <integer> | <rational> | <real>
%----Numbers are always interpreted as themselves, and are thus implicitly
%----distinct if they have different values, e.g., 1 != 2 is an implicit axiom.
%----All numbers are base 10 at the moment.
<file_name> ::= <atomic_word>
<nothing> ::=
%----Top of Page-----------------------------------------------------------------------------------
%----Rules from here on down are for defining tokens (terminal symbols) of the grammar, assuming
%----they will be recognized by a lexical scanner.
%----A ::- rule defines a token, a ::: rule defines a macro that is not a token. Usual regexp
%----notation is used. Single characters are always placed in []s to disable any special meanings
%----(for uniformity this is done to all characters, not only those with special meanings).
%----These are tokens that appear in the syntax rules above. No rules defined here because they
%----appear explicitly in the syntax rules, except that <vline>, <star>, <plus>, <hash> denote
%----"|", "*", "+", "#", respectively.
%----Keywords: thf tff fof cnf include
%----Punctuation: ( ) , . [ ] :
%----Operators: ! ? ~ & | <=> => <= <~> ~| ~& * +
%----Predicates: = != $true $false $arithmetic_stuff
<single_quoted> ::- <single_quote><sq_char><sq_char>*<single_quote>
%----<single_quoted>s contain visible characters. \ is the escape character for ' and \, i.e.,
%----\' is not the end of the <single_quoted>. The token does not include the outer quotes, e.g.,
%----'cat' and cat are the same. See <atomic_word> for information about stripping the quotes.
<distinct_object> ::- <double_quote><do_char>*<double_quote>
%---Space and visible characters upto ~, except " and \ distinct_object>s contain visible
%----characters. \ is the escape character for " and \, i.e., \" is not the end of the
%----<distinct_object>. <distinct_object>s are different from (but may be equal to) other tokens,
%----e.g., "cat" is different from 'cat' and cat. Distinct objects are always interpreted as
%----themselves, so if they are different they are unequal, e.g., "Apple" != "Microsoft" is
%----implicit.
%----Tokens used in syntax, and cannot be character classes
<vline> ::: [|]
<star> ::: [*]
<plus> ::- [+]
<arrow> ::- [>]
<less_sign> ::- [<]
<hash> ::- [#]
%----For lex/yacc there cannot be spaces on either side of the | here
<comment> ::- <comment_line>|<comment_block>
<comment_line> ::- <percentage_sign><printable_char>*
<comment_block> ::: {slash_char}{star}<not_star_slash>{star}{star}*{slash_char}
<not_star_slash> ::: ([^*]*[*][*]*[^/*])*[^*]*
%----Defined comments are a convention used for annotations that are used as additional input for
%----systems. They look like comments, but start with %$ or /*$. A wily user of the syntax can
%----notice the $ and extract information from the "comment" and pass that on as input to the
%----system. They are analogous to pragmas in programming languages. To extract these separately
%----from regular comments, the rules are:
%---- <defined_comment> ::- <def_comment_line>|<def_comment_block>
%---- <def_comment_line> ::: [%]<dollar><printable_char>*
%---- <def_comment_block> ::: [/][*]<dollar><not_star_slash>[*][*]*[/]
%----A string that matches both <defined_comment> and <comment> should be recognized as
%----<defined_comment>, so put these before <comment>. Defined comments that are in use include:
%---- TO BE ANNOUNCED
%----System comments are a convention used for annotations that may used as additional input to a
%----specific system. They look like comments, but start with %$$ or /*$$. A wily user of the
%----syntax can notice the $$ and extract information from the "comment" and pass that on as input
%----to the system. The specific system for which the information is intended should be identified
%----after the $$, e.g., /*$$Otter 3.3: Demodulator */ To extract these separately from regular
%----comments, the rules are:
%---- <system_comment> ::- <sys_comment_line>|<sys_comment_block>
%---- <sys_comment_line> ::: [%]<dollar><dollar><printable_char>*
%---- <sys_comment_block> ::: [/][*]<dollar><dollar><not_star_slash>[*][*]*[/]
%----A string that matches both <system_comment> and <defined_comment> should
%----be recognized as <system_comment>, so put these before <defined_comment>.