zero
, or
succ(N)
, where N is itself a natural number.
succ
could be read as "successor", N + 1, so that 0 is defined as zero
,
1 as succ(zero)
, 2 as succ(succ(zero))
, and so on.
In a similar way, lists can be defined as either the nil
list or a cons
cell containing an element and the tail of a list.
This is a natural example of a polymorphic datatype, as the element sort is not provided.
ADTs are often (but not always) self-referential, in the sense that larger terms are built from smaller terms of the same sort and an appropriate constructor.
Consider the natural numbers as defined above.
tff(nat_type, type-datatype, nat : $tType). tff(zero_type, type-datatype_constructor, zero : nat). tff(succ_type, type-datatype_constructor, succ : nat > nat).TODO consider shorter role names:
type-adt
and type-adtc
?
Here the type of natural numbers and two constructors are defined.
Note the use of the sub-roles datatype
and datatype_constructor
of the
existing type
role.
This indicates to the theorem prover that the type declaration should be considered an ADT or
an ADT constructor, respectively.
ADT testers and destructors must be (for now at least ... read more below) defined in the language itself, e.g.:
%----check for zero tff(is_zero_type,type, is_zero: nat > $o ). tff(zero_is_zero,axiom, is_zero(zero) ). tff(succ_is_not_zero,axiom, ! [X: nat] : ~ is_zero(succ(X)) ). %----check for successor tff(is_succ_type,type, is_succ: nat > $o ). tff(zero_is_not_succ,axiom, ~ is_succ(zero) ). tff(succ_is_succ,axiom, ! [X: nat] : is_succ(succ(X)) ). %----get the predecessor of a number, if it exists tff(pred_type,type, pred: nat > nat ). tff(pred_succ,axiom, ! [X: nat] : ( pred(succ(X)) = X ) ).
Note that pred
is a type
, not a
type-datatype_constructor
, i.e., it is a uninterpreted function, not part of the ADT.
At present testers and destructors are provided by the user, but a future proposal might add
defined predicates to add these automatically.
With this, it is possible to give a conjecture that is provable only with ADT semantics, e.g.,
tff(all_nats_zero_or_succ, conjecture, ![X : nat]: is_zero(X) | is_succ(X)).
This relies on the theorem prover reasoning that all natural numbers are either zero or the successor of a natural number. Similarly:
tff(no_cyclic_nats, conjecture, ~?[X : nat]: X = succ(X)). tff(succ_injective, conjecture, ![X : nat, Y : nat]: X != Y => succ(X) != succ(Y)).
These are not theorems without ADT semantics.
Mutually-recursive ADTs are allowed, but they require a particular order of declaration in order to avoid undeclared types. Consider enforcing colour alternation in red-black trees:
/* Corresponds to Standard ML datatype 'a red = rLeaf of 'a | rBranch of 'a black * 'a black and 'a black = bLeaf of 'a | bBranch of 'a red * 'a red */ %----declare types first tff(red_type,type-datatype, red: $tType > $tType ). tff(black_type,type-datatype, black: $tType > $tType ). %----types have been declared now, any order works tff(rLeaf_type,type-datatype_constructor, rLeaf: !>[A: $tType] : ( A > red(A) ) ). tff(rBranch_type,type-datatype_constructor, rBranch: !>[A: $tType] : ( ( black(A) * black(A) ) > red(A) ) ). tff(bLeaf_type,type-datatype_constructor, bLeaf: !>[A: $tType] : ( A > black(A) ) ). tff(bBranch_type,type-datatype_constructor, bBranch: !>[A: $tType] : ( ( red(A) * red(A) ) > black(A) ) ).Some conjectures require the theorem prover to reason inductively, e.g.,
tff(add_type,type, plus: ( nat * nat ) > nat ). tff(add_zero,axiom, ! [X: nat] : ( plus(X,zero) = X ) ). tff(add_succ,axiom, ! [X: nat,Y: nat] : ( plus(X,succ(Y)) = succ(plus(X,Y)) ) ). tff(add_commutative,conjecture, ! [X: nat,Y: nat] : ( plus(X,Y) = plus(Y,X) ) ).
It is possible to combine ADTs with other parts of TPTP such as arithmetic, e.g.,
tff(nat2int_type,type, nat2int: nat > $int ). tff(nat2int_zero,type, nat2int(zero) = 0 ). tff(nat2int_succ,type, ! [X: nat] : ( nat2int(succ(X)) = $sum(nat2int(X),1) ) ).TODO more examples with e.g. HOL, FOOL, non-classical?
type-datatype
Rolea_type
is introduced in the same way as an uninterpreted type,
i.e. a type
declaration that declares a $tType
, and extends it with
the datatype
subrole to signal the ADT semantics.
Systems should process this in the same way as a type
declaration, but internally
flag the type as an ADT.
For example:
tff(int_list_type, type-datatype, int_list : $tType).ADTs can be polymorphic, and this is also supported in TPTP by declaring a type constructor:
tff(list_type, type-datatype, list : $tType > $tType).
type-datatype_constructor
Roletype
declaration that declares the signature of the constructor, and extends it with
the datatype_constructor
subrole to signal the ADT semantics.
Systems should process this in the same way as a type
declaration, but internally
flag the symbol as an ADT constructor.
For example:
tff(int_nil_type, type-datatype_constructor, int_nil : int_list). tff(int_cons_type, type-datatype_constructor, int_cons : ( $int * int_list ) > int_list).Polymorphism works naturally:
tff(nil_type, type-datatype_constructor, nil : !>[A : $tType]: A > list(A)). tff(cons_type, type-datatype_constructor, cons : !>[A : $tType]: A * list(A) > list(A)).
The associated ADT for a constructor can be determined uniquely by the type constructor it maps
into.
The ADT should have been declared with type-datatype
, and systems should report
SZS status SemanticError
if it is not.
In the polymorphic case, it is malformed for a type-datatype_constructor
to map into
a type variable, and systems should likewise report SemanticError
on inputs of the
form:
tff(who_do_I_belong_to, type-datatype_constructor, confused : !>[X : $tType]: X).Type variables should generally be repeated verbatim in the constructors of polymorphic ADTs:
/* Below corresponds to Rocq Inductive sum (A B:Type) : Type := | inl : A -> sum A B | inr : B -> sum A B. */ %----OK tff(sum_type,type-datatype, sum: ( $tType * $tType ) > $tType ). tff(inl_type,type-datatype_constructor, inl: !>[A: $tType,B: $tType] : ( A > sum(A,B) ) ). tff(inr_type,type-datatype_constructor, inr: !>[A: $tType,B: $tType] : ( B > sum(A,B) ) ). %----probably not intended but permissible tff(sum_type_001,type-datatype, sum: ( $tType * $tType ) > $tType ). tff(inl_type_002,type-datatype_constructor, inl: !>[A: $tType,B: $tType] : ( B > sum(B,A) ) ). tff(inr_type_003,type-datatype_constructor, inr: !>[A: $tType,B: $tType] : ( A > sum(B,A) ) ).Renaming or instantiating sort variables in the range of a constructor is not necessarily an error (see Generalized ADTs below) and systems that do not support such deviations should report
SZS status Inappropriate
.
A datatype and its constructors must be declared contiguously (so that an ATP system knows when it has read all the constructors (because it reads a non-constructor)). Mutually recursive ADTs must be declared contiguously, first the datatypes, then all the datatype_constructors.
SZS status SemanticError
and a suitable message.
For example, both types in:
tff(void_type, type-datatype, void : $tType). tff(list_type, type-datatype, list : $tType > $tType). %----whoops, forgot nil tff(cons_type, type-datatype_constructor, cons : !>[A : $tType]: (A * list(A)) > list(A)).might (or might not) trigger an error. Users who create problems with ADTs should ensure that their ADTs are inhabited. Problems in the TPTP library declare only inhabited ADTs.
Problems can be imported from existing libraries, including TIP [CJ+15] and Vampire's inductive benchmarks. ITP systems are likely to produce ADT inputs if the language supports it.
Systems are presently more sparse. Vampire supports much of this proposal already, but will need adapting for the new syntax. Some SMT solvers support a theory of data types, including Z3 and cvc5.
Existing theorem provers might treat the new type
subroles as normal
type
declarations.
This will reduce the logical strength of the input, which might result in a
previously-complete system claiming that theorem is a non-theorem.
Systems affected by this could:
SZS status Inappropriate
on encountering one of the new
type
subroles; or
/* Following corresponds to the Standard ML datatype 'a expr = eVar of 'a | eApp of 'a expr * 'a expr | eLam of 'a -> 'a expr */ thf(expr_type,type-datatype, expr: $tType > $tType ). thf(eVar_type,type-datatype_constructor, eVar: !>[A: $tType] : ( A > expr(A) ) ). thf(eApp_type,type-datatype_constructor, eApp: !>[A: $tType] : ( ( expr(A) * expr(A) ) > expr(A) ) ). thf(eLam_type,type-datatype_constructor, eLam: !>[A: $tType] : ( ( A > expr(A) ) > expr(A) ) ).However, not all such datatypes are meaningful [Pau00] and some may introduce inconsistencies or surprising results. Consider
/* Below corresponds loosely to (illegal) Rocq Inductive dodgy := bites : (dodgy -> Prop) -> dodgy. */ thf(dodgy_type, type-datatype, dodgy : $tType). thf(dont_trust_it_bites, type-datatype-constructor, bites : (dodgy > $o) > dodgy).With the standard semantics for higher-order logic, this declaration alone has no models, as there would then be a bijection between dodgy and dodgy > $o. In the same spirit as sort inhabitation, users should not supply such declarations, and systems may behave strangely if they do. Systems may choose to check for these, perhaps implementing a condition such as strict positivity. Problems in the TPTP library declare only well-behaved higher-order datatypes.
Thanks to Simon Guilloud, Sankalp Gambhir and others at EPFL for raising this concern and producing the example!
tff(vec_type,type-datatype, vec: ( nat * $tType ) > $tType ). tff(nil_type,type-datatype_constructor, nil: !>[A: $tType] : vec(zero,A) ). tff(cons_type,type-datatype_constructor, cons: !>[N: nat,A: $tType] : ( ( A * vec(N,A) ) > vec(succ(N),A) ) ).TODO check this with Cezary?