Answer Extraction for TPTP

by Geoff Sutcliffe, Mark Stickel, Stephan Schulz, Josef Urban


Answer extraction is useful (and meaningful only) when there are existential quantifiers at the outermost level of a conjecture. In this situation the conjecture can be considered to a question for which an answer is required, in the form of the values that instantiate the existentially quantified variables. This proposal describes how such instantiations can be returned in a standard and useful way.

The new question role is introduced to identify conjectures for which the bindings for the outermost existentially quantified variables are wanted. For returning answers, four SZS answer forms are proposed:

In all forms the answer output lines must be preceded by an SZS status line.

The following problem, from the problem file ANS001+1, is used as the running example ...

    fof(abc,axiom,p(a,b,c)).
    fof(def,axiom,p(d,e,f)).
    fof(ghi_or_jkl,axiom,( p(g,h,i) | p(j,k,l) )).
    fof(xyz,question, ? [X,Y,Z] : p(X,Y,Z)).

The Question Form

Questions are written like conjectures, but using the question role instead of the conjecture role. The outermost existentially quantified variables are the ones that the user wants values for, i.e., their sets of instatiations are the answer.

Trick: If the user wants values for only a subset of the outermost existentially quantified variables, a dummy universal quantification can be inserted, e.g., if the question is ...

    fof(xyz,question, ? [X,Y,Z] : p(X,Y,Z)).
... and the user wants values for only X and Z, it can be written as ...
    fof(xyz,question, ? [X,Z] : ! [D] : ? [Y] : p(X,Y,Z)).

Answers

Answers are bindings for the outermost existentially quantified variables.


The Tuple Answer Form

The tuple form is ...
    SZS answers Tuple list_of_definite_answer_tuples_goes_here
... e.g., ...
    % SZS status Success for ANS001+1
    % SZS answers Tuple [[a,b,c],[d,e,f]] for ANS001+1
Disjunctive answers can also be expressed, e.g., ... Should this be allowed?
    % SZS status Success for ANS001+1
    % SZS answers Tuple [[a,b,c],[d,e,f],([g,h,i]|[j,k,l])] for ANS001+1
To indicate that there are (or might be) more answers than those lists, the tuple can be terminated in a Prolog-like way with |_, e.g., ...
    % SZS answers Tuple [[a,b,c],[d,e,f]|_] for ANS001+1
Lines may have a : to indicate the start of system/task specific information. For example:
    % SZS answers Tuple [[a,b,c],[d,e,f]] for ANS001+1 : Used OASys

The Instantiated Answer Form

It is expected that in many applications the answers returned will be used as logical data in a subsequent reasoning process. In that case it's convenient to have the data in a logical form, rather than a tuple. The instantiated form provides an atomic logical form ...
    % SZS answers Instantiated list_of_instantiated_formulae_goes_here
... e.g., ...
    % SZS status Success for ANS001+1
    % SZS answers Instantiated [p(a,b,c),p(d,e,f)] for ANS001+1
Disjunctive answers can also be expressed, e.g., ... Should this be allowed?
    % SZS status Success for ANS001+1
    % SZS answers Instantiated [p(a,b,c),p(d,e,f),(p(g,h,i) | p(j,k,l)] for ANS001+1
To indicate that there are (or might be) more answers than those lists, the tuple can be terminated in a Prolog-like way with |_, e.g., ...
    % SZS answers tuple for [p(a,b,c),p(d,e,f)|_] for ANS001+1
Lines may have a : to indicate the start of system/task specific information. For example:
    % SZS answers Instantiated [p(a,b,c),p(d,e,f)] for ANS001+1 : Used OASys

The InstantiatedFormulae Answer Form

The instantiated formulae form provides answers in the TPTP FOF format, so that all TPTP tools can work with the answers, and the answers can be used directly, e.g., as axioms, in subsequent reasoning processes. The instantiated formulae form is ...
    % SZS answers start InstantiatedFormulae for ANS001+1
    fof_encoded_answers_go_here
    % SZS answers end InstantiatedFormulae for ANS001+1
There can be multiple fof encoded answers, each of which is an answer to the question. For example,
    % SZS status Success for ANS001+1
    % SZS answers start InstantiatedFormulae for ANS001+1
    fof(1,answer,
        p(a,b,c),
        answer_to(xyz,[])).
    fof(2,answer,
        p(d,e,f),
        answer_to(xyz,[])).
    % SZS answers end InstantiatedFormulae for ANS001+1
Disjunctive answers can also be expressed, e.g., ... Should this be allowed?
    % SZS status Success for ANS001+1
    % SZS answers start InstantiatedFormulae for ANS001+1
    fof(3,answer,
        ( p(g,h,i) 
        | p(j,k,l) ),
        answer_to(xyz,[])).
    % SZS answers end InstantiatedFormulae for ANS001+1


The Formulae Form

The formulae form is ...
    % SZS answers start Formulae for ANS001+1
    fof_encoded_answers_go_here
    % SZS answers end Formulae for ANS001+1
There can be multiple fof encoded answers, each of which is an answer to the question. The fullest form of a fof encoded answer is ...
    fof(some_name,answer,
        quantification_from_question
            disjunction_of_conjunctions_of_binding_equations & the_question,
        answer_to(name_of_question,[])).
... e.g., ...
    % SZS status Success for ANS001+1
    % SZS answers start Formulae for ANS001+1
    fof(1,answer,
        ? [X,Y,Z] :
          ( X = a & Y = b & Z = c & p(X,Y,Z) ),
        answer_to(xyz,[])).
    fof(2,answer,
        ? [X,Y,Z] :
          ( X = d & Y = e & Z = f & p(X,Y,Z) ),
        answer_to(xyz,[])).
    fof(3,answer,
        ? [X,Y,Z] :
          ( ( ( X = g & Y = h & Z = i )
            | ( X = j & Y = k & Z = l ) )
          & p(X,Y,Z) ),
        answer_to(xyz,[])).
    % SZS answers end Formulae for ANS001+1
The []s in the answer_to term are for useful_info, in the TPTP syntax tradition.

It is possible that this long form asks too much of some systems, so it is tolerable to omit either the_question or the answer_to term, or both. Additionally, the answer variables may be renamed (but not reordered). For example, a minimal long form would be ...

    % SZS status Success for ANS001+1
    % SZS answers start Formulae for ANS001+1
    fof(1,answer, ? [U,V,W] : ( U = a & V = b & W = c )).
    fof(2,answer, ? [K,J,L] : ( K = d & J = e & L = f )).
    fof(3,answer, ? [Z,Y,X] : ( ( Z = g & Y = h & X = i )
                              | ( Z = j & Y = k & X = l ) )).
    % SZS answers end Formulae for ANS001+1

Multiple Questions

A TPTP problem file can have multiple conjectures, and the problem is solved when all conjectures have been proved. Similarly a TPTP problem file can have multiple questions, and the problem is solved when all questions have been answered. The answer_to provides differentiation between answers to different questions in the same problem file. For example, for the following problem from the problem file ANS002+1 ...
    fof(a,axiom,p(a)).
    fof(b,axiom,q(b)).
    fof(p,question,? [X] : p(X)).
    fof(q,question,? [X] : q(X)).
... the answer could be ...
    % SZS status Success for ANS002+1
    % SZS answers formulae start for ANS001+1
    fof(1,answer,? [X] : X = a & p(X),answer_to(p,[])).
    fof(2,answer,? [X] : X = b & q(X),answer_to(q,[])).
    % SZS answers formulae end for ANS001+1

Emails with Petra

Hi Geoff, Petra,

> On 16 Jun 2023, at 19:46, Geoff Sutcliffe  wrote:
>
> Hi Stephan (and Petra),
>
> At LPAR Petra and I talked about her use of question answering (this paper,
> if you are interested, but not necessary reading for this email) ...
>    https://easychair.org/publications/paper/BlrQ
> The upshot is a need for some funky builtin predicates. You might recall
> my previous foray into this topic with Arpana, and the resultant Multiple
> AnSwer EXtraction system she wrote in her MS work. Do you recall if I
> convinced you to provide something like $syntactically_different_from in
> E? What we seem to need now is $unify/2 that tests if two expressions unify,
> and Petra would also like $contains/2 that tests if a certain symbol occurs
> in an expression. I think $unify would be easy, right? What about $contains?
> This is a gentle step of consideration, not a request for implementation :-)

In principle, both would be quite simple to implement (unify is harder,
because for equational literals, you would need to do it modulo symmetry).

But this may be an X/Y-problem. What behaviour do you actually want, and
why?

> Cheers,
>
> Geoff
>
>>> 1. In the TPTP World answers may contain only variables and symbols from the
>>>   input formulae. That excludes, e.g., Skolems.

E will happily give you a Skolem. Otherwise, you can’t answer
?[X]:p(X)
under axiom
![X]:p(X)
...which seems counterproductive. You can always filter unwanted
answers after the fact (e.g. with grep -v).

>>> 2. I would like ATP systems to provide a builtin predicate $unify/2.
>>>   Notice $unify allows instantiation both ways.Some examples ...

I wonder if you cannot simply add the unification constraint as
an equational literal X!=f(Y).
Aha...more smart people have answered the above question ;-).

>>> No. The negative case breaks. For example ...
>>> Axioms
>>>    p(a)
>>>    p(b)
>>>    a = b
>>> Question
>>>    ? [X] : ( p(X) & X != a )
>>> No answer. But ...
>>> Question
>>>    ? [X] : ( p(X) & ~$unify(X,a) )
>>> Answer X == b


Examples that illustrate my need for $unify ...


>   Question
>       ? [X] : (p(X) & ? [Y] : $unify(X,f(Y)))
>   Axiom                    Answer
>       ! [Z] : p(Z)             Z
>       p(f(a))                  f(a)
>
>   Question
>       ? [X] : p(f(X))
>   Axiom                    Answer
>       ! [Z] : p(Z)             Z
>       p(f(a))                  a
>
>   Question
>       ? [X] : ( p(X) & ~$unify(X,a))
>   Axiom                    Answer
>       p(a) & p(b)              b
>
>   Question
>       ? [X] : p(X)
>   Axiom                    Answer
>       ? [Z] : p(Z)             no answer (Skolems not allowed)
>   This one is interesting - as a conjecture it's provable, but as a question
>   it's not answerable. I like that!


I discussed this with Sophie today, when she explained her interest in
abduction as a way to explain why a countermodel invalidates some claim.
Her example was an array A, then you switch the contents of A[i] and
A[j], then claim the new array is not equal to the old array. This is not
true if i = j, or A[i] = A[j], and that's what need to be found as the
"explanation". I recast is as a question of the form ...

Axioms: Arrays, Swap, array A, indices i, j.
Question: ? [X:$o] : (X => (A[i] != A[j]))

... and the ATP must synthesize X as (i != j) & (A[i] != A[j]). $unify
and $contains might be used to provide hints (e.g., $contains(X,!=)) or
contraints, e.g., ? [L:$o,R:$o] : $unify(X,L & R). Yay, another nice use
case - thanks Sophie!


My proposal would do it using $contains, e.g., ....
    fof(pel55,question,
        ? [X] :
          ( killed(X,agatha)
          & ~ $contains(X,killed)
          & ~ $contains(X,hates) ) ).
... or more positively ...
    fof(pel55,question,
        ? [X] :
          ( killed(X,agatha)
          & ( $unifies(X,agatha)
            | $unifies(X,butler)
            | $unifies(X,charles) ) ) ).

Your work shows that it can be done. Next to get agreement on the syntax
(quite easy), then convince ATP system developers to adopt it (not so
easy).