Recording Variable Bindings in TSTP Proofs

There are several possible uses for explicitly recorded variable bindings in derivations, including derivation verification and answer extraction. The TPTP language defines a format for recording derivations, but it does not yet include a standard format for recording variable bindings. The inference record of an annotated formula is the place where details of the inference that produced the formula are record. The original concept for an inference record's arguments was that

the first argument records the name of the inference rule,
the second argument records information about the inference as a whole, and
the third argument records information about the parents.

As variable bindings are associated with a individual parents, that indicates that variable binding information should be stored in the third argument of inference records. The third argument is a list of parents and associated information. As multiple items of associated information are possible, the associated information is stored in a []ed list. One example of a way to store variable bindings in those lists is

cnf(3,plain,q(b) | r(a),
    inference(resolution,
              [status(thm)],
              [ one : [bind(X,$fot(b))] ,
                two : [bind(Y,$fot(a))] ] )  ).

recording that the variable X of the parent named one is bound to the first-order term b (the $fot functor is used to indicate that the informational data is logical, rather than an "arbitrary Prolog term"), and the variable Y of the parent named two is bound to the first-order term a.

While the recording standard (or similar) described above is already possible in the TPTP syntax, and various implemented parsers would accept such recording, it does add complexity to the parent list. The Metis system, which aims to be TPTP compliant, currently records variable bindings in the second arguments of its inference records, e.g.,

cnf(3,plain,q(b) | r(a),
    inference(resolution,
              [status(thm), bind(1,X,$fot(b)), bind(2,Y,$fot(a))],
              [one, two])  ).

This alternative simplifies the parsing of the parents list (there are probably some parsers who cannot parse more than simple names in the parents list), without adding complexity to the parsing of the information about the inference as a whole. However, duplicating the parents makes for unpleasant error-checking, as there are a lot of cases (missing bindings (but is that illegal?, can I have some but not all?), inconsistent (different parents in binding than in the inference), extra bindings not associated with any parent, etc.). While some of these questions have obvious answers (e.g., missing bindings are legal as not all variables have to get bound in an inference), different parents in binding than in the inference is a problem and you know only too late.