The CADE-17 ATP System Competition

Semantic Division

Motivations

Historically, the idea for having semantic divisions arose out of ideas being discussed to avoid skewing effects that were being observed in the syntactically defined divisions and categories. The skewing came about due to the presence of groups of very similar problems in the TPTP, leading to their dominance in certain divisions and categories due to the random selection of problems from those eligible to be used. Semantic divisions were proposed as a solution to this problem, because a semantic division would have only very similar problems, and skewing would not be possible. The skewing has since been prevented by a mechanism that limits the number of very similar problems in any division or category, but other benefits of having semantic divisions were noted, and these have motivated the addition of the SEM division to CASC. These motivations and benefits are listed below.

Various ATP systems and techniques have been observed to be particularly well suited to problems with certain syntactic characteristics. Due to this specialization, empirical evaluation of ATP systems must be done in the context of problems that are reasonably homogeneous with respect to the systems. This motivates the syntactically defined divisions and categories of CASC. Evaluation of ATP systems within such divisions and categories makes it possible to say which systems work well for what types of problems. This identifies specialist capabilities of ATP systems, while general capabilities can be inferred from the separate division and category capabilities. Problems based on a specified encoding of a chosen semantic domain can reasonably be expected to be very homogeneous. The SEM division, which uses such problems, thus extends the specialist evaulation provided by the syntactically defined divisions and categories.
For CASC-16 some entrants developed automated techniques for tuning their ATP systems to specified problem sets. The SEM division in CASC provides an opportunity for researchers to demonstrate and evaluate how well their systems can be tuned to the specified encodings of the semantic domains. This has the potential to be very useful in application, where a generic prover can be quickly adapted to suit the problems to be solved. Such tuning also leads to the development of specialist systems for the type of problem.
The SEM division will raise the profile of ATP in the communities that have direct interest in the semantic domain. This is good for ATP and CASC.

Design Considerations

Syntactic classification of problems makes it easily possible to specify divisions and categories that cover all problems. In contrast, it would be very difficult to specify semantic divisions to cover all problems, and too many (in the sense of CASC) semantic divisions would be needed to cover a broad range of problems. Evaluation of ATP systems in a particular semantic domain is however interesting with respect to that semantic domain. The SEM division will therefore focus on a single semantic domain in each CASC, and the semantic domain will be changed for each CASC.
Problems based on SET005-0 have been chosen for the CASC-17 SEM division because it's a non-contraversial starting point; set theory is the basis of many things that people do. In the future, community input will be sought: if many people are working on specialist systems, or are able to tune their general purpose systems for a generally agreed upon encoding for a particular semantic domain, and that semantic domain is of sufficient interest, it will be considered for the SEM division.
The SEM division problems will not have their predicate and function symbols renamed; indeed, specific tuning to the formulae is expected. Similarly, the formulae will not be reordered. As is the case with all CASC divisions, tuning to individual problems is not allowed, i.e., systems can be tuned to only the (encoding of the) semantic domain.
Semantic divisions will typically (but not necessarily) be FOF, because that's typically the natural logic representation.