Verifying LLM Answers with ShZZaM and ATP

Abstract

This talk describes a process for verifying the answer produced by an LLM by (i) translating the question to logic using the ShZZaM tool, (ii) answering the logic question using ATP, and (iii) comparing the LLM's answer to the ATP answer.