Verifying LLM Answers with ShZZaM and ATP
Motivation
A
Chat with Bard
Logic
Saves the Day
LLM vs. ATP
ShZZaM
The ShZZaM
ZigZag Loops
ShZZaM in
Action
The
Implementation
Testing with various examples ... TPTP, SUMO, MATP, PWC, ...
Observations and Questions
LLMs know TPTP syntax!
Can
ShZZaM be trusted
?
Maybe not
!
Can
ATP be trusted
?
The
Cycle of Trust
The End ... Questions?