Higher-order Theorems | Satallax 3.2 | Leo‑III 1.1 | Satallax 3.0 | LEO‑II 1.7.0 | Zipperpin 1.1 | Isabelle 2016 |
---|---|---|---|---|---|---|
Solved/500 | 430/500 | 382/500 | 382/500 | 305/500 | 179/500 | 387/500 |
Av. CPU Time | 22.86 | 15.43 | 13.63 | 11.10 | 3.81 | 67.12 |
Av. WC Time | 22.84 | 6.71 | 13.63 | 11.14 | 3.83 | 53.27 |
Solutions | 430/500 | 382/500 | 375/500 | 301/500 | 179/500 | 0/500 |
μEfficiency | 444 | 229 | 526 | 493 | 315 | 34 |
μWCEfficiency | 466 | 102 | 526 | 493 | 315 | 18 |
SOTAC | 0.25 | 0.22 | 0.23 | 0.21 | 0.20 | 0.25 |
Core Usage | 1.05 | 2.60 | 0.99 | 0.71 | 0.93 | 1.59 |
New Solved | 1/3 | 2/3 | 1/3 | 0/3 | 2/3 | 3/3 |
Typed First-order Theorems +*-/ | Vampire 4.1 | Vampire 4.2 | CVC4 ARI‑1.5.2 | Princess 170717 | Zipperpin 1.1 |
---|---|---|---|---|---|
Solved/250 | 194/250 | 191/250 | 188/250 | 130/250 | 39/250 |
Av. CPU Time | 28.09 | 11.11 | 19.94 | 16.21 | 19.71 |
Av. WC Time | 27.99 | 11.09 | 20.61 | 7.32 | 19.72 |
Solutions | 194/250 | 191/250 | 188/250 | 115/250 | 39/250 |
μEfficiency | 481 | 512 | 461 | 278 | 95 |
μWCEfficiency | 481 | 513 | 461 | 157 | 95 |
SOTAC | 0.35 | 0.33 | 0.34 | 0.28 | 0.31 |
Core Usage | 0.77 | 0.74 | 1.17 | 2.48 | 1.10 |
New Solved | 4/6 | 6/6 | 6/6 | 5/6 | 0/6 |
First-order Theorems | Vampire 4.2 | Vampire 4.0 | E 2.1 | CVC4 NAR‑1.5.2 | iProver 2.6 | Leo‑III 1.1 | lean‑nanoCoP 1.0 | Zipperpin 1.1 | Prover9 1109a | iProverMo 2.5‑0.1 | Scavenger EP‑0.2 |
---|---|---|---|---|---|---|---|---|---|---|---|
Solved/500 | 452/500 | 444/500 | 381/500 | 327/500 | 283/500 | 211/500 | 186/500 | 154/500 | 140/500 | 99/500 | 71/500 |
Av. CPU Time | 12.65 | 15.79 | 20.44 | 27.70 | 25.43 | 38.71 | 44.74 | 23.38 | 25.99 | 26.68 | 80.98 |
Av. WC Time | 12.66 | 15.77 | 20.49 | 28.95 | 25.37 | 15.49 | 44.51 | 23.41 | 26.05 | 26.75 | 58.73 |
Solutions | 452/500 | 440/500 | 381/500 | 327/500 | 279/500 | 211/500 | 186/500 | 154/500 | 138/500 | 99/500 | 71/500 |
μEfficiency | 548 | 550 | 448 | 358 | 245 | 82 | 82 | 163 | 164 | 118 | 10 |
μWCEfficiency | 555 | 551 | 448 | 357 | 245 | 34 | 148 | 163 | 165 | 118 | 5 |
SOTAC | 0.22 | 0.21 | 0.19 | 0.17 | 0.16 | 0.14 | 0.13 | 0.12 | 0.15 | 0.11 | 0.10 |
Core Usage | 0.79 | 0.87 | 0.82 | 0.81 | 0.86 | 2.75 | 0.74 | 0.83 | 0.76 | 0.73 | 1.72 |
New Solved | 23/36 | 27/36 | 12/36 | 19/36 | 5/36 | 18/36 | 4/36 | 2/36 | 6/36 | 0/36 | 0/36 |
First-order Non-theorems | Vampire SAT‑4.1 | Vampire SAT‑4.2 | iProver SAT‑2.6 | CVC4 SNA‑1.5.2 | E FNT‑2.1 | Scavenger EP‑0.2 |
---|---|---|---|---|---|---|
Solved/250 | 219/250 | 217/250 | 175/250 | 136/250 | 85/250 | 12/250 |
Av. CPU Time | 23.89 | 12.14 | 17.73 | 19.53 | 7.57 | 19.96 |
Av. WC Time | 23.79 | 12.11 | 6.13 | 19.73 | 7.61 | 8.78 |
Solutions | 217/250 | 204/250 | 175/250 | 136/250 | 85/250 | 12/250 |
μEfficiency | 503 | 598 | 497 | 419 | 244 | 11 |
μWCEfficiency | 503 | 598 | 410 | 419 | 245 | 6 |
SOTAC | 0.29 | 0.28 | 0.24 | 0.23 | 0.24 | 0.17 |
Core Usage | 0.90 | 0.78 | 1.62 | 0.86 | 0.95 | 2.29 |
New Solved | 3/4 | 3/4 | 3/4 | 3/4 | 0/4 | 0/4 |
Effectively Propositional CNF | iProver 2.6 | iProver 2.5 | Vampire 4.2 | E 2.1 | Scavenger EP‑0.1 | Scavenger EP‑0.2 |
---|---|---|---|---|---|---|
Solved/200 | 174/200 | 171/200 | 168/200 | 53/200 | 5/200 | 4/200 |
Av. CPU Time | 34.58 | 32.72 | 21.61 | 29.75 | 64.49 | 23.48 |
Av. WC Time | 34.59 | 32.73 | 21.57 | 29.81 | 58.56 | 18.57 |
Solutions | 172/200 | 166/200 | 167/200 | 53/200 | 5/200 | 4/200 |
μEfficiency | 231 | 246 | 277 | 152 | 3 | 2 |
μWCEfficiency | 231 | 246 | 277 | 153 | 2 | 2 |
SOTAC | 0.33 | 0.33 | 0.35 | 0.25 | 0.18 | 0.17 |
Core Usage | 0.91 | 0.92 | 0.88 | 0.78 | 1.40 | 1.52 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
SLedgeHammer Theorems | Vampire SLH‑4.2 | CVC4 SLH‑1.5.2 | ET 2.0 | E SLH‑2.1 | Leo‑III SLH‑1.1 | iProver SLH‑2.6 | Zipperpin SLH‑1.1 | iProverMo 2.5‑0.1 |
---|---|---|---|---|---|---|---|---|
Solved/2000 | 1433/2000 | 1364/2000 | 1328/2000 | 1185/2000 | 652/2000 | 519/2000 | 472/2000 | 320/2000 |
Av. CPU Time | 13.32 | 3.52 | 0.33 | 15.57 | 31.88 | 23.39 | 7.19 | 11.31 |
Av. WC Time | 3.47 | 3.70 | 3.48 | 3.99 | 10.26 | 5.89 | 7.22 | 6.11 |
Solutions | 1425/2000 | 1364/2000 | 1325/2000 | 1183/2000 | 652/2000 | 519/2000 | 472/2000 | 320/2000 |
μEfficiency | 227 | 478 | 654 | 155 | 13 | 86 | 81 | 39 |
μWCEfficiency | 406 | 478 | 250 | 350 | 39 | 131 | 81 | 61 |
SOTAC | 0.28 | 0.32 | 0.25 | 0.27 | 0.20 | 0.20 | 0.19 | 0.17 |
Core Usage | 3.14 | 0.75 | 0.14 | 3.51 | 3.11 | 3.06 | 0.89 | 1.79 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
Large Theory Batch Problems | Vampire LTB‑4.0 | Vampire LTB‑4.2 | MaLARea 0.6 | iProver LTB‑2.6 | E LTB‑2.1 |
---|---|---|---|---|---|
Solved/1500 | 1156/1500 | 1144/1486 | 1131/1500 | 777/1499 | 683/1499 |
Av. CPU Time | 17.50 | 26.16 | 18.28 | 176.45 | 45.09 |
Av. WC Time | 11.51 | 20.73 | 5.20 | 44.33 | 18.14 |
Solutions | 1156/1500 | 1144/1486 | 1131/1500 | 777/1499 | 683/1499 |
μEfficiency | 73 | 58 | 45 | 4 | 16 |
μWCEfficiency | 99 | 65 | 150 | 17 | 37 |
SOTAC | 0.25 | 0.25 | 0.25 | 0.21 | 0.21 |
Core Usage | 1.48 | 1.22 | 3.51 | 3.98 | 2.43 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
THF without Equality | Satallax 3.2 | Satallax 3.0 | Leo‑III 1.1 | LEO‑II 1.7.0 | Zipperpin 1.1 | Isabelle 2016 |
---|---|---|---|---|---|---|
Solved/100 | 74/100 | 65/100 | 58/100 | 44/100 | 37/100 | 65/100 |
Av. CPU Time | 16.43 | 8.11 | 10.61 | 8.69 | 8.09 | 57.33 |
Av. WC Time | 16.47 | 8.16 | 3.58 | 8.73 | 8.10 | 43.41 |
Solutions | 74/100 | 65/100 | 58/100 | 44/100 | 37/100 | 0/100 |
μEfficiency | 285 | 406 | 213 | 368 | 319 | 35 |
μWCEfficiency | 310 | 406 | 96 | 368 | 319 | 18 |
SOTAC | 0.27 | 0.26 | 0.23 | 0.19 | 0.20 | 0.28 |
Core Usage | 1.00 | 0.87 | 2.55 | 0.72 | 1.05 | 1.73 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
THF with Equality | Satallax 3.2 | Leo‑III 1.1 | Satallax 3.0 | LEO‑II 1.7.0 | Zipperpin 1.1 | Isabelle 2016 |
---|---|---|---|---|---|---|
Solved/400 | 356/400 | 324/400 | 317/400 | 261/400 | 142/400 | 322/400 |
Av. CPU Time | 24.19 | 16.30 | 14.76 | 11.50 | 2.69 | 69.10 |
Av. WC Time | 24.17 | 7.27 | 14.76 | 11.55 | 2.71 | 55.26 |
Solutions | 356/400 | 324/400 | 310/400 | 257/400 | 142/400 | 0/400 |
μEfficiency | 483 | 233 | 555 | 524 | 314 | 33 |
μWCEfficiency | 505 | 103 | 556 | 524 | 314 | 19 |
SOTAC | 0.24 | 0.22 | 0.23 | 0.22 | 0.19 | 0.24 |
Core Usage | 1.06 | 2.61 | 1.02 | 0.70 | 0.90 | 1.56 |
New Solved | 1/3 | 2/3 | 1/3 | 0/3 | 2/3 | 3/3 |
TFA using Integers | Vampire 4.1 | Vampire 4.2 | CVC4 ARI‑1.5.2 | Princess 170717 | Zipperpin 1.1 |
---|---|---|---|---|---|
Solved/150 | 109/150 | 96/150 | 92/150 | 64/150 | 27/150 |
Av. CPU Time | 49.14 | 21.83 | 39.08 | 25.19 | 28.47 |
Av. WC Time | 48.95 | 21.78 | 40.29 | 11.81 | 28.49 |
Solutions | 109/150 | 96/150 | 92/150 | 50/150 | 27/150 |
μEfficiency | 260 | 242 | 184 | 226 | 78 |
μWCEfficiency | 260 | 243 | 184 | 107 | 78 |
SOTAC | 0.42 | 0.36 | 0.37 | 0.32 | 0.35 |
Core Usage | 0.89 | 0.88 | 0.98 | 2.45 | 1.04 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
TFA using Rationals | CVC4 ARI‑1.5.2 | Vampire 4.2 | Princess 170717 | Vampire 4.1 | Zipperpin 1.1 |
---|---|---|---|---|---|
Solved/15 | 15/15 | 15/15 | 15/15 | 14/15 | 12/15 |
Av. CPU Time | 0.02 | 0.07 | 6.61 | 0.03 | 0.01 |
Av. WC Time | 0.01 | 0.09 | 2.24 | 0.05 | 0.01 |
Solutions | 15/15 | 15/15 | 15/15 | 14/15 | 12/15 |
μEfficiency | 1000 | 1000 | 500 | 933 | 800 |
μWCEfficiency | 1000 | 1000 | 296 | 933 | 800 |
SOTAC | 0.21 | 0.21 | 0.21 | 0.21 | 0.20 |
Core Usage | 1.34 | 0.63 | 2.53 | 0.66 | 1.25 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
TFA using Reals | CVC4 ARI‑1.5.2 | Vampire 4.2 | Vampire 4.1 | Princess 170717 | Zipperpin 1.1 |
---|---|---|---|---|---|
Solved/85 | 81/85 | 80/85 | 71/85 | 51/85 | 0/85 |
Av. CPU Time | 1.90 | 0.30 | 1.30 | 7.77 | - |
Av. WC Time | 2.08 | 0.33 | 1.32 | 3.16 | - |
Solutions | 81/85 | 80/85 | 71/85 | 50/85 | 0/85 |
μEfficiency | 853 | 903 | 792 | 330 | - |
μWCEfficiency | 853 | 903 | 792 | 220 | - |
SOTAC | 0.33 | 0.32 | 0.28 | 0.26 | - |
Core Usage | 1.35 | 0.59 | 0.62 | 2.50 | - |
New Solved | 6/6 | 6/6 | 4/6 | 5/6 | 0/6 |
FOF Theorems without Equality | Vampire 4.2 | Vampire 4.0 | iProver 2.6 | E 2.1 | CVC4 NAR‑1.5.2 | lean‑nanoCoP 1.0 | Leo‑III 1.1 | Zipperpin 1.1 | iProverMo 2.5‑0.1 | Scavenger EP‑0.2 | Prover9 1109a |
---|---|---|---|---|---|---|---|---|---|---|---|
Solved/200 | 186/200 | 184/200 | 180/200 | 169/200 | 157/200 | 131/200 | 99/200 | 95/200 | 90/200 | 69/200 | 60/200 |
Av. CPU Time | 4.52 | 5.40 | 14.63 | 7.70 | 19.84 | 45.66 | 30.97 | 12.25 | 24.25 | 82.52 | 4.43 |
Av. WC Time | 4.57 | 5.40 | 14.65 | 7.77 | 20.26 | 45.43 | 12.89 | 12.28 | 24.31 | 60.07 | 4.52 |
Solutions | 186/200 | 182/200 | 178/200 | 169/200 | 157/200 | 131/200 | 99/200 | 95/200 | 90/200 | 69/200 | 58/200 |
μEfficiency | 773 | 712 | 484 | 652 | 400 | 159 | 100 | 273 | 283 | 24 | 241 |
μWCEfficiency | 773 | 712 | 484 | 652 | 400 | 290 | 43 | 273 | 284 | 13 | 241 |
SOTAC | 0.16 | 0.15 | 0.16 | 0.14 | 0.14 | 0.12 | 0.11 | 0.11 | 0.11 | 0.10 | 0.11 |
Core Usage | 0.64 | 0.92 | 0.81 | 0.71 | 0.87 | 0.71 | 2.68 | 0.83 | 0.70 | 1.71 | 0.63 |
New Solved | 2/2 | 2/2 | 2/2 | 2/2 | 2/2 | 2/2 | 2/2 | 0/2 | 0/2 | 0/2 | 0/2 |
FOF Theorems with Equality | Vampire 4.2 | Vampire 4.0 | E 2.1 | CVC4 NAR‑1.5.2 | Leo‑III 1.1 | iProver 2.6 | Prover9 1109a | Zipperpin 1.1 | lean‑nanoCoP 1.0 | iProverMo 2.5‑0.1 | Scavenger EP‑0.2 |
---|---|---|---|---|---|---|---|---|---|---|---|
Solved/300 | 266/300 | 260/300 | 212/300 | 170/300 | 112/300 | 103/300 | 80/300 | 59/300 | 55/300 | 9/300 | 2/300 |
Av. CPU Time | 18.34 | 23.14 | 30.61 | 34.96 | 45.56 | 44.29 | 42.16 | 41.30 | 42.57 | 51.02 | 28.02 |
Av. WC Time | 18.32 | 23.10 | 30.64 | 36.98 | 17.78 | 44.10 | 42.20 | 41.34 | 42.34 | 51.19 | 12.52 |
Solutions | 266/300 | 258/300 | 212/300 | 170/300 | 112/300 | 101/300 | 80/300 | 59/300 | 55/300 | 9/300 | 2/300 |
μEfficiency | 397 | 442 | 312 | 330 | 70 | 87 | 112 | 90 | 30 | 8 | 1 |
μWCEfficiency | 410 | 443 | 311 | 328 | 28 | 86 | 113 | 90 | 52 | 8 | 0 |
SOTAC | 0.26 | 0.26 | 0.23 | 0.20 | 0.17 | 0.16 | 0.19 | 0.14 | 0.16 | 0.12 | 0.12 |
Core Usage | 0.89 | 0.83 | 0.90 | 0.76 | 2.82 | 0.95 | 0.86 | 0.84 | 0.81 | 0.95 | 2.25 |
New Solved | 21/34 | 25/34 | 10/34 | 17/34 | 16/34 | 3/34 | 6/34 | 2/34 | 2/34 | 0/34 | 0/34 |
FOF Non-theorems without Equality | Vampire SAT‑4.1 | iProver SAT‑2.6 | Vampire SAT‑4.2 | E FNT‑2.1 | CVC4 SNA‑1.5.2 | Scavenger EP‑0.2 |
---|---|---|---|---|---|---|
Solved/100 | 98/100 | 87/100 | 96/100 | 65/100 | 63/100 | 12/100 |
Av. CPU Time | 24.43 | 16.71 | 21.50 | 3.11 | 30.67 | 19.96 |
Av. WC Time | 24.32 | 5.60 | 21.41 | 3.12 | 31.04 | 8.78 |
Solutions | 96/100 | 87/100 | 84/100 | 65/100 | 63/100 | 12/100 |
μEfficiency | 463 | 598 | 590 | 453 | 475 | 28 |
μWCEfficiency | 463 | 500 | 590 | 451 | 475 | 15 |
SOTAC | 0.25 | 0.23 | 0.24 | 0.22 | 0.21 | 0.17 |
Core Usage | 0.83 | 1.63 | 0.80 | 1.05 | 0.93 | 2.29 |
New Solved | 3/4 | 3/4 | 3/4 | 0/4 | 3/4 | 0/4 |
FOF Non-theorems with Equality | Vampire SAT‑4.1 | Vampire SAT‑4.2 | iProver SAT‑2.6 | CVC4 SNA‑1.5.2 | E FNT‑2.1 | Scavenger EP‑0.2 |
---|---|---|---|---|---|---|
Solved/150 | 121/150 | 121/150 | 88/150 | 73/150 | 20/150 | 0/150 |
Av. CPU Time | 23.44 | 4.72 | 18.73 | 9.91 | 22.09 | - |
Av. WC Time | 23.36 | 4.73 | 6.66 | 9.98 | 22.19 | - |
Solutions | 121/150 | 120/150 | 88/150 | 73/150 | 20/150 | 0/150 |
μEfficiency | 530 | 603 | 430 | 382 | 104 | - |
μWCEfficiency | 530 | 603 | 350 | 382 | 107 | - |
SOTAC | 0.32 | 0.32 | 0.25 | 0.24 | 0.28 | - |
Core Usage | 0.96 | 0.77 | 1.60 | 0.80 | 0.65 | - |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
EPR Unsatisfiable CNF | Vampire 4.2 | iProver 2.6 | iProver 2.5 | E 2.1 | Scavenger EP‑0.1 | Scavenger EP‑0.2 |
---|---|---|---|---|---|---|
Solved/150 | 131/150 | 131/150 | 128/150 | 33/150 | 4/150 | 3/150 |
Av. CPU Time | 27.21 | 38.30 | 39.64 | 27.01 | 79.57 | 29.72 |
Av. WC Time | 27.14 | 38.30 | 39.64 | 27.07 | 72.68 | 24.00 |
Solutions | 131/150 | 129/150 | 123/150 | 33/150 | 4/150 | 3/150 |
μEfficiency | 177 | 118 | 140 | 126 | 1 | 1 |
μWCEfficiency | 180 | 118 | 140 | 126 | 1 | 1 |
SOTAC | 0.37 | 0.34 | 0.33 | 0.24 | 0.17 | 0.17 |
Core Usage | 0.95 | 0.99 | 1.00 | 0.76 | 1.24 | 1.33 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
EPR Satisfiable CNF | iProver 2.5 | iProver 2.6 | Vampire 4.2 | E 2.1 | Scavenger EP‑0.1 | Scavenger EP‑0.2 |
---|---|---|---|---|---|---|
Solved/50 | 43/50 | 43/50 | 37/50 | 20/50 | 1/50 | 1/50 |
Av. CPU Time | 12.13 | 23.21 | 1.80 | 34.27 | 4.16 | 4.76 |
Av. WC Time | 12.17 | 23.26 | 1.85 | 34.34 | 2.05 | 2.28 |
Solutions | 43/50 | 43/50 | 36/50 | 20/50 | 1/50 | 1/50 |
μEfficiency | 564 | 572 | 576 | 233 | 7 | 7 |
μWCEfficiency | 564 | 572 | 566 | 233 | 4 | 4 |
SOTAC | 0.31 | 0.31 | 0.28 | 0.25 | 0.20 | 0.20 |
Core Usage | 0.71 | 0.69 | 0.63 | 0.82 | 2.03 | 2.09 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
LTB CakeML Theorems | Vampire LTB‑4.0 | Vampire LTB‑4.2 | MaLARea 0.6 | iProver LTB‑2.6 | E LTB‑2.1 |
---|---|---|---|---|---|
Solved/1500 | 1156/1500 | 1144/1486 | 1131/1500 | 777/1499 | 683/1499 |
Av. CPU Time | 17.50 | 26.16 | 18.28 | 176.45 | 45.09 |
Av. WC Time | 11.51 | 20.73 | 5.20 | 44.33 | 18.14 |
Solutions | 1156/1500 | 1144/1486 | 1131/1500 | 777/1499 | 683/1499 |
μEfficiency | 73 | 58 | 45 | 4 | 16 |
μWCEfficiency | 99 | 65 | 150 | 17 | 37 |
SOTAC | 0.25 | 0.25 | 0.25 | 0.21 | 0.21 |
Core Usage | 1.48 | 1.22 | 3.51 | 3.98 | 2.43 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |