Higher-order Theorems | Satallax 3.2 | Satallax 3.3 | Leo‑III 1.3 | LEO‑II 1.7.0 |
---|---|---|---|---|
Solved/500 | 406/500 | 401/500 | 355/500 | 213/500 |
Av. CPU Time | 32.81 | 32.20 | 38.54 | 17.92 |
Av. WC Time | 32.81 | 32.13 | 16.95 | 17.96 |
Solutions | 406 81% | 401 80% | 355 71% | 209 41% |
μEfficiency | 266 | 262 | 134 | 290 |
μWCEfficiency | 287 | 286 | 59 | 290 |
SOTAC | 0.33 | 0.33 | 0.35 | 0.29 |
Core Usage | 0.99 | 0.97 | 2.62 | 0.76 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 |
Typed First-order Theorems +*-/ | Vampire 4.3 | Vampire 4.1 | CVC4 1.6pre | Princess 170717 |
---|---|---|---|---|
Solved/200 | 163/200 | 162/200 | 157/200 | 105/200 |
Av. CPU Time | 14.22 | 23.12 | 14.99 | 13.39 |
Av. WC Time | 14.24 | 23.08 | 15.49 | 6.50 |
Solutions | 163 81% | 162 81% | 157 78% | 92 46% |
μEfficiency | 539 | 491 | 613 | 311 |
μWCEfficiency | 541 | 494 | 613 | 177 |
SOTAC | 0.33 | 0.35 | 0.36 | 0.30 |
Core Usage | 0.67 | 0.75 | 0.61 | 2.24 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 |
First-order Theorems | Vampire 4.3 | Vampire 4.2 | CSE_E 1.0 | E 2.2pre | CVC4 1.6pre | Leo‑III 1.3 | iProver 2.8 | leanCoP 2.2 | nanoCoP 1.1 | CSE 1.1 | CSE 1.0 | Prover9 1109a | Twee 2.2 | Geo‑III 2018C |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Solved/500 | 461/500 | 454/500 | 363/500 | 350/500 | 298/500 | 256/500 | 248/500 | 143/500 | 133/500 | 126/500 | 123/500 | 122/500 | 74/500 | 50/500 |
Av. CPU Time | 16.37 | 15.06 | 26.94 | 25.64 | 43.50 | 31.23 | 29.44 | 46.71 | 48.59 | 54.93 | 50.20 | 29.68 | 62.24 | 40.09 |
Av. WC Time | 16.36 | 15.06 | 26.89 | 25.69 | 45.68 | 11.88 | 29.22 | 46.37 | 48.13 | 55.05 | 50.28 | 29.74 | 62.30 | 40.13 |
Solutions | 461 92% | 454 90% | 362 72% | 350 70% | 298 59% | 256 51% | 247 49% | 143 28% | 133 26% | 126 25% | 123 24% | 122 24% | 73 14% | 50 10% |
μEfficiency | 483 | 473 | 333 | 339 | 232 | 94 | 167 | 56 | 55 | 61 | 64 | 113 | 50 | 21 |
μWCEfficiency | 485 | 479 | 331 | 342 | 232 | 39 | 167 | 101 | 95 | 66 | 65 | 113 | 50 | 21 |
SOTAC | 0.20 | 0.19 | 0.15 | 0.15 | 0.14 | 0.13 | 0.14 | 0.11 | 0.10 | 0.10 | 0.10 | 0.13 | 0.17 | 0.18 |
Core Usage | 0.83 | 0.83 | 0.95 | 0.84 | 0.82 | 2.67 | 0.92 | 0.81 | 0.81 | 0.97 | 0.99 | 0.87 | 0.87 | 0.95 |
New Solved | 2/3 | 2/3 | 1/3 | 1/3 | 0/3 | 1/3 | 0/3 | 0/3 | 0/3 | 0/3 | 0/3 | 0/3 | 0/3 | 0/3 |
First-order Non-theorems | Vampire SAT‑4.3 | Vampire SAT‑4.1 | iProver SAT‑2.8 | CVC4 SAT‑1.6pre | E FNT‑2.2pre | Geo‑III 2018C |
---|---|---|---|---|---|---|
Solved/200 | 191/200 | 188/200 | 137/200 | 116/200 | 38/200 | 38/200 |
Av. CPU Time | 44.98 | 48.22 | 27.90 | 29.52 | 21.16 | 23.54 |
Av. WC Time | 44.80 | 47.98 | 9.01 | 29.86 | 21.20 | 23.59 |
Solutions | 191 95% | 186 93% | 137 68% | 116 58% | 38 19% | 38 19% |
μEfficiency | 176 | 362 | 326 | 208 | 105 | 113 |
μWCEfficiency | 176 | 362 | 228 | 206 | 105 | 113 |
SOTAC | 0.31 | 0.30 | 0.24 | 0.23 | 0.25 | 0.20 |
Core Usage | 0.96 | 0.95 | 2.91 | 0.83 | 0.94 | 0.86 |
New Solved | 1/2 | 1/2 | 1/2 | 1/2 | 0/2 | 1/2 |
Effectively Propositional CNF | iProver 2.8 | Vampire 4.3 | iProver 2.6 | E 2.2pre | Leo‑III 1.3 | Geo‑III 2018C |
---|---|---|---|---|---|---|
Solved/150 | 133/150 | 128/150 | 126/150 | 27/150 | 17/150 | 10/150 |
Av. CPU Time | 29.19 | 45.77 | 37.52 | 47.62 | 124.13 | 41.73 |
Av. WC Time | 29.21 | 45.30 | 37.53 | 47.71 | 64.45 | 41.83 |
Solutions | 132/150 | 127/150 | 124/150 | 27/150 | 17/150 | 10/150 |
μEfficiency | 169 | 171 | 163 | 64 | 4 | 44 |
μWCEfficiency | 176 | 171 | 163 | 71 | 2 | 45 |
SOTAC | 0.34 | 0.36 | 0.32 | 0.23 | 0.24 | 0.23 |
Core Usage | 0.97 | 0.95 | 0.96 | 0.87 | 2.57 | 0.76 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
Large Theory Batch Problems | MaLARea 0.6 | Vampire LTB‑4.0 | Vampire LTB‑4.3 | iProver LTB‑2.8 | E LTB‑2.2pre | Grackle 0.1 |
---|---|---|---|---|---|---|
Solved/5000 | 876/5000 | 594/3553 | 757/5000 | 613/4999 | 458/4999 | 379/4893 |
Av. CPU Time | 16.10 | 10.93 | 6.02 | 55.93 | 7.36 | 46.39 |
Av. WC Time | 4.72 | 2.90 | 1.80 | 14.14 | 16.06 | 11.68 |
Solutions | 876 17% | 594 16% | 757 15% | 613 12% | 458 9% | 379 7% |
μEfficiency | 17 | 88 | 97 | 2 | 13 | 3 |
μWCEfficiency | 48 | 129 | 134 | 9 | 6 | 11 |
SOTAC | 0.35 | 0.24 | 0.25 | 0.21 | 0.19 | 0.19 |
Core Usage | 3.24 | 2.75 | 2.07 | 3.96 | 0.46 | 3.97 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
THF without Equality | Satallax 3.2 | Satallax 3.3 | Leo‑III 1.3 | LEO‑II 1.7.0 |
---|---|---|---|---|
Solved/100 | 72/100 | 70/100 | 56/100 | 37/100 |
Av. CPU Time | 28.11 | 28.86 | 28.93 | 9.61 |
Av. WC Time | 28.14 | 28.89 | 12.58 | 9.67 |
Solutions | 72 72% | 70 70% | 56 56% | 37 37% |
μEfficiency | 193 | 190 | 139 | 284 |
μWCEfficiency | 208 | 204 | 65 | 284 |
SOTAC | 0.36 | 0.34 | 0.36 | 0.28 |
Core Usage | 1.01 | 0.96 | 2.62 | 0.70 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 |
THF with Equality | Satallax 3.2 | Satallax 3.3 | Leo‑III 1.3 | LEO‑II 1.7.0 |
---|---|---|---|---|
Solved/400 | 334/400 | 331/400 | 299/400 | 176/400 |
Av. CPU Time | 33.83 | 32.91 | 40.34 | 19.66 |
Av. WC Time | 33.82 | 32.82 | 17.77 | 19.71 |
Solutions | 334 83% | 331 82% | 299 74% | 172 43% |
μEfficiency | 284 | 280 | 133 | 291 |
μWCEfficiency | 307 | 307 | 57 | 291 |
SOTAC | 0.33 | 0.33 | 0.34 | 0.30 |
Core Usage | 0.99 | 0.97 | 2.62 | 0.77 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 |
TFA using Integers | Vampire 4.1 | Vampire 4.3 | CVC4 1.6pre | Princess 170717 |
---|---|---|---|---|
Solved/125 | 98/125 | 93/125 | 85/125 | 62/125 |
Av. CPU Time | 37.03 | 24.80 | 25.59 | 17.12 |
Av. WC Time | 36.92 | 24.78 | 26.27 | 8.53 |
Solutions | 98 78% | 93 74% | 85 68% | 50 40% |
μEfficiency | 316 | 326 | 449 | 307 |
μWCEfficiency | 321 | 326 | 450 | 166 |
SOTAC | 0.39 | 0.35 | 0.39 | 0.33 |
Core Usage | 0.86 | 0.80 | 0.70 | 2.15 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 |
TFA using Reals | CVC4 1.6pre | Vampire 4.3 | Vampire 4.1 | Princess 170717 |
---|---|---|---|---|
Solved/75 | 72/75 | 70/75 | 64/75 | 43/75 |
Av. CPU Time | 2.47 | 0.17 | 1.84 | 8.00 |
Av. WC Time | 2.77 | 0.24 | 1.87 | 3.59 |
Solutions | 72 96% | 70 93% | 64 85% | 42 56% |
μEfficiency | 885 | 893 | 783 | 317 |
μWCEfficiency | 885 | 900 | 783 | 195 |
SOTAC | 0.33 | 0.31 | 0.29 | 0.26 |
Core Usage | 0.51 | 0.51 | 0.57 | 2.38 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 |
FOF Theorems without Equality | Vampire 4.2 | Vampire 4.3 | iProver 2.8 | E 2.2pre | CSE_E 1.0 | CVC4 1.6pre | Leo‑III 1.3 | leanCoP 2.2 | nanoCoP 1.1 | CSE 1.1 | CSE 1.0 | Prover9 1109a | Twee 2.2 | Geo‑III 2018C |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Solved/125 | 116/125 | 115/125 | 102/125 | 92/125 | 90/125 | 83/125 | 82/125 | 75/125 | 71/125 | 62/125 | 61/125 | 28/125 | 16/125 | 14/125 |
Av. CPU Time | 7.53 | 10.08 | 12.87 | 12.92 | 12.89 | 40.77 | 24.68 | 42.87 | 53.79 | 60.14 | 55.66 | 3.54 | 98.33 | 52.39 |
Av. WC Time | 7.59 | 10.11 | 12.89 | 12.99 | 12.85 | 43.47 | 9.94 | 42.59 | 53.17 | 60.26 | 55.71 | 3.64 | 98.39 | 52.42 |
Solutions | 116 92% | 115 92% | 101 80% | 92 73% | 89 71% | 83 66% | 82 65% | 75 60% | 71 56% | 62 49% | 61 48% | 28 22% | 16 12% | 14 11% |
μEfficiency | 706 | 622 | 357 | 493 | 484 | 190 | 104 | 151 | 130 | 138 | 144 | 155 | 17 | 13 |
μWCEfficiency | 706 | 622 | 357 | 498 | 483 | 190 | 44 | 274 | 224 | 150 | 148 | 155 | 17 | 13 |
SOTAC | 0.16 | 0.15 | 0.13 | 0.11 | 0.11 | 0.11 | 0.11 | 0.10 | 0.09 | 0.09 | 0.09 | 0.09 | 0.09 | 0.33 |
Core Usage | 0.65 | 0.75 | 0.86 | 0.74 | 0.92 | 0.88 | 2.54 | 0.76 | 0.82 | 0.95 | 0.99 | 0.78 | 0.95 | 1.00 |
New Solved | 2/3 | 2/3 | 0/3 | 1/3 | 1/3 | 0/3 | 1/3 | 0/3 | 0/3 | 0/3 | 0/3 | 0/3 | 0/3 | 0/3 |
FOF Theorems with Equality | Vampire 4.3 | Vampire 4.2 | CSE_E 1.0 | E 2.2pre | CVC4 1.6pre | Leo‑III 1.3 | iProver 2.8 | Prover9 1109a | leanCoP 2.2 | CSE 1.1 | nanoCoP 1.1 | CSE 1.0 | Twee 2.2 | Geo‑III 2018C |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Solved/375 | 346/375 | 338/375 | 273/375 | 258/375 | 215/375 | 174/375 | 146/375 | 94/375 | 68/375 | 64/375 | 62/375 | 62/375 | 58/375 | 36/375 |
Av. CPU Time | 18.46 | 17.65 | 31.58 | 30.18 | 44.55 | 34.31 | 41.01 | 37.46 | 50.94 | 49.89 | 42.64 | 44.83 | 52.29 | 35.31 |
Av. WC Time | 18.44 | 17.62 | 31.52 | 30.22 | 46.54 | 12.79 | 40.63 | 37.51 | 50.54 | 50.01 | 42.37 | 44.94 | 52.34 | 35.34 |
Solutions | 346 92% | 338 90% | 273 72% | 258 68% | 215 57% | 174 46% | 146 38% | 94 25% | 68 18% | 64 17% | 62 16% | 62 16% | 57 15% | 36 9% |
μEfficiency | 436 | 395 | 283 | 288 | 246 | 91 | 104 | 100 | 24 | 35 | 30 | 37 | 61 | 24 |
μWCEfficiency | 440 | 403 | 281 | 289 | 246 | 38 | 104 | 100 | 43 | 37 | 52 | 37 | 61 | 24 |
SOTAC | 0.21 | 0.20 | 0.17 | 0.16 | 0.16 | 0.13 | 0.14 | 0.14 | 0.12 | 0.12 | 0.11 | 0.11 | 0.20 | 0.12 |
Core Usage | 0.86 | 0.89 | 0.96 | 0.88 | 0.80 | 2.74 | 0.96 | 0.90 | 0.85 | 0.99 | 0.79 | 0.98 | 0.85 | 0.93 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
FOF Non-theorems without Equality | Vampire SAT‑4.3 | Vampire SAT‑4.1 | iProver SAT‑2.8 | CVC4 SAT‑1.6pre | E FNT‑2.2pre | Geo‑III 2018C |
---|---|---|---|---|---|---|
Solved/100 | 96/100 | 94/100 | 80/100 | 69/100 | 28/100 | 14/100 |
Av. CPU Time | 74.53 | 70.92 | 40.00 | 34.19 | 13.20 | 13.95 |
Av. WC Time | 74.19 | 70.55 | 13.34 | 34.54 | 13.23 | 13.98 |
Solutions | 96 96% | 92 92% | 80 80% | 69 69% | 28 28% | 14 14% |
μEfficiency | 167 | 138 | 255 | 162 | 149 | 81 |
μWCEfficiency | 167 | 138 | 155 | 162 | 149 | 81 |
SOTAC | 0.28 | 0.27 | 0.24 | 0.23 | 0.24 | 0.19 |
Core Usage | 0.94 | 0.97 | 3.29 | 0.92 | 1.00 | 1.04 |
New Solved | 1/2 | 1/2 | 1/2 | 1/2 | 0/2 | 1/2 |
FOF Non-theorems with Equality | Vampire SAT‑4.3 | Vampire SAT‑4.1 | iProver SAT‑2.8 | CVC4 SAT‑1.6pre | Geo‑III 2018C | E FNT‑2.2pre |
---|---|---|---|---|---|---|
Solved/100 | 95/100 | 94/100 | 57/100 | 47/100 | 24/100 | 10/100 |
Av. CPU Time | 15.12 | 25.52 | 10.91 | 22.67 | 29.14 | 43.45 |
Av. WC Time | 15.10 | 25.42 | 2.94 | 23.00 | 29.20 | 43.50 |
Solutions | 95 95% | 94 94% | 57 57% | 47 47% | 24 24% | 10 10% |
μEfficiency | 185 | 586 | 397 | 254 | 145 | 61 |
μWCEfficiency | 185 | 586 | 302 | 250 | 145 | 61 |
SOTAC | 0.34 | 0.34 | 0.24 | 0.22 | 0.20 | 0.28 |
Core Usage | 0.98 | 0.93 | 2.37 | 0.70 | 0.76 | 0.78 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
EPR Unsatisfiable CNF | iProver 2.8 | Vampire 4.3 | iProver 2.6 | E 2.2pre | Leo‑III 1.3 | Geo‑III 2018C |
---|---|---|---|---|---|---|
Solved/125 | 108/125 | 104/125 | 101/125 | 18/125 | 17/125 | 0/125 |
Av. CPU Time | 31.46 | 49.06 | 40.77 | 70.81 | 124.13 | - |
Av. WC Time | 31.47 | 48.49 | 40.78 | 70.87 | 64.45 | - |
Solutions | 107/125 | 103/125 | 99/125 | 18/125 | 17/125 | 0/125 |
μEfficiency | 96 | 92 | 86 | 30 | 5 | - |
μWCEfficiency | 100 | 91 | 85 | 30 | 2 | - |
SOTAC | 0.35 | 0.38 | 0.33 | 0.24 | 0.24 | - |
Core Usage | 0.99 | 0.99 | 0.98 | 0.94 | 2.57 | - |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
EPR Satisfiable CNF | iProver 2.8 | iProver 2.6 | Vampire 4.3 | Geo‑III 2018C | E 2.2pre | Leo‑III 1.3 |
---|---|---|---|---|---|---|
Solved/25 | 25/25 | 25/25 | 24/25 | 10/25 | 9/25 | 0/25 |
Av. CPU Time | 19.38 | 24.38 | 31.53 | 41.73 | 1.24 | - |
Av. WC Time | 19.43 | 24.42 | 31.45 | 41.83 | 1.39 | - |
Solutions | 25/25 | 25/25 | 24/25 | 10/25 | 9/25 | 0/25 |
μEfficiency | 534 | 550 | 566 | 264 | 233 | - |
μWCEfficiency | 555 | 557 | 566 | 267 | 273 | - |
SOTAC | 0.28 | 0.28 | 0.27 | 0.23 | 0.22 | - |
Core Usage | 0.86 | 0.86 | 0.76 | 0.76 | 0.72 | - |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
LTB CakeML Theorems | MaLARea 0.6 | Vampire LTB‑4.0 | Vampire LTB‑4.3 | iProver LTB‑2.8 | E LTB‑2.2pre | Grackle 0.1 |
---|---|---|---|---|---|---|
Solved/5000 | 876/5000 | 594/3553 | 757/5000 | 613/4999 | 458/4999 | 379/4893 |
Av. CPU Time | 16.10 | 10.93 | 6.02 | 55.93 | 7.36 | 46.39 |
Av. WC Time | 4.72 | 2.90 | 1.80 | 14.14 | 16.06 | 11.68 |
Solutions | 876 17% | 594 16% | 757 15% | 613 12% | 458 9% | 379 7% |
μEfficiency | 17 | 88 | 97 | 2 | 13 | 3 |
μWCEfficiency | 48 | 129 | 134 | 9 | 6 | 11 |
SOTAC | 0.35 | 0.24 | 0.25 | 0.21 | 0.19 | 0.19 |
Core Usage | 3.24 | 2.75 | 2.07 | 3.96 | 0.46 | 3.97 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |