Higher-order Theorems | Satallax 3.0 | Satallax 2.8 | LEO‑II 1.7.0 | Leo+III 1.0 | Leo‑III 1.0 | Isabelle 2015 |
---|---|---|---|---|---|---|
Solved/500 | 346/500 | 315/500 | 238/500 | 89/500 | 74/500 | 356/500 |
Av. CPU Time | 22.10 | 19.45 | 20.93 | 48.37 | 42.79 | 81.08 |
Solutions | 327/500 | 313/500 | 231/500 | 88/500 | 74/500 | 0/500 |
μEfficiency | 358 | 360 | 344 | 38 | 33 | 29 |
μWCEfficiency | 361 | 360 | 345 | 17 | 16 | 27 |
SOTAC | 0.32 | 0.32 | 0.33 | 0.21 | 0.21 | 0.39 |
Core Usage | 0.95 | 0.93 | 0.74 | 2.12 | 2.17 | 1.03 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
Typed First-order Theorems +*-/ | Vampire 4.1 | VampireZ3 1.0 | CVC4 TFF‑1.5.1 | Beagle 0.9.47 | Princess 160606 |
---|---|---|---|---|---|
Solved/500 | 419/500 | 380/500 | 343/500 | 300/500 | 342/500 |
Av. CPU Time | 13.39 | 9.15 | 5.72 | 18.76 | 17.59 |
Solutions | 419/500 | 380/500 | 343/500 | 300/500 | 271/500 |
μEfficiency | 585 | 512 | 586 | 391 | 397 |
μWCEfficiency | 623 | 515 | 587 | 282 | 294 |
SOTAC | 0.30 | 0.25 | 0.27 | 0.22 | 0.24 |
Core Usage | 0.78 | 0.80 | 1.15 | 1.86 | 2.41 |
New Solved | 3/6 | 3/6 | 0/6 | 5/6 | 5/6 |
Typed First-order Non-theorems +*-/ | CVC4 TFN‑1.5 | Beagle SAT‑0.9.47 | CVC4 TFN‑1.5.1 | Princess 160606 |
---|---|---|---|---|
Solved/50 | 15/50 | 10/50 | 9/50 | 8/50 |
Av. CPU Time | 32.27 | 3.11 | 0.02 | 1.44 |
Solutions | 0/50 | 0/50 | 9/50 | 0/50 |
μEfficiency | 221 | 140 | 180 | 150 |
μWCEfficiency | 221 | 58 | 180 | 77 |
SOTAC | 0.41 | 0.42 | 0.50 | 0.52 |
Core Usage | 1.40 | 2.83 | 0.75 | 1.88 |
New Solved | 1/7 | 0/7 | 2/7 | 0/7 |
First-order Theorems | Vampire 4.0 | Vampire 4.1 | E 2.0 | CVC4 FOF‑1.5.1 | iProver 2.5 | leanCoP 2.2 | Prover9 1109a | Geo‑III 2016C |
---|---|---|---|---|---|---|---|---|
Solved/500 | 457/500 | 447/500 | 392/500 | 329/500 | 278/500 | 168/500 | 101/500 | 54/500 |
Av. CPU Time | 15.39 | 14.14 | 30.87 | 35.04 | 30.82 | 77.94 | 29.99 | 41.73 |
Solutions | 453/500 | 447/500 | 392/500 | 328/500 | 274/500 | 168/500 | 98/500 | 54/500 |
μEfficiency | 537 | 547 | 460 | 324 | 163 | 54 | 78 | 28 |
μWCEfficiency | 538 | 548 | 460 | 324 | 164 | 93 | 78 | 27 |
SOTAC | 0.24 | 0.23 | 0.23 | 0.20 | 0.19 | 0.17 | 0.20 | 0.23 |
Core Usage | 0.89 | 0.87 | 0.78 | 0.86 | 0.93 | 0.87 | 0.87 | 0.96 |
New Solved | 30/33 | 30/33 | 27/33 | 30/33 | 12/33 | 7/33 | 2/33 | 0/33 |
First-order Non-theorems | Vampire SAT‑4.1 | Vampire SAT‑4.0 | iProver SAT‑2.5 | Nitpick 2015 | CVC4 FNT‑1.5.1 | Geo‑III 2016C | E FNT‑2.0 | Refute 2015 |
---|---|---|---|---|---|---|---|---|
Solved/300 | 250/300 | 240/300 | 200/300 | 139/300 | 96/300 | 76/300 | 70/300 | 58/300 |
Av. CPU Time | 40.11 | 36.45 | 30.28 | 37.86 | 22.43 | 13.69 | 16.31 | 69.09 |
Solutions | 248/300 | 238/300 | 200/300 | 139/300 | 96/300 | 76/300 | 70/300 | 0/300 |
μEfficiency | 215 | 221 | 349 | 28 | 211 | 138 | 127 | 7 |
μWCEfficiency | 215 | 221 | 260 | 26 | 211 | 138 | 125 | 6 |
SOTAC | 0.27 | 0.26 | 0.23 | 0.19 | 0.15 | 0.15 | 0.19 | 0.14 |
Core Usage | 0.93 | 0.96 | 2.29 | 1.11 | 0.69 | 0.92 | 0.83 | 1.02 |
New Solved | 1/8 | 1/8 | 1/8 | 0/8 | 0/8 | 0/8 | 0/8 | 0/8 |
Effectively Propositional CNF | iProver 2.5 | Vampire 4.0 | E 2.0 | Geo‑III 2016C | Vampire 4.1 |
---|---|---|---|---|---|
Solved/300 | 228/299 | 222/299 | 101/299 | 9/299 | 0/299 |
Av. CPU Time | 28.38 | 35.35 | 21.85 | 62.08 | - |
Solutions | 226/299 | 221/299 | 101/299 | 9/299 | 0/299 |
μEfficiency | 300 | 305 | 154 | 14 | - |
μWCEfficiency | 300 | 305 | 155 | 14 | - |
SOTAC | 0.47 | 0.46 | 0.34 | 0.32 | - |
Core Usage | 0.87 | 0.92 | 0.94 | 0.80 | - |
New Solved | 3/13 | 1/13 | 2/13 | 3/13 | 0/13 |
Large Theory Batch Problems | Vampire LTB‑4.0 | Vampire LLTB‑4.1 | Vampire LTB‑4.1 | E LTB‑2.0 | iProver LTB‑2.5 |
---|---|---|---|---|---|
Solved/600 | 403/600 | 398/600 | 396/600 | 305/600 | 298/600 |
Av. CPU Time | 31.13 | 22.41 | 16.70 | 33.91 | 132.23 |
Av. WC Time | 11.62 | 9.54 | 8.05 | 12.56 | 35.07 |
Solutions | 403/600 | 398/600 | 396/600 | 305/600 | 298/600 |
μEfficiency | 199 | 234 | 246 | 85 | 20 |
μWCEfficiency | 246 | 277 | 285 | 145 | 54 |
SOTAC | 0.25 | 0.24 | 0.24 | 0.22 | 0.21 |
Core Usage | 1.95 | 1.65 | 1.66 | 2.63 | 3.60 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
THF without Equality | Satallax 3.0 | Satallax 2.8 | LEO‑II 1.7.0 | Leo+III 1.0 | Leo‑III 1.0 | Isabelle 2015 |
---|---|---|---|---|---|---|
Solved/100 | 67/100 | 63/100 | 47/100 | 33/100 | 29/100 | 65/100 |
Av. CPU Time | 6.81 | 5.36 | 6.59 | 57.12 | 51.36 | 44.68 |
Solutions | 67/100 | 63/100 | 47/100 | 32/100 | 29/100 | 0/100 |
μEfficiency | 435 | 389 | 415 | 66 | 61 | 51 |
μWCEfficiency | 436 | 385 | 417 | 37 | 36 | 48 |
SOTAC | 0.32 | 0.29 | 0.24 | 0.20 | 0.20 | 0.36 |
Core Usage | 0.92 | 0.92 | 0.78 | 1.56 | 1.50 | 1.04 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
THF with Equality | Satallax 3.0 | Satallax 2.8 | LEO‑II 1.7.0 | Leo+III 1.0 | Leo‑III 1.0 | Isabelle 2015 |
---|---|---|---|---|---|---|
Solved/400 | 279/400 | 252/400 | 191/400 | 56/400 | 45/400 | 291/400 |
Av. CPU Time | 25.77 | 22.98 | 24.46 | 43.22 | 37.27 | 89.21 |
Solutions | 260/400 | 250/400 | 184/400 | 56/400 | 45/400 | 0/400 |
μEfficiency | 339 | 352 | 327 | 31 | 25 | 23 |
μWCEfficiency | 343 | 354 | 327 | 12 | 10 | 22 |
SOTAC | 0.33 | 0.32 | 0.35 | 0.22 | 0.21 | 0.40 |
Core Usage | 0.96 | 0.93 | 0.73 | 2.46 | 2.61 | 1.03 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
TFA using Integers | Vampire 4.1 | VampireZ3 1.0 | CVC4 TFF‑1.5.1 | Beagle 0.9.47 | Princess 160606 |
---|---|---|---|---|---|
Solved/300 | 230/300 | 202/300 | 164/300 | 135/300 | 185/300 |
Av. CPU Time | 23.54 | 16.50 | 11.94 | 40.12 | 28.78 |
Solutions | 230/300 | 202/300 | 164/300 | 135/300 | 115/300 |
μEfficiency | 374 | 287 | 381 | 128 | 265 |
μWCEfficiency | 438 | 291 | 382 | 65 | 132 |
SOTAC | 0.35 | 0.28 | 0.30 | 0.23 | 0.27 |
Core Usage | 0.85 | 0.91 | 0.90 | 2.22 | 2.69 |
New Solved | 3/6 | 3/6 | 0/6 | 5/6 | 5/6 |
TFA using Rationals | CVC4 TFF‑1.5.1 | Vampire 4.1 | Beagle 0.9.47 | VampireZ3 1.0 | Princess 160606 |
---|---|---|---|---|---|
Solved/75 | 73/75 | 73/75 | 73/75 | 72/75 | 71/75 |
Av. CPU Time | 0.02 | 0.07 | 1.10 | 0.14 | 3.33 |
Solutions | 73/75 | 73/75 | 73/75 | 72/75 | 71/75 |
μEfficiency | 973 | 964 | 940 | 940 | 748 |
μWCEfficiency | 973 | 964 | 753 | 939 | 684 |
SOTAC | 0.20 | 0.20 | 0.20 | 0.20 | 0.20 |
Core Usage | 1.61 | 0.69 | 1.53 | 0.71 | 2.03 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
TFA using Reals | Vampire 4.1 | CVC4 TFF‑1.5.1 | VampireZ3 1.0 | Beagle 0.9.47 | Princess 160606 |
---|---|---|---|---|---|
Solved/125 | 116/125 | 106/125 | 106/125 | 92/125 | 86/125 |
Av. CPU Time | 1.66 | 0.01 | 1.25 | 1.44 | 5.31 |
Solutions | 116/125 | 106/125 | 106/125 | 92/125 | 85/125 |
μEfficiency | 863 | 848 | 794 | 694 | 504 |
μWCEfficiency | 863 | 848 | 798 | 521 | 449 |
SOTAC | 0.28 | 0.25 | 0.24 | 0.23 | 0.20 |
Core Usage | 0.69 | 1.22 | 0.65 | 1.59 | 2.12 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
Typed First-order Non-theorems +*-/ | CVC4 TFN‑1.5 | Beagle SAT‑0.9.47 | CVC4 TFN‑1.5.1 | Princess 160606 |
---|---|---|---|---|
Solved/50 | 15/50 | 10/50 | 9/50 | 8/50 |
Av. CPU Time | 32.27 | 3.11 | 0.02 | 1.44 |
Solutions | 0/50 | 0/50 | 9/50 | 0/50 |
μEfficiency | 221 | 140 | 180 | 150 |
μWCEfficiency | 221 | 58 | 180 | 77 |
SOTAC | 0.41 | 0.42 | 0.50 | 0.52 |
Core Usage | 1.40 | 2.83 | 0.75 | 1.88 |
New Solved | 1/7 | 0/7 | 2/7 | 0/7 |
FOF Theorems without Equality | Vampire 4.0 | Vampire 4.1 | iProver 2.5 | E 2.0 | CVC4 FOF‑1.5.1 | leanCoP 2.2 | Geo‑III 2016C | Prover9 1109a |
---|---|---|---|---|---|---|---|---|
Solved/200 | 187/200 | 182/200 | 166/200 | 161/200 | 148/200 | 98/200 | 32/200 | 25/200 |
Av. CPU Time | 15.47 | 11.42 | 23.78 | 15.91 | 40.14 | 76.35 | 38.84 | 21.24 |
Solutions | 184/200 | 182/200 | 163/200 | 161/200 | 148/200 | 98/200 | 32/200 | 22/200 |
μEfficiency | 632 | 676 | 237 | 570 | 212 | 93 | 46 | 59 |
μWCEfficiency | 632 | 676 | 240 | 570 | 213 | 163 | 43 | 59 |
SOTAC | 0.21 | 0.20 | 0.19 | 0.19 | 0.19 | 0.16 | 0.28 | 0.15 |
Core Usage | 0.94 | 0.90 | 0.93 | 0.62 | 0.95 | 0.87 | 0.96 | 0.86 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
FOF Theorems with Equality | Vampire 4.0 | Vampire 4.1 | E 2.0 | CVC4 FOF‑1.5.1 | iProver 2.5 | Prover9 1109a | leanCoP 2.2 | Geo‑III 2016C |
---|---|---|---|---|---|---|---|---|
Solved/300 | 270/300 | 265/300 | 231/300 | 181/300 | 112/300 | 76/300 | 70/300 | 22/300 |
Av. CPU Time | 15.34 | 16.01 | 41.29 | 30.87 | 41.25 | 32.87 | 80.17 | 45.93 |
Solutions | 269/300 | 265/300 | 231/300 | 180/300 | 111/300 | 76/300 | 70/300 | 22/300 |
μEfficiency | 473 | 461 | 386 | 398 | 113 | 91 | 28 | 17 |
μWCEfficiency | 475 | 462 | 386 | 398 | 113 | 91 | 46 | 17 |
SOTAC | 0.25 | 0.25 | 0.26 | 0.21 | 0.20 | 0.21 | 0.17 | 0.16 |
Core Usage | 0.84 | 0.85 | 0.89 | 0.78 | 0.92 | 0.87 | 0.88 | 0.95 |
New Solved | 30/33 | 30/33 | 27/33 | 30/33 | 12/33 | 2/33 | 7/33 | 0/33 |
FOF Non-theorems without Equality | Vampire SAT‑4.1 | Vampire SAT‑4.0 | iProver SAT‑2.5 | Nitpick 2015 | CVC4 FNT‑1.5.1 | Geo‑III 2016C | E FNT‑2.0 | Refute 2015 |
---|---|---|---|---|---|---|---|---|
Solved/200 | 199/200 | 191/200 | 183/200 | 120/200 | 94/200 | 76/200 | 60/200 | 57/200 |
Av. CPU Time | 36.55 | 30.54 | 24.31 | 33.47 | 18.87 | 13.69 | 14.42 | 68.46 |
Solutions | 197/200 | 189/200 | 183/200 | 120/200 | 94/200 | 76/200 | 60/200 | 0/200 |
μEfficiency | 258 | 277 | 502 | 38 | 317 | 206 | 161 | 10 |
μWCEfficiency | 258 | 277 | 380 | 35 | 317 | 206 | 158 | 9 |
SOTAC | 0.24 | 0.23 | 0.23 | 0.18 | 0.15 | 0.15 | 0.17 | 0.14 |
Core Usage | 0.92 | 0.95 | 2.25 | 1.10 | 0.69 | 0.92 | 0.85 | 1.02 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
FOF Non-theorems with Equality | Vampire SAT‑4.1 | Vampire SAT‑4.0 | Nitpick 2015 | iProver SAT‑2.5 | E FNT‑2.0 | CVC4 FNT‑1.5.1 | Refute 2015 | Geo‑III 2016C |
---|---|---|---|---|---|---|---|---|
Solved/100 | 51/100 | 49/100 | 19/100 | 17/100 | 10/100 | 2/100 | 1/100 | 0/100 |
Av. CPU Time | 53.98 | 59.50 | 65.58 | 94.49 | 27.66 | 189.69 | 105.37 | - |
Solutions | 51/100 | 49/100 | 19/100 | 17/100 | 10/100 | 2/100 | 0/100 | 0/100 |
μEfficiency | 131 | 110 | 7 | 43 | 59 | 0 | 0 | - |
μWCEfficiency | 131 | 110 | 6 | 21 | 59 | 0 | 0 | - |
SOTAC | 0.39 | 0.39 | 0.31 | 0.28 | 0.28 | 0.21 | 0.17 | - |
Core Usage | 0.96 | 0.98 | 1.13 | 2.72 | 0.68 | 0.99 | 1.01 | - |
New Solved | 1/8 | 1/8 | 0/8 | 1/8 | 0/8 | 0/8 | 0/8 | 0/8 |
EPR Unsatisfiable CNF | iProver 2.5 | Vampire 4.0 | E 2.0 | Geo‑III 2016C | Vampire 4.1 |
---|---|---|---|---|---|
Solved/200 | 156/200 | 155/200 | 60/200 | 0/200 | 0/200 |
Av. CPU Time | 32.44 | 45.42 | 21.79 | - | - |
Solutions | 154/200 | 154/200 | 60/200 | 0/200 | 0/200 |
μEfficiency | 293 | 285 | 124 | - | - |
μWCEfficiency | 293 | 285 | 124 | - | - |
SOTAC | 0.49 | 0.48 | 0.36 | - | - |
Core Usage | 0.87 | 0.96 | 0.89 | - | - |
New Solved | 0/4 | 0/4 | 2/4 | 0/4 | 0/4 |
EPR Satisfiable CNF | iProver 2.5 | Vampire 4.0 | E 2.0 | Geo‑III 2016C | Vampire 4.1 |
---|---|---|---|---|---|
Solved/100 | 72/99 | 67/99 | 41/99 | 9/99 | 0/99 |
Av. CPU Time | 19.58 | 12.06 | 21.95 | 62.08 | - |
Solutions | 72/99 | 67/99 | 41/99 | 9/99 | 0/99 |
μEfficiency | 313 | 345 | 215 | 42 | - |
μWCEfficiency | 313 | 344 | 216 | 42 | - |
SOTAC | 0.43 | 0.40 | 0.32 | 0.32 | - |
Core Usage | 0.88 | 0.83 | 1.00 | 0.80 | - |
New Solved | 3/9 | 1/9 | 0/9 | 3/9 | 0/9 |
LTB AIM Theorems | Vampire LTB‑4.1 | Vampire LTB‑4.0 | Vampire LLTB‑4.1 | E LTB‑2.0 | iProver LTB‑2.5 |
---|---|---|---|---|---|
Solved/200 | 48/200 | 47/200 | 44/200 | 17/200 | 14/200 |
Av. CPU Time | 40.36 | 87.83 | 21.36 | 17.81 | 193.01 |
Av. WC Time | 12.02 | 27.03 | 6.29 | 4.89 | 48.68 |
Solutions | 48/200 | 47/200 | 44/200 | 17/200 | 14/200 |
μEfficiency | 96 | 90 | 140 | 73 | 1 |
μWCEfficiency | 107 | 105 | 164 | 75 | 4 |
SOTAC | 0.32 | 0.31 | 0.29 | 0.21 | 0.20 |
Core Usage | 2.20 | 2.55 | 1.95 | 2.35 | 3.96 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
LTB CakeML Theorems | Vampire LTB‑4.0 | Vampire LLTB‑4.1 | Vampire LTB‑4.1 | E LTB‑2.0 | iProver LTB‑2.5 |
---|---|---|---|---|---|
Solved/200 | 199/200 | 198/200 | 197/200 | 193/200 | 192/200 |
Av. CPU Time | 15.73 | 13.56 | 12.99 | 29.01 | 138.28 |
Av. WC Time | 10.04 | 9.76 | 10.40 | 13.56 | 37.40 |
Solutions | 199/200 | 198/200 | 197/200 | 193/200 | 192/200 |
μEfficiency | 158 | 158 | 153 | 49 | 13 |
μWCEfficiency | 191 | 187 | 173 | 106 | 41 |
SOTAC | 0.21 | 0.20 | 0.20 | 0.20 | 0.20 |
Core Usage | 1.32 | 1.30 | 1.25 | 2.21 | 3.58 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |
LTB HOL Light Theorems | Vampire LTB‑4.0 | Vampire LLTB‑4.1 | Vampire LTB‑4.1 | E LTB‑2.0 | iProver LTB‑2.5 |
---|---|---|---|---|---|
Solved/200 | 157/200 | 156/200 | 151/200 | 95/200 | 92/200 |
Av. CPU Time | 33.69 | 33.95 | 14.02 | 46.75 | 110.33 |
Av. WC Time | 9.00 | 10.17 | 3.72 | 11.89 | 28.14 |
Solutions | 157/200 | 156/200 | 151/200 | 95/200 | 92/200 |
μEfficiency | 350 | 404 | 491 | 132 | 46 |
μWCEfficiency | 442 | 480 | 576 | 254 | 117 |
SOTAC | 0.28 | 0.26 | 0.25 | 0.24 | 0.23 |
Core Usage | 2.57 | 2.02 | 2.01 | 3.54 | 3.58 |
New Solved | 0/0 | 0/0 | 0/0 | 0/0 | 0/0 |