All results described below were obtained with a Sigma KB consisting of only the constituent file Merge.kif, version 1.36, after the application of predicate variable instantiation, row variable expansion, and assertion caching. All tests were run on a host with an Athlon 3.2 Ghz CPU in 32-bit mode, with 3.4 GB of accessible RAM and 2.4 GB of swap space. Assuming use of type prefixes, predicate variable instantiation (no holds prefixing), and assertion caching: 30 tests expected to succeed: 1, 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 16, 18, 19, 20, 21, 22, 24, 26, 29, 30, 31, 32, 34, 44, 46, 47, 48, 49, 50. 20 tests expected to fail: 6, 7, 14, 15, 17, 23, 25, 27, 28, 33, 35, 36, 37, 38, 39, 40, 41, 42, 43, 45. TQG1.kif.tq: Class subsumption, skolemization. Succeeds for local Vampire in 17 steps. TQG2.kif.tq: Relation subsumption. Succeeds for local Vampire in under 180 seconds, with a proof of 41 steps. Succeeds for EP---0.999 in 226 seconds. TQG3.kif.tq: Case elimination reasoning. Succeeds for local Vampire in 24 steps. Requires the ability to use numbers as terms, since the proof depends on ground non-atomic terms formed with SignumFn. TQG4.kif.tq: Uses holdsDuring. Succeeds for local Vampire in 23 steps. TQG5.kif.tq: Class equality and subsumption reasoning. Succeeds for local Vampire in 45 steps. TQG6.kif.tq: Should succeed, but fails for local Vampire after one hour, even though a complete proof path exists. Merge.kif contains many clauses in which (part ...) is a positive literal (consequent), which greatly increases the search space and could be a factor. TQG7.kif.tq: Fails because there is no path to a proof, given the current content of Merge.kif. The biconditional relating sibling, mother, and father should be replaced with several conditionals, some that are not skolemizing. TQG8.kif.tq: Succeeds for local Vampire in 32 steps, but requires at least a few minutes. TQG9.kif.tq: Class identification. Succeeds for local Vampire in seven steps. TQG10.kif.tq: Case elimination with multiple rules. Answered by local Vampire in 39 steps, but requires at least five minutes. Not answered by EP---0.999 after 10 minutes. TQG11.kif.tq: Multiplication, equality reasoning. Succeeds for local Vampire in one step. Requires a multiplication function and the ability to accept numbers as terms. TQG12.kif.tq: One simple rule. Succeeds for local Vampire in nine steps. TQG13.kif.tq: Skolemization, multiple rules. Succeeds for local Vampire in 17 steps. TQG14.kif.tq: Just one rule. This test cannot succeed, because the rule below can never be satisfied as written. Atom is not a subclass of CorpuscularObject, and there is no way to prove that ?ATOM is both an Atom and a CorpuscularObject. Atom should be made a subclass of CorpuscularObject (this is a perspective issue), or the argument types of component should be broadened from CorpuscularObject to SelfConnectedObject, or part should be used instead of component in the rule below (which would change the sortal antecedent to the much broader Object). (=> (instance ?ATOM CorpuscularObject) (=> (instance ?ATOM Atom) (forall (?NUCLEUS1 ?NUCLEUS2) (=> (and (component ?NUCLEUS1 ?ATOM) (component ?NUCLEUS2 ?ATOM) (instance ?NUCLEUS1 AtomicNucleus) (instance ?NUCLEUS2 AtomicNucleus)) (equal ?NUCLEUS1 ?NUCLEUS2))))) TQG15.kif.tq: Fails in local Vampire. Apparently there is no path to a proof in Merge.kif. TO DO: Devise a path, adding the required statements to Merge.kif, if warranted. TQG16.kif.tq: Succeeds for local Vampire in 31 steps, but only after several minutes. TQG17.kif.tq: Fails after 300 seconds. TO DO: identify a proof from the axioms in Merge.kif. TQG18.kif.tq: Succeeds for local Vampire in 28 steps. Requires a few minutes. TQG19.kif.tq: Addition, equality reasoning. Succeeds for local Vampire in one step. Requires an addition function and the ability to accept numbers as terms. TQG20.kif.tq: Addition, equality reasoning, with a syntax slightly different from TQG19. Succeeds for local Vampire in eight steps. Requires an addition function and the ability to accept numbers as terms. TQG21.kif.tq: Addition, subtraction, multiplication, division, and equality reasoning. Succeeds for local Vampire in 11 steps. Requires addition, subtraction, multiplication, and division functions, and the ability to accept numbers as terms. TQG22.kif.tq: Succeeds for local Vampire in 11 steps. Requires > 60 seconds. TQG23.kif.tq: Fails for local Vampire after 1200 seconds. Has worked in the past without sortals, but required 75 steps. Probably requires many more now. The necessary axioms are in place in Merge.kif. The query should succeed, given enough time. TQG24.kif.tq: Class subsumption. Succeeds for local Vampire in 18 steps. TQG25.kif.tq: Case elimination. This test fails, perhaps because of insufficient support for reasoning with lists and list producing functions. TO DO: Describe an expected proof. TQG26.kif.tq: Case elimination. Succeeds for local Vampire in 29 steps. TQG27.kif.tq: Fails. Requires support for KappaFn. TQG28.kif.tq: Fails, probably because it requires support for lists and ListFn. TO DO: Specify a proof. TQG29.kif.tq: Equality reasoning. Succeeds for local Vampire in eight steps. TQG30.kif.tq: Reasoning about class equality. Succeeds for local Vampire in 13 steps. It's not clear if this test should succeed. We probably need to add more predicates to SUMO to capture the following different notions of equality: two terms are canonically identical in appearance; two non-identical terms denote the same entity (Morning Star, Evening Star); two sets/classes have the same members (extensional set equality); two sets/classes have the same defined criteria for membership (intensional set equality). TQG31.kif.tq: Circular subclass reasoning. Succeeds for local Vampire in 18 steps. TQG32.kif.tq: An "intensional" query requiring circular subclass reasoning. Succeeds for local Vampire in 17 steps. TQG33.kif.tq: Commonsense reasoning: every physical object has some non-zero positive mass. This test fails because some basic pieces of commensense knowledge are missing from Merge.kif. Since Merge.kif is intended to be only a fundamental upper ontology, it should not be a matter of concern that this test fails. The necessary knowledge could be expected to be supplied by a constituent file more specialized than Merge.kif. TQG34.kif.tq: A KB integrity test: Every term is an instance of Entity. Succeeds for local Vampire in five steps. TQG35.kif.tq: Temporal point and interval reasoning. Fails after several minutes, probably because of insufficient axiomatic support in Merge.kif for reasoning about TimePoints contained in TimeIntervals. TO DO: Devise an inference path that should work, and add the missing knowledge to Merge.kif. TQG36.kif.tq: Temporal interval reasoning. Fails after several minutes, probably because of insufficient axiomatic support for ;; reasoning about the relationships between TimeIntervals. TO DO: Devise an inference path that should work, and add the necessary statements to Merge.kif. TQG37.kif.tq: Temporal point and interval reasoning. Fails after several minutes, probably because of insufficient axiomatic support in Merge.kif for reasoning about TimePoints contained in TimeIntervals. TO DO: Devise an inference path that should work, and add the missing knowledge to Merge.kif. TQG38.kif.tq: Temporal point and interval reasoning. Fails after several minutes, probably because of insufficient axiomatic support in Merge.kif for reasoning about TimePoints contained in TimeIntervals. TO DO: Devise an inference path that should work, and add the missing knowledge to Merge.kif. TQG39.kif.tq: Predicate introduction. This test fails when pred var instantiation is in force (i.e., when "holdsPrefix" != "yes") because transitiveTestPred39-1 is being defined by entering statements via KB.tell() *after* the rule that defines TransitiveRelation has already been instantiated during preprocessing. We do not preprocess the entire KB again with each call to KB.tell(). The test would work if Sigma were restarted (forcing the user assertions file to be preprocessed with the other constituent files, hence including transitiveTestPred39-1 in the pred var instantiation of the transitivity rule), or if the KB were translated with holds prefixing rather than with pred var instantiation. Note that predicate introduction via KB.tell() is one case where the user is likely to see different (better) inference behavior after restarting Sigma, or by defining new predicates in a constituent file instead. TQG40.kif.tq: Predicate introduction, use of a predicate with the wrong number of arguments. This test fails because there are no axioms to prevent a BinaryPredicate from being used to form a statement with three arguments. (Such axioms cause problems of their own.) This sort of syntactic correctness enforcement probably should not be done in inference, but as part of a canonicalization/rejection step that prevents non-wff expressions from ever reaching the inference engine. TQG41.kif.tq: Predicate introduction, use of a predicate with the wrong number of arguments. This test fails because there are no axioms to prevent a TernaryPredicate from being used to form a statement with four arguments. (Such axioms cause problems of their own.) This sort of syntactic correctness enforcement probably should not be done in inference, but as part of a canonicalization/rejection step that prevents non-wff expressions from ever reaching the inference engine. TQG42.kif.tq: Predicate introduction, use of a predicate with the wrong number of arguments. This test fails because there are no axioms to prevent a QuaternaryPredicate from being used to form a statement with five arguments. (Such axioms cause problems of their own.) This sort of syntactic correctness enforcement probably should not be done in inference, but as part of a canonicalization/rejection step that prevents non-wff expressions from ever reaching the inference engine. TQG43.kif.tq: Predicate introduction, use of a predicate with the wrong number of arguments. This test fails because there are no axioms to prevent a QuintaryPredicate from being used to form a statement with six arguments. (Such axioms cause problems of their own.) This sort of syntactic correctness enforcement probably should not be done in inference, but as part of a canonicalization/rejection step that prevents non-wff expressions from ever reaching the inference engine. TQG44.kif.tq: Predicate introduction: defines and uses a predicate that takes 10 arguments. Succeeds for local Vampire in 40 steps. The test should succeed. However, a warning should be displayed to the user, explaining that the number of arguments exceeds the arbitrary maximum predicate arity of seven. The warning does appear for formulas loaded from constituent files, but apparently is bypassed when assertions are entered via KB.tell(). This should be fixed, eventually. TQG45.kif.tq: Division by 0: The underlying code that implements the division function should be defensive enough to prevent anything bad from happening. This test currently fails, but exists for its possible side-effects, not for the query answer. It's not clear what the answer should be. Requires division and addition functions, and the ability to handle the use of numbers as terms. TQG46.kif.tq: Row variable expansion: Tests to determine if row variable expansion is properly restricted in the case of a row variable appearing in the arg 1 position of a binary predicates. Succeeds for local Vampire in seven steps. TQG47.kif.tq: Row variable expansion: Tests to determine if row variable expansion is properly restricted in the case of a row variable appearing in the arg 2 position of a binary predicates. Succeeds for local Vampire in seven steps. TQG48.kif.tq: Row variable expansion: Tests to determine if row variable expansion is properly handled in the case of a row variable spanning args 3 and 4 of a quaternary predicate. Succeeds for local Vampire in 11 steps. TQG49.kif.tq: Row variable expansion: Tests to determine if row variable expansion is properly handled in the case of a row variable spanning args 1 and 2 of a quaternary predicate. Succeeds for local Vampire in 11 steps. TQG50.kif.tq: Skolemization of a 10-level deep class hierarchy, with subsumption reasoning. Succeeds for local Vampire in 53 steps. No answer from EP---0.999 after five minutes.