Michael Personick
30
* mroycsi wrote: Based on sparql bottom up evaluation, the subquery will
* return s1,s2,s3 as the solutions for ?s. Joined with the ?s :p ?o, you
* should only get the statements where ?s is s1,s2,s3.
* I haven't debugged bigdata so I don't know exactly what it is doing, but
* it seems that currently with the bigdata evaluation, for each solution
* produced from ?s :p ?o, the subquery is run, and it seems that the ?s
* binding in the subquery is getting constrained by the ?s from the inbound
* solution, so results of the subquery are not always s1,s2,s3, depending
* on the inbound solution.
* thompsonbry wrote: Normally bottom up evaluation only differs when you
* are missing a shared variable such that the bindings for variables having
* the same name are actually not correlated.
* This is a bit of an odd case with an interaction between the order/limit
* and the as-bound evaluation which leads to the "wrong" result. We
* probably do not want to always do bottom up evaluation for a subquery
* (e.g., by lifting it into a named subquery). Are you suggesting that
* there is this special case which needs to be recognized where the
* subquery MUST be evaluated first because the order by/limit combination
* means that the results of the outer query joined with the inner query
* could be different in this case?
* mroycsi wrote: This is [a] pattern that is well known and commonly used
* with sparql 1.1 subqueries. It is definitely a case where the subquery
* needs to be evaluated first due to the limit clause. The order clause
* probably doesn't matter if there isn't a limit since all the results are
* just joined, so order doesn't matter till the solution gets to the order
* by operations.
* thompsonbry wrote: Ok. ORDER BY by itself does not matter and neither
* does LIMIT by itself. But if you have both it matters and we need to run
* the subquery first.
**********
* mroycsi wrote: Based on sparql bottom up evaluation, the subquery will
* return s1,s2,s3 as the solutions for ?s. Joined with the ?s :p ?o, you
* should only get the statements where ?s is s1,s2,s3.
* I haven't debugged bigdata so I don't know exactly what it is doing, but
* it seems that currently with the bigdata evaluation, for each solution
* produced from ?s :p ?o, the subquery is run, and it seems that the ?s
* binding in the subquery is getting constrained by the ?s from the inbound
* solution, so results of the subquery are not always s1,s2,s3, depending
* on the inbound solution.
* thompsonbry wrote: Normally bottom up evaluation only differs when you
* are missing a shared variable such that the bindings for variables having
* the same name are actually not correlated.
* This is a bit of an odd case with an interaction between the order/limit
* and the as-bound evaluation which leads to the "wrong" result. We
* probably do not want to always do bottom up evaluation for a subquery
* (e.g., by lifting it into a named subquery). Are you suggesting that
* there is this special case which needs to be recognized where the
* subquery MUST be evaluated first because the order by/limit combination
* means that the results of the outer query joined with the inner query
* could be different in this case?
* mroycsi wrote: This is [a] pattern that is well known and commonly used
* with sparql 1.1 subqueries. It is definitely a case where the subquery
* needs to be evaluated first due to the limit clause. The order clause
* probably doesn't matter if there isn't a limit since all the results are
* just joined, so order doesn't matter till the solution gets to the order
* by operations.
* thompsonbry wrote: Ok. ORDER BY by itself does not matter and neither
* does LIMIT by itself. But if you have both it matters and we need to run
* the subquery first.
************
* mroycsi wrote: Based on sparql bottom up evaluation, the subquery will
* return s1,s2,s3 as the solutions for ?s. Joined with the ?s :p ?o, you
* should only get the statements where ?s is s1,s2,s3.
* I haven't debugged bigdata so I don't know exactly what it is doing, but
* it seems that currently with the bigdata evaluation, for each solution
* produced from ?s :p ?o, the subquery is run, and it seems that the ?s
* binding in the subquery is getting constrained by the ?s from the inbound
* solution, so results of the subquery are not always s1,s2,s3, depending
* on the inbound solution.
* thompsonbry wrote: Normally bottom up evaluation only differs when you
* are missing a shared variable such that the bindings for variables having
* the same name are actually not correlated.
* This is a bit of an odd case with an interaction between the order/limit
* and the as-bound evaluation which leads to the "wrong" result. We
* probably do not want to always do bottom up evaluation for a subquery
* (e.g., by lifting it into a named subquery). Are you suggesting that
* there is this special case which needs to be recognized where the
* subquery MUST be evaluated first because the order by/limit combination
* means that the results of the outer query joined with the inner query
* could be different in this case?
* mroycsi wrote: This is [a] pattern that is well known and commonly used
* with sparql 1.1 subqueries. It is definitely a case where the subquery
* needs to be evaluated first due to the limit clause. The order clause
* probably doesn't matter if there isn't a limit since all the results are
* just joined, so order doesn't matter till the solution gets to the order
* by operations.
* thompsonbry wrote: Ok. ORDER BY by itself does not matter and neither
* does LIMIT by itself. But if you have both it matters and we need to run
* the subquery first.
************
* mroycsi wrote: Based on sparql bottom up evaluation, the subquery will
* return s1,s2,s3 as the solutions for ?s. Joined with the ?s :p ?o, you
* should only get the statements where ?s is s1,s2,s3.
* I haven't debugged bigdata so I don't know exactly what it is doing, but
* it seems that currently with the bigdata evaluation, for each solution
* produced from ?s :p ?o, the subquery is run, and it seems that the ?s
* binding in the subquery is getting constrained by the ?s from the inbound
* solution, so results of the subquery are not always s1,s2,s3, depending
* on the inbound solution.
* thompsonbry wrote: Normally bottom up evaluation only differs when you
* are missing a shared variable such that the bindings for variables having
* the same name are actually not correlated.
* This is a bit of an odd case with an interaction between the order/limit
* and the as-bound evaluation which leads to the "wrong" result. We
* probably do not want to always do bottom up evaluation for a subquery
* (e.g., by lifting it into a named subquery). Are you suggesting that
* there is this special case which needs to be recognized where the
* subquery MUST be evaluated first because the order by/limit combination
* means that the results of the outer query joined with the inner query
* could be different in this case?
* mroycsi wrote: This is [a] pattern that is well known and commonly used
* with sparql 1.1 subqueries. It is definitely a case where the subquery
* needs to be evaluated first due to the limit clause. The order clause
* probably doesn't matter if there isn't a limit since all the results are
* just joined, so order doesn't matter till the solution gets to the order
* by operations.
* thompsonbry wrote: Ok. ORDER BY by itself does not matter and neither
* does LIMIT by itself. But if you have both it matters and we need to run
* the subquery first.