Sumedh from Citus Data here. > Does Citus keeps it's performance over tables wit...

Tharkun · on April 20, 2016

Not every query is parallelizable. Maintaining performance is a lie. An easy to grasp is example is computing a median. And I mean an exact median, not an approximation.

spathak · on April 20, 2016

@Tharkun: You are right that not every query is immediately parallelizable. Distinct count's are another example. In some cases data can be re-partitioned so we can calculate exact values and push down computation in parallel. This may provide better performance than a single large table, so there are still benefits to it. Ultimately though there will be tradeoffs to moving to an entirely distributed environment, but depending on the use-case the value may offset those.

mtanski · on April 20, 2016

I'm not sure why folks are downvoting you because most database systems that provide the full array of relational operations (joins, groupby, groupby cube, etc) do not scale linearly (maybe past a handful of nodes). Mixing OLTP / OLAP using current technologies is hard.