Can I use Tableau join culling instead of referential keys on the database?

Ok, I have been meaning to run a few tests with actual data to see if Tableau join culling feature

 

  • Whatever you can do in DB, do it in DB including performance optimizations. Tableau is great being able to overcome some of the DBA laziness (with features like Join culling) but it still won’t overcome everything
  • You must know Join Culling doesn’t always work such as in scenarios like outer joins, etc. (if you need a list of those, there are other blogs that cover that)
    • Now, also keep in mind that depending on how you build the Tableau model, you might get confused why it doesn’t always work even if it is all inner joins and even we have checked the “Assume Referential Integrity” dialog box
    • You should also understand that the SQL generation is also affected by the the order in which you bring the tables into the Tableau model

So, regarding the test, here are specifics

Here are some of the tests that I just ran

  • There are 4 scenarios using two large tables with identical data
  • Both fact tables have 2 million records, exactly same data
  • One fact table has DB keys, the other doesn’t
  • There are 4 datasources in Tableau, both of these fact tables joined in Tableau; once with Assume RI checked and other without checked
  • This makes a total of 4 scenarios, the dashboard below has 2 queries to each of the 4 scenarios

 

I ran the following dashboard that covers these scenarios across various tables:-

tableau-join-culling-test-scenarios
Various scenarios

 

 

Here are results of performance recording. If you see and compare, you will see it’s better to have the keys in the DB.

tableau-join-culling-test-results
Join Culling – results

 

  • Compare the Orange with Red, this is where we are telling Tableau that there are no Keys in DB so Assume RI. The performance of these queries is worse than the case (Red) where DB keys are present.
  • Also, adding Assume RI flag may add some overhead (comparing Blue to Red). In this case, both sources have DB Keys. Adding this option only adds overhead because of our logic.
  • Now, if you compare Orange with Teal, this is where we are telling Tableau to overcome lack of Keys, you see that we improve the experience but not by a whole lot.

 

But, don’t think that Assume Referential Integrity check-box will overcome the “lack of optimizations” you should do on the database. In other words, DBA laziness

I will also write more follow on blogs on this topic of performance

One thought on “Can I use Tableau join culling instead of referential keys on the database?

  1. Don’t forget – if you lie to Tableau, Tableau might lie to you. In particular, if you state your database has referential integrity when it does not, Tableau may generate queries and produce results from the query cache that are incorrect for your database. Also, future versions of Tableau will take further advantage of this setting, so even if it doesn’t affect your results now, it might affect them in the future. Never set this unless it is true!

Comments are closed.