Hello all. Thanks for coming back. I’m actually enjoying writing this series of posts. Some crazy hit rates for the first two parts that’s for sure. Most of them probably from my mother but I suppose they all count. Thanks in particular to the people from Tableau that got in touch regarding this series as well as associated Zen Masters across the globe.
Anyway here’s part 3 – Our Tableau Server Configuration
This part focuses on how we set up our Tableau Server application to run on the infrastructure described in Part 2. There may be some overlap and there may be some things I miss – I’m happy to discuss individually. Note there was some evolution of this environment from the initial design and build to what we have today. I’m going to describe the end state rather than the baby steps to get there. Note our service is still fairly immature and a little way off what I would consider a full-spec enterprise deployment but you’ll get the general idea. Tableau Server configuration is a relatively simple affair in comparison to most other enterprise applications I’ve used.
- Main Server – 4 core CPU, 8GB RAM, 250GB disk space
- Worker Server, 4 core CPU, 16GB RAM, 250GB disk space
Both servers would share the processing load, in particular the VizQL and dataserver processing. They would both be allocated a significant amount of RAM for snappy rendering and have a large amount of disk space for extract storage. We’ve found this performance to be adequate rather than fast, although that may be due to the fact that the users are hitting the server from a number of geographical locations. Tableau make a number of recommendations to improve server and desktop performance. They also have a really detailed whitepaper on scalability. This sort of professional looking doc is loved by management, even though they’ll never read it.
I must say that this kind of document is a real bonus for users trying to get a service signed off. Scalability is the sort of subject that will be grilled by senior bosses and I’ve been in situations where vendors haven’t provided any guidelines meaning I’ve had to conduct my own scalability tests to satisfy the check point. Having an official whitepaper from Tableau shuts down that question immediately. Great stuff. In my experience the server specifications seem to be less of a factor than poor dashboard design and inefficient calculated field queries. We’ve also seen some dramatic improvements in speed from basic database maintenance such as applying an index to key tables. Tableau have some decent docs on this subject. Obviously this is easier if you’ve got control of the datasource but sometimes you’ll be connecting to an ancient DB thousands of miles away, used by hundreds of users with vague ownership at best so it’s a tricky and risky process to go messing with the configuration.
Think I mentioned this in part 2 of this post so I won’t go into this here. It’s pretty key to being able to manage your service as it scales. Don’t even think about local accounts, integrate with Active Directory from the off.
The level of resilience depends on a few factors, in particular the budget you have available and also the level of service tier you class your service as. Obviously a Tier 1 service would have a resilient backup server, with the application able to failover seamlessly and quickly, ideally with no interruption to the service for the end user. In the case of our initial Tableau Server deployment we did not configure a failover server as the service was classed as a Tier 3 service. That meant if we had an outage to the application management had accepted (and signed off) that Tableau would be out of action until it was resolved. We made this clear to users as they were onboarded (more about that in part 4) in order to head off any complaints. As our service evolved and grew the usage of Tableau became global and critical to the operations of some business units. Because of this we upped the service tier to Tier 2 and implemented a failover server to allow the service to continue if we had an issue. We didn’t sign up to any SLA agreements as often happens when a new service is commissioned so there was no pressure to provide a four or five nines level of availability, something that is often required by the business. Tableau provides decent docs on configuring your environment for failover and we found it a pretty easy setup process.
We’ve got big local disks in our Tableau server machines, easily expandable due to the fact we are on the Virtual Machine environment. The key take away from this is that you can’t really predict the disk space requirements of Tableau. You might be fine for ages and then have one user upload a monster extract that fills the entire disk. We tried to hammer this home to the users in terms of best practice and education during the onboarding stage. More about onboarding in part 4 of this series.
We ran both servers in the same locale. This was down to ease of setup and a necessity to get the project completed (and the service live) as quickly as possible. On reflection we should have considered the location of our potential user base and the fact that adoption of the application was likely to be fast and viral. We certainly underestimated the impact and popularity that Tableau would have. Configuring a system globally has a number of specific challenges.
- Time – It’s usually easy to get something delivered in your immediate locale. You know the people, you know the process and you can easily wander over to someone’s desk and encourage them to work on your request ahead of other work. If there are any problems then it’s easy to escalate and if someone doesn’t understand your requirements it’s easy to have that face to face discussion to clarify. That’s not always the case when you’re trying to get something installed in a datacenter thousands of miles away from where you are. Processes invariably take longer, and it’s harder to expedite with the personal touch when it’s often someone you’ve never spoken to before that picks up your ticket.
- Consistency of process – One of my pet hates this. In my organisation I have deployed several global services over the last 4 years. Recently I deployed an enterprise monitoring tool across multiple business areas. We had infrastructure and application masters installed in New York, London, Toronto, Hong Kong, Sydney & Tokyo. Typical tasks involved in the setup were commissioning of the servers, setup of user accounts, filesystem configuration and then installation of the application. Each of those tasks involves a different request process to be carried out by different teams. That’s fine, just needs a bit of organisation on my part. However, the ludicrous fact that each of those processes also varied per region added even more red tape to the process. Unbelievably irritating to the end user to see such a failure to globalize common tasks.
IT management often speak about wanting to “appear like McDonalds” to the end user. You know, you order a Big Mac (Server) in Tokyo and it’s the same process and product as in London or anywhere else. That’s what they always talk about and in reality it’s the absolute polar opposite experience for the user. We completed the monitoring project in 8 months. I think we could have halved that if the setup process has been globally consistent, it really was that much of a handicap.
- Consistency of technology – This is kind of related to the point above. In my case the standard Windows or Linux VM offerings that you can choose from vary significantly in each region. Same for the available databases (Oracle in New York, Sybase in London), and same for hardware model and numerous other components of the technology stack. Again, adds to delay, inconsistency and user frustration.
- Support – Again similar to the two posts above. Some support teams are merged (e.g. Server teams supporting both Windows and Linux with the same process) or they could be completely separate in another region. Some regions might offer 24×7 support, some might not. The monitoring of the infrastructure might also be different (varying disk space thresholds and partitions) and the alerts generated might be handled in contrasting ways. It’s a real throw of the dice in terms of the user experience.
As you can tell the subject of globalisation is one that is a key frustration of mine. In terms of technology and process it’s not that hard to do – but tends to always fall down due to politics, agendas and red tape. It’s disappointing that the ones that suffer are the users and the support teams.
Run As User
This is important. By default Tableau installs and runs as “NTAUTHORITY\NetworkService“. The application will run fine as this user, but we found that the server was not able to access some data sources. In particular we were unable to access UNC paths to Windows Sharepoint environments to monitor Excel spreadsheets. This is a commonly used data source in IT and it was critical that Tableau was able to point to them. In order to allow that we modified the application configuration to run as a domain user, in our case DOMAIN\tableau. That meant Tableau could see the data sources via UNC paths provided the sharepoint was permissioned to allow DOMAIN\tableau to have read access. Tableau make this recommendation in their documentation.
Alerts and Subscriptions
We set up Tableau 8 to send email alerts to our support team (me initially) in the event of a server health issue or extract refresh failure. This setting is useful but needs some refining as it doesn’t seem possible to be able to send emails to different audiences depending on the workbook. I’m not keen on getting woken up for a poorly configured extract failure for one of the user workbooks. Maybe that will come in a future release. For now we just absorbed that headache.
Tableau Server offers a neat feature to control the level of server caching that occurs. I can imagine this being an important tuning point in large deployments. For my case we simply selected “Balanced” and cache for no longer than 1 minute. We’d keep an eye on performance and revisit that setting if required.
See Part 2 for recommendations on this.
Our server was V7 so I didn’t get a chance to try this out initially but will get around to it in the next few weeks. I don’t really need to say too much as the brilliant Russell Christopher has covered it all perfectly. Thanks loads @russch – If you’re not following his blog then you’re missing out. I’m envisioning using history tables and other queries into the Tableau postgres database to produce a live dashboard of the heath of the system. User logons and usage patterns, extract refreshes, hot queries and other metrics all up there for managers and support teams to see. Perfect. Looking forward to having a blast at it.
Some things that we either didn’t consider or chose not to implement..
- Load balancer – Our initial service offering didn’t really need this but it’s something I’d certainly consider in the future as the service grows. Some details from the vendor here.
A few things that I’d love to see in future versions of Tableau.
- Version control – If I’m building an enterprise service I’ll always look for version control. The recent monitoring project I refer to used Subversion to automatically check in configuration files after each modification. It adds that extra level of safety and trackability to changes involving the user xml files. I’d like to see some form of in-built version control in Tableau for those (luckily very rare) occasions where user delete their workbooks not only from the server but also from their local disk.
- Simpler permissioning of workbooks & projects – Maybe you’re all totally happy with this aspect of Tableau Server? I certainly don’t hear many complaints. Maybe it’s just me but I find permissioning awkward and difficult to keep a proper track of. I’ve ended up just leaving it to the users to permission as they see fit when they publish.
- Alerting on extract refreshes – 8.1 is a lot better in this respect with the emailing feature but would be nice to see some form of integration with formal alert protocols such as SNMP. I’d also like the ability to configure variable alert destinations based on workbook ownership and other context. It might be a good idea for Tableau to not spend much time on trying to be an alerting tool and instead just dump everything to a log file (with guidance on what patterns to monitor) for integration with a more enterprise class monitoring application like Geneos.
- Dedicated Sybase connector – I know you can connect to Sybase using the generic ODBC connector in V7 but it seems Sybase was being ignored a little when it comes to the portfolio of connectors. This is no longer an issue now support has been added in V8.
So that’s how we configured our Tableau Server. There’s no one right way to do this and part of the fun is responding to feedback on the service and evolving the offering. You’re not gonna get it right first time. As discussed in part 1 of this series I brought Tableau into the organisation as a small-scale side project and it expanded with popularity. If you’re lucky to have a signed cheque and a go from your managers to implement a service from the start then the approach may be different to mine but ultimately the same considerations apply. This is only part of the journey though, there’s a lot to consider in terms of the actual service you provide to the user base and that will be the subject of the next part in this series.
Next up is Part 4 – Key Operational Considerations