Hi all and welcome to Part 4 of the seemingly never-ending story called Tableau as an IT Service. Lot of good chat and engagement on this from the community, thanks in particular to Paul Chapman (@cheeky_chappie), Matt Francis (@matt_francis), Kelly Martin (@vizcandykelly), Robin Kennedy (@odbin) and Francois Ajenstat (@ajenstat) for their interest and support.
Must admit I didn’t realize I’d be writing this much on the subject when I first thought of the idea but a number of people have asked for my opinions on a few extra areas so I’m trying to cover them wherever possible. It’s certainly a big subject and one that if done incorrectly can lead to even great tools like Tableau being seen negatively by your user community. So it’s important to get it right.
So far in this series I’ve covered the following aspects
- Project Initiation
- Infrastructure Setup
- Tableau Server Configuration
Those are all well and good but a robust IT service will also have its operational ducks in a row, and that’s the subject of…
Tableau as an IT Service – Part 4 – Operational Considerations
What I mean by “operational” is how your service is performing when you’ve given the green light to your users and people are now actively using it. There are a number of subjects to consider in this space, some of which can make or break an IT service, irrespective of the application involved.
I didn’t really know what to call this category. What I mean is that every new service will need to be registered in a number of other IT systems for it to be fully integrated into your organisation. For example.
- Payment & purchasing systems – So that accounts and procurement teams can find the vendor details in their directory and ensure payments get through to the right department.
- Application directories – So that you can accurately track change to your environment and your application is correctly reported on to the business. It will probably be given an “application code” or something cryptic.
- Business Continuity systems – Required by BCP and control teams to ensure you tick the box for having appropriate control and disaster recovery functions and that your systems are involved in any testing events.
- Information Security systems – Similar to the above but with regards to InfoSec checks and records.
- Architecture & application strategy – Make sure that the appropriate architects and strategy departments are bought into Tableau as a strategic tool for the organisation. If they don’t know about it you might find them recommending another tool instead and then you’ve got a fight on.
- Appropriate contact lists – Make sure your service is listed in any emergency procedures and contact directories. If something fails and users are impacted then they need to know who to call.
This all ensures that when you enter “Tableau” to one or more systems to perform task X, then you get the appropriate details rather than “Unknown application”. Some of these systems may be global or have regional variations.
Now that our user count is growing and the server usage also increasing, we are looking to implement a recharge model. Chargeback is commonly used by IT departments as a way of charging business units for their utilization of IT systems.
There are a number of advantages for IT departments.
- Cost visibility – With a solid chargeback model a service owner can easily report the usage of a system or service back to senior management, with an associated cost value. This can be used to better understand the overall cost picture of IT in an organisation.
- User control – If users have to pay for their level of usage of a system then they are much more likely to use the system responsibly and only use what they actually need. That way service resources are more efficiently utilised. This can also be problematic if the chargeback model is flawed (see disadvantages below).
- Service perception – A service with a clear chargeback model is more likely to be adopted as a longer term IT strategy by senior management. Being able to demonstrate financial transparency and clear understanding of how much a service costs to run, and how that cost will change with service expansion is manna for senior management and budget controllers.
And some disadvantages
- Reporting & admin – Every model will force users or service owners to regularly and accurately report their usage to management. This can take a while and can rapidly become onerous.
- Abuse – Chargeback undoubtedly forces users to change their behaviour. But get the model wrong and you can suffer. I’ve seen cases where job scheduling tools are charged per job, rather than per server or user and that has led to user teams consolidating huge batches of jobs down to a small number of commands, executed by one job group that forks off hundreds of other commands. Saves money, yes – but extremely bad IT practice and ultimately bad for the system health. But if their goal is to save money you can understand how that can happen.
- Shared components – How to charge back on components that are shared between multiple teams or applications? Can get tricky. Usually a balance is implemented to reduce admin overhead but this can mean one team paying more or less than they should.
- Adoption – Users are more likely to give your service a try if it is free. Simple. And if your chargeback model is flawed then that could mean your service doesn’t fly at all.
- Power users – With any service you’ll get your power users. The ones that push you for the upgrades and test the functionality to the limit. If they’re making you spend more on the service for functionality that other teams don’t require then there’s sometimes a case for applying a tiered model to charge them more. You want it? Then you have to pay for it. It’s usually a good idea to keep power users happy as they’ll sell your service for you and help to realise the business benefit.
As stated, we don’t actually have a chargeback model for Tableau yet. It’s currently under discussion. My initial thoughts are to charge according to either disk space used on the system or by user accounts. I’d be interested to hear how the community is doing this. Let me know what’s been working (or not) for you.
For other services that we look after we tend to charge back on the number of user accounts on the system associated with a particular business unit. Or a flat charge per server using the service. This was done with ease of admin in mind and I think it could do with a review as some teams are paying more than they should and others are getting a bit of a free ride.
So how do you get smoothly and seamlessly from “Hi Paul – I really want some of that Tableau action” to “Cheers Paul – I love it, thank you for bringing Tableau into my life”?
Here’s what you need to consider in your onboarding process.
- Speed – It needs to be fast fast fast. I work in an area characterized by agility and short time to market for pretty much everything and my users don’t react well to delays. Anything over a couple of weeks and you’ll risk frustrating your users.
- Clarity – You’ll need proper documentation and ideally a flow chart of the process. Any costs, especially recurring maintenance costs need to be fully understood. If your systems can provide a link or report for the user of where the request process is at any one time then even better.
- Ease – Ideally it will be one request via ticket or email to the initiating team. Users hate having to revisit a process and often forget to do so, meaning that the whole request process gets held up. There’s nothing worse than users having to fill out some obscure or unnecessary web form with loads of mandatory fields.
Our process goes like this
- Initial request – Our users start with an initial email to a helpdesk team who then raise a ticket to track the request. Our documentation states that the email must include approval for the desktop licence cost from their manager. This can be a problem in most onboarding processes as chasing a manager to approve a spend can be a real pain.That’s why we leave that up to the user so that the helpdesk team don’t waste their time tracking down the approval.
- Ordering – Once the helpdesk get the approved request then they place an order in our ordering system. We use a tool provided by Ariba. That goes off to Tableau and payment gets made using the transit or billing code provided in the request, and the helpdesk get a licence key which is passed on to the user.
- Install – Helpdesk then send a request to the desktop installation team, who deploy our packaged version of Tableau Desktop to the user workstation.
- Server – If the user requires a server licence then the helpdesk pass the ticket onto my team, and we add their account to server.
That whole process usually takes about a week, which is generally fine for all but the most demanding of users. Crucially it is only one email that the user has to submit to helpdesk so it’s admin-lite from the client perspective. They love that.
So why is onboarding so important? One of our other enterprise services hasn’t been set up so well and has a particularly poor onboarding process which can take several months. This means that by the time the user gets access to the system they are already pissed off with the whole thing and this means they have a very low tolerance for any issues or problems with the service. Because we get users onto Tableau quickly and smoothly they are happy and keen to get stuck in from the start – and if they do get any issues it means they’ll cut you some slack rather than escalating to your boss.
Most enterprises will have a central mechanism for deploying software. Typically there will be one for the UNIX server environment (Opsware), one for the Windows server environment (SCCM) and one for the Desktop environment, although SCCM covers both servers and desktop in our case. There are a few steps involved in getting a version of Tableau Desktop (or other application) to the point where the deployment team can hit the button.
- Packaging – Firstly I’ll download the new version from the Tableau website and then send it to the Packaging team. They’ll form it into an msi package and place it on a repository for me to access for testing.
- Testing – I’ll then have to test the package. Typically I’ll do this on my own desktop or one of our shared machines. Always a good idea to document this process as you’ll be officially signing off that it works so you may need to evidence that.
- Staging – Once the testing has been signed off as successful the packaging team will stage the package on one of their “deployment points”. That could be anywhere, the location isn’t important although if your organisation has issues with technology consistency (see part 2 of this series) then you might find a platform team in one region doesn’t have access to deployment points in another region.
- Deployment – The deployment team should be able to use their admin console to easily and automatically deploy the software to the machine(s) of your choice. The requester will get a report back as to the success / failure of each attempt.
Unfortunately even with a packaging process Tableau is a little harder than most applications to distribute due to the frequency of updates. See the “Upgrades” section below for more detail.
Tableau actually make this quite difficult for an enterprise scale environment. As you know they have an active release cycle of approx 1 month per minor release. Now that’s ace for the average user, who can go and grab the latest desktop whenever they want. But not so good for the service owner. As stated earlier we like to package up any application and make it available through the official enterprise distribution channels so that admin teams can control distribution and upgrades. Unfortunately such an aggressive release cycle means that we just don’t have the time to package each release and make it available for deployment as the process can take weeks. So we tend to re-package our desktop version once a quarter, or if the update has a critical bug fix etc. It’s probably ok if your organisation has an agile packaging method / team but in my place we just didn’t have the resource to keep up with Tableau’s release cycle.
Unfortunately there is also nothing stopping users from upgrading their own desktop application without the knowledge of the service owner. I’ve been contacted by users asking why they are suddenly unable to publish to server, and the answer is because they have upgraded their desktop to server version +1. Luckily Tableau desktop installations can co-exist happily so long as you’re aware what version you’re saving a workbook with. The same challenges exist with Tableau Server upgrades, although these can be even more time-consuming to implement.
All this means we tend to hold back on upgrades unless there’s an important bug fix or significant functionality jump.
Support of the Service
Obviously your service will be supported. But that support model can vary in a number of ways, not least dependent on the amount of resources that your organisation is able to throw at it. The chances are that whatever team supports your Tableau service will also have to support other applications as well. There are a number of key considerations in this area.
- Service Tier – First off, what tier of service are you offering? If it’s a lower tier say 3, then support will most likely be during business hours only, with no on-call service to fix issues in the middle of the night or weekends. Anything higher and you’ll have that in place. If the business is demanding the service be a higher tier then they’ll have to provide you the resources to deliver that service. There are occasions where that doesn’t happen and it’s a recipe for disaster.
- Team setup – In your support team you’ll obviously need one Tableau subject matter expert (SME). If you have more people then one of the first things to do is make sure that knowledge gets transferred. The others don’t need to be experts but they do need to be able to address issues and keep the system going when the SME is absent. This all sounds obvious but I’ve seen many a team that is left floundering with a huge skills gap when someone leaves or is off sick.
- Documentation – There’s no excuse for not having this in place. Key admin tasks and a list of the common issues and remediation steps should be easily accessible on a sharepoint or confluence page. You’ll be amazed how even an expert’s mind can go blank when faced with an incident. Your docs should also be detail the environment and the appropriate governance. If you’ve done the right things in terms of project initiation (part 1 of this series) then you’ll be ok here.
- Alert Processing – When your monitoring system generates an alert how does that alert reach a support person? It could ping an email to a support Blackberry carried by the on-call technician, or it could go to a central Network Operations Center (NOC) who would view the alert on a console such as Netcool Omnibus or HP Openview. They would then call out the appropriate technical resource or conduct some initial remediation actions, thus potentially avoiding the need to wake you up. If you are using a NOC then you’d better ensure your alerts are correctly raised, free of junk alerts, as that team will be calling you up if they get one, regardless of how minor it may seem.
- Support Flow – Just like the onboarding flow, you’ll get some kudos from your users if you can give them a flow chart that documents exactly what happens from the moment an issue is detected to the resolution.
- Budget – Make sure you’re aware of any ongoing costs to your service and think well ahead. If you think you’re going to need some more resources either in terms of tech or people then GET IT IN YOUR BUDGET PLAN as soon as you can. Most budget cycles operate over a year in advance and managers hate any unplanned costs appearing when all the numbers have been signed off. Even if you might not need the spend then it’s best to plan for it as it can always be removed.
Luck plays a part in the success of your service support. Some applications are more prone to issues than others, some take more maintenance and some are just plain flaky. Luckily Tableau doesn’t seem to suffer from any of those problems, giving your team a chance to concentrate on helping the users make the best of their experience.
Tableau make a number of recommendations in their documentation on how to improve performance if you have an issue. That’s fine, but you want to be addressing performance before you have an issue. That’s the job of your application monitoring tool (more details in part 2 of this series).
In my experience I’ve seen poor dashboard design and inefficient queries to be more of a problem than actual issues on the server side.
While you’ll get an idea of CPU and RAM usage trends I don’t see any way that Tableau can proactively alert if extracts are taking longer than expected or if queries are running slower than usual. Ping me if you know of a way. I’d like to be able to specify the expected time of a query or extract refresh or ideally have the system work that out based on historical information and alert me if a query or extract has breached the usual threshold.
This is another tricky subject and one that we haven’t mastered yet in my service. From what I can see there’s little to stop users uploading massive extracts to your system and filling up the disk unless you have a checkpoint in your service model where publication of user workbooks has to go through a central team for approval and analysis in terms of best practices. My team doesn’t have the resource for that kind of setup, you may be more fortunate.
Any monitoring system deployed on the Tableau server (see part 2) would be able to detect when the disk space thresholds are breached on a machine. That may give support teams enough time to prevent the disk filling up.
Interested to hear the opinion of the community on how best to manage resource capacity for Tableau server. I think the best way is probably to make it a procedural checkpoint as described above. That obviously takes resources.
In terms of the infrastructure, your platform and storage teams should have capacity management tools and strategies for ensuring the availability of their infrastructure as the environment grows. It’s not easy to proactively keep on top of capacity as there are so many factors involved, but if your infrastructure teams have solid Key Performance Indicators (KPIs) then they may even be able to use Tableau to visualise the trends and head off any issues well before they happen. That’s what we do at my place.
So what happens when it all goes wrong? You’re gonna get incidents and outages so expect it. And the chances are your users will also understand that systems do break occasionally. What the service owner needs to demonstrate is that they have a good grasp of the impact of the issue and that the resolution steps are clear and pursued as efficiently as possible. You’ll also need to make sure that any follow on actions are performed to avoid a repeat of the same issue. Repeat problems don’t go down well.
Most enterprises will have a dedicated team to handle incident management although depending on the severity of the problem they may not be involved in your issue. So it’s likely there may be a chance that you are managing the incident. Here are some things you’ll need to consider.
- Logging – Make sure the first thing you do is log a ticket in your incident management system and ensure your manager is aware. This ticket should clearly indicate the issue, impact, steps taken so far.The ticket system *should* send a message to your users but you may need to ping them a note anyway.
- Incident manager – Ensure someone is in charge of the incident and managing it.
- Fix the issue – Get the appropriate technical teams on a call and get to the root of the problem. The first priority is to get the system up and running again. If you need to log a ticket with the vendor then do that immediately, sometimes it can take an hour or so for them to get back to you depending on your level of support cover.
- Close out – Once you’re back in the game get the communications out to your users and management.
- Follow on – If it’s a serious incident then you may have a “Post Incident Review” with key managers in the following days. This will typically detail the timeline of the incident, root cause and any follow-up actions that are required such as patching etc.
You’re going to get issues so don’t kid yourself that you won’t. Even though outages aren’t great, a service owner can often emerge with a lot of kudos for efficiently managing an incident through to resolution.
You have to get this right. Most IT incidents occur as a result of poor change management processes or failure to adhere to such controls. Most enterprises will class failing to follow change process as a disciplinary offence, and I’ve seen it cost people their jobs so whatever you do don’t take any shortcuts, it’s just not worth it.
Your incident management team will probably look after change management across the firm or your business unit so it will be a case of adhering to their guidelines. Again the approach will depend on the tier of your service, but you’ll probably be scheduling major changes for weekends to minimise the risk if something goes wrong.
One of the key considerations is communication. It’s a good idea to place your power users on the approval chain of the change so they need to sign off that they’re ok with it. If you’ve got a change coming up then make your users aware as early as possible, then send a reminder a week or so from the change date and the day before. Also send them the results of the change (even if it failed) so they’re totally up to speed. This will also inspire confidence in your service.
Change management can be a real pain due to the process and paperwork involved. But it’s a critical aspect of maintaining a solid IT service, perhaps the single most critical item.
Keep in touch with your account manager. It’s actually their responsibility to keep the relationship going and a good account manager will be reaching out to you regularly. Don’t be scared to put the pressure on if there are any issues or you feel that you’re not getting the right level of service. It’s their job to keep you sweet. I’ve had good experience with Sarah Bedwell (nee Henselman) in the London office who has been great. Good luck for your move back to the USA Sarah!
Other services in my organisation have really suffered from poor vendor relationships so it really is a critical part of maintaining your robust service.
So you love your service and you want to get as many people using it as you can. That’s obvious. But it’s not as simple as speaking to people and doing demos. You need to manage the perception of your service and crucially see your engagements through to completion.
Let me give you an example. When I was in full Tableau evangelist mode I was trying to onboard loads of people. So I set a demo up with a particular team and it went really well. They loved it and before I knew it I had requests for individual sessions and help to get started. Unfortunately I did all this at a time when I was mega busy with other work and as a result I wasn’t able to give them as much time as I would have normally. This resulted in the initial enthusiasm dying off and I subsequently found out their manager was giving negative feedback upwards about the whole Tableau service as they felt they had been neglected after being promised so much. That was my fault for choosing the wrong time to approach them. I should have waited until I had the bandwidth to see it through for them. Obviously when I got wind of that perception I managed to circle back to the team and finish the job I started but the damage was done.
Make sure your team is up to date on vendor training, ideally get some of the official certifications that Tableau offer. Not only does this help in terms of technical expertise but crucially it gives confidence to your users that they can rely on your team for true support and engineering (unless the service has a separate engineering team). I’ve seen some cases where a support team doesn’t keep their skills up and before long the users know more about the application than the so called experts. That’s pretty embarrassing.
I’m referring to your internal community of Tableau users here. Here are some tips to help spark that sense of community in your organisation
- Power users – Identify your key users quickly and build a personal relationship.
- Lunch & Learn – Have regular get togethers, share how you’re using the tool and any issues.
- External events – Attend as many roadshows as you can. Bring your power users along.
- Demos – Tableau are more than happy to come in and do demos. That’s really good way to boost engagement and make the users feel like they are dealing with a vendor that actually cares.
- Social page – If you’ve got an internal Facebook-style site then spin up a community page. Post your issues, changes, events and news on it regularly. Encourage everyone to contribute.
- Gamification – Do something cool and fun like Andy Kriebel’s “Viz Cup” at Facebook. a fantastic off the wall idea that I’m sure will be repeated. The subject of gamification is one that’s getting a lot of chat at the moment.
- Merch – Get some freebies from the vendor. I dish out Tableau badges to my users for a laugh. They seem to like it.
- Evangelise – If you love Tableau, talk about Tableau. It’s not like Fight Club. I don’t shut up about it. Coffee points, meetings, lunch. I’ve had people give it a go just to get me to be quiet. Then they are hooked. Mwuhhahaha! Make sure you’re able to see your promises through though.
Ok that’s it. If you’ve considered everything in this series of posts then I can pretty much guarantee you’ve got a solid foundation for your enterprise service. There are many different variations of these processes depending on your organisation and there’s really no one right way of doing it. For (a lot) more detail on how to run an IT service check out the Information Technology Infrastructure Library (ITIL) methodology. This framework underpins almost all of what I’ve spoken about in this series.
Happy to expand on any of this. Grab me on @paulbanoub anytime. Thanks also to the fabulous Tableau community for the encouragement I’ve had while compiling this guide.