2 minutes with… Ryan Sleeper of Evolytics

2 mins with title2

Greetings Viz fans!

Now this is an exciting one for me. Been trying to get Ryan on board for ages. Anyway, after weeks of ever-increasing bribes he has finally cracked and gets the 2 minutes with treatment.. Enjoy.

VN: So who are you then and what do you do?

profile-picRS: Hi Paul – it was so great to meet you in person in Seattle! Thank you for having me on.I’m Ryan Sleeper, Director of Data Visualization & Analysis at Evolytics.

VN: Tell me about your org

RS: Evolytics is a full-service digital analytics consultancy in Kansas City, Missouri, USA. Our team does anything and everything related to digital analytics including measurement planning, web analytics implementation, testing, and optimization. I come in at the end and help the team and clients understand the data, primarily by using Tableau to help illustrate the stories in the data.

VN: How do you personally use Tableau?

RS: At work, a typical project begins with me using Tableau to do ‘discovery’ analytics. This is the phase where I don’t necessarily know what I am looking for and I am just digging in the data looking for insights related to a client’s business question. Most of this will not be seen by anybody but me. Once I have found the insights / indicators that can be used to measure the success of a client’s objective, the project moves to more of a ‘descriptive’ analytics task, where I create dashboards that help monitor our progress to goals. Occasionally, I also get the opportunity to create self-serve reporting for clients. These are essentially apps that end users can interact with to find their own stories in the data. This is more in-depth than a dashboard, but does not require the client to build anything themselves. I enjoy the challenge of designing these interactive reports with user experience in mind.

In my personal life, I enjoy trying to answer sports questions that I am curious about and sharing the results using Tableau Public.

VN: What has the impact been on your business?

RS: As a full-service analytics company, we’ve always offered reporting and analysis services, but before Tableau, they were more of an included, ‘value-add’ service. Tableau adds so much value to our reporting and analysis to the point where we can now have engagements specifically for Tableau, whether it be training or reporting via Tableau Server.

Tableau has also helped us provide better insights for our clients by reducing the time it takes us to find them.

VN: You have been an outspoken proponent of Tableau Public – what do you like most about it?

RS: I simply would not be as far along as I am without Tableau Public. Much fewer community connections, no Iron Viz, no guest posts at Tableau, probably no Kansas City Tableau User Group, and the list goes on. Tableau Public is my sandbox for developing new Tableau skills that I may not necessarily have the time to risk trying at work. Tableau Public also has a built in community that is there to provide feedback, help answer questions, and encourage you to keep working at it.

VN: What does the Tableau community mean to you and who do you learn from?

RS: I am constantly amazed by the Tableau community’s willingness to help each other. The Tableau community has played a huge role in my personal Tableau development, and not only have they taught me a great deal, but they’ve inspired me to pay it forward whenever I have the chance.

There are too many in the community to name, but I am inspired every single day by the effort, art, and selflessness that the community puts out. I look at nearly every single Viz of the Day and keep up with several blogs, including this one. Chances are if you’ve had a Viz of the Day or been on 2 Minutes, I have learned something from you.

If I had to pick one viz ‘mentor’, it would be Ben Jones of Tableau and dataremixed.com. Ben really pushed me to share my content and keep innovating when I was just getting started on Tableau Public. I also feel like I grew up in my Tableau life with Anya A’Hearn, Kelly Martin, and Ramon Martinez, all of whom I co-presented with during my first Tableau Conference presentation in 2013 and whose work I have studied for a long time.

VN: You’re a fellow TUG leader. Have you got any tips for running a successful TUG?

RS: The KC TUG is relatively new, just now closing in our one-year anniversary, but I have learned a few things so far. My biggest tip is to keep the content non-intimidating for beginners. I have found that at least half of our attendees are just getting started with Tableau and even just evaluating whether or not they want to use Tableau. I recommend including at least one lesson at each of your meetings that your entire audience can feel like they can begin using on their own as soon as they get back to the office.

VN: Could you give me an interesting non-work fact about yourself?

RS: My wife and I really prioritize travel / experiencing different cultures in our lives, and while I am mainly an American football / basketball / baseball guy, I collect soccer (football) scarves from each country I visit. So this year’s Tableau Conference speaker gift, a #Data14 scarf in Seattle’s trademark navy and green, literally could not have been better for me. Some of my favorite scarves include FC Barcelona, Wellington Phoenix, and Morocco’s national team – who I saw at the 1994 World Cup as a boy in 1994. I’ll also be in your neck of the woods in May to pick up my first Premier League scarf.

Awesome stuff thanks a lot Ryan. Don’t forget to give me a shout when you’re over in May – I’ll round up some London data folks and we can show you around.

Until next time… Cheers, Paul

Advertisements

How to Monitor Your Tableau Server – Part 2 – Tableau Server Application Monitoring

Hello there,

Following on from Part 1 of this series. Here’s part 2, how to monitor your Tableau Server application itself.

Now I don’t know server in as much detail as some of the Jedi-level experts out there so I’m totally open to different ways of doing things. My recommendations here are based as much on general IT service monitoring best practice as they are on Tableau specifics. If I’ve missed something then do point it out – hoping the community can help me expand this article. 

On that subject – I’m delighted to have been able to collaborate with Craig Bloodworth (@craigbloodworth), Mark Jackson (@ugamarkj) & Chris Schultz (@nalyticsatwork) on this. Thanks for your invaluable contributions guys.

Are we ok? That’s the ubiquitous question on an IT service manager’s mind. And it can be a real worry. But the fact is that there are a lot of tools and methods you can employ to cut down that worry and stress or even eliminate it.

 

Service Availability

Simply put, is your Tableau Server up or down? Tableau offer a “Server Status” view, but in my opinion that’s pretty useless as you’re never going to be staring at it for the whole day. I’m also not sure how quickly it updates or responds to the system activity. It never seems to change when I’m looking at it.

status

Tableau’s Default Server ‘Monitor’

So it’s clear you’ll need something else to give you that early warning of any issues.

xml

Tableau Server Monitor in xml

Btw you can also get this in xml output. Could be handy.

 

Process Monitoring (Enterprise Process)

procs

Main Tableau Server Processes (click to enlarge)

These are the key processes (running programs) that are required for Tableau Server to function. If one of these has crashed they you’ll likely have a problem.

So referring back to Part 1, I talked about enterprise monitoring tools used to monitor your Tableau infrastructure. Well you should be able to use these tools to set up application monitoring. That’s monitoring of your own application, that you define (and ideally configure) that produces alerts that come to you or your own support team (via the enterprise process).

You should set up monitoring rules to alert on zero instances of each of these processes. The alerts need to be classed as a “Critical” severity so that they hit the alert list of the Level 1 team (non-critical alerts may not be visible). Make sure the monitoring rules apply 24 x 7.

Important – Make sure that the Level 1 & 2 teams that will get these alerts know exactly what to do with them. These teams will probably have a document or Runbook that you’ll need to fill out which will give them instructions as to what the alert means and who they should call. This needs to be crystal clear as they’ll usually follow it to the letter.

Process Monitoring (Paranoid Android Process)

marvin_660

“I knew that alert would get lost. Don’t say I didn’t warn you..”

So even if you set up the above monitoring using the Enterprise Process, then you may have issues. That process can break, meaning that your alert may take up to 30 mins to get to you (or a lot longer!).

Therefore I always encourage being as paranoid as possible when it comes to monitoring.

Luckily there are a number of things you can do to add an extra level to your monitoring.

 

Use a Simple Script

miker

Monitor Tableau Server without the GUI

Mike Roberts of Interworks has written a simple guide to scripting up a basic process check based on the default Tableau Server monitor xml output mentioned above. You can run that script using Windows Task Scheduler and get an email if any of the processes are detected to be down.

 

 

I don’t use that one, but I do have a very basic Powershell script that I run using Task Scheduler every 5 mins. Does the same thing. It’s based on the following code.

powershell.exe -command "& {if (! (get-process -name postgres -erroraction SilentlyContinue)){Send-MailMessage -SmtpServer '' -from  -To  -Subject 'postgres.exe not running on PROD '}}"

All that does is execute in the background and if the process name (in this example postgres) is not detected by the get-process command then it sends an email to my team. Not foolproof but when combined with the enterprise process then it gives me a better level of protection.

Query the processes via URL

craig

Querying processes via URL

This is a new one on me. Apparently it is possible to query each process by http and get a message back to indicate if the process is ok. Opens up a lot of options for more scripting of remote checks or monitoring of the URLs via third party applications. All adds to the arsenal of monitoring available to the service owner. Many thanks to Craig Bloodworth (@craigbloodworth) of The Information Lab for this tip. You can find more details in this blog post.

The Windows Event Log

By default Windows will log any messages or errors to the Windows Event Log. This can include system and application alerts and is a great source of data regarding system health.

tableau_event_restartingdeadcomponent

Windows Event Log (click to enlarge)

Fire up the Event Viewer (somewhere in administrative tools menu usually) . You should see a number of categories of event on the left, from system stuff to specific application messages. Some will be informational, others downright confusing, but there will be some gold dust in there that you need to be mindful of.

For example – the image (right) shows that Tableau server has been restarting the backgrounder process due to a crash. That’s not critical to know about immediately but I’d sure be interested to understand if it is happening regularly.

There are ways you can export this data automatically and then create a Tableau datasource – we haven’t done that yet but are planning to.

Windows Performance Monitor

perfmon

Windows performance monitor data collector

You can also make use of the inbuilt Windows performance monitor to collect and export data regarding the performance of the Tableau processes on your server. We set up a collector and constructed a basic Server Health Dashboard.

 

 

 

server health

Server Health Dashboard based on Windows perf mon stats

It’s a good idea to subscribe to these dashboards to get them dropped in your inbox at the start and end of your production day.

The details for setting this up are on this Tableau KB article.

 

 

Tableau Log File Monitoring

To me the Tableau logs seem like a real mystery. There’s clearly a ton of information in there, but even the Tableau support folk don’t seem to know what’s important and what’s junk. There are also a lot of messages that seem like red-herrings and some that are just plain confusing.

It’s a shame that there isn’t more clarity on which strings and messages we should pay attention to, at the moment I’m just guessing.

In terms of alerting, the enterprise monitoring tool you use will have an equivalent log scraping functionality, just as it does for process monitoring. This will involve you telling the tool which text to alert on. Fairly simple. You can also write your own script in much the same way as the powershell process monitoring script mentioned earlier in this post.

I get really annoyed with the state of the Tableau Server logs. They’re a total mess. There are multiple locations, and there’s little consistency. I’ve not had time to analyse them properly but it seems like some entries contain either DEBUG / INFO / ERROR or FATAL which would give an indication of whether you should trigger an alert based on the occurrence. It doesn’t seem consistent though.

Ideally I’d like every log entry from every component to start with a timestamp, then either of these severity indicators. Would make it so easy.

 

Log analysis using Splunk

splunkIf you’ve not seen Splunk then you should take a look. It’s a great tool for aggregating and analysing masses of log file data and is in widespread use at many large enterprises. I don’t use it yet but it’s in the pipeline.

Another bit of collaboration – Chris Schultz has written a guide to using Splunk to analyse Tableau Server logs. It’s on his new blog here.

 

Monitoring Tableau Server Activity

Monthly Server Stats

A wealth of info is available from the Postgres DB

So you’ll probably know that Tableau has an internal Postgres database. You may not know that you can interrogate this database easily and pull out pure gold! It’s an absolute treasure trove of information about your server performance, usage and pretty much anything else.

I’m not going to elaborate on it here as my good friend Mark Jackson (@ugamarkj) has written a comprehensive guide on it here.

This is critical ammo to the Tableau Service manager and making these dashboards available to your user community will get you some serious brownie points, especially with senior management. Most applications don’t have the ability to provide this level of detail, Tableau does, and it’s a great feature.

Other Resources

As mentioned there are a ton of ways to do this and there are many more guides out there. Take a look at some of these links.

http://www.alansmitheepresents.org/2014/02/tableau-server-performance-monitoring.html
http://kb.tableausoftware.com/articles/knowledgebase/automation-checking-server-status
 

OK that’s it for this part. Hopefully that’s given you an idea of what is possible in terms of monitoring the Tableau Server application. Got any ideas or methods of your own, then do share!

Cheers, Paul

How to Monitor Your Tableau Server – Part 1 – Infrastructure Monitoring

Hello there,

I hope you are all well and recovered from #data14. What a great event that was.

I’m gonna get a bit serious on yo now. It’s time to talk monitoring.

For a Tableau service manager (or any IT service for that matter), the worst situation that can possibly occur is getting a phone call from your users to tell you that your service is down. At best you’ll look stupid, at worst it will cost you credibility and is a sure-fire way to destroy user confidence in your service.

So how do you avoid this? You could not have any outages – well you can forget that, it aint gonna happen. You’ll get issues so get ready for them. What you can do is monitor your service big time. That way you’ll get the heads up and you can answer that phone call with a “yep we know, we’ve just raised an incident ticket and we are on it” – or better still, get to the incident and fix it before users even notice! Remember that effective incident management can actually gain you plus points from your user base, and senior management.

The problem with monitoring is that it’s BORING. I should know I did it for 12 years! But it’s also essential! Get it right and you’ll be making your life a lot easier. It also traditionally doesn’t get a whole lot of investment thrown its way as there’s no immediate tangible business benefit.

Monitoring falls into these categories. This is likely to take me more than one post to explain and it’s a big subject so I’ll doubtless miss some bits out. As always, I’m happy to connect offline and explain.

  • Infrastructure monitoring
  • Application monitoring
  • Performance monitoring
  • Capacity monitoring
  • User experience monitoring

Infrastructure Monitoring

As the name suggests this is all about monitoring of your infrastructure. That’s your hardware and network, peripherals and components of the platform your Tableau Server application is running on.

Chances are the infrastructure will be owned by an IT team. You’ll need a great relationship with these folks so if you haven’t then start buying them some doughnuts now. From what I can see Tableau is often brought into organisations by business users and that then antagonizes IT, meaning this relationship isn’t always the best. That’s a separate conversation however.

 

How does infrastructure monitoring work?

Chances are your monitoring team will have decided on an enterprise monitoring tool for the whole organisation. It will probably take the form of a central server, receiving alerts from an agent that is deployed as standard on each server in the estate.

NagiosSome examples of commonly used monitoring tools include the following. I’ve got a fondness for ITRS Geneos myself but am not going to go into the relative merits of each tool. You won’t have a choice what tool is used in your org anyway.

So what happens? Well the agent will have a set of monitoring “rules” that it adheres to. These will take the form of something like “check the disk space on partition X, every Y minutes and trigger an alert if greater than Z percentage full”. That’s all the agent does. Polls the server for process availability, disk space, memory usage etc on a scheduled frequency and triggers an alert to the central server if the condition is breached. Those parameters should be fully configurable.

consoleThe central server will then display the alert on an event console such as this one (pictured). Alerts will be given a criticality such as minor, major or critical. The alert console will be viewed by a support team, usually an offshore Level 1 team that provides an initial triage of the alert. They may then pass it onto a Level 2 team for potential remediation, or they may also pass it on to Level 3 – the main support team. That’s the usual process in a big organisation.

So what’s the issue with that? Well there’s the time factor for one. It can sometimes take 20 – 30 mins for an alert to get to the person that matters. That’s obviously not great. Also there’s the sheer volume of alerts, a big organisation can be dealing with tens of thousands of active alerts a week, many of them junk. That increases the risk of your alert being missed. There are also a lot of break points in the process, and sometimes alerts just go missing due to lost packets, network issues etc. It happens. On the whole the process works though.

 

Who’s responsible and what for?

Your infra teams are 100% responsible for the monitoring of these components. This encompasses

  • Server availability (ICMP ping)
  • CPU usage
  • Memory usage
  • Disk space (operating system partitions only)
  • Network throughput / availability

trustnooneThey’ll tell you not to worry about this. They’ll tell you that any alerts will go to their support teams and they’ll be on it should they detect an issue. My advice – don’t trust anyone. There have been many times where I’ve had an issue and lo and behold the monitoring hasn’t been configured properly, or hasn’t even been set up at all. Or there’s been a bad break in the process somewhere. That aint cool.

 

So what should I do?

Take these steps to keep your infra teams on their toes. They’re providing you a platform, you are entitled to ask. They might not like it, but stick to your guns – you’ll be the one who gets it in the neck if your Tableau Server goes down.

  • Ask for a breakdown of the infra monitoring thresholds – What’s the polling cycle for alerts? What thresholds are being monitored? Who decided them and why?
  • Ask for a process flow – What happens when an alert is generated? Where does it go? How long does it take for someone to get on it? How is root cause followed up?
  • Ask to have visibility of the infra changes – If there are changes going on to the environment that might affect your server, make sure you get notified. Make sure you attend the appropriate change management meetings so you know what’s going on.
  • Ask for a regular report on server performance – There will probably be a tool on the server that logs time series data on server performance. That should be accessible to you as well as them. Chuck the data into Tableau and make it available to your users.
  • Understand the infra team SLA – It’s important to realise that you are a customer of the infra teams. Ask them for a Service Catalogue document for the service that they are providing. Understand the SLA that they’re operating to. Don’t be out-of-order, but if you find they’re not giving you good service then don’t be scared to wave the SLA.
  • Ask for a report of successful backups – Just as important as monitoring
  • Ask for the ICMP ping stats – How many packets get lost in communications with your Tableau server? How many times does it drop off the network?
  • Be nice – The infra teams in big orgs have a tough job. They’ll have no money and little resource. Cut them some slack and don’t be a prat if they let you down occasionally. It happens.

Start with that lot. Your users will also love it if you can make this information available to them. Again, it inspires confidence that you know what you’re doing.

OK that’s it for infrastructure monitoring. Next up I’ll dive into how you monitor your Tableau Server application.

Cheers, Paul