How to PROPERLY Back Up Your Tableau Server


Hello there,

Whadda ya mean you didn't take a backup?

Whadda ya mean you didn’t take a backup?

Time for another post about Tableau Server and how to get the best out of it in a large-scale, enterprise deployment situation.

Today we are focusing on how to PROPERLY back up your Tableau Server installation.

Like many aspects of enterprise services, this is a simple concept, but one that if you get wrong, can spell disaster. It always amazes me how many people / organisations don’t do this properly or even at all.

You know how annoyed you get when your mum tells you she isn’t backing up all her family photos – well that’s what I get like when I see IT systems neglecting backups.

Note this post refers to a standalone Tableau installation with a manual failover to DR. We don’t yet have a clustered environment. I’ll update the post with considerations when we implement that.

 

What’s a backup?

Seems a simple question, and there are a number of different types of backups that you can take, each useful in different situations. Here’s what I’ve got in place:

 

 

Full System Backup

This is a complete dump of the server filesystems to disk (or tape – there’s still plenty of tape backup infra out there). Most likely it will be one of the big vendor products that look like the mothership from Close Encounters of the Third Kind.

Your full system backup should be set up by your server team when you get your machine. However, the principle of “trust no-one” applies here as always and it’s up to you to check the following:

  • Have the backups been set up at all?
  • Are they backing up all the filesystems? – Many times I’ve seen that only operating system partition backups have been set up, and I’ve had to request the application partitions be included.
  • Have the backups been succeeding? – Get your backup team to send you a report every month of backup completion. They don’t always succeed and you probably won’t be told that there has been a failure.
  • If you need to perform a restore, do you know the process and how long does it take?

If you get the okay on that then you’re good. But only as an insurance policy. Full system backups can take a long time to restore, and may only be weekly so you could end up losing data even if these are in place. It’s up to you to ensure you’re covered rather than rely on other teams doing things correctly.

 

 

Nightly Tableau Backup

There’s no excuse for not having this in place. It’s easy to set up and it is a case of when rather than if it saves your ass.

The tabadmin backup command gets Tableau Server to dump all content & configuration to a single .tsbak file. You don’t have to stop the server to do this and it doesn’t seem to impact performance too much while it is running so this should be the first backup you configure.

A simple script like this will do the job.

@echo OFF
set Binpath="D:\Program Files\Tableau\Tableau Server\9.0\bin"
set Backuppath="D:\Program Files\Tableau\Backups\nightly"
echo %date% %time%: *** Housekeeping started ***

tabadmin backup %Backuppath%\ts_backup_ -d
timeout 5

tabadmin cleanup

move "D:\Program Files\Tableau\Backups\nightly\*" \\\tableau_shr\backups\nightly\
echo %date% %time%: *** Housekeeping completed ***

The tabadmin backup command does the actual work here, dumping everything to a file. Always a good idea to run tabadmin cleanup afterwards to remove logs etc.

We run this script at a quiet time for the server (not that there is one in my global environment). We use the Windows Scheduler on the server but I’d recommend using a decent scheduler like Autosys or whatever your enterprise standard is as WTS is pretty poor.

IMPORTANT: You may have noticed the move command at the end there. That takes our newly created backup file and moves it OFF THE SERVER to a share drive accessible by my backup server. Why? Well what happens if you lose the server and your backup file is on it? You may as well have no backup. So move it somewhere else.

Update – this tip actually saved my ass this week when we lost our entire VM cluster (er.. hardware team – *cough* – what’s going on??) . We were able to failover to the backup server successfully. Going forward we will be soon implementing Tableau’s High Availability capability.

Do make sure you rotate your backup files with a script that deletes the old files or your share drive will fill up. I keep 4 days worth, just in case the current file is somehow corrupted – rare but can happen.

 

 

Weekly Restart

You may know I’m not a fan of running enterprise apps on Windows. I prefer Linux for a number of reasons that I’m not going to go into here. I know many users want Tableau Server on Linux, and the amazing Tamas Foldi has only gone and written it himself – so one day we may see it.

Anyway, with Windows apps I always build in a weekly application restart. In our case every Saturday morning. That involves a server reboot (to clean out any OS related temp stuff), application restart and a tabadmin cleanup. The tabadmin cleanup with the server stopped has the added bonus of clearing out the temp files (doesn’t happen when the server is running). These files can get pretty big so worth clearing out.

 

 

Virtual Machine Snapshots

If you’re running on a VM then you may be able to utilise the VM snapshot facility. Contact your VM admins for details. I’ve not needed to implement this but I know some that do. VM snapshots are super handy.

Do be aware that Tableau don’t seem to support this though..

Screen Shot 2015-11-11 at 19.11.48

 

 

Config File Backup

Sometimes it’s handy to just back up your Tableau Server config. I’ve got a script that grabs all the .yml and template files in my Server directory, zips them up and moves them off the server. Pretty useful to refer back to old config settings if you need to. Make sure you include workgroup.yml.

If you’re being really good then you’ll be checking your config files into a revision control repo like SVN.

 

Site Specific Backups

Tableau Server allows you to backup per site. This doesn’t give me much extra but I know in orgs that have lots of sites, or a site per team / business unit it can be very handy.

One thing that isn’t great about exporting a site is that the site is locked and inaccessible as the export is taking place. See Toby Erkson’s blog for more info on exporting a site.

 

 

Backup File Size & Duration

As your environment grows you’ll need to be mindful of the size of your backup file. Mine is around 16GB and takes well over an hour to write. Takes about 25 mins to restore. You’ll need to understand those numbers as your system matures.

unnamed

Backup files can get pretty big

Another variable that can affect backup time is the specification of your primary server. If your primary is low spec then you’re gonna get a longer time to write a backup. I don’t have any stats on that but I know it is true. Contact Jeff Mills of Tableau if you want more info on Weak Primaries & backup times.

 

 

Backup Your Logs

Less important this, but handy to do on a weekly basis is to zip up your logs. We have a much better solution for logfile management using Splunk – you’ll see a blog about that in the future.

 

 

The Most Important Bit – TEST YOUR BACKUPS

OK so you’re backing up like a man / woman possessed? Fine. You’re only as good as your last restore. So TEST your backups periodically. Files get corrupted and you don’t want to be discovering that your only backup is broken when you need it.

OK that’s it. Backups can save your life – don’t ignore them. Paranoia is king in IT!

Cheers, Paul

Advertisements
This entry was posted in Tableau as an IT service and tagged , , , , , , , , , , , , , , , , , , . Bookmark the permalink.

17 Responses to How to PROPERLY Back Up Your Tableau Server

  1. Another great post Paul – we have learn’t recently that our back up process involved a full cleanup which removed the logs, and hindered our investigation into a server problem

    • paulbanoub says:

      Cheers mate. Ideally you could pull all your logs off onto another server – we use Splunk to aggregate and index all the logs. I’ll be posting about that in the future.

  2. Graham Macleod says:

    Hi Paul. I’d be interested in knowing more on how you manage the Saturday restart. Specifically which parts are manual and which parts are automatic. Do you have full control over your server, and what would you do if the server failed to come back up correctly after a restart?

    On your section about backup times, we have quite a beefy primary server which our backups take place on and this takes between 30 and 40 minutes for a (currently) 15GB-16GB backup.

    • paulbanoub says:

      Hi Graham, thanks for commenting

      All the restart is automated using a script triggered by Windows Task Scheduler. Currently it’s just the application but I am planning to restart the whole server as part of this process. If the server failed to come back up then my hardware team should be alerted by their own monitoring. I would also have my monitoring to tell me and I’d make sure the hardware team was aware, and not assume their monitoring works properly.

      Regarding backup times. That’s a decent speed for a backup that size. Mine takes a fair bit longer. I’d like to see the ability to tweak backups more, e.g. exclude certain content or metadata etc. Anything to give more options. At least you don’t have to shut the server down now though.

      Happy to chat on the phone if you want to dive into any of this more.
      Cheers
      @paulbanoub

      • Graham Macleod says:

        I would love to have our physical server restarted but we don’t really have weekend infrastructure cover and I would hate to bring our server down and for it not to come back up again. I’ll maybe discuss with our server guys what they can do around automating this physical server restart. I think there is an option in windows scheduler to execute a command when the server starts so it would be possible to execute a tabadmin stop (because I think the tableau services start upon a server reboot), tabadmin cleanup, then tabadmin start. That way it’s all, in theory, handled by the box being restarted.

      • paulbanoub says:

        It’s an issue not having weekend hardware support. If you can get that then life will be easier.

        I would perform any backup / cleanup before restarting the server. Every time you restart a host you run the (albeit small) risk that it won’t come back up again. So I’d script the following
        – tabadmin backup
        – cleanup
        – shut down server application
        – restart server (Tableau will come back up automatically if services are set to do so)

  3. Chris H says:

    Great Post Paul. A Similar script on the DR box to restore the backup is also helpful to keep the environments in sync as of the last backup.

    • paulbanoub says:

      Hi Chris, thanks for commenting. You make a good point, but we chose not to implement auto-restore on BCP just in case someone does something catastrophic to the primary server or there is corruption. If that happens (rare) then you may not want to automatically restore the data. Just personal preference, your way is probably better.

  4. Chris H says:

    Also, one drawback with VM snap shots is licensing. If any of the Machine specific identifiers change, (moving to a different VM guest) then the server will come up un licensed due to Tableau Server thinking that it is on a different machine. The licensing of the server will need to be cleaned out and re applied which can be a bit fiddly.

  5. Mark Wu says:

    This is very nice comprehensive backup guide for Tableau server practitioners. Thanks Paul

  6. Hi Paul, Thanks for your post. we have automated the backup script in the evening at 8PM EST. but not sure why the date (-d switch) is showing in UTC and next date is being displayed. eg: if we run the backup on 02/03/2016 at 8PM EST then the backup filename shows as -2016-02-04. did you face any issues similar way?

  7. Pingback: A Tableau Server Admin Toolkit | Paul Banoub's VizNinja Blog

  8. syed says:

    Thanks a lot Paul for this excellent post. We are a big organization with content backup size of around 80GB now which is growing fast… We currently have a Weekly backup process with server logs cleanup etc – however, I couple of years ago used to have a Warm backup process – no server restart – some how luckily I found that warm restart actually doesn’t backup all content – i-e any content which is being used at time of backup by any users (locked content) which is a big problem as we don’t know what will not be brought to backup and will not work when needed to restore.

    We found this issue when doing testing for next version upgrade in test environment. We then contacted tableau support on this and they also confirmed that this could be an issue and it’s better to do a Cold backup (with app restart).

    This push us to have 2-3 hrs down time.

    Any thoughts or experience in this one you or anyone else had? how to mitigate this down time issue?

    Thanks again for great work!!!

  9. madhu says:

    HI Paul,

    i have a requirement that we need to verify the tableau backup file using tabadmin commands. every day one backup script is executing and backup file is creating . my requirement is after completion of backup script my backup verification script will be execute and verify that backup file.problem is identifing the latest backup file .this script check the latest tableau backup file amon a set of backup folders using “tabadmin verifydatabase” -f “backupfilename”.

  10. Jenksy says:

    This is a good guide but more robust commands should be used to move files around – especially when talking about network shares. I’d recommend replacing all of your MOVE commands with ROBOCOPY, which will allow for resume in the event of a network problem, and provide great logging which you could suck into Splunk or Tableau for analysis.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s