Tableau Server – all about the… Backgrounder

Hello everyone.

You may know me as a Tableau Centre of Excellence manager. That can involve a lot of paperclip pushing skills, with the real work being done by my excellent team (thx @jakesviz & The Information Lab). But I do try and get down and dirty with my lovely Tableau Server environment to keep my skills fresh. Obviously I don’t mess with it – @jakesviz gets pretty protective about his Server.

This series of posts is my attempt to shed some light on the internals of Server. Note there are many more experts in this field than me (Craig Bloodworth, Mark Jackson, Jen Vaughan, Tamas Foldi, Mike Roberts, Angie Greenhaw – to name but a few) so please do comment if anything is incorrect here. Maybe you guys could help me evolve this post?

What is the backgrounder?

The backgrounder is a process that runs as part of the Tableau Server application. As the name suggests, it handles background tasks such as refreshing extracts, running subscriptions and also processes tasks initiated from tabcmd.

Here are the backgrounder processes. The .exe file and the .war file. The WAR file is a Web Application Archive, and contains all the necessary components and resources needed for a Web Application such as Backgrounder.

On a clustered environment you’ll find these files in D:\Program Files\Tableau\Tableau Server\worker.1\bin (may vary slightly with your installation).

28-01-2016 14-07-52

Other files related to the backgrounder.

Template (.templ) files – These files TBD

28-01-2016 14-13-09

There are also a few .rb files in D:\Program Files\Tableau\Tableau Server\worker.1\tabmigrate\db\migrate.

And we also have a .properties file which contains all the config entries relevant to the backgrounder. It also has almost all of the other stuff that you’d find in the main workgroup.yml file which is odd. I’d  have expected it to be just the backgrounder config.

backprops

Here is the location of the backgrounder log files

backlogs

Here are 2 instances of backgrounder.exe running on my server (from Task Manager).

28-01-2016 14-01-08

 

Can I mess with it?

Backgrounder can be configured. There are several settings present in both workgroup.yml and backgrounder.properties. Workgroup.yml is the master config file, and it populates the backgrounder.properties (and other .properties files) when a ‘tabadmin configure’ is run.

I don’t know what all of these do (yet) and the only one I’ve ever edited is ‘backgrounder.extra_timeout_in_seconds‘ which sets the max time in seconds that a backgrounder session can run for. Tableau kills off the session if this threshold is reached. Useful for forcing users to optimise their extract times!

I also pay attention to the ‘backgrounder.vmopts‘ parameter, as this defines the size of the java heap space for this component. All components have a vmopts setting and I’ve had to increase them on occasion due to out of memory problems.

You may also want to change the ‘backgrounder.log.level‘ if you need more debug info, although Tableau logs are chatty enough for me.

If there’s a golden parameter in this lot that you get value from then let me know in the comments.

backgrounder.deploy.dir: D:/Program Files/Tableau/Tableau Server/data/tabsvc/backgrounder
backgrounder.external_cache.concurrency_limit: 10
backgrounder.external_cache.enabled: true
backgrounder.external_cache.num_connections: 1
backgrounder.external_native_query_cache_disable: true
backgrounder.extra_timeout_in_seconds: 1800
backgrounder.failure_threshold_for_run_prevention: -1
backgrounder.jdbc.wg.connections: 8
backgrounder.jdbc.wg.idle_connections: 4
backgrounder.log.dir: D:/Program Files/Tableau/Tableau Server/data/tabsvc/logs/backgrounder
backgrounder.log.level: info
backgrounder.native_api.log.level: info
backgrounder.out_of_date_schedule_minutes: 240
backgrounder.ping_dataengine.millis_to_wait: 5000
backgrounder.ping_dataengine.num_retries: 24
backgrounder.ping_services.num_retries: 10000000
backgrounder.ping_services.time_to_wait: 5000
backgrounder.purge_directories.directories: ""
backgrounder.querylimit: 7200
backgrounder.restart_interval_in_minutes: 480
backgrounder.restrict_serial_collections_to_site_level: true
backgrounder.search_index_verification.enabled: true
backgrounder.sheet_image_api.max_age: 240
backgrounder.sleep_interval: 10
backgrounder.sort_jobs_by_run_time_history_observable_hours: -1
backgrounder.sort_jobs_by_type_schedule_boundary_heuristics_milliSeconds: -1
backgrounder.timeout_tasks: refresh_extracts, increment_extracts, subscription_notify, single_subscription_notify
backgrounder.tomcat.threads: 4
backgrounder.urlprefix: backgrounder
backgrounder.vmopts: -XX:+UseConcMarkSweepGC -Xmx512m

Note that Tableau Support don’t like you to edit config files manually, they recommend that you use the tabadmin set commands to change any parameters. They might have to change that recommendation when we see Tableau Server on Linux.

For more about .templ, Ruby & properties files check out this from Tamas Foldi.

 

What are the problems with the backgrounder?

Here are some of the things that can be problematic with the backgrounder.

  • Single Threaded – This means the backgrounder process can only run one thread at a time, a thread being a set of executable instructions that a process can perform. The upshot of this is that your backgrounder works through a queue of tasks one-by-one.
  • Latency – Due to the single threaded nature of backgrounder, you may see delays or ‘latency’. For example, if you have one backgrounder, and 2 tasks for it to perform at 2am, then task 2 will have to wait until task 1 has finished. If task 1 takes an hour then task 2 won’t start until 3am. This can be annoying for users that expect their data to be refreshed by a certain time.
  • Resource Intensive – The backgrounder can consume a significant amount of processing power (CPU) and input / output (I/O) on your server. This is dependent on the type of task it is performing. It’s not uncommon to see a backgrounder node consuming 100% CPU.
  • Other Stuff – The backgrounder process also does other stuff on Tableau Server that isn’t concerned with extract refreshes. For example – reaping extracts, checking disk space, synching Active Directory groups, rebuilding the search index etc. Bear that in mind when building your system. In reality these tasks don’t take up too much resource but they do take some and you should be aware.

 

Isolating the backgrounders

A common configuration in a clustered environment is to dedicate one of your worker nodes to the backgrounder processes. This means you can dial up the number of backgrounders and let them do their stuff without worrying about any impact on other processes. This is one of the most common performance recommendations from Tableau support.

You can also get a lot of info out of the Postgres DB relating to backgrounder usage and performance. Nelson Davis has posted a guide to getting started here.

 

Improvements I’d like to see

Ok so here are a bunch of improvements I’d like to see to the topic of backgrounders and extract management. I know some of this will drop in upcoming releases and some of these problems have been solved with custom solutions at some customer sites.

Alerting – Tableau Server doesn’t alert (email / IM / SMS) when a task fails. This means you’ll need to set up external monitoring to detect issues. I know Tableau are on this though so expect to see it in an upcoming version. Some people in the community have also coded their own solutions to this problem but it really should be native functionality.

Control per site – We segregate our dev / test / prod user environments using sites , all on the same server. We run 8 backgrounder processes on that server, which are shared across the tasks on all sites. As an administrator I’d really like to be able to bind backgrounder processes to specific sites. For example, 2 backgrounders on each of dev & test sites, then the other 4 dedicated to the production site. That would ensure production tasks always have enough resource to be able to execute on time.

Control per process – I’d like to be able to stop / pause / mess with individual backgrounder processes easily. It is possible – see this from Toby Erkson, but it would be good to have this as part of an administrator console or something.

Control per type / size / pattern of extracts – It would be good if I could dedicate specific backgrounder processes to particular extracts based on their characteristics. In particular I’d like to allocate one backgrounder to all the extracts that take less than 1 minute to complete. Or even use this to reward users that show diligence with their extract management by dynamically prioritizing incremental refreshes or extracts that have a low failure rate.

Better metrics – I’d like to see exactly how much CPU is taken up by a particular backgrounder process or task / schedule or per project. This would be useful for chargeback.

Dynamic reprioritisation – I love all my users. But in particular the ones that take good care of their extract refreshes. I’d like Tableau to be able to dynamically increase the priority of tasks that complete quickly, are incremental and that have a low failure rate. The message being, if you want your stuff to get the best slice of available resource then help us out with best practice.65

Disable run now – We’ve had some issues with the “run now” option that allows users to kick off an extract refresh on-demand using the UI. In particular we’ve seen some trigger-happy users bring our server down by hammering on the run now option multiple times. I’d like to disable that or maybe throttle it somehow.

Better guidelines from Tableau – The documentation from Tableau isn’t great in this area.

05-05-2016 13-09-11

I have an 8 core 128GB server and  run 8 backgrounders with no capacity issues. And I know of other organisations running way more than that. According to this doc I should be running between 2-4. That would be some serious under-utilization of the server. I’d like to see some clearer recommendations, maybe taking into account the variety of use cases that I’ve seen in other big enterprise deployments.

 

OK that’s all for now. This post could have been more detailed but I figure that I’ll get some valuable inputs from the community that will help me expand it. Actually in the time I was writing this, Mike Roberts was doing the same – check his post for some excellent info.

Cheers, Paul

 

Advertisements

How to initiate IM / Skype / Chat / LastFM session from a Tableau dashboard

Top Tips

Hello everyone. Before I start this I have to admit I nicked the idea completely from Dave Hart of Interworks, who came up with it when he was working with us. But he doesn’t blog or tweet so that leaves me to claim all the glory… Actually I’m just passing on his superior knowledge..

Many of you probably use mailto: actions in your Tableau dashboards, to kick off a mail to someone that might be highlighted in a viz etc. And you might also have some sort of internal IM system like MS Lync or something like that.

Well you can just as easily initiate a chat session to one or multiple recipients using Tableau.

This uses the Session Initiation Protocol (SIP) to open the session in whatever IM client is in use. You just need to open Tableau Server like this

tabadmin set vizqlserver.url_scheme_whitelist sip
tabadmin set vizqlserver.url_scheme_whitelist im
tabadmin restart

sip: allows Lync to lookup people using their email address
im: tells Lync to open it as a group chat

This is an example of an URI Scheme. There are dozens of others, such as callto: which tells mobile phones to call as well as facetime: and of course mailto:. 

Once you’ve opened the server up you can then create a Tableau action to fire off the command. Note that if you try and IM yourself it will usually default to email.

im:<sip:kelly.smith@company.com>
im:<sip:kelly.smith@company.com><sip:james.anderson@company.com>
im:<sip:kelly.smith@company.com><sip:james.anderson@company.com><sip:rob.jones@company.com>

When it comes to the action Tableau won’t let you use multiple items without specifying a delimiter so what you do is use “><” as the delimiter and then put [url encoded] “<” and “>” around the data field.

Anyway I posted this on the Tableau Forum and there’s a workbook attached to that post so you should be able to download it and see how you get on. I reckon there’s some potentially way cool stuff possible using URI schemes. There’s also lastfm:, spotify: and skype: amongst them – so much fun to be had here I think.

Forum post: http://community.tableausoftware.com/thread/155429

Regards, Paul

Talkin’ bout a Revolution…

Hello, I trust you’re all ok,

There’s something stressing me about Tableau. It might not be the most obvious but I’ll have a go at describing it anyway.

See I demo Tableau Server all the time. Like daily. To some pretty senior people that I really want to impress. These demos often involve clicking around the server views to show some of the user content. Trouble is some of my user content isn’t that great. It can be badly designed and slow to load – that’s a separate issue that my team is working on.

So I click on the view and then, there it is…..

30x30REV

Spin, spin, spin. Will it be 2 seconds or 20? Will it even load at all? The room falls quiet as all eyes settle on the spinning circle. The audience is almost hypnotised. The tension grows, until the view pops into life (or occasionally doesn’t). It’s the moment I dread as I know everyone is staring at the circle, waiting. Seems like it makes 5 seconds feel like 50.

So here are my issues with this.

  • Positioning – The spinner is bang in the middle of the screen, you can’t avoid it. It grabs your attention and also the attention of the room. Everyone starts watching it whether they want to or not.
  • Information – The spinner doesn’t give any indication of the progress of the operation. I’m not sure how it can, given the nature of the underlying queries but it’s still a problem.
  • Inconsistency – Often the spin rate slows slightly just prior to completion. But sometimes it speeds up again. So I think it’s about to complete, then it carries on. I know it’s just an animated gif but it still seems to occur occasionally.
  • Errors – I’ve had occasions when the connection is lost with the server for whatever reason, but the circle keeps going. That’s very misleading.

I think everyone accepts that applications which perform loading operations will generally have some sort of indicator. But I think the spinning circle can be improved.

facebook_standard_loading_animation

Users blamed the system

facebook_custom_loading_animation

Users blamed the app

The psychology of waiting is certainly an interesting subject. This post refers to a study conducted by Facebook that seems to suggest the type of indicator used can affect how the user perceives the problem. Admittedly the post doesn’t provide an accurate source for this but I thought it interesting nonetheless.

.

Screen Shot 2015-01-09 at 22.57.05

There are many recommendations to making the waiting game feel less of a drag

There’s plenty of chat on how to make that waiting game feel less of a drag to users. This post from UXMovement.com has a number of suggestions such as

  • Use backwards moving ribbings
  • Increase number of pulsations
  • Accelerate progress, avoid pauses

Also take a look at this article by Chris Harrison. The associated video shows the theory in action and there’s a detailed study of the theories available.

.

I wonder how much thought Tableau have given the format and positioning of the spinning circle as well as the science of perceived performance as opposed to actual performance. I think the consensus is that a progress bar is the best option but I know that won’t happen as Tableau can’t easily know the duration of a query. However, there are plenty of recommendations out there that may be worth considering. Why not make use of every trick in the book to make users feel they’re getting a faster experience?

There has been some chat on the Tableau forums about this (thanks @johncmunoz). It seems such an insignificant component but why not add as much polish to the tool as possible? Anyway, I’m no expert on this so maybe someone who knows the subject can supply more info.

Until then I’ll just think of appropriate 90’s synth-pop as the circle spins.

Regards, Paul

5 Top Tableau Time Saving Tips

Top Tips

Hello everyone. Time for a new series on VizNinja. I’m calling it Triple T – Tableau Top Tips.

This is where I list my favourite useful Tableau features. They may be well-known already, or maybe even not so – let me know if you find any of the tips helpful. I’m going to hopefully open this up to Alteryx also as I get more familiar with that tool.

I wasn’t going to post these as I thought that hey – everyone knows these as they’re so obvious and simple. And then the first person I spoke to about Revert to Saved had no idea it existed. So if you think these are all dead obvious then that’s cool, but someone out there might find them useful.

Tip 1 -“Revert to saved”

I’m a serial saver, so the Tableau & Alteryx not having autosave features isn’t a show stopper for me, although it would certainly be nice to have. I know it annoys many bloggers.

I am however, repeatedly going back to my saved workbook version. Using most applications you’d close the GUI and reload the file. With Tableau – just hit F12 and your workbook is back to the saved version.

revert

Simple, but saves me a load of mouse clicks every day.

Tip 2 – Use the repository

Another great example of the attention to detail that Tableau show is the “My Tableau Repository” that I’m sure you’re all familiar with.

My tip here is make sure you use it properly. However tempting it may be to save that workbook anywhere in the repository, ensure you stick to the naming convention. Workbooks in the workbook directory, datasources in the datasources directory. Won’t be an issue immediately if you don’t but as your usage of the tool grows you’ll thank yourself for keeping things neat and tidy.

Tip 3 – Swap Rows & Columns

As simple as it sounds. Hit CTRL-W to swap your shelves around. Hit it again to revert.

Tip 4 – Tooltip Persistence

It really annoys me how tooltips disappear after approx 8 seconds or so. I use them all the time to provide context in meetings.

If you want the tooltip to hang around, then position your mouse over either one of the command buttons or one of the actions / links on the tooltip (if present). That will stop it from clearing. When you want it to go then just move the mouse away.

Not sure if it has been resolved in later versions. I know there was an enhancement request in.

tooltipI

Tip 5 – Cell Size Hotkeys

Don’t bother fiddling around with grabbing the edges of cells with your mouse. Use these hotkeys under the “Format” menu of Tableau desktop.

cellsize

Ok that’s it for the first TTT. More to come later. Hit me with your TTT’s in the comments or @paulbanoub.

Cheers, Paul