Leverage Tableau Bridge
Understanding Tableau Bridge Features and Capabilities
Understanding and Optimizing Tableau Bridge
Planning to migrate to Tableau Cloud?
If your organisation is considering or has Tableau Cloud and you have on-premises data (MSSQL, Oracle, PostgreSQL, CSF files) or data in a private cloud that you want to leverage for creating Tableau workbooks with regular refreshes, then Tableau Bridge is a must.
Introduction to Tableau Bridge
Scenarios for Using Tableau Bridge
Best Practices for Data Connectivity in tableau
But what exactly is a Tableau Bridge and how does it work?
Tableau Bridge is software you install on a machine – physical or virtual – within your organization’s network. Since Tableau Cloud can’t connect directly to your on-premises or private cloud data, the Bridge serves as a secure intermediary, enabling communication between that data and Tableau Cloud through an outbound encrypted connection.
However, before deploying a Tableau Bridge, it’s critical to estimate the number of Bridges needed and ensure that your machines have adequate capacity to support refresh workflows aligned to your organization’s needs.
In this 3-part blog series we will cover Bridge planning, Installation, and Management to help you use master each aspect.
Planning Bridge Deployment
Careful planning when setting up a new instance of Tableau Cloud or migrating from an existing instance of an on-premises Tableau Server ensures smooth data connectivity for refreshes and queries. Having too few Bridges can cause job queues and delays; have too many, and you risk increasing the costs unnecessarily.
So, how do you determine the optimum number of Bridges? Well, here’s a simple way to estimate the ideal number of Bridges.
Two major activities on Tableau Cloud use bridge:
- Refreshing the data source connecting to a private network that Tableau Cloud cannot reach directly.
- Live connections with the private network data.
Before you jump on to the planning part, it’s best to familiarize yourself with some of the keywords used for the Bridge.
- Client – Each bridge instance is called a client.
- Pool – Pool is a grouping of bridge clients for load balancing purposes. Pool maps to data network domains give admins the capability to keep the data fresh while maintaining security.
- Concurrent job – This term refers to the jobs running at a given instance of time.
Bridge Planning for Extract Connections
Now let’s look into extract connections. To start planning, begin by putting together the following data:
- Maximum number of concurrent extract refresh jobs in the existing Tableau environment.
- Number of concurrent jobs configured/planned to run on each bridge client (default value is 10 but can be configured).
- Max runtime of the job.
- Buffer accounting for expected future growth in usage (optional).
You can fetch the details about existing refresh jobs from the Tableau server PostgreSQL repository. Also make sure to consider only the jobs that are supposed to use bridge, and this applies to both extract and live connections.
Steps for Estimating Number of Bridge Instances
- Estimate Concurrent Connections: Determine the anticipated volume of simultaneous data source connections needed at peak periods.
- Evaluate Workload: Evaluate the workload by considering how often data refreshes occur and the level of concurrent user access.
- Review Tableau Bridge Capacity: Evaluate according to the maximum planned concurrent jobs and the maximum runtime for extract refreshes.
- Redundancy for high Availability: For high availability, consider deploying redundant instances of Tableau Bridge.
- Network Latency and Bandwidth: Evaluate the network latency and bandwidth connecting on-premises or private cloud with Tableau Cloud, as these aspects may influence the performance of Tableau Bridge.
Calculating Total Number of Bridge Instances
Let’s understand this using an example. If 100 hours of extract refresh needs to run in the 2-hour time window.
- 2 hr can be configured based on individual refresh statistics.
- Add buffer hours of extract refresh as needed as per individual anticipation for future growth.
Total No. of Parallel job required = 100/2 = 50
If a bridge is planned to run 10 concurrent jobs (No. 10 is configurable in the bridge setting)
Total # of bridges required = 50/10 = 5.
(Note: Since the number of bridges will always be an integer, we are rounding off the result to the next highest integer.)
The above calculation is based on the total time taken for refresh activity. This calculates the number of bridges based on the runtime needed in each time window.
Bridge Planning for Live Connections
This includes data sources on the Tableau with live connections and data in a private network, hence those data sources need to establish a connection via the bridge. The bridge can support up to 16 concurrent live queries and it is not configurable.
Gather the following data to plan bridge deployment for live connections.
- Maximum number of concurrent views on content using live connection with private network data (This detail can be gathered from built-in administrative view in Tableau Server along with a few details from Tableau repository PostgreSQL and should be observed over a period to get a better approximation).
- Buffer accounting for expected future growth in usage (optional).
So, let’s understand this using an example.
Let’s say the maximum concurrent views on content using live data connection = 40
(Add buffer views as needed as per individual anticipation for future growth)
Then, the No. of bridges planned = Ceiling (40/16) = 3
Total Bridges Planned for Deployment
Once you have both the number of bridges planned for live and extract connections, take the higher number, and proceed with the deployment.
Please note that this process gives the recommendation to start the bridge deployment and it is possible for the actual number of bridges to vary slightly if the number of extract or live connections is different than what is used in the above formulae.
It is recommended to test the deployment in staging before moving to production to avoid any surprises.
Specification of Bridge Client Machines
Now that we’ve discussed determining the number of bridges required, the next step is to consider the specifications of the machine running those clients. Let’s look at the recommended configuration for installing a bridge client on a Windows machine. This configuration depends on the number of concurrent refreshes that are planned to run on the client.
For up to 10 concurrent refreshes, here is the recommended configuration:
CPU | 4vCPU |
RAM | 8 GB |
Storage (NVMe SSD) | 300 GB |
This configuration should be adjusted depending on the number of concurrent jobs to get the details for different numbers of concurrent jobs.
The Role of Tableau Bridge in Data Connectivity
Best Practices for Seamless Data Integration using tableau bridge
Tableau Bridge Capacity
Now you are ready to install Tableau Bridge!
Stay tuned for Parts 2 and 3 on Bridge Installation and Pooling best practices. With some planning up front, you can ensure Tableau Cloud and the Tableau Bridge work smoothly together!