Understanding Tableau Data Lineage
Leveraging GraphiQL for Data Lineage
Enhancing Data Transparency in Tableau
As good as Tableau is at data visualization and turning raw information into insightful dashboards and reports, it doesn’t exactly provide a straightforward means of tracking the data lineage – the path your data took before reaching those visualizations. Understanding data’s lineage, its journey from source to visualization, can be important from the point of data governance, data integrity and even impact analysis. Therefore, in this blog, we will explore how GraphiQL can help uncover the story behind your data on Tableau server.
Follow along as we guide you through enabling the Metadata API, connecting to GraphiQL, and crafting queries to extract lineage information. It’s easier than you think!
Before you begin, please ensure the following prerequisites for enabling the Metadata API in Tableau Server:
- Tableau Server Version:
The Metadata API is accessible in Tableau Server version 2018.2 and subsequent versions. - Administrator Permissions:
You must possess administrator permissions to enable the Metadata API in Tableau Server.
Introduction to GraphiQL
Step-by-Step Guide to Using GraphiQL with Tableau
Benefits of Visualizing Data Lineage
Best Practices for Ensuring Data Accuracy
Setting Up GraphiQL for Tableau
Exploring Data Lineage Visualization Tools
Common Challenges and Solutions
Advanced Techniques for Data Management
Once you’ve ensured the prerequisites mentioned above, please follow the steps outline below:
Step 1
Enabling the Tableau Metadata API
To connect to GraphiQL online on a Tableau Server page and use it to query the Tableau Metadata API, you can follow these steps:
- Log in to Tableau Server: To access GraphiQL online, you must first log in to Tableau Server using your Tableau Server credentials.
- Navigate to the GraphiQL endpoint: Once you have logged in to Tableau Server, navigate to the GraphiQL endpoint URL. This URL will be in the format:
https://YOUR_SERVER/graphiql
- Replace YOUR_SERVER with the name or IP address of your Tableau Server.
- Connect to the Metadata API endpoint: In GraphiQL, you need to specify the endpoint URL for the Tableau Metadata API. For Tableau Server, the endpoint URL is:
https://YOUR_SERVER/api/metadata/GraphiQL
- Replace YOUR_SERVER with the name of your Tableau Server.
Step 2
Send a query
Once you have connected to the Metadata API endpoint, you can send a GraphiQL query to extract the metadata you need. You can write your query in the left pane of GraphiQL, which provides syntax highlighting, auto-completion, and error checking.
Here is an example query that retrieves the metadata for a specific workbook:
query workbooks{
workbooks{
name
projectName
}
}
Step 3
Execute the query
To execute the query, click the “Play” button in the upper left corner of GraphiQL. The results of the query will be displayed in the right pane of GraphiQL, along with any errors or warnings. For reference, the screen shot is attached below.
Step 4
Query the Metadata API for lineage data
Once you have connected to the Metadata API, you can use its endpoints to extract lineage data for specific workbooks or data sources. The Metadata API provides several endpoints to extract metadata, including data sources, workbooks, views, and fields.
Step 5
Parse the metadata response
The Metadata API returns metadata in the form of JSON objects. You can use a JSON parser to extract the relevant lineage data from the response.
Step 6
Process the metadata using GraphiQL
Once you have extracted the lineage data, you can process it using GraphiQL.
GraphiQL is a query language for APIs that allows you to define the structure of the data you want to retrieve. You can use GraphiQL to extract specific lineage data, such as the data source, the workbook, or the fields used in a view. Here is an example GraphiQL query that extracts lineage data for a specific view in Tableau:
query workbooks{
workbooks{
workbook: name
project_wb: projectName
Datasource: embeddedDatasources {
Datasource: name
id
}
id_wb: id
sheets: sheets {
sheet: name
id
}
dashboard: dashboards {
dashbaord: name
id
}
}
}
This query retrieves the name of the workbook, the name of the project that contains the Data source, the name of the project that contains the workbook, the name of the data source used in the view, and the name and data type of the fields used in the view.
Step 7
For reference, the screenshot is attached below. We will get this view in Tableau by running the above query for the workbooks.
By following the steps outlined in this blog, you’ll be able to extract lineage data from Tableau Server using GraphiQL and the Metadata API. As noted at the beginning of this blog, the data lineage information can empower you to ensure data integrity by tracing the flow of data and identifying potential inconsistencies which in turn can prove vital in improving data quality and maintaining regulatory compliance.
And did we mention, extracting data lineage can also empower you, as a user, to make informed decisions based on a clear understanding of your data’s journey?
Thanks for reading!