Data extracts are a part of the data taken into Tableau’s local storage from the main data source. While making a Tableau data extract, a local copy of that portion of data is saved in the memory of Tableau. Working and managing data in the form of extracts is easier than managing the entire data from a live connection.
While working with Tableau data, we can also refresh the data from the data source. It can be done by either of the two ways:
- Refreshing the entire data subset in a certain time interval.
- Alternatively, we can have an incremental data refresh where only newly added data is updated in the extracted file.
Data extracts are saved as a .TDE file that is the Tableau Data Extract file.
Advantages of Tableau Data Extracts:
1. The data extracts can support large data sets or subsets containing millions or billions of rows of data.
2. The data extracts optimize Tableau’s performance because they speed up working with data by providing a local copy of data from its source, saving a lot of time while working with large data sets.
3. Tableau data extracts also allow us to use advanced Tableau functionalities that we normally cannot use with data from live data connections or the original data source. We can easily manipulate or modify the data saved in an extract file by applying filters, calculations, conditions, or limits that give us a lot of analytical freedom and flexibility.
4. The Tableau data extracts also allow us to use data offline. You do not need to have continuous access to the data source to work with it. Using the data offline is possible due to a local copy of data extracts being saved in Tableau’s memory.
How to Extract Data in Tableau?
Step 1: Notice on the top right corner of the Data Source page, there is a Connection section. There are two options to make a connection with the data source, i.e., Live and Extract.
Select the Extract option to start the process of extracting data from a data source. Also, if you do not wish to extract a selected set of values from the data source, you can select the Extract option and move forward. However, if you want a specific set of values from the data source, you need to select the Edit option given next to the Extract option.
Step 2: A window named Extract Data will open in which you can select the data values you wish to extract by applying filters, aggregations, selecting rows, etc. In this window, you also get a range of options to set the extraction process as per your requirements. You can also choose from single or multiple tables, add a filter, set aggregation on selected data values, number of rows, etc.
Step 3: You can choose a set of values that needs to be extracted by the use of filters. You can extract the filtered data from the data source by applying the filter conditions in the Extract Data window.
To apply a filter on the values of data source, click on Add.
Now, you will see a complete list of available fields in a data source. You can also select any field from here and then click OK.
Step 4: Now, a list of values from that field will appear where you can select the values that you want to include or exclude from the extract. To exclude the field values, select values from the list, and then check the column Exclude.
You can also apply filters using Wildcard tab where you enter a value and set conditions on them like Contains, Starts with, Ends with, or Exactly matches.
You can also enter a specific filter condition by field or by formula from the Condition tab of the filter window.
Another tab that is present in the Filter window is the Top tab where you can set a filter condition to extract only the top 100 or top 1000 values etc. in the data extract from the given options in this tab.
Step 5: Once you have applied all the filters, you can see the filter summary on the General tab.
Step 6: In addition to applying filters, you can also apply aggregation on the selected values. An example of aggregating data values during data extraction is shown in the below screenshot.
Another important option that is given in the Extract Data window is for rows of fields, which can set the number of rows that you want to fetch in your extract from the data source. You get a range of options, like selecting all rows with the option of an incremental refresh every time new data is added at the source. Alternatively, you can select a selected number of the top few rows to include in the extract, or can give a sample number of rows to extract.
Step 7: Once all the conditions are set to create a data extract from the data source, it is uploaded into Tableau’s memory. Then, it becomes ready for you to use it in your data analysis process.
Right-click on the name of the data source available in the Data pane to manage the Tableau data extracts. A list of options related to that data source will appear. Here, you have three to four options related to your data extract relevant to that data source. You can also manage your existing extracts from here.
Step 8: You can initiate the process of creating data extracts in a few more ways than you just saw in the previous steps.
Click on the Data tab present on the toolbar and select your data source and select the Extract Data option from the list that opens.
Another way is to right-click on the active data source present in the Data pane and select the Extract Data option from the list.
Step 9: From the Use Extract option, you can use the data extract fields and values into Tableau’s visualizations.
Remove the extract from the workbook:
You can remove an extract anytime by selecting the extract data source on the Data menu and selecting Extract > Remove. When you remove an extract, you can either choose to Remove the extract only from the workbook or Remove and delete the extract file. The second option will delete the data extract from your hard drive.
See extract history:
You can also see when the extract was last updated and many other details by choosing a data source from the Data menu and selecting Extract > History.
If you open a workbook saved with an extract and Tableau cannot locate the extract, then select any one of the following options in the Extract Not Found dialog box:
- Locate the extract: You can select this option if the extract exists but not in the location where Tableau initially saved it. Click on OK to open an Open File dialog box where you can specify the new location for the extracted file.
- Remove the extract: You can select this option if you have no further need for the extract. It is equivalent to closing the data source. All open worksheets that reference the data source are deleted.
- Deactivate the extract: Use the original data source from which the extract was created, instead of the extract.
- Regenerate the extract: Recreates the extract. All filters and other customizations you specified when you created the extract are automatically applied.