Connect to and Visualize Live Parquet Data in Tableau Prep



Use CData Tableau Connectors and Tableau Prep Builder to visualize live Parquet data.

Tableau is a visual analytics platform transforming the way businesses use data to solve problems. When paired with the CData Tableau Connector for Parquet, you can easily get access to live Parquet data within Tableau Prep. This article shows how to connect to Parquet in Tableau Prep and build a simple chart.

The CData Tableau Connectors enable high-speed access to live Parquet data in Tableau. Once you install the connector, you simply authenticate with Parquet and you can immediately start building responsive, dynamic visualizations and dashboards. By surfacing Parquet data using native Tableau data types and handling complex filters, aggregations, & other operations automatically, CData Tableau Connectors grant seamless access to Parquet data.

NOTE: The CData Tableau Connectors support Tableau Prep Builder 2020.4.1 or higher. If you are using an older version of Tableau Prep Builder, you will need to use the CData Tableau Connector for Parquet. If you wish to connect to Parquet data in Tableau Cloud, you will need to use CData Connect.

Install the CData Tableau Connector

When you install the CData Tableau Connector for Parquet, the installer should copy the TACO and JAR files to the appropriate directories. If your data source does not appear in the connection steps below, you will need to copy two files:

  1. Copy the TACO file (cdata.parquet.taco) found in the lib folder of the connector's installation location (C:\Program Files\CData\CData Tableau Connector for Parquet 20XX\lib on Windows) to the Tableau Prep Builder repository:

    • Windows: C:\Users\[Windows User]\Documents\My Tableau Prep Repository\Connectors
    • MacOS: /Users//Documents/My Tableau Prep Repository/Connectors
  2. Copy the JAR file (cdata.tableau.parquet.jar) found in the same lib folder to the Tableau drivers directory, typically [Tableau installation location]\Drivers.

Connect to Parquet in Tableau Prep Builder

Open Tableau Prep Builder and click "Connect to Data" and search for "Parquet by CData." Configure the connection and click "Sign In."

Connect to your local Parquet file(s) by setting the URI connection property to the location of the Parquet file.

Discover and Prep Data

Drag the tables and views you wish to work with onto the canvas. You can include multiple tables.

Data Cleansing & Filtering

To further prepare the data, you can implement filters, remove duplicates, modify columns and more.

  1. Start by clicking on the plus next to your table and selecting the Clean Step option.
  2. Select the field values to filter by. As you select values, you can see how your selections impact other fields.
  3. Opt to "Keep Only" or "Exclude" entries with your select values and the data changes in response.

Data Joins and Unions

Data joining involves combining data from two or more related tables based on a common field or key.

  1. To join multiple tables, drag a related table next to an existing table in the canvas and place it in the Join box.
  2. Select the foreign keys that exist in both tables.

Exporting Prepped Data

After you perform any cleansing, filtering, transformations, and joins, you can export the data for visualization in Tableau.

  1. Add any other needed transformations then insert an Output node at the end of the flow.
  2. Configure the node to save to a file in the format of your choice.

Once the output data is saved, you can work with it in Tableau, just like you would any other file source.

Using the CData Tableau Connector for Parquet with Tableau Prep Builder, you can easily join, cleanse, filter, and aggregate Parquet data for visualizations and reports in Tableau. Download a free, 30-day trial and get started today.

Ready to get started?

Download a free trial of the Parquet Tableau Connector to get started:

 Download Now

Learn more:

Parquet Icon Parquet Tableau Connector

The fastest and easiest way to connect Tableau to Parquet data. Includes comprehensive high-performance data access, real-time integration, extensive metadata discovery, and robust SQL-92 support.