How to Connect the Power BI Service to Amazon S3: Complete Guide



Connect to the CData Power BI Connectors from PowerBI.com to provide real-time datasets across the organization.

The CData Power BI Connector for Amazon S3 seamlessly integrates with the tools and wizards in Power BI, including the real-time data workflows on PowerBI.com. Follow the steps below to publish reports to PowerBI.com and use the Power BI Gateway to configure automatic refresh.

1. Create a DSN

Installing the Power BI Connector creates a DSN (data source name) called CData Power BI Amazon S3. This the name of the DSN that Power BI uses to request a connection to the data source. Configure the DSN by filling in the required connection properties.

You can use the Microsoft ODBC Data Source Administrator to create a new DSN or configure (and rename) an existing DSN: From the Start menu, enter "ODBC Data Sources." Ensure that you run the version of the ODBC Administrator that corresponds to the bitness of your Power BI Desktop installation (32-bit or 64-bit).

To authorize Amazon S3 requests, provide the credentials for an administrator account or for an IAM user with custom permissions. Set AccessKey to the access key Id. Set SecretKey to the secret access key.

Note: You can connect as the AWS account administrator, but it is recommended to use IAM user credentials to access AWS services.

For information on obtaining the credentials and other authentication methods, refer to the Getting Started section of the Help documentation.

2. Get Amazon S3 Data

With the data source configured, follow the steps below to load data from Amazon S3 tables into a dataset.

Select Views to Load

  1. Open Power BI Desktop and click Get Data -> CData Amazon S3.
  2. Select CData Power BI Amazon S3 in the Data Source Name menu and select the Import data connectivity mode.
  3. Expand the CData Power BI Amazon S3 folder, expand an associated schema folder, and select tables.

Shaping Data

Use the Query Editor if you need more control over the query and query results before you load the data. Power BI detects the column behavior from the Amazon S3 metadata retrieved by the CData connector. In the Query Editor, you can perform actions like filtering, summarizing, and sorting on columns.

To open the Query Editor, click Edit in the Navigator window. Right-click a row to filter the rows. Right-click a column header to perform actions like the following:

  • Change column data types
  • Remove a column
  • Group by columns

Power BI records your query modifications in the Applied Steps section, adjusting the underlying data retrieval query that is executed to the remote Amazon S3 data.

Load Data

When you click Load, the connector executes the underlying query to Amazon S3.

3. Create Data Visualizations

After loading Amazon S3 data into Power BI, you can create data visualizations in the Report view by dragging fields from the Fields pane onto the canvas. Follow the steps below to create a pie chart:

  1. Select the pie chart icon in the Visualizations pane.
  2. Select a dimension in the Fields pane: for example, Name.
  3. Select a measure in the Fields pane: for example, OwnerId.

You can change sort options by clicking the ellipsis (...) button for the chart. Options to select the sort column and change the sort order are displayed.

You can use both highlighting and filtering to focus on data. Filtering removes unfocused data from visualizations; highlighting dims unfocused data. You can highlight fields by clicking them:

You can apply filters at the page level, at the report level, or to a single visualization by dragging fields onto the Filters pane. To filter on the field value, select one of the values that are displayed in the Filters pane.

Click Refresh to synchronize your report with any changes to the data.

4. Configure Data Refresh on PowerBI.com

Follow the steps below to configure automatic data refresh through the Power BI Gateway. The gateway enables the Power BI cloud service to connect to the DSN on your machine.

Selecting a Gateway Mode

You need to select a gateway mode when you install the gateway:

  • Gateway (personal mode): Use the gateway in personal mode if you only need to publish to PowerBI.com and refresh reports. The gateway runs under your Windows user account.
  • (Recommended) Gateway (Standard mode - formerly Enterprise): Use the gateway in standard mode if you are using other Azure services that require a gateway. You also need the default gateway if multiple users need to access the gateway.
    You need a system DSN to connect through the default gateway. (System DSNs can be accessed system-wide, while user DSNs can only be used by a specific user account.) You can use the CData Power BI AmazonS3 system DSN configured as the last step of the connector installation.

Configuring the Gateway (Personal Mode)

Publishing through the gateway in personal mode simply requires an installed gateway with access to custom connectors.

  1. Run the CData Power BI Connector installer. If you have not already done so, download the Power BI Gateway.
  2. Select the on-premises data gateway (personal mode) option.
  3. Sign into the gateway.
  4. Name the gateway and specify a recovery key.
  5. In the Connectors section of the gateway settings, enable the custom data connectors option. You can also specify an alternate path to the custom data connector .pqx files here.
    Note: The CData Power BI Connectors install the .pqx files to the default folder, Your User Home\Documents\Power BI Desktop\Custom Connectors.

Configuring the Gateway (Standard Mode)

Publishing through the gateway requires an installed gateway with access to customer connectors and a configured connection to the DSN for Amazon S3 from PowerBI.com

1. Set Up the Gateway

Follow the steps below to configure the gateway on your machine:

  1. Run the CData Power BI Connector installer. If you have not already done so, download the Power BI Gateway.
  2. Select the on-premises data gateway (recommended) option.
  3. Sign into the gateway.
  4. Name the gateway and specify a recovery key.
  5. In the Connectors step, choose a folder where the gateway will look for the CData Power BI Connector. This article uses C:\Users\PBIEgwService\Documents\Power BI Desktop\Custom Connectors\. Copy the .pqx files for the CData Connector (found in C:\Users\USERNAME\Documents\Power BI Desktop\Custom Connectors\) to the folder you configured.

    NOTE: The account configured for the service (NT SERVICE\PBIEgwService) needs to be able to access the folder chosen for the gateway. If needed, you can change the service account in the Service Settings section of the gateway installer.

  6. Confirm that the entry CData.PowerBI.AmazonS3 is displayed in the list in the Connectors section.

2. Connect to Amazon S3 Data from PowerBI.com

  1. Add a data source to the gateway: Log into PowerBI.com and from the Settings menu, select Manage Gateways and select your gateway.
  2. Select the option to "Allow user's custom data connectors to refresh through this gateway cluster."
  3. Click Apply to save your changes.
  4. Click the option to add a data source to the gateway.
  5. In the Data Source Settings section, enter a name for the data source and in the Data Source Type menu select CData Power BI Connector for Amazon S3.
  6. In the Data Source Name box that is displayed, enter the system DSN: CData Power BI AmazonS3.

5. Publish to PowerBI.com

You can now publish refreshable reports and their underlying datasets. Follow the steps below to publish and complete the data refresh configuration for a dataset.

  1. In Power BI Desktop, click Publish on the Home ribbon to publish the report.
  2. On PowerBI.com, select the workspace where you uploaded the report.
  3. In the Datasets section, click the options menu for the Amazon S3 dataset you created, then click Settings.
  4. In the Gateway Connection section, enable the option to use a gateway and select your gateway. You may need to manually add the data source to the gateway:
    1. Expand the Gateway under the Actions column
    2. Click the link to "Manually add to gateway"
    3. In the "New connection" form, set the Connection name, set Data Source Name to the same data source name as above (e.g. "CData PBI Amazon S3"), and set Authentication Method to "Anonymous"
    4. Set Privacy Level as needed and click "Create"
  5. If you are using the gateway in personal mode, expand the Data Source Credentials node and click Edit Credentials -> Sign In. (This step is not necessary if you are using the default gateway.)

6. Refresh a Dataset

Refresh the dataset to provide the current data to your reports.

  • To refresh manually, open the dataset options menu from your workspace -> Datasets and click Refresh Now.
  • To schedule refreshes, open the dataset options menu from your workspace -> Datasets and click Schedule Refresh. Enable the option to keep your data up to date. Specify the refresh frequency in the menus.
  • In Report view, click Refresh to sync the report with the dataset as you work.

Ready to get started?

Download a free trial of the Amazon S3 Power BI Connector to get started:

 Download Now

Learn more:

Amazon S3 Icon Amazon S3 Power BI Connector

The fastest and easiest way to connect Power BI to Amazon S3 data. Includes comprehensive high-performance data access, real-time integration, extensive metadata discovery, and robust SQL-92 support.