Discover how a bimodal integration strategy can address the major data management challenges facing your organization today.
Get the Report →Access Live Databricks Data in Excel (Desktop)
Use Connect Spreadsheets by CData to gain access to live Databricks data from your Excel spreadsheets.
Looking for Connect Cloud instructions?
Your Connect Cloud account includes Connect Spreadsheets, so you can use the instructions below. You can expect minor differences when referencing the Connect Spreadsheet platform, but the principles still apply!
Microsoft Excel is a widely used spreadsheet software application, primarily used for tasks related to data management, analysis, and visualization. When combined with Connect Spreadsheets by CData, you gain immediate access to Databricks data directly within Excel, facilitating data analysis, collaboration, calculations, and more. This article shows how to connect to Databricks in Connect Spreadsheets and access and update live Databricks data in Excel spreadsheets.
Connect Spreadsheets is the easiest way to get all your live data into Microsoft Excel and Google Sheets - no more downloading, wrangling, and uploading files again. Just connect to your data, select the dataset you'd like to see, and import it into your spreadsheet.
This setup requires a Connect Spreadsheets account and the Connect Spreadsheets Add-In for Excel. To get started, sign up a free trial of Connect Spreadsheets and install the free Connect Spreadsheets Excel Add-In.
About Databricks Data Integration
Accessing and integrating live data from Databricks has never been easier with CData. Customers rely on CData connectivity to:
- Access all versions of Databricks from Runtime Versions 9.1 - 13.X to both the Pro and Classic Databricks SQL versions.
- Leave Databricks in their preferred environment thanks to compatibility with any hosting solution.
- Secure authenticate in a variety of ways, including personal access token, Azure Service Principal, and Azure AD.
- Upload data to Databricks using Databricks File System, Azure Blog Storage, and AWS S3 Storage.
While many customers are using CData's solutions to migrate data from different systems into their Databricks data lakehouse, several customers use our live connectivity solutions to federate connectivity between their databases and Databricks. These customers are using SQL Server Linked Servers or Polybase to get live access to Databricks from within their existing RDBMs.
Read more about common Databricks use-cases and how CData's solutions help solve data problems in our blog: What is Databricks Used For? 6 Use Cases.
Getting Started
Configure Databricks Connectivity for Excel
Connectivity to Databricks from Excel is made possible through Connect Spreadsheets. To work with Databricks data from Excel, we start by creating and configuring a Databricks connection.
- Log into Connect Spreadsheets, click Connections and click Add Connection
- Select "Databricks" from the Add Connection panel
-
Enter the necessary authentication properties to connect to Databricks.
To connect to a Databricks cluster, set the properties as described below.
Note: The needed values can be found in your Databricks instance by navigating to Clusters, and selecting the desired cluster, and selecting the JDBC/ODBC tab under Advanced Options.
- Server: Set to the Server Hostname of your Databricks cluster.
- HTTPPath: Set to the HTTP Path of your Databricks cluster.
- Token: Set to your personal access token (this value can be obtained by navigating to the User Settings page of your Databricks instance and selecting the Access Tokens tab).
- Click Create & Test
- Navigate to the Permissions tab in the Add Databricks Connection page and update the User-based permissions.
With the connection configured, you are ready to connect to Databricks data from Excel.
Access Live Databricks Data in Excel
The steps below outline connecting to Connect Spreadsheets from Excel to access live Databricks data.
- Open Excel, create a new sheet (or open an existing one).
- Click Insert and click Get Add-ins. (if you have already installed the Add-In, jump to step 4).
- Search for Connect Spreadsheets and install the Add-in.
- Click Data and open the CData Connect Spreadsheets Add-In.
- In the Add-In panel, click "Log in" to authenticate with your Connect Spreadsheets account
- In the Connect Spreadsheets panel in Excel, click Import
- Choose a Connection (e.g. Databricks1), Table (e.g. Customers), and Columns to import
- Optionally add Filters, Sorting, and a Limit
- Click Execute to import the data and opt to overwrite the existing sheet or create a new one.
Update Databricks Data from Excel
In addition to viewing Databricks data in Excel, Connect Spreadsheets also lets you update and delete Databricks data. Begin by importing data (as described above).
- Update any cell or cells with changes you want to push to Databricks (your changes will be in red)
- In the Connect Spreadsheets Add-In panel, select Update
- Optionally highlight the cell(s) you wish to update and select an update option ("Update All" or "Update Selected")
- Click Execute to push the updates to Databricks
A notification will appear when the update is complete
Live Access to Databricks Data from Spreadsheet Apps
New, you have a direct, cloud-to-cloud connection to live Databricks data from your Excel workbook. You can add more data to your workbook for calculations, aggregations, collaboration, and more.
Try Connect Spreadsheets and get real-time data access to 100+ SaaS, Big Data, and NoSQL sources directly from Microsoft Excel.