Connect to Impala Data in Google Apps Script



Use CData Connect Cloud to access Impala data in Google Apps Script.

Google Apps Script empowers users to build custom functionality within their Google documents, including Google Sheets and Google Docs. Apps Script natively supports SQL Server connectivity via JDBC, providing a powerful extensibility tool for connecting Google cloud applications to external data. Paired with the SQL connectivity offered by CData Connect Cloud, users can easily access live Impala data directly from within their Google documents.

This article shows how to connect to Impala in Connect Cloud and provides sample scripting for processing Impala data in a Google Spreadsheet.

Our script only reads data from a specified table, but you can easily extend the script to incorporate update functionality.

Configure Impala Connectivity for Google Apps Scripts

Connectivity to Impala from Google Apps Scripts is made possible through CData Connect Cloud. To work with Impala data from Google Apps Scripts, we start by creating and configuring a Impala connection.

CData Connect Cloud uses a straightforward, point-and-click interface to connect to data sources.

  1. Log into Connect Cloud, click Connections and click Add Connection
  2. Select "Impala" from the Add Connection panel
  3. Enter the necessary authentication properties to connect to Impala.

    In order to connect to Apache Impala, set the Server, Port, and ProtocolVersion. You may optionally specify a default Database. To connect using alternative methods, such as NOSASL, LDAP, or Kerberos, refer to the online Help documentation.

  4. Click Create & Test
  5. Navigate to the Permissions tab in the Add Impala Connection page and update the User-based permissions.

Add a Personal Access Token

If you are connecting from a service, application, platform, or framework that does not support OAuth authentication, you can create a Personal Access Token (PAT) to use for authentication. Best practices would dictate that you create a separate PAT for each service, to maintain granularity of access.

  1. Click on your username at the top right of the Connect Cloud app and click User Profile.
  2. On the User Profile page, scroll down to the Personal Access Tokens section and click Create PAT.
  3. Give your PAT a name and click Create.
  4. The personal access token is only visible at creation, so be sure to copy it and store it securely for future use.

With the connection configured, you are ready to connect to Impala data from Google Apps Script.

Connect to Impala Data from Apps Script

At this point, you should have configured a connection Impala in Connect Cloud. All that is left new is to use Google Apps Script to access Connect Cloud and work with your Impala data in Google Sheets.

In this section, you will create a script (with a menu option to call the script) to populate a spreadsheet with Impala data. We have created a sample script and explained the different parts. You can view the raw script at the and of the article.

1. Create an Empty Script

To create a script for your Google Sheet, click Tools Script editor from the Google Sheets menu:

Open script editor

2. Declare Class Variables

Create a handful of class variables to be available for any functions created in the script.

//replace the variables in this block with real values as needed
var address = 'tds.cdata.com:14333';
var user = 'CONNECT_USER'; // [email protected]
var userPwd = 'CONNECT_USER_PAT';
var db = 'ApacheImpala1';

var dbUrl = 'jdbc:sqlserver://' + address + ';databaseName=' + db;

3. Add a Menu Option

This function adds a menu option to your Google Sheet, allowing you to use the UI to call your function.

function onOpen() {
  var spreadsheet = SpreadsheetApp.getActive();
  var menuItems = [
    {name: 'Write data to a sheet', functionName: 'connectToApacheImpalaData'}
  ];
  spreadsheet.addMenu('Impala Data', menuItems);
} 
The newly added Menu option.

4. Write a Helper Function

This function is used to find the first empty row in a spreadsheet.

/*
 * Finds the first empty row in a spreadsheet by scanning an array of columns
 * @return The row number of the first empty row.
 */
function getFirstEmptyRowByColumnArray(spreadSheet, column) {
  var column = spreadSheet.getRange(column + ":" + column);
  var values = column.getValues(); // get all data in one call
  var ct = 0;
  while ( values[ct] && values[ct][0] != "" ) {
    ct++;
  }
  return (ct+1);
}

5. Write a Function to Write Impala Data to a Spreadsheet

The function below writes the Impala data, using the Google Apps Script JDBC functionality to connect to Connect Cloud, SELECT data, and populate a spreadsheet. When the script is run, two input boxes will appear:

The first one asks the user to input the name of a sheet to hold the data (if the spreadsheet does not exist, the function creates it).

Input box for sheet selection.

The second asks the user to input the name of a Impala table to read. If an invalid table is chosen, an error message appears and the function is exited.

Input box for table selection.

Note, while the function is designed for use as a menu option, you can extend it for use as a spreadsheet formula.

/*
 * Reads data from a specified Impala 'table' and writes it to the specified sheet.
 *    (If the specified sheet does not exist, it is created.)
 */
function connectToApacheImpalaData() {
  var thisWorkbook = SpreadsheetApp.getActive();

  //select a sheet and create it if it does not exist
  var selectedSheet = Browser.inputBox('Which sheet would you like the data to post to?',Browser.Buttons.OK_CANCEL);
  if (selectedSheet == 'cancel')
    return;

  if (thisWorkbook.getSheetByName(selectedSheet) == null)
    thisWorkbook.insertSheet(selectedSheet);
  var resultSheet = thisWorkbook.getSheetByName(selectedSheet);
  var rowNum = 2;

  //select a Impala 'table'
  var table = Browser.inputBox('Which table would you like to pull data from?',Browser.Buttons.OK_CANCEL);
  if (table == 'cancel')
    return;

  var name = Jdbc.getConnection(dbUrl, {
    user: user, 
    password: userPwd
	}	
  );

  //confirm that var table is a valid table/view
  var dbMetaData = name.getMetaData();
  var tableSet = dbMetaData.getTables(null, null, table, null);
  var validTable = false;
  while (tableSet.next()) {
    var tempTable = tableSet.getString(3);
    if (table.toUpperCase() == tempTable.toUpperCase()){
      table = tempTable;
      validTable = true;
      break;
    }
  } 
  tableSet.close();
  if (!validTable) {
    Browser.msgBox("Invalid table name: " + table, Browser.Buttons.OK);
    return;
  }

  var stmt = name.createStatement();

  var results = stmt.executeQuery('SELECT * FROM ' + table);
  var rsmd = results.getMetaData();
  var numCols = rsmd.getColumnCount();

  //if the sheet is empty, populate the first row with the headers
  var firstEmptyRow = getFirstEmptyRowByColumnArray(resultSheet, "A");
  if (firstEmptyRow == 1) {
    //collect column names
    var headers = new Array(new Array(numCols));
    for (var col = 0; col < numCols; col++){
      headers[0][col] = rsmd.getColumnName(col+1);
    }
    resultSheet.getRange(1, 1, headers.length, headers[0].length).setValues(headers);
  } else {
    rowNum = firstEmptyRow;
  }

  //write rows of Impala data to the sheet
  var values = new Array(new Array(numCols));
  while (results.next()) {
    for (var col = 0; col < numCols; col++) {
      values[0][col] = results.getString(col + 1);
    }
    resultSheet.getRange(rowNum, 1, 1, numCols).setValues(values);
    rowNum++;
  }

  results.close();
  stmt.close();
}
  

When the function is completed, you have a spreadsheet populated with your Impala data, and you can now leverage all of the calculating, graphing, and charting functionality of Google Sheets anywhere you have access to the Internet.


Complete Google Apps Script

//replace the variables in this block with real values as needed
var address = 'tds.cdata.com:14333';
var user = 'CONNECT_USER'; // [email protected]
var userPwd = 'CONNECT_USER_PAT';
var db = 'ApacheImpala1';

var dbUrl = 'jdbc:sqlserver://' + address + ';databaseName=' + db;

function onOpen() {
  var spreadsheet = SpreadsheetApp.getActive();
  var menuItems = [
    {name: 'Write table data to a sheet', functionName: 'connectToApacheImpalaData'}
  ];
  spreadsheet.addMenu('Impala Data', menuItems);
}

/*
 * Finds the first empty row in a spreadsheet by scanning an array of columns
 * @return The row number of the first empty row.
 */
function getFirstEmptyRowByColumnArray(spreadSheet, column) {
  var column = spreadSheet.getRange(column + ":" + column);
  var values = column.getValues(); // get all data in one call
  var ct = 0;
  while ( values[ct] && values[ct][0] != "" ) {
    ct++;
  }
  return (ct+1);
}

/*
 * Reads data from a specified 'table' and writes it to the specified sheet.
 *    (If the specified sheet does not exist, it is created.)
 */
function connectToApacheImpalaData() {
  var thisWorkbook = SpreadsheetApp.getActive();

  //select a sheet and create it if it does not exist
  var selectedSheet = Browser.inputBox('Which sheet would you like the data to post to?',Browser.Buttons.OK_CANCEL);
  if (selectedSheet == 'cancel')
    return;

  if (thisWorkbook.getSheetByName(selectedSheet) == null)
    thisWorkbook.insertSheet(selectedSheet);
  var resultSheet = thisWorkbook.getSheetByName(selectedSheet);
  var rowNum = 2;

  //select a Impala 'table'
  var table = Browser.inputBox('Which table would you like to pull data from?',Browser.Buttons.OK_CANCEL);
  if (table == 'cancel')
    return;

  var name = Jdbc.getConnection(dbUrl, {
    user: user, 
    password: userPwd
	}
  );

  //confirm that var table is a valid table/view
  var dbMetaData = name.getMetaData();
  var tableSet = dbMetaData.getTables(null, null, table, null);
  var validTable = false;
  while (tableSet.next()) {
    var tempTable = tableSet.getString(3);
    if (table.toUpperCase() == tempTable.toUpperCase()){
      table = tempTable;
      validTable = true;
      break;
    }
  } 
  tableSet.close();
  if (!validTable) {
    Browser.msgBox("Invalid table name: " + table, Browser.Buttons.OK);
    return;
  }

  var stmt = name.createStatement();

  var results = stmt.executeQuery('SELECT * FROM ' + table);
  var rsmd = results.getMetaData();
  var numCols = rsmd.getColumnCount();

  //if the sheet is empty, populate the first row with the headers
  var firstEmptyRow = getFirstEmptyRowByColumnArray(resultSheet, "A");
  if (firstEmptyRow == 1) {
    //collect column names
    var headers = new Array(new Array(numCols));
    for (var col = 0; col < numCols; col++){
      headers[0][col] = rsmd.getColumnName(col+1);
    }
    resultSheet.getRange(1, 1, headers.length, headers[0].length).setValues(headers);
  } else {
    rowNum = firstEmptyRow;
  }

  //write rows of Impala data to the sheet
  var values = new Array(new Array(numCols));
  while (results.next()) {
    for (var col = 0; col < numCols; col++) {
      values[0][col] = results.getString(col + 1);
    }
    resultSheet.getRange(rowNum, 1, 1, numCols).setValues(values);
    rowNum++;
  }

  results.close();
  stmt.close();
}

Ready to get started?

Learn more about CData Connect Cloud or sign up for free trial access:

Free Trial