Ready to get started?

Download a free trial of the Amazon Redshift Cmdlets to get started:

 Download Now

Learn more:

Amazon Redshift Icon Amazon Redshift Cmdlets

An easy-to-use set of PowerShell Cmdlets offering real-time access to Amazon Redshift data. The Cmdlets allow users to easily read, write, update, and delete live data - just like working with SQL server.

PowerShell Scripting to Replicate Redshift Data to MySQL



Write a simple PowerShell script to replicate Redshift data to a MySQL database.

The CData Cmdlets for Redshift offer live access to Redshift data from within PowerShell. Using PowerShell scripts, you can easily automate regular tasks like data replication. This article will walk through using the CData Cmdlets for Redshift and the CData Cmdlets for MySQL in PowerShell to replicate Redshift data to a MySQL database.

After obtaining the needed connection properties, accessing Redshift data in PowerShell and preparing for replication consists of four basic steps.

To connect to Redshift, set the following:

  • Server: Set this to the host name or IP address of the cluster hosting the Database you want to connect to.
  • Port: Set this to the port of the cluster.
  • Database: Set this to the name of the database. Or, leave this blank to use the default database of the authenticated user.
  • User: Set this to the username you want to use to authenticate to the Server.
  • Password: Set this to the password you want to use to authenticate to the Server.

You can obtain the Server and Port values in the AWS Management Console:

  1. Open the Amazon Redshift console (http://console.aws.amazon.com/redshift).
  2. On the Clusters page, click the name of the cluster.
  3. On the Configuration tab for the cluster, copy the cluster URL from the connection strings displayed.

Collecting Redshift Data

  1. Install the module:

    Install-Module RedshiftCmdlets
  2. Connect to Redshift:

    $redshift = Connect-Redshift -User $User -Password $Password -Database $Database -Server $Server -Port $Port
  3. Retrieve the data from a specific resource:

    $data = Select-Redshift -Connection $redshift -Table "Orders"

    You can also use the Invoke-Redshift cmdlet to execute pure SQL-92 statements:

    $data = Invoke-Redshift -Connection $redshift -Query 'SELECT * FROM Orders WHERE ShipCountry = @ShipCountry' -Params @{'@ShipCountry'='USA'}
  4. Save a list of the column names from the returned data.

    $columns = ($data | Get-Member -MemberType NoteProperty | Select-Object -Property Name).Name

Inserting Redshift Data into the MySQL Database

With the data and column names collected, you are ready to replicate the data into a MySQL database.

  1. Install the module:

    Install-Module MySQLCmdlets
  2. Connect to MySQL, using the server address and port of the MySQL server, valid user credentials, and a specific database with the table in which the data will be replicated:

    $mysql = Connect-MySQL -User $User -Password $Password -Database $Database -Server $Server -Port $Port
  3. Loop through the Redshift data, store the values, and use the Add-MySQL cmdlet to insert the data into the MySQL database, one row at a time. In this example, the table will need to have the same name as the Redshift resource (Orders) and to exist in the database.

    $data | % { $row = $_ $values = @() $columns | % { $col = $_ $values += $row.$($col) } Add-MySQL -Connection $mysql -Table "Orders" -Columns $columns -Values $values }

You have now replicated your Redshift data to a MySQL database. This gives you freedom to work with Redshift data in the same way that you work with other MySQL tables, whether that is performing analytics, building reports, or other business functions.

Notes

  • Once you have connected to Redshift and MySQL in PowerShell, you can pipe command results to perform the replication in a single line:

    Select-Redshift -Connection $redshift -Table "Orders" | % { $row = $_ $values = @() $columns | % { $col = $_ $values += $row.$($col) } Add-MySQL -Connection $mysql -Table "Orders" -Columns $columns -Values $values }
  • If you wish to replicate the Redshift data to another database using another PowerShell module, you will want to exclude the Columns, Connection, and Table columns from the data returned by the Select-Redshift cmdlet since those columns are used to help pipe data from one CData cmdlet to another:

    $columns = ($data | Get-Member -MemberType NoteProperty | Select-Object -Property Name).Name | ? {$_ -NotIn @('Columns','Connection','Table')}