Ready to get started?

Download a free trial of the HDFS Data Provider to get started:

 Download Now

Learn more:

HDFS Icon HDFS ADO.NET Provider

Rapidly create and deploy powerful .NET applications that integrate with HDFS.

LINQ to HDFS Data



LINQ offers versatile querying capabilities within the .NET Framework (v3.0+), offering a straightforward method for programmatic data access through CData ADO.NET Data Providers. In this article, we demonstrate the use of LINQ to retrieve information from the HDFS Data Provider.

This article illustrates using LINQ to access tables within the HDFS via the CData ADO.NET Data Provider for HDFS. To achieve this, we will use LINQ to Entity Framework, which facilitates the generation of connections and can be seamlessly employed with any CData ADO.NET Data Providers to access data through LINQ.

See the help documentation for a guide to setting up an EF 6 project to use the provider.

  1. In a new project in Visual Studio, right-click on the project and choose to add a new item. Add an ADO.NET Entity Data Model.
  2. Choose EF Designer from Database and click Next.
  3. Add a new Data Connection, and change your data source type to "CData HDFS Data Source".
  4. Enter your data source connection information.

    In order to authenticate, set the following connection properties:

    • Host: Set this value to the host of your HDFS installation.
    • Port: Set this value to the port of your HDFS installation. Default port: 50070

    Below is a typical connection string:

    Host=sandbox-hdp.hortonworks.com;Port=50070;Path=/user/root;User=root;
  5. If saving your entity connection to App.Config, set an entity name. In this example we are setting HDFSEntities as our entity connection in App.Config.
  6. Enter a model name and select any tables or views you would like to include in the model.

Using the entity you created, you can now perform select commands. For example:

HDFSEntities context = new HDFSEntities(); var filesQuery = from files in context.Files select files; foreach (var result in filesQuery) { Console.WriteLine("{0} {1} ", result.Id, result.FileId); }

See "LINQ and Entity Framework" chapter in the help documentation for example queries of the supported LINQ.