Using LINQ to manipulate data in DataSet/DataTable

Standard

Have you tried a micro ORM for your data access? Click to read more


LINQ (Language Integrated Query) provides a very easy and fast way to manipulate data that is cached in a DataSet. In .Net applications, DataSet represents a disconnected cache of data. Your application typically needs to search, filter thru this data in order to display this data according to the need. DataView provides a very limited options
when it comes to creating a filtered view of the data. LINQ extends these capabilities to a greater extend.

A LINQ query opertaion consists of 3 actions (Ref:MSDN): obtain the data source, create the query and execute the query.

Any datasource that implements the IEnumerable(T) generic interface can be queried thru LINQ. So DataTable objects are good candidates to do any LINQ query opertions, we will see using the following examples, how some common tasks can be done using LINQ queries.

For our example, we will consider that our DataSet has one(1) table with the following schema,

dtCustomer (CustomerID,CustomerName,Address,City,PostalCode,State,Country,PhoneNumer)

A simple select:

IEnumerable query =
    from customer in dtCustomer.AsEnumerable()
    select customer;

Till this point, we have the LINQ query ready, but not executed it yet, query is executed when we actually use it.

foreach (DataRow dr in query)
{
    Console.WriteLine(dr.Field("CustomerName"));
}

At this point, our query is executed and it prints the names of the customer.

We can also, get the resulset as a DataView by simply doing,

DataView view = query.AsDataView();

Most times, when we are dealing with DataSet/DataTable, data we will be creating a DataView as result of our LINQ query. ToList(), ToArray() methods are also very useful when you want to get your resultset
as a generic list or Array (Think AJAX/JSON!).

Lambda Expressions can be used as parameters to LINQ queries.

IEnumerable customers =
    query.Where(c => c.Field("CustomerName").Containes("John"));

//All names that contain "John"
foreach (DataRow cust in customers)
{
    Console.WriteLine(cust.Field("CustomerName"));
}

Simple Where Clause:

EnumerableRowCollection query
              = from customer in dtCustomer.AsEnumerable()
                where customer.Field("State") == "NJ"
                select customer;
            DataView njview = query.AsDataView();

Pretty simple, njview represents all customers that live in NJ.
You can extend the example to add more criteria to your where clause…

EnumerableRowCollection query
              = from customer in dtCustomer.AsEnumerable()
                where customer.Field("State") == "NJ" && customer.Field("PostalCode") == "08807"
                select customer;
            DataView njview = query.AsDataView();

It is useful to note that when you write your where clause, you leverage the power of your C# (or VB.Net) language features to search and filter your resultset using LINQ.

So, A SQL where clause is

where customer.Field("State") == "NJ"
     where customer.Field("State") != "NJ"

A SQL Like clause is

where customer.Field("CustomerName").Containes("John")

Skip and Take allows to get the skip the first n rows or get the top n rows as a result of the query.

EnumerableRowCollection query
              = (from customer in dtCustomer.AsEnumerable()
                where customer.Field("State") == "NJ"
                select customer).Skip(3);

EnumerableRowCollection query
              = (from customer in dtCusomter.AsEnumerable()
                where customer.Field("State") == "NJ"
                select customer).Take(3);

Simple ORDER BY clause:

EnumerableRowCollection query
              = from customer in dtCustomer.AsEnumerable()
                orderby customer.Field("CustomerName")
                select customer;

Above query, gets the result order by customer name (ascending is default). And if you want it by descending order,

EnumerableRowCollection query
              = from customer in dtCusomter.AsEnumerable()
                orderby customer.Field("CustomerName")  descending
                select customer;

Reverse ORDER:

EnumerableRowCollection query
              = (from customer in dtCustomer.AsEnumerable()
                orderby customer.Field("CustomerName")
                select customer.Reverse();

Simple JOIN Operator:

var customers = ds.Tables["Customer"].AsEnumerable();
var orders = ds.Tables["Order"].AsEnumerable();
var query =
    		from customer in customers
    		join order in orders
		on order.Field("CustomerID")
    		equals customer.Field("Customer")
	into custOrders
	select custOrders;

All the examples given above is just the tip of iceberg on what you can do with LINQ and Lambda expressions. There is tons of samples and articles available on Internet on this, if you are looking for a way to simplyfy and speed up your business processing logic in your middle tier, LINQ is something you need to adopt!

Cheers!

  • erik

    Your examples for Take() and Skip() do not seem to work as shown, first is the seemingly out of place parenthesis.

    should the example read like this?
    # EnumerableRowCollection query
    # = (from customer in dtCusomter.AsEnumerable()
    # where customer.Field(“State”) == “NJ”
    # select customer).Take(3);

    Secondly when I try to do something very similar in my own code I get the error message:

    Cannot implicitly convert type ‘System.Collections.Generic.IEnumerable’ to ‘System.Data.EnumerableRowCollection’. An explicit conversion exists (are you missing a cast?)

    which I am guessing is because the Take() function is returning an IEnumerable that is not a EnumberableRowCollection

    I have found that you are able to do something like this to return a DataTable:

    # DataTable dt
    # = (from customer in dtCusomter.AsEnumerable()
    # where customer.Field(“State”) == “NJ”
    # select customer).Take(3).CopyToDataTable();

    but unfortunately that a copy of the datatable is not enough for my needs.

    • NhatNguyen

      int x = 30;
      int y = 12;
      int j = 0;
      var q = from a in data.Tables
      select a;
      count = q.Count();
      for (int i = 0; i <= count; i++)
      {
      j = i + 1;
      Button bt = new Button();
      bt.Size = new Size(200, 74);
      bt.Top = x;
      bt.Left = y;

      bt.Name = q.First().ID.ToString();<- wrong
      bt.Text = q.First().Name.ToString();<-wong
      x += 173;
      if (j % 6 == 0)
      {
      x = 30;
      y += 84;
      }
      pnButton.Controls.Add(bt);
      }
      Can you fix it for me?

      • binu

        First thing I see wrong is that you are looping over a collection to get value from collection to create buttons and you are always getting the First() in the collection???
        May be you need to do this instead??

        var count = query.Count();

        for (var i = 0; i <= count; i++)
        {
        var bt = new Button();

        bt.Name = query.ElementAtOrDefault(i).Name;
        bt.Text = query.ElementAtOrDefault(i).Text;

        }

        *** And as a future note, never expect anyone to fix your code, you will need to read, research, learn to fix it…people can give pointers in right direction, it is up to your to figure out and fix it by yourself.

        Hope this helps!

  • dotnetarchitect

    Thanks for pointing out misplaced parenthesis. I will test the code to see why you were getting an error..Can you tell me what you were trying to do? May be I can help…

    • erik

      My code is very similar to the take() example you gave.

      ideally it would look something like this:
      EnumerableRowCollection query
      = (from row in table.AsEnumerable()
      orderby row.field(“Time”) descending
      select row).Take(5);
      myListView.ItemsSource = query.AsDataView();

      the problem I am running into is that the result from Take(5) is an IEnumerable but not an EnumerableRowCollection

      • dotnetarchitect

        Here is some code where i was able to get this working …there is some variation in approach, but works

        //my dummy data setup
        DataSet ds = new DataSet();
        ds.ReadXmlSchema(HttpContext.Current.Server.MapPath(@"App_Dataproducts.xsd"));
        ds.ReadXml(HttpContext.Current.Server.MapPath(@"App_Dataproducts.xml"));
        DataTable dt = ds.Tables[0];

        EnumerableRowCollection query
        = (from product in dt.AsEnumerable()
        select product);
        //Here you get all data depending on your query
        DataView dv = query.AsDataView();

        //you can take top 3
        var q = (from p in query.AsEnumerable()
        select p).Take(3);

        //get it as a generic list of datarow
        List rowlist = q.ToList();

        Response.Write("Data Row Count = " + rowlist.Count.ToString());

        //we can nest the above 2 statements too
        var q2 = (
        from p in
        (from product in dt.AsEnumerable()
        select product).AsEnumerable()
        select p
        ).Take(3);

        List rowlist2 = q.ToList();
        Response.Write("Data Row Count = " + rowlist2.Count.ToString());

  • Pingback: Linq(2) to DataTable « Steveluo’s Blog

  • Nab

    Hi,

    i want to left join two data table on 2 columns.. i am not getting way to do it.. can you help me?

    my SQL query look like this

    select * from table1 left join table2 on table1.column1=t2.column1 and table1.column2=t2.column2

    Thanks
    Nab

  • surya

    Hi,
    I want to select only the string which is not null and not empty from a dataTable, but my query does not give the desired results. It is geeting all the rows from the dataTable regardless of my query condition.

    my query looks like this:

    EnumerableRowCollection query = from n in dt.AsEnumerable()
    where !String.IsNullOrEmpty(n.Field(“First Name”)) || !String.IsNullOrEmpty(n.Field(“Last Name”)) && !String.IsNullOrEmpty(n.Field(“Business Phone”)) || !String.IsNullOrEmpty(n.Field(“Company Main Phone”)) || !String.IsNullOrEmpty(n.Field(“Home Phone”)) || !String.IsNullOrEmpty(n.Field(“Mobile Phone”))
    select n;

    Can you please help!!!

    Thanks
    Surya

  • Pingback: 2010 in review « dotnet etc.

  • NhatNguyen

    private void LoadButton()
    {
    int x = 30;
    int y = 12;
    int j = 0;
    var q = from a in data.Tables
    select a;
    var count = q.Count();
    for (var i = 0; i <= count; i++)
    {
    j = i + 1;
    Button bt = new Button();
    bt.Size = new Size(200, 74);
    bt.Top = x;
    bt.Left = y;

    bt.Name = q.ElementAtOrDefault(i).ID.ToString();<- it's wrong
    bt.Text = q.ElementAtOrDefault(i).Name.ToString();<-it's wrong
    x += 173;
    if (j % 6 == 0)
    {
    x = 30;
    y += 84;
    }
    pnButton.Controls.Add(bt);
    }

    }
    I fix my code like me, but system have erorr "The query operator 'ElementAtOrDefault' is not supported." I'm a student, I try fix this code but it don't suscessfull. Thanks somuch

  • Ana

    I need to use Linq + datatable with “Count” and “Group by”, Can you give me a example?

  • Pingback: Using LINQ to manipulate data in DataSet/DataTable | Smartech Force Blog