Speeding up OpenNETCF.ORM ?

Mar 8, 2011 at 8:52 PM
Edited Mar 8, 2011 at 8:53 PM
Hi,
I'm using the OpenNETCF.ORM with SQLCE to save enviromental sensor data on a winCE 5 device. Saving work's well, but if there are a few ~100.000 entries the query for a date range it get's very slow. Is it possible to speed up the query? Maybe changing the underlying  database engine?

I'v added some sample source to demonstrate what I mean.

Chris

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using OpenNETCF.ORM;

namespace ConsoleApplication {
    [Entity(KeyScheme.None)]
    public class Datapoint {
        public Datapoint() { }
        public Datapoint(int datastreamID, DateTime timeStamp, double val) {
            DatastreamID = datastreamID;
            TimeStamp = timeStamp;
            LongTimeStamp = timeStamp.Ticks;
            Value = val;
        }

        [Field]
        public int DatastreamID { get; set; }

        [Field(DataType = System.Data.DbType.DateTime)]
        public DateTime TimeStamp { get; set; }

        [Field]
        public long LongTimeStamp { get; set; }

        [Field(DataType = System.Data.DbType.Double)]
        public double Value { get; set; }
    }
}

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using OpenNETCF.ORM;

namespace ConsoleApplication {
    [Entity(KeyScheme.Identity)]
    public class Datastream {
        public Datastream() { }
        public Datastream(string unit, string tag) {
            Unit = unit;
            Tag = tag;
        }

        [Field(IsPrimaryKey = true)]
        public int DatastreamID { get; set; }

        //[Reference(typeof(Datapoint), "DatastreamID", Autofill = false)]
        //public Datapoint[] Stream { get; set; }

        [Field(Length = 10)]
        public string Unit { get; set; }

        [Field]
        public string Tag { get; set; }
    }
}


using System;
using System.Collections.Generic;
using System.Text;
using OpenNETCF.ORM;
using System.Reflection;

namespace ConsoleApplication {
    class Program {

        static SqlCeDataStore Store = new SqlCeDataStore(@"\Hard Disk2\data1.sdf");

        static void Main(string[] args) {
            Store.DeleteStore();
            Store.ConnectionBehavior = ConnectionBehavior.Persistent;
            Store.DiscoverTypes(Assembly.GetExecutingAssembly());
            if (!Store.StoreExists) {
                Store.CreateStore();
            }

            List<FilterCondition> fcl1 = Filter1();

            Datapoint[] foo = new Datapoint[1];

            DateTime sw1 = DateTime.Now;

            for (int i = 1; i < 11; i++) {
                int j = FillTable(10000, i);
                sw1 = DateTime.Now;
                var boo = Store.Select<Datapoint>(fcl1);

                Console.WriteLine("Inserted {0}, Query: {1}",
                    i * 100000,
                    (DateTime.Now - sw1).TotalMilliseconds.ToString());
            }
          
            // Verify
            foreach (var item in foo) {
                if (item.TimeStamp < DateTime.Today 
                    || item.TimeStamp > DateTime.Today.AddMinutes(100)
                    || item.DatastreamID != 1) {
                    Console.WriteLine("Fail");
                }
            }
            Console.ReadLine();
        }

        private static List<FilterCondition> Filter1() {
            List<FilterCondition> fcl = new List<FilterCondition>();
            fcl.Add(new FilterCondition
            {
                FieldName = "Timestamp",
                Operator = FilterCondition.FilterOperator.GreaterThan,
                Value = DateTime.Today
            });
            fcl.Add(new FilterCondition
            {
                FieldName = "Timestamp",
                Operator = FilterCondition.FilterOperator.LessThan,
                Value = DateTime.Today.AddMinutes(100)
            });
            fcl.Add(new FilterCondition
            {
                FieldName = "DatastreamID",
                Operator = FilterCondition.FilterOperator.Equals,
                Value = 1
            });
            return fcl;
        }

        private static int FillTable(int noOfDp, int instanceID) {
            Datastream ds = new Datastream("°C", "Test_" + instanceID);
            Store.Insert(ds);
            var obj = Store.Select<Datastream>("Tag", "Test_" + instanceID)[0];
            Random rnd = new Random();
            for (int i = 0; i < noOfDp; i++) {
                Datapoint dp = new Datapoint(obj.DatastreamID, DateTime.Today.AddMinutes(i), rnd.NextDouble() * 10);
                Store.Insert(dp);
            }
            return obj.DatastreamID;
        }
    }
}

Coordinator
Mar 8, 2011 at 8:56 PM

Try setting the FieldSearchOrder  on the FieldAttribute of the TimeStamp Field (or whatever field you’re searching on).  It will add an index on that column, which should grealy improve Select speed.

Let me know if that improves things (and by how much if you're capturing that data now).  If not, I'll dive into the code and see what I can find.

Mar 10, 2011 at 1:41 PM

I have added a FieldSearchOrder, the queries are a lot faster now, about ~800ms.

But there is now another problem. My testing DB is ~78MB with 1112358 rows. It's located on a high-speed CF-Card.

Until the db-file is opened it took about 5minutes. Maybe it's the reading speed of the CF-Card, but I cannot test this because the internal drive space isn't so much.

Coordinator
Mar 10, 2011 at 1:59 PM

The first time you open it with the updated schema (adding the index) it's going to take a while because it has to actually build and populate the index from the table.  The more rows that are in the table, the longer that process will take; meaning that if the table is empty, building the index is very quick.  Once the index is built (i.e. any time after the first) then it should be just as fast to start up as before (well slightly slower since it has to ensure the index exists, but that's it).

Mar 10, 2011 at 3:39 PM

I've created the testing database with index on my development machine and copied the test db to my device.

You're right, only the first startup lasts ~90sec. All further startups are fast as the queries =)

Now I just need a way to deploy a persistent install of the SQLCE on my device :/

Coordinator
Mar 10, 2011 at 3:43 PM

Excellent.  So adding the FieldSearchOrder improved query perf  for you.  How long was the query taking before? I see it's down to ~800ms on a 1.1M row table, but I don't see how long it was taking originally.

Mar 10, 2011 at 4:57 PM

A lot longer, up to 10minutes. But I didn't filled the table with 1.1M rows on first tests, just about 400.000. 

I also have benched my CF-Card directly on the device. Read/Write ~100kbyte/s

Mar 14, 2011 at 11:04 AM

I think there is another speed issue. Aver every INSERT in the database the index seems to get rebuild. The first SELECT after the INSERT took ~10seconds, all subsequent SELECT's took about 800ms. This can be repeatedly observed.

Is it possible to update the Index right after the INSERT?

Coordinator
Mar 14, 2011 at 2:44 PM

That definitely should not be the case.  When you do a table insert, a row gets inserted by the DB engine into the index at the same time.  Can you provide repro steps for this behavior?

Mar 14, 2011 at 8:18 PM

I tried to reproduce this on a fast computer, it happens there also. But not that often as on a Windows CE5 device.

I uploaded a sample to my dropbox. Source + binaries for full .net.

http://dl.dropbox.com/u/15578628/ConsoleApplication.zip

Source is .netcf compatible.

Coordinator
Aug 25, 2011 at 8:16 PM

Chris,

Are you still seeing this?

Aug 26, 2011 at 11:53 AM

Yes, it's still there after updating to the latest build.

I think it's a SqlCE related issue.

Coordinator
Aug 26, 2011 at 5:38 PM

I've tossed the question over to the SQL Compact team to see if we can get any insight, but I'd agree that it's strange behavior.  I'll let you know if we come up with anything.

Coordinator
Aug 29, 2011 at 4:11 PM

So after discussing it with the SQL Compact team and doing some testing it turns out that the intermittent slow Select call is caused by stale index statistics.  I've added the ability for the ORM engine to cache commands for parameterized queries (it's on by default in the latest code check-in but can be turned off for the old behavior).  By re-using an already compiled command, the statistics are not recomputed and I no longer see the intermittent slow down.