Excel C# Interop

imageI made a mistake.

I had a project in mind – to pull information from basecamp classic (todo items, with metdata embedded), massage it in Excel, and then upload it back to basecamp.   

I have always used Excel (or some equivalent spreadsheet) for project tracking – I never know which columns I need to add, when I need to highlight things in different colors, what kinds of charts I need to pull.  So the decision to bring the information down to Excel was an easy one.

That cost me 1.5 days of pain.

In retrospect, it would have been much easier to use EF Code First against .\SQLEXPRESS and use SSMS to do the bulk of my editing.   But, not having the time to re-tool, I pushed forward, and I did get it to work well enough to help me next week as I start work with that project…

Anyway, this is what I learned along the way:

Workbook.Save();
Workbook.Close(false);
Application.Quit();

The recipe for closing down the instance of Excel that had been open.   Without getting asked if I want to save.

Marshal.ReleaseComObject(sheets);

You have to do this for pretty much ever object you dink with, else the COM Excel wrapper stays open in the background. Reference: http://stackoverflow.com/questions/158706/how-to-properly-clean-up-excel-interop-objects). I haven’t fixed this yet.

Sheet.Cells[1,1].Value
Sheet.UsedRange.Cells[1,1].Value

Cells start at index 1, not 0.   UsedRange is faster than Cells.

var connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fullName + ";Extended Properties=\"Excel 12.0 Xml;HDR=YES\"";

You can use ACE to query spreadsheet data, however, type-coercion is sketchy.  As in, if you have no data, that long field will come back as string.    I ended up ditching that and reading/writing my lists of stuff directly from the worksheet so I could have direct control over the types:

Worksheet sheet1 = b.Worksheets["Tasks"];
t2 = sheet1.ReadListFromSheet<MyTask>();
sheet1.UpdateListToSheet(t2, "TodoItemId");
        public static List<T> ReadListFromSheet<T>(this Worksheet sheet) where T:new()
        {
            var nameToCol = GetNameToColumnDictionary(sheet);
            var colToSetter = GetColToSetterDictionary<T>(nameToCol);

            var result = new List<T>();
            var rowCount = sheet.UsedRange.Rows.Count; 

            for (int row = 2; row<=rowCount; row++)
            {
                var item = new T();
                foreach (var e in colToSetter)
                {
                    var col = e.Key;
                    var setter = e.Value;
                    var o = sheet.Cells[row, col].Value;
                    if (o == null) continue;
                    Type source = o.GetType();
                    var target = setter.GetParameters()[0].ParameterType;
                    try
                    {
                        object o2 = null;
                        if (target == typeof(bool?))
                        {
                            o2 = Convert.ChangeType(o, typeof(bool));  // bypass the null stuff
                        } else if (target == typeof(int?))
                        {
                            o2 = Convert.ChangeType(o, typeof (int)); 
                        } else if (target == typeof(DateTime?))
                        {
                            o2 = Convert.ChangeType(o, typeof (DateTime));
                        }
                        else
                        {
                            // this generic conversion seems to catch most everything
                            o2 = Convert.ChangeType(o, target);
                        }
                        setter.Invoke(item, new object[] {o2});

                    }
                    catch (Exception ex)
                    {
                        Console.WriteLine("Need to handle from type: {0}, value:{1} to setter type: {2}", source.FullName, o.ToString(), target.FullName);
                    }
                }
                result.Add(item);
            }
            return result; 
        }

        public static void UpdateListToSheet<T>(this Worksheet sheet, IEnumerable<T> lists, string pkColumnName)
        {
            pkColumnName = pkColumnName.Trim().ToLower();

            // Grab the list of columns
            var nameToCol = GetNameToColumnDictionary(sheet);

            // get a mapping from list member to column number / vice versa
            var colToGetter = GetColToGetterDictionary<T>(nameToCol);


            // grab the mapping of pk to row number
            if (!nameToCol.ContainsKey(pkColumnName)) throw new Exception("Could not locate primary key " + pkColumnName + " on sheet " + sheet.Name);
            var pkColumnNumber = nameToCol[pkColumnName];
            var pkToRow = new Dictionary<string, int>();
            var emptyRows = new List<int>();
            for (var r = 2; r <= sheet.UsedRange.Rows.Count; r++)
            {
                var pkValue = sheet.Cells[r, nameToCol[pkColumnName]].Value;

                if (pkValue == null)
                {
                    emptyRows.Add(r);
                }

                var test = pkValue.ToString();
                if (test == String.Empty) continue;
                pkToRow[test] = r;
            }

            // get a direct line to the getter of the pk
            if (!colToGetter.ContainsKey(pkColumnNumber)) throw new Exception("Could not locate primary column in DTO????");
            MethodInfo pkGetter = colToGetter[pkColumnNumber];

            // write the list contents into matching column names in the spreadsheet, updating as we go
            foreach (var item in lists)
            {
                var pkVal = pkGetter.Invoke(item, null).ToString();
                if (String.IsNullOrEmpty(pkVal)) continue; // inbound item has no PK

                // figure out which row to write to
                int rowToUpdate = -1;
                if (pkToRow.ContainsKey(pkVal))
                {
                    // row already exists .. update it
                    rowToUpdate = pkToRow[pkVal];
                }
                else
                {
                    // row does not exist .. append it
                    if (emptyRows.Count > 0)
                    {
                        rowToUpdate = emptyRows[0];
                        emptyRows.RemoveAt(0);
                    }
                    else
                    {
                        rowToUpdate = sheet.UsedRange.Rows.Count + 1;
                    }
                }

                // write it out
                foreach (var e in colToGetter)
                {
                    int col = e.Key;
                    MethodInfo getter = e.Value;
                    var o = getter.Invoke(item, null);
                    // any translations from .Net to Excel would happen here - none so far
                    sheet.Cells[rowToUpdate, col] = o;
                }
            }
        }

        #region Private

        private static Dictionary<string, int> GetNameToColumnDictionary(Worksheet sheet)
        {
            var nameToCol = new Dictionary<string, int>();
            for (var c = 1; c <= sheet.UsedRange.Columns.Count; c++)
            {
                var columnName = sheet.Cells[1, c].Value;
                if (!(columnName is string)) continue;
                if (String.IsNullOrEmpty(columnName)) continue;
                columnName = columnName.Trim().ToLower();
                nameToCol[columnName] = c;
            }
            return nameToCol;
        }

        private static Dictionary<int, MethodInfo> GetColToGetterDictionary<T>(Dictionary<string, int> nameToCol)
        {
            var colToGetter = new Dictionary<int, MethodInfo>();
            var properties = typeof (T).GetProperties(BindingFlags.Public | BindingFlags.Instance);
            foreach (var p in properties)
            {
                if (!p.CanWrite || !p.CanRead) continue;
                MethodInfo mget = p.GetGetMethod(false);
                if (mget == null) continue;
                var lowerPropName = p.Name.ToLower();
                if (!nameToCol.ContainsKey(lowerPropName)) continue; // not present in target xls
                colToGetter[nameToCol[lowerPropName]] = mget;
            }
            return colToGetter;
        }

        private static Dictionary<int, MethodInfo> GetColToSetterDictionary<T>(Dictionary<string, int> nameToCol)
        {
            var colToGetter = new Dictionary<int, MethodInfo>();
            var properties = typeof(T).GetProperties(BindingFlags.Public | BindingFlags.Instance);
            foreach (var p in properties)
            {
                if (!p.CanWrite || !p.CanRead) continue;
                MethodInfo mset = p.GetSetMethod(false);
                if (mset == null) continue;
                var lowerPropName = p.Name.ToLower();
                if (!nameToCol.ContainsKey(lowerPropName)) continue; // not present in target xls
                colToGetter[nameToCol[lowerPropName]] = mset;
            }
            return colToGetter;
        }
  • Not yet memory-leak proof
  • Can only use a long as an id
  • Does not handle nullable columns gracefully

I’ll post more later on what I was trying to do.   I did finally get it to work.   Sketchy as it all is.  I love that word Sketchy… thank you Rutledge Wood!

Now I have to go pick up my race bib for a 10 mile race tomorrow. 

Tagged with: , , , ,
Posted in Code

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories
Tags
.net 3d 3d-printing 4k abc15 algorithms ames android anonymous types asp.net audio editing aws backup basecamp beatunes biorhythm bitcoin bittorrent blender blog boston marathon bpm c# caffeine campfire candycrush car carmax charity chiropractor cities-skylines clog clone codelouisville codepalousa coding coffee collaboration color run ComputerElbow ComputerVision configuration consulting cooking crash course crashplan crestwood cycling dabda dan dapper DataSet ddl diabetes dictation dotnetcore dotnetmud downtown e-cycling elite excel exercise expiration facebook feature-branching firefall flipflops Flow FL Studio focus food forecastle fortresscraft franklinplanner gadgets game-design games git github google docs google maps gopro gps grandpa greenshot hack half marathon headless health heart rate hiren ignew integration testing interop inventory ios ipad itunes javascript jobs karma kdf keyboards keys kittens lamont laptop lavalamp lego life lifehack linq linqtotwitter linux los angeles louisville mandelbulber massage therapy mastery-teaching maths merge metformin Minecraft miniature modeling monitor mud muhammad ali institute music mvc mycartracks netfabb nexus10 node nostalgia nutrition nwipe oldham county grand slam opal openjscad openscad owin pacedj paper mockup pepakura performance photoscan politics pomodoro postgresql powershell premiere prius process product-management project-management qa ReFS resharper review rmi roman road 5k RSI rubiks running samsung 700t sandals schedule scooter scribblelive selenium service shapeways sleep slic3r sneakersync snot software software-engineering solidoodle soylent spacegame speaking sql sqlite SSDT SSIS standing state-machine stayfocusd stonehearth sunset tablet teaching team teamcity teamtreehouse terraform tesla testing tfs time timelapse torque touch tracks trs80 Tuple tutor twitter ubuntu unit testing utilities video video editing visual studio vscode vsvim warp stabilizer windows 8 windows home server wordpress wpf xml
Archives
%d bloggers like this: