Having Fun with SSIS

My new project involves getting data from 500+ remote sites into a target database; in parallel, quickly.  Thanks to some proof of concept work done by my coworker Mike “The Devastator” Hurd (I made up the nickname), I got to learn SSIS, and I’m cursing that I had not picked it up earlier. 

What Is SSIS?

  • Imagine you wanted to write a generic program to get data from point A to point B, but modify it along the way.
  • You wanted pluggable components that could talk to each other.
  • You wanted anything that could be done in parallel to be done in parallel.  Ie, read from two tables at the same time if you have bandwidth.
  • You want it to work buffered rather than row by row.  But dynamically figure out buffer sizes so that they fit well to the page size on the computer (to minimize swapping)
  • You want a GUI that you can configure these components with.
  • You want every component to have a configurable “if things fail” state, to redirect failed rows to other components, etc.
  • You want it to be solid – no leaking memory, etc
  • You want to parameterize the stuff getting into the components
  • You want it to be runnable from a command line, with ability to override parameters and connection strings.
  • You want the whole configuration to be saveable as an XML file, or hosted in a SQL Server
  • You want to be able to schedule it to run from SQLAGent (SQL Server’s “Cron” thing)
  • You want to be able to run it directly from C# via managed code.

Then you would write SSIS.

What Does It Look Like?

image

image

image

image

Initial Gotchas

SSIS requires some level of licensing, so this does NOT work:

  • Install SQLEXPRESS
  • install Visual Studio 2012 Premium
  • Install SSIS add-on
  • Install SSDT (“database tooling for VS2012”)

It successfully loads, and you can debug packages within Visual Studio, but you cannot run them outside via dtexec and dtexecui (the command line thingies for SSIS).  Instead, you need to add:

  • Install SQL Server Developer Edition (which includes SSIS)  (or standard edition)
    • Select Integration Services & Data Tools
    • Do NOT have to install the engine.
    • Might have to install an updated SSDT to get VS2012 support

Then everything runs fine.

The other thing I quickly learned is it is very Schema-dependent.  The reason for this is it allocates row counts into buffer sizes based on the size of the rows – so a Varchar(MAX) field is very different from a Varchar(250) field – and the schema will fail to validate if the underlying data source it is reading from does not match perfectly.

The solution I had to apply was to do a select cast(x as varchar(250)) as x to extract the data to guarantee the schema coming in to SSIS.  

Conclusion

Very glad to have this tool in my arsenal.   Thanks Mike!

Day at the Office: Using vi’s Macros

I had a method which was too long.  It was a switch/case statement, with a bunch of processing at each case label.    There were a lot of local variables that were being used:

var foo=something; 
var bar=yetsomethingelse; 
var scooby=another thing; 
var fred=yetanotherthing; 

switch(thingy) { 
  case A: 
    stuff with foo and bar; 
    break; 
    
  case B: 
    stuff with foo and scooby; 
    break; 
    
  case C:
    stuff with foo, bar, and scooby; 
    break;
    
  case D:
    stuff with foo, bar, and fred; 
    break;
    
  default:
    throw new exception(); 
    break;
}

In this mockup I only have 4 variables; the real one had 12.   And each case label was probably 20-30 lines of code.

Apply Refactoring: Replace Local Variables with a Class

And the name of the class shall be context. 

So I had a lot of errors, where this:

if (something(fred)) anotherthing(barney)

had to become this:

if (something(c.Fred) anotherthing(c.Barney);

The Vi Macro

The exact keystrokes were:

  Begin with cursor on fred
qn Begin Recording a Macro to register n.  I use n because its on the right hand, and the @ symbol to apply a macro is on the left hand.
~ Change case (now we have Fred)
h Go left one (because ~ moves you forward)
i Enter insert mode
c. <Esc> type in “c.” and exit insert mode
l move the cursor right one.
q Finish recording

 

Then, I hit rebuild (F6), looked at the list of errors, and then primed myself:

/fred @n  / enter  (find fred, fix fred macro, find next fred)

@n / enter   (fix fred, find next fred)

Apply Refactoring: Extract Methods

Now that I had a reasonable list of variables (one mega-variable, the “context”, and some other minor ones), I could take the meat of each case block and extract out a method for just that case block.

The result: cleaner code… and in one case, I could pass my context variable one layer deeper and extract out some other meaningful stuff from it (which I had not pulled out into a variable before).

Save the Wrists! Ordered a new keyboard / Code Dictation?

My wrists have been numbing more lately.  At my wife’s suggestion, I’ll bring it up with my GP at my appointment next Wednesday.  Meanwhile:

  • Switch to using the trackball with my left hand.
  • Switch keyboards.

This is the keyboard I’ve been using for the last 2 years or so – Microsoft Wireless Comfort Keyboard 5000.   It has some level of curve in it, however, I still have Ulnar Deviation and Pronation when using it.  The keyboard action is very light, and its been good to me for a while.   It is wireless .. less cords to deal with.

image

This is my older, wired, USB keyboard.  Microsoft Ergonomic Keyboard 4000 v1.0.  I switched back to it, but what I find is that the keys take too much pressure to press.. and my hands are tired.   I couldn’t do it.    The keyboard is also larger, which means I need to stretch my fingers further to press the keys.  However, Ulnar deviation is a bit better.

image

Just for fun, I tried using both at the same time:

image

This worked surprisingly well.  It was fluid, except when I needed to switch to the mouse.   I’m used to my keyboard being a single keyboard, and something about having two keyboards confused me; I didn’t know which hand to move.  I guess I know my “home” position based on the distance between my hands, and if I move one hand, I cannot find home easily without looking. 

I did some research and decided to order this keyboard, it should arrive next week sometime:

 

In the mean time some of the things I can do are:

  • type at a deliberate rate. (not look at keyboard)
  • type with one hand only.   (requires looking at keyboard)
  • I can also just use one finger from each hand.  By doing this the entire hand moves, and ulnar deviation is removed.   It is also quite fast.  (requires looking at keyboard)
[youtube=http://www.youtube.com/watch?v=xLR0Nwh1l6o&w=448&h=252&hd=1]
What did I type?

This is me typing with one hand.

In light of Hanselman’s new app which allows dictation from an iPhone, I wonder if there are any programmer-specific dictation tools?  I have looked at vimspeak, however that mostly addresses the control aspects of using vi.   Think of it:

public int Add(int a, int b) { return a+b; } 

“public int shift A d d paren open int a comma int b paren close curly open return a plus b semi close curly” ?

Other folks have talked about it over here: http://productivity.stackexchange.com/questions/3605/how-can-we-use-dragon-naturallyspeaking-to-code-more-efficiently

The problem is one of lost context.  In programming we are trying to be specific about contexts – usually using parens and curly’s as visual queues – for example, that “int a” and “int b” are parameters of Add.   The solution seems like it would be codifying contexts into spoken word, by adding extra keywords to tell the parser what we are trying to do.  For example:

“function <name>” => starts adding function, puts you in function mode; creates defaults; additional things make sense in this mode:

  • “parameter x”
    • [of type] y
  • “returns [type] ”
  • [visibility] “public” | “private” | “internal”   // keywords can infer that we meant visibility
  • “body”  // go to editing body mode

So theoretically, “function Shift a d d visibility public returns int parameter a of type int parameter b of type int body return a plus b”, or more specifically:

What is said What the code looks like (using underscore to denote cursor)
function object a01() { }
Shift A object A() { }
d d object Add() {  }
visibility object Add()  {  }
public public object Add() {  }
returns public object Add() { }
int public int Add() {  }
parameter public int Add() {  }
a public int Add(object a) { }
of type public int Add(object a) {  }
int public int Add(int a) { }

etc.    Ie, you map the spoke word into meanings that then get applied to the editor.

You could even navigate “into” and “out of” contexts by using clicks and clucks – sounds that are easy to make yet are definitely not letters.   “down” and “up” would apply at the context level – ie, statements, or functions.

It is an interesting problem.  It could be a fun thing to solve.  Any takers?

Day at the Office: Visual Studio Express 2013

Customer has requested sufficient documentation that after we leave the project, they could bring on another developer and they could get up to speed efficiently.

imageSo, here’s my table of contents that I’m filling out (sure to change as it progresses).    I’m currently working on the “Build From Scratch”, which, to avoid any future awkwardness, means “Create a blank windows 7 virtual machine, Install visual studio and Compile.

Just to see if it was possible, I tried installing the free Visual Studio Express 2013 for Desktop instead of the paid version.

It works.

One of my resentments against Microsoft – from back when I was a Unixite and a Micro$oft Hat3r – was that the average joe can’t afford the tools.   (Whereas, Perl and Python and gnu C++ were free).   This is no longer true

You can download and run VS2013 Express for Web, or Desktop.   Limitation:  no code analysis stuff; possibly harder to integrate 3rd party tools.   Includes: NuGet package manager, Intellisense, Multiple projects per solution, etc.

You can download and run SQLExpress.  Its pretty much the full blown SQL except it limits itself to 1 core and 1G of memory.  

I think that pretty much covers most of the software development I’ve done in the last 14 years, and gotten paid for.

Good Job, Microsoft.  I think I can be proud of you now.

Super Secret Project

I hav2014-05-30 12_42_47-Michael Thornberry (WaywardMage) on Twittere a coworker who loves to code.   He has a list of technologies that he’s been just hoping to find a problem for.   Angular, Xamarin, WebAPI, all kinds of stuff.

I do have such a list; for me, they  include things like Angular and Erlang.    I also have a separate list of projects that I would like to code (or have coded someday), problems that need to be solved.    In the past, I would start to work on one of these projects.. get 2-3 hours in, realize that the full solution is more like 50-60 hours away.. promise myself to work on it the next weekend..  and then life moves on, something else comes up, the need is just not that great, and it doesn’t get done.

// TODO: separate blog post on obsolescence

Synergy is Born

At the end of a BrainNom Monday (where we sit around and watch instructional videos on the web or delve into new subjects as a group of software engineers), I mentioned some of these side projects I had never gotten around to.

He got pretty excited.    I have given him a reasonable target to try to hit with this array of tools he’s been sharpening.      So we started a collaboration.

The Experiment

I’m taking on the role of product manager – dealing with what the app should do.. how it handles multiple people.. the user experience … sharing .. stuff like that.  And he’s pursuing the coding.

He gives me feedback on the requirements; I give him feedback on the technology.

And we’re starting to build this app!

He has laid out the project structure, which includes a bunch of words I had not heard of before.. Ionic, Cordova, Ripple.    He showed them to me today.     I have a bunch of requirements and data structures and use cases and build orders of features figured out – keeping an eye out for what’s the shortest path to get to a minimum viable product, but to have enough foundational UI that the future features have a place to sit without a UI (and UX) rewrite.

The App

Will be revealed in due time.  Either it will go live, and then we’ll have a whole series of blog posts of how we got there (with full source disclosed), or it won’t, in which case we’ll blog about the demise.   I could give you a technically truthful teaser and say it has to do with temperature and burning, but I’d be trying to throw you off-course, of course.

I think we should come up with a cool code-name for this app, so I can talk about the app without actually giving it away.   Hmm.

Day at the Office: Just another Day, Nothing Special

I have several 3d printing posts queued up on a every-other-day schedule.  I feel I really should write some code, or something, to re-present the part of the that does code for a living.

But today was .. just an uneventful day.

  • I did the weekly Status Report.    I’m doing some fun stuff in there, trying to represent the flow of work in and out of the process.. when things go to test, when they go to deployment, when bugs are introduced vs discovered, etc.   In color.  In Excel.  image
      • I wonder how much I would get out of marking “when bugs are introduced”.   Kinda see the full cost of deploying a feature – also seeing all the bugs that might have been introduced at the same time.
  • I worked on two bugs.  
    • The first one was a GUI thing – some stuff on the screen was flickering.  Turns out, some XAML had gotten changed to {Binding … UpdateOnPropertyChange}, and the underlying property was being += ‘ed in a loop.  So every time the loop ran through, it updated the UI.   
      • Fix #1: accumulate in a temporary variable and then assign at the end.   
      • Fix #2: in this minor control, turn off the automatic NotifyOfPropertyChange on every set, and provide a NotifyOfPropertyChange() method that a parent could control.. and then modify all parents to call NotifyOfPropertyChange outside of the loops that updated stuff.
      • Of course I profiled it before and after.   Went from long drawn out CPU hog-ness to itty bitty spikey. 
    • The second was a business rule implementation bug.   The client has several settings for something – lets call it a light saber setting – and the settings were something like “only cut through meat”, “cut everything”, “don’t cut through living tissue”, and “don’t cut green things”.  Turns out, green things needed to be excluded from one of the meat settings as well (or something like that).
      • Most of the time of the fix was me asking the client, “so what you mean is this”. 
      • I got it wrong at first.   I thought they meant the light saber dispenser setting, “only dispense to Jedi’s but also to green things”.   When I fully explained what I was going to do (with screenshots!), the client realized the miscommunication and corrected me.
      • Actual size of fix:   less than 20 characters.
  • Then I left early, and logged on at night, and did a production release in 1.5 hours.
    • Backing up the Prod database, of course.
    • Running database deployment scripts (I use a can-rerun-indefinitely approach with some batch scripts)
    • Using Teamcity to deploy to production.  Over VPN, it takes about 20 minutes.   But I had to do it twice, due to a failed VPN.
    • Testing the app after deployment – that the 6-7 fixes we deployed were working.  I didn’t check all of them, just the ones involving database changes and settings migrations, etc.
    • And the communication that the deployment happened, what to expect, for the brave soul who herds the client’s users.

Just another day.    Nothing exciting, just work.    A Good day.

Saving Wrists

image

My wrists have been hurting lately.. especially the right one.  Wife thinks I have carpel tunnel syndrome, she might be right.   I already have the ergonomic keyboard, and a trackball; I can use a mouse in either hand, with either button configuration.    However, when working with code, there’s definitely a “switch hand to arrow keys” and “switch back” repetitive thing that happens.  (at least for me).  So, this journey to save on keystrokes and wrist movements.

Step 1:  Try Not to Use the Mouse

I started by putting the mouse very far away from me.   This forced me to try to find keyboard shortcuts for most of the things I was trying to do – especially switching windows.  Here are some of the ones I use now;  most of these are not the default keyboard combinations, but rather the secondary keyboard combinations, which I was left with after vsvim got installed.

Shft-Alt-L or Ctrl-Alt-L Solution Explorer
F5 Build + Debug
F6 or Ctrl-Shift-B Build
Ctrl-Alt-O Ouptut Window
Ctrl-R Ctrl-R Resharper Refactor
Alt-~ Navigate to related symbol
Ctrl-K C Show Pending Changes
Ctrl-T, Ctrl-Shift-T Resharper Navigate to Class / File
Alt-\ Go to Member
Ctrl-Alt-F Show File Structure
Alt-E O L Turn off auto-collapsed stuff

I also assumed a layout where I have a bunch of code windows, and all other windows are either shoved over on the right or detached and on another monitor.   No more messing with split windows all over the place.  By using a keyboard shortcut, wherever the window is, it becomes visible.  I don’t hunt around in tabs anymore.

image

Step 2: VsVim

History

I first learned vi in 1983, on a Vt100 Terminal emulator connected via a 150 baud modem to the unix server provided by Iowa State University’s Computer Science department.   (I was still in high school, I was visiting my brother who was a graduate student at the time).  There was some kind of vi-tutor program that I went through.    It was also much better than edlin and ed, which were my other options at the time.

Anti-Religious-Statement: I used it religiously till 1990, when learning LISP, I also learned to love emacs.   Yes, I stayed in emacs most of the time, starting shell windows as needed.   

I maintained a proficiency in both vi and emacs till 2001, when I got assimilated by .Net and left my Unix roots behind.

And Now

Having had a history with it, I decided to try vsvim and see how quickly things came back to me.

The first thing I noticed is that every other program I used, whenever I mis typ hhhhcwtyped something, I’d start throwing out gibberish involving hhhjjjjxdw vi movement commands.  And pressing ESC a lot.   I (am still having to) to try to train my eyes to only use vi commands when I saw the flashing block yellow cursor that I configured it to be.

imageimage

I also had to un-bind a few things – for example, vi’s Ctrl-R is a lot less useful to me than Resharper’s Ctrl-R Refactorhhhhhhhhhhhhhhh   I did it again.   vi’s Ctrl-R “redo” I could just do :redo instead.

And where am I now?  I still need to think about it a bit..  but, for example, recently I changed some code from being a static Foo.DoSomething() to a IFoo.DoSomething(), and I had to inject the class in a bunch (10+?) of constructors.   The key sequences went something like this. (R# in Red, vsvim command in blue)

ALT-\ ctor ENTER Jump to the constructor in the file (R#)
/) ENTER Search forward for “)” and put cursor there (/ and ? go across lines, fF are in current line only)
i, IFoo foo ESC Insert “, IFoo foo”
F, Go back to the comma
v lllllllllll “ay Enter visual select mode, highlight going right, to buffer A, yank (copy); cursor back at ,
/foo ENTER jump forward to foo
Alt-ENTER ENTER Use R# “insert field for this thingy in the constructor” thingy
Ctrl-T yafvm ENTER Use R# Go to Class, looking for YetAnotherFooViewModel  (most of the common things I work with have a very fast acronym.  For example “BasePortfolioEditorViewModel” is “bpevm”.  I can also use regexp stuff)
Alt-\ ctor ENTER Jump to constructor
/) ENTER Go to closing brace
aP paste from buffer A before cursor

If this sounds like complete gibberish …  yes it is.  But here’s the thing:

  • I am talking aweZUM s3krit c0dez with my computer!
  • My fingers are not leaving the home position on the keyboard.  Not even for arrow keys.
  • By storing snippets of text into paste buffers (a-z, 0-9, etc), I can avoid typing those things again, which is very useful.
  • If I plan ahead a bit I can save a lot of keystrokes trying to get somewhere in a file.
  • Once I enter insert mode, Its just like normal – can still use arrow keys to move around, shift-arrow to select, etc.

It is geeky, nerdy, experimental, and it might be helping my wrists a bit.   1 week so far, still going okay.

another trick I use:  variable names like “a”, “b”, “c” .. and then I Ctrl-R Ctrl-R rename them to something better later.

I would not recommend trying to learn vi without a vi-tutor type program

Day at the Office: LINQ vs Stored Procedure

I love using LINQ (to EF to SQL).   It takes care the plumbing of getting data out of the database.  But there was one case where I had to swap it out for using a stored procedure.

The Setup

There were several screens that showed similar data, but constrained in different ways.  They all shared common code along these lines:

Base:

iQueryable Q  // defines the base query that all screens need

Overrides 1..N:

IQueryable Filter(IQueryable Q) { 
    return Q.Where(… additional clauses…); 
    // specific to the screen 
}

and then calling back to some Shared code:

IQueryable Filter(IQueryable Q, FilterSettings fs) {
  // apply standard filters that user could set at the app global level
}

The app used to then take this IQueryable and send it over to a ProjectAintoB() function that did the actual query (.ToList()) and then mapped that stuff out to a viewmodel / list item thing to shove into a grid:

return new GridItem { 
    Description = Q.Description, 
    ClientName = Q.Client.Name, 
    ManagerName = Q.Client.Manager.FullName, 
    Exceptions = Q.Exceptions, 
    NotesCount = Q.Notes.Count()
};

Note: this is simplified.  The original GridItem class had somewhere around 35 members, across 8-9 tables.

When we first started using the app, this worked well enough.   It returned quickly enough to not be noticeable.   However, as the app started to grow, things started to get noticeably slower, so we went onsite and did some profiling, and this is what we found:

image
Actual screenshot/image from original analysis document around the problem

  • At the client site, it was a lot longer, with CPU pegged at 100% for 2-3 seconds.
  • Turns out, most of that CPU was LINQ attempting to build the query, not actually executing the query.
    • I didn’t notice this on my Dev machine because of superior hardware and multiple (8) processors. At the client, most of their processors were 2 cores, and the other core was busy doing client-ish stuff.
    • This was with EF 5.0 with .Net 4.0, before they had “cached compilation of queries” (.Net 4.5).  I tried upgrading to .Net 4.5; but still had problems.
  • After the query was executed, there was a network spike as the data returned from the server
    • Turns out, by having 9 tables returning, and the way EF work(ed|s), there was a LOT of repetition – the same Manager FullName across 5,000 rows, etc.
    • This lead to a large network payload of returned data
    • This was a killer over VPN.  (Which only affected me, the developer, but I am important!)

The Solution

We did it in two stages:

  • We kept the original concept of an IQueryable that gets modified by various layers to get down to the data that needs to be pulled back.
  • We executed it, but only pulled back a single item:   select g.GridItemId.   On some screens, this was 50 items; on other screens, 10000 items.
  • We called a stored procedure which took in a TVP (table valued parameter) of id’s, and returned multiple result sets:
    • one result set for the grid items + external id’s to other tables
    • one result set for all the Clients referenced by these grid items.
    • one result set for all the managers referenced by these grid items.
    • one result set for all the Notes referenced by these grid items.
  • We read these in, and directly constructed our grid item from these results.

And it worked wonderfully.

  • Average compilation time at the client site was down to 50 ms (most of the complexity was joining in to all the other tables to populate a grid item)
  • The second call to the sproc finished within 20-40ms or so
  • The payload across the wire was much smaller.

Here’s a screenshot of a paragraph I wrote in my excitement at getting it to run so much faster:

image

I also recorded a video of a before screen, and the after screen.  I can’t show that to you, but it was a hit.

The Result

This, and other performance enhancements over the last 6 months – according to the client last week:

Client: Yes!  Actually, 2.0 is faster than 1.0 now!  We dislike going back to 1.0.
Me:

Yess!