Software Estimation Review #1

One of my recent projects had this burndown:

At the start of the project, the “desired” hours were coming down at the speed of the “high” estimate.  If I had used the first 5 data points to predict the end of the project, it would have been near T=7, at a cost of 6N or so, and the estimate was only for 5N.   Not good.

Luckily, things started to sort themselves out as the project progressed.   However, I wanted to understand what was going on.

At work, I track every slice of time at a project and “top level task” level.   I also note down what exactly I’m doing (and usually its outcome – success, interrupt, break, etc – especially useful if there is fragmentation, you can see how interrupted you were – a metric to show pain).

I exported this detail into Excel, and created an item by item grid of time taken for each line item  in the original estimate.

In the process I found several items that I worked on that didn’t have a home in the estimate, and I got “actual hours worked” for each of the line items.   It was shocking.  Here is an abridged version:

  • I saw tasks much larger than the original estimate.
  • I saw some tasks much smaller than the original estimate.
  • I missed several features.
  • I left out entire categories such as refactoring (as the project grows), documentation,  styling and usability.

Possible Interpretations

I did well.

The project was done within the time allotted.  This was a success.  Perhaps the individual line items were not, but the overall project was – I was lucky that I overestimated several items.  If I had tried to squeeze every line item, I would have been in trouble.

Could more Up Front Design yield a better estimate?

The Waterfall thought would be I didn’t do enough up-front design.  However, I spent longer putting this estimate together than I should have – because in this job, putting together an estimate is not billable time!  If I had spent more time on it, I would have had more line items – yielding a higher estimate – yielding a greater risk that the client would not approve the budget – and I would have still gotten it done on time – perhaps even less time.

I have done this before = takes less time

I had previously done a similar project.  As a result,  I estimated several items fairly low. Examples:

  • 5 to 8 (actual: 23)
  • 5 to 7 (actual: 12)
  • 5 to 7 (actual: 14).

This is equivalent of an handyman stating: I’ve built a house before, so hanging drywall won’t take as long.   Incorrect!   It takes time to build software, even if you know exactly what you are doing.

 “Sure that Technology will be easy to integrate”

My coworker (on loan to me for two weeks) spent a lot of time trying to get a grid to work correctly.   Actually, the grid worked fine, the helper classes to interface it with MVC did not.     Trying to get them to work took the majority of the hours on one of the pages.

In the end, he scrapped the helper classes he had downloaded – they worked only in specific circumstances – and rolled his own – and coded the rest of the page in record time.

I’m not sure which way to go on this.  Perhaps having a bucket of “research” hours?    Timebox the items that we research to save research budget in case its needed later?   It seems every project I’ve been on, there’s been one or two things that took some time to get the approach figured out.

I left out parts of the project from the estimate

There were several items that I had not estimated up front that are a part of any project.  See below for a running checklist.

I skipped parts of the estimate

Some functionality turned out to be trivial, or unnecessary. For example, we went straight from a grid of Endor to editing Endor without a stop at a detail page for Endor.   Or, one of the setup pages – thought to be three seperate pages – collapsed into a single grid.

These “mistakes” are what saved the project.

Overall Analysis

I think my mistake was trying to specify everything in the estimate at a line by line level.  As DPR207 at TechEd 2010 pointed out on slide 16, there are different levels of uncertainty as a project progresses.  I was estimating at too granular a level too early – guaranteeing I would miss items, as well as specify some items that were unnecessary.

Doing it Differently

In this particular project, how else could I have estimated it?

Agile, Story Points, etc.

Using Story Points and approaching a project in an agile way (Scrum-like) are my favorite ways of handling project estimation and management.  However: In my role as consultant, with clients pre-approving blocks of hours (based on a high end estimate), I don’t get to do that.   So I’ll pass on that solution.   I HAVE to have a number of hours up front, or I need to find a very understanding client.

Estimate at a higher level

Rather than estimating at a very detailed level, I could estimate in bigger chunks like this:

Level 1: Systems “A Website”“A Batch App” 50,100,150,200,300,500 hours
Level 2: Functions / Features / Pages Grid of FooEditing Foo

Selecting Bars for a Foo.

Reading in an xxx file.

Validating xxx information

10-20, 20-30, 30-50 hours

Fibonacci Numbers

I will definitely continue using Fibonacci numbers for estimation.  It saves me brain power – Its either a 5 or an 8.  I don’t have to argue with myself whether its a 7 or an 8 or a 9.

I use a slightly modified sequence:   0.5, 1,2,3,5,8,15 … somewhere in there, I jump into adding zeros:  5,10,20,30,50, etc.

Going Forward

Jim Benson (@ourfounder) gave a presentation at CodePaLousa: “We are Not Engineers“.   (I tried to find a link to a powerpoint, and then I remembered, it was truly a powerpoint free presentation. I did find similar materials here)  The gist of the presentation was:  “Humans are horribly designed for doing estimation”.   If I estimate based on gut feel, I’m going to be wrong.   Thus, for my job, I either need to stop estimating, or start to build a toolkit.

This is the toolkit so far:

Keep track of all time and match it back up against the estimate at a line by line level.

It takes about 15-30 minutes to export from Harvest and match things up in Excel.   Its a form of instant feedback.

Estimate in larger chunks.

Rather than counting the number of fields on a page (which is what I did previously), I need to find the average time to write a page, and then double it if it’s a big page.    Try not to estimate in anything smaller than, say, 10 hours, for a chunk – a page, a file processing feature, etc.

Keep a checklist

My checklist at the moment:

  • Initial project setup
    • Machine setup
    • Project setup – layers (DAL, etc) – into SVN
    • Stub version of project that can be deployed
  • Test data strategy (in support of integration testing)
    • Test data creation and deletion
    • Setting up automated integration tests to run automatically
  • Individual Items (Project specific)
    • Larger chunks
    • Include testing time for each item – writing unit or integration tests, and testing by hand.
  • Growing pains
    • Refactoring is inevitable (and not specific to any particular item)
    • Configuration and Deployment (as the project enters different environments)
    • Merging between branches if branches are called for (any time there is a phased release, or a separation between UAT, QA and DEV)
    • Necessary Documentation (after complicated things get done)
  • Project Management
    • Dependent on each client. 10%  of a 35 hour week = 3.5 hours of email, status updates, demo prep, and demoing – which is about right for a good client.  Other clients, the number has been closer to 20%.
  • Research bucket
    • For the unfortunate item(s) that will pop up that just don’t work
    • Also can be used to take extra care to simplifying something so it stays more maintainable.
  • Usability, Look and Feel
    • Cross browser testing
    • “Making it Pretty” – fonts, colors, grids
    • “Making it more functional” – renaming things, page organization, number of clicks to do something
  • I18N and L10N
    • Dependent on client and project.

The Request

How do you estimate your projects?  How close do you get?   I truly want to know.  Shoot me a comment or write it up and add a link.

Where did my time go?

This week i’ve felt hurried.. it felt like life kept getting in the way, preventing me from working the work hours I should to be working.

I can’t argue with a feeling, so I took an inventory, and mapped it out. With some help from my wife, and a history of text messages, I could mostly put together what was going on. (Harvest, the time keeping app for work, helps)

Inventory Result

The Blue is work.  Most of it was Working-At-the-Office (WAO), though some was Working-At-Home.. (WAH).. either in the Basement, or Outside. (Outside = it was a beautiful day)

The Yellow (yellow) is hanging with my spouselet .. which also happens to be very good for me, I enjoy that time immensely.   We’ve been watching Season 6 of Doctor Who.

The Orange is hanging out with people – my spouse’s family on Monday, and with a buddy for breakfast on Wednesday.  Note: Monday and Tuesday lunches should have been in Orange as well, I’m trying to make it a point to hang with the guys at work.

The Other Warm Color is “stuff I had to do”.  Some of it is house hold related stuff – some of it is being of service to other people.

And lastly – there’s Purple, the stuff I do for me.  Which includes writing this blog post, and my little jog around the block, and puttering around on whatever I want to do.

Analysis

Monday was bleah.  I had a family thing to go to, didn’t get my hours in.

Tuesday, I tried to catch up.  It was fairly training.  Wife and I recovered it well though.

Wednesday was shot – My son needed some assistance with homework, so there was some time lost to transit.  The good news there is, I could work from home – I multitasked – I had the dogs out, and the cat out, as I worked from my laptop.  It feels good to crittersit the critters and spend a little time with them.

What’s the problem? What’s missing? What needs to change?

I find myself lagging behind in my work hours and spending Friday, the day when I get 4 hours to work on anything I want (including this blog) — I spend it on catching up the hours I didn’t get done earlier in the week.

What causes this lag?    Its usually “household” things.  Sometimes its “people” things. Both of those are important, and they’re not going to budge.

The good news is, the days that I don’t have “life” going on, I can easily stay at work to get caught up.    Or even work from home to get caught up.  (I find that I can only work from home when I’m alone there – otherwise, my family drags me into their lives).    If I stay extra at work, I can usually put a workout in there as well.

Other Optimizations

I might try sleeping in till 7am, and not tossing and turning for an hour.

As mentioned earlier, I would like to add more exercise.   I’m not sure where – the current idea is, on days that “life” isn’t going on, work late, and add in a jog there.

Random thoughts: The YMCA is on my way to/from work, but the time needed after a workout to take a shower is a detriment; yet my chances this week of doing something on the way home .. there was only Tuesday.  I did jog on Tuesday.   So.. mission successful? Operating as designed?   Maybe the Late Night Jog is the best way to go, i used to jog at about 10pm or so, when I first started jogging.   After dinner, even, help with absorbing all those calories (diabetes related).

The Big Picture

I’m not seeing a lot of wasted time in there.. hours spent in front of a TV, flipping channels.

I am operating at 90% capacity, easily – with the remaining 10% necessary for buffer, sanity.   The occasional breaks – in white, with question marks – are signs of sanity, not places to cram more stuff in.

As a very wise person once told me – as my life progresses, I am going to find more and more good stuff that doesn’t fit.

What If I Were Retired

If I were retired, how would the ingredients change?

I suspect there would be a lot more puttering, and helping, and hanging with people going on.  And probably a lot less “creating” things (my current work).

Calories In Minus Calories Out


There was a conversation at work. Simplified:

VP (Visiting Person): “What Kind of Cookies do you have here?”
Me: “I don’t know, I haven’t checked. I’m not eating Wheat right now, and that shelf is all wheat.”
VP: “How’s that going? I’ve heard about that.”
Me: “Pretty good. I’ve lost a few pounds even without exercising.”
VP: “Yep, calories in and calories out.”

My brain took off on that conversation. Hence, this blog post.

Normal

The assumption is that people simply control the calories out. “Just work out harder”.

# Powershell
"Normal Situation: "
"Calories In : {0}" -f ($caloriesIn = 2000)
"Calories Out: {0}" -f ($caloriesOut = 2000)
"Net Calories: {0}" -f ($netCalories = $caloriesIn - $caloriesOut)
"Gain/Loose {0} lbs per week" -f ($netlbsperweek = $netCalories * 7/3500)

Normal Situation: 
Calories In : 2000
Calories Out: 2000
Net Calories: 0
Gain/Loose 0 lbs per week

True, but there’s a twist that I found out, due to my type two.

Conjecture for DM2

This is only my understanding of it. It is not proven or fact.

The hidden factor for me was insulin release, and and insulin resistance.

Lets say that for the calories I was taking in, my blood sugars bumped up to the 140+ range. Lets assume that I’m not working out actively at the moment. (when engaged in physical activity, some other form of transport happens, and sensitivity to insulin goes up).

My body is desperately trying to shove this energy into my cells, and is pumping out insulin (all that it has). The insulin has the effect of transmuting some of these calories to stored fat – almost immediately – before I have a chance to use it.

# Powershell
"Actual Situation: (not on drugs)"
"Calories In:  {0}" -f ($caloriesIn = 2000)
"Converted to fat due to high insulin levels: {0}" -f ($insulinToFat = $caloriesIn * 0.15 )   # total guess
"Fat gained per week: {0} lbs" -f ($fatperweek = $insulinToFat * 7 / 3500)
"Body must make do with {0}" -f ($bodyAvailable = $caloriesIn - $insulinToFat)
"Feeling as though only consuming {0} calories - ie, starving" -f $bodyAvailable

Actual Situation: 
Calories In:  2000
Converted to fat due to high insulin levels: 300
Fat gained per week: 0.6 lbs
Body must make do with 1700
Feeling as though only consuming 1700 calories - ie, starving

My evidence for myself:

There were 2-3 months that I went off my medication, but was watching and logging what I ate.
During that time, I ate 1800-2100 calories a day. Yet, I gained 5 pounds in about 4 weeks.

My deduction:
By storing those 5 pounds => that would mean 17500 calories over 28 days = 625 calories a day.
Which meant that I lived on around 1200-1400 calories a day.
And yes, I felt starved the whole time. (not really starved, but you know, the feeling? I have never really starved, except perhaps once)

So what the heck does my medication (Metformin) do?

Everything says “it limits the production of hepatic (liver) sugar”.
What does that have to do with anything?

My understanding (only my understanding) is:
The body MUST MUST MUST not get into a hypoglycemic situation – because the brain dies. Therefore, it monitors it very seriously.

As blood sugar gets too low, it tells the liver to go make some more. Its not an all or nothing – its a gradual release type thing. However, its tuned based on “relative” levels of blood sugar, not absolute levels.

Being diabetic, and having repeated elevated blood sugar levels “reset” what my body thinks normal is. So my body is churning out sugar even when I’m at a comfortable spot, like 110. It thinks 110 is low.

By jumping in and cutting that link (or reducing it, anyway), Metformin allows my “average” sugar levels to come back down to what they are now.. a fasting number of 80 or so.

And once I get down to 80.. and if I watch what I eat, such that any meal, 2 hours after, I’m back under 140 (these are numbers I’ve chosen for myself), my body leaves the second equation, and goes back to the first equation. Or maybe, the 0.15 goes down and becomes a 0.5. I don’t know exactly.

What I do know is: If I stay off Metformin, my weight goes up, and my fasting blood sugar levels go up. If I stay on it, and I eat wisely, they come back down to normal levels.

In Conclusion

All of the above is my explanation to myself.
Its probably wrong. The reality probably has something to do with aliens, monkeys, ninjas, and a turtle.
If you have a better explanation, grand unifying theory of blood sugar, please do post it and point me at it.

Snot Funny

Insert standard: “Oh I don’t post enough” post here.  Seriously, i believe I have 2 readers other than myself.  (Hi Doug!  Hi Molly!)  (i just set myself up for disappointment, maybe its less than two)

I thought I would fall back to my dastardly plan of re-posting old stuff from the “geeky” tag from my livejournal, but what I found there was:  a whole bunch of started projects with nothing finished.  Not worthy of posting. A few cool pictures..

In other news, my wife thinks I might be somewhat ADD-ish.  Her diagnosis was on the basis of opened cabinets I left behind after working in the kitchen for even a small amount of time. I disagreed with her at the time. However, given the previous paragraph, that might be true.

Given all three previous paragraphs, I should refrain from starting yet another thread that I might or might not finish, so I won’t talk about what my plans for this blog are.

So all I can leave behind, then, is something like this:

a. Save Points

When I play a computer game that has Load and Save points.. the first 10-15 minutes back in reality.. I feel strangely out of touch.. imagining that life had load and save points and being utterly surprised that it does not.

b. Snot Funny

Sinus Infections have re-upped my anti-snottery artillery.

  • Throat Coat to remove itchy ickyness so i can swallow again. Add Honey as needed.
  • Because water doesn’t go down well, heat it up, add honey, then it goes down fine.
  • Neti-pot. for washing the gunk out of sinus cavities. Makes your cat look at me funny, as I kneel over the bath tub. There are other tricks to this trade, but they’re kinda gross.
  • Salt water gargle. Mouthwash gargle. I have not tried Vodka gargle.
  • Stuff gunked up? Cayenne pepper + lemon tea! Heavy on cayenne.
  • Vicks Vapor Rub! I’m never too sure if its a de-gunker or a runny-snot-dryer. I think i use it as both? And a throat-soother.
  • DayQuil and/or NyQuil. Take with food to avoid upset stomach.

Duct Tape and Lego’s

While discussing the finer points of avoiding RMI stress injuries (aka carpel tunnel, but there are several others) at work, JS mentioned “if he could only tilt his trackball up by a few degrees”…

Welp, I’ve been struggling the last 2-3 weeks with all kinds of arm pain.. leading me to try all kinds of things. I liked the “tilt” idea. So I made one from Legos:

Its not the only thing I’ve made. I tried making a Droid Charging Station, but that didn’t work so well; however my Lego Monitor Tilt works pretty good:

And then.. Duct Tape. This is the original version of the “I need a handle to pull the air filter out of the furnace” thingy:

Life is good.

Old Fogey

In the department of how-old-a-fogey-am-I, An email I wrote at work might be a good blog post here. It shows my early geekness.  It has been altered to be more of a blog post.

A Coworker wrote:

Looks like he’s got some dev cred. Plus I think that’ll make Sunny and me the old guys no longer…

My Response:

OMG I used to lust after the contents of that book!

I wax nostalgic:

I grew up in Liberia, West Africa; my parents taught at Cuttington University College.

we had power 2-3 hours in the evening (not enough gasoline for the campus generator). Just enough to get the refridgerators cool .. you took what you needed out at 4pm when power came on, and then shut it (with paper tape, no duct tape available there) so nobody would accidentally open it, and then it would get cool enough to stay cold till the next day.

I first met a computer, it was a TRS-80 model I (i think, might have been a II – image on left), a fullbrite professor’s kid (Lars F) showed me a little game called “Adventure”. Then he showed me BASIC. And I was hooked. (PEEK 14400!)  (Side note: Lars also introduced me to Dungeons and Dragons, and the Rubik’s Cube)

Later on, the campus got a “computer lab” – of TI-99/4A‘s. My parents, being math professors, taught the courses, so they brought one home with them. With that book. Of course, we didn’t have any of those cartridges.. all I could do was look through that book. (and this one also: image on right)

But yeah, i’d start planning my programs during school, on paper.. power would come on, and I’d type them in furiously, getting them to work.. play .. and then power would go out, and it was gone. Repeat daily.

(after power went out, light 4-5 candles and read novels for the rest of the night till mom pestered me to go to bed)

When I came to the US for the first time.. 1983 … i was amazed at, in order:
a) 24 hour power
b) hot water
c) vending machines
d) hamburgers
e) 300 baud models hooking up to Iowa State University’s CS computer .. which introduced me to..
f) UNIX!

*gratitude for the little things*

Test Data Creators (Integration Tests)

Everybody seems to love Unit Tests

I agree, they are wonderful. I have lots of logic that is unit tested … and its easy to set up (especially with tools like moq)…

But its not what I rely on.  I have found it to be too limited to give me the confidence I’m looking for as I write a system.  I want as much tested as I can – including the data access layers – and how everything fits together – and that my dependency injectors are working correctly.

Another view: in my current project, I’m using nHibernate as the data provider.  The general consensus on mocking nHibernate is: don’t do it. Instead, use an in-memory database (didn’t work – had to maintain different mapping files), or write an IRepository around it.

When I do that, what I find is most of the logic that needs testing is in the subtleties around my queries (LINQ and otherwise) – the rest is plumbing data from one representation to another.  While unit testing that is valid, it does not cover the places where I find most of my failures.  Stated in GWT syntax, my tests would be “GIVEN perfect data, WHEN logic is executed, THEN the app does the right thing”“perfect data” being the elusive part.

I have tried providing a List<T>.AsQueryable() as a substitute data source in unit tests – and that works well, as long as my queries do not get complicated (involving NHibernate .Fetch and so on.)   If the queries grew beyond my ability to mock them with .AsQueryable(); my “test” situation (LINQ against a list) started to differ significantly from the “real” situation (LINQ against a database) and I started to spend too much time getting the test just right, and no time on real code.

My Solution – Test Data Creators

My solution for the past 5 years over multiple projects has been “Integration Tests”, which engage the application from some layer (controller, presenter, etc) all the way down to the database.

“Integration”,”Unit”, and “Functional” tests — there seem to be a lot of meanings out there. For example, one boss’s idea of a “Unit” test was, whatever “Unit” a developer was working on, got tested. In that case, it happened to be the “Unit” of batch-importing data from a system using 5 command line executables. Thus, for this article only, I define:

  • Unit Test – A test called via the nUnit framework (or similar) that runs code in one target class, using mocks for everything else called from that class, and does not touch a database or filesystem
  • Integration Test – A test called via the nUnit frameowrk (or similar) that runs code in one target class, AND all of the components that it calls, including queries against a database or filesystem
  • Functional Test – Something I haven’t done yet that isn’t one of the above two
  • Turing Test – OutOfscopeException

Having built these several times for different projects, there are definite patterns that I have found that work well for me. This article is a summary of those patterns.

Pattern 1: Test Data Roots

For any set of data, there is a root record.
Sometimes, there are several.
In my current project, there is only one, and it is a “company”; in a previous project, it was a combination of “feed”, “company” and company.

The Pattern:

  • Decide on a naming convention – usually, “TEST_”+HostName+”_”+TestName
  • Verify that I’m connecting to a location where I can delete things with impunity — before I delete something horribly important (example: if connection.ConnectionString.Contains(“dev”))
  • If my calculated test root element exists, delete it, along with all its children.
  • Create the root and return it.
  • Use IDisposable so that it looks good in a using statement, and any sessions/transactions can get closed appropriately.

Why:

  • The HostNameallows me to run integration tests on a build server at the same time as a local machine, both pointed at a shared database.
  • I delete at the start to leave behind test data after the test is run. Then I can query it manually to see what happened. It also leaves behind excellent demo material for demoing functionality to client and doing ad-hoc manual testing.
  • The TestName allows me to differentiate between tests. Once I get up to 20-30 tests, I end up with a nice mix of data in the database, which is helpful when creating new systems – there is sample data to view.

Example:

using (var root = new ClientTestRoot(connection,"MyTest")) { 
    // root is created in here, and left behind. 
    // stuff that uses root is in here.  looks good. 
}

Pattern 2: Useful Contexts

Code Example:

using (var client = new ClientTestRoot(connection,"MyTest")) { 
    using (var personcontext = new PersonContext(connection, client)) { 
       // personcontext.Client
       // personcontext.Person
       // personcontext.Account
       // personcontext.PersonSettings
    }
}

I create a person context, which has several entities within it, with default versions of what I need.

I also sometimes provide a lambda along the lines of:

new PersonContext(connection, client, p=>{p.LastName="foo", p.Married=true})

to allow better customization of the underlying data.

I might chain these things together. For example, a Client test root gives a Person context gives a SimpleAccount context … or seperately, a MultipleAccount context.

Pattern 3: Method for Creating Test Data can be Different from What Application Uses

By historical example:

Project Normal App Data Path Test Data Generation Strategy
Project 1 (2006) DAL generated by Codesmith OracleConnection, OracleCommand (by hand)
Project 2 (2007) DAL generated by Codesmith Generic Ado.Net using metadata from SELECT statement + naming conventions to derive INSERT + UPDATE from DataTable’s
Project 3 (2008) DAL generated by Codesmith DAL generated by Codesmith — in this case, we had been using it for so long, we trusted it, so we used it in both places
Project 4 (2010) Existing DAL + Business Objects Entity Framework 1
Project 5 (2011) WCF + SqlConnection + SqlCommand + Stored Procedures No test data created! (see pattern 7 below)
Project 6 (2012) NHibernate with fancy mappings (References, HasMany, cleaned up column names) NHibernate with simple mappings – raw column names, no references, no HasMany, etc

The test data creator will only be used by tests — not by the application itself. It maintains its own network connection. However you do it, get it up and running as quickly as you can – grow it as needed. Refactor it later. It does NOT need to be clean – any problems will come to light as you write tests with it.

Pattern 4: Deleting Test Data is Tricky Fun

The easiest way everybody seems to agree on is: drop database and reload. I’ve had the blessings to be able to do this exactly once, its not the norm for me – usually I deal with shared development databases, or complicated scenarios where I don’t even have access to the development database schema.

Thus, I have to delete data one table at a time, in order.

I have used various strategies to get this done:

  • Writing SQL DELETE statements by hand — this is where I start.
  • Putting ON DELETE CASCADE in as many places as it makes sense. For example, probably don’t want to delete all Employees when deleting a Company (how often do we delete a company! Are you sure?) but could certainly delete all User Preferences when deleting a User. Use common sense.
  • Create a structure that represents how tables are related to other tables, and use that to generate the delete statements.

This is the hardest part of creating test data. It is the first place that breaks — somebody adds a new table, and now deleting fails because foreign keys are violated. (long term view: that’s a good thing!)

I got pretty good at writing statements like:

delete from c2
where c2.id in ( 
    select c2.id from c2
    join c1 on ...
    join root on ....
    where root.id = :id )

After writing 4-5 of them, you find the pattern.. the child of a C2 looks very similar to the delete query for C2, except with a little bit more added. All you need is some knowledge of where you delete first, and where you can go after that.

How Tables Relate

I no longer have access to the codebase, but as I remember, I wrote something like this:

var tables = new List(); 
var table1 = new TableDef("TABLE1","T1"); 
{ 
     tables.Add(table1); 
     var table2 = table1.SubTable("TABLE2","T2","T1.id=T2.parentid"); 
     { 
         tables.Add(table2); 
         // etc etc
     }
     // etc etc
}
tables.Reverse();   // so that child tables come before parent tables

I could then construct the DELETE statements using the TableDef’s above – the join strategy being the third parameter to the .SubTable() call.

Slow Deletes

I ran into a VERY slow delete once, on Oracle. The reason was, the optimizer had decided that it was faster to do a rowscan of 500,000 elements than it was to do this 7-table-deep delete. I ended up rewriting it:

select x.ROWID(), ...; foreach ... { delete  where rowid==... }

Moral(e): you will run into weird deletion problems. That’s okay, it goes with the territory.

Circular Dependencies

Given:

  • Clients have People
  • Feeds have Files For Multiple Clients
  • Files Load People
  • A loaded person has a link back to the File it came from

This led to a situation where if you tried to delete the client, the FK from Feed to Client prevented it. If you tried to delete the feed, the FK from People back to File prevented it.

The solution was to NULL out one of the dependencies while deleting the root, to break the circular dependency. In this case, when deleting a Feed, I nulled the link from person to any file under the feed to be deleted. I also had to do the deletes in order: Feed first, then Client.

Example:
Here’s some real code from my current project, with table names changed to protect my client:

var exists =
	(from c in session.Query() where c.name == companyNameToLookFor select c).
		FirstOrDefault();
if (exists != null)
{
	using (var tran = session.BeginTransaction())
	{
		// rule #1: only those things which are roots need delete cascade
		// rule #2: don't try to NH it, directly delete through session.Connection

		// ownercompany -> DELETE CASCADE -> sites
		// sites -> manual -> client
		// client -> RESTRICT -> feed
		// client -> RESTRICT -> pendingfiles
		// client -> RESTRICT -> queue
		// queue -> RESTRICT -> logicalfile
		// logicalfile -> CASCADE -> physicalfile
		// logicalfile -> CASCADE -> logicalrecord
		// logicalrecord -> CASCADE -> updaterecord

		var c = GetConnection(session);

		c.ExecuteNonQuery(@" 
			delete from queues.logicalfile 
			where queue_id in ( 
			   select Q.queue_id 
			   from queues.queue Q
			   join files.client CM ON Q.clientid = CM.clientid
			   join meta.sites LCO on CM.clientid = LCO.bldid
			   where LCO.companyid=:p0
			)
			", new NpgsqlParameter("p0", exists.id)); 

		c.ExecuteNonQuery(@" 
			delete from queues.queue 
			where clientid in ( 
				select bldid
				from meta.sites
				where companyid=:p0
			)
			", new NpgsqlParameter("p0",exists.id)); 

		c.ExecuteNonQuery(@"
			delete from files.pendingfiles 
			where of_clientnumber in (
				select bldid
				from meta.sites
				where companyid=:p0
			) ",
			new NpgsqlParameter(":p0", exists.id));
		c.ExecuteNonQuery(@"
			delete from files.feed 
			where fm_clientid in (
				select bldid
				from meta.sites
				where companyid=:p0
			) ", 
			new NpgsqlParameter(":p0",exists.id)); 
		c.ExecuteNonQuery(@"
			delete from files.client 
			where clientid in (
				select bldid
				from meta.sites
				where companyid=:p0
			) ",
			new NpgsqlParameter(":p0", exists.id)); 

		session.Delete(exists);
		tran.Commit();
	}
}

In this case, ownercompany is the root. And almost everything else (a lot more than what’s in the comments) CASCADE DELETE’s from the tables I delete above.

I did not write this all at once! This came about slowly, as I kept writing additional tests that worked against additional things. Start small!

Pattern 5: Writing Integration Tests Is Fun!

Using a library like this, writing integration tests becomes a joy. For example, a test that only accounts which are open are seen:

Given("user with two accounts, one open and one closed"); 
{
   var user = new UserContext(testClientRoot); 
   var account1 = new AccountContext(user,a=>{a.IsClosed=true, a.Name="Account1" }); 
   var account2 = new AccountContext(user,a=>{a.IsClosed=false,a.Name="Account2" }); 
}
When("We visit the page"); 
{ 
    var model = controller.Index(_dataService); 
}
Then("Only the active account is seen"); 
{
    Assert.AreEqual(1,model.Accounts.Count); 
    ... (etc)
    Detail("account found: {0}", model.Accounts[0]); 
}

The GWT stuff above is for a different post, its an experiment around a way of generating documentation as to what should be happening.

When I run this test, the controller is running against a real data service.. which could go as far as calling stored procedures or a service or whatever.
When this test passes, the green is a VERY STRONG green. There was a lot that had to go right for the test to succeed.

Pattern 6: Integration Tests Take Time To Iterate

Running unit tests – can easily run 300-500 in a few seconds. Developers run ALL tests fairly often. Integration tests, not so much.

Solution: Use a CI server, like TeamCity, and run two builds:

  • Continuous Integration Build – does the compile, and then runs unit tests on **/bin/*UnitTest.dll
  • Integration Test Buid – if previous build is successful, then triggers – compiles – and runs unit tests on **/bin/*Test.dll

Ie, the Integration Test build runs a superset of tests – Integration tests AND Unit Tests both.
This also relies on naming convention for test dll’s – *UnitTests.dll being more restrictive than *Tests.dll.
There’s another approach I have used, where Integration Tests are marked with a category and Explicit() – so that local runs don’t run them, but the integration server includes them by category name. However, over time, I have migrated to keeping them in separate assemblies – so that the unit tests project does not have any references to any database libraries, keeping it “pure”.

When working on code, I usually run one integration test at a time, taking 3-4 seconds to run. When I’m done with that code, I’ll run all tests around that component.. maybe 30 seconds? Then, I check it in, and 4-5 minutes later, I know everything is green or not, thanks to the CI server. (AND, it worked on at least two computers – mine, and the CI server).

Pattern 7: Cannot Create; Search Instead

This was my previous project. Their databases had a lot of replication going on – no way to run that locally – and user and client creation was locked down. There was no “test root creation”, it got too complicated, and I didn’t have the privileges to do so even if I wanted to tackle the complexity.

No fear! I could still do integration testing – like this:

// Find myself some test stuff
var xxx = from .... where ... .FirstOrDefault(); 
if (xxx == null) Assert.Ignore("Cannot run -- need blah blah blah in DB"); 
// proceed with test
// undo what you did, possibly with fancy transactions
// or if its a read-only operation, that's even better.

The Assert.Ignore() paints the test yellow – with a little phrase, stating what needs to happen, before the test can become active.

I could also do a test like this:

[Test] 
public void EveryKindOfDoritoIsHandled() { 
    var everyKindOfDorito = // query to get every combination
    foreach (var kindOfDorito in everyKindOfDorito) {
        var exampleDorito = ...... .FirstOrDefault(); 
        // verify that complicated code for this specific Dorito works
    }
}

Dorito’s being a replacement word for a business component that they had many different varieties of, with new ones being added all the time. As the other teams created new Doritos, if we didn’t have them covered (think select…case.. default: throw NotSupportedException()) our test would break, and we would know we had to add some code to our side of the fence. (to complete the picture: our code had to do with drawing pretty pictures of the “Dorito”. And yes, I was hungry when I wrote this paragraph the first time).

Interestingly, when we changed database environments (they routinely wiped out Dev integration after a release), all tests would go to Yellow/Ignore, then slowly start coming back as the variety of data got added to the system, as QA ran through its regression test suite.

Pattern 8: My Test Has been Green Forever.. Why did it Break Now?

Unit tests only break when code changes. Not so with Integration tests. They break when:

  • The database is down
  • Somebody updates the schema but not the tests
  • Somebody modifies a stored procedure
  • No apparent reason at all (hint: concurrency)
  • Intermittent bug in the database server (hint: open support case)
  • Somebody deleted an index (and the test hangs)These are good things. Given something like TeamCity, which can be scheduled to run whenever code is checked in and also every morning at 7am, I get a history of “when did it change” — because at some point it was working, then it wasn’t.

    If I enable the integration tests to dump what they are doing to console – I can go back through Teamcity’s build logs and see what happened when it was last green, and what it looked like when it failed, and deduce what the change was.

    The fun part is, if all the integration tests are running, the system is probably clear to demo. This reduces my stress significantly, come demo day.

    Pattern 9: Testing File Systems

    As I do a lot of batch processing work, I create temporary file systems as well. I utilize %TEMP% + “TEST” + testname, delete it thoroughly before recreating it, just like with databases.

    In Conclusion

    Perhaps I should rename this to “My Conclusion”. What I have found:

    I love writing unit tests where it makes sense – a component, which has complicated circuitry, which can use a test around that circuitry.
    I love even more writing integration tests over the entire system – One simple test like: “CompletelyProcessExampleFile1” tells me at a glance that everything that needs to be in place for the REAL WORLD Example File 1 to be processed, is working.
    It takes time.
    Its definitely worth it (to me).
    Its infinitely more worth it if you do a second project against the same database.

    May this information be useful to you.

Diabetes Type II

I am a diabetic, type II.   I talk more about that on my livejournal.

I was reading a book, Wheat Belly by William Davis, MD.  It made a lot of sense, and fit directly into the knowledge that I already had — just gave me a new term, “AGE”‘s.   This confluence inspired me to put together a visio^H^H^H^H^H creately diagram of the concepts that I knew of so far, about my diabetes.

Here it is (click for full size):

This will probably get updated and reposted over time.  If you have any questions, ask, and I’ll tell you what I understand (but remember: I am NOT a doctor.  Just a geek.  With Diabetes Mellitus Type II.)

Duplicating sections of a PostgreSQL database using Powershell

The Problem

  • The customer has large postgreSQL database; it is too large to transfer over VPN.
  • I need to develop against a local copy of the database, where I can make schema modifications at will.

My Solution

  • Pull the schema
  • Pull the sequence information separately (it did not come over with the schema)
  • Pull full dumps for small tables (in order)
  • Pull subsets for large tables (in order)
  • Load everything locally
  • Do this in a script

Here is the code for the solution, with some commentary as to why certain things are the way that they are:

GetData.ps1

$PGDUMP = get-command pg_dump.exe 
$PSQL = get-command psql.exe

get-command verifies that it can find the executable in your current path, or dies if it cannot.
I try to do this for every executable I invoke in a powershell script.

$Env:PGCLIENTENCODING="SQL_ASCII"
$H="111.22.33.44"
$U="sgulati"
$P="5432"
$DB="deathstardb"

PGCLIENTENCODING was necessary because some of the rows in their database had UTF-8-like characters that confused the loader. I arrived at it by trial and error.

. .\tableconfig.ps1

Because I use the same configuration for getting data as for loading data, I pushed that into its own file.

tableconfig.ps1

$FULLTABLES = @( 
   "ds_employees.employees", 
   "ds_contacts.contact_types",
   "ds_contacts.companies",
   "ds_contacts.systems", 
   "ds_inbound.clients",
   "ds_inbound.feeds",
   "ds_inbound.pendingfiles"
); 
$PARTIALTABLES = @( 
   @(   "ds_inbound.processedfiles", 
        "select * from inbound.processedfiles where clientid='555' "
   ), 
   @(   "ds_inbound.missingfiles",
        "select * from inbound.missingfiles where clientid='555' "
    )
);

$FULLTABLES are tables I’m going to grab all data for.
$PARTIALTABLES are tables which I cannot grab all data for (they are too large), so I’m just going to grab the subset that I need

# PG_DUMP
# http://www.postgresql.org/docs/8.1/static/app-pgdump.html
# -s = schema only
# -a = data only
# -F = format.. p = plain, -c = custom
# -O = --no-owner
# -f = output file
# -c create
# -d --inserts
# -X --disable-triggers
# -E = encoding = SQL_ASCII

When there are confusing command line options called from a script, I put a comment in a script explaining
what many of the command line options are, along with a link to online documentation.
This helps with future maintenance of the script.

$exportfile = "${DB}.schema.sql"
if (! (test-path $exportfile)) { 
   "Schema: $exportfile"
   & $PGDUMP -h $H -p $P -U $U --create -F p -O -s -f $exportfile ${DB}
} else { 
   "skip schema: $exportfile"
}

I use a convention that if something has been pulled, do not pull it again.
This enables me to selectively refresh pieces by deleting the local cache of those files.

Note that The PGDUMP command creates a schema file, but does NOT pull current sequence values.

$exportfile = "${DB}.sequence.sql"
if (! (test-path $exportfile)) { 
    $sql = @"
select N.nspname || '.' || C.relname as sequence_name
from pg_class C
join pg_namespace N on C.relnamespace=N.oid
where relkind='S'
and N.nspname like 'ds_%'
"@
    $listOfSequences = ($sql | & $PSQL -h $H -p $P -U $U -d $DB -t)
    $sql = @()
    foreach ($sequence in $listofsequences) { 
       $trim = $sequence.trim(); 
       if ($trim) { 
           "Interrogating $sequence"
           $lastval = ( "select last_value from $trim" | & $PSQL -h $H -p $P -U $U -d $DB -t ) 
           $sql += "select setval('${trim}', $lastval);" 
       }
    }
    $sql | set-content $exportfile
} else { 
    "skip sequence: $exportfile"
}

This gets complicated:

  • I am running a query to get every sequence in the system.. then for each of those sequences, I’m getting the last value.
  • I am doing this by executing PSQL and capturing its output as text; I could have done it with Npgsql called directly from powershell, but i didn’t go down that route at the time this was written.
  • I am saving the information in the form of a SQL statement that sets the value correctly. This eliminates the hassle of understanding the data format.
  • I am relying on the customer’s convention of prefixing their schema names with “ds_” to filter out the system sequences. You may need a different approach.

Update: My customer read through this post, and pointed out something I had missed: There’s a view called

pg_statio_user_sequences

which provides a list of sequences. Still need to loop to get the current values… nevertheless, nice to know!

foreach ($fulltable in $FULLTABLES) { 
  $exportfile = "${DB}.${fulltable}.data.sql";
  if (! (test-path $exportfile)) { 
     "Full: $exportfile"
     & $PGDUMP -h $H -p $P -U $U --inserts --disable-triggers -F p -E SQL_ASCII -O -a -t $fulltable -f $exportfile ${DB}

	 # we need to patch the set searchpath in certain situations
	 if ($exportfile -eq "deathstardb.ds_inbound.feeds.data.sql") { 
		 $content = get-content $exportfile
		 for($i=0; $i -lt $content.length; $i++) { 
			 if ($content[$i] -eq "SET search_path = ds_inbound, pg_catalog;") { 
				$content[$i]="SET search_path = ds_inbound, ds_contacts, pg_catalog;"; 
			 }
		 }
		 $content | set-content $exportfile
	 }

  } else { 
     "Skip full: $exportfile"
  }
}

This executes PG_DUMP on the tables where we want full data, and dumps them into “rerunnable sql” files.
However, some of the triggers (that are pulled with the schema) were badly written; they made assumptions on the runtime searchpath (a postgres thing) and thus failed.
I fixed that by adding some search and replace code to convert bad sql into good sql for the specific instances that were dying.

foreach ($partialtabletuple in $PARTIALTABLES) { 
  $partialtable = $partialtabletuple[0];
  $query = $partialtabletuple[1]; 
  $exportfile = "${DB}.${partialtable}.partial.sql"; 
  if (! (test-path $exportfile)) { 
      "Partial: $exportfile"
	  & $PSQL -h $H -p $P -U $U -c "copy ( $query ) to STDOUT " ${DB} > $exportfile
  } else { 
	 "skip partial: $exportfile"
  }
}

This runs PSQL in “copy (query) to STDOUT” mode to capture the data from a query to a file. The result is a tab seperated file.

LoadData.ps1

Things get much simpler here:

$PSQL = get-command psql.exe
$Env:PGCLIENTENCODING="SQL_ASCII"
$H="localhost"
$U="postgres"
$P="5432"
$DB="deathstardb"

. .\tableconfig.ps1

# PSQL
# -c = run single command and exit

$exportfile = "${DB}.schema.sql"
& $PSQL -h $H -p $P -U $U -c "drop database if exists ${DB};"
& $PSQL -h $H -p $P -U $U -f "${DB}.schema.sql"
& $PSQL -h $H -p $P -U $U -d ${DB} -f "${DB}.sequence.sql"

I’m going with the model that I’m doing a full wipe – i don’t trust anything locally, I am far too creative a developer for that — hence I drop the database and start fresh.
I create the schema from scratch (there are a few errors, hasn’t bitten me yet)
and then I set all the sequence values.

foreach ($fulltable in $FULLTABLES) { 
  $exportfile = "${DB}.${fulltable}.data.sql"
  & $PSQL -h $H -p $P -U $U -d ${DB} -f $exportfile
}

Important: The data is loaded IN ORDER (as defined in $FULLTABLES), so as to satisfy FK dependencies.
To figure out dependencies, I used pgadmin‘s “dependencies” tab on an object, and drew it out on paper.
It seemed daunting at first, but upon persevering, it was only 6-7 tables deep. A job I had in 2006 had (30+ total, 7 deep?) for comparison.

foreach ($partialtabletuple in $PARTIALTABLES) { 
  $partialtable = $partialtabletuple[0];
  $query = $partialtabletuple[1]; 
  $exportfile = "${DB}.${partialtable}.partial.sql"; 
  get-content $exportfile | & $PSQL -h $H -p $P -U $U -d ${DB} -c "copy $partialtable FROM STDIN "
}

Source Control

I check everything into source control (subversion for me):

GetData.ps1
LoadData.ps1
Data\tableconfig.ps1
Data\deathstardb.schema.sql
Data\deathstardb.sequence.sql
Data\deathstardb.ds_employees.employees.data.sql
Data\deathstardb.ds_contacts.contact_types.data.sql
Data\deathstardb.ds_inbound.processedfiles.partial.sql
(etc)

Important bits here:

  • My client did not have a copy of their schema in source control. Now they do.
  • The naming convention makes it easy to know what each file is.
  • I’m keeping the data in a seperate folder from the scripts that make it happen.

Additional Scripting

There are some additional scripts that I wrote, which I am not delving into here:

  • the script that, when applied to a copy of the production database, creates what I am developing with.
    • Luckily, what I’m doing is all new stuff, so I can rerun this as much as I want, it drops a whole schema and creates with impunity
  • the script to apply the above (dev) changes to my local database
  • the script to apply the above (dev) changes to my development integration database

Whenever I’m working with a database, I go one of two routes:

  • I use the above “make a copy of prod” approach as my “start over”, and only have a script of forward-changes
  • I make my script do an “if exists” for everything before it adds anything, so it is rerunnable.

With either approach its very important that when a production rollout occurs, I start a new changes script, and grab a new copy of the schema.

There is a newer third route – which is to use some kind of software that states with authority, “this is what it should be”, and allows a comparison and update to be made against an existing data source. Visual Studio Database Solutions are one such example, ERStudio is another. Hopefully, it does its job right! Alas, this client does not have that luxury.

In conclusion

Getting my development environment repeatable is a key to reducing stress. I believe The Joel Test calls it #2: “Can you make a build in one step?”.

I used a ton of tricks to get it to work.. it felt like I was never going to get there.. but I did. If you do something 3-4 times, you might want to automate it.

May your journey be similarly successful.

Backing up and Restoring

I recently helped my wife set up her new work computer. I could not do everything; the IT guy had to come in and add it to the domain, and she installed various essentials like Minesweeper (j/k, i think it was Photoshop).
Being a good geek, I intend to have a good image of that computer now that its set up.

So, I practiced on my laptop tonight.

Step 1: Back up the machine.
Hook up external Hard Drive
Boot off Hiren’s Boot Disk
Basically following http://sir-sherwin.blogspot.com/2011/04/disk-imaging-using-acronis-true-image.html
(except that I used Seagate Disk Wizard Something Something with Acronis support)
2.5 hours later, I have several .tib files (partitioned into 4.7g chunks)
For reference, the laptop had 55G of used HD space.

Step 2: Play with Backup
Attached external Hard Drive to my big computer
Downloaded http://www.vmware.com/products/converter/
Started to convert the .tib file into a vmware image
There were a lot of options.. i ended up hydrating to a 80G virtual drive, and got to choose the partitioning scheme.

1.5 hours later, I have a vmware image i can run.

Step 3: See Laptop living in VM on big computer
Left = Original; Right = VM

There’s a few problems with drivers.. to do it perfectly, I would sysprep the machine…
It definitely validates the backup, though.

Just ’cause I could.
Yep, life is good.