SqlServer on a Ram Drive: Fast or Not?

TL;DR:  Don’t bother, its not.

I’m at it again .. writing what I call integration tests, which are effectively database-friendly-data-setup-and-teardown tests.  As can be imagined, its definitely much slower than unit testing; however, I love doing it, and there’s a lot of sprocs and other stuff that its really nice to get some tests around. (most of the bugs that led to this investment in time, were in the sprocs).

Since I had the RAM available.. decided to try to have SQLSERVER (Developer Edition) run against databases that were stored in RAM.  How does that compare?

The database is about 4G in size.

SqlServer Developer, 10G RAM Drive, In a VM , Tests run by R#

image

SqlServer Developer, using MDF files against C: in a VM; VM is on SSD; R#

image

SqlServer Enterprise on VM in Azure (network lag); R#

image

One test failure was because this server’s copy of the invoice database was not complete; and the test was set to be readonly against this server (local sqlexpress = much happier about dropping and re-inserting records).

I could not ping all the way to the database server, but Client Statistics would indicate a ping of probably 150ms?

image

SqlServer Enterprise at client location via VPN; R#

image

This is definitely faster than our azure-hosted SQL – Looking at ping, says it’s a 50ms round trip.

Pinging taacasql01.triaa.local [10.120.0.10] with 32 bytes of data:
Reply from 10.120.0.10: bytes=32 time=50ms TTL=127
Reply from 10.120.0.10: bytes=32 time=44ms TTL=127
Reply from 10.120.0.10: bytes=32 time=45ms TTL=127
Reply from 10.120.0.10: bytes=32 time=45ms TTL=127

SqlServer Enterprise local network (client site); Teamcity

image

This is almost on-par with local SqlServer.  However, I don’t know if the machines are faster.

Summary

Ram-Disk Sql-Server didn’t help.  Or maybe its that, even on a regular hard drive, it was able to load everything into RAM, and got quite fast.

Local Sqlserver vs LAN SqlServer were close enough; I’d use one as a substitute for “how would it perform” on the other.   Helps that the customer is (probably) running both VM’s on a Hypervisor, and network communication between the two machines is … superfast.

WAN SqlServer was definitely the dumps; however, that’s good for posing artificial limits on ourselves, to make sure we’re not doing too many round trips, etc.   Nevertheless, our main cloud sql server is slower than a VPN into our client; that doesn’t seem right. Or, our client is that awesome.  It could be the latter. 

Not shown above, but if you drill in to some of the tests, you can see the cost of setting up an Entity Framework context the first time.  It seemed to take about 8 seconds against my local server.  Once set up, subsequent tests were less than a second each.   However, the set-up would happen again for every other test fixture – apparently whatever its doing to cache things, got dropped between fixtures.  Possible optimization, hang onto it in a static?  *food for thought*

Methodology Notes:

I ran each full suite twice.  Sometimes three times, till I got what seemed to be a stable number.

I used SoftPerfect for the RAM disk.  Its set to sync ramdisk to disk every hour or so.  After seeing that it didn’t really improve things, I deleted it.

image

I drank 3 glasses of Tea, 1 Spicy Hot V8, and ate 3 pieces of chocolate, and 1 cream cheese snack, while running all these tests. 

Candy Crush Helper

I was listening to Tim Ferris interview Jane McGonigal on the neuro-programming that occurs around playing games.  One of the games mentioned was Candy Crush Saga.  Curious, I decided to start playing it – and now I find it relaxing, a good place to put my mind when my mind is obsessing about other things.

But of course, being me, I started to wonder, “can I program a solution for this?”    Well, of course not – the boards are actually random – but I could write a helper to detect available moves, and possibly future moves.

Turns out a guy already wrote one.  See his video explaining how it works here.   However, I couldn’t get it to run, and it required me running candy crush on my laptop.  And, it didn’t look ahead moves – just the currently available ones.  Still, an impressive piece of work. 

2015-09-12 22.36.39So I started coding one for fun.   I started with this screen:

Question:  How many possible moves do you see?  (A move for the uninitiated, is swapping adjacent candies so they form at least 3 in a row or 3 in a column).  I counted 11 on my initial count.

I wrote some code to load this into a class, and wrote some methods to load candies.   It was a very zen-kata-exercise thing to figure out how to save the board – turns out I didn’t know 2D arrays in C# very well at all (and I still don’t, I chose a different method): image
I wrote a some “possible candy swap” code, and wrote some stuff to find solutions, and wrapped that in a test: image
It says there are 14 moves. What?  I don’t have a clean way of outputting the moves yet, so I will have to highlight them by hand. image (89 ms)

Here’s what it found: 

image

  • Yeah. Bug in my code.  I thought I was doing a copy from one board to another, but it was a shallow copy, so all the “possible” moves were being remembered one after the other, rather than one at a time.  PANDA-MONIUM!

My attempted fix of the code didn’t work.   And its getting late.

I believe the test needs to Assert(13, allPossibleMoves.Count):

image

Or am I missing something?   I will resume this.. some day. 

In the mean time, it raises interesting things like:

  • How to represent a very complicated system in a way that makes sense while developing?
  • How to load test data into complicated systems?

Happy crushes.  I’m at level 35, FWIW.

Oh, and I’m using the test-adapter to run NUnit Tests using Microsoft’s test-running tools.

Detailed Testing of Very Large Forms

I was given the task of “testing everything” regarding an integration for my client.  

image

This is a screenshot of a document of screenshots; appropriately zoomed out so that no detail is visible.

  • The source system is in yellow; those are portrait screenshots of a multi-page web-form.  
  • The target system is in gray; its several screens of a Windows app.
  • It used to be that gray was loaded from PDF’s, and yellow generated PDF’s.   Human error, lots of time, etc.
  • Now there is an intermediate system, lets call it blue, that is used to load the gray system.   It has a code-generated API document that is 467 pages long, that mostly tells you that X field is stored at Y location in the database that gray displays fields from.

You might think that this could be straight forward, but its not:

  • Some fields in yellow are old, and do not belong in gray
  • Some fields in gray are manual entry, cannot be loaded via blue.
  • Some fields in gray will in the future be loaded by blue, but not yet
  • Some fields in yellow actually go to another system, purple.
  • Furthermore, yellow is customizable.  Not everything is displayed for everybody; some of the fields had never been used before to my knowledge, and might just be old stale stuff.

No problem!  Just have the business persons spec out which fields go where!

  • There’s a business person who knows yellow.
  • There’s another person who knows blue.
  • And another person who knows how blue loads gray, and knows more about gray.
  • Everybody knows about the existence of purple. But, I didn’t find a person who knew purple.  And blue doesn’t talk to purple, and we’re only talking to blue.
  • These people are hard to pin down and have like 15 minutes to spare, tops.
  • Even if I could get them all together, how to document it?
  • The solution would be to find the person who was interested in having yellow go to gray who knew the fields on both sides.  Unfortunately, there was no such person available.  This is all speculative work that someday, in the future, would be good to have. 

So, sensing somebody will have to feel the pain… I volunteered (billable) to Brute Force Test it.   Where do I start?

  • I could get an alphabetical list of all the fields in yellow
  • I mostly knew how yellow was put together – approximately which page had which fields.
  • I knew almost nothing about gray.

My Solutionimage

  • Put on my QA Hat.
  • Turn on everything possible in yellow
  • Create a full application (ever possible field filled out), send it through blue to gray.
    • Take screenshots of every screen so there is no doubt what I put in or what I got.
    • Generate an Excel with the list of all fields from yellow, and the values I filled them out with.
  • Because I don’t know gray, I’ll iterate through it one field at a time.  
    • The other ones, I can much more quickly find things, so less seek time.
    • One page, one field at a time, find what I think is the maimagetch in yellow
      • I did this against the excel spreadsheet that I had generated from yellow.
    • Using pen and paper, cross stuff off from my screenshots of gray.
    • Using Excel, update the field to field to field spreadsheet with success or failure
  • Use Excel Filters to the group the fields into which groups need to be asked to which person.
  • Ask the questions, get answers, in the 15 minute window I could get everybody together during. 
  • Address individual fields with detected problems one at a time, with screenshots of yellow blue and gray as needed.

imageThis has been my life for the last week and a half.  I’m almost through my second pass of fixing fields, thanks to feedback from the person who knows gray.  And we’re a lot smarter about what in yellow doesn’t map with gray.

Mr. Pink was not harmed during this process.  He can definitely say that he wasn’t because he knows when he was and when he wasn’t.  But he cannot definitely say that about anybody else, because he doesn’t definitely know.

Day at the Office: A Tale of Testing

imageTL;DR:  Its really awesome when you understand the customer’s business language and can do thorough testing for them and document it.

I’ve been working on a bug for the last few days, which involves some math, and positive vs negative signs and such stuff.   Basically:

  • Customer sent me a screenshot where the math was obviously not working.
  • We figured out what was going on (boundary case not handled), and handled it.  

Normally, we’d be done at that point.  Except, this customer does not have a dedicated QA department; we are the first line QA department.  So.. we decided to be thorough.  Hence, this Test Document (screenshot; intentionally small so that details are illegible)

We posted the testing results of 7 types of things + 5 states of things, and highlighted any suspicious behavior (in yellow) for confirmation from the client —

And we have a brilliant customer, who gave us prompt feedback: (in blue)

Which we then fixed, in purple.

Time Creep

This 2-3 hour bug has now become a 2-3 day bug.   I feel a bit guilty about that, yet – if these situations had been discovered in the future, that would have been more angst, spread amongst more people; I’m glad we could figure this out now rather than later.

Also, by doing this testing and documenting it, I’m generating a snapshot-in-time of how the app is supposed to work. 

Unit Tests?

The only thing that would make this even more awesome would be to do these as unit tests.   But that would have been 3-4 days instead; also, we’re trying to wrap this project up.   Also, the customer doesn’t speak “unit tests”, but the customer definitely speaks “screenshots of the app”, so we’re choosing to converse in the latter language.

But how did it get Borged up to begin with?

I had shared the content of this post with a friend, and he asked the obvious question – why wasn’t this caught sooner?   The app has been used in production for over a year now. 

Reason Part 1:  this code (recalculating estimated cash balance in an account) was recently rewritten for performance optimization.  The old code (before my time on the project) was huge, complex, not documented, and slow – it was actually used for generating charts forecasting cash balance out into the future, and was designed to handle one portfolio at a time; the new code could do things in batches of portfolios, maximizing lookups of shared stocks and options, etc.    When we did the optimized calculations, we tested everything that we (and the client) knew of at the time.   No regrets.

Reason Part 2:  My client is also small enough that they don’t have a dedicated business guy who documents every rule around every calculation.   There’s a wealth of “common shared knowledge” amongst several people.  While we, as developers, try to understand that as much as possible, there’s a reason they do what they do very well – that stuff is complex.   Instead, our goal is to get 80%-90% close, “usable”, and then if there’s gaps remaining, those will get revealed, and patched later.  Reference concept Minimum Viable Product.   (Only doable with in-house applications; externally facing applications in a marketplace need more attention on polish and completeness.)

Selenium

I started using Selenium with my current client .. on the side, mostly because logging into the website involved: “wait click text text click click text click wait click wait click”. Additionally, any time I compile, it wipes my logged in session, and thus repeat. 

Last night at my son’s soccer game, I was chatting with another technical parent.. he was starting to use Selenium also!  In his case, Selenium IDE and converting that into code for tests.   Having played a bit with that side of things, and ending up with Selenium WebDriver + some ideas of how to do it nicely, I decided to compose this post to share my experiences so far. 

  • Code on Github (learning github for windows);
  • VS2012, Nuget package restore enabled;
  • Total time taken to write code (from scratch) = Less than taken to write this post talking about the code.  

Targeting NerdDinner

My client would have a cow (and a chicken, and maybe several other barnyard animals) if I put any of their code outside their network.  And I’d get fired.  So I recreated the pattern I am using against www.nerddinner.com.  A certain dude whom I respect created it as a demo website. It had about the right mix of complexity – it was not just searching on google  – but it wasn’t navigating gmail either.  Thank you Scott.

First Attempt at Code.

See github branch here; Excerpt from here

        [Test]
        public void MainPage_CanSearchForDinners_ButNeverFindsAnyInKY()
        {
            using (IWebDriver driver = new FirefoxDriver())
            {
                driver.Navigate().GoToUrl("http://www.nerddinner.com");
                var input = driver.FindElement(By.Id("Location"));
                input.SendKeys("40056");
                var search = driver.FindElement(By.Id("search"));
                search.Click();
                var results = driver.FindElements(By.ClassName("dinnerItem"));
                // at this point, i don't know what to do, as there's never any search results.   
                Assert.AreEqual(0, results.Count,
                                "No dinners should be found.. omg, if this works, then its worth it to change the test");
            }
        }
  • This code is using NUnit (out of scope for this post)
  • This is how you navigate
  • This is how you find things
  • This is how you send text
  • This is how you click things
  • Every time you run, you’re starting of with no cookies, so you’ll have to log in every time.

Whee.  Good start.  Okay, now lets get serious[er]. 

Second Pass At Code

Per the recommendations of Jim Holmes (@aJimHolmes, http://frazzleddad.blogspot.com/), whom I met in person at http://www.codepalousa.com/ and whom I have also derived a great deal of respect, I know what smells funny:

  • The first pass is very brittle.  If somebody changes an ID or a classname, you have a LOT of tests to go change.   Solution: Page Pattern
  • What does one do when one does not have a second bullet, but there seems like there should be one?

What I accomplish with all this Jazz:

image

The overall branch is here; the rewritten test here:

        [Test]
        public void MainPage_CanSearchForDinners_ButNeverFindsAnyInKY()
        {
            MainPage.LocationToSearch.SendKeys("40056");
            MainPage.SearchButton.Click();
            var results = MainPage.PopularDinnerSearchResults;

            Assert.AreEqual(0, results.Count,
                            "No dinners should be found.. omg, if this works, then its worth it to change the test");

        }

Much simpler.  The Page Helper Class is here, and partially looks like:

    public class MainPage : PageBase
    {
        public MainPage(IWebDriver driver)
            : base(driver)
        {
            Wait.Until(d => d.Title == "Nerd Dinner");
        }

        public static MainPage NavigateDirectly(IWebDriver driver)
        {
            driver.Navigate().GoToUrl("http://www.nerddinner.com");
            return new MainPage(driver);
        }

        public IWebElement LocationToSearch
        {
            get { return Driver.FindElement(By.Id("Location")); }
        }

        public IWebElement SearchButton
        {
            get { return Driver.FindElement(By.Id("search")); }
        }

  • The constructor takes and stores a reference to the Driver.  The calling program is responsible for creating and disposing the driver (as its an expensive resource)
  • It uses a PageBase class which creates an additional WebDriverWait (the Wait variable)
  • The constructor waits until we know we’re on the right page.  This allows us to click something somewhere else, and new up this object, and then wait till the page actually loads. 
  • Because this is the root of the website, I include a NavigateDirectly() routine which says “I don’t care where you were, now go to this page”.  I only do this on overview or login pages, the kind not derived from a click.
  • It exposes IWebElements to the callers (tests) so they don’t need to know where on the page various things are located.
  • In WebForms – I have to implement extension methiods which search IWebDriver and IWebElement for id’s Ending In Text, because ID=”btnSearch” ends up being “ctl00.BoogerBooger_04.HumDeDum_03.YabbaDabba.WickedBackbtnSearch”

Another excerpt (here):

public List UpcomingDinners
        {
            get
            {
                var upcomingDinnersUl = (from e in Driver.FindElements(By.ClassName("upcomingdinners"))
                                         where e.TagName == "ul"
                                         select e).FirstOrDefault();
                if (upcomingDinnersUl == null) throw new NotFoundException("could not find ul.upcomingdinners");
                return upcomingDinnersUl.FindElements(By.TagName("a")).ToList(); 
            }
        }
  • When doing complicated finds, I usually search By.ClassName or By.Id or whatever first, and end up with e.TagName second, because they do NOT provide an e.Id or e.ClassName routine.   (You can do e.GetAttribute(“id”) but I’m not sure if that returns null or empty, and nulls can be a bummer).
  • new ByChained(x,y,z) does not mean an element which matches x,y and z, but instead an element x which contains an element y which contains an element z.   

Third Pass – the Developer Helper

If I run the Console App which is the library that has all the page helpers, I get something like this:

image

  • The purpose of this app is to let me quickly get to sections of my website that I need to play around in.
  • The keywords yield a more detailed list; the more detailed list is used in navigation; using the old trick of:
    delimited = “|“+delimited+”|“; if (delimited.contains(“|xxx|“) … 

    which is what we did before arrays #GetOffMyLawn #WheresMyTeeth

  • Once the site is up, it stays there (until I choose somewhere else to go)

Additionally, in tests, I redid the creation/deletion of the driver as follows:

        [TestFixtureSetUp]
        public void FixtureSetup()
        {
            
        }

        [TestFixtureTearDown]
        public void FixtureTeardown()
        {
            AbandonDriver();
        }

        [SetUp]
        public void TestSetup()
        {
            if (Driver == null) Driver = new FirefoxDriver();
            MainPage = MainPage.NavigateDirectly(Driver);
        }

        [TearDown]
        public void TestTeardown()
        {
            
        }

        private void AbandonDriver()
        {
            if (Driver != null) Driver.Dispose();
            Driver = null;
        }

  • I am not creating a driver for every test; however if a test is going to mess up the relative state of a driver, it can be abandoned and a new one started.
  • This saves several seconds per test, which helps once you break the 5-6 test mark.  Especially if what is saved is not as much “start the driver”, but more “start the driver, go to the web site, log in, and get to the page of interest”.

Conclusion

Once you’ve done it once or twice, its remarkably easy to write some pretty neat tests.   This may not be the perfect way to do it, but its something that works for me; hope it works for you!

Test Data Creators (Integration Tests)

Everybody seems to love Unit Tests

I agree, they are wonderful. I have lots of logic that is unit tested … and its easy to set up (especially with tools like moq)…

But its not what I rely on.  I have found it to be too limited to give me the confidence I’m looking for as I write a system.  I want as much tested as I can – including the data access layers – and how everything fits together – and that my dependency injectors are working correctly.

Another view: in my current project, I’m using nHibernate as the data provider.  The general consensus on mocking nHibernate is: don’t do it. Instead, use an in-memory database (didn’t work – had to maintain different mapping files), or write an IRepository around it.

When I do that, what I find is most of the logic that needs testing is in the subtleties around my queries (LINQ and otherwise) – the rest is plumbing data from one representation to another.  While unit testing that is valid, it does not cover the places where I find most of my failures.  Stated in GWT syntax, my tests would be “GIVEN perfect data, WHEN logic is executed, THEN the app does the right thing”“perfect data” being the elusive part.

I have tried providing a List<T>.AsQueryable() as a substitute data source in unit tests – and that works well, as long as my queries do not get complicated (involving NHibernate .Fetch and so on.)   If the queries grew beyond my ability to mock them with .AsQueryable(); my “test” situation (LINQ against a list) started to differ significantly from the “real” situation (LINQ against a database) and I started to spend too much time getting the test just right, and no time on real code.

My Solution – Test Data Creators

My solution for the past 5 years over multiple projects has been “Integration Tests”, which engage the application from some layer (controller, presenter, etc) all the way down to the database.

“Integration”,”Unit”, and “Functional” tests — there seem to be a lot of meanings out there. For example, one boss’s idea of a “Unit” test was, whatever “Unit” a developer was working on, got tested. In that case, it happened to be the “Unit” of batch-importing data from a system using 5 command line executables. Thus, for this article only, I define:

  • Unit Test – A test called via the nUnit framework (or similar) that runs code in one target class, using mocks for everything else called from that class, and does not touch a database or filesystem
  • Integration Test – A test called via the nUnit frameowrk (or similar) that runs code in one target class, AND all of the components that it calls, including queries against a database or filesystem
  • Functional Test – Something I haven’t done yet that isn’t one of the above two
  • Turing Test – OutOfscopeException

Having built these several times for different projects, there are definite patterns that I have found that work well for me. This article is a summary of those patterns.

Pattern 1: Test Data Roots

For any set of data, there is a root record.
Sometimes, there are several.
In my current project, there is only one, and it is a “company”; in a previous project, it was a combination of “feed”, “company” and company.

The Pattern:

  • Decide on a naming convention – usually, “TEST_”+HostName+”_”+TestName
  • Verify that I’m connecting to a location where I can delete things with impunity — before I delete something horribly important (example: if connection.ConnectionString.Contains(“dev”))
  • If my calculated test root element exists, delete it, along with all its children.
  • Create the root and return it.
  • Use IDisposable so that it looks good in a using statement, and any sessions/transactions can get closed appropriately.

Why:

  • The HostNameallows me to run integration tests on a build server at the same time as a local machine, both pointed at a shared database.
  • I delete at the start to leave behind test data after the test is run. Then I can query it manually to see what happened. It also leaves behind excellent demo material for demoing functionality to client and doing ad-hoc manual testing.
  • The TestName allows me to differentiate between tests. Once I get up to 20-30 tests, I end up with a nice mix of data in the database, which is helpful when creating new systems – there is sample data to view.

Example:

using (var root = new ClientTestRoot(connection,"MyTest")) { 
    // root is created in here, and left behind. 
    // stuff that uses root is in here.  looks good. 
}

Pattern 2: Useful Contexts

Code Example:

using (var client = new ClientTestRoot(connection,"MyTest")) { 
    using (var personcontext = new PersonContext(connection, client)) { 
       // personcontext.Client
       // personcontext.Person
       // personcontext.Account
       // personcontext.PersonSettings
    }
}

I create a person context, which has several entities within it, with default versions of what I need.

I also sometimes provide a lambda along the lines of:

new PersonContext(connection, client, p=>{p.LastName="foo", p.Married=true})

to allow better customization of the underlying data.

I might chain these things together. For example, a Client test root gives a Person context gives a SimpleAccount context … or seperately, a MultipleAccount context.

Pattern 3: Method for Creating Test Data can be Different from What Application Uses

By historical example:

Project Normal App Data Path Test Data Generation Strategy
Project 1 (2006) DAL generated by Codesmith OracleConnection, OracleCommand (by hand)
Project 2 (2007) DAL generated by Codesmith Generic Ado.Net using metadata from SELECT statement + naming conventions to derive INSERT + UPDATE from DataTable’s
Project 3 (2008) DAL generated by Codesmith DAL generated by Codesmith — in this case, we had been using it for so long, we trusted it, so we used it in both places
Project 4 (2010) Existing DAL + Business Objects Entity Framework 1
Project 5 (2011) WCF + SqlConnection + SqlCommand + Stored Procedures No test data created! (see pattern 7 below)
Project 6 (2012) NHibernate with fancy mappings (References, HasMany, cleaned up column names) NHibernate with simple mappings – raw column names, no references, no HasMany, etc

The test data creator will only be used by tests — not by the application itself. It maintains its own network connection. However you do it, get it up and running as quickly as you can – grow it as needed. Refactor it later. It does NOT need to be clean – any problems will come to light as you write tests with it.

Pattern 4: Deleting Test Data is Tricky Fun

The easiest way everybody seems to agree on is: drop database and reload. I’ve had the blessings to be able to do this exactly once, its not the norm for me – usually I deal with shared development databases, or complicated scenarios where I don’t even have access to the development database schema.

Thus, I have to delete data one table at a time, in order.

I have used various strategies to get this done:

  • Writing SQL DELETE statements by hand — this is where I start.
  • Putting ON DELETE CASCADE in as many places as it makes sense. For example, probably don’t want to delete all Employees when deleting a Company (how often do we delete a company! Are you sure?) but could certainly delete all User Preferences when deleting a User. Use common sense.
  • Create a structure that represents how tables are related to other tables, and use that to generate the delete statements.

This is the hardest part of creating test data. It is the first place that breaks — somebody adds a new table, and now deleting fails because foreign keys are violated. (long term view: that’s a good thing!)

I got pretty good at writing statements like:

delete from c2
where c2.id in ( 
    select c2.id from c2
    join c1 on ...
    join root on ....
    where root.id = :id )

After writing 4-5 of them, you find the pattern.. the child of a C2 looks very similar to the delete query for C2, except with a little bit more added. All you need is some knowledge of where you delete first, and where you can go after that.

How Tables Relate

I no longer have access to the codebase, but as I remember, I wrote something like this:

var tables = new List(); 
var table1 = new TableDef("TABLE1","T1"); 
{ 
     tables.Add(table1); 
     var table2 = table1.SubTable("TABLE2","T2","T1.id=T2.parentid"); 
     { 
         tables.Add(table2); 
         // etc etc
     }
     // etc etc
}
tables.Reverse();   // so that child tables come before parent tables

I could then construct the DELETE statements using the TableDef’s above – the join strategy being the third parameter to the .SubTable() call.

Slow Deletes

I ran into a VERY slow delete once, on Oracle. The reason was, the optimizer had decided that it was faster to do a rowscan of 500,000 elements than it was to do this 7-table-deep delete. I ended up rewriting it:

select x.ROWID(), ...; foreach ... { delete  where rowid==... }

Moral(e): you will run into weird deletion problems. That’s okay, it goes with the territory.

Circular Dependencies

Given:

  • Clients have People
  • Feeds have Files For Multiple Clients
  • Files Load People
  • A loaded person has a link back to the File it came from

This led to a situation where if you tried to delete the client, the FK from Feed to Client prevented it. If you tried to delete the feed, the FK from People back to File prevented it.

The solution was to NULL out one of the dependencies while deleting the root, to break the circular dependency. In this case, when deleting a Feed, I nulled the link from person to any file under the feed to be deleted. I also had to do the deletes in order: Feed first, then Client.

Example:
Here’s some real code from my current project, with table names changed to protect my client:

var exists =
	(from c in session.Query() where c.name == companyNameToLookFor select c).
		FirstOrDefault();
if (exists != null)
{
	using (var tran = session.BeginTransaction())
	{
		// rule #1: only those things which are roots need delete cascade
		// rule #2: don't try to NH it, directly delete through session.Connection

		// ownercompany -> DELETE CASCADE -> sites
		// sites -> manual -> client
		// client -> RESTRICT -> feed
		// client -> RESTRICT -> pendingfiles
		// client -> RESTRICT -> queue
		// queue -> RESTRICT -> logicalfile
		// logicalfile -> CASCADE -> physicalfile
		// logicalfile -> CASCADE -> logicalrecord
		// logicalrecord -> CASCADE -> updaterecord

		var c = GetConnection(session);

		c.ExecuteNonQuery(@" 
			delete from queues.logicalfile 
			where queue_id in ( 
			   select Q.queue_id 
			   from queues.queue Q
			   join files.client CM ON Q.clientid = CM.clientid
			   join meta.sites LCO on CM.clientid = LCO.bldid
			   where LCO.companyid=:p0
			)
			", new NpgsqlParameter("p0", exists.id)); 

		c.ExecuteNonQuery(@" 
			delete from queues.queue 
			where clientid in ( 
				select bldid
				from meta.sites
				where companyid=:p0
			)
			", new NpgsqlParameter("p0",exists.id)); 

		c.ExecuteNonQuery(@"
			delete from files.pendingfiles 
			where of_clientnumber in (
				select bldid
				from meta.sites
				where companyid=:p0
			) ",
			new NpgsqlParameter(":p0", exists.id));
		c.ExecuteNonQuery(@"
			delete from files.feed 
			where fm_clientid in (
				select bldid
				from meta.sites
				where companyid=:p0
			) ", 
			new NpgsqlParameter(":p0",exists.id)); 
		c.ExecuteNonQuery(@"
			delete from files.client 
			where clientid in (
				select bldid
				from meta.sites
				where companyid=:p0
			) ",
			new NpgsqlParameter(":p0", exists.id)); 

		session.Delete(exists);
		tran.Commit();
	}
}

In this case, ownercompany is the root. And almost everything else (a lot more than what’s in the comments) CASCADE DELETE’s from the tables I delete above.

I did not write this all at once! This came about slowly, as I kept writing additional tests that worked against additional things. Start small!

Pattern 5: Writing Integration Tests Is Fun!

Using a library like this, writing integration tests becomes a joy. For example, a test that only accounts which are open are seen:

Given("user with two accounts, one open and one closed"); 
{
   var user = new UserContext(testClientRoot); 
   var account1 = new AccountContext(user,a=>{a.IsClosed=true, a.Name="Account1" }); 
   var account2 = new AccountContext(user,a=>{a.IsClosed=false,a.Name="Account2" }); 
}
When("We visit the page"); 
{ 
    var model = controller.Index(_dataService); 
}
Then("Only the active account is seen"); 
{
    Assert.AreEqual(1,model.Accounts.Count); 
    ... (etc)
    Detail("account found: {0}", model.Accounts[0]); 
}

The GWT stuff above is for a different post, its an experiment around a way of generating documentation as to what should be happening.

When I run this test, the controller is running against a real data service.. which could go as far as calling stored procedures or a service or whatever.
When this test passes, the green is a VERY STRONG green. There was a lot that had to go right for the test to succeed.

Pattern 6: Integration Tests Take Time To Iterate

Running unit tests – can easily run 300-500 in a few seconds. Developers run ALL tests fairly often. Integration tests, not so much.

Solution: Use a CI server, like TeamCity, and run two builds:

  • Continuous Integration Build – does the compile, and then runs unit tests on **/bin/*UnitTest.dll
  • Integration Test Buid – if previous build is successful, then triggers – compiles – and runs unit tests on **/bin/*Test.dll

Ie, the Integration Test build runs a superset of tests – Integration tests AND Unit Tests both.
This also relies on naming convention for test dll’s – *UnitTests.dll being more restrictive than *Tests.dll.
There’s another approach I have used, where Integration Tests are marked with a category and Explicit() – so that local runs don’t run them, but the integration server includes them by category name. However, over time, I have migrated to keeping them in separate assemblies – so that the unit tests project does not have any references to any database libraries, keeping it “pure”.

When working on code, I usually run one integration test at a time, taking 3-4 seconds to run. When I’m done with that code, I’ll run all tests around that component.. maybe 30 seconds? Then, I check it in, and 4-5 minutes later, I know everything is green or not, thanks to the CI server. (AND, it worked on at least two computers – mine, and the CI server).

Pattern 7: Cannot Create; Search Instead

This was my previous project. Their databases had a lot of replication going on – no way to run that locally – and user and client creation was locked down. There was no “test root creation”, it got too complicated, and I didn’t have the privileges to do so even if I wanted to tackle the complexity.

No fear! I could still do integration testing – like this:

// Find myself some test stuff
var xxx = from .... where ... .FirstOrDefault(); 
if (xxx == null) Assert.Ignore("Cannot run -- need blah blah blah in DB"); 
// proceed with test
// undo what you did, possibly with fancy transactions
// or if its a read-only operation, that's even better.

The Assert.Ignore() paints the test yellow – with a little phrase, stating what needs to happen, before the test can become active.

I could also do a test like this:

[Test] 
public void EveryKindOfDoritoIsHandled() { 
    var everyKindOfDorito = // query to get every combination
    foreach (var kindOfDorito in everyKindOfDorito) {
        var exampleDorito = ...... .FirstOrDefault(); 
        // verify that complicated code for this specific Dorito works
    }
}

Dorito’s being a replacement word for a business component that they had many different varieties of, with new ones being added all the time. As the other teams created new Doritos, if we didn’t have them covered (think select…case.. default: throw NotSupportedException()) our test would break, and we would know we had to add some code to our side of the fence. (to complete the picture: our code had to do with drawing pretty pictures of the “Dorito”. And yes, I was hungry when I wrote this paragraph the first time).

Interestingly, when we changed database environments (they routinely wiped out Dev integration after a release), all tests would go to Yellow/Ignore, then slowly start coming back as the variety of data got added to the system, as QA ran through its regression test suite.

Pattern 8: My Test Has been Green Forever.. Why did it Break Now?

Unit tests only break when code changes. Not so with Integration tests. They break when:

  • The database is down
  • Somebody updates the schema but not the tests
  • Somebody modifies a stored procedure
  • No apparent reason at all (hint: concurrency)
  • Intermittent bug in the database server (hint: open support case)
  • Somebody deleted an index (and the test hangs)These are good things. Given something like TeamCity, which can be scheduled to run whenever code is checked in and also every morning at 7am, I get a history of “when did it change” — because at some point it was working, then it wasn’t.

    If I enable the integration tests to dump what they are doing to console – I can go back through Teamcity’s build logs and see what happened when it was last green, and what it looked like when it failed, and deduce what the change was.

    The fun part is, if all the integration tests are running, the system is probably clear to demo. This reduces my stress significantly, come demo day.

    Pattern 9: Testing File Systems

    As I do a lot of batch processing work, I create temporary file systems as well. I utilize %TEMP% + “TEST” + testname, delete it thoroughly before recreating it, just like with databases.

    In Conclusion

    Perhaps I should rename this to “My Conclusion”. What I have found:

    I love writing unit tests where it makes sense – a component, which has complicated circuitry, which can use a test around that circuitry.
    I love even more writing integration tests over the entire system – One simple test like: “CompletelyProcessExampleFile1” tells me at a glance that everything that needs to be in place for the REAL WORLD Example File 1 to be processed, is working.
    It takes time.
    Its definitely worth it (to me).
    Its infinitely more worth it if you do a second project against the same database.

    May this information be useful to you.