Short Update: Monitoring For Data Problems in complex systems

create view problem_view_darth_vader as (select ..... problem data)

In our case, it was that a particular item had two UPC codes from two suppliers. Add a couple of extra columns like department=’production’ and fix=’how to fix it’.

Next, write a generic thing which checks select view_name from information_schema.views where view_name like ‘problem_view_%’, and loops through those.. if anything has rows, displays the rows. I used a DataTable and DataAdapter.Fill() because then you get the columns even if there are no rows, unlike Dapper’s <dynamic>. Show this on a web page, along with a div id=SuccessDiv

Then, using your favorite “how’s my website doing” tool (mine is DataDog Synthetic Tests), do a test looking for success div. Set up an monitor, track failures. Datadog takes screenshots, so the errored data is saved with the failed monitor.

Result: Here’s the monitor zoomed in to when i introduced a test view to create a problem:

And here is the screen that was captured as part of the failed test:

The only side problem is that you don’t have control over how frequent the checks go. Probably different pages could be done and use different view names, like problem_view_darth_vader_60 to check every hour, for example. May improve it in the future.

I think the next things to put monitors up for are probably these particular things that get stuck without a thing. (Sorry, redacted, but you get the idea).

AWS RDS SqlServer Native Backup and Restore

Had to learn this yesterday to clone a production environment down to a lower environment. Figured it qualified for a blog post.

exec msdb.dbo.rds_backup_database 
         @source_db_name='xxxProd', 
         @s3_arn_to_backup_to='arn:aws:s3:::xxx-sql-native-backup/xxxProd.bak',
         @overwrite_S3_backup_file=1,
         @type='full';

exec msdb.dbo.rds_task_status;   -- till lifecycle=SUCCESS

ALTER DATABASE xxxUAT SET SINGLE_USER WITH ROLLBACK IMMEDIATE;

drop database xxxUAT;
exec msdb.dbo.rds_restore_database 
         @restore_db_name='xxxUAT', 
         @s3_arn_to_restore_from='arn:aws:s3:::xxx-sql-native-backup/xxxProd.bak';

exec msdb.dbo.rds_task_status;   -- till lifecycle=SUCCESS

delete from xxxUAT.dbo.SensitiveTableStuff;

The Gotcha’s were:

Had to set up an option group that added SqlServer Native Backup and Restore to the RDS instance. It took a few minutes to apply, the RDS instance did not reboot or go offline during this process.
Could not restore over an existing database.
Learned the hard way that while you can detach, you can’t re-attach a database using SSMS. Reattaching uses a custom stored procedure. And detaching and attaching had nothing to do with deleting.

Primer on SSDT / Checking Schema into Source Control

I’ve been learning a lot of new things with this client – SSIS, SSRS, and now we’ve also implemented a workflow using SSDT. In short: This is a well supported way to check your database schema and seed data into source control. So, looking back, what are the resources that I wish I had known going in?

Installation

Near as I can tell, SSDT is installed if/when you install SQL Server Express. There’s a basic version also installed if you install VS2012 Premium and up, supposedly.

It can also be installed directly via http://msdn.microsoft.com/en-us/jj650015

Once installed, you’ll be able to do Schema Compares, Data Compares (useless to me?), and database projects (*.sqlproj). As a lot of people don’t seem to have it installed, I usually put it in a different solution than the main project solution.

Capturing an Existing Database into a Database Project

Create database project; If you choose to go blank at first then..
Right click solution, compare schema, SWITCH DIRECTIONS, Update Target
Hint: Don’t save any user objects, not till you are very comfortable with it.

As of the current version, I no longer have to have one database project, and a different server project (as I did when I looked at it in 2010 or so).

Making Changes

Can either make them in the database and then suck the changes in, or can edit them directly in visual studio – either graphically or in SQL.

A build action will verify every reference – for example, sproc referencing columns, etc. There’s a bit of a confusion around getting casing incorrect that you can turn off.

When you check the changes into source control, they are saved as the T-SQL version. Very self-documenting, diffable, etc.

Applying Changes to Local Database Interactive

Use schema compare, select/deselect what you want to update, and hit Update.

This does NOT run pre- and post- deployment scripts.

Applying Database Changes Automated / Without Visual Studio

Compile the project, get a .DACPAC file. Then use SQLPACKAGE.EXE, located usually at C:\Program Files\Microsoft SQL Server\110\dac\bin\sqlpackage.exe. Newlines added for readability.

“If I publish it, what will it do?”

/action:DeployReport
/sourceFile:Flintstone.Database.dacpac
/TargetServerName:BEDROCKCITY
/TargetDatabasename:FlintStone
/OutputPath:differences.xml
/p:DropObjectsNotInSource=false

“What is absolutely everthing that is different between the two?”

/action:DeployReport
/sourceFile:Flintstone.Database.dacpac
/TargetServerName:BEDROCKCITY
/TargetDatabasename:FlintStone
/OutputPath:differences.xml
/p:DropObjectsNotInSource=true

“Do it. (but don’t delete those users!)”

/action:Publish
/sourceFile:Flintstone.Database.dacpac
/TargetServerName:BEDROCKCITY
/TargetDatabasename:FlintStone
/p:DropObjectsNotInSource=false
/p:BlockOnPossibleDataLoss=false

It runs it with pretty colors in a console app.

With Sql Server Management Studio, caveats

In SSMS, under database, tasks, there is “Update Database-Tier Application”. This does effectively the same thing, except you don’t get to choose the options above, so it usually does too much. However, you can save off a change script and then walk the change script by hand.

Handing Data Changes

This guy wrote an excellent blog article about it: http://schottsql.blogspot.com/2012/11/ssdt-pre-and-post-deploy-scripts.html. I extended it a little bit:

Created a ScriptHistory table with a script name (PK) and when it was run, to avoid rerunning
This is for the one time refactors that move data from an old location to a new location (or going from inferred settings to explicit settings), but cannot be run again, but there is no way to detect if it’s a rerun or not.

Use MERGE statements like this article suggests: http://blogs.msdn.com/b/ssdt/archive/2012/02/02/including-data-in-an-sql-server-database-project.aspx

Best Practice for Security / Users / etc

I have not had a chance to do this yet, but the general idea is to create roles for permissions, have SSDT deploy the roles, but the user membership is unique per server and is handled manually. (We’re talking about situations like I am developing in COMPANY1’s domain, but the production server is in COMPANY2’s domain and Windows Auth is being used.)

In Conclusion

I watched my dba-idol Richard Svatos struggle with this in 2003.

I’ve tried to solve it in a variety of ways

I’m glad there is a usable, well supported solution now in the ecosystem which I am flourishing.

Day at the Office: LINQ vs Stored Procedure

I love using LINQ (to EF to SQL). It takes care the plumbing of getting data out of the database. But there was one case where I had to swap it out for using a stored procedure.

The Setup

There were several screens that showed similar data, but constrained in different ways. They all shared common code along these lines:

Base:

iQueryable Q  // defines the base query that all screens need

Overrides 1..N:

IQueryable Filter(IQueryable Q) { 
    return Q.Where(… additional clauses…); 
    // specific to the screen 
}

and then calling back to some Shared code:

IQueryable Filter(IQueryable Q, FilterSettings fs) {
  // apply standard filters that user could set at the app global level
}

The app used to then take this IQueryable and send it over to a ProjectAintoB() function that did the actual query (.ToList()) and then mapped that stuff out to a viewmodel / list item thing to shove into a grid:

return new GridItem { 
    Description = Q.Description, 
    ClientName = Q.Client.Name, 
    ManagerName = Q.Client.Manager.FullName, 
    Exceptions = Q.Exceptions, 
    NotesCount = Q.Notes.Count()
};

Note: this is simplified. The original GridItem class had somewhere around 35 members, across 8-9 tables.

When we first started using the app, this worked well enough. It returned quickly enough to not be noticeable. However, as the app started to grow, things started to get noticeably slower, so we went onsite and did some profiling, and this is what we found:

Actual screenshot/image from original analysis document around the problem

At the client site, it was a lot longer, with CPU pegged at 100% for 2-3 seconds.
Turns out, most of that CPU was LINQ attempting to build the query, not actually executing the query.
- I didn’t notice this on my Dev machine because of superior hardware and multiple (8) processors. At the client, most of their processors were 2 cores, and the other core was busy doing client-ish stuff.
- This was with EF 5.0 with .Net 4.0, before they had “cached compilation of queries” (.Net 4.5). I tried upgrading to .Net 4.5; but still had problems.
After the query was executed, there was a network spike as the data returned from the server
- Turns out, by having 9 tables returning, and the way EF work(ed|s), there was a LOT of repetition – the same Manager FullName across 5,000 rows, etc.
- This lead to a large network payload of returned data
- This was a killer over VPN. (Which only affected me, the developer, but I am important!)

The Solution

We did it in two stages:

We kept the original concept of an IQueryable that gets modified by various layers to get down to the data that needs to be pulled back.
We executed it, but only pulled back a single item: select g.GridItemId. On some screens, this was 50 items; on other screens, 10000 items.
We called a stored procedure which took in a TVP (table valued parameter) of id’s, and returned multiple result sets:
- one result set for the grid items + external id’s to other tables
- one result set for all the Clients referenced by these grid items.
- one result set for all the managers referenced by these grid items.
- one result set for all the Notes referenced by these grid items.
We read these in, and directly constructed our grid item from these results.

And it worked wonderfully.

Average compilation time at the client site was down to 50 ms (most of the complexity was joining in to all the other tables to populate a grid item)
The second call to the sproc finished within 20-40ms or so
The payload across the wire was much smaller.

Here’s a screenshot of a paragraph I wrote in my excitement at getting it to run so much faster:

I also recorded a video of a before screen, and the after screen. I can’t show that to you, but it was a hit.

The Result

This, and other performance enhancements over the last 6 months – according to the client last week:

Client: Yes! Actually, 2.0 is faster than 1.0 now! We dislike going back to 1.0.
Me:

Backup, Balance, and Cross Apply

My goodness, the weeks have flown. I haven’t had the time to properly devote to any one geeky project.. but there’s been a bunch of smaller things going on. I’ll try to sort them from Geeky-est to Meta.

I used Cross Apply for the first time

Its hard to do better than Stack Overflow, so here: http://stackoverflow.com/questions/1139160/when-should-i-use-cross-apply-over-inner-join

My use case was pretty simple:

I had a sproc which would return inventory of stocks held.
Anybody who had more than 100 stocks, I was trying to create an option from those stocks (100 stocks = 1 option) (for scale testing purposes)
However, to create an option, I needed a valid combination of symbol, expiration date, and strike price (so that web service lookups against the market would return valid data)

The lookup of what is valid is something like SELECT TOP 1 EXPIRATION, STRIKEPRICE from <stuff we already know about> where SYMBOL=@symbol

The cross apply becomes something like this:

INSERT INTO (new thing) 
SELECT (bunch of old things from S, LOOKUP.EXPIRATION, LOOKUP.STRIKEPRICE) 
FROM inventory of stock S
CROSS APPLY (  select top 1 ... ) AS LOOKUP

I had to reformat my W7 box at home

Something went haywire, and the computer would freeze every time a video played on a web page. Even ads. I tried all kinds of stuff around uninstalling and reinstalling drivers… SAFE mode worked ok, but coming out of safe mode = kaboom.

So I decided to restore my machine to a previous known good version, thanks to Windows Home Server.

But I forgot that my machine had a RealTek integrated network adaptor, which the WHS restore CD doesn’t know about.. so no connection for restore! I tried looking for the drivers CD, couldn’t find it in 5 minutes

So I reinstalled Windows 7 from scratch. And still no Network driver. I had to go get the drivers on a different machine, and bring ’em over by sneakernet. At this point, i was committed, so I continued the reformat.

I currently have a robocopy going from my WHS backup to my new working directory. My incredibly awesome nomenclature is: “c:\2013“. That’s where I put my stuff. If a project spans years, it gets copied from one year to the other. One folder per project.

Life Balance

Family Reunion Road Trip
2nd swim time with Father in Law.. I’m getting better at this treading water stuff.
Not much running.. 5 miles here, 0.6 there, 2 more squeezed in
Fourth of July = dogs went crazy while we were not home = broke through drywall to get out of basement and tear through the house
Dryer motor getting repaired, thanks to a shoe. And a dryer vent, thanks to a dog (it was next to the drywall above)
Crazy dogs led to walking the dogs in the morning. I am not a morning person.
Today is door replacement day, thank you Lowes
Back deck is done! Thank you Steve Ader. He does good work.
3 birthday parties … ages: 30, 33, 1.. and an upcoming 86.

Looking forward

I’m hoping to get my Android + OBDII set back up again. I want to see how much gasoline I’m spending over a 2D map.
I could write about tracking down some performance problems in EF .. mostly around context.Attach() .. and how I replaced it with a single update statement.
I could write about my current project, the “feel” of it, as it went from “we don’t trust the system” to “hey, this is working pretty good”.
I need a different home backup solution. If I upgrade one more hard drive, my WHS will be at reasonable-capacity, and honestly I just use it as a NAS. I have a 1 year old off-site backup, which is not automated enough for me. I have only used WHS’s restore the way it was meant ONCE – when my laptop got unstable, and i restored it to itself — and that was a beautiful thing — but once in 6 years? Compared to 7 or 8 reformat’s (changing OS’s usually, or new owner for a machine)? … contemplation is needed.

peace out.

Database Change Scripts

At CodePaLousa Day #1, I attended a session, Tools and Adoption Patterns for Automating Database Deployment, about doing automated change in databases. That’s more than I need to do right now, however, it gave me a +1 on how I’m currently managing change, which is my default method if the customer doesn’t provide one.

What I do:

For every change, I start or append to a script file.
When a change gets to prod or “moves on” to branch, I start a different script file.
Every script file must be re-runnable.
Every script file must have a precondition check – something that is true of the database before it can run against it.
For SqlServer, if the precondition check fails, I turn noexec on to prevent the rest of the script from running.
For SqlServer, DDL statements are wrapped in exec() so that I don’t get compilation errors.

Here’s a sample, with names changed to protect the client:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- This is something from the most recent previous round of changes, which should already exist. 
PRINT 'Precondition Check'
if (not exists (select 1 from information_Schema.COLUMNS where TABLE_NAME='Quadrant' and COLUMN_NAME='ImperialCode')) 
begin
    PRINT '  Failed -- skipping rest of script'
    set noexec on
end else begin
    PRINT '  Passed'
end
GO 

-- In support of TA26 (revision 1683): 

PRINT 'Table NoRebelsAllowedRule'
if (not exists (select 1 from INFORMATION_SCHEMA.TABLES where TABLE_NAME='NoRebelsAllowedRule')) 
begin

    CREATE TABLE [dbo].[NoRebelsAllowedRule](
        [NoRebelsAllowedRuleID] [int] IDENTITY(1,1) NOT NULL,
        [CatalogID] [int] NOT NULL,
        [DeletedOn] [datetime] NULL,
        [DeletedByUserID] [int] NULL,
        [CreatedOn] [datetime] NOT NULL,
        [CreatedByUserID] [int] NOT NULL,
        [UpdatedOn] [datetime] NULL,
        [UpdatedByUserID] [int] NULL,
     CONSTRAINT [PK_NoRebelsAllowedRule] PRIMARY KEY CLUSTERED 
    (
        [NoRebelsAllowedRuleID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    PRINT '  Created'
end else begin
    PRINT '  Already Exists'
end
GO

PRINT 'Foreign Key FK_NoRebelsAllowedRule_Catalog'
if (not exists ( select 1 from information_schema.REFERENTIAL_CONSTRAINTS where constraint_name='FK_NoRebelsAllowedRule_Catalog' ) ) 
begin 
    ALTER TABLE [dbo].[NoRebelsAllowedRule]  WITH CHECK ADD  CONSTRAINT [FK_NoRebelsAllowedRule_Catalog] FOREIGN KEY([CatalogID])
    REFERENCES [dbo].[Catalog] ([CatalogID])
    PRINT '  Created'
end else begin
    PRINT '  Already Exists'
end
GO

PRINT '04/23/13 Replacing Catalog.CannotFireBlasters with BlasterBehaviorID'
if (not exists ( select 1 from information_schema.COLUMNS where table_name='Catalog' and column_name='BlasterBehaviorID' ) ) 
begin
    alter table [dbo].[Catalog] add BlasterBehavior INT NOT NULL DEFAULT '0'; 
    PRINT '  Added BlasterBehaviorID'
end else begin
    PRINT '  BlasterBehaviorID Already Exists'
end
if (exists (select 1 from information_Schema.COLUMNS where table_name='Catalog' and column_name='CannotFireBlasters' ) ) 
begin
    exec (N'update [dbo].[Catalog] set BlasterBehaviorID=0 where CannotFireBlasters=0');
    exec (N'update [dbo].[Catalog] set BlasterBehaviorID=3 where CannotFireBlasters=1');
    PRINT '  Migrated Data from CannotFireBlasters'
    alter table [dbo].[Catalog] drop column CannotFireBlasters
    PRINT '  Dropped old column CannotFireBlasters'
end else begin
    PRINT '  Old column CannotFireBlasters does not exist'
end
GO

set noexec off
GO

Since the above sample, I’ve started adding dates and work item numbers into the script as well, to make things a bit more traceable. I figured as long as I had a sequence going, I’d append to the same script, though in an environment with multiple developers, you may want to partition more.

Why I didn’t..

RedGate Sql Compare tools: I don’t own them.
Sql Server Project in Visual Studio: When you actually get around to using them against real environments, there are a ton of problems around “Server” projects verses “Schema” projects (or some such) which are worse in 2012 than they used to be.
EF Migrations: Haven’t spent the time to learn them, desire to be productive, sticking with what I already know and can do quickly.