Tuesday, July 9, 2013

T-SQL Tuesday #44 How to Take Down Prod in 30 Seconds

Hello Dear Reader!  Welcome to my blog on T-SQL Tuesday #44 Second Chances.  I’m hosting this month, and we are writing all about second chances.  My second chance comes from the not to distant past.

One day the DBA team was given a toy frog as some part of swag from a vendor.  We did what any group of grown men would do.  We put a dunce hat on it.  We decided that whoever screwed up next would have it sitting on their cubical wall, and we would pass it around as the next offender appeared.  A fun little way to pass the time and rib one another.  

No sooner had I participated in developing this badge of shame, than I earned it.  The title says it all.  How to take down prod in 30 seconds, but I should clarify.  Not some, not half, but allllllll you’re clustered servers in just 30 seconds.


I have to give a special Thank you to my buddy Dan Taylor (@DBABulldog | Blog), you see I remembered I had the frog.  I had forgotten what I had done to earn it.  It was sitting on the edge of my mind, but no matter how hard I tried I could not remember it.  It was sitting in a fog just out of reach.   An itch that I couldn’t scratch.  A few words out of his mouth and it all came flooding back.  As a good friend we've swapped many stories over the years, without his memory (which is better than mine) I would have had to go with a less interesting tale of woe.

“So Balls”, you say, “How did you screw up?”

Well Dear Reader I had an unfortunate convergence of unexpected anomalies that peaked in a spectacular crescendo of a mistyped password.   Yes a mistyped password.  My second chance would be typing in correctly.  The next best thing is explaining it so you hopefully never have to feel the same pain.

I SOLEMNLY SWEAR I AM UP TO NO GOOD


I had a new production SQL 2008 R2 Instance to install.  Things were going pretty smooth.  I got up to the screen where you punch in the password for the service account, and that’s when it all went wrong.  

I mistyped the password.  GASP, SHOCK, AWE, OTHER SUCH EXPRESSIONS!!!!!

Normally I would agree no big deal, but the next time I punched in the password I didn't get a password error, I got an error informing me that the account was locked.  Enter the series of unfortunate events.

Imagine you live in a world where all of the Prod servers are using the same service account.  Imagine that you've suggested this be changed but it ended up on the “That’s a good idea we’ll tackle that another day” pile.  Imagine that you are not using Microsoft Clustering for your Clustered servers, and that the inventive Server Engineers rolled their own “health check”.  Imagine that your current password policy locks out when you mistype the password somewhere between 3-8 times.

“But Balls”, you say, “You only typed your password once?  Not 3-8 times!”

Exactly.  There’s a bug in the installer for SQL Server 2008 R2.  When you click the next button after filling out the service account information, you authenticate at least twice for every account you type in.  Not so in SQL 2005 or SQL 2008 (not R2).  But in SQL 2008 R2 one mistyped password counts a whole lot more.  Depending on the services being installed, enough to lock out an account.

Then you are left to watch the manual health checks fail because the account is locked out, attempt a cluster failover, only to be locked out on the other side because the SQL Service account was locked out.

You catch your error quickly.  Run to the Team Lead, report what has happened, hoping this can get fixed before the inevitable outages begin.  Then you race back to your desk.  You have an uncomfortable phone call to place to the help desk.

Imagine that while this unfolds you are waiting on hold for the help desk to open a ticket (you have to follow protocol), that will get assigned to an engineer, who will pass it on to AD Services.  Queue the uncomfortable elevator music.

Co-workers scrambling in the back ground, like the bull pin of a busy newspaper.  Someone is keeping an active wipe board of what servers are now down, every minute someone in your cube starting to say “Have you….” Only to be cut off by your response “Still on Hold”.  Queue the music.

Other co-workers are fielding calls from App Teams reporting that their applications are offline.  Other co-workers trying to reach managers that can bypass a well-orchestrated bureaucratic separation of duties that results in elevator music while you are still on hold. Did I mention being on hold?  While on hold forty-five minutes can feel like weeks.

The saving grace (for my job), the bug I found was easily to duplicate.  It was easy to see that this behavior was not in previous versions.  As an added bonus those service accounts started becoming unique real quick.

DEMO: THE BUG I LEARNED ALL ABOUT

We’ll skip ahead a bit.  Say you are installing SQL Server 2008 R2.  We’ve gotten up to the Server Configuration where we are punching in our passwords.  First let’s open up our Event Viewer, click on our Security Tab and clear it out. 


*If this were anything other than my personal VM I would backup the log so we could restore it, do not clear out a security log on a prod server without proper guidance.


Now the only event in our log is the event denoting that our log has been cleared.  Back to SQL Server. 
 

We will click on the Use the same account for all SQL Server services button and type in our .\s-sqlsrv service account.  Definitely not following best practices here.  SQL Engine, SQL Agent, and SSIS all getting the same service account. 



Let’s Type the password in wrong and see what happens?  Click OK.  Click Next.



SQL reacted just like we thought.  Theoretically we should have 1 bad login check right?  The same user name was in use, we don’t need to validate it 3 more times.  One should do.  Perhaps at most we’ve got three validation checks right?


Let’s head over to our trusty error log and see. 

We’ve gone from 1 to 13 errors in the click of a button.  How many failed logins do we have?  Not 1, 2, 3, 4, 5, 6, 7, but 8 failed logins from one attempt.  You’ll get this if you use the button or if you do not use the button.

You may be asking did this get fixed in SQL 2012?



One look at the installer and you can see the button is gone.  Let’s punch in the same service account name and an incorrect password.



And now on to our error log.



Wow!  Six entries, now we are looking at 3 entries per account.  Nope didn’t get any better.

WRAP IT UP

Long story short, make sure those passwords are correct.  Personally I like to use a utility like KeePass to generate, store, and copy my passwords from.  Anything that keeps me from typing.  Or as the case may be mistyping J.

As always Dear Reader, Thanks for stopping by!

Thanks,

Brad






Friday, July 5, 2013

SQL Saturday Orlando: LAST CALL for Speakers


Hello Dear Reader!  I just wanted to write to Thank All of you for the submissions to SQL Saturday 232 Orlando.  This has been a crazy event and we are still several months away.  

Everything started out as it normally does, Karla Landrum (@karlakay22 | Blog),   leading the way pulling a motley crew of Shawn McGehee (@SQLShawn | Blog), SQL MVP Kendal Van Dyke (@SQLDBA | Blog), SQL MVP Andy Warren (@SQLAndy | Blog), Ben Cork, and myself behind her.

We hit a hiccup early on.  Our venue wasn’t going to be available on the date we had originally announced.  Unexpectedly we had to shift our date.  Some speakers couldn’t make it, and it brought us into conflict with other SQL Saturdays that some speakers had committed to speak at.  At that time we put out a very public call for speakers.

There's Still time to get a seat at our table!
The response was overwhelming!  So overwhelming that we are closing the call a bit early.  The call for speakers will end on July 10th and we hope to have the schedule out within a week or two after that.  Speaking with Rodney Landrum, my speaker committee co-captain, our goal is still the same.  No speaker will get turned away.



An essential part of SQL Saturday is to provide free training to the community.  Equally important is to help grow the next generation of SQL Server professionals who will be our speakers.  Look no further than myself to see proof of this.

So Dear Reader, get those abstracts in, because we’ll expand the number of rooms to fit you in!  Get ready to be part of the biggest SQL Saturday Orlando Ever!  Besides You know you want one of these!

As always Thanks for stopping by.

Thanks,


Brad

Tuesday, July 2, 2013

T-SQL Tuesday #44 The Second Chance

Hello Dear Reader!  This is the first Tuesday of the month and you know what that means.  It’s time to announce the T-SQL Tuesday Topic of the month!  This is your opportunity to participate in the largest SQL Blog party on the intrawebs. 

T-SQL Tuesday is an event started by Adam Machanic(@AdamMachanic| Blog) back in 2009.  The basic idea one blogger hosts the party and others participate.  We announce the topic the first Tuesday of the month, July 2nd 2013 for today, and everyone will post their blogs on the second Tuesday of the month, July 9th 2013 for the actual posts.  This month the host is none other than ME!

I love T-SQL Tuesday, there is always so much to write about.  Our world of technology changes so fast.  Each of us has the daily constraints of a life and a job as well.  Sometimes it is great to have a topic to write on so you can express your point of view or get the opportunity to dive a little deeper into an area of SQL that may have piqued your interest.  Equally as wonderful is reading all the other blogs that people have put together on the subject.  Variety is the spice of life, and will we get it in spades.


“So Balls”, you say, “That’s great, but what’s the topic?”

Thanks for keeping me on task Dear Reader!  Without further ado the topic of T-SQL Tuesday #44, Second Chances.

SECOND CHANCES



 As a DBA or a Presenter/Speaker we have all had at least one moment we would like back.  The demo didn't work, you were green and got asked a question you now know in your sleep.  You had a presentation in front of a client, and it all went sideways.  Maybe you logged onto the prod server thinking it was dev and dropped something you shouldn't have.  These moments serve not just as painful reminders, but also as powerful instruments for learning.  Would you like another shot at getting it right?  WELL NOW'S YOUR CHANCE!   Or I guess actually your…. Second…. Chance.  Your missions should you choose to accept it, tell me one of the moments you had, and most importantly what you learned from it!

First and foremost the rules. 

Rule 1: Don’t get yourself fired.  If you almost dropped the prod DB last week, truncated an important table, or took down a prod server during critical business hours, and nobody knows it was you & the people you work for read your blog, you should probably avoid writing about it here.  You want to write about events we can look back on and reflect over, not events HR would *love* to know about.

Rule 2: Some Time next Tuesday using GMT, here’s a link to a GMT time convertor, publish your blog post.  For example in the US that would cover 8 pm Monday to 8 pm Tuesday.

Rule 3: Make sure that you include the Image at the top of the page helping to identify your post as a T-SQL Tuesday blog.  Then come back here and post a link in the comments so I can find them.  Before the end of the week I'll do a round up of all the blogs. 

Extra Credit!

Tweet your blog with the hash tag #tsql2sday & go read someone else’s blog on the subject!
As Always Dear Reader, Thanks for stopping by and I’ll see you next Tuesday!

Thanks,

Brad