{"id":2517,"date":"2018-06-14T23:57:12","date_gmt":"2018-06-14T23:57:12","guid":{"rendered":"http:\/\/microsoftgeek.com\/?p=2517"},"modified":"2018-09-06T23:17:57","modified_gmt":"2018-09-06T23:17:57","slug":"an-introduction-to-sql-server-clusters","status":"publish","type":"post","link":"https:\/\/microsoftgeek.com\/?p=2517","title":{"rendered":"An Introduction to SQL Server Clusters"},"content":{"rendered":"<h3 id=\"what_type_of_clustering_are_we_talking_about\">What Type of SQL Clustering are we Talking About?<\/h3>\n<p>There are lots of types of clusters out there.\u00a0When we cluster SQL Server, we install one or more SQL Server instances into a\u00a0Windows Failover Cluster. In this post I\u2019m talking specifically about clustering SQL Server 2005 or later using Windows Server 2008 or later.<\/p>\n<p><strong>Key Concept<\/strong>: A Windows Failover Cluster uses shared storage\u2013 typically, this shared storage is on a\u00a0SAN.\u00a0When a SQL Server instance is installed on the cluster, system and user databases are required to be on the shared storage. That allows the cluster to move the SQL instance to any server (or \u201cnode\u201d) in the cluster whenever you request, or if one of the nodes is having a problem. \u00a0There is only one copy of the data, but the network name and SQL Server service for the instance can be made active from any cluster node.<\/p>\n<p><strong>Translation<\/strong>: A failover cluster basically gives you the ability to have all the data for a SQL Server instance installed in something like a share that can be accessed from different servers. It will always have the same instance name, SQL Agent jobs, Linked Servers and Logins wherever you bring it up. You can even make it always use the same IPAddress and port\u2013 so no users of the SQL Server have to know where it is at any given time.<\/p>\n<p>Here is a diagram of a SQL Server cluster. The cluster is named SQLCLUSTER01. It has two nodes (servers), which are named SQLCLU01NODE01 and\u00a0SQLCLU01NODE02. People connect to the SQL Server instance at SQLCLU01A\\SQL. The instance has been configured on port 1433.<\/p>\n<p><strong><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-13535 alignnone\" title=\"SQL Server Cluster Diagram with Shared Storage\" src=\"https:\/\/www.brentozar.com\/wp-content\/uploads\/2012\/02\/SQlFailoverCluster_01.png\" sizes=\"auto, (max-width: 638px) 100vw, 638px\" srcset=\"https:\/\/www.brentozar.com\/wp-content\/uploads\/2012\/02\/SQlFailoverCluster_01.png 1107w, https:\/\/www.brentozar.com\/wp-content\/uploads\/2012\/02\/SQlFailoverCluster_01-600x300.png 600w, https:\/\/www.brentozar.com\/wp-content\/uploads\/2012\/02\/SQlFailoverCluster_01-300x150.png 300w, https:\/\/www.brentozar.com\/wp-content\/uploads\/2012\/02\/SQlFailoverCluster_01-500x250.png 500w\" alt=\"\" width=\"638\" height=\"319\" \/><\/strong><\/p>\n<p>Oh no! There\u2019s been a failure in our environment!<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-13562\" title=\"SQlFailoverCluster_AfterCrash\" src=\"https:\/\/www.brentozar.com\/wp-content\/uploads\/2012\/02\/SQlFailoverCluster02.png\" sizes=\"auto, (max-width: 638px) 100vw, 638px\" srcset=\"https:\/\/www.brentozar.com\/wp-content\/uploads\/2012\/02\/SQlFailoverCluster02.png 1107w, https:\/\/www.brentozar.com\/wp-content\/uploads\/2012\/02\/SQlFailoverCluster02-600x300.png 600w, https:\/\/www.brentozar.com\/wp-content\/uploads\/2012\/02\/SQlFailoverCluster02-300x150.png 300w, https:\/\/www.brentozar.com\/wp-content\/uploads\/2012\/02\/SQlFailoverCluster02-500x250.png 500w\" alt=\"\" width=\"638\" height=\"319\" \/><\/p>\n<p>Here\u2019s what happened.<\/p>\n<p>The\u00a0SQLCLU01NODE01 server crashed unexpectedly. When this happened, the Windows Failover Cluster service saw that it went offline. It brought up the SQL Server services on SQLCLU01NODE02. The\u00a0SQLCLU01A\\SQL instance started up and connected to all the same databases on the shared storage\u2013 there\u2019s one copy of the data, and it doesn\u2019t \u00a0move. As part of the SQL Server startup, any transactions that were in flight and had not committed at the time of the crash were rolled back.<\/p>\n<p>While this automatic failover was occurring, users could not connect to the\u00a0SQLCLU01A\\SQL instance. However, after it came back up they were able to resume operations as normal, and had no idea that a server was still offline.<\/p>\n<h3 id=\"why_you_care_about_clustering\">Why You Care About SQL SERVER Clustering<\/h3>\n<p>If you\u2019re a business owner, manager, or DBA, you care about clustering because it helps keep your applications online more of the time\u2014 when done properly, it makes your database highly available.<\/p>\n<p>Here are some ways that clustering makes your life easier:<\/p>\n<ul>\n<li>Hardware failures are a nightmare on standalone servers. If a server starts having problems in a failover cluster, you can easily run your SQL Server instance from another node while you resolve the issue.<\/li>\n<li>Applying security patches on a standalone server can be very tedious and annoying to the business: the SQL Server is offline while you wait for the server to reboot. By using failover clustering, you can apply patches with only brief downtimes for your application as you move your SQL Server instance to a different node.<\/li>\n<li>Failover clusters can also give you an additional tool in your troubleshooting toolkit. Example: if you start seeing high latency when using storage and you\u2019ve ruled out all the immediate candidates, you can fail to another node to try to rule out if it\u2019s a problem with a per-node component like an\u00a0HBA.<\/li>\n<li>Clustering is transparent to the calling application. Lots of things with SQL Server \u201cjust work\u201d with clustering, whereas they\u2019re a little harder with other alternatives. With clustering,\u00a0<em>all<\/em>\u00a0of my databases, logins, agent jobs, and everything else that\u2019s in my SQL Server instance fail over and come up together as a single unit\u2014 I don\u2019t have to script or configure any of that. I can also cluster my\u00a0distributed transaction coordinator\u00a0and fail it over with my instance as well.<\/li>\n<\/ul>\n<h3 id=\"gotchas_and_notes_for_planning_a_cluster\">Gotchas and Notes for Planning a SQL Cluster<\/h3>\n<h4 id=\"know_what_clustering_doesn8217t_do\">Know What Clustering SQL Server Doesn\u2019t Do<\/h4>\n<p>The first gotcha is to be aware of what a failover cluster\u00a0<em>won\u2019t<\/em>\u00a0help you with.<\/p>\n<p>Clustering won\u2019t improve your performance, unless you\u2019re moving to more powerful servers or faster storage at the same time you implement clustering. If you\u2019ve been on local storage, don\u2019t assume moving to a SAN means a nirvana of performance. Also, clustering doesn\u2019t guarantee that everything involved in your SAN is redundant! If your storage goes offline, your database goes too.<\/p>\n<p>Clustering doesn\u2019t save you space or effort for backups or maintenance. You still need to do all of your maintenance as normal.<\/p>\n<p>Clustering also won\u2019t help you scale out your reads. While a SQL Server instance can run on any node in the cluster, the instance is only started on one node at a time. That storage can\u2019t be read by anyone else on the cluster.<\/p>\n<p>Finally, clusters won\u2019t give you 100% uptime. There are periods of downtime when your SQL Server instance is \u201cfailing over\u201d, or moving between nodes.<\/p>\n<h4 id=\"avoid_having_too_many_nodes\">Invest Time Determining the Right Naming Convention<\/h4>\n<p>You have a lot of names involved in a cluster: a name for the cluster itself, names for each of the servers in the cluster, and names for each SQL instance in the cluster. This can get confusing because you can use\u00a0<em>any<\/em>\u00a0of these names later on when connecting with Remote Desktop\u2013 so if you\u2019re not careful, there may be times when you\u2019re not entirely sure what server you\u2019re logged onto!\u00a0I have two general rules for naming:<\/p>\n<p>First, make sure it\u2019s obvious from the name what type of component it is\u2013 whether it\u2019s a cluster, physical server, a SQL Server instance, or a Distributed Transaction Coordinator. I also recommend installing\u00a0BGINFO\u00a0to display the server name on the desktop for every server in the cluster.<\/p>\n<p>Second, name everything so that if you later add further nodes or install another SQL Server instance onto the cluster, the names will be consistent.<\/p>\n<h4>Avoid Putting Too Many Nodes in One SQL Cluster<\/h4>\n<p>I prefer to have only two or three nodes in a cluster. For example, if I need to cluster five SQL Server instances, I would put them in two failover clusters.<\/p>\n<p>This requires a few extra names and IP Addresses overall, but I prefer this for management reasons. \u00a0When you apply patches or upgrades, you must make sure that each service on your cluster runs on each node successfully after you\u2019ve applied the change. Having a smaller cluster means you don\u2019t need to fail your instance over as many times after a change.<\/p>\n<h4 id=\"don8217t_assume_your_applications_will_reconnect_properly_after_failover\">Don\u2019t Assume Your Applications Will Reconnect Properly After Failover<\/h4>\n<p>Even though your SQL Server instance will come up with the same network name and IPAddress (if not using DHCP), many applications aren\u2019t written to continue gracefully if the database server goes offline briefly.<\/p>\n<p>Include application testing with your migration to a failover cluster. Even though the application doesn\u2019t know it\u2019s talking to a cluster (it\u2019s a connection string like any other), it may not reconnect after a failover. I worked with one application where everything worked fine after a failover, except web servers stopped writing their log data to a database because they weren\u2019t designed to retry after a connection failure. The data was written asynchronously and didn\u2019t cause any failures that impacted users, but the issue wasn\u2019t noticed immediately and caused the loss of some trending data.<\/p>\n<h4 id=\"8220active_active8221_can_be_useful\">\u201cActive Active\u201d Can Be Useful<\/h4>\n<p>My ideal cluster layout to work with is a two node cluster with identical hardware and two SQL Server instances on it. This is commonly called \u201cActive Active\u201d clustering, but that term is technically a no-no. Officially this is called a \u201cMulti-Instance Failover Cluster.\u201d Not quite as catchy, is it?<\/p>\n<p>Many people think the ideal situation is to put their most important SQL Server instance on a two node cluster and leave the second node ready, waiting, and idle. So, why do I want a second SQL Server instance?<\/p>\n<p>I like to put my critical, heavy hitter database on one of those instances in the cluster. I then want to take a couple of less critical, less busy databases and put them on the second instance. The perfect examples are logging databases. There are two requirements for these databases: first, they can\u2019t require a large amount of memory or processor use to run well, because I absolutely have to know that these two instances can run successfully at peak load on a single node if required. Second, the databases on the \u201cquiet\u201d instance shouldn\u2019t cause the whole application to go offline if they aren\u2019t available.<\/p>\n<p>Why do I like having a \u201cquiet\u201d instance? Well, whenever I need to apply updates to Windows or SQL Server, this is the canary I send into the coal mine first. You can perform\u00a0rolling upgrades\u00a0with failover clusters, which is great. But it\u2019s even better to know that the first instance you fail over onto an upgraded node won\u2019t take absolutely everything down if it has a problem.<\/p>\n<p><strong>Notes<\/strong>: Because of licensing costs, this option won\u2019t always be realistic. If you go this route you have to make sure everything can stay within SLA if it has to run on a single node at your busiest times\u2013 don\u2019t overload that \u201cquiet\u201d instance!<\/p>\n<h4 id=\"8220active_active8221_can_be_useful\">Re-Evaluate your SQL Server Configuration Settings<\/h4>\n<p>Revisit your configuration settings as part of your planning. For example, on a multi-instance cluster, you use the\u00a0<em>minimum<\/em>memory setting for SQL Server to configure how your instances will balance their memory usage if they are on the same node.<\/p>\n<h3 id=\"do_i_have_to_use_clustering_to_use_availability_groups_in_sql_server_2012\">Do I Have to Use Clustering to Use Availability Groups in SQL Server 2012?<\/h3>\n<p>This is an interesting question\u2013 don\u2019t let it confuse you. We have a very cool new feature called\u00a0Availability Groups\u00a0coming in SQL Server 2012, which\u00a0<strong>does<\/strong>\u00a0offer \u00a0awesome scale-out read functionality. You\u2019ll read in many places that it \u201crequires Failover Clustering.\u201d<\/p>\n<p>This is true. In order to use the Availability Group feature in SQL Server 2012, the Failover Clustering feature must be enabled in Windows. If you\u2019re using Windows Server 2008 or prior, this feature is only available in Datacenter and Enterprise edition of Windows Server, so that feature isn\u2019t free. \u00a0This feature is now included in Windows Server 2012 for all editions.<\/p>\n<p>But wait, there\u2019s a catch! Even though you\u2019re enabling the Failover Cluster feature, you are\u00a0<em>NOT<\/em>\u00a0required to have shared storage to use Availability Groups. You have the\u00a0option to use a Failover Cluster in an Availability Group, but you can also run your Availability Groups with entirely independent storage subsystems if you desire. The feature is required because no matter what, Availability Groups will use parts of the Failover Clustering feature to manage a virtual network name and IP Address.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>What Type of SQL Clustering are we Talking About? There are lots of types of clusters out there.\u00a0When we cluster SQL Server, we install one or more SQL Server instances into a\u00a0Windows Failover Cluster. In this post I\u2019m talking specifically about clustering SQL Server 2005 or later using Windows Server 2008 or later. Key Concept: [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[71,48],"tags":[],"class_list":["post-2517","post","type-post","status-publish","format-standard","hentry","category-sql-server","category-microsoft-windows-server-2012"],"_links":{"self":[{"href":"https:\/\/microsoftgeek.com\/index.php?rest_route=\/wp\/v2\/posts\/2517","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/microsoftgeek.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/microsoftgeek.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/microsoftgeek.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/microsoftgeek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2517"}],"version-history":[{"count":1,"href":"https:\/\/microsoftgeek.com\/index.php?rest_route=\/wp\/v2\/posts\/2517\/revisions"}],"predecessor-version":[{"id":2518,"href":"https:\/\/microsoftgeek.com\/index.php?rest_route=\/wp\/v2\/posts\/2517\/revisions\/2518"}],"wp:attachment":[{"href":"https:\/\/microsoftgeek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2517"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/microsoftgeek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2517"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/microsoftgeek.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2517"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}