Active Directory Troubleshooting Basics
In this article we will cover ways to monitor and troubleshoot common problems with Active Directory. Although listing ways to troubleshoot Active Directory could easily span into a 3 volume book set, we will cover the most common issues and solutions here within these articles. Whether you are already a pro, or just a beginner – these tips should serve you well. In this article we will cover replication traffic and how to monitor and troubleshoot it with tips and tools.
Monitoring and Troubleshooting Active Directory Replication
Replication may be defined as a duplicate copy of similar data on the same or a different platform or system. When using a directory service such as Active Directory, the directory database is carried by all domain controllers so that when you want to contact a domain controller for use, there is always a local copy local for use so that requests do not have to be sent over the wide area network (WAN). Replication for Active Directory operates within the directory service component of the security subsystem. This component is called Ntdsa.dll and is accessed through the Lightweight Directory Access Protocol (LDAP). Ntdsa.dll runs as a part of the local security authority (LSA), which runs as Lsass.exe. Updates are transported over Internet Protocol (IP) by the remote procedure call (RPC) protocol. The Simple Mail Transfer Protocol (SMTP) is also available for use as well, although it’s more common to see RPC over IP used.
When considering Active Directory, replication takes place and a copy of the Active Directory database is stored and updated on all other participating domain controllers on your network and in a perfect world, each copy of the database is the same and all domain controllers are synchronized. If this happens, then all your domain controllers are synchronized with an exact duplicate copy of the Active Directory database. When you install Active Directory, for the most part even if all the default settings are chosen, the replication process from domain controller to domain controller is automatic and practically transparent. For the most part, domain controllers handle the replication processes without advanced configuration and most times, without a problem.
In figure 1, you can see a common network (2 sites connected via a WAN link) with a domain controller in each location. Again, the benefit of having a domain controller local to your PC’s at each network segment is to have requests made of the domain controller kept local to the PC’s in need of its services to speed up requests (by keeping them local) or in case of disaster recovery, which could happen if the WAN link drops, the local PCs can still find a local domain controller to use. Keeping traffic off the wide area network (WAN) and containing it to the local area network (LAN) is the best design practice you can implement.
Figure 1: A Common Wide Area Network (WAN)
As a systems administrator, you should still consider that Active Directory performance still needs to be monitored and analyzed. The health and maximized performance of Active Directory depends on a smooth replication process. If you are having problems with replication, you will know not only from blatant logging in your Event Viewer, but from poor performance as well. Many times, you cannot stop every problem from occurring, but hopefully after reading this article, you will be better equipped to handle issues and keep your network as optimized as possible to handle the traffic traversing it.
Consider a common problem such as a failed network link. In figure 2, you see that the main wide area network link has been broken.
Figure 2: A Failed Network Link
ISP’s and telecom service providers occasionally have problems and service can be interrupted. This of course stops the communication between domain controllers, therefore also severing the replication process. This can prevent the synchronization of information between domain controllers and possibly cause corruption and/or other problems.
A good way to make sure that this doesn’t happen is to set up a backup link (such as ISDN as seen in figure 2). ISDN (Integrated Services Digital Networks) is a digital WAN technology used to facilitate connections between sites. More commonly used today for disaster recovery, ISDN still has a place in today’s marketplace. Although still used, you don’t have to limit yourself to any technology when it comes to backup links, you can use a fractional or full T1, a DSL line, or any other technology that allows you to have redundancy in your links. The goal is to have redundant links to keep your domain controllers in constant communication with each other so that the Active Directory database stays synchronized and healthy. A common symptom of replication problems is that information is not updated on some or all domain controllers. For example, a systems administrator creates a user account on one domain controller, but the changes are not propagated to other domain controllers. In most environments, this is a potentially serious problem because it affects network security and can prevent authorized users from accessing the resources they require. You can take several steps to troubleshoot Active Directory replication; each of these is discussed in the following sections.
Verifying Network Connectivity
In order for replication to work properly in distributed environments, you must have network connectivity. Although ideally all domain controllers would be connected by high-speed and redundant LAN or WAN links, this is rarely the case for larger deployments and for most companies that utilize slow WAN links that aren’t recoverable from a disaster. Always make sure your network topology is documented and tested to ensure that it’s connected. There are many tools you can use to verify connectivity such as Ping and Tracert which come with just about every operating system ever created that runs TCP/IP.
In real world deployments, analog/dial-up connections and slow connections are common. If you have verified that your replication topology is set up properly, you should confirm that your servers are able to communicate over the network. Problems such as a failed dial-up connection attempt can prevent important Active Directory information from being replicated. Learn how to use ping and other ICMP based protocol troubleshooting tools in the links section at the end of this article.
Verifying Router and Firewall Configurations
When building a secure network, most times controls are placed on network devices to filter the traffic going from place to place. The most commonly used tool to control traffic is a Firewall. A router or any other device that utilizes a firewall feature set, or some other form of Access Control that stops access to and from other hosts connected can also be used. A firewall is usually dedicated to only protecting the perimeter so its been designed to do that, do not assume that the use of a firewall stops any risk of you being attacked, it only minimizes that risk.
Firewalls are used to restrict the types of traffic that can be transferred between networks. Their main use is to increase security by preventing unauthorized users from transferring information. In some cases, company firewalls may block the types of network access that must be available in order for Active Directory replication to occur. For example, if a specific router or firewall prevents data from being transferred using SMTP, replication that uses this protocol will fail.
Network Ports Used by Active Directory Replication
RPC replication uses dynamic port mapping as per the default setting. When you need to connect to an RPC endpoint during Active Directory replication, RPC uses TCP port 135. RPC on the client contacts the RPC endpoint mapper on the server at a well-known port and RPC randomly allocates high TCP ports from port 1024 to 65536. Because of this configuration, a client will never need to know what port to use for Active Directory replication; it will just take place seamlessly. There are also other ports assigned for Active Directory replication. There are as follows:
Protocol Port LDAP udp 389
tcp 389LDAP (SSL) udp 636
tcp 636Kerberos udp 88
tcp 88DNS udp 53
tcp 53SMB over IP udp 445
tcp 445Global Catalog Server tcp 3269
tcp 3268
Examining the Event Logs:
Errors, if they occur, will show up in the Event Viewer logs. At the end of this article, I have placed a link to the Microsoft Website so that you can learn how to use the Event Viewer. The Event Viewer can be very helpful when trying to locate and resolve a replication problem. Many errors are reported to the Event Viewer for your review.
Whenever an error in the replication configuration occurs, the computer writes events to the Directory Service and File Replication Service (FRS) event logs. By using the Event Viewer administrative tool, you can quickly and easily view the details associated with any problems in replication. For example, if one domain controller is not able to communicate with another to transfer changes, a log entry is created.
You may receive events such as:
- Event ID 1311 in the directory service log
- Event ID 1265 with error “DNS Lookup Failure” or “RPC server is unavailable” in the directory service log. Or, received “DNS Lookup Failure” or “Target account name is incorrect” from the repadmin command
- Event ID 1265 “Access denied,” in directory service log. Or, received “Access denied” from the repadmin command
Note:
The link at the end of the article covers the explanation of these specific errors and more.
Verifying Site Links
Before domain controllers in different sites can communicate with each other, the sites must be connected by site links. If replication between sites is not occurring properly, verify that the proper site links are in place. Verify your site links by using the Replication diagnostics utility (Repadmin.exe). Use this tool to verify correct site links and to display inbound and outbound connections. You can also use it to display the replication queue. You can get the tool by using the link at the end of this article.
Verifying That Information Is Synchronized
It’s often easy to forget to perform manual checks regarding the replication of Active Directory information. One of the reasons for this is that Active Directory domain controllers have their own read/write copies of the Active Directory database. Therefore, if connectivity does not exist, you will not encounter failures while creating new objects.
It is important to periodically verify that objects have been synchronized between domain controllers. This process might be as simple as logging on to a different domain controller and looking at the objects within a specific OU. This manual check, although it might be tedious, can prevent inconsistencies in the information stored on domain controllers, which, over time, can become an administration and security nightmare.
Verifying Authentication Scenarios
A common replication configuration issue occurs when clients are forced to authenticate across slow network connections. The primary symptom of the problem is that users complain about the amount of time it takes them to log on to the Active Directory (especially during times of high volume of authentications, such as at the beginning of the workday). Usually, you can alleviate this problem by using additional domain controllers or reconfiguring the site topology. A good way to test this is to consider the possible scenarios for the various clients that you support. Often, walking through a configuration, such as “A client in Domain A is trying to authenticate using a domain controller in Domain B, which is located across a very slow WAN connection,” can be helpful in pinpointing potential problem areas.
Verifying the Replication Topology
The Active Directory Sites and Services tool allows you to verify that a replication topology is logically consistent. You can quickly and easily perform this task by right-clicking the NTDS Settings within a Server object and choosing All Tasks => Check Replication Topology. If any errors are present, a dialog box alerts you to the problem.
You can verify the Active Directory topology using the Active Directory Sites and Services tool.
Besides for ensuring that replication always continues, you can also learn how to monitor it as well. There are several ways in which you can monitor the behavior of Active Directory replication and troubleshoot the process if problems occur. In our next article we will look at the replication monitor and part III of this article will cover the system monitor.
Summary
In this article we covered the basics of replication, how it works, how to verify and troubleshoot it and what you can do to ensure that you Active Directory topology is healthy.