Overview
Communication between domain controllers (DC) can be confusing and the technical information from Microsoft doesn’t always help. This article is an attempt to provide a simple description of the communication between two domain controllers and why machine accounts and their passwords really matter.
Domain Controllers communicate with each other using a shared secret. This is basically the local machine account and the local password hash value. The DC stores the machine name of any other domain controllers, and uses the local machine account and the stored local password hash to establish a connection and pull domain account change information to the local DC, as required. Each DC stores the other computers that are also a DC, and uses the password hash for that computer’s machine account to establish that connection each time it attempts to communicate. Every DC has a machine account (a machine account represents the entire machine, not just one person) in Active Directory, and the password hash is stored in the registry.
The DC saves the current password of the other DC in the CurrVal entry, and the last password in the OldVal entry. By default, a new password hash is renegotiated every 30 days (See MaximumPasswordAge in Group Policy), which copies CurrVal to OldVal and replaces CurrVal with the new password hash. If there are other domain controllers in the domain, and if more than 60 days have elapsed since they last communicated, you might need to reset the shared secret with the other domain controllers. The symptom of a lapsed shared secret are replication errors with the error message “Access Denied” or “Failed to Authenticate”.
Example Scenario
Let’s assume you have two domain controllers, one named dc01.mycompany.local and the other named dc02.mycompany.local, each is working correctly. Each DC has a local machine account, like dc01$ and dc02$. Each will also have a corresponding user account in Active Directory named dc01$ and dc02$.The dollar sign indicates these are hidden user accounts.
Let’s assume dc01.mycompany.local wants to pull changes from dc02.mycompany.local (all requests between DC are pull requests), it consults its own local copy of AD to fetch the account name and password hash for dc02$. It then connects to dc02.mycompany.local with the user account dc02$ and the locally stored password hash. Dc02 compares the supplied password hash with its own master copy in its private registry, at HKEY_LOCAL_MACHINE\SECURITY\Policy\Secrets\$MACHINE.ACC. The registry value contains both the current password hash (CurrVal) and the prior password hash (OldVal). If either one matches, the connection is allowed. If not, access is denied and the error is reported in the Event Log for Directory Services on dc01.mycompany.local, and a security audit failure event is generated on dc02.mycompany.local in the Event Log.
Remember a new password hash is renegotiated every 30 days (MaximumPasswordAge in Group Policy), which copies CurrVal to OldVal and replaces CurrVal with the new password hash. The password attribute in the AD object CN=dc02$ is changed to match CurrVal, and the change is replicated to all domain controllers in the domain.
If you restore dc02.mycompany.local using an old backup snapshot that was created more than 60 days ago, the registry for dc02.mycompany.local gets restored along with its private copy of Active Directory. The restored password hash in $MACHINE.ACC on dc02.mycompany.local is now three generations old. When dc01.mycompany.local attempts to connect to dc02.mycompany.local, dc02.mycompany.local will reject the password hash in CN=dc02$ (on dc01.mycompany.local) because it matches neither CurrVal nor OldVal, and access is denied. This prevents dc01.mycompany.local from pulling replication data from dc02.mycompany.local.
Similarly, when dc02.mycompany.local attempts to pull replication data from dc01.mycompany.local, the restored AD object CN=dc01$ (in the dc02.mycompany.local out-of-date copy of AD) will match neither CurrVal nor OldVal on dc01.mycompany.local, so dc02.mycompany.local cannot pull the replication data from dc01.mycompany.local. Pull replication fails in both directions and the Event Log error messages start populating.
Troubleshooting
Before you assume an out-of-date password is the problem, you must verify the two servers can communicate with each other. From the the newly restored DC attempt to ping the other DC. Open an administrative console and type:
ping dc01.mycompany.local
where dc01.mycompany.local is the fully qualified domain name (FQDN) of the Primary Domain Controller in the domain. If this fails you have a basic network connectivity problem. You need to fix that first.
Solution
If they are able to communicate and are generating the errors listed above, then you should reset the DC Shared Secret.
On the recently restored DC, run the Netdom console utility to reset its machine account password:
netdom resetpwd /server:dc01.mycompany.local /UserD:mycompanyAdministrator /PasswordD:*
where “Administrator” is the name of a user account on dc01 that has administrator privileges to change the passwords of other users, and “mycompany” is the name of the domain. The command Netdom resetpwd will do following:
- Write the new random password hash to $MACHINE.ACC in the registry of the local computer (dc02) as CurrVal. The prior password hash is moved to OldVal.
- Update the AD object CN=dc02$ on dc01 with the new password hash (using the supplied logon credentials).
- Update the AD object CN=dc02$ on the local computer (dc02) with the same new password hash (for local loopback connections).
- This will allow dc01 to connect to dc02 to pull replication data from dc02.
- This will also allow dc01 to replicate the new password to all of the other DCs in the domain, allowing them all to also connect to dc02.
Note: In rare cases you might also need to stop and restart the Kerberos Distribution Center (KDC) service. See How to use Netdom.exe to reset machine account passwords of a Windows Server domain controller.
On the other DC, you need to reset dc01’s shared secret so that dc02 can pull replication data in the reverse direction. On the other DC (dc01), run the Netdom console utility to reset its machine account password:
netdom resetpwd /server:dc02.mycompany.local /UserD:mycompanyAdministrator /PasswordD:*
where “Administrator” is the name of a user account on dc01 that has administrator privileges to change the passwords of other users, and “mycompany” is the name of the domain. The command Netdom resetpwd will do following:
- Write the new random password hash to $MACHINE.ACC in the registry of the PDC (dc01) as CurrVal. The prior password hash is moved to OldVal.
- Update the object CN=dc01$ on dc02 with the new password hash (using the supplied logon credentials).
- Update the object CN=dc01$ on the local computer (dc01) with the same new password hash (for local loopback connections).
- This will allow dc02 to connect to dc01 to pull replication data from it. Other DCs can still connect to dc01 using the old password hash (in $MACHINE.ACC OldVal). They will receive an update with dc01’s new password hash via normal automated replication efforts.
Eventually the new password hash in CN=dc01$ will replicate to any other DCs in the domain.