isthewebsitedown if you are asking, probably not. if I am asking, probably so

1Dec/090

Download HKCRScan.exe tool for troubleshooting MS Article ID 823159

Users were getting a "HTTP/1.1 503 Service Unavailable" on both https://<servername>/Exchange , https://<servername>/Public and on https://<servername>/microsoft-server-activesync, they get a login prompt and then a "HTTP 501/HTTP 505"

The below tool should be run from the command prompt. It should identify and remove registry keys over the 259 character limit. It will kick back any errors. If you have null keys (keys that are faulty but unremovable), you can use RootKitRevealer from sysinternals and get rid of them. I understand that regdelnull can do something similar, but in this case, it was a corrupt key, not a key with null characters.

In my case, the affected key was relating to the driver for the Intel storage controller (VEN_8086&DEV_24D3&SUBSYS_458015D9&REV_02). Not cool. I could not delete or rename the key and could not set/view permissions on it. Ran RootKitRevealer, which caused a stop error/reboot (crap) but successfully removed the key. IN OTHER WORDS, DO NOT DO THIS IF YOU DO NOT HAVE A TESTED BACKUP.

"To help troubleshoot this issue, run the HKCRScan tool (HKCRScan.exe). The HKCRScan tool enumerates the HKEY_CLASSES_ROOT registry hive to locate subkeys that contain more than 259 characters. Additionally, HKCRScan helps determine if there is an invalid discretionary access control list by returning error code 0x5. This error code means "Access denied" when it enumerates a registry key. The HKCRScan tool is an internal tool developed by Microsoft."

Download: HKCRScan

31Oct/090

When to panic…

I am working on a 225 mailbox migration this weekend. The environment is basically the following:

Old Server: Windows 2003 Std/Exchange 2003 Std, all patches (pretty basic)

New environment: 2 Windows 2008 Enterprise Mailbox servers running the Exchange 2007 Enterprise mailbox role in CCR with a Windows 2008 Standard machine running the CAS/HT roles and serving as the File Share Witness host. Each of the mailbox servers have three volumes (CCR likes both machines to be as nearly identical as possible): a 40GB C:, a 20GB D: for log files (on a RAID10) and a 300GB E: (on a RAID6). These volumes were set up by a co-worker a few weeks ago and he did a great job with it. The servers are fast and they have great I/O on disk writes. All three machines are hosted in a ESX/Blade server environment with a SAN backend connected via Fibre Channel. This is becoming a pretty popular arrangement. The RAID10 logfile volume is considered best practice for performance reasons. The mailbox store lives on the big RAID6 volume for fault tolerance.

Anyway, all machines were updated and I had tested failing over the CCR cluster nodes successfully, so at about midnight last night, I started moving mailboxes. At around 2am, the old mail server went offline. It responded to ping, but I could not RDP to it or get to and SMB shares. Couldn't get to the services either. It was, for my purposes, dead. The big issue here is that the mailbox move process was still trying to work, for all 225 mailboxes. The lack of old server caused all kinds of issues to take place that had the effect of hammering the log files. And since log shipping is pretty much how CCR works, both servers started choking. In two hours, we generated 19.8GB of log files, which then knocked the mailstore offline. I could not remount it, since there was no room for more logfiles.

Panic mode.

I temporarily stopped the replication, created new log file folders on both of the cluster nodes, moved the location of the log files in AD, moved the files themselves over to the big data volume, and restarted replication. These steps were originally from EXPTA.com, but it appears that that site is down, so I am linking to the google cache. These should all be done in the Exchange Management Shell (launched as administrator), and only performed after the new log directories have been created on both cluster nodes in the exact same location. Obviously, you will need to also change the paths to match your environment.

Step 1:  Suspend-StorageGroupCopy -Identity "First Storage Group" -SuspendComment "Moving transaction logs" -Confirm:$False

Step 2:  Move-StorageGroupPath -Identity 'First Storage Group' -LogFolderPath 'E:\ExchangeLogs' -SystemFolderPath 'E:\ExchangeLogs' -ConfigurationOnly

Step 3: move [oldpath]\*.* [newpath]

Step 4: Resume-StorageGroupCopy -Identity "exchange1\First Storage Group"

After this was completed (step 3 took a while, since I had 20GB of logfiles) I was able to remount the store and test via OWA. Then it was time to figure out why the Ex2003 box went down. After the moves are complete, I will run a backup to commit those log files to the DB and then move them back to the correct drive, as 20GB should be enough in any  normal case.

26Oct/090

Exchange troubleshooting

While there is no substitute for a full working lab, there are several tools that can help to make troubleshooting various elements of  your Exchange environment easier.

MX ToolBox - Great for all-in-one checking of reverse pointers, blacklists, open relays and general diagnostics. If I was stuck on a desert island, this would be the troubleshooting website I would take.

TestExchangeConnectivity.com - Runs a test connection to your Exchange server the same way your Wi-Mo phone or iPhone would. Great for testing an environment when you are not really sure the phone should work (unsupported OS or patch level)

Hexillion.com - Good for looking up public records for DNS and such. The have a lot of options for how much data you want to see.

Telnet client - The fact that this has to be manually installed on Vista is a crime

Steps to send email via telnet - If you need to interact on the most basic level, without fear of spamfiltering or email clients muddying the water, this is a good place to start.

14Oct/090

Find hidden mailboxes in Exchange 2003

Before you can remove an Exchange server from your org, you need to get all the mailboxes off of it. I was working on a Exchange 2007 CCR migration and found that there was still a long dead Exchange 2000 server in the org. The admin had tried to delete it, but it reported that there was still a mailbox on the server. He had checked every AD account for it to no avail. If you want to find out what AD accounts still have resources on a specific server:

  1. Start ADUC on the Exchange server, assuming it Windows 2003.
  2. Right click on your domain at the top and choose "Find".
  3. Click on the "Advanced" tab.
  4. Under "Field", select User, then "Exchange Home Server".
  5. Change the "Condition" from "Starts With" to "Ends With".
  6. In the "Value" field, type in the old Exchange server name and then click add to set the value.
  7. Click find to start a search.

You can then open that user account and clear out the Exchange settings that are holding you back.

Filed under: Exchange 2003 No Comments