Microsoft releases load simulation tools for desktops


Microsoft has released their Remote Desktop Load Simulation Tools which have nothing to do with Remote Desktop in the RDP sense.  Instead, the tools are designed for 32-bit and 64-bit server capacity planning and performance/scalability analysis.  According to Microsoft:

In a server-based computing environment, all application execution and data processing occur on the server. Therefore it is extremely interesting to test the scalability and capacity of servers to determine how many client sessions a server can typically support under a variety of different scenarios. One of the most reliable ways to find out the number or users a server can support for a particular scenario is to log on a large number of users on the server simultaneously. The Remote Desktop Load Simulation tools provide the functionality which makes it possible to generate the required user load on the server.

Supported operating systems are:

  • Windows Server 2008
  • Windows Server 2008 Datacenter
  • Windows Server 2008 Datacenter without Hyper-V
  • Windows Server 2008 Enterprise
  • Windows Server 2008 Enterprise without Hyper-V
  • Windows Server 2008 for Itanium-based Systems
  • Windows Server 2008 R2
  • Windows Server 2008 R2 for Itanium-based Systems
  • Windows Server 2008 Service Pack 2
  • Windows Server 2008 Standard
  • Windows Server 2008 Standard without Hyper-V

(Notice the lack of Windows 2003 support?)

A minimal test environment requires:

  1. Target Remote Desktop Server
  2. Client Workstations
  3. Test Controller Host

Using Winsat.exe in Windows Server 2008 as a performance benchmarking tool


Microsoft has the Windows System Assessment Tool (Winsat) available for download that can assess a computer’s ability to run Windows Vista.  This tool provides a wealth of information on you hardware’s horsepower, plus it’s scriptable. It’s designed to run under Windows Vista, but can be run under Windows Server 2008 as well.  Here’s how to do it.

 1. Dowload the Windows Vista Upgrade Advisor utility
 
2. Use Universal Extractor’s (uniextract) MSI method to extract the files from the .msi package
 
3. Copy winsat.exe to the c:\windows\system32 directory on the Windows 2008 server
 
4. Open an elevated command prompt and change to the c:\windows\system32 directory.  There’s many different hardware components you can benchmark, but the following example benchmarks sequential reads on drive C:
 
winsat disk -seq -read -drive c
 
See the Technet command reference for Winsat for details on all tests winsat can perform, such as:


Assessment Description
winsat dwm Assesses the ability of a system to display the Aero desktop effects.
winsat d3d Assesses the ability of a system to run Direct 3D applications, such as games.
winsat mem Assesses system memory bandwidth by simulating large memory to memory buffer copies.
winsat disk Assesses the performance of disk drives.
winsat cpu Assesses the performance of the CPU(s).
winsat media Assesses the performance of video encoding and decoding (playback) using the Direct Show framework.
winsat mfmedia Assesses the performance of video decoding (playback) using the Media Foundation framework.
winsat features Enumerates relevant system information.
winsat formal Runs a set of pre-defined assessments and saves the data in an XML file in %systemroot%\performance\winsat\datastore.

Dell PowerEdge 1950 NIC teaming test results


I’ve completed testing of the NIC teaming on our new Dell PowerEdge 1950 servers.  I’m more than a little bit surprised by the results, which I’ll get to in a moment.  My initial assumptions were that the network adapters would perform in the following order, from best to worst performing: 

1)  Teamed Intel NICs
2)  Teamed Broadcom NICs
3)  Single Intel NIC
4)  Single Broadcom NIC
 
I tested each configuration by copying a large file or directory of files from server PO1 to server PO2.  Both servers booted from SAN, ran Windows 2003 with the latest Windows patches and updates from our Patchlink server. PO2 was cloned from PO1 after being sysprep’d.  The servers were configured identically, each plugged into the same module on the same HP Procurve 5304xl switch.  The switch was configured with 802.3ad link aggregation.
 
The NICs that were tested were:
 
1 quad port Intel VT 1000 gigabit NIC PCI-X
2 integrated Broadcom Netxtreme II BCM5708C gigabit NICs
 
The files I used to test were:
  • OM_5.4.1_SUU_A00.iso, a 1.85 GB ISO image file
  • gw700.iso, a 689MB ISO image file
  • A 2.23 GB directory of 509 text files, each averaging 5MB in size
The methodology I used to test with was:
  • Install the NIC drivers and configure team’s static IP, subnet mask, default gateway, and DNS on each server.  Default team settings were used, including TCP Offload Engine (TOE), Large Send Offload (LSO), and Checksum Offload (CO)
  • Disable all unused NICs
  • Restart both servers
  • Copied the first test file from PO1 to PO2 using the following syntax:
  copy filename \\po2\c$\temp\test\
  • Timed how many seconds it took to copy the file from PO1 to PO2
  • Deleted the copied file from PO2
  • Copied the test file again from PO1 to PO2 until 5 passes were completed
  • Repeated the process for the next test file(s)
The following configurations were tested:
 
1) Single Intel NIC to Single Intel NIC using driver dated 6/13/08
 
2) Single Broadcom NIC to Single Broadcom NIC using driver dated 2/21/08
 
3) Single Intel NIC using driver dated 6/13/08 to Single Broadcom NIC using driver dated 2/21/08
 
4) Teamed Intel NIC to Teamed Intel NIC using driver dated 6/13/08
 
5) Teamed Intel NIC using driver dated 8/23/07 to Teamed Intel NIC using driver dated 6/13/08
 
6) Teamed Intel NIC to Teamed Intel NIC using driver dated 8/23/07
 
7) Teamed Broadcom NIC to Teamed Broadcom NIC using driver dated 2/21/08
 
You can see the test results in the attached document, but to summarize:
 
1)  The teamed Intel NICs performed the worst – even worse than using single Intel NICs
2)  The single Broadcom NIC outperformed the single Intel NIC
3)  The teamed Broadcom NICs were the highest performing
 
I have no clue why the results are what they are.  In the past, I’ve experienced horrendous performance with the Broadcoms, and great performance from the Intels.  Does anyone have any idea as to why the teamed Intel NICs would perform so poorly?
 
The only real difference I could see was that when copying files, Windows Task Manager showed Network Utilization at ~51-56% for the Broadcom tests, and ~16-17% for the Intel tests.  Why, I’m not sure.
 
The data in the spreadsheet shows actual averages, which was the average number of seconds it took to copy a file over five tries, and what is called adjusted average.  Adjusted average is something I learned about long ago in a stats class I had that said it’s a best practice to disregard the lowest and highest value in your sample.  Either way you look at it, the findings are the same:  The Intel performance is horrible while the Broadcoms perform great.
 
Based upon these tests I’m going to recommend going with the teamed Broadcom NICs in the new server deployment.

Hacking ntbackup.exe and bkprunner.exe for better performance in Windows Server 2003 SP1


I was perusing Susan’s blog today and came across her link to Chris’s great description of how to modify the bkprunner.exe process to improve backup performance in Windows Server 2003 SP1.

Chris details how to use a hex editor and modify the registry to increase ntbackup performance. He also mentions that you can now use the /FU switch with SP1’s ntbackup, which according to KB 814583 enables a “file unbuffered” setting to bypass the cache manager. This change provides a number of benefits during the disk-to-disk backup process:

  • Sustainable throughput over time
  • Reduction in processor utilization: on average, peak utilization is reduced to 30 percent
  • Elimination of impacts to the system process during the backup job

If for some reason you are running a pre Service Pack 1 version of Windows Server 2003, KB 839272 describes a hotfix version of ntbackup.exe you can download that supports the /FU switch.

Firefox 2.x and excessive memory consumption


I usually have a lot of tabs open in Firefox while I work. I’ve noticed (as have many others) excessive memory consumption by the browser at times. Right now firefox.exe is using 121,484K with only 9 tabs open. I rebooted first thing this morning, and I’ve had Firefox running for only about four hours.

Most of the information I’ve found says the problem has to do with misbehaving Add-ons, extensions or themes. I’m only running four Add-ons, and decided to uninstall all but my del.icio.us Buttons and Google Browser Sync. Unfortunately, the problem persisted even after a reboot.

I did some more searching and found this thread that suggests loading this image to see if your browser memory consumption goes through the roof. Mine did. I read further down the thread and found a problematic add-on is indeed Google Browser Sync.

Check out this list of problematic extensions to see if any of your favorite add-ons are listed. If none of your extensions are listed, try the suggestions found on the Standard Diagnostic for Firefox and the causes for Firefox Hangs.

You can also try the Leak Monitor extension to help determine what is the cause of your Firefox memory leak. PCtipsbox.com has four tips on handling Firefox memory usage as well.

Identifying processes running as svchost.exe


I have previously written about experiences with systems becoming unresponsive and reporting svchost.exe utilizing 99% of the CPU. Since so many different .dll’s run as this generic host process, identifying exactly which program is the cause of the high CPU usage is often difficult.

According to KB314056, the Svchost.exe file is located in the %SystemRoot%\System32 folder. At startup, Svchost.exe checks the services part of the registry to construct a list of services that it must load. Multiple instances of Svchost.exe can run at the same time. Each Svchost.exe session can contain a grouping of services. Therefore, separate services can run, depending on how and where Svchost.exe is started. This grouping of services permits better control and easier debugging.

Svchost.exe service groups are listed in the following registry key:

HKEY_LOCAL_MACHINE\Software\Microsoft\WindowsNT\CurrentVersion\Svchost

Windows XP Pro has a built in function that can be run from a command prompt called tasklist.exe that provides information useful in tracking down the offending programs.

Running tasklist.exe with no switches will provide a list of running processes, their PID, console type and memory usage. Notice svchost.exe is shown as PIDs 1448, 1508, and 1856.

tasklist.jpg

To determine which PID is running which service, run

tasklist.exe /SVC

tasklistsvc.jpg

Notice the additional information that is shown about which instances services are run from.

You can list services and applications on a remote system by running

tasklist.exe /s remoteIPaddress

or

tasklist.exe /s remoteComputerName

tasklists.jpg

If you want even more detail about the process and applications running, type:

tasklist /M

This will show which .dlls are in use by the processes.

If you want to isolate a service shared from svchost.exe, My Green Paste has a nice post on manipulating this service via the registry.

Once you’ve isolated the offending process that is causing the excessive resource utilization, use taskkill.exe to kill the offending application. You may need to specify the /F switch to force the offending process to be killed.

taskkill.jpg

Obviously killing the wrong processes can crash your machine, and editing the registry can make it unbootable, so before making changes make sure you have a recent backup.

[edited 01-24-2008]

Ask the Performance Team has published a new post on svchost.exe with some really detailed information. I think the methods of creating isolated processes and isolated service groups would be most helpful in troubleshooting performance and bottleneck issues.