| |
Fixing Windows Crashes
with Debugger
I stumbled upon a programmer's tool that everyone can use to aid in
the troubleshooting of Window's "Blue Screen of Death" errors. It's
not that hard to setup and I believe it's worth the time and effort
to learn how to use this tool.
When Windows crashes you can have Windows capture everything in
memory and send it to a file called a minidump. It's this file that
we can analyze with the debugger tool to see exactly what was going
on when Windows crashed.
First you'll need to set Windows to capture crashes and send the
memory content to a minidump file.
To Set Windows to Capture a Minidump aka Memory Dump http://support.microsoft.com/kb/314103
The default type of memory dump is the small memory dump. To change
or view the settings for the type of memory dump, follow these
steps:
1. Click Start, and then click Control Panel.
2. In Control Panel, double-click System, and then click the
Advanced tab.
3. Click Settings in the Startup and Recovery area.
4. View or change the type of memory dump under Write debugging
information
By default, saving STOP message information to a file is enabled in
Windows XP. Three types of memory dumps are available:
1. A small memory dump (64 kilobytes), which is written to the %SystemRoot%\Minidump
folder. The paging file on the boot volume must be at least 2
megabytes (MB) in size.
2. A kernel memory dump, which is written to the %SystemRoot%\Memory.dmp
folder. The paging file on the boot volume must be at least 50 to
800 MB in size, depending on the amount of RAM.
3. A complete memory dump, which is written to the %SystemRoot%\Memory.dmp
folder. The paging file on the boot volume must be large enough to
hold all of the physical RAM plus 1 MB.
Get the debugger
The debugger is free and available from Microsoft's Web site (http://www.microsoft.com/whdc/devtools/debugging/default.mspx).
At the site, scroll down until you see the heading, "Installing
Debugging Tools for Windows." Select the link, "Install 32-bit
version…” and then select the most recent non-beta version and
install it. The most recent versions are about 12M-byte downloads.
You can do the installation on a PC without restarting it.
The debugger also needs what's called a "symbols" file.
http://www.microsoft.com/whdc/DevTools/Debugging/symbolpkg.mspx
Download the appropriate package for your OS and install.
I also found another, simpler debugger
program here:
http://windowsbbs.com/debugwiz.zip
Using this debugger
Start, Run, CMD
dumpwiz
then Browse to where your dump file is located.
Click 'Generate Log'
Exit
Your debugging log can be found and opened in Notepad at the
location C:\debuglog.txt
Loading the dump file
To open the dump file that you want to analyze, select File | Open
Crash Dump. You'll be asked if you want to save workspace
information. Click Yes if you want it to remember where the dump
file is. WinDbg looks for the Windows symbol files. WinDbg
references the symbol file path, accesses microsoft.com, and
displays the results. Close the Disassembly window so you are
working in the Command window.
Debugger commands
With the dump file loaded into WinDbg, it's time to ask for some
diagnostic information. While there are loads of commands to use,
two commands are all you need:
!analyze –v and , and lmv.
!analyze –v displays information describing the state of a system
when it crashed, the fault encountered, and who is the primary
suspect.
lmv displays a list of drivers and their path, version and vendor
information. It often includes a product description.
You pronounce the first command: "bang analyze dash vee."
Analysis with !analyze –v
Type !analyze –v on the command line at the bottom of the Command
window. The explanation it gives is a combination of English and
programming lamhuage, but it is nonetheless a great start. In fact,
in many cases you may not need to go any further. If you recognize
the cause of the crash, you're probably done.
Here's an example. After typing! analyze –v, we receive the
following output:
kd> !analyze -v
KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
(This is a very common bugcheck. Usually the exception address
pinpoints the driver/function that caused the problem. Always note
this address as well as the link date of the driver/image that
contains this address.)
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: bf9bc4bd, The address that the exception occurred at
Arg3: f69f02bc, Trap Frame
Arg4: 00000000
Debugging Details:
------------------
EXCEPTION_CODE: c0000005
FAULTING_IP:
vdriver+44bd
bf9bc4bd 8b4014 mov eax,[eax+0x14]
TRAP_FRAME: f69f02bc -- (.trap fffffffff69f02bc)
ErrCode = 00000000
eax=00000000 ebx=01740000 ecx=010886a0 edx=f69f069c esi=fa07d400 edi=e161f7f8
eip=bf9bc4bd esp=f69f0330 ebp=f69f0344 iopl=0 nv up ei pl nz na pe
nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010202
vdriver+44bd:
bf9bc4bd 8b4014 mov eax,[eax+0x14] ds:0023:00000014=????????
DEFAULT_BUCKET_ID:
DRIVER_FAULT
BUGCHECK_STR: 0x8E
LAST_CONTROL_TRANSFER: from bf9ba5cf to bf9bc4bd
STACK_TEXT:
f69f0344 bf9ba5cf e161f7f8 e17f8e30 e21e4530 vdriver+0x44bd
f69f06b0 f69f06e0 e2638678 f69f06e4 f69f0890 vdriver+0x25cf
e1bd6b90 1f0507b6 00000000 e1622008 00000010 0xf69f06e0
00000000 00000000 00000000 00000000 00000000 0x1f0507b6
f69f0bf0 805766ef f69f0c78 f69f0c7c f69f0c8c nt!KiCallUserMode+0x4
f69f0c4c bf8733cd 00000002 f69f0c9c 00000018
nt!KeUserModeCallback+0x87
f69f0ccc bf8722a5 bc667998 0000000f 00000000 win32k!SfnDWORD+0xa0
f69f0d0c bf873b38 7196bc2d f69f0d64 00affed0
win32k!xxxDispatchMessage+0x1c0
f69f0d58 805283c1 00afff2c 804d2d30 ffffffff
win32k!NtUserDispatchMessage+0x39
f69f0d58 7ffe0304 00afff2c 804d2d30 ffffffff nt!KiSystemService+0xc4
00afff08 00000000 00000000 00000000 00000000
SharedUserData!SystemCallStub+0x4
FOLLOWUP_IP:
vdriver+44bd
bf9bc4bd 8b4014 mov eax,[eax+0x14]
FOLLOWUP_NAME: MachineOwner
SYMBOL_NAME: vdriver+44bd
MODULE_NAME: vdriver
IMAGE_NAME: vdriver.dll
BUCKET_ID: 0x8E_vdriver+44bd
Look for a section labeled "Debugging Details." Then, scan down
until you find DEFAULT_BUCKET_ID:. This provides the general
category of the failure. It shows DRIVER_FAULT, indicating that a
driver is the likely culprit. Scanning further down to IMAGE_NAME,
we see vdriver.dll. We have a suspect!
Analysis with lmv
The next step is to confirm the suspect's existence and find any
details about him. Typing lm in the command line displays the loaded
modules; v instructs the debugger to output in verbose (detail)
mode, showing all known details for the modules. This is a lot of
information. Locating the driver of interest can take a while, so
simplify the process by selecting edit | Find.
Here's an example of output generated by the lmv command:
kd> lmv
bf9b8000 bfa0dc00 VDriver (no symbolic information)
Loaded symbol image file: VDriver.dll
Image path: \SystemRoot\System32\VDriver.dll
Checksum: 00058BD5 Timestamp: Fri Sep 28 10:12:47 2001 (3BB4855F)
File version: 5.20.10.1066
Product version: 5.20.10.1066
File flags: 8 (Mask 3F) Private
File OS: 40004 NT Win32
File type: 3.4 Driver
File date: 00000000.00000000
CompanyName: Video Technologies Inc.
ProductName: VDisplay Driver for Windows XP
InternalName: VDriver.dll
OriginalFilename: VDriver.dll
ProductVersion: 5.20.10.1066
FileVersion: 5.20.10.1066
FileDescription: Video Display Driver
LegalCopyright: Copyright© Video Technologies Inc. 2000-2004
Support: (800) 555-1212
Use File | Find to locate the suspect driver. If the vendor was
thorough, complete driver/vendor detail is revealed
The amount of information you see depends upon the driver vendor.
Some vendors put little information in their files; others, such as
Veritas, put in everything from the company name to a support
telephone number! If a vendor is thorough, the results from the
command will be similar to those shown here.
After you find the vendor's name, go to its Web site and check for
updates, knowledge base articles, and other supporting information.
Inconsistent answers
If you have recurring crashes but no clear or consistent reason, it
may be a memory problem. Download the free test tool, Memtest86.
This simple diagnostic tool is quick and works great. Many people
discount the possibility of a memory problem, because they account
for such a small percentage of system crashes. However, they are
often the cause that keeps you guessing the longest.
The operating system is the culprit
Not likely! As surprising as it may seem, the operating system is
rarely at fault. If ntoskrnl.exe (Windows core) or win32.sys (the
driver that is most responsible for the "GUI" layer on Windows) is
named as the culprit, and they often are, don't be too quick to
accept it. It is far more likely that some errant third-party device
driver called upon a Windows component to perform an operation and
passed a bad instruction, such as telling it to write to
non-existent memory. So, while the operating system certainly can
err, exhaust all other possibilities before you call Microsoft! The
same goes for debugging Unix, Linux, and NetWare.
|