ResAvail Pages and Working Sets

Hello everyone, I'm Ray and I'm here to talk a bit about a dump I recently looked at and a little-referenced memory counter called ResAvail Pages (resident available pages).

The problem statement was: The server hangs after a while.

Not terribly informative, but that's where we start with many cases. First some good housekeeping:

0: kd> vertarget

Windows 7 Kernel Version 7601 (Service Pack 1) MP (2 procs) Free x64

Product: Server, suite: TerminalServer SingleUserTS

Built by: 7601.18113.amd64fre.win7sp1_gdr.130318-1533

Machine Name: "ASDFASDF1234"

Kernel base = 0xfffff800`01665000 PsLoadedModuleList = 0xfffff800`018a8670

Debug session time: Thu Aug 8 09:39:26.992 2013 (UTC - 4:00)

System Uptime: 9 days 1:08:39.307

Of course Windows 7 Server == Server 2008 R2.

One of the basic things I check at the beginning of these hang dumps with vague problem statements is the memory information.

0: kd> !vm 21

*** Virtual Memory Usage ***

Physical Memory: 2097038 ( 8388152 Kb)

Page File: \??\C:\pagefile.sys

Current: 12582912 Kb Free Space: 12539700 Kb

Minimum: 12582912 Kb Maximum: 12582912 Kb

Available Pages: 286693 ( 1146772 Kb)

ResAvail Pages: 135 ( 540 Kb)

********** Running out of physical memory **********

Locked IO Pages: 0 ( 0 Kb)

Free System PTEs: 33526408 ( 134105632 Kb)

******* 12 system cache map requests have failed ******

Modified Pages: 4017 ( 16068 Kb)

Modified PF Pages: 4017 ( 16068 Kb)

NonPagedPool Usage: 113241 ( 452964 Kb)

NonPagedPool Max: 1561592 ( 6246368 Kb)

PagedPool 0 Usage: 35325 ( 141300 Kb)

PagedPool 1 Usage: 28162 ( 112648 Kb)

PagedPool 2 Usage: 24351 ( 97404 Kb)

PagedPool 3 Usage: 24350 ( 97400 Kb)

PagedPool 4 Usage: 24516 ( 98064 Kb)

PagedPool Usage: 136704 ( 546816 Kb)

PagedPool Maximum: 33554432 ( 134217728 Kb)

********** 222 pool allocations have failed **********

Session Commit: 6013 ( 24052 Kb)

Shared Commit: 6150 ( 24600 Kb)

Special Pool: 0 ( 0 Kb)

Shared Process: 1214088 ( 4856352 Kb)

Pages For MDLs: 67 ( 268 Kb)

PagedPool Commit: 136768 ( 547072 Kb)

Driver Commit: 15548 ( 62192 Kb)

Committed pages: 1648790 ( 6595160 Kb)

Commit limit: 5242301 ( 20969204 Kb)

So we're failing to allocate pool, but we aren't out of virtual memory for paged pool or nonpaged pool. Let's look at the breakdown:

0: kd> dd nt!MmPoolFailures l?9

fffff800`01892160 000001be 00000000 0000000000000002

fffff800`01892170 00000000 0000000000000000 00000000

fffff800`01892180 00000000

Where:

yellow = Nonpaged high/medium/low priority failures

green = Paged high/medium/low priority failures

cyan = Session paged high/medium/low priority failures

So we actually failed both nonpaged AND paged pool allocations in this case. Why? We're "Running out of physical memory", obviously. So where does this running out of physical memory message come from? In the above example this is from the ResAvail Pages counter.

ResAvail Pages is the amount of physical memory there would be if every working set was at its minimum size and only what needs to be resident in RAM was present (e.g. PFN database, system PTEs, driver images, kernel thread stacks, nonpaged pool, etc).

Where did this memory go then? We have plenty of Available Pages (Free + Zero + Standby) for use. So something is claiming memory it isn't actually using. In this type of situation one of the things I immediately suspect is process working set minimums. Working set basically means the physical memory used by a process.

So let's check.

0: kd> !process 0 1

<a lot of processes in this output>.

PROCESS fffffa8008f76060

SessionId: 0 Cid: 0adc Peb: 7fffffda000 ParentCid: 0678

DirBase: 204ac9000 ObjectTable: 00000000 HandleCount: 0.

Image: cscript.exe

VadRoot 0000000000000000 Vads 0 Clone 0 Private 1. Modified 3. Locked 0.

DeviceMap fffff8a000008a70

Token fffff8a0046f9c50

ElapsedTime 9 Days 01:08:00.134

UserTime 00:00:00.000

KernelTime 00:00:00.015

QuotaPoolUsage[PagedPool] 0

QuotaPoolUsage[NonPagedPool] 0

Working Set Sizes (now,min,max) (5, 50, 345) (20KB, 200KB, 1380KB)

PeakWorkingSetSize 1454

VirtualSize 65 Mb

PeakVirtualSize 84 Mb

PageFaultCount 1628

MemoryPriority BACKGROUND

BasePriority 8

CommitCharge 0

I have only shown one example process above for brevity's sake, but there were thousands returned. 241,423 to be precise. None had abnormally high process working set minimums, but cumulatively their usage adds up.

The “now” process working set is lower than the minimum working set. How is that possible? Well, the minimum and maximum are not hard limits, but suggested limits. For example, the minimum working set is honored unless there is memory pressure, in which case it can be trimmed below this value. There is a way to set the min and/or max as hard limits on specific processes by using the QUOTA_LIMITS_HARDWS_MIN_ENABLE flag via SetProcessWorkingSetSize.

You can view if the minimum and maximum working set values are configured in the _EPROCESS->Vm->Flags structure. Note these numbers are from another system as this structure was already torn down for the processes we were looking at.

0: kd> dt _EPROCESS fffffa8008f76060 Vm

nt!_EPROCESS

+0x398 Vm : _MMSUPPORT

0: kd> dt _MMSUPPORT fffffa8008f76060+0x398

nt!_MMSUPPORT

+0x000 WorkingSetMutex : _EX_PUSH_LOCK

+0x008 ExitGate : 0xfffff880`00961000 _KGATE

+0x010 AccessLog : (null)

+0x018 WorkingSetExpansionLinks : _LIST_ENTRY [ 0x00000000`00000000 - 0xfffffa80`08f3c410 ]

+0x028 AgeDistribution : [7] 0

+0x044 MinimumWorkingSetSize : 0x32

+0x048 WorkingSetSize : 5

+0x04c WorkingSetPrivateSize : 5

+0x050 MaximumWorkingSetSize : 0x159

+0x054 ChargedWslePages : 0

+0x058 ActualWslePages : 0

+0x05c WorkingSetSizeOverhead : 0

+0x060 PeakWorkingSetSize : 0x5ae

+0x064 HardFaultCount : 0x41

+0x068 VmWorkingSetList : 0xfffff700`01080000 _MMWSL

+0x070 NextPageColor : 0x2dac

+0x072 LastTrimStamp : 0

+0x074 PageFaultCount : 0x65c

+0x078 RepurposeCount : 0x1e1

+0x07c Spare : [2] 0

+0x084 Flags : _MMSUPPORT_FLAGS

0: kd> dt _MMSUPPORT_FLAGS fffffa8008f76060+0x398+0x84

nt!_MMSUPPORT_FLAGS

+0x000 WorkingSetType : 0y000

+0x000 ModwriterAttached : 0y0

+0x000 TrimHard : 0y0

+0x000 MaximumWorkingSetHard : 0y0

+0x000 ForceTrim : 0y0

+0x000 MinimumWorkingSetHard : 0y0

+0x001 SessionMaster : 0y0

+0x001 TrimmerState : 0y00

+0x001 Reserved : 0y0

+0x001 PageStealers : 0y0000

+0x002 MemoryPriority : 0y00000000 (0)

+0x003 WsleDeleted : 0y1

+0x003 VmExiting : 0y1

+0x003 ExpansionFailed : 0y0

+0x003 Available : 0y00000 (0)

How about some more detail?

0: kd> !process fffffa8008f76060

PROCESS fffffa8008f76060

SessionId: 0 Cid: 0adc Peb: 7fffffda000 ParentCid: 0678

DirBase: 204ac9000 ObjectTable: 00000000 HandleCount: 0.

Image: cscript.exe

VadRoot 0000000000000000 Vads 0 Clone 0 Private 1. Modified 3. Locked 0.

DeviceMap fffff8a000008a70

Token fffff8a0046f9c50

ElapsedTime 9 Days 01:08:00.134

UserTime 00:00:00.000

KernelTime 00:00:00.015

QuotaPoolUsage[PagedPool] 0

QuotaPoolUsage[NonPagedPool] 0

Working Set Sizes (now,min,max) (5, 50, 345) (20KB, 200KB, 1380KB)

PeakWorkingSetSize 1454

VirtualSize 65 Mb

PeakVirtualSize 84 Mb

PageFaultCount 1628

MemoryPriority BACKGROUND

BasePriority 8

CommitCharge 0

No active threads

0: kd> !object fffffa8008f76060

Object: fffffa8008f76060 Type: (fffffa8006cccc90) Process

ObjectHeader: fffffa8008f76030 (new version)

HandleCount: 0 PointerCount: 1

The highlighted information shows us that this process has no active threads left but the process object itself (and its 20KB working set use) were still hanging around because a kernel driver had a reference to the object that it never released. Sampling other entries shows the server had been leaking process objects since it was booted.

Unfortunately trying to directly track down pointer leaks on process objects is difficult and requires an instrumented kernel, so we tried to check the easy stuff first before going that route. We know it has to be a kernel driver doing this (since it is a pointer and not a handle leak) so we looked at the list of 3rd party drivers installed. Note: The driver names have been redacted.

0: kd> lm

start end module name

<snip>

fffff880`04112000 fffff880`04121e00 driver1 (no symbols) <-- no symbols usually means 3rd party

fffff880`04158000 fffff880`041a4c00 driver2 (no symbols)

<snip>

0: kd> lmvm driver1

Browse full module list

start end module name

fffff880`04112000 fffff880`04121e00 driver1 (no symbols)

Loaded symbol image file: driver1.sys

Image path: \SystemRoot\system32\DRIVERS\driver1.sys

Image name: driver1.sys

Browse all global symbols functions data

Timestamp: Wed Dec 13 12:09:32 2006 (458033CC)

CheckSum: 0001669E

ImageSize: 0000FE00

Translations: 0000.04b0 0000.04e4 0409.04b0 0409.04e4

0: kd> lmvm driver2

Browse full module list

start end module name

fffff880`04158000 fffff880`041a4c00 driver2 (no symbols)

Loaded symbol image file: driver2.sys

Image path: \??\C:\Windows\system32\drivers\driver2.sys

Image name: driver2.sys

Browse all global symbols functions data

Timestamp: Thu Nov 30 12:12:07 2006 (456F10E7)

CheckSum: 0004FE8E

ImageSize: 0004CC00

Translations: 0000.04b0 0000.04e4 0409.04b0 0409.04e4

Fortunately for both the customer and us we turned up a pair of drivers that predated Windows Vista (meaning they were designed for XP/2003) that raised an eyebrow. Of course we need a more solid evidence link than just "it's an old driver", so I did a quick search of our internal KB. This turned up several other customers who had these same drivers installed, experienced the same problem, then removed them and the problem went away. That sounds like a pretty good evidence link. We implemented the same plan for this customer successfully.

ResAvail Pages and Working Sets

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112