-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path[0x01]....Lecture Notes for Winternals
76 lines (44 loc) · 50.7 KB
/
[0x01]....Lecture Notes for Winternals
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
———— -[ Windowz Internals Course ]-
[Winternals Notes]:: {#1kernelmode\ usermode, processes, threads, virtual memory, object & handles} Objects are kernel (CreateFileW) that need a handle to use. Processors are containers for threads that execute code. Sys Arch P1 (7:57):: win32k.dll ntdll.dll csrss.exe [userland] System Processes, Services, Environment Subsystem, User Application, *Subsystem DLLs, NTDLL.DLL [kmode] Executive, Device Drivers, Kernel, Graphics (win32k), Hardware Abstraction Layer (HAL.) Service Control Manager (SCM), SMS. [usermode] # Call fread (application) > Call ReadFile > (MSVcrt.dll) > Call NTReadFile return to caller (Kernel32.DLL) > Sysenter\ Syscall return to caller (NtDll.DLL.) System Service Dispatcher. EAX CPU register is loaded with a number (system service number) [kernel mode] Call NTReadFile (NtOskrnl.EXE (part of Executive) I/O Manager > Call NTReadFile (NTOSKRNL.EXE) > NTReadFile call driver return to caller (NTOSKRNL.EXE) > initiate I\O return to caller (driver.sys) I\O Regress Packet (IRP.) Notepad: kernel32!CreateFileW CProcessActivator ApartmentActivator ole32 move eax,42h 42h is the system service number for NTCreateFile. # NTOSKRNL.exe (Executive & kernel on 64bit systems) NTKrnlPa.exe (Executive & kernel on 32bit systems (PA = Physical Address Extensions kernel) Hal (a layer that insultes drivers & the kernel from actualy hardware) win32k.sys (Kernel component of Windows Subsystem (CreateWindow doesn't go through NTDLL) which handles windowing & GDI graphics NTDLL.dll System support routines & native API dispatcher to executive services (lower layer of usermode, 2 functions: is the functions to jump in kernel mode with the EAX register holding the system service number & then instructions such as sysenter or syscall that transitions the processor into kernel into that system service dispatcher that eventually calls the appropriate kernel service; ). Kernel32.dll user32.dll gdi32.dll advapi32.dll (subsystem dll's) CSRSS.exe (Client Server Runtime SubSystem) is the process that manages the Windowz subsystem & is always running, killing it will Blue Screen the system. SMP (Symmetric Multiprocessing) All CPUs are same & share main memory & have equal access to peripheral devices (no master\ slave) or roles, controlling; Basic architecture supports up to 32\64 CPUs; Windowz 7 64bit & 2008 R2 support up to 256 coreAre; Introduce concept called "Processor Group" which can handle 64 processors. {Subsystem is a special view of the OS; Exposes services via subsystem DLLs.} Original NT shipped with Win32, OS\2 & POSIX, Windows XP dropped support for OS\2; Some API functions use the Advanced Local Procedure Call (ALPC) to notify CSRSS of relevant events; SMSS (Session Manager); Subsystem info stored in Registry: HKLM\System\CCS\Control\Session Manager\Subsystems. Every image belong to exactly one subsystem Value stored in image PE header. NTDLL Used by subsystem dlls & 'native' images; lowest layer of usermode code; Various support functions; Dispatcher to kernel services (most of them accessible using Windowz API 'Wrappers') # System Processes: *Idle Process —PID 0 no executable; one thread per CPU (core), *System Process —fixed PID 4; represents the kernel address space & resources; hosts system threads; threads created by kernel & device drivers; execute code in system space only; created using the PsCreateSystemThread kernel API (documented in the WDK); allocate memory from the system pools, *Session Manager (SMSS.exe), *Windowz Subsystem (CSRSS.exe), *Logon Process (WinLogon.exe), *Services Control Manager (SCM) (Services.exe), *Local Security Authentication Server (LSASS.exe), *Local Session Manager (LSM.exe) local helper to SMSS. i8042port.sys # Sessions Manager: {Running the image %SYSTEMROOT%\smss.exe}; first usermode process created by the system, {Main tasks: Creating system environment variables, Launches the subsystem processes (normally just csrss.exe), Launches itself in other sessions: that instance loads WINLOGON & CSRSS in that session. Then Terminates.} WinLogon: %SYSTEMROOT%\WinLogon.exe Handles interactive logons & logoffs, If terminated logs off the user session; Notified of a user request by the Secure Attention Sequence typically Ctrl+Alt+Del; Authenticates the user by presenting a username\ password dialog through LogonUI.exe, sends captured username & password to LSASS if successfully authenticated, initiates the user's session, winlogon will create a token that identifies the user. LSASS %SYSTEMROOT%\Lsass.exe Calls the appropriate authentication package, upon successful authentication, creates a token representing the user's security profile. Returns information to WinLogon (Domain controller or local.) Service Control Manager (SCM): %SYSTEMROOT%\Services.exe communicate with Windowz services through Named Pipe which is an IPC mechanism supported by Windowz (UNIX Daemon Processes.) -pause- SysProc2 Can run under "special" accounts Local Service (most powerful), Network Service, Local Service. Local Session Manager: Introduced in Windowz Vista %SYSTEMROOT%\lsm.exe to lsm.dll in Win 8 hosted in a standard svchost.exe; Manages the Terminal sessions on the local machine; Communicates requests to SMSS. # StartService API which communicates with the Service Control Manager # Wow64: Allows execution of Win32 binaries on 64-bit Windowz; Wow64 intercepts system calls from the 32bit application —converts 32bit data structures into 64bit data structures; issues the native 64bit system call; returns any data from the 64bit system call. IsWow64Process function can tell whether a process is running under Wow64; Address space is 2GB or 4GB (if image is linked with the LARGEADDRESSAWARE flag); device drivers must be native 64bit; File System: windows\system32 (64bit) windows\syswow64 (32bit); 32bit exe dlls > 32bit ntdll.dll > wow64cpu.dll > wow64.dll > 64bit ntdll.dll > ntoskrnl.exe, wow64win.dll > win32k.sys; a 64bit process cannot load a 32bit dll & vice versa; Some APIs are not supported by Wow64 processes eg. ReadFileScatter, WriteFileGather, AWE functions —these functions assume a particular format of memory & particular alignment that simply can't be mended by the transition dlls, also the address window & extension functions that allow 32bit processes to access more than 4GB of physical memory cannot be used in a Wow64 scenario. File System Redirection: System directories names have not changed in 64bit Windowz eg. windows\System32 contains 64bit native images. Registry Redirection: components trying to register as 32bit & 64bit will clash; Component Object Model mechanism that exists in Windowz since 93 or smth Wow6432Node HKLM\Software, HK_Classes_Root, HK_Current_User\Software\Classes; New flags for Registry APIs allow access to the 64bit or 32bit nodes: KEY_WOW64_64KEY KEY_WOW64_32KEY # Process: Management & containment object: Owns Private virtual address space (2GB\3GB on 32bit, 8TB on 64bit), Working Set (physical memory owned by process), Private handle table for kernel objects, Access Token which determines the security context in which code by default executes in that process, has a Priority Class (win32), basic creation functions: CreateProcess (9 args), CreateProcessAsUser (1 more, handle to a token object so I can create a process with another user's credentials); open DOCX can't create process so ShellExecute[EX] to CreateProcess with winWord as one of its parameters; ExitProcess, TerminateProcess # Process Creation: Open image file check startup stuff opened mapped appropriate headers, Create kernel Executive Process object manage that process several structures to do that —one of them is the KPROCESS structure which is the lowest lvl structure & is wrapped by a higher-lvl structure known as EPROCESS (Executive Process (undocumented)); Create initial thread KTHREAD ETHREAD structure; Create kernel Executive Thread object; Notify CSRSS of new Process & Thread by sending Local Procedure Call message; Complete process & thread initialization —load required DLLs & initialize (Visual Studio leap files) not just mapping DLLs to address space but Dllmain function called with DLL_PROCESS_ATTACH reason of each DLL; Start execution of main\ WinMain {Priorities, Context Switch, PEB (Process Environment Block)} #Threads Own Context (registers, etc.), 2 Stacks (usermode & kernelmode); optionally Message Queue & Windows; optional security token (runs under the same security token of its parent process) —impersonation executes some code on behalf or with a security context of a different user & then reverting to self; Scheduling state Priority (0-31) State (Ready, Wait, Running) current access mode (user or kernel); CreateThread (win32) ExitThread TerminateThread #Thread Stacks every usermode has 2: In kernel space (12k x86), (24k 64bit) resides in physical memory most of the time so kernel stack can't crash the OS; In userspace (may be large) 1MB reserved 64k committed; A guard page is placed just below the last committed page, so that the stack can grow —page is 4k of address; can change the initial size: Using Linker as new defaults; On a thread by thread basis CreateThread CreateRemoteThread(Ex). #Thread Priorities: 1-31 (31 highest) 0 reserved for zero page thread The Windowz API mandates thread priority be based on a process priority class (base priority) A threads priority can be changed around the base priority SetPriorityClass SetThreadPriority priority offset from the parent's base priority Kernel: KeSetPriorityThread to an absolute value; Thread Saturation value (to highest): Idle priority: 4, Below normal priority class: 6, Normal priority class: 8, Above Normal priority class: 10, High priority class: 13, Realtime priority class: 24. {#Thread quantum length of thread period} HKLM\SYSTEM\CCS\Control\PriorityControl Win32PrioritySeparation KeSetEvent (Event, Increment) IoCompleteRequest (IRP, PriorityBoost) #:SetThreadIdealProcessor SetThreadAffinityMask SetProcessAffinityMask NUMA (Non-Uniform Memory Architecture) #Kernel Dispatcher Objects: Threads sometimes need to coordinate work (accessing linked lists concurrently from multuple threads) Synchro is based upon waiting for some condition to occur The kernel provides a set of synchro (dispatcher) primitives on which threads can wait efficiently. Maintain a state (signaled or non-signaled) WaitForSingleObject WaitForMultipleObjects KeWaitForSingleObject KeWaitForMultipleObjects Dispatcher Object Types: Process, event, mutex, semaphore, timer, file, I\O completion port Higher lvl wrappers exist: MFC: CSyncObject (abstract base of CMutex, CSemaphore et. al.) .NET: WaitHandle (abstract base of Mutex, Semaphore et. al.) #Signaled Meaning: Process has terminated, Thread has terminated, Mutex is free, The Event flag is raised (True), The Semaphore count is greater than zero (hasn't maxed count), File\ I\O Completion Port operation completed, CreateWaitableTimer Timer interval time expires #Synchro primitives: Mutex [Mutual Exclusion] called Mutant in kernel terminology; Allows a single thread to enter a critical region; The thread that enters the critical region (its wait has succeeded) is the owner of the mutex; ReleaseMutex Releasing the mutex allows one (single) thread to acquire it & enter the critical section; Recursive acquisition is O.K. (increments a counter); If the owning thread does not release the mutex before it terminates, the kernel releases it & the next wait succeeds with a special code (abandoned mutex.) Semaphore: CreateSemaphore ReleaseSemaphore Maintains a counter (set at creation time); Allows x callers to "go through" a gate; When a thread succeeds a wait, the semaphore counter decreases —When the counter reaches 0 subsequent waits do not succeed (state is non-signaled) —Releasing the semaphore increments its counter releasing a thread that is waiting; eg. Queue that needs to limit to say no more than 100 entries; Is a Semaphore with a maximum count of one equivalent to a Mutex ? Does not maintain any ownership, If one thread acquires the Semaphore & tries to acquire that a second time it actually becomes deadlocked; A mutex can only be released by the thread that acquired the mutex that's not the case for a semaphore One thread can acquire a semaphore & another thread can release the semaphore That's actually an advantage a semaphore has A mutex is for protecting data. Event: Maintains a Boolean flag (can be signaled\ non-signaled); Event types: Manual reset (notification in kernel terminology), Auto reset (synchronization); When set (signaled) threads waiting for it succeeds the wait: Manual reset event releases any number of threads Auto reset event releases just one thread —& the event goes automatically to the non-signaled state; useful when no other objects fits the bill —provides flow synchronization as opposed to data synchronization. Critical Section: Usermode replacement for a mutex; Can be used to synchronize threads within a single process —operates on a structure of type CRITICAL_SECTION EnterCriticalSection & LeaveCriticalSection; Cheaper than a mutex when no contention exists —no transitiion to kernel mode in this case; no way to specify a timeout other than infinite & zero —Zero is accomplished by TryEnterCriticalSection; .NET: lock C# keyword that's using monitor Enter\ Exit in a Try\ Finally block #More Threading: Thread pools: simplifies thread management; potentially boosts performance since threads don't need to be created\ destroyed explicitly; C++11 & .NET 4+ provide helpers for fork\ join scenarios —parallel_for (c++) Parallel.For (.NET). Simplify operations where order is unimportant; Other higher lvl threading helpers exist in C++11 & .NET 4+ —manual thread management considered "low lvl"; {Understanding threads can help make the right choices & solve problems} {Lock free programming (synchro's with the capabilities of the processor), no Mutex here} count becomes Interlocked.Increment for Multiprocessing #Jobs: Kernel object that allows managing one (or more) processes as a unit; System enforces Job quotas & security: Total & per process CPU time, working sets, CPU affinity & priority class, quantum length (for long, fixed quantums only), Security limits, UI limits; CreateJobObject OpenJobObject AssignProcessToJobObject TerminateJobObject SetInformationJobObject
———— -[ Windowz Internals 2 Course ]-
#The Object Manager: Part of the Executive; manages creating, deleting & otherwise tracking objects; maintains objects in a tree-like structure —can be partially viewed with the WinObj SysInternals tool; User mode clients can obtain Handles to Objects —cannot touch actual memory structure; kernel mode clients can do either. {Symbolic Links} #Object Structure: Structure in memory, half owned by the Object Manager > Object Name, Object Directory, Security Descriptor, Open Handle Count, Open Handle List, Pointer Count, Object Type (Type Object: Type Name, Synchronizable ?, Pageable ? {Page pool}. Object Methods) half owned by the Executive > Kernel object. File Mapping (Section) CreateFileMapping, OpenFileMapping which allows us to share memory across processes & to access files through memory APIs; Token (LogonUser, GetProcessToken); I\O Completion Port (CreateIOCompletionPor); WindowStation (CreateWindowStation, OpenWindowStation); Desktop (CreateDesktop, OpenDesktop) #Objects & Handles: When a process creates or opens an object, it receives a handle to the object —used as an opaque, indirect pointer to the underlying object —allows sharing objects across processes; In .NET, handles are used internally by types such as FileStream, Mutex, Semaphore, AutoResetEvent, etc; Each process has a private handle table; Viewing process handles: Process Explorer, handles.exe (Sysinternals (Console)), Resource Monitor, !handle debugger command (WinDbg) Inject code into process & then close the handle, works only on NT & some native & undocumented APIs. 1:39 handle usage Usermode processes retrieve a handle by calling an appropriate Create* function or an Open* function —a handle is returned upon success —Call CloseHandle when done with the handle; If object is named Create* with the same name returns a handle to the existing object —GetLastError() returns ERROR_ALREADY_EXISTS; Kernel code can obtain handles that reside in system space & are visible in any process context; Alternatively kernel code can obtain a direct pointer to underlying object given a handle —by calling ObReferenceObjectbyHandle —must release reference with ObDerefernceObject [...] #Sharing Objects: Process handle inheritance, Opening an object by name, Duplicating a handle … sharing by inheritance #Handle entry layout *=todo=* Pointer to Object Header, AIL: Audit on Close, Inheritance Flag, Is Locked ? The Access Mask says what this handle actually do so no two handles are necessarily the same That means that one handle can allow doing some operation A & not an operation B & some other handle that may point to the same kernel object may allow operations A & B. !
————
EXPUNGED ***{An access mask {}is actually bit flags that indicate various types of operations possible in that particular type of object {In fact when we try to obtain a handle to an existing object we must ask a specific access mask & that access mask may be granted but if that's not possible because of security concerns that means that the security script or on the object says for eg. that the caller can't get that particular operation to happen then the return handle will be null & the operation would fail & GetLastError would return access denied} ...~it whether it can do operation a & b or a or b for instance.}***
————
{cont.} #Object Names & Sessions: Each session should have its own objects; The Object Manager creates a Sessions directory with a session ID subdirectory {BaseNamedObjects view in WinObj} Windowz 8 Store Apps have their own folder so two users cannot communicate {contracts & extensions}; Processes can access the global session objects by prefixing object names with "Global\"; Can create private namespaces for tightened security CreatePrivateNamespace #User & GDI Objects: CreateWindow CreatePen LineTo MoveToEx User Objects: Windows (HWND) —CreateWindow[Ex], Menus (HMenu) —CreateMenu & hooks (HHook) —SetWindowsHookEx (intercept windows message & do some special stuff with it perhaps change messages perhaps record them) User Object Handles: No reference\ handle counting DestroyWindow; Private to a Window Station; GDI Objects: Device Context (HDC), pen (HPEN), brush (HBRUSH), bitmap (HBITMAP) et. al. (getDC CreatePen CreateSolidBrush CreateBitmap (GDI doesn't support Hardware Acceleration or 3D)) … Memory is managed in chunks called Pages; Page size is determined by CPU type; Two page sizes are supported; Allocations\ de-allocations & other memory block attributes are always per page; Small: (x86 4kb) (x64 4kb) (IA-64 8kb) Large page size: (x86 2MB (PAE), 4MB (non-PAE)) (x64 2Mb) (IA-64 16MB) {PAE = Physical Address Extension VAD = Virtual Address Descriptor} #Sharing Pages: Code pages are shared between processes —2 or more processes based on the same images; DLL code —However DLLs must be loaded in the same address; Data pages (read\ write) are shared at first {copy-on-write mechanism which duplicates the data into a private copy & maps that for that particular process's access so that process will access that data & we'll change that but no other process will see that change.} —But with special protection called Copy-On-Write —If one process changes the data an exception is caught by the Memory Manager which creates a private copy of the accessed page for that process — —Removing the Copy-On_Write protection; Data Pages can be created without Copy-on-Write #Demo-Sharing DLL Code: Image Base is preferred address. Base is the actual address that the DLL loads into. {ASLR randomization of base address Vista & up} # x86 Address Space Layout: (00000000-FFFFFFFF 2GB Application Code Global Variables Per thread stacks, DLL code) (80000000-BFFFFFFF Kernel & Executive, HAL, drivers) (C0000000-C07FFFFF Process page tables, hyperspace) (C0800000-FFFFFFFF Sysem cache, Paged pool, Non-paged pool) (00000000-BFFFFFFF 3GB user address space) (C0000000-FFFFFFFF 1GB system space); 3GB User Address Space per process using the Boot Configuration Database (BCD) on Vista & up. # x64bit Address Layout: (00000000'00000000 Per process space 8TB) (7FFFFFFF'FFFFFFFF unmapped) (FFFF0800'00000000 Start of system space) (FFFFF680'00000000 4 Lvl Page Table map (512GB)) (FFFFF700'00000000 Hyperspace Process Page tables (512GB)) (FFFFF780'00000000 System working set (512GB)) (FFFFF800'00000000 Kernel\ HAL\ Drivers) (FFFFF900'00000000 Session space (512GB)) (FFFFF980'00000000 System cache (1TB)) (FFFFFA80'00000000 Start of paged pool area (128GB) > System Mapped views (max 1TB)) (FFFFFAA0'00000000 System PTA pool (128GB)) (FFFFFAC0'00000000 Non-paged pool (128GB)) (FFFFFFFF'FFFFFFFF Reserved for HAL (2GB).) Current CPU architectures only support 48bits addressing. Current kernel implementation can work with 16TB at most —For efficiency reasons that have to do with single list entry structures (SLIST_ENTRY) Result: user address space is 8TB & system address space is 8TB as well, starting from the top. #Virtual Address Translation: Virtual Address -> virtual page number + byte within page > Page directory ((*1)KPROCESS, CR3 register) > Page Tables > Translation Lookaside Buffer > Address Translation (CPU) > Physical page number + byte within page. CPU throws an exception known as a Page Fault. 8086 works in real mode, the physical addresses only are used, starting with 386 CPUs are configured to use virtual addresses. (*1) Upper bits select PDE Page Directory Entry; That PDE points to the physical memory of smth known as a Page Table; The next thing is the middle 10 bits select another entry PTE Page Table Entry; That PTE points to the start of a page in physical memory; lower 12 bits offset to get to a particular byte inside that page; Use 1 bit to say whether that PTE is really valid bit0 --if value bit is 0 then CPU says that page is probably not in physical memory; Page Directory: 1 per process --mapped to virtual address 0xC0300000 (0xc0600000 on PAE systems); Physical address of page directory stored in KPROCESS structure; While thread is executing, the CR3 register holds its address --when a thread context switch occurs between threads of different processes CR3 is reloaded from the appropriate KPROCESS instance. #PDE & PTE: Each entry is 32bits (64bits on PAE); Upper 20 bits is the Page Frame Number (PFN) (24 bits on PAE); Bit0 is the Valid bit *(0 V Valid, 1 W Writable, 2 O Owner, 3 WT WriteThrough, 4 CD Cache Disabled, 5 A Accessed, 6 D Dirty, 7 L Reserved (Large page is PDE), 8 GL Global, 9 CW Reserved, 10 P Reserved, 11 U Reserved)* Dirty = Page has been written to, Large Page = this maps a large page (2MB), Accessed = Page has been read, Owner = Usermode or Kernelmode accessible #Physical Address Extension (PAE): Intel Pentium Pro & later processors support a new PAE mode; Virtual address translation contains an extra lvl of indirection (Page Index); Each PTE\ PDE is 64bits of which 24 are the PFN not just 20; The kernel must support PAE --default 32bit kernel is the PAE kernel; Address Windowing Extensions --API that allows access to more than 4GB of physical memory Naturally requires the PAE kernel --memory above 4GB cannot be used "automatically"; although File system cache uses it if available; AllocateUserPhysicalPages, VirtualAlloc with specific flags (MEM_PHYSICAL & MEM_RESERVE), MapUserPhysicalPages. #x64 Virtual Address Translation: 47-39 Page Map lvl 4 > 9bits > CR3 | 38-30 Page Directory Pointer > 9 bits | 29-21 Page Table > 9bits | 20-12 Page Table Entry > 9bits | 11-0 Byte within page > 12bits > RAM, Byte, Page. #Page Faults: Invalid PTE's: The CPU throws a page fault exception when the Valid bit (bit 0) in a PTE is clear; Windows uses the other PTE bits to indicate where the required page can be found; Example: a page that resides in a page file (x86 w/o PAE) (0 valid) (1-5 Page file index) (5-9 Protection) (10 P Prototype) (11 U Transition) (12-31 Page File Offset) #Page Files: Backup storage for writeable, non-shareable committed memory --Upto 16 page files are supported --On different partitions --Initial size & maximum size can be set --Named PageFile.sys on disk (root partition); Created contiguous on boot --initial value should be maximum of normal usage; Page file information in the Registry --HKLM\System\CurrentControlSet\Control\Session Manager\Memory Management\PagingFiles; Page file maximum size --4GB (x86 original kernel), 16TB (x86+PAE & x64); Default sizes: 1xRAM (minimum) 3xRAM (maximum) #Commit Charge: Commit charge represents the memory that can be committed; In RAM & existing page file(s); Contributors to the commit charge --Private committed memory (VirtualAlloc with MEM_COMMIT flag) --no RAM or page file is used until memory is actually touched --Until then considered zero demand pages; Page file backed memory mapped file allocated with MapViewOfFile; Copy-on-write backed memory; Kernel non-paged & paged pools; Kernelmode stacks; Page tables & yet to be created page tables; Allocations with Address Windowing Extensions (AWE) functions; The commit limit is basically the amount of RAM plus maximum size of all page files. {ProcExp View > System Info} #Working Sets: Process Working Set --The subset of the process' committed memory that resides in physical memory; System Working Set --The subset of system memory residing in physical memory; Systems with Terminal Services --Some kernel memory is on a per session basis & as such working set as well; Demand Paging --When a page is needed from disk more than one is read at a time to reduce I\O. Page Frame Number Database: PFN Database describes the state of all physical pages; Valid PTEs point to entries in the PFN Database; A PFN entry points back to the PTE; The structure layout of a PFN entry depends on the state of the page; kernel debugger: !memusage, !pfn. #PFN Database: Bad Page List, Zero Page list (Zero Page Thread > from Free Page List, > to Process Working Set), Free Page List (Pages read from disk or kernel allocations > to Process Working Set, > to Standby Page List, Private Bytes at Process Exit > from Process Working Set), Modified Page List (Working set replacement > from Process Working Set, Soft page faults > to Process Working Set & from Standby Page List, > to Modified Page Writer), Standby Page List (> to Process Working Set, Working Set replacement > to Process Working Set, > to Free Page List.) #Memory APIs in Usermode: Low-lvl: Virtual API (VirtualAlloc, VirtualFree, VirtualProtect) allocates in page 10 bytes? 1 page ! 'lowest lvl API --works on page granularity only --Allows reserving and\ or committing of memory, Heap API (HeapCreate HeapAlloc HeapFree) small chunks uses linked lists to find free 'Uses the Virtual API internally; manages small allocations w/o wasting pages'. C\ C++ Runtime API (malloc realloc free operatornew) uses own allocation 'Uses the Heap API (usually compiler dependant', Local\ Global API (LocalAlloc GlobalAlloc GlobalLock LocalFree) functions from the 16bit days 'mostly for compatibility with Win16'. #The Heap Manager: Allocating in page granularity is sometimes too much --need fine grained control; The Heap Manager manages smaller allocations (8 bytes minimum); The HeapXxx Window API functions are a thin wrapper over the native NtDll.Dll functions. Heap Types: One heap is always created with a process called the Default Process Heap --can be accessed using GetProcessHeap; Additional heaps can be created using the CreateHeap function; A heap can be fixed in size or growable; Low Fragmentation Heap (LFH) {.NET less fragmented} #System Memory Usage: Kernel & Driver memory usage —Non pageable code —code that may execute at IRQL >= 2; Pageable code; File System Cache; Non-Paged Pool —Memory that may be accessed at IRQL >= 2; Paged Pool #System Memory Pools: The kernel provides 2 general memory pools for use by the kernel itself & device drivers —Non-Paged Pool — —Memory always resides in RAM (never page out) — —Can be accessed at any IRQL —Paged Pool — —Memory can be swapped at disk — —Should be accessed at IRQL_DPC_LEVEL(2) only; Pool sizes depend on the amount of RAM & the OS type —Can be altered (up to some maxima) in the registry — —HKLM\System\CurrentControlSet\Control\Session Manager\Executive; Task Manager displays current sizes. #System Memory Pools & APIs: ExAllocatePool —allocate memory from the paged or non-paged pool; ExAllocatePoolWithTag —allocate memory & tag it with a 4 byte value —can be used to track memory leaks; ExFreePool —Frees memory previously allocated (on wutev pool); Pool usage & tags can be viewed with PoolMon.exe —part of the Windowz driver kit. #Memory Mapped Files: Internally called Section objects; Allow the creation of "views" into a file —return a memory pointer for data manipulation; Implies shared memory capabilities —this is the usual case with mapping EXEs & DLLs; Also can create "pure" shared memory —backed up by page files —when Memory Mapped file object is destroyed memory is recycled. #Memory Mapped Files APIs: Win32: CreateFileMapping —create a file mapping object based on a specific file (previously created with CreateFile) or based on system paging file; OpenFileMapping —open an existing MMF based on its name (NOT the filename); MapViewOfFile(Ex) —create a "view" into the MMF; .NET: System.IO.MemoryMappedFiles namespace; Classes: MemoryMappedFile, MemoryMappedViewStream, MemoryMappedViewAccessor; Kernel: ZwCreateSection —creates a Section object (if based on a file, call ZwCreateFile first); ZwMapViewOfSection —map a "view" into system space. #Large Pages: Large pages (2MB) allow mapping using a PDE only (no need for PTEs) —advantage is better use of the translation look aside buffers; Windowz maps by default large pages for NtOSKrnl.Exe & HAL.dll as well as core system data (initial non paged pool & PFN database); Potential disadvantages —Single protection to entire page —May be more difficult to find a large page's size contiguous physical memory for mapping; Programmatically using Large Pages —specifying the MEM_LARGE_PAGE in calls to the VirtualAlloc function —Size & alignment must be multiple of large page size — —Can be determined by calling the GetLargePageMinimum function. {Rightclick process & select Go To Details; in Task Manager} #More on Memory Management: Why do 32bit systems with 4GB of RAM show much less than that ? Memory mapped I\O by drivers for certain devices "steal" (overlap) RAM memory range; Virtual Address Descriptors (VADs) —Describe address space ranges; Locking memory —VirtualLock (Win32), MmProbeAndLockPages (kernel) —Use SetWorkingSetSize(Ex) to increase quota —may fail if insufficient RAM; More information can be found in the book "Windowz Internals" by Russinovich, Solomon & Ionescu. #Trap Dispatching: Traps —Interrupts or Exceptions —divert code execution outside the normal flow; Trap Dispatching —Kernel mechanisms for capturing an executing thread's state when an interrupt or exception occurs & transferring control to a handling routine; Interrupt —Asynchronous event, unrelated to the current executing code; Exception —Synchronous call to certain instructions —Reproducible under the same conditions. #Hardware Interrupts: Device1, Device2, Devicen > IRQ0, IRQ1 (Interrupt Controller) > Interrupt request line on the processor. {acknowleding there's some interrupt} > Interrupt Controller sends IRQ number through the Data Bus to the processor & then the prcessor receives that number & then uses that as an index with some manipulation as an index into an Interrupt dispatch table which has pointers to functions that are interrupt service routines. {It's usually a round robin but hardware layout is more complex. #Interrupt Dispatching: Kernel or usermode code > Interrupt ! > Record CPU State (Trap Frame) > Mask equal or lower IRQL interrupts > Call appropriate ISR -> Interrupt Service Routine -> Restore CPU state. #Interrupt Request Level (IRQL): {Higher lvl interrupts can preempt lower lvl interrupts} Each interrupt has an associated Interrupt Request Level (IRQL) —can be considered its priority —For hardware interrupts, mapped by the HAL; Each processor's context includes its current IRQL —A CPU always runs the highest IRQL code; Servicing an interrupt raises the processor's IRQL to the lvl of the interrupts IRQL —This masks all IRQLs at that IRQL & lower; Dismissing an interrupt restores the processor's IRQL to that prior to the interrupt. {APC = Asynchronous Procedure Call DPC = Deferred Procedure Call} #IRQ Levels: x86: PASSIVE_LEVEL (0), APC_LEVEL (1), DISPATCH_LEVEL (2), Device IRQL (DIRQL) (3-26), Profile\ Synch (27), Clock (28), Inter CPU Interrupt (29), Power fail (30), HIGH_IRQL (31.) X64:: PASSIVE_LEVEL (0), APC_LEVEL (1), DISPATCH_LEVEL (2), Device IRQL (DIRQL) (3-11), Synch (12), Clock (13), Inter CPU\ Power (14), HIGH_IRQL (15) IRQL changes can only be done in Kernelmode. {Interrupt Dispatch Table IDT} #IRQLs vs. Thread Priorities: IRQLs can be changed in Kernelmode only with KeRaiseIRQL & KeLowerIRQL. #The Spin Lock: Synchronization on MP systems uses IRQLs within each CPU & spin locks to coordinate among the CPUs; A Spin Lock is just a data cell in memory —it is accessed with a test & modify operation atomic across all processors —Similar in concept to a mutex; Not exposed & not needed to usermode applications; Acquiring a Spin Lock. IRQL is implicit in the choice of routine —KeAcquireSpinLock uses IRQL = DISPATCH_LEVEL; KeSynchronizationExecution —is used to synchronize with an ISR; ExInterlockedXxx routines use IRQHighLevel; Spin Locks should not be requested if already owned —causes a deadlock ! Raise to associated IRQL > Test & set the spin lock bit > Was it previously clear ? No (back to start) > Yes ? This CPU now owns the spin lock. #Exception Dispatching: Synchronous even resulting from certain code —Divide by zero, access violation, stack overflow, invalid instruction; Structured Exception Handling (SEH) —A mechanism used to handle & possibly resolve exceptions; Exceptions are connected to entries in the IDT. #Exception Handling: Some exceptions are handled transparently —eg. a breakpoint is an exception which is transferred to the appropriate debugger; Some exceptions are filtered back to usermode for possible handling —eg. accessing a usermode address that is not mapped; Frame based exception handlers are search (32bit systems) —if the current frame has no handlers the previous frame is searched & so on —64bit systems don't use frames but search mechanism is the same; Unhandled exceptions from kernelmode generate a "bug check" a.k.a. "Blue Screen Of Death." #Resolving Exceptions: Exception occurred [kernelmode] > Switch to kernelmode (if was in usermode) > Create a Trap Frame > Exception occurred in kernelmode ? Yes > Look for handler > Found one ? Yes > Execute handler > Done, No Crash system. Exception occurred in usermode > Debugger attached ? > Yes Send message to Debug Port > "First chance" handled by debugger ? > Yes Done No > Search for Base handlers > Found one ? > Yes Execute handler > Done > No No frame based handlers found > Debugger attached ? Yes > Send message to Debug Port > Second chance handled by debugger ? Yes > Done No Debugger attached > Send message to Exception port > Subsystem handles the exception > Yes > Done > No Send Message to Error port > Terminate Process > Done. #Structured Exception Handling (SEH): Exposed for developers by extended C keywords —__try Wraps a block of code that may throw exceptions —__except Possible block for handling exceptions in the preceding __try block —__finally Execute code whether an exception occurred or not —__leave Jumps to the __finally clause —allowed blocks are __try\ __finally & __try\__except — —however can be nested to any lvl; Works in kernelmode & usermode; Custom exceptions can be raised with RaiseException (Win32.) [...abrupt ending of course due to [sham] missing 3rd course...] You need Enterprise Visual Studio & money for courses & why do you want to learn to steal\ hijack other apps.
———— -[ Windows Internals 3 Course ]-
#Introduction To Services:: A Service Application is a program that provides some functionality without being tied to the logged on user —a service may run without any user logging on; Service examples — —IIS Service (listening to HTTP port 80 etc.) — —SQL Server (listening for database requests via some mechanism); A service needs to register with the Service Control Manager (SCM); A service can be controlled by a Service Control Program (SCP) —fancy name for a bunch of service related APIs. #Service Characteristics:: A service is built just like any Win32 application —may host a single service or multiple services; Must be registered with the SCM —Using the CreateService API; Communicates with the SCM using a named pipe; May run automatically when Windowz boots —loaded by SCM (services.exe); Usually run under a special user account —Local System, Network Service, or Local Service; Service Control Programs can manipulate a service —eg. StartService, ControlService APIs. #Service Configuration:: A service application must be installed —by calling the CreateService API or an equivalent tool; Inserts a new key into the registry under HKLM\System\CCS\Services —Technically the entries in the services key corresponds to services & device drivers; The services MMC snap-in can be used to view Services only; To start a service a Service Control Program calls the StartService API —or uses an equivalent tool. #Service Key Parameters:: Start -when should the server start —SERVICE_BOOT_START(0) -drivers only —SERVICE_SYSTEM_START(1) -drivers only —SERVICE_AUTO_START(2) -start service when the system starts —SERVICE_DEMAND_START(3) -start service on demand (StartService) —SERVICE_DISABLED(4) -do not start the service; DelayedAutostart —Relevant to auto-start services only —if true (1) service is started some time after the SCM is started; Type -type of service of driver —SERVICE_WIN32_OWN_PROCESS(16) —service runs in a process that hosts just one service —SERVICE_WIN32_SHARE_PROCESS(32) —service runs in a process that hosts multiple services; ImagePath —the path to the service executable; DisplayName —The service name visible in the services applet —if not specified the service key becomes the name; Description —textual description of the service; ObjectName —The account under which the server process should execute.
#Service Architecture:: SCM > Service Process > (Main Thread) > Call StartServiceCtrlDispatcher > creates (Service Thread) CallRegisterServiceCtrlHandler > (Main Thread) Service Control Handler > (Service Thread) Wait for Client requests > Handle client requests.
#Controlling Services:: Service Control Programs such as the Services MMC use the Windowz API to control services; OpenSCManager —Opens a connection to the SCM; OpenService,CreateService —Opens a connection to an existing service or installs a new service; StartService —Stars a service; ControlService(Ex) —Sends other commands to the service (stop, pause, etc.); QueryServiceStatus —returns current service status; DeleteService —uninstalls a service.
##Services P2:: #Service Accounts:: LocalSystem —The most powerful on the local computer —Use with care; NetworkService —Allows a service to authenticate outside the local computer; LocalService —Similar to NetworkService but can only acces network elements accepting anonymous access; Some specific user —uncommon should use in special cases.
#Shared Service Processes:: Some services run in their own process; Some services are sharing only a single process —less system overhead of extra processes —If one service crashes it brings down all other services in that process —all services running in a shared process run with the same account; Micro$oft uses the SvcHost.exe generic host to host multiple services within the same process.
#Trigger Start Services:: Introduced in Windowz 7; Service can start with a certain "trigger"; Cannot be configured using the Services MMC —must call the ChangeServiceConfig2 API function (or a comparable tool); Possible triggers —computer joins a domain —device approval —firewall port open —group or user policy change —IP address availability —network protocol (Windowz 8 & later) —custom based on Event Tracing for Windowz (ETW) event.
##The I\O System P1:: Introduction:: The I\O system abstracts logical & physical devices; Most I\O system parts are within the Executive & Kernel; The I\O system provides —Uniform naming mechanism across devices & files —Uniform security model —Asynchronous packet-based I\O —Support for Plug & Play —Dynamic loading & unloading of device drivers —Support for power management —Support for multiple file systems.
#I\O System Components:: (User mode) Applications\ Services > User mode PnP Manager > (Kernel mode) PnP Manager (User mode) > Setup components > .inf, .cat, registry. (Kernel mode) I\O System: I\O Manager, Power Manager, PnP Manager, WMI routines > (User mode) WMI Service. (Kernel mode) I\O System > Drivers: driver, driver, driver > HAL.
#Device Drivers:: Device Drivers are Loadable Kernel Modules —technically the only officially supported way to get 3rd party code into the kernel; Classic device drivers provide the "glue" between hardware devices & the operating system; Several ways to segregate device driver into categories; User mode Device Drivers —Printer drivers —Drivers based on the User Mode Driver Framework (UMDF); Kernel mode Drivers —File system drivers —Plug & Play drivers —Software drivers (non-Plug & Play drivers.)
msinfo32 in Run |Process exp System process > DLLs to view drivers
#Invoking a Driver:: [User mode] Call fread (application) > Call ReadFile (Msvcrt.dll) > call NTReadFile (Kernel32.dll) > sysenter\ syscall (NtDLL.dll) > [Kernel mode] call NTReadFile (NtOsKrnl.exe) > NTReadFile: call Driver (NTOSKrnl.Exe) > initiate I\O (driver.sys.)
#Plug & Play:: Automatic & dynamic recognition of installed hardware —hardware detected at initial system installation —recognition of PnP hardware changes between boots —run-time response to PnP hardware changes; Dynamic loading & unloading of drivers in response to hardware insertion or removal; Hardware resource allocation & reallocation —PnP manager may reconfigure resources at run-time in response to new hardware requesting resources that are already in use.
#Device Enumeration:: Upon boot the P&P Manager performs enumeration of buses & devices —starts from an imaginary "Root" device —scans the system recursively to walk the device tree; Bus creates a PDO (Physical Device Objects) for each physical device; P&P Manager loads drivers —Loads lower filter drivers (if exist) —They create their FiDOs (Filter Device Objects) —loads "the driver" (function driver) —It should create the FDO (Functional Device Object) —loads upper filter drivers (if exist) —they create their FiDOs. [Devnode] PDO > FiDOs > FDO > FiDOs.
Device manager > View by connection (enumerate bus)
##The I\O System P2:: #Important Registry Keys:: "Hardware" (Instance) keys —HKLM\System\CCS\Enum — —CCS = Current Control Set —Information about a single device; "Class" keys —HKLM\System\CCS\Control\Class —information about all devices of same the type; "Software" (Service) keys —HKLM\System\CCS\Services\drivername —Information about a specific driver.
#Device Node "DevNode":: Represents a stack of devices; PDO: Physical Device Object —created by the bus driver; FiDO: Filter Device Object —optional lower\ upper device objects; FDO: Functional Device Objects —the "actual" driver created device object; Stack of devices not drivers.
#I\O Request Packet (IRP):: A structure representing some request —represented by the IRP structure —contains all details needed to handle the request (codes, buffers, sizes, etc.); Always allocated from non-paged pool; Accompanied by a set of structures of type IO_STACK_LOCATION —number of structures is the number of devices in this DevNode —compliments the data in the IRP; IRPs are typically created by the I\O Manager, P&P Manager or the Power Manager —can be explicitly created by drivers as well.
#{IRP Flow missing!!}
#Accessing Devices:: A client that wants to communicate with a device must open a handle to the device —CreateFile or CreateFile2 from usermode — —the System.IO.FileStream class in .NET —ZwCreateFile from kernelmode; CreateFile accepts a "filename" which is actually a device symbolic link —"file" being just one specific case —for devices the name should have the format \\.\name —cannot access non-local device —must use double backslashes "\\\\.\\name" in C\C++.
#Asynchronous I\O:: The I\O manager supports an aynchronous model —client initiates request may not block & get a notification later; Device drivers must be written with asynchrony in mind —should start an operation mark the IRP as pending & return immediately; The I\O Manager supports several ways of receiving a notification when the operation completes; To use I\O asynchronously CreateFile must be called with the FILE_FLAG_OVERLAPPED flag; Other I\O functions must provide a non-null OVERLAPPED structure pointer.
##Device Drivers P1:: {first skipped} #Kernel Drivers: Always execute in Kernel Mode —use the kernel mode stack of a thread —image part of system space —unhandled exceptions will crash the system; Typically has a SYS file extension; Usually invoked by client code —eg. ReadFile, WriteFile, DeviceIoControl; Exports entry points for various functions —called by system code when appropriate; System handles all device independant aspects of I\O —no need for hardware specific code or assembly.
#Plug & Play Drivers:: Communicate with the Plug & Play Manager & the Power Manager —via IRPs; Driver types —Function driver — —manages the hardware device — —the driver that knows the device intimately; Bus driver — —manages a bus (PCI, USB, IEEE1394, etc.) — —typically written by Micro$oft; — —Filter drivers —Sit on top of a function driver (upper filter) or on top of a bus driver (below the function driver -low filter) — —allow intercepting requests.
#Anatomy of a Driver:: A driver exports functionality callable by the I\O system. I\O System > Initializtion Routines, AddDevice Routine, Dispatch Routines, StartI\O routine, ISR routine, DPC routine.
#Driver & Device Objects:: Drivers are represented in memory using a DRIVER_OBJECT structure —created by the I\O system —provided to the driver in the DriverEntry function —holds all exported functions; Device objects are created by the driver on a per-device basic —represented by the DEVICE_OBJECT structure —typically created in the Driver's AddDevice routine —several can be associated with a single driver object. Driver_Object > Device Object > Device Object
#Typical IRP Processing:: [User mode] App Calls (eg.) ReadFile > [Kernel mode] Validate request > [Requesting thread context] Dispatch routine > [Arbitrary thread context] Start I\O routine > Start I\O routine > [Device interrupt] > ISR routine > [DPC (Software) interrupt] DPC routine.
#Referencing User Buffers:: Buffers provided in user space are not generally accessible from an arbitrary thread context and\or high IRQL (>=2); The I\O system provides ways to mitigate that; Buffered I\O —transfer is to & from an intermediate buffer in system address space —I\O Manager does all of the setup work; Direct I\O —transfer is to or from user's physical pages —I\O Manager does most of the setup work; Neither I\O —no help from the I\O manager.
#Buffered I\O:: Userspace: p n > Kernelspace: qn (copied) > q=Irp->AssociatedIrp.SystemBuffer > RAM > (copy q to p.)
#Direct I\O:: [User space] p n > (MDL) Irp->MdlAddress, q=MmGetSystemAddressForMdlSafe(IRP->Mdl Address,…) > locked RAM pages > Userprocess nop is [Kernelspace] q n
##Device Drivers P2:: #The Windows Driver Model:: Drivers for Win95 & NT were completely separate; WDM is a model for writing device drivers --mostly source compatible between Win98\ ME & Win2000\ XP --supports a wide range of buses (PCI, USB, IEEE1394 & more) --extensible to support future buses --supports a wide range of device classes (HID, Scanners, Cameras etc.) --Can still be used today; Not included in WDM --file system drivers --video drivers; WDM is showing its age.
#The Windowd Driver Foundation:: A new driver model, introduced in [2006] Windows Vista --cant be installed on WinXP as well & even Win2000 (KMDF); WDF has two distinct parts --KMDF -kernelmode driver framework --UMDF -usermode driver framework; KMDF is a replacement for WDM --consistent object-based model -- --properties, events & methods --boilerplate Plug & Play & Power code implemented by the framework -- --driver just needs to register for "interesting" events --object lifetime management --versioning /w side-by-side support.
#UMDF:: Allows building drivers in usermode --easier development & debugging; Works for certain driver categories (eg. USB); UDMF 1.x is based around the Component Object Model (COM); UMDF drivers hosted in a system supplied host (WDFHost.exe); Object model similar in concept to KMDF; UMDF 2.0 introduced in Win 8.1 --near identical object model compared to KMDF --some form of translation is possible both ways.
#Driver Installation:: Drivers for hardware devices must be installed /w an INF file; INF File --text file format similar to the classic INI file --sections in square brackets & instructions as key=value pairs --INF looked up by hardware ID & compatible IDs -- --precise matches are preferred -- --digitally signed files are preferred -- --newer files preferred; Installed INF files are stored in %SYSTEMROOT%\INF; Usermode P&P service requests INF file if no match found in one system.
#Driver Verifier:: A tool that allows monitoring device drivers activities & operations; Has a GUI but can be operated from the commandline as well --can even change some settings w/o a restart; Does not require any special code or awareness from the driver writer; Can monitor any driver; M$ uses it for its own drivers & for drivers submitted for WHQL certification.
##Device Drivers P2:: #Introduction: A "software" driver does not manage any hardware; typically used as a method to get code to run in kernel mode; Examples Process Explorer, Process Monitor; This means --no AddDevice routine needed --driver explorts a well-known name for the only device --installation does not have to use an INF file; we'll create a software driver that can set high thread priority value (16-31) to any thread regardless of its parent process priority --barring security restrictions.
#The DriverEntry Function:: The "main" function called when the driver first loads; Should fill exported functions supported by the driver —unload routine --AddDevice for hardware based drivers --Dispatch routines; For software driver --creates the one & only device object --creates a symbolic link so the device can be accessed from usermode.