302 lines
17 KiB
HTML
302 lines
17 KiB
HTML
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|||
|
<html xmlns="http://www.w3.org/1999/xhtml">
|
|||
|
<head>
|
|||
|
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
|
|||
|
<title>WinPcap: NPF driver internals manual</title>
|
|||
|
<link href="tabs.css" rel="stylesheet" type="text/css"/>
|
|||
|
<link href="style.css" rel="stylesheet" type="text/css"/>
|
|||
|
</head>
|
|||
|
<body>
|
|||
|
<!-- Generated by Doxygen 1.6.1 -->
|
|||
|
<div class="navigation" id="top">
|
|||
|
<div class="tabs">
|
|||
|
<ul>
|
|||
|
<li><a href="main.html"><span>Main Page</span></a></li>
|
|||
|
<li><a href="pages.html"><span>Related Pages</span></a></li>
|
|||
|
<li><a href="modules.html"><span>Modules</span></a></li>
|
|||
|
<li><a href="annotated.html"><span>Data Structures</span></a></li>
|
|||
|
<li><a href="files.html"><span>Files</span></a></li>
|
|||
|
</ul>
|
|||
|
</div>
|
|||
|
</div>
|
|||
|
<div class="contents">
|
|||
|
<h1>NPF driver internals manual<br/>
|
|||
|
<small>
|
|||
|
[<a class="el" href="group__internals.html">WinPcap internals</a>]</small>
|
|||
|
</h1><table border="0" cellpadding="0" cellspacing="0">
|
|||
|
<tr><td colspan="2"><h2>Modules</h2></td></tr>
|
|||
|
<tr><td class="memItemLeft" align="right" valign="top"> </td><td class="memItemRight" valign="bottom"><a class="el" href="group__NPF__ioctl.html">NPF I/O control codes</a></td></tr>
|
|||
|
<tr><td class="memItemLeft" align="right" valign="top"> </td><td class="memItemRight" valign="bottom"><a class="el" href="group__NPF__include.html">NPF structures and definitions</a></td></tr>
|
|||
|
<tr><td class="memItemLeft" align="right" valign="top"> </td><td class="memItemRight" valign="bottom"><a class="el" href="group__NPF__code.html">NPF functions</a></td></tr>
|
|||
|
<tr><td class="memItemLeft" align="right" valign="top"> </td><td class="memItemRight" valign="bottom"><a class="el" href="group__NPF__jitter.html">NPF Just-in-time compiler definitions</a></td></tr>
|
|||
|
</table>
|
|||
|
<hr/><a name="_details"></a><h2>Detailed Description</h2>
|
|||
|
<html>
|
|||
|
|
|||
|
<head>
|
|||
|
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
|
|||
|
<meta name="GENERATOR" content="Microsoft FrontPage 6.0">
|
|||
|
<meta name="ProgId" content="FrontPage.Editor.Document">
|
|||
|
<title></title>
|
|||
|
</head>
|
|||
|
|
|||
|
<body>
|
|||
|
|
|||
|
<p>This section documents the internals of the Netgroup Packet Filter (NPF), the kernel
|
|||
|
portion of WinPcap. Normal users are probably interested in how to use WinPcap
|
|||
|
and not in its internal structure. Therefore
|
|||
|
the information present in this module is destined mainly to WinPcap developers and maintainers, or to
|
|||
|
the people interested in how the driver works. In particular, a good knowledge
|
|||
|
of OSes, networking and Win32 kernel programming and device drivers development
|
|||
|
is required to profitably read this section. </p>
|
|||
|
<p>NPF is the WinPcap component that does the hard work, processing the packets
|
|||
|
that transit on the network and exporting capture, injection and analysis
|
|||
|
capabilities to user-level.</p>
|
|||
|
<p>The following paragraphs will describe the interaction of NPF with the
|
|||
|
OS and its basic structure.</p>
|
|||
|
<h2>NPF and NDIS</h2>
|
|||
|
<p>NDIS (Network Driver Interface Specification) is a standard that defines the
|
|||
|
communication between a network adapter (or, better, the driver that manages it)
|
|||
|
and the protocol drivers (that implement for example TCP/IP). Main NDIS purpose
|
|||
|
is to act as a wrapper that allows protocol drivers to send and receive packets
|
|||
|
onto a network (LAN or WAN) without caring either the particular adapter or the
|
|||
|
particular Win32 operating system.</p>
|
|||
|
<p>NDIS supports three types of network drivers:</p>
|
|||
|
<ol>
|
|||
|
<li><strong>Network interface card or NIC drivers</strong>. NIC drivers
|
|||
|
directly manage network interface cards, referred to as NICs. The NIC
|
|||
|
drivers interface directly to the hardware at their lower edge and at their
|
|||
|
upper edge present an interface to allow upper layers to send packets on the
|
|||
|
network, to handle interrupts, to reset the NIC, to halt the NIC and to
|
|||
|
query and set the operational characteristics of the driver. NIC drivers can
|
|||
|
be either miniports or legacy full NIC drivers.
|
|||
|
<ul>
|
|||
|
<li>Miniport drivers implement only the hardware-specific operations
|
|||
|
necessary to manage a NIC, including sending and receiving data on the
|
|||
|
NIC. Operations common to all lowest level NIC drivers, such as
|
|||
|
synchronization, is provided by NDIS. Miniports do not call operating
|
|||
|
system routines directly; their interface to the operating system is
|
|||
|
NDIS.<br>
|
|||
|
A miniport does not keep track of bindings. It merely passes packets up
|
|||
|
to NDIS and NDIS makes sure that these packets are passed to the correct
|
|||
|
protocols.
|
|||
|
<li>Full NIC drivers have been written to perform both hardware-specific
|
|||
|
operations and all the synchronization and queuing operations usually
|
|||
|
done by NDIS. Full NIC drivers, for instance, maintain their own binding
|
|||
|
information for indicating received data. </li>
|
|||
|
</ul>
|
|||
|
<li><strong>Intermediate drivers</strong>. Intermediate drivers interface
|
|||
|
between an upper-level driver such as a protocol driver and a miniport. To
|
|||
|
the upper-level driver, an intermediate driver looks like a miniport. To a
|
|||
|
miniport, the intermediate driver looks like a protocol driver. An
|
|||
|
intermediate protocol driver can layer on top of another intermediate driver
|
|||
|
although such layering could have a negative effect on system performance. A
|
|||
|
typical reason for developing an intermediate driver is to perform media
|
|||
|
translation between an existing legacy protocol driver and a miniport that
|
|||
|
manages a NIC for a new media type unknown to the protocol driver. For
|
|||
|
instance, an intermediate driver could translate from LAN protocol to ATM
|
|||
|
protocol. An intermediate driver cannot communicate with user-mode
|
|||
|
applications, but only with other NDIS drivers.
|
|||
|
<li><b>Transport drivers or protocol drivers</b>. A protocol driver implements
|
|||
|
a network protocol stack such as IPX/SPX or TCP/IP, offering its services
|
|||
|
over one or more network interface cards. A protocol driver services
|
|||
|
application-layer clients at its upper edge and connects to one or more NIC
|
|||
|
driver(s) or intermediate NDIS driver(s) at its lower edge.</li>
|
|||
|
</ol>
|
|||
|
<p>NPF is implemented as a protocol driver. This is not the best possible choice
|
|||
|
from the performance point of view, but allows reasonable independence from the
|
|||
|
MAC layer and as well as complete access to the raw traffic.</p>
|
|||
|
<p>Notice that the various Win32 operating systems have different versions of
|
|||
|
NDIS: NPF is NDIS 5 compliant under Windows 2000 and its derivations (like
|
|||
|
Windows XP), NDIS 3
|
|||
|
compliant on the other Win32 platforms. </p>
|
|||
|
<p>Next figure shows the position of NPF inside the NDIS stack:</p>
|
|||
|
<p align="center"><img border="0" src="npf-ndis.gif"></p>
|
|||
|
<p align="center"><b>Figure 1: NPF inside NDIS.</b></p>
|
|||
|
<p>The interaction with the OS is normally asynchronous. This means that the
|
|||
|
driver provides a set of callback functions that are invoked by the system when
|
|||
|
some operation is required to NPF. NPF exports callback functions for all the I/O operations of the
|
|||
|
applications: open, close, read, write, ioctl, etc.</p>
|
|||
|
<p>The interaction with NDIS is asynchronous as well: events
|
|||
|
like the arrival of a new packet are notified to NPF through a callback
|
|||
|
function (Packet_tap() in this case). Furthermore, the interaction with NDIS and
|
|||
|
the NIC
|
|||
|
driver takes always place by means of non blocking functions: when NPF invokes a
|
|||
|
NDIS function, the call returns immediately; when the processing ends, NDIS invokes
|
|||
|
a specific NPF
|
|||
|
callback to inform that the function has finished. The
|
|||
|
driver exports a callback for any low-level operation, like sending packets,
|
|||
|
setting or requesting parameters on the NIC, etc.</p>
|
|||
|
|
|||
|
<h2>NPF structure basics</h2>
|
|||
|
|
|||
|
<p>Next figure shows the structure of WinPcap, with particular reference to the
|
|||
|
NPF driver.</p>
|
|||
|
|
|||
|
<p align="center"><img border="0" src="npf-npf.gif" width="500" height="412"></p>
|
|||
|
|
|||
|
<p align="center"><b>Figure 2: NPF device driver.</b>
|
|||
|
|
|||
|
<p>NPF is able to
|
|||
|
perform a number of different operations: capture, monitoring, dump to disk,
|
|||
|
packet injection. The following paragraphs will describe shortly each of these
|
|||
|
operations.</p>
|
|||
|
<h4>Packet Capture</h4>
|
|||
|
<p>The most important operation of NPF is packet capture.
|
|||
|
During a capture, the driver sniffs the packets using a network interface and delivers them intact to the
|
|||
|
user-level applications.
|
|||
|
</p>
|
|||
|
<p>The capture process relies on two main components:</p>
|
|||
|
<ul>
|
|||
|
<li>
|
|||
|
<p>A packet filter that decides if an
|
|||
|
incoming packet has to be accepted and copied to the listening application.
|
|||
|
Most applications using NPF reject far more packets than those accepted,
|
|||
|
therefore a versatile and efficient packet filter is critical for good
|
|||
|
over-all performance. A packet filter is a function with boolean output
|
|||
|
that is applied to a packet. If the value of the function is true the
|
|||
|
capture driver copies
|
|||
|
the packet to the application; if it is false the packet is discarded. NPF
|
|||
|
packet filter is a bit more complex, because it determines not only if the
|
|||
|
packet should be kept, but also the amount of bytes to keep. The filtering
|
|||
|
system adopted by NPF derives from the <b>BSD Packet Filter</b> (BPF), a
|
|||
|
virtual processor able to execute filtering programs expressed in a
|
|||
|
pseudo-assembler and created at user level. The application takes a user-defined filter (e.g. <20>pick up all UDP packets<74>)
|
|||
|
and, using wpcap.dll, compiles them into a BPF program (e.g. <20>if the
|
|||
|
packet is IP and the <i>protocol type</i> field is equal to 17, then return
|
|||
|
true<75>). Then, the application uses the <i>BIOCSETF</i>
|
|||
|
IOCTL to inject the filter in the kernel. At this point, the program
|
|||
|
is executed for every incoming packet, and only the conformant packets are
|
|||
|
accepted. Unlike traditional solutions, NPF does not <i>interpret</i>
|
|||
|
the filters, but it <i>executes</i> them. For performance reasons, before using the
|
|||
|
filter NPF feeds it to a JIT compiler that translates it into a native 80x86
|
|||
|
function. When a packet is captured, NPF calls this native function instead
|
|||
|
of invoking the filter interpreter, and this makes the process very fast.
|
|||
|
The concept behind this optimization is very similar to the one of Java
|
|||
|
jitters.</li>
|
|||
|
<li>
|
|||
|
<p>A circular buffer to store the
|
|||
|
packets and avoid loss. A packet is stored in the buffer with a header that
|
|||
|
maintains information like the timestamp and the size of the packet.
|
|||
|
Moreover, an alignment padding is inserted between the packets in order to
|
|||
|
speed-up the access to their data by the applications. Groups of packets can be copied
|
|||
|
with a single operation from the NPF buffer to the applications. This
|
|||
|
improves performances because it minimizes the number of reads. If the
|
|||
|
buffer is full when a new packet arrives, the packet is discarded and
|
|||
|
hence it's lost. Both kernel and user buffer can be
|
|||
|
changed at runtime for maximum versatility: packet.dll and wpcap.dll provide functions for this purpose.</li>
|
|||
|
</ul>
|
|||
|
<p>The size of the user buffer is very
|
|||
|
important because it determines the <i>maximum</i> amount of data that can be
|
|||
|
copied from kernel space to user space within a single system call. On the other
|
|||
|
hand, it can be noticed that also the <i>minimum</i> amount of data that can be copied
|
|||
|
in a single call is extremely important. In presence of a large value for this
|
|||
|
variable, the kernel waits for the arrival of several packets before copying the
|
|||
|
data to the user. This guarantees a low number of system calls, i.e. low
|
|||
|
processor usage, which is a good setting for applications like sniffers. On the
|
|||
|
other side, a small value means that the kernel will copy the packets as soon as
|
|||
|
the application is ready to receive them. This is excellent for real time
|
|||
|
applications (like, for example, ARP redirectors or bridges) that need the better
|
|||
|
responsiveness from the kernel.
|
|||
|
From this point of view, NPF has a configurable behavior, that allows users to choose between
|
|||
|
best efficiency or best responsiveness (or any intermediate situation). </p>
|
|||
|
<p>The wpcap library includes a couple of system calls that can be used both to set the timeout after
|
|||
|
which a read expires and the minimum amount of data that can be transferred to
|
|||
|
the application. By default, the read timeout is 1 second, and the minimum
|
|||
|
amount of data copied between the kernel and the application is 16K.</p>
|
|||
|
<h4> Packet injection</h4>
|
|||
|
<p> NPF allows to write raw packets to the network. To send data, a
|
|||
|
user-level application performs a WriteFile() system call on the NPF device file. The data is sent to the network as is, without encapsulating it in
|
|||
|
any protocol, therefore the application will have to build the various headers
|
|||
|
for each packet. The application usually does not need to generate the FCS
|
|||
|
because it is calculated by the network adapter hardware and it is attached
|
|||
|
automatically at the end of a packet before sending it to the network.</p>
|
|||
|
<p>In normal situations, the sending rate of the packets to the network is not
|
|||
|
very high because of the need of a system call for each packet. For this reason,
|
|||
|
the possibility to send a single packet more than once with a single write
|
|||
|
system call has been added. The user-level application can set, with an IOCTL
|
|||
|
call (code pBIOCSWRITEREP), the number of times a single packet will be
|
|||
|
repeated: for example, if this value is set to 1000, every raw packet written by
|
|||
|
the application on the driver's device file will be sent 1000 times. This
|
|||
|
feature can be used to generate high speed traffic for testing purposes: the
|
|||
|
overload of context switches is no longer present, so performance is remarkably
|
|||
|
better. </p>
|
|||
|
|
|||
|
<h4> Network monitoring</h4>
|
|||
|
<p>WinPcap offers a kernel-level programmable monitoring
|
|||
|
module, able to calculate simple statistics on the network traffic. The
|
|||
|
idea behind this module is shown in Figure
|
|||
|
2: the statistics can be gathered without the need to copy the packets to
|
|||
|
the application, that simply receives and displays the results obtained from the
|
|||
|
monitoring engine. This allows to avoid great part of the capture overhead in
|
|||
|
terms of memory and CPU clocks.</p>
|
|||
|
<p>The monitoring engine is
|
|||
|
made of a <i>classifier</i> followed by a <i>counter</i>. The packets are
|
|||
|
classified using the filtering engine of NPF, that provides a configurable way
|
|||
|
to select a subset of the traffic. The data that pass the filter go to the
|
|||
|
counter, that keeps some variables like the number of packets and
|
|||
|
the amount of bytes accepted by the filter and updates them with the data of the
|
|||
|
incoming packets. These variables are passed to the user-level application at
|
|||
|
regular intervals whose period can be configured by the user. No buffers are
|
|||
|
allocated at kernel and user level.</p>
|
|||
|
<h4>Dump to disk</h4>
|
|||
|
<p>The dump to disk
|
|||
|
capability can be used to save the network data to disk directly from kernel
|
|||
|
mode.
|
|||
|
</p>
|
|||
|
<p align="center"><img border="0" src="npf-dump.gif" width="400" height="187">
|
|||
|
</p>
|
|||
|
<p align="center"><b>Figure 3: packet capture versus kernel-level dump.</b>
|
|||
|
</p>
|
|||
|
<p>In
|
|||
|
traditional systems, the path covered by the packets that are saved to disk is
|
|||
|
the one followed by the black arrows in Figure
|
|||
|
3: every packet is copied several times, and normally 4 buffers are
|
|||
|
allocated: the one of the capture driver, the one in the application that keeps
|
|||
|
the captured data, the one of the stdio functions (or similar) that are used by
|
|||
|
the application to write on file, and finally the one of the file system.
|
|||
|
|
|||
|
</p>
|
|||
|
<p>When the
|
|||
|
kernel-level traffic logging feature of NPF is enabled, the capture driver
|
|||
|
addresses the file system directly, hence the path covered by the packets is the
|
|||
|
one of the red dotted arrow: only two buffers and a single copy are necessary,
|
|||
|
the number of system call is drastically reduced, therefore the performance is
|
|||
|
considerably better.
|
|||
|
|
|||
|
</p>
|
|||
|
<p>Current
|
|||
|
implementation dumps the to disk in the widely used libpcap format. It gives
|
|||
|
also the possibility to filter the traffic before the dump process in order to
|
|||
|
select the packet that will go to the disk.
|
|||
|
</p>
|
|||
|
<h2>Further reading</h2>
|
|||
|
<p>The structure of NPF and its filtering engine derive directly from the one of
|
|||
|
the BSD Packet Filter (BPF), so if you are interested the subject you can read
|
|||
|
the following papers:</p>
|
|||
|
<p>- S. McCanne and V. Jacobson, <a href="ftp://ftp.ee.lbl.gov/papers/bpf-usenix93.ps.Z">The
|
|||
|
BSD Packet Filter: A New Architecture for User-level Packet Capture</a>.
|
|||
|
Proceedings of the 1993 Winter USENIX Technical Conference (San Diego, CA, Jan.
|
|||
|
1993), USENIX. </p>
|
|||
|
<p>- A. Begel, S. McCanne, S.L.Graham, BPF+: <a href="http://www.acm.org/pubs/articles/proceedings/comm/316188/p123-begel/p123-begel.pdf">Exploiting
|
|||
|
Global Data-flow Optimization in a Generalized Packet Filter Architecture</a>,
|
|||
|
Proceedings of ACM SIGCOMM '99, pages 123-134, Conference on Applications,
|
|||
|
technologies, architectures, and protocols for computer communications, August
|
|||
|
30 - September 3, 1999, Cambridge, USA</p>
|
|||
|
<h2>Note</h2>
|
|||
|
<p>The code documented in this manual is the one of the Windows NTx version of
|
|||
|
NPF. The Windows 9x code is very similar, but it is less efficient and
|
|||
|
lacks advanced features like kernel-mode dump.</p>
|
|||
|
<p>
|
|||
|
|
|||
|
|
|||
|
</body>
|
|||
|
|
|||
|
</html>
|
|||
|
</div>
|
|||
|
|
|||
|
<hr>
|
|||
|
<p align="right"><img border="0" src="winpcap_small.gif" align="absbottom" width="91" height="27">
|
|||
|
documentation. Copyright (c) 2002-2005 Politecnico di Torino. Copyright (c) 2005-2009
|
|||
|
CACE Technologies. All rights reserved.</p>
|