Configuring the
NetFlow FlowCollector

This chapter tells you how to configure and enable NetFlow switching and data export on a
Cisco 7000 series router. It also tells you how to customize aggregation schemes for the FlowCollector. The FlowCollector accepts NetFlow data export Versions 1 and 5 (a superset of Version 1) from the router. Before configuring the FlowCollector, ensure that all files have been set up correctly, as described in the chapter entitled "Installing and Setting Up the NetFlow FlowCollector."

Enabling NetFlow Switching and Data Export on the Router

For the FlowCollector to collect exported NetFlow data, you must ensure that the Cisco router is configured for Netflow switching and data export.

The FlowCollector accepts Version 1 and Version 5 of NetFlow export from the router. Version 5 is a superset of Version 1. The additional field present in the Version 5 flow records are

Sequence numbers
Autonomous system numbers
Source and destination prefix masks

The recommended IOS release for NetFlow features is 11.1(11) CA or later. For more information about the latest IOS release, refer to the Release Notes for Cisco 7000 Family for Cisco IOS Release 11.2 P and the Cisco IOS Configuration Guides and Cisco IOS Command Reference Guides.

To enable a router for NetFlow switching, you must perform the following tasks, beginning in global configuration mode:

Step 1 Specify the interface and enter interface configuration mode.

<router #> interface <specifier> slot/port-adapter/port

Where <specifier> represents the port used. Other interfaces such as FDDI and Token Ring can also be used.

Step 2 Enable flow records for IP packets on the interface.

<router #> ip route-cache flow

Step 3 Enable NetFlow data export.

For Version 1:

<router #> ip flow-export ip-address UDP port

For Version 5:

<router #> ip flow-export ip-addr port version 5 {origin-as | peer-as}

Where [origin-as | peer-as] is origin autonomous system / peer autonomous system

To disable NetFlow export, enter no ip flow-export.

The UDP port number for the specified workstation must be the same as the UDP port number for the router. You should define the UDP port number as part of a thread definition in the nfconfig.file on the workstation. For more information, refer to the section entitled "Defining Aggregation Schemes and Filters."

Step 4 Verify that NetFlow export is on.

<router># write terminal

The location where flow data is exported is also provided.

Step 5 Exit interface configuration mode and return to global configuration mode.

<router># exit

Step 6 Exit global configuration mode and save the configuration changes.

<router># exit

copy running-config startup-config

Displaying NetFlow Switching Statistics

You can display NetFlow switching statistics (including IP subprotocols, well known ports, total flows, average number of packets per flow, and average flow lifetime) on the router. To do so, use the command shown here:

<router #> show ip cache flow

Defining Aggregation Schemes and Filters

The FlowCollector collects data that can be summarized into files based on user-defined criteria. You define these criteria in the nfconfig.file.

The nfconfig.file consists of two types of user-defined criteria that you use to customize the data aggregated by the FlowCollector application:

Thread--Defines aggregation schemes and other configuration parameters
Filter--Defines the information that is accepted or rejected by the aggregation scheme

Note In this document, the term "thread" means an aggregation task that lets you specify how you want to collect, summarize, and store data.

A thread consists of the following fields:

Thread name
Optional filters
Aggregation scheme
Duration of the aggregation
Path to where the data is stored
UDP port number
Version (V1/V5)
State (Active/Inactive)
Total disk usage limit for this thread
Number of files to retain

Note You can create up to 10 active threads. For readability, put each field on a line by itself. In a thread definition, the port should be defined before the version.

A filter consists of multiple fields. Filters can be shared among threads. A filter has a unique ID (name). For more information about filters, refer to the section entitled "Filter Syntax." The first field of a filter is the filter name.

Figure 3-1 shows an example of how the FlowCollector uses threads and filters. In this example, aggregation scheme AS1 uses filter F1, aggregation scheme AS2 uses filter F1 and F2, aggregation scheme AS3 does not use any filters, and aggregation schemes AS4 and AS5 use filter F3.

Figure 3-1: NetFlow FlowCollector Aggregating Data Example

Creating a Thread

You use a thread to tell the FlowCollector how to aggregate the traffic flows stored on the workstation. You can create multiple threads to meet your needs. Below is the command syntax for a thread. The keywords are listed on separate lines for readability.

Thread thread-name
[Filter filter-name]
Aggregation scheme 
Period minutes
DataSetpath directory-path
Port value
Version V1|V5
DiskSpaceLimit Megabytes
FileRetain number

Note The fileRetain attribute must be the last attribute of a thread.

Table 3-1 lists thread attributes and variables and their definitions.

**Table 3-1: Attributes and Variables for Creating a Thread**
Attribute	Variable	Definition
Thread	thread-name	Unique name of the thread. Can be up to 18 alphanumeric characters.
Filter	filter-name	(Optional.) Unique name of the filter. Can be up to 18 alphanumeric characters. When more than one filter is defined, the result is logical AND of them. You can specify one or more filters, and filters can be shared among threads. For more information on filters, refer to the "Filter Syntax" section.
Aggregation	scheme	A way to summarize data collected by the FlowCollector application.
Period	minutes	Duration of the thread. (That is, how often the FlowCollector application writes aggregated data into a file. Data received in each period is written into a separate file.) For example, setting period 30 generates two files every hour.
DataSetpath	directory-path	Directory path used for storing the aggregated data. The output filename is <router-name>.hhmm or <router-name>_YYYY_MM_DD.hhmm, and the flag to enable this is LONG_OUTPUTFILE_SUFFIX in the nf.resources file. For more information on the output files, refer to the section entitled "Understanding the Output File Form" in the chapter entitled "Using the NetFlow Collector."
DiskSpaceLimit	MegaBytes	Defines a limit of the total disk usage for the disk partition where DataSetPath resides, beyond which this thread will no longer write data to the disk. This parameter essentially allows you to reserve disk space for other threads by limiting the amount of disk space consumed by this thread, and can help prevent disk space exhaustion as well. Default for DiskSpaceLimit is 1000 MB.
FileRetain	number	Invokes the program to clean up files. The program removes the oldest files and leaves the number of files to retain on a per DataSetPath per day per router ID per aggregation scheme basis. To disable this, set FileRetain to 0. For example, setting FileRetain 10 with a total of 12 files causes the program to delete the two oldest files.
Port	value	The UDP port used by the router to report the traffic.
Version		NetFlow export version expected on the port configured with this thread. Valid values are V1 or V5.

Disk Space Management

Depending on the volume of flow data being exported from your router(s), as well as the FlowCollector configuration parameters you use, the FlowCollector can consume large amounts of disk space in a short period of time. The FlowCollector provides several configuration parameters and features which will help you manage your disk space usage:

Filters
Aggregation schemes
FileRetain configuration parameter
DiskSpaceLimit configuration parameter

As described earlier, a filter can help you discard any flow data which is not of interest to you. By using filters to ensure you are storing only that data which is of interest, you can potentially reduce the amount of disk space which will be used by the FlowCollector.

Aggregation schemes are used to define how you want the FlowCollector to summarize the flow data being exported from your router(s). By using only those aggregation schemes which are required for your application, and, when possible, by selecting the aggregation scheme(s) which generate the least amount of data on disk, you can reduce the amount of disk space which will be used by the FlowCollector. For example, using the HostMatrix aggregation scheme will result in less disk space usage than would the DetailHostMatrix scheme. Of course, which aggregation schemes you use will be determined primarily by the data you are interested in and how you want to summarize that data. It's important to realize, however, that the different aggregation schemes can greatly affect the amount of disk space used by the FlowCollector.

The FileRetain configuration parameter allows you to limit the number of files the FlowCollector will retain on disk per day for the given thread. By setting the FileRetain parameter to a lower value, you will reduce the amount of disk space used by the FlowCollector by reducing the number of files the application will retain. Note that the FileRetain parameter applies only to files created within the same day; therefore, FileRetain does not necessarily help limit disk space consumption over time.

The DiskSpaceLimit configuration parameter allows you to limit the total amount of disk space used in the partition containing DataSetPath, beyond which the FlowCollector will no longer write data to the disk for the given thread. For example, if the total usage of the DataSetPath partition is currently at 400 MB, and the DiskSpaceLimit parameter is set to 405 MB, data buffered in the FlowCollector for the given thread will be written to disk only if the resulting files will be less than 5 MB in size. If the resulting files would be 5 MB or larger, they will not be written to disk, the buffered data will be discarded, and an error message will be written to the log. This condition will persist until additional disk space is made available.

Assuming you have other techniques to prevent disk space exhaustion, setting the DiskSpaceLimit parameter to 0 produces optimal performance, because the FlowCollector can then bypass expensive disk space checks.

Finally, the DiskSpaceLimit parameter does not apply to the RawFlow aggregation scheme, because this scheme cannot tolerate the performance overhead associated with disk space checks.

The features and parameters mentioned above will help you to manage disk space usage by the FlowCollector. You may need to employ your own file archival and deletion techniques to save older data files and prevent disk space exhaustion.

Aggregation Schemes

Table 3-2 lists the predefined aggregation schemes available to a NetFlow user. The aggregation scheme determines the type of information that is aggregated and stored in the output files. You can specify only one aggregation per thread.

**Table 3-2: Aggregation Schemes and Descriptions**
Scheme	Description
RawFlows	The output from this aggregation scheme is stored in binary data files of n minutes worth of data as specified by the period keyword.
SourceNode	The output from this aggregation scheme creates one row for every source address present in the received flow export data. The fields retained in the output are source address, sum of all packets sent, total number of bytes sent (octets) from the source, and number of flows aggregated into this row.
DestNode	The output from this aggregation scheme creates one row for every destination address present in the input data. The fields retained in the output are destination address, sum of all packets sent, total number of bytes sent (octets) to the destination, and number of flows aggregated into this row.
DetailDestNode	The output from this aggregation scheme creates one row for every destination address present in the input data. The fields retained in the output are destination address, source port, destination port, protocol name, sum of all packets sent, total number of bytes sent (octets) to this destination address, and number of flows summarized in the row.
HostMatrix	The output from this aggregation scheme creates one row for every matching source and destination pair present in the input data. The fields retained in the output are source address, destination address, sum of all packets exchanged between the pair, sum of bytes exchanged, and number of flows that were aggregated into this row.
DetailHostMatrix	The output from this aggregation scheme creates one row for every source and destination pair present in the input data. The fields retained in the output are srcaddr, dstaddr, srcport, dstport, and protocol fields. For each unique record identified by the key, sum of packets, total number of bytes, firstFlowStamp, lastFlowStamp, and number of flows summarized into this record are written.
SourcePort	The output from this aggregation scheme creates one row for every source port present after the information in the nfknown.srcport file is applied. The fields retained in the output are source port, sum of all packets sent, the total number of bytes sent (octets) from this port, and number of flows summarized in this record.
DestPort	The output from this aggregation scheme creates one row for every destination port present after the information in the nfknown.dstport file is applied. The fields retained in the output are destination port, sum of all packets sent, total number of bytes sent (octets) to this port, and number of flows summarized in this record.
Protocol	The output from this aggregation scheme creates one row for every protocol present in the input data. The fields retained in the output are protocol name, sum of all packets, and total number of bytes (octets) exchanged based on the protocol. Protocols not known by the NetFlow FlowCollector are listed as "Others." Known protocols are defined in the nfknown.protocols file.
DetailInterface	The fields retained in the output are source address, destination address, input interface index, output interface index, next hop, sum of all packets sent and received on the interface, total number of bytes sent (octets) to and from the interface, and number of flows summarized in the row.
CallRecord	The NetFlow usage record generation scheme for billing/accounting applications. Refer to the section entitled "CallRecord Aggregation Scheme Output File Example" in the chapter "Using the NetFlow FlowCollector"
ASMatrix	The output from this aggregation scheme creates one row for every matching source and destination AS number present in the input data. The fields are source AS number, destination AS number, sum of packets exchanged, total number of bytes, and the total number of flows summarized into this record.

In the following example, thread foo uses the SourceNode aggregation scheme. The FlowCollector creates an output file in the directory /opt/CSCOnfc/Data every 30 minutes and keeps the last
24 files:

Thread 	foo
Aggregation	SourceNode 
Period 	30	
Port	9991
Version	V1
State 	Active	
DataSetpath	/opt/CSCOnfc/Data
DiskSpaceLimit	1000
FileRetain	24

Note that you should not define two active threads which use the same aggregation scheme and DataSetPath as this would cause the FlowCollector to product an unusable data file.

Creating a Filter

A filter defines what data is included or excluded by the thread. The filter can contain multiple permit and deny keywords. The command syntax for a filter is

filter filter-name
permit type value mask
deny type value mask

You can filter a transport layer protocol. The protocol in the nfknown.protocols file can contain multiple source and destination ports.

Note The protocol must also be defined in the nfknown.protocols file. For more information, refer to the section entitled "Setting Up Protocols and Ports."

In the following example, the filter backbone permits input data based on the next hop matching the IP address 171.69.4.0 with a mask of 0.0.0.255 and IP address 171.69.5.0 with a mask of 0.0.255.255.

filter backbone
permit nexthop 171.69.4.0 0.0.0.255
permit nexthop 171.69.5.0 0.0.255.255

Table 3-3 contains descriptions of the filter keywords and variables.

Table 3-3: Filter Keywords, Variables, and Descriptions

Keyword	Variable	Description
filter	name	Unique name of the filter. Can be up to 20 alphanumeric characters.
permit or deny	type	You can define one or more permit and deny actions as required. The permit keyword keeps the data that matches the specified filter type and value. The deny keyword rejects the data that matches the specified filter type and value.
	value	All filter types require a value. Refer to Table 3-4 for a description of filter types and values.
	mask	Subnet mask for the filter type. Not all filter types require a mask.

The filter criteria is similar to that of a router's access list. When defining a filter, keep in mind the following:

You must use an explicit permit statement to deny a flow, otherwise everything in the flow is denied.

: The default condition for a filter is to deny the flow. For example:

filter kill-www
	deny Dstport 80

: In this example, all flows going to port 80 and all other flows will be denied. If you want to deny flows to port 80 only, you need an explicit wild card to permit all other flows. For example:

filter kill-www
	deny	Dstport	80	
	permit	Dstaddr	0.0.0.0	255.255.255.255

When multiple filter conditions exist, the FlowCollector application attempts to apply the conditions sequentially, in the order you specify, until a match is found.

Filter Types

Table 3-4 describes the filter types available, the type of input required for the value, and whether the value requires a mask.

Table 3-4: Filter Types, Values, and Their Descriptions

Keyword	Value	Mask Required	Description
srcaddr	Source IP address	Yes	Filter the input data based on the source IP address
dstaddr	Destination IP address	Yes	Filter the input data based on the destination IP address
Protocol	Protocol ID	No	Takes a protocol label from the nfknown.protocols file
Prot	"prot" field in	No	As in the /etc/protocols file the input flow of your workstation data
srcport	Source port number	No	Filter the input data based on the source port number
destport	Destination port number	No	Filter the input data based on the destination port number
srcinterface	Source interface number	No	Filter the input data based on the source interface number
dstinterface	Destination interface number	No	Filter the input data based on the destination interface number
nexthop	Next hop IP address	Yes	Filter the input data based on the next hop IP address

Example

Each line starting with permit/deny is a filter condition; therefore, filterA in the following example contains four filter conditions.The first condition states that all flows from network 171.69.1.0 are permitted. Also, based on the order in which filter condition definitions appear in this example, filterA will allow a flow from Srcaddr = 171.69.1.2 with Srcport = 53. Filter fA example:

Filter	filterfA
permit	Srcaddr	171.69.1.24 		0.0.0.255
deny	Srcaddr	204.233.0.0 	0.0.255.255
deny 	Srcport 	53
permit  	Dstaddr 	0.0.0.0 	255.255.255.255

If you want to permit traffic from 171.69.1.0 but deny traffic coming from port 53, you should rearrange the above filter conditions this way:

Filter		filterA
deny		Srcaddr		204.233.0.0		0.0.255.255
deny		Srcport		53
permit		Srcaddr		171.69.1.24		0.0.0.255
permit		Srcaddr		0.0.0.0		255.255.255.255

The last filter condition overrides the default behavior, which calls for denying all flows that do not match any of the first three filter conditions.

nfconfig.file Example

In the following example, NetFlow export traffic (Version 1) is arriving on UDP port 9991. The application requires data to be aggregated by the following aggregation schemes:

SourceNode (Source IP address)
DestPort (Destination port number)

Further, the first aggregation scheme should include only the traffic that passed through the router 171.97.23.65, whereas the latter should not include traffic that came from network 171.69.0.0. Two threads (A and B) are created to accomplish this. Each thread flushes the aggregated data every 10 minutes under the /opt/CSCOnfc/Data directory. Two filters (app1A and app1B) are created to accomplish the desired filtering:

filter applA
permit nexthop 	171.97.23.65	0.0.0.0
filter applB
deny srcaddr 	171.69.0.0 		0.0.255.255
permit srcaddr 	0.0.0.0 		0.0.0.0
Thread A
Filter applA
Aggregation SourceNode
Period 10
Port 9991
version V1
State Active
DataSetpath /opt/CSConfc/Data
DiskSpaceLimit	1000
FileRetain 0
Thread B
filter applB
Aggregation DestPort
Period 10
Port 9991
Version V1
State Active
DataSetpath /opt/CSCOnfc/Data
DiskSpaceLimit 	1000
FileRetain 0

Setting Up Protocols and Ports

Use the information in this section to specify the protocols and ports from which you would like to collect data. The FlowCollector recognizes these protocols and ports (specified in the nfknown.protocols file and the nfknown.srcports file) when aggregating data. The FlowCollector uses the following files from the nf.resources file to determine which protocols and ports are recognized.

nfknown.protocols--This file contains definitions of protocols that you want the FlowCollector application to recognize.
nfknown.srcports--This file contains the source port numbers that you want the FlowCollector to recognize.
nfknown.dstports--This file contains the destination port numbers that you want the FlowCollector to recognize.

When a protocol or port is not known (that is, when it is not in the previously listed files), the information for its associated traffic flows is summarized under "Others."

Figure 3-2 shows an example of a typical communication session between host A and host B. In this example, the FlowCollector aggregates data for the Telnet protocol from source port 23 and destination port 9001. Whether this information is stored in data files for later retrieval depends on how the FlowCollector is customized.

Figure 3-2: Data Collection Example

Table 3-5 lists the raw data received in the data collection example.

Table 3-5: Data Received Example

Source Address	Destination Address	Source Port	Destination Port	Protocol	Packets	Bytes
A	B	23	9001	6	20	2000
B	A	9001	23	6	30	1000
A	B	20	9002	6	20	200
B	A	9002	20	6	50	300

When you add the Telnet protocol definition to the nfkown.protocols file, the output data file of the protocol aggregation scheme will contain a row with 50 packets (20 plus 30) and 3000 bytes
(2000 plus 1000). It will also have another row for "Others" containing 70 packets and 500 bytes.

Configuring Protocols

When you add a protocol to the nfknown.protocols file, the FlowCollector application aggregates the traffic statistics associated with the protocol. If you remove the protocol from the nfknown.protocols file, information for that protocol is no longer recognized and is aggregated under the label "Others."

The protocols listed in the nfknown.protocols file are used by the aggregation schemes and protocol filters you define in the nfconfig.file. To configure the protocols that the FlowCollector recognizes, you must edit the nfknown.protocols file to include the following information:

Protocol name
Source or destination port
Protocol type

Table 3-6 shows the nfknown.protocols file options and their descriptions.

Table 3-6: nfknown.file Protocol Options

Option	Value/Number	Description
protocol	name	Filter specification based on the transport layer protocol name.
srcport or dstport	Number	Source port number or destination port number.
OR		Required when you have more than one srcport/dstport to provide a Boolean OR functionality.
prot	Value	Protocol type is in the /etc/protocols file of your workstation.

The known protocols (such as www, telnet, ftp) should match those in the /etc/services file of your workstation. For information about protocols and protocol types supported, refer to the protocols file in the /etc directory on your workstation.

The command syntax for a protocol is

protocol <name> 
	[<port> <Number> [OR <port> <Number>]]
prot <value>

Where <port> is srcport or dstport.

The following example shows the contents of the nfknown.protocols file provided with the FlowCollector application. In this example, the application recognizes the following protocols: telnet, ftp, tftp, and www-tcp originating or terminating on the specified ports. You can specify multiple source and destination ports by using the OR option. In the first protocol example, the FlowCollector recognizes traffic flows for all Telnet sessions terminating on port 23.

Protocol telnet
	Dstport 23
	Prot 6
Protocol ftp
	Srcport 20 OR Dstport 20 OR Srcport 21 OR Dstport 21
	Prot 6
Protocol tftp
	Srcport 69 OR Dstport 69
	Prot 17
Protocol www-tcp
	Srcport 80 OR Dstport 80 
	Prot 6

Modifying the Source and Destination Port Files

The FlowCollector uses the contents of the nfknown.srcports and nfknown.dstports files in source port and destination port aggregation schemes. When you add a port to either of these two files, traffic to or from the port is counted separately in the output file. All other traffic is counted as "Others."

The following example shows the contents of the nfknown.srcports file provided with the FlowCollector. In this example, the application recognizes ports 1 through 24 and 6000. You can specify a range of ports by using a comma to separate the numbers.

1, 24
6000

Table of Contents