flows module

The flows module is part of the nmeta suite

It provides an abstraction for conversations (flows), using a MongoDB database for storage and data retention maintenance.

Flows are identified via an indexed bi-directionally-unique hash value, derived from IP-value-ordered 5-tuple (source and destination IP addresses, IP protocol and transport source and destination port numbers).

Ingesting a packet puts the flows object into the context of the packet that flow belongs to, and updates the database object for that flow with information from the current packet.

There are various methods (see class docstring) that provide views into the state of the flow.

class flows.Flow(config)

Bases: baseclass.BaseClass

An object that represents a flow that we are classifying

Intended to provide an abstraction of a flow that classifiers can use to make determinations without having to understand implementations such as database lookups etc.

Be aware that this module is not very mature yet. It does not cover some basic corner cases such as packet retransmissions and out of order or missing packets.

Read a packet_in event into flows (assumes class instantiated as an object called ‘flow’):

flow.ingest_packet(dpid, in_port, pkt, timestamp)

Variables available for Classifiers (assumes class instantiated as an object called ‘flow’):

Variables for the current packet:

flow.packet.flow_hash
The hash of the 5-tuple of the current packet
flow.packet.packet_hash
The hash of the current packet used for deduplication. It is an indexed uni-directionally packet identifier, derived from ip_src, ip_dst, proto, tp_src, tp_dst, tp_seq_src, tp_seq_dst
flow.packet.dpid
The DPID that the current packet was received from via a Packet-In message
flow.packet.in_port
The switch port that the current packet was received on before being sent to the controller
flow.packet.timestamp
The time in datetime format that the current packet was received at the controller
flow.packet.length
Length in bytes of the current packet on wire
flow.packet.eth_src
Ethernet source MAC address of current packet
flow.packet.eth_dst
Ethernet destination MAC address of current packet
flow.packet.eth_type
Ethertype of current packet in decimal
flow.packet.ip_src
IP source address of current packet
flow.packet.ip_dst
IP destination address of current packet
flow.packet.proto
IP protocol number of current packet
flow.packet.tp_src
Source transport-layer port number of current packet
flow.packet.tp_dst
Destination transport-layer port number of current packet
flow.packet.tp_flags
Transport-layer flags of the current packet
flow.packet.tp_seq_src
Source transport-layer sequence number (where existing) of current packet
flow.packet.tp_seq_dst
Destination transport-layer sequence number (where existing) of current packet
flow.packet.payload
Payload data of current packet
flow.packet.tcp_fin()
True if TCP FIN flag is set in the current packet
flow.packet.tcp_syn()
True if TCP SYN flag is set in the current packet
flow.packet.tcp_rst()
True if TCP RST flag is set in the current packet
flow.packet.tcp_psh()
True if TCP PSH flag is set in the current packet
flow.packet.tcp_ack()
True if TCP ACK flag is set in the current packet
flow.packet.tcp_urg()
True if TCP URG flag is set in the current packet
flow.packet.tcp_ece()
True if TCP ECE flag is set in the current packet
flow.packet.tcp_cwr()
True if TCP CWR flag is set in the current packet

Variables for the whole flow:

flow.packet_count()
Unique packets registered for the flow
flow.client()
The IP that is the originator of the flow (if known, otherwise 0)
flow.server()
The IP that is the destination of the flow (if known, otherwise 0)
flow.packet_direction()
c2s (client to server) or s2c directionality based on first observed packet direction in the flow. Source of first packet in flow is assumed to be the client
flow.max_packet_size()
Size of largest packet in the flow
flow.max_interpacket_interval()
TBD
flow.min_interpacket_interval()
TBD

Variables for the whole flow relating to classification:

classification.TBD

Challenges (not handled - yet):
  • duplicate packets due to retransmissions or multiple switches in path
  • IP fragments
  • Flow reuse - TCP source port reused
class Classification(flow_hash, clsfn, time_limit)

Bases: object

An object that represents an individual traffic classification

commit()

Record current state of flow classification into MongoDB classifications collection.

dbdict()

Return a dictionary object of traffic classification parameters for storing in the database

class Flow.Packet

Bases: object

An object that represents the current packet

dbdict()

Return a dictionary object of metadata parameters of current packet (excludes payload), for storing in database

tcp_ack()

Does the current packet have the TCP ACK flag set?

tcp_cwr()

Does the current packet have the TCP CWR flag set?

tcp_ece()

Does the current packet have the TCP ECE flag set?

tcp_fin()

Does the current packet have the TCP FIN flag set?

tcp_psh()

Does the current packet have the TCP PSH flag set?

tcp_rst()

Does the current packet have the TCP RST flag set?

tcp_syn()

Does the current packet have the TCP SYN flag set?

tcp_urg()

Does the current packet have the TCP URG flag set?

Flow.client()

The IP that is the originator of the flow (if known, otherwise 0)

Finds first packet seen for the flow_hash within the time limit and returns the source IP

Flow.ingest_packet(dpid, in_port, pkt, timestamp)

Ingest a packet into the packet_ins collection and put the flow object into the context of the packet. Note that timestamp MUST be in datetime format

Flow.max_interpacket_interval()

Return the size of the largest inter-packet time interval in the flow (assessed per direction in flow) as seconds (type float)

Note: c2s = client to server direction s2c = server to client direction

Note: results are slightly inaccurate due to floating point rounding.

Flow.max_packet_size()

Return the size of the largest packet in the flow (in either direction)

Flow.min_interpacket_interval()

Return the size of the smallest inter-packet time interval in the flow (assessed per direction in flow) as seconds (type float)

Note: c2s = client to server direction s2c = server to client direction

Note: results are slightly inaccurate due to floating point rounding.

Flow.packet_count()

Return the number of packets in the flow (counting packets in both directions). This method should deduplicate for where the same packet is received from multiple switches, but is TBD...

Works by retrieving packets from packet_ins database with current packet flow_hash and within flow reuse time limit.

Flow.packet_direction()

Return the direction of the current packet in the flow where c2s is client to server and s2c is server to client.

Flow.server()

The IP that is the destination of the flow (if known, otherwise 0)

Finds first packet seen for the hash within the time limit and returns the destination IP

Flow.suppress_flow()

Set the suppressed attribute in the flow database object to the current packet count so that future suppressions of the same flow can be backed off to prevent overwhelming the controller