3

i'm using softflowd+nfdump to create netflow data and store this data in a 2d (string) array

flows = new string *[flows_len];
for (int i=0;i<flows_len;i++)
{
    flows[i] = new string[47];
}

I'm writing in c++. Each "row" in the array represents a flow record, and 47 is the number of different fields of netflow data as displayed by nfdump.

I would like to create some statistics on a per IP basis (for example,how many connections-flows are there per IP) but i can't figure out how to get those rows-flows with the same IP (value of srcip is stored in flows[j][4] and i'm new to c++).

thanks in advance!

billz
  • 44,644
  • 9
  • 83
  • 100

2 Answers2

1

This is a very, very, very simple example

#include <vector>
#include <string>
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <algorithm>
#include <iterator>

using namespace std;

typedef vector< string > StatInfo; // 47 enries

void print_stat_by_ip( const vector< StatInfo > & infos, const string & ip ) {
    for ( int i = 0, count = infos.size(); i < count; i++ ) {
        const StatInfo & info = infos[ i ];
        if ( info[ 4 ] == ip ) {
            copy( info.begin(), info.end(), ostream_iterator< string >( cout, ", " ) );
            cout << endl;
        }
    }
}

int main()
{
    vector< StatInfo > infos;

    for ( int i = 0; i < 10; i++ ) {
        StatInfo info;
        for ( int j = 0; j < 47; j++ ) { // just filling them "0", "1", "2", ... , "46"
            char c_str[ 42 ];
            sprintf( c_str, "%d", j ); 
            info.push_back( c_str );
        }
        char c_str[ 42 ];
        sprintf( c_str, "%d", rand() % 10 );
        info[ 4 ] = c_str;          // this will be an IP-address
        infos.push_back( info );

        copy( info.begin(), info.end(), ostream_iterator< string >( cout, ", " ) );
        cout << endl;
    }

    string ip_to_find = "5";
    cout << "----------------------------------------" << endl;
    cout << "stat for " << ip_to_find << endl;
    cout << "----------------------------------------" << endl;
    print_stat_by_ip( infos, ip_to_find );
}

You can find it here http://liveworkspace.org/code/3AAye8

borisbn
  • 4,988
  • 25
  • 42
1

Honestly I would consider a rethink on your containers. The following uses a standard lib array, vector, and multimap to accomplish what I think you're looking for. The sample code just populates the table rows with the strings "A", "B", or "C" along with one of three IP addresses. The part you should pay special note to is the usage of the multimap to index your table based on IP address (though it could easily be retrofitted to do the same for any arbitrary column).

Note: there are plenty of people out there more proficient with the std lib algorithms, functions, and container usage than I. This is just to give you an idea of how a multimap may help in your possible solution.

EDIT OP wanted to see counts of the IP addresses in the table, the code for this has been amended to the tail of the main() function. Also updated to not use C++11 features. Hopefully closer to something the OP can work with.

#include <iostream>
#include <iterator>
#include <algorithm>
#include <functional>
#include <map>
#include <vector>
#include <string>
using namespace std;

// some simple decls for our info, table, and IP mapping.
typedef std::vector<std::string> FlowInfo;
typedef std::vector<FlowInfo> FlowTable;

// a multi-map will likely work for what you want.
typedef std::multimap<std::string, const FlowInfo* > MapIPToTableIndex;

// a map of IP string-to-unsigned int for counting occurrences.
typedef std::map<std::string, unsigned int> MapStringToCount;

int main(int argc, char *argv[])
{
    // populate your flow table using whatever method you choose.
    //  I'm just going to push 10 rows of three ip addresses each.
    FlowTable ft;
    for (size_t i=0;i<10;++i)
    {
        FlowInfo fi(47); // note: always fixed at 47.

        for (size_t j=0;j<fi.size();++j)
            fi[j] = "A";
        fi[0][0]+=i;
        fi[4] = "192.168.1.1";
        ft.push_back(fi);

        for (size_t j=0;j<fi.size();++j)
            fi[j] = "B";
        fi[0][0]+=i;
        fi[4] = "192.168.1.2";
        ft.push_back(fi);

        for (size_t j=0;j<fi.size();++j)
            fi[j] = "C";
        fi[0][0]+=i;
        fi[4] = "192.168.1.3";
        ft.push_back(fi);
    }

    // map by IP address into something usefull.
    MapIPToTableIndex infomap;
    for (FlowTable::const_iterator it = ft.begin(); it != ft.end(); ++it)
        infomap.insert(MapIPToTableIndex::value_type((*it)[4], &*it));


    // prove the map is setup properly. ask for all items in the map
    //  that honor the 192.168.1.2 address.
    for (MapIPToTableIndex::const_iterator it = infomap.lower_bound("192.168.1.2");
         it != infomap.upper_bound("192.168.1.2"); ++it)
    {
        std::copy(it->second->begin(), it->second->end(),
                  ostream_iterator<std::string>(cout, " "));
        cout << endl;
    }

    // mine the IP occurance rate from the table:
    MapStringToCount ip_counts;
    for (FlowTable::const_iterator it= ft.begin(); it!=ft.end(); ++it)
        ++ip_counts[ (*it)[4] ];

    // dump IPs by occurrence counts.
    for (MapStringToCount::const_iterator it = ip_counts.begin();
         it != ip_counts.end(); ++it)
    {
        cout << it->first << " : " << it->second << endl;
    }

    return 0;
}

Output

B B B B 192.168.1.2 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
C B B B 192.168.1.2 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
D B B B 192.168.1.2 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
E B B B 192.168.1.2 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
F B B B 192.168.1.2 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
G B B B 192.168.1.2 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
H B B B 192.168.1.2 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
I B B B 192.168.1.2 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
J B B B 192.168.1.2 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
K B B B 192.168.1.2 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
192.168.1.1 : 10
192.168.1.2 : 10
192.168.1.3 : 10
WhozCraig
  • 65,258
  • 11
  • 75
  • 141
  • this seems really nice! But what if i don't know the IP? if i'm monitoring an interface and i want to count how many connections are there for each separate unknown IP? Also,because its easier for me with the 2d array, can i do this without the vectors? the vector is needed for the map? – drazenmozart Dec 29 '12 at 11:44
  • @drazenmozart The vector is there for the expressed purpose of allowing you to add as many rows as you need. If you want to used a fixed array, its entirely up to you. Since I'm not familiar with what your definition of an "unknown" IP address is, I can only speculate on the precise output you're looking for. You want an IP list of unique addresses and their occurrence count in the table? You'll be surprised how easy that is to mine out of your table. – WhozCraig Dec 29 '12 at 11:55
  • @drazenmozart updated to mine the IP occurrence rate from the `FlowTable` instance. Hope that was what you were looking for. – WhozCraig Dec 29 '12 at 12:03
  • you are right, sorry for being unclear and again,sorry for my poor c++!! if my nfdump file has, for instance 500 flows of 40 different IP, i want to create new arrays with all flows for each IP, so i can make statistics for each separate IP address(number of flows,bytes tranfered,etc). And i need this to run real-time,so i don't know the IP addresses, and cannot request flows for particular IP addresses. i hope this is clear :) – drazenmozart Dec 29 '12 at 12:11
  • i'm getting an error for the #include : In file included from /usr/include/c++/4.6/array:35:0, from main.cpp:8: i'm using netbeans in ubuntu 12. – drazenmozart Jan 04 '13 at 17:58
  • @drazenmozart Do you have a conforming C++11 compiler that implements `std::array<>` ? It does not appear so. if that is the case a little massaging to use `std::vector<>` instead would work. I can update the answer if that is the case, but note this answer also uses lambda expressions that are also C++11-featured. What is your toolchain? – WhozCraig Jan 04 '13 at 18:06
  • If i understand what u asked :p, in my netbeans under tool collection i have GNU,and compiler=g++,assembler=as,make command=make,debug=gdb – drazenmozart Jan 04 '13 at 18:20
  • @drazenmozart I can update the answer to use only C++03 and prior features, hopefully that will get you closer to what you need. Again, it is just a sample. – WhozCraig Jan 04 '13 at 18:31
  • thanks a lot! i tried to insert this to my source code, but at "for (MapIPToTableIndex::const_iterator it = infomap.lower_bound("xxx.xxx.xxx.xxx"); it != infomap.upper_bound("xxx.xxx.xxx.xxx"); ++it)" i get this: terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc. I am creating a 'FlowTable' of 1600 'FlowsInfo',not so big i think, but i get this memory error!! any idea? – drazenmozart Jan 05 '13 at 21:26
  • 1600 FlowInfos isn't very big a all, especially since they're dynamically allocated. You're addressing all your indexes right, correct? (i.e. they're all 0-based, so 1600 FlowInfo's is `i=0;i<1600` and more importantly, each FlowInfo has at-most (0..46) fo a possible index into its string list. If you exceed that, you're bound to corrupt the heap. Run your program under valgrind. it will likely show you where the error is. – WhozCraig Jan 06 '13 at 01:46