0

background: I am using KDD99 data set with weka library to predict the IDS attacks the training and all works fine its around 42 features based on which the attack predication works. But in a real time environment when i use the Sniffer to capture the packets i may not be able to fetch all the 42 features from the packet not it would be required as well. I would be getting around 10 features.

I am new to data mining and weka library Now the problem is i would have used all the 42 features from training data set to train network and the i have 10 features in the test data.

Do i need to train the network with only 10 of these features which are going to get captured or is there a way i can train the network with 42 features and while classification i can request to consider the only 10 features is there a way to make attribute selection during the classification of data?

Can any one share me the Java snippet code if there is any solution.

The alert for the outdated KDD99 is useful and many thanks for it but still i was thinking what if i have less no. of features in Test data than training data how to address the problem? what should be the ideal way to solve in weka

Thanks in advance....

  • **STOP USING THIS DATA SET.** It is 100% useless for intrusion detection. It is **not real, and not up-to-date**. There is absolutely no point in trying to extract these features from real data, and you will detect 0 actual attacks this way. – Has QUIT--Anony-Mousse Apr 19 '15 at 11:50
  • possible duplicate of [How to derive KDD99 Features from DARPA pcap file?](http://stackoverflow.com/questions/14090121/how-to-derive-kdd99-features-from-darpa-pcap-file) – Has QUIT--Anony-Mousse Apr 19 '15 at 11:53
  • Thanks for the immediate alert . I was working on a poc for my masters thesis this helped me out... i searched for the UNB ISCX Intrusion Detection Evaluation DataSet but couldn't get access to it looks like the link is broken can you suggest me any other dataset which is updated and suitable... – user3836311 Apr 19 '15 at 13:18
  • Did some google around whether KDD99 can be still considered as a baseline for research on IDS found these links as it says it can still be used for the research. Yes it cant detect some of the new attacks http://adsabs.harvard.edu/abs/2008SPIE.6973E..16T http://eprints.iisc.ernet.in/26885/1/darpa.pdf – user3836311 Apr 19 '15 at 14:48
  • Never trust indian papers, sorry. Indian and Chinese publishers accept everything for a little money. This paper is so badly incorrect on so many levels (for example, the $ character is not part of the command on page 8). And essentially, they showed that real IDS systems do not *care* about the "attacks" present in the DARPA set (because vulnerable systems virtually do no longer exist - do you still run unpatched Win NT and Win 3.11?!?) – Has QUIT--Anony-Mousse Apr 19 '15 at 16:42
  • Thank you once again. What is the data set you will recommend me to use. Can you give me any link to download the dataset? – user3836311 Apr 19 '15 at 16:59
  • I'm not doing much intrusion detection. Todays attacks are mostly SQL injection, these *cannot* be detected by TCP features like that kdd cup data set. You need deep packet inspection for that. I have *not* looked deeper into this, but this data set may be much more meaningful: http://users.aber.ac.uk/pds7/csic_dataset/csic2010http.html – Has QUIT--Anony-Mousse Apr 19 '15 at 17:41
  • Note that this data set is also simulated web traffic, and given that it is from 2010, it may also be outdated again... although some things like XSS still exist, of course. – Has QUIT--Anony-Mousse Apr 19 '15 at 17:50
  • Thank you so much for the pointers to dataset i looked into the dataset Sorry to trouble you but i have two questions 1. How would i detect the intrusions of type Dos, Probe, worm? 2. The dataset detects different attacks like " SQL injection, buffer overflow, information gathering, files disclosure, CRLF injection, XSS etc" but there is no labeling of attacks how would i drill down to classify into different categories? – user3836311 Apr 20 '15 at 14:18

0 Answers0