1

I have searched a lot on Intrusion Detection system but now I am confused as now from where should I start. I dont know whether any open source reusable codes exists but I want to make Intrusion Detection and Prevention System with Neural Network.

From the Developer point of view my question is from where should I begin with. Kindly guide me on this topic.

Also I am presently working and analysing KDD CUP 1999 Dataset. And in search for more such data sets.

Kindly tell me which will be the best algorithms for building Intrusion Detection System.

Thanks to whomsoever reply or read.. Kindly guide me on this. Thanks in advance.

Mat
  • 202,337
  • 40
  • 393
  • 406
Hemang Rami
  • 338
  • 4
  • 14
  • What kind of intrusion detection are you talking about? Intrusion into a network, server, etc? – WaelJ Sep 28 '11 at 21:27
  • It is the network based IDPS(Intrusion Detection & Prevention System). I want to build IDS with Neural network. It will be installed on Server and hosts. both. – Hemang Rami Oct 01 '11 at 11:30

2 Answers2

3

I study in the same subject. Intrusion detection and machine learning. It is rather broad subject. I will answer more about data pre-processing and feature construction point of view. Neural Network part is different story altogether.

First of all, this area is heavily commercialized therefore there is almost no open source code examples. A lot things are done commercially in a closed ecosystem.

From academic perspective: There is a big data set problem. DK99C (Darpa - KDD99 data set) exists but it is very old. KDD99 dataset is constructed from DARPA tcpdumps. They used bro IDS , tcpdump api to construct features. From my perspective it is a lot harder to create features from raw tcpdump than working with machine learning algorithms (Neural Network) on ready features.

Read this article to learn more about how it (KDD99) is constructed

Article (Lee2000framework) Lee, W. & Stolfo, S. J. 
A framework for constructing  features and models for intrusion detection systems 
ACM Trans. Inf. Syst. Secur., ACM, 2000, 3, 227-261

Read this article and its presentation to learn why this subject is a hard problem to study.

 Inproceedings (Sommer2010Outside) Sommer, R. & Paxson, V. 
 Outside the Closed World: On Using Machine Learning for Network Intrusion Detection
 Proceedings of the 2010 IEEE Symposium on Security and Privacy, IEEE Computer Society, 2010, 305-316

Read this article to see how most academics work in this subject. A bit disappointing really.

Article (Tavallaee2010Toward) Tavallaee, M.; Stakhanova, N. & Ghorbani, A. 
Toward Credible Evaluation of Anomaly-Based Intrusion-Detection Methods 
Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 2010, 40, 516 -524

Read this why DK99C is considered harmful. It is harmful but no other credible dataset exists.

Article (Brugger2007KDD) Brugger, S. 
KDD Cup’99 dataset (Network Intrusion) considered harmful 
KDnuggets newsletter, 2007, 7, 15

Read this about taxonomy of IDS data pre processing

Article (Davis2011Data) Davis, J. J. & Clark, A. J. 
Data preprocessing for anomaly based network intrusion detection: A review 
Computers & Security, 2011, 30, 353 - 375
Atilla Ozgur
  • 14,339
  • 3
  • 49
  • 69
  • Thank you for your answer Atilla Ozgur. I will comment again after reading all those articles also It will be better if you provide me your mailing address for further doubts. Thanks again... – Hemang Rami Nov 26 '11 at 15:53
  • @HemangRami I am subscribed to IDS and Machine learning questions in stackexchange network. If you are in a doubt, ask a question here. Some of them may be better suited to other networks. It will be a discussion for all of us. If necessary we can talk in a chat. – Atilla Ozgur Nov 27 '11 at 21:27
  • yeah sure. I too wann chat with you. – Hemang Rami Dec 03 '11 at 16:40
  • Can you tell me how to extract those 41 attributes from Live System.. Is there any ready made codes available. – Hemang Rami Dec 13 '11 at 18:09
  • @HemangRami no ready code is available. Read Lee2000framework article to learn more about 41 attributes. – Atilla Ozgur Dec 15 '11 at 14:34
1

Most intrusion detection systems which use Neural Networks make use of supervised training, ie. the system prompts you for an opinion when certain changes are requested to its host. I suggest that you start with finding out the methodology for hooking change requests. In windows that could involve using a system hook to filter certain actions that are requested by applications. This will allow your app the option of prompting you for a response, that response will overtime be fed into the neural net. This dataset then can be used to optimize the recognition of certain patterns and your responses to those patterns. There are obviously more things to consider when building a system such as this but you should be off to a good start based on what I said.

Romaine Carter
  • 637
  • 11
  • 23
  • thank you for you answer it is help me. but my one more question is KDD CUP dataset is having 41 attributes as an input and 42nd is the output. using this dataset I am currently training it. Now my question is that How to capture those 41 attribute from the live system? do we have some api available for the same or do we have to look upon some more ?? please justify and answer with example it will be really helpful to me.... – Hemang Rami Oct 22 '11 at 07:12