|
Abstract : |
Recently there has been much interest in applying data mining to computer network intrusion detection. For the past two years, MITRE has been exploring how to make data mining useful in this context. This paper provides lessons learned in this task. Based upon our experiences in getting started on this type of project, we suggest data mining techniques to consider and types of expertise and infrastructure needed. This paper has two intended audiences: network security professionals with little background in data mining, and data mining experts with little background in network intrusion detection. Key words: data mining, intrusion detection, computer network security 1. Network Intrusion Detection: What is it? Intrusion detection starts with instrumentation of a computer network for data collection. Pattern-based software ?sensors ? monitor the network traffic and raise ?alarms ? when the traffic matches a saved pattern. Security analysts decide whether these alarms indicate an event serious enough to warrant a response. A response might be to shut down a part of the network, to phone the internet service provider associated with suspicious traffic, or to simply make note of unusual traffic for future reference. If the network is small and signatures are kept up to date, the human analyst solution to intrusion detection works well. But when organizations have a large, complex network the human analysts quickly become overwhelmed by the number of alarms they need to review. The sensors on the MITRE network, for example, currently generate over one million alarms per day. And that number is increasing. This situation arises from ever increasing attacks on the network, as well as a tendency for sensor patterns to be insufficiently selective (i.e., raise too many false alarms). Commercial tools typically do not provide an enterprise level view of alarms generated by multiple sensor vendors. Commercial intrusion detection software packages tend to be signature-oriented with little or no state information maintained. These limitations led us to investigate the application of data mining to this problem., |