Questions tagged [anomaly-detection]

In data mining, anomaly detection (also outlier detection) is the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset.

442 questions
23
votes
2 answers

Dataflow anomaly analysis warnings from PMD

I am using Eclipse with the PMD Plug-in (4.0.0.v20130510-1000) and get a lot of those violations: Found 'DD'-anomaly for variable 'freq' (lines '187'-'189'). Found 'DU'-anomaly for variable 'freq' (lines '189'-'333'). In this SO answer, it says that…
brimborium
  • 9,362
  • 9
  • 48
  • 76
9
votes
1 answer

Working Example Of Luminol Anomaly Detection And Correlation Library By Linkedin

Github Link Of Luminol Library: https://github.com/linkedin/luminol Can anyone explain me with a sample code, how to use this module for finding anomalies in data set. I want to use this module for finding the anomalies in my time series data. P.S.:…
Ashish
  • 4,206
  • 16
  • 45
8
votes
1 answer

One Class SVM algorithm taking too long

The data bellow shows part of my dataset, that is used to detect anomalies describe_file data_numbers index 0 gkivdotqvj 7309.0 0 1 hpwgzodlky 2731.0 1 2 dgaecubawx 0.0 2 3 NaN …
E199504
  • 425
  • 4
  • 12
7
votes
2 answers

Isolation Forest vs Robust Random Cut Forest in outlier detection

I am examining different methods in outlier detection. I came across sklearn's implementation of Isolation Forest and Amazon sagemaker's implementation of RRCF (Robust Random Cut Forest). Both are ensemble methods based on decision trees, aiming to…
7
votes
4 answers

What is the difference between Real-time Anomaly Detection and Anomaly Detection?

Hence, the following derives: What isa clear the definition of Real-time Anomaly Detection? I am investigating the field of Anomaly Detection and in many papers the approach is defined Real-time, while in many other it is simply called Anomaly…
dadadima
  • 938
  • 4
  • 28
7
votes
1 answer

How to detect anomaly in a time series data(specifically) with trend and seasonality present in it?

I want to detect the outliers in a "time series data" which contains the trend and seasonality components. I want to leave out the peaks which are seasonal and only consider only the other peaks and label them as outliers. As I am new to time series…
Raja Sahe S
  • 587
  • 1
  • 7
  • 13
7
votes
1 answer

How to monitor messages rate in Kafka topics?

How can I get alerted when there is a message rate in some topic higher or lower than usual?
marosbfm
  • 311
  • 2
  • 10
7
votes
1 answer

What is the range of Scikit-Learn's IsolationForest decision_function scores?

Scikit-Learn's IsolationForest class has a method decision_function that returns the anomaly scores of the input samples. However, the documentation does not state what the possible range of these scores is, and only states that "the lower [the…
DataMan
  • 3,115
  • 6
  • 21
  • 36
6
votes
3 answers

Conversion of IsolationForest decision score to probability algorithm

I am looking to create a generic function to convert the output decision_scores of sklearn's IsolationForest into true probabilities [0.0, 1.0]. I am aware of, and have read, the original paper and I understand mathematically that the output of that…
artemis
  • 6,857
  • 11
  • 46
  • 99
6
votes
1 answer

Isolation Forest in Python

I am currently working on detecting outliers in my dataset using Isolation Forest in Python and I did not completely understand the example and explanation given in scikit-learn documentation Is it possible to use Isolation Forest to detect outliers…
Nnn
  • 191
  • 3
  • 9
5
votes
1 answer

Is there a way to calculate feature importance at observation level in isolation forest?

I am using Isolation Forest in R to perform Anomaly Detection on multivariate data. I tried calculating the anomaly scores along with contribution of individual metric in calculating that score. I am able to get the anomaly score but facing problem…
5
votes
1 answer

Isolation Forest : Categorical data

I am trying to detect anomalies in a breast cancer dataset using Isolation Forest in sklearn. I am trying to apply Iolation Forest to a mixed data set and it gives me value errors when I fit the model. This is my dataset :…
5
votes
1 answer

How to train isolationForest model so as to give the minimum number of false positives?

While using Isolation Forest for anomaly detection in data should we train the model with only normal data or mix of both normal as well as outlier data? Also what is the best algorithm for anomaly detection for multivariate data? I want minimum…
5
votes
1 answer

Implementation of Excess-Mass or Mass-Volume curves

I am looking for an implementation of Excess-Mass or Mass-Volume curves which are used for the evaluation of unsupervised anomaly detection algorithms. I'd prefer an implementation in Python but I could re-write it from any other language. Thank…
5
votes
0 answers

Why my LSTM model is repeating the previous values?

I build a simple LSTM model in Keras as below: model = Sequential() model.add(keras.layers.LSTM(hidden_nodes, input_dim=num_features, input_length=window, consume_less="mem")) model.add(keras.layers.Dense(num_features,…
Alessandro
  • 742
  • 1
  • 10
  • 34
1
2 3
29 30