In data mining, anomaly detection (also outlier detection) is the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset.
Questions tagged [anomaly-detection]
442 questions
23
votes
2 answers
Dataflow anomaly analysis warnings from PMD
I am using Eclipse with the PMD Plug-in (4.0.0.v20130510-1000) and get a lot of those violations:
Found 'DD'-anomaly for variable 'freq' (lines '187'-'189').
Found 'DU'-anomaly for variable 'freq' (lines '189'-'333').
In this SO answer, it says that…

brimborium
- 9,362
- 9
- 48
- 76
9
votes
1 answer
Working Example Of Luminol Anomaly Detection And Correlation Library By Linkedin
Github Link Of Luminol Library: https://github.com/linkedin/luminol
Can anyone explain me with a sample code, how to use this module for finding anomalies in data set.
I want to use this module for finding the anomalies in my time series data.
P.S.:…

Ashish
- 4,206
- 16
- 45
8
votes
1 answer
One Class SVM algorithm taking too long
The data bellow shows part of my dataset, that is used to detect anomalies
describe_file data_numbers index
0 gkivdotqvj 7309.0 0
1 hpwgzodlky 2731.0 1
2 dgaecubawx 0.0 2
3 NaN …

E199504
- 425
- 4
- 12
7
votes
2 answers
Isolation Forest vs Robust Random Cut Forest in outlier detection
I am examining different methods in outlier detection. I came across sklearn's implementation of Isolation Forest and Amazon sagemaker's implementation of RRCF (Robust Random Cut Forest). Both are ensemble methods based on decision trees, aiming to…

Houssam Metni
- 143
- 2
- 8
7
votes
4 answers
What is the difference between Real-time Anomaly Detection and Anomaly Detection?
Hence, the following derives: What isa clear the definition of Real-time Anomaly Detection?
I am investigating the field of Anomaly Detection and in many papers the approach is defined Real-time, while in many other it is simply called Anomaly…

dadadima
- 938
- 4
- 28
7
votes
1 answer
How to detect anomaly in a time series data(specifically) with trend and seasonality present in it?
I want to detect the outliers in a "time series data" which contains the trend and seasonality components. I want to leave out the peaks which are seasonal and only consider only the other peaks and label them as outliers. As I am new to time series…

Raja Sahe S
- 587
- 1
- 7
- 13
7
votes
1 answer
How to monitor messages rate in Kafka topics?
How can I get alerted when there is a message rate in some topic higher or lower than usual?

marosbfm
- 311
- 2
- 10
7
votes
1 answer
What is the range of Scikit-Learn's IsolationForest decision_function scores?
Scikit-Learn's IsolationForest class has a method decision_function that returns the anomaly scores of the input samples. However, the documentation does not state what the possible range of these scores is, and only states that "the lower [the…

DataMan
- 3,115
- 6
- 21
- 36
6
votes
3 answers
Conversion of IsolationForest decision score to probability algorithm
I am looking to create a generic function to convert the output decision_scores of sklearn's IsolationForest into true probabilities [0.0, 1.0].
I am aware of, and have read, the original paper and I understand mathematically that the output of that…

artemis
- 6,857
- 11
- 46
- 99
6
votes
1 answer
Isolation Forest in Python
I am currently working on detecting outliers in my dataset using Isolation Forest in Python and I did not completely understand the example and explanation given in scikit-learn documentation
Is it possible to use Isolation Forest to detect outliers…

Nnn
- 191
- 3
- 9
5
votes
1 answer
Is there a way to calculate feature importance at observation level in isolation forest?
I am using Isolation Forest in R to perform Anomaly Detection on multivariate data.
I tried calculating the anomaly scores along with contribution of individual metric in calculating that score. I am able to get the anomaly score but facing problem…

Sidharth Agarwal
- 113
- 9
5
votes
1 answer
Isolation Forest : Categorical data
I am trying to detect anomalies in a breast cancer dataset using Isolation Forest in sklearn. I am trying to apply Iolation Forest to a mixed data set and it gives me value errors when I fit the model.
This is my dataset :…

Nnn
- 191
- 3
- 9
5
votes
1 answer
How to train isolationForest model so as to give the minimum number of false positives?
While using Isolation Forest for anomaly detection in data should we train the model with only normal data or mix of both normal as well as outlier data? Also what is the best algorithm for anomaly detection for multivariate data? I want minimum…

Nir_AI
- 51
- 3
5
votes
1 answer
Implementation of Excess-Mass or Mass-Volume curves
I am looking for an implementation of Excess-Mass or Mass-Volume curves which are used for the evaluation of unsupervised anomaly detection algorithms.
I'd prefer an implementation in Python but I could re-write it from any other language.
Thank…

Stergios
- 3,126
- 6
- 33
- 55
5
votes
0 answers
Why my LSTM model is repeating the previous values?
I build a simple LSTM model in Keras as below:
model = Sequential()
model.add(keras.layers.LSTM(hidden_nodes, input_dim=num_features, input_length=window, consume_less="mem"))
model.add(keras.layers.Dense(num_features,…

Alessandro
- 742
- 1
- 10
- 34