Questions tagged [rapidminer]

RapidMiner is an environment for machine learning, data mining, text mining, predictive analytics, and business analytics. RapidMiner is written in Java and it was open source. There is an open source version and an enterprise version with additional features and an API to write own extensions.

RapidMiner is an environment for machine learning, data mining, text mining, predictive analytics, and business analytics. RapidMiner is written in Java and, from version 6.5, provides an open source version as well as an enterprise version with additional features.

511 questions
9
votes
4 answers

Which datamining tool to use?

Can somebody explain me the main pros and cons of the most known datamining open-source tools? Everywhere I read that RapidMiner, Weka, Orange, KNIME are the best ones. look at this blog post Can somebody do a fast technical comparison in a small…
user2670818
  • 719
  • 5
  • 12
  • 28
8
votes
2 answers

rapid miner: how to add a 'label' attribute to a dataset?

I want to apply a decision tree learning algorithm to a dataset I have imported from a CSV. The problem is that the "tra" input of the Decision Tree block is still red, stating "Input example set must have special attribute 'label'.". How do I add…
fstab
  • 4,801
  • 8
  • 34
  • 66
5
votes
3 answers

In Rapidminer once I import a data set how do I change the type of a column?

I've imported a datset into Rapidminer 5 and one of the columns that was supposed to be nominal or polynomial was set as a numeric. My data set has over 500 attributes so I don't really want to have to reimport my data every time I realize I've made…
Paul Mendoza
  • 5,709
  • 12
  • 53
  • 82
5
votes
1 answer

How to detect and delete noise in rapidminer?

I am new in rapid miner 5, just want to know how to find noise in my data and show them in chart and how to delete them?
H.Ghassami
  • 1,012
  • 2
  • 21
  • 42
5
votes
3 answers

Integration of RapidMiner in Java application

I have a text classification process in RapidMiner. It reads the test data from specified excel ssheet and does the classification. I have also a small Java application which is just running this process. Now I want to make the file input part in my…
ArmMiner
  • 215
  • 3
  • 5
  • 15
5
votes
2 answers

Different results from LOF implementation in ELKI and RapidMiner

I have written my own implementation of LOF and I'm trying to compare results with the implementations in ELKI and RapidMiner, but all 3 give different results! I'm trying to work out why. My reference dataset is one-dimensional, 102 real values…
Michael D.
  • 195
  • 1
  • 9
5
votes
2 answers

Clustering algorithm appropriate for very small clusters

I am trying to find duplicates in a list of about 5000 records. Each record is a person's name and address, but all typed inconsistently into one field, so I'm trying a fuzzy matching approach. My methodology (using rapidminer) is to do some…
aquavitae
  • 17,414
  • 11
  • 63
  • 106
4
votes
1 answer

Data mining for significant variables (numerical): Where to start?

I have a trading strategy on the foreign exchange market that I am attempting to improve upon. I have a huge table (100k+ rows) that represent every possible trade in the market, the type of trade (buy or sell), the profit/loss after that trade…
Mike Furlender
  • 3,869
  • 5
  • 47
  • 75
3
votes
1 answer

Is there a process for munging data from many different formats in RapidMiner?

I'm trying to help my team streamline a data ingestion process that is taking up a substantial amount of time. We receive data in multiple formats and with attributes arranged differently. Is there a way using RapidMiner to create a process…
Robert1er
  • 81
  • 5
3
votes
2 answers

Predicting a numeric attribute through high dimensional nominal attributes

I'm having difficulties mining a big (100K entries) dataset of mine concerning logistics transportation. I have around 10 nominal String attributes (i.e. city/region/country names, customers/vessel identification codes, etc.). Along with those, I…
hildebro
  • 549
  • 2
  • 5
  • 20
3
votes
2 answers

extracting text with xpath with different nodes

I'm currently trying to extract some text from a website with xPath and Rapidminer. I want to extract the "270€" from the following code:
+ 270 €
I tried the following…
Marius
  • 31
  • 1
3
votes
0 answers

How to pass a batch file with date parameters to 'Execute Program' operator in Rapid Miner

I have two 'context level'macros defined in my Rapid Miner process as - from_date = //some date// to_date = // some date// In the Rapid Miner process, I'm using the Execute Program operator. This operator should accept the two macros (defined…
user2014
  • 171
  • 1
  • 14
3
votes
1 answer

Rapidminer: Explaining decision tree parameters

I am very new to rapidminer and data mining in general but I have attempted to make a cursory search for what all of the parameters mean in rapidminers decision tree parameters and came up lacking. I know what a leaf is and a node and am at the…
HammockKing
  • 77
  • 1
  • 10
3
votes
1 answer

W-apriori in Rapidminer

I need to create association rules using apriori algorithm in Rapidminer, but I can't seem to make it work. I'm using the 5.3.1 weka extension. I've already created the association rules using built-in FP-Growth and Create Associations operators,…
Dth
  • 1,916
  • 3
  • 23
  • 34
3
votes
3 answers

Feature Selection in dataset containing both string and numerical values?

Hi I have big dataset which has both strings and numerical values ex. User name (str) , handset(str), number of requests(int), number of downloads(int) ,....... I have around 200 such columns. Is there a way/algorithm which can handle both strings…
cryp
  • 2,285
  • 3
  • 26
  • 33
1
2 3
34 35