5

I have a text classification process in RapidMiner. It reads the test data from specified excel ssheet and does the classification. I have also a small Java application which is just running this process. Now I want to make the file input part in my aplication, so that everytime I would be able to specify the excel file from my application (not from RapidMiner). Any hints?

This is the code:

import com.rapidminer.RapidMiner;
import com.rapidminer.Process;
import com.rapidminer.example.Attribute;
import com.rapidminer.example.Example;
import com.rapidminer.example.ExampleSet;
import com.rapidminer.operator.IOContainer;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.OperatorException;



import java.io.File;
import java.io.IOException;
import java.util.Iterator;
import com.rapidminer.operator.io.ExcelExampleSource; 
import com.rapidminer.tools.XMLException;


public class Classification {

    public static void main(String [] args) throws Exception{
         ExampleSet resultSet1 = null;
         IOContainer ioInput = null;
        IOContainer ioResult;
        try {
            RapidMiner.setExecutionMode(RapidMiner.ExecutionMode.COMMAND_LINE);
            RapidMiner.init();
            Process pr = new Process(new File("C:\\Users\\MP-TEST\\Desktop\\Rapid_Test\\Wieder_Model.rmp"));
            Operator op = pr.getOperator("Read Excel");
            op.setParameter(ExcelExampleSource.PARAMETER_EXCEL_FILE, "C:\\Users\\MP-TEST\\Desktop\\Rapid_Test\\HaendlerRatings_neu.xls");
            ioResult = pr.run(ioInput);
            if (ioResult.getElementAt(0) instanceof ExampleSet) {
                resultSet1 = (ExampleSet)ioResult.getElementAt(0);

                for (Example example : resultSet1) {
                    Iterator<Attribute> allAtts = example.getAttributes().allAttributes();
                    while(allAtts.hasNext()) {
                        Attribute a = allAtts.next();
                                if (a.isNumerical()) {
                                        double value = example.getValue(a);
                                        System.out.println(value);

                                } else {
                                        String value = example.getValueAsString(a);
                                        System.out.println(value);
                                }
                         }
                }
                    }
        } catch (IOException | XMLException | OperatorException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }




          }
}

This is the error:

Apr 09, 2013 9:06:05 AM com.rapidminer.Process run
INFO: Process C:\Users\MP-TEST\Desktop\Rapid_Test\Wieder_Model.rmp starts
com.rapidminer.operator.UserError: A value for the parameter 'excel_file' must be specified! 
    at com.rapidminer.operator.nio.model.ExcelResultSetConfiguration.makeDataResultSet(ExcelResultSetConfiguration.java:316)
    at com.rapidminer.operator.nio.model.AbstractDataResultSetReader.createExampleSet(AbstractDataResultSetReader.java:127)
    at com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:52)
    at com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:1)
    at com.rapidminer.operator.io.AbstractReader.doWork(AbstractReader.java:126)
    at com.rapidminer.operator.Operator.execute(Operator.java:855)
    at com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
    at com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:711)
    at com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
    at com.rapidminer.operator.Operator.execute(Operator.java:855)
    at com.rapidminer.Process.run(Process.java:949)
    at com.rapidminer.Process.run(Process.java:873)
    at com.rapidminer.Process.run(Process.java:832)
    at com.rapidminer.Process.run(Process.java:827)
    at Classification.main(Classification.java:29)

Best regards

Armen

ArmMiner
  • 215
  • 3
  • 5
  • 15

3 Answers3

5

Works fine for me:

  • Download Rapidminer (and unzip the file)
  • Into "lib" directory, you need:
    1. rapidminer.jar
    2. launcher.jar
    3. All jar in "/lib/freehep" directory.
  • Put libs 1, 2 and 3 in your classpath java project (libraries)
  • Copy this code and run:


    import com.rapidminer.Process;
    import com.rapidminer.RapidMiner;
    import com.rapidminer.operator.Operator;
    import com.rapidminer.operator.OperatorException;
    import com.rapidminer.operator.io.ExcelExampleSource;
    import com.rapidminer.tools.XMLException;
    import java.io.File;
    import java.io.IOException;
    import java.lang.Object;

    public class ReadRapidminerProcess {
      public static void main(String[] args) {
        try {
          RapidMiner.setExecutionMode(RapidMiner.ExecutionMode.COMMAND_LINE);
          RapidMiner.init();

          Process process = new Process(new File("/your_path/your_file.rmp"));
          process.run();

        } catch (IOException | XMLException | OperatorException ex) {
          ex.printStackTrace();
        }
      }
    }

I hope to help you, I searched a lot before finding the answer.

Toto
  • 71
  • 1
  • 4
1

I see two ways to do that.

The first one would be to change programatically the XML definition of your process. Rapidminer processes are specified by an XML file with .rmp extension. In the file you will find the definition of the operator you wish to change. This is an excerpt from a simple process specifiing the Read Excel operator:

<operator activated="true" class="read_excel" compatibility="5.3.005" expanded="true" height="60" name="Read Excel" width="90" x="313" y="75">
    <parameter key="excel_file" value="D:\file.xls"/>    <!-- HERE IS THE FILE PATH -->
    <parameter key="sheet_number" value="1"/>
    <parameter key="imported_cell_range" value="A1"/>
    <parameter key="encoding" value="SYSTEM"/>
    <parameter key="first_row_as_names" value="true"/>
    <list key="annotations"/>
    <parameter key="date_format" value=""/>
    <parameter key="time_zone" value="SYSTEM"/>
    <parameter key="locale" value="English (United States)"/>
    <list key="data_set_meta_data_information"/>
    <parameter key="read_not_matching_values_as_missings" value="true"/>
    <parameter key="datamanagement" value="double_array"/>
</operator>

I highlighted the part where the path to the excel file is. You can overwrite that in your application. Just be careful not to break the XML file.


The other way is to modify the operator after you load the process in your java application. You can get a reference to your operator by Process#getOperator(String name) or Process#getAllOperators(). I guess it should be of one of these classes:

com.rapidminer.operator.io.ExcelExampleSource
com.rapidminer.operator.nio.ExcelExampleSource

When you find the correct operator you modify the path by Operator#setParameter(String key, String Value).

This code works for me with RapidMiner 5.3: (the process is just a Read Excel operator and a Write CSV operator)

package sorapid;

import com.rapidminer.Process;
import com.rapidminer.RapidMiner;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.OperatorException;
import com.rapidminer.operator.io.ExcelExampleSource;
import com.rapidminer.tools.XMLException;
import java.io.File;
import java.io.IOException;

public class SOrapid {

  public static void main(String[] args) {
    try {
      RapidMiner.setExecutionMode(RapidMiner.ExecutionMode.COMMAND_LINE);
      RapidMiner.init();

      Process process = new Process(new File("c:\\Users\\Matlab\\.RapidMiner5\\repositories\\Local Repository\\processes\\test.rmp"));
      Operator op = process.getOperator("Read Excel");
      op.setParameter(ExcelExampleSource.PARAMETER_EXCEL_FILE, "d:\\excel.xls");
      process.run();

    } catch (IOException | XMLException | OperatorException ex) {
      ex.printStackTrace();
    }
  }
}
Josef Borkovec
  • 1,069
  • 8
  • 13
  • I'm trying to do with the second option by specifying the key and the value, but it shows that the operator object is null. – ArmMiner Apr 05 '13 at 16:24
  • I'm giving the parameters for the ReadExcel operator, but stil getting an error: A value for the parameter 'excel_file' must be specified! code: operator.setParameter(ExcelExampleSource.PARAMETER_EXCEL_FILE, "C:\\Users\\MP-TEST\\Desktop\\Rapid_Test\\HaendlerRatings_neu.xls"); operator.setParameter(ExcelExampleSource.PARAMETER_IMPORTED_CELL_RANGE, "A1:A2000"); – ArmMiner Apr 08 '13 at 08:08
  • I added a code example which works for me. I'm not sure what could be the problem for you. Could you set a breakpoint after you set the parameters and inspect the content of the operator's parameters.keyToValueMap variable? There should be an entry for excel_file like "excel_file => d:\excel.xls" – Josef Borkovec Apr 08 '13 at 10:54
  • One question: Is it right if I leave the parameter excel_file empty (in rapidMiner)? – ArmMiner Apr 08 '13 at 14:52
  • It works if you run the process from your java application and set the parameter. But you will not be able to run the same process from rapidminer without choosing the file path first. – Josef Borkovec Apr 08 '13 at 15:11
  • That's the problem. If I put empty the Excel_File field, I get the error, which indicates that the the program can't set the parameter for this operator. – ArmMiner Apr 08 '13 at 15:20
  • I just edited the question by adding the code. Can you have a look? Thanks! – ArmMiner Apr 08 '13 at 15:25
  • Your code works fine for me with a process consisting only of read excel operator and connection to output (excel file parameter is empty). Maybe a different version of RapidMiner? – Josef Borkovec Apr 08 '13 at 15:38
  • RapidMiner version is: 5.3.007. In RapidMiner I'm leaving the ReadExcel operator active with required options. Only the Excel_File field is empty. Your process is in the same way right? – ArmMiner Apr 08 '13 at 15:44
  • I'm using the same version. When exactly do you see the error? Only when you open the process in rapidminer or also when you run the java application? You can't run the process in rapidminer if you don't specify the excel file parameter, but you should be able to run it from the java app if you set the parameter first. If you get the error when running the java app, please explain when the error occurs and a stack trace. – Josef Borkovec Apr 08 '13 at 16:02
  • Yeah, I'm getting error when I run the java application. The error points the line: pr.run() and the description is: specify the parameter value 'excel_file'. It's strange, because as you can see I'm specifying in the code the path to the excel file. – ArmMiner Apr 08 '13 at 16:28
  • Solved! Sorry, but it was my fault. I had another Read_Excel operator which was deactivated, so my active operator's name was 'Read_Excel (2)'. :) – ArmMiner Apr 09 '13 at 07:24
0

Try this:

private SimpleExampleSet ReadExcel( File processXMLFile_, File excelFile_ ) throws IOException, XMLException, OperatorException
{
    IOContainer outParameters   = null;
    Process     readExcel       = new Process( processXMLFile_ );
    IOObject    inObject        = new SimpleFileObject( excelFile_ );
    IOContainer inParameters    = new IOContainer( inObject );

    outParameters   = readExcel.run( inParameters );

    SimpleExampleSet    result  = (SimpleExampleSet) outParameters.getElementAt( 0 );

    return result;

}

Sorry, I cannot post image with RapidMiner script if you need, I can send it to email.

Maxim
  • 339
  • 1
  • 5