2

I have the following three classes : I tried making the routine of 1 & 2 and used tjava to call the main class and the method from 1 & 2 but I am unable to fetch those methods.

1)

package page_scraper;

import com.gargoylesoftware.htmlunit.Page;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.WebClientOptions;
import com.gargoylesoftware.htmlunit.html.FrameWindow;
import com.gargoylesoftware.htmlunit.html.HtmlButtonInput;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlOption;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlSelect;
import com.gargoylesoftware.htmlunit.html.HtmlTextInput;
import java.io.FileWriter;
import java.io.IOException;
import java.io.PrintStream;
import java.io.Writer;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.Date;
import java.util.List;
import page_scraper.UnitArray;

public class PageScraper {
    public void Scrape() throws IOException {
        try {
            UnitArray object = new UnitArray();
            ArrayList<String> unitList = object.getUnitArray();
            WebClient webClient = new WebClient();
            webClient.getOptions().setThrowExceptionOnScriptError(false);
            webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
            HtmlPage page = (HtmlPage)webClient.getPage("http://www.bmreports.com/servlet/com.logica.neta.bwp_PanBMUData");
            List frames = page.getFrames();
            HtmlPage page1 = (HtmlPage)((FrameWindow)frames.get(0)).getEnclosedPage();
            HtmlTextInput settlementDay = (HtmlTextInput)page1.getHtmlElementById("param5");
            HtmlSelect period = (HtmlSelect)page1.getHtmlElementById("param6");
            HtmlOption periodOption = period.getOption(1);
            HtmlTextInput unitId = (HtmlTextInput)page1.getHtmlElementById("param1");
            HtmlButtonInput button = (HtmlButtonInput)page1.getHtmlElementById("go_button");
            String outputLocation = String.valueOf(System.getProperty("user.home")) + "/Documents/output.csv";
            FileWriter fileWriter = new FileWriter(outputLocation);
            String errorLocation = String.valueOf(System.getProperty("user.home")) + "/Documents/error.csv";
            FileWriter errorWriter = new FileWriter(errorLocation);
            int i = 0;
            while (i < unitList.size()) {
                int x = 0;
                while (x < 365) {
                    String errorData;
                    SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd");
                    Calendar cal = Calendar.getInstance();
                    cal.add(5, - x);
                    String dateValue = dateFormat.format(cal.getTime());
                    System.out.println(dateValue);
                    settlementDay.setValueAttribute(dateValue);
                    period.setSelectedAttribute(periodOption, true);
                    unitId.setValueAttribute(unitList.get(i));
                    System.out.println(unitList.get(i));
                    try {
                        button.click();
                        HtmlPage page2 = (HtmlPage)((FrameWindow)frames.get(1)).getEnclosedPage();
                        String pageSource = page2.asXml();
                        int firstIndex = pageSource.indexOf("csv=") + 38;
                        int secondIndex = pageSource.indexOf("n\"") + 1;
                        String csvData = pageSource.substring(firstIndex, secondIndex);
                        fileWriter.append(csvData);
                    }
                    catch (ClassCastException e) {
                        errorData = String.valueOf(dateValue) + " " + unitList.get(i) + System.getProperty("line.separator");
                        System.out.println(errorData);
                        errorWriter.append(errorData);
                        continue;
                    }
                    catch (StringIndexOutOfBoundsException e) {
                        errorData = String.valueOf(dateValue) + " " + unitList.get(i) + System.getProperty("line.separator");
                        System.out.println(errorData);
                        errorWriter.append(errorData);
                        continue;
                    }
                    ++x;
                }
                ++i;
            }
            webClient.close();
            fileWriter.close();
            errorWriter.close();
        }
        catch (IOException e) {
            e.printStackTrace();
        }
    }
}

2)

package page_scraper;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;

public class UnitArray {
    public ArrayList<String> getUnitArray() {
        String csvList = "abc,xyz";
        ArrayList<String> list = new ArrayList<String>(Arrays.asList(csvList.split(",")));
        return list;
    }
}

3)

package page_scraper;

import page_scraper.PageScraper;

public class main {
    public static void main(String[] args) throws Exception {
        PageScraper test = new PageScraper();
        test.Scrape();
    }
}

I made the routines for the above code(1) & 2)) in Talend and then used tjava to call the method but unable to do so..I also tried using tjava for all and did a onSubjob ok on each of the tjava. How can I call these classes in talend and call the method ?

user1538020
  • 515
  • 1
  • 9
  • 25
  • **I tried loading the routines but then I am getting this error : Exception in thread "main" java.lang.NoSuchFieldError: INSTANCE** – user1538020 Dec 16 '15 at 10:05

3 Answers3

4

Firstly, routines classes in Talend need to be in routines package

package routines;

public class PageScraper {

    public void Scrape() {
        System.out.println("PageScraper.Scrape");

    }
}

Secondly, to use it in Job you need to drag'n'drop routine to opened job area. how to import routines in talend

Then you can use your class in that way enter image description here

xto
  • 406
  • 2
  • 8
  • I tried doing that but then I am getting this error : **Exception in thread "main" java.lang.NoSuchFieldError: INSTANCE** – user1538020 Dec 16 '15 at 10:04
  • Have you changed import page_scraper.UnitArray into import routines.UnitArray in PageScraper class ? – xto Dec 16 '15 at 11:07
  • 1
    And try this: close your routine, in Routines right-click on your routine and click Edit Routine Libraries. Then add *.jar file with com.gargoylesoftware.htmlunit , because it isn't default in Talend libraries – xto Dec 16 '15 at 11:10
0

You can easily make a jar file that contains the three classes then load the jar using tLibraryLoad or include the jar in your routine if you want to get more reusability.

54l3d
  • 3,913
  • 4
  • 32
  • 58
  • I tried doing that but then I am getting this error : **Exception in thread "main" java.lang.NoSuchFieldError: INSTANCE** – user1538020 Dec 16 '15 at 10:05
  • @user1538020 Please check this [question](http://stackoverflow.com/q/22330848/2037229) and its answers – 54l3d Dec 16 '15 at 10:12
0
  • As suggested in the other answers, you need to define classes under routines package.
  • in case you are using Takend 7.3 & above, Right click on your routine and add it as Dependent package
  • Get routines as a jar and in case using in bigData jobs, you may need to use tLibraryLoad to package it together with other dependencies..
Eric Aya
  • 69,473
  • 35
  • 181
  • 253
VimalK
  • 65
  • 1
  • 8