0

I try to map Dataset<Row> into Dataset<MyRow> in Apache Spark in Java. Does anyone know where could be the problem?

Dataset<Row> flightsDF = spark.read().format("csv").option("header", "true").load( ... );
Dataset<Row> DSOfRows = flightsDF
  .filter( ... );
DSOfRows.show(5);

Code will show 5 first rows, then I will try to map the Row into MyRow.

Dataset<MyRow> DSOfMyRows = DSOfRows.map(
  (MapFunction<Row, MyRow>) row -> new MyRow(row.getAs("Carrier"),
      row.getAs("ArrDelay")), Encoders.bean(MyRow.class));
DSOfMyRows.show(5);

and there is the problem because it will print only empty rows.

++
||
++
||
||
||
||
||
++
only showing top 5 rows

The MyRow class looks like this:

public static class MyRow implements Serializable {
  public String carrier;
  public Double arrDelay;

  MyRow(String carrier, Double delay) {
      this.carrier = carrier;
      this.arrDelay = delay;
  }

  String getCarrier() {
      return carrier;
  }

  public void setCarrier(String c) {
      this.carrier = c;
  }

  Double getArrDelay() {
      return arrDelay;
  }

  public void setArrDelay(Double delay) {
      this.arrDelay = delay;
  }
}
  • I found this question https://stackoverflow.com/questions/41118998/how-can-i-convert-a-custom-java-class-to-a-spark-dataset, but it did not help. – Ondrej Kucera Nov 09 '17 at 18:05
  • Your class `MyRow` is not a [JavaBean](https://stackoverflow.com/questions/3295496/what-is-a-javabean-exactly). Have you tried adding a public no-argument constructor? – Alexandre Dupriez Nov 09 '17 at 23:36
  • You are right! I added the public no-argument constructor and private properties. Thank you a lot :) – Ondrej Kucera Nov 10 '17 at 00:03

0 Answers0