4

I have wrote a program which visit HBase using spark 1.6 with spark-hbase-connecotr ( sbt dependency: "it.nerdammer.bigdata" % "spark-hbase-connector_2.10" % "1.0.3"). But it doesn't work when using spark 2.*. I've searched about this question and I got some concludes:

  1. there are several connectors used to connect hbase using spark

    • hbase-spark. hbase-spark is provided by HBase official website. But I found it is developed on scala 2.10 and spark 1.6. The properties in the pom.xml of the project is as below:

      <properties>
        <spark.version>1.6.0</spark.version>
        <scala.version>2.10.4</scala.version>
        <scala.binary.version>2.10</scala.binary.version>
        <top.dir>${project.basedir}/..</top.dir>
        <avro.version>1.7.6</avro.version>
        <avro.mapred.classifier></avro.mapred.classifier>
      </properties>
      
    • hbase-spark-connecotr: I visit their website and there is no information about spark 2.0. And the jar's name is spark-hbase-connector_2.10 that tell us the jar is compiled by scala 2.10 used by spark 1.* . But when I change the jar's name to spark-hbase-connector_2.11(compiled by scala 2.11,the same as spark 2.),my IEDA(a kind of IDE) tell me no jar named spark-hbase-connecotr_2.11.So there is no support for spark 2. .

    • hortonworks-spark: I 've visited their website. Some remarks said that is is not support for spark 2.*.

Do you know any jar provided by third party with full document which solve the problem? what packages should I use to connect the hbase using spark 2.* ? I appreciate you for any suggestions. Thanks!

Doone
  • 81
  • 6
  • 1
    Possible duplicate of [Which HBase connector for Spark 2.0 should I use?](http://stackoverflow.com/questions/40908891/which-hbase-connector-for-spark-2-0-should-i-use) – James Fry Feb 14 '17 at 21:27
  • See below links https://stackoverflow.com/questions/51566176/how-to-access-hbase-from-spark-scala-is-there-clear-defined-scala-api https://stackoverflow.com/questions/40908891/which-hbase-connector-for-spark-2-0-should-i-use?rq=1 – DataNoob Jul 28 '18 at 10:03

1 Answers1

2

I choose using newAPIHadoopRDD to visit hbase in spark

bp2010
  • 2,342
  • 17
  • 34
Doone
  • 81
  • 6
  • 1
    Using this will perform a full Scan operation though. Is there any way to perform a get by rowkey using Spark 2? – bp2010 Jun 18 '18 at 13:30