2

I'm trying to insert data with HiveContext like this:

/* table filedata
CREATE TABLE `filedata`(
  `host_id` string,
  `reportbatch` string,
  `url` string,
  `datatype` string,
  `data` string,
  `created_at` string,
  `if_del` boolean)
*/
hiveContext.sql("insert into filedata (host_id, data) values (\"a1e1\", \"welcome\")")

Error and try to use "select":

hiveContext.sql("select \"a1e1\" as host_id, \"welcome\"as data").write.mode("append").saveAsTable("filedata")
/*
stack trace 
java.lang.ArrayIndexOutOfBoundsException: 2
*/

It needs to all columns like this:

hc.sql("select \"a1e1\" as host_id,
          \"xx\" as reportbatch,
          \"xx\" as url,
          \"xx\" as datatype,
          \"welcome\" as data,
          \"2017\" as created_at, 
          1 as if_del").write.mode("append").saveAsTable("filedata")

Is there a way to insert specified columns? For example, only insert columns "host_id" and "data".

SCouto
  • 7,808
  • 5
  • 32
  • 49
hyjal
  • 63
  • 8

2 Answers2

1

As far as i Know , Hive does not support the insertion of values into only some columns

From the documentation

Each row listed in the VALUES clause is inserted into table tablename.

Values must be provided for every column in the table. The standard SQL syntax that allows the user to insert values into only some columns is not yet supported. To mimic the standard SQL, nulls can be provided for columns the user does not wish to assign a value to.

So you should try this:

  val data = sqlc.sql("select 'a1e1', null, null, null, 'welcome', null, null, null")
  data.write.mode("append").insertInto("filedata")

Reference here

Community
  • 1
  • 1
SCouto
  • 7,808
  • 5
  • 32
  • 49
  • From the documentation:As of Hive 1.2.0 each INSERT INTO T can take a column list like INSERT INTO T (z, x, c1). See Description of HIVE-9481 for examples. "insert into filedata (host_id, data) values (\"a1e1\", \"welcome\")" works in hive but doesn't work with HiveContext. – hyjal Dec 20 '17 at 01:20
  • Unsupported language features in query:insert into table filedata values ("a1e1", null, null, null, "welcome", null, null) – hyjal Dec 20 '17 at 02:38
  • You are right, you should do this with the saveAsTable method, i'll update my answer. I've just tested it and it work properly – SCouto Dec 20 '17 at 07:41
0

You can do it if you are using row columnar file format such as ORC. Please see the working example below. This example is in Hive but will work very well with HiveContext.

hive> use default;
OK
Time taken: 1.735 seconds
hive> create table test_insert (a string, b string, c string, d int) stored as orc;
OK
Time taken: 0.132 seconds
hive> insert into test_insert (a,c) values('x','y');
Query ID = user_20171219190337_b293c372-5225-4084-94a1-dec1df9e930d
Total jobs = 1
Launching Job 1 out of 1


Status: Running (Executing on YARN cluster with App id application_1507021764560_1375895)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 4.06 s
--------------------------------------------------------------------------------
Loading data to table default.test_insert
Table default.test_insert stats: [numFiles=1, numRows=1, totalSize=417, rawDataSize=254]
OK
Time taken: 6.828 seconds
hive> select * from test_insert;
OK
x       NULL    y       NULL
Time taken: 0.142 seconds, Fetched: 1 row(s)
hive>
Sandeep Singh
  • 7,790
  • 4
  • 43
  • 68