2

I am trying to get a quoted csv output from beeline. My query looks like:

beeline -u 'jdbc:hive2://localhost:10000/' --outputformat=csv2 -e 'set system:disable.quoting.for.sv=false; select 1 as a, 2 as b'

I expect as output

"a","b"
"1","2"

But only get the non-quoted version.

According to the documentation:

The quoting can be disabled by setting the disable.quoting.for.sv system variable to true.

I suppose I am doing something wrong here. How can I set this variable to false? Having it in my query does not seem to do the trick, having it on hive-site.xml neither.

Note: hive 1.2.1 on hdp 2.5.

Guillaume
  • 2,325
  • 2
  • 22
  • 40

3 Answers3

2

You can try:

beeline --disableQuotingForSV=true --verbose=true -u 'jdbc:hive2://localhost:10000/' --outputformat=csv2  -e 'select 1 as a, 2 as b'
jedijs
  • 563
  • 5
  • 14
  • No change. I am curious to know where you found this? – Guillaume Mar 07 '17 at 12:13
  • 3
    Indeed but does not seem to work for me. The other options (HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false") does the trick when the explanation from the accepted answer is understood. – Guillaume Mar 08 '17 at 05:56
  • @Guillaume Thanks for the pointer, I am stuck at the same dilemma can you please help me look at my https://stackoverflow.com/questions/69079148/how-to-output-hive-table-as-a-standard-csv-file-with-quotes – axnet Sep 07 '21 at 01:58
2

Quoting fields in csv is only necessary/useful when the field contains a separator or a newline.

So if you try to write 1 2,3, the last value should be quoted: 1,"2,3".

Quoting a field does not mean it should be treated as a string. The "2,3" is still a number (if the , is the decimal separator in this case).

Some csv writers add quotes around every field, which is totally unnecessary and only makes the file larger without any benefit.

The documentation seems to be clear about this too:

If quoting is not disabled, double quotes are added around a value if it contains special characters (such as the delimiter or double quote character) or spans multiple lines.

Embedded double quotes are escaped with a preceding double quote.

Danny_ds
  • 11,201
  • 1
  • 24
  • 46
0

this will work: env HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline -u 'jdbc:hive2://localhost:10000/' --outputformat=csv2 -e 'select 1 as a, 2 as b'

Feri
  • 1
  • 2