4

I ran a simple query using Athena dashboard on data of format csv.The result was a csv with column headers. When storing the results,Athena stores with the column headers in s3.How can i skip storing header column names,as i have to make new table from the results and it is repetitive

Shiva Rama Krishna
  • 365
  • 2
  • 4
  • 14

2 Answers2

2

Try "skip.header.line.count"="1", This feature has been available on AWS Athena since 2018-01-19, here's a sample:

CREATE EXTERNAL TABLE IF NOT EXISTS tableName (
  `field1` string,
  `field2` string,
  `field3` string 
)
 ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
 WITH SERDEPROPERTIES (
   'separatorChar' = ',',
   'quoteChar' = '\"',
   'escapeChar' = '\\'
   )
LOCATION 's3://fileLocation/'
TBLPROPERTIES ('skip.header.line.count'='1')

You can refer to this question: Aws Athena - Create external table skipping first row

Hilda Chang
  • 193
  • 1
  • 11
1

From an Eric Hammond post on AWS Forums:

...
  WHERE
    date NOT LIKE '#%'
...

I found this works! The steps I took:

However, subsequent queries store even more data in that S3 directory, so it confuses any subsequent executions.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • I refered [Eric Hammond post on AWS Forum](https://forums.aws.amazon.com/thread.jspa?threadID=245579).In that case, the file starts with '#%',so above query applies.In my case, i think i have to say where NOT LIKE ' – Shiva Rama Krishna Jul 14 '17 at 05:09
  • My file did NOT start with `#%` either, but it worked for me from an Athena output file. I suspect it is because a Date field isn't loaded from a header text string, making it NULL and the `NOT LIKE` filters based upon that. Try it, it will actually work for you! – John Rotenstein Jul 14 '17 at 05:41
  • you are right,it worked!,changed it to `WHERE NOT LIKE ''` as i do not have datetime field in my results,so kept the relevant datatype(string) in my case. – Shiva Rama Krishna Jul 14 '17 at 06:10
  • but sadly my result has a column header again,so it again becomes the same thing, did u see the column headers in,after u ran query on new table with `WHERE NOT LIKE '#%'`,and may u try creating a table with that,because for me the column header still persists in the output no matter what – Shiva Rama Krishna Jul 14 '17 at 06:35
  • Mmm. Strangely, I am unable to repeat my prior success. Now I can't query the result table at all! Sorry, can't help any further. – John Rotenstein Jul 14 '17 at 08:46