AWS Athena Import CSV file

Question

I have this data stored in S3 as .csv (but it can be any other file format which is the best suitable for my requirement):

"41.9100687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417","41.9810128,-87.8785121","41.9200687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417",
"41.9100687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417","41.9810128,-87.8785121","41.9200687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417",
"41.9100687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417","41.9810128,-87.8785121","41.9200687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417",
"41.9100687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417","41.9810128,-87.8785121","41.9200687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417",

and I would like to have one coordinate per column:

Like this:

Coordinates:

1.  41.9100687,-87.8805614

2.  41.9802511,-87.8803253

3.  41.9806802,-87.8792417

After importing S3 I choose CSV as data type... And then I add string column.

But instead I get some weird table output. Beside this I tried to import this as plain txt file with comma delimiter.. I get same weird output.

What am I doing wrong here?

EDIT

This test column screenshot is query from another but identical example. There should be gps_coordinates

Is the file in a folder on its own (with no other files)? Did your LOCATION statement point to the folder or the file? — John Rotenstein, Oct 21 '19 at 03:48

score 14 · Accepted Answer · answered Oct 21 '19 at 03:47

To reproduce your situation, I did the following:

Created a text file using your sample data (gps.txt)
Uploaded it to an Amazon S3 bucket in its own folder (with no other files in that folder)
Created a table in Amazon Athena
- Specified the location as the folder name (s3://my-bucket/gps/)
- Specified 7 columns (since there are 7 string values in your sample file)

However, since the data has commas within each pair of numbers, I changed the SerDe to OpenCSVSerDe for Processing CSV - Amazon Athena:

CREATE EXTERNAL TABLE IF NOT EXISTS default.gps (
  `c1` string,
  `c2` string,
  `c3` string,
  `c4` string,
  `c5` string,
  `c6` string,
  `c7` string 
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' 
WITH SERDEPROPERTIES ("separatorChar" = ",", "escapeChar" = "\\") 

LOCATION 's3://my-bucket/gps/'
TBLPROPERTIES ('has_encrypted_data'='false');

I was then able to successfully query the table. A sample column value is: 41.9100687,-87.8805614

Is there a way to ignore the first row which is header column — kumar, Aug 30 '22 at 14:13
@kumar [Aws Athena - Create external table skipping first row](https://stackoverflow.com/a/47280738/174777) — John Rotenstein, Aug 30 '22 at 22:06

AWS Athena Import CSV file

1 Answers1