Extract specific column from csv file, and convert to string in Lambda using Python

Question

I'm trying to get all email addresses in a comma separated format from a specific column. This is coming from a csv temp file in Lambda. My goal is to save that file in s3 with only one column containing the email addresses.

This is what the source data looks like:

Here is my code:

#open file and extract email address
with open('/tmp/maillist.csv', 'w') as mail_file:
    wm = csv.writer(mail_file)
    mail_list = csv.reader(open('/tmp/filtered.csv', "r"))
    for rows in mail_list:
        ','.join(rows)
        wm.writerow(rows[3])
bucket.upload_file('/tmp/maillist.csv', key)

I was hoping to get a result like this:

But instead, I'm getting a result like this:

I also tried this code:

#open file and extract email address
mail_list = csv.reader(open('/tmp/filtered.csv', "r"))
with open('/tmp/maillist.csv', 'w') as mail_file:
    wm = csv.writer(mail_file)
    wm.writerow(mail_list[3])
bucket.upload_file('/tmp/maillist.csv', key)

But I get this error instead:

Response:
{
  "errorMessage": "'_csv.reader' object is not subscriptable",
  "errorType": "TypeError",
  "stackTrace": [
    "  File \"/var/task/lambda_function.py\", line 68, in lambda_handler\n     wm.writerow(mail_list[3])\n"

Any help is appreciated.

post the a few rows of your `filtered.csv` file to understand how the input data looks like. — abhilb, Dec 24 '19 at 16:01

Josh Sharkey · Accepted Answer · 2020-01-02T17:49:14.883

1

In this code snippet:

#open file and extract email address
with open('/tmp/maillist.csv', 'w') as mail_file:
    wm = csv.writer(mail_file)
    mail_list = csv.reader(open('/tmp/filtered.csv', "r"))
    for rows in mail_list:
        ','.join(rows)
        wm.writerow(rows[3])
bucket.upload_file('/tmp/maillist.csv', key)

You are taking each row and converting it into a single string using .join. The single string is then interpreted as an iterable, which you pass into wm.writerow. When you use wm.writerow to write to a single column you should provide an iterable which has one single index --- that means the list of emails from the 3rd column should be put into a final list after you combine them into a string. Additionaly if you want to put a comma separated list into a single cell for a CSV file, you must use quotes (") to escape the list:

 for row in mail_list: 
    single_string =  '"' + ','.join(rows[3]) + '"'
    wm.writerow([single_string])

This will produce a csv file where each row of the file has a single column of: all of the emails in the 3rd column of the same row from the first combined to a single cell escaped with double quotes.

edited Jan 02 '20 at 17:49

answered Jan 02 '20 at 17:04

Josh Sharkey

1,008
9
34

sorry i accidentally submitted to early, see the updated answer – Josh Sharkey Jan 02 '20 at 17:08
1

If my assumption is wrong about the 3rd column, then you can omit the `join` call and simply use: `wm.writerow([row[3]])`. Again, making sure the argument you pass to `writerow` is a `list` – Josh Sharkey Jan 02 '20 at 17:16
Omitting 'join' and adding the extra brackets solved the extra separation. However, the output is still a list, not a string. – Pucho Jan 02 '20 at 17:25
Can you explain what the desired output is? That is, what you would like the `maillist.csv` to look like? – Josh Sharkey Jan 02 '20 at 17:29
I would like an output like this: 'username1@domain.com, username2@domain.com, username3@domain.com' all in a single string as mentioned above. – Pucho Jan 02 '20 at 17:35
Ok so in this scenario, the commas need to be escaped. You need to have an escape character like `"` for your list. See https://stackoverflow.com/questions/4617935/is-there-a-way-to-include-commas-in-csv-columns-without-breaking-the-formatting for more info. I will update my answer shortly. – Josh Sharkey Jan 02 '20 at 17:45
1

I replaced ' single_string = '"' + ','.join(rows[3]) + '"' ' with ' single_string =''.join(rows[3]) + ';' ' and I've got the output desired. – Pucho Jan 02 '20 at 18:17
1

it says I can award you the bounty in 22 hours. – Pucho Jan 02 '20 at 18:18

score -1 · Answer 2 · answered Dec 24 '19 at 16:00

-1

Issue 1:

','.join(rows) is not being assigned to any variable

Issue 2:

mail_list = csv.reader(open('/tmp/filtered.csv', "r")) above line means mail_list is a csvreader object. You to read the lines using this object.

answered Dec 24 '19 at 16:00

abhilb

5,639
2
20
26

Extract specific column from csv file, and convert to string in Lambda using Python

2 Answers2