0

I have two lists - list1 when printed looks like this:

[['KR', 'Alabama', 111], ['KR', 'Alabama', 909], ['KR', 'Alabama', 90], ['KR', 'Alabama', 10], ['KR', 'Arizona', 12], ['KR', 'Arizona', 10], ['KR', 'Arizona', 93], ['KR', 'Arizona', 98],....]

And list2 when printed looks like this:

[11, 110, 108,....]

Now I want to join these two lists and write the result into a csv file so that output looks like this:

KR,Alabama,111,11
KR,Alabama,909,110
KR,Alabama,90,108
KR,Alabama,10,34
KR,Arizona,12,45

So basically the values of list2 becomes the 4th column in csv file. I wrote this code in ipython but it produces output in wrong format and also does not write all the records in the file (last 26 records are not in the file):

final_list = zip(list1,list2)
print final_list

cdc_part1 = open("file1.csv", 'wb')
wr = csv.writer(cdc_part1, dialect='excel')

wr.writerows(final_list)

The output in file looks like:

"['KR', 'Alabama', 111]",11
"['KR', 'Alabama', 909]",110
"['KR', 'Alabama', 90]",108
"['KR', 'Alabama', 10]",34
"['KR', 'Arizona', 12]",45

As you can notice there " and [] around the list1 item and strings in list1 have ' around them. How can I get the correct format of output and why last 26 records are not getting written to file?

NOTE: list1,list2 as well as final_list that I am forming all are of same size (300) but yet in file I see only 274 records

user2966197
  • 2,793
  • 10
  • 45
  • 77

3 Answers3

1

Since list1 is a list of lists, performing zip(list1, list2) will end up with something like this:

[(['KR', 'Alabama', 111], 11),
 (['KR', 'Alabama', 909], 110),
 (['KR', 'Alabama', 90], 108)]

So you'll need to add an extra step in there to add last element to the first list.

final_list = [ a + [b] for a, b in zip(list1, list2) ]

This will get you,

[['KR', 'Alabama', 111, 11],
 ['KR', 'Alabama', 909, 110],
 ['KR', 'Alabama', 90, 108]]

And that should output the CSV properly.

Aldehir
  • 2,025
  • 13
  • 10
  • how do I remove `'` around strings? – user2966197 Sep 18 '15 at 18:19
  • @user2966197 The output I have is python's string representation of the list. The CSV library you're using should properly format the CSV file. You just have to change your line `final_list = ...` to the line above. – Aldehir Sep 18 '15 at 18:21
  • also I am using `ipython` for this and so my code which writes to a csv file, is there anything wrong to it that may result in last 26 records not getting written to file? Is there a better way to write to csv file in ipython? – user2966197 Sep 18 '15 at 18:22
  • @user2966197 I'm unsure of why you would be missing 26 records. I would investigate those records and see if there's some encoding problem that causes the csv module to ignore them. – Aldehir Sep 18 '15 at 18:24
0

You are using zip the wrong way, for more look here zip lists in python. the problem is that zip joins elements in the input lists, here the element in the first list itself is a list so its joined to the elements in the second list. the correct way to do it is

for each in xrange(0, len(list1)):
    list1[each].append[list2[each]]
print list1
Community
  • 1
  • 1
kaushik94
  • 687
  • 6
  • 17
0

It's all in how zip works. From the docs:

Returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables.

So zip expects each argument to be an iterable. It will then create an iterable (in python2 a list) of tuples grabbing the ith element of each argument for the ith tuple. So if you pass in a list of lists as the first argument and a list of strings as the second, your final items will each be a tuple where each first element is the inner list of list1 and the second element is the corresponding indexed string from list2.

Instead, you want something like:

final_list = [list1[i] + [list2[i]] for i in \
     range(min(len(list1),len(list2)))]

As to why the last 26 records are not in the file, from the zip docs:

The iterator stops when the shortest input iterable is exhausted

So your list2 has 26 fewer elements than list1. Rather than guess at what to add to the last 26 items from list1, it just doesn't include them in the result.

Note: Using min() in the new final_list formulation as above will result in the same short-circuiting behavior

lemonhead
  • 5,328
  • 1
  • 13
  • 25