128

This code opens the URL and appends the /names at the end and opens the page and prints the string to test1.csv:

import urllib2
import re
import csv

url = ("http://www.example.com")
bios = [u'/name1', u'/name2', u'/name3']
csvwriter = csv.writer(open("/test1.csv", "a"))

for l in bios:
    OpenThisLink = url + l
    response = urllib2.urlopen(OpenThisLink)
    html = response.read()
    item = re.search('(JD)(.*?)(\d+)', html)
    if item:
        JD = item.group()
        csvwriter.writerow(JD)
    else:
        NoJD = "NoJD"
        csvwriter.writerow(NoJD)

But I get this result:

J,D,",", ,C,o,l,u,m,b,i,a, ,L,a,w, ,S,c,h,o,o,l,....

If I change the string to ("JD", "Columbia Law School" ....) then I get

JD, Columbia Law School...)

I couldn't find in the documentation how to specify the delimeter.

If I try to use delimeter I get this error:

TypeError: 'delimeter' is an invalid keyword argument for this function
mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Zeynel
  • 13,145
  • 31
  • 100
  • 145

4 Answers4

192

It expects a sequence (eg: a list or tuple) of strings. You're giving it a single string. A string happens to be a sequence of strings too, but it's a sequence of 1 character strings, which isn't what you want.

If you just want one string per row you could do something like this:

csvwriter.writerow([JD])

This wraps JD (a string) with a list.

Laurence Gonsalves
  • 137,896
  • 35
  • 246
  • 299
  • Thanks! This fixed it. I'll try other answers too. I also created an empty list JDList=[] and appended JD to that, that also works but this is simpler. – Zeynel Nov 29 '09 at 22:25
  • 5
    Now it also writes the quotation marks of the string. Is there a way around that? – stefanbschneider Nov 05 '16 at 19:07
  • @CGFoX Can you post example code that demonstrates this? – Laurence Gonsalves Nov 05 '16 at 19:22
  • 1
    `writer.writerow([datetime.now().strftime("%Y-%m-%d %H:%M:%S")])` writes the datetime as `"2016-11-05 20:30:19"` – stefanbschneider Nov 05 '16 at 19:38
  • @CGFoX I cannot reproduce that behavior. I get `2016-11-05 13:21:11` without quotes. What version of Python are you using? – Laurence Gonsalves Nov 05 '16 at 20:26
  • I'm using Python 3.5.2. Maybe the problem is in how I opened the file or set up my writer: `with open(file, "w", newline="") as csvfile:` and then `writer = csv.writer(csvfile, delimiter=" ")`. Directly afterwards, the `writerow` line is executed. – stefanbschneider Nov 06 '16 at 08:24
  • 1
    I think the problem was having a space in the string and using space as delimiter for the writer. When I tried writing `[datetime.now().strftime("%Y-%m-%d_%H:%M:%S")]`, i.e., without space, it worked fine and without the quatation marks. So it had nothing to do with your solution - it works fine! – stefanbschneider Nov 06 '16 at 09:04
13

The csv.writer class takes an iterable as it's argument to writerow; as strings in Python are iterable by character, they are an acceptable argument to writerow, but you get the above output.

To correct this, you could split the value based on whitespace (I'm assuming that's what you want)

csvwriter.writerow(JD.split())
Gabriel Reid
  • 2,506
  • 18
  • 20
3

This happens, because when group() method of a MatchObject instance returns only a single value, it returns it as a string. When there are multiple values, they are returned as a tuple of strings.

If you are writing a row, I guess, csv.writer iterates over the object you pass to it. If you pass a single string (which is an iterable), it iterates over its characters, producing the result you are observing. If you pass a tuple of strings, it gets an actual string, not a single character on every iteration.

shylent
  • 10,076
  • 6
  • 38
  • 55
1

To put it another way - if you add square brackets around the whole output, it will be treated as one item, so commas won't be added. e.g. instead of:

spamwriter.writerow(matrix[row]['id'],matrix[row]['value'])

use:

spamwriter.writerow([matrix[row]['id'] + ',' + matrix[row]['value']])
Dyonn
  • 11
  • 1