0

I want to write a function to process some data I bring from Excel. The data is essentially in an Excel column (transaction IDs). For reasons of my own convenience, I thought I'd use raw_input with copy-pasting the column from Excel, store it and run the function on that.

However, whatever I do I get errors (I actually got stuck in this very first stage of bringing in the data), and I'm pretty sure the reason is because each item is in a new line (when I use Excel's option to transpose the column to a row, I get no errors).

So, for instance, if I wanted to try and set a sample string to work with, e.g.:

some_string = "014300071432Gre
014300054037Col
014300065692ASC"

(this is the formatting you get when pasting from a column in Excel), and just call some_string, I'd get:

  File "<stdin>", line 1
    al = "014300071432Gre
                        ^
SyntaxError: EOL while scanning string literal

I tried removing the line-breaks with .split() but that didn't work

I also tried the triple quotes I saw suggested in several threads, but that didn't work either. It only got me more confused because I thought triple quotes are used when you don't want python to evaluate something.

I've placed some Sample Data in a Google doc.

Would really appreciate any help. Thanks!

Optimesh
  • 2,667
  • 6
  • 22
  • 22

3 Answers3

1

You're right that the difficulty in using raw_input with a copied column of Excel data is the newlines. The issue is that raw_input specifically reads one line. From the official docs:

raw_input([prompt])
If the prompt argument is present, it is written to standard output without a trailing newline. The function then reads a line from input, converts it to a string (stripping a trailing newline), and returns that.

By definition, a newline character marks the end of a line. So there really isn't a simple way to paste a column of Excel data into raw_input.


In most cases, the best way to read Excel data from Python is simply to read the Excel file directly. The best package for this is xlrd. Assuming your workbook is named myData.xls and you want to read A2:A5 from the first sheet, you would do something like

import xlrd

wb = xlrd.open_workbook('myData.xls')
ws = wb.sheet_by_index(0)
result = ws.col_values(0, 1, 5)

At that point, result would be a 4-element list of cell values (A2, A3, A4, and A5).


If you really need the user interface to be "copy a range of cells in Excel; paste into my app" then you probably have to look into building a GUI which has a multiline text input box. Here you have lots of choices, from Python's included Tkinter, to third-party libraries for Python, to non-Python GUIs (as long as they can read the input and then pass it to your Python program).


Edit: You can read the clipboard directly (so don't do the paste step at all). See these questions for more information. The simplest solution taken from those questions relies on Tkinter:

from Tkinter import Tk

r = Tk()
result = r.selection_get(selection='CLIPBOARD')
r.destroy()

The above assumes that the clipboard is already populated. In other words, the flow would be something like

  1. Your program prompts the user to copy a selection in Excel
  2. The user copies a selection in Excel
  3. The user responds to your program's prompt (to let your program know the clipboard is ready)
  4. Your program issues the above snippet to grab the clipboard contents into result
  5. Your program processes result as desired

No doubt there are more sophisticated ways, but that should be enough to get you going.

Community
  • 1
  • 1
John Y
  • 14,123
  • 2
  • 48
  • 72
0
some_string = '''014300071432Gre
014300054037Col
014300065692ASC'''
x3al
  • 586
  • 1
  • 8
  • 24
  • that's strange, as it didn't work for me earlier. Could you (1) explain how to make it work with raw_input ? (2) explain why it works with triple quotes? I thought they use for commenting blocks of text. Thanks! – Optimesh Mar 26 '14 at 10:32
  • Triple quotes are for strings that can contain quotes themselves. You can "comment" code with them but it's just a side-effect (you can use any literal as statement in python and interpreter will not say anything). Triple-quoted string are often used as docstrings for functions too. – x3al Mar 26 '14 at 10:36
  • Don't sure what's wrong with raw_input: it has nothing to do with quotes or whatever. – x3al Mar 26 '14 at 10:37
  • Thanks. if I do something like: some_var = raw_input("paste col here") print some_var after pasting the data it would print out only the first item - which is in the first row. – Optimesh Mar 26 '14 at 10:41
  • `some_var = sys.stdin.read()` will read everything till Ctrl+D. You can write your logic with raw_input or something to read till an empty line (or two empty lines) – x3al Mar 26 '14 at 11:53
  • but how would the user (me) paste the data? I'd like him to input it as if it was prompted by raw_input(). Thanks :) – Optimesh Mar 26 '14 at 12:46
0
some_string = """014300071432Gre
014300054037Col
014300065692ASC"""

triple quotes are a multiline string,
you could write this as:

some_string = "014300071432Gre\n014300054037Col\n014300065692ASC"