0

I have a file that has 3 values on each line. It is a fairly random file, and any of these values can be str or int.

George, 34s, Nikon

42, absent, Alan

apple, 111, 41

marked, 15, never

...

So, I read in the line, and using split I get the first value:

theFile = r"C:\... "

tDC = open(theFile, "r")

for theLine in tDC:

        a, b, c = theLine.split(',')

So far so good.

Where I'm stuck is when I try to deal with variable a. I need to deal with it differently if it is a str or if it is an int. I tried setting a = int(a), but if it is a string (e.g., 'George') then I get an error. I tried if type(a) = int or if isinstance(a,int), but neither work because all the values come in as a string!

So, how do I evaluate the value NOT looking at its assigned 'type'? Specifically, I want to read all the a's and find the maximum value of all the numbers (they'll be integers, but could be large -- six digits, perhaps).

Is there a way to read in the line so that numbers come in as numbers and strings come in as strings, or perhaps there is a way to evaluate the value itself without looking at the type?

fredtantini
  • 15,966
  • 8
  • 49
  • 55
Darlene
  • 3
  • 1
  • 3
  • 2
    You could use a `try`/`except` block, using the `ValueError` in the `except`. – Cory Kramer Oct 20 '14 at 22:10
  • How do you know whether some column `42` represents the int `42` or the string `"42"`? After all, data that can have string values like `"32s"` can probably also have string values like `"42"`. – abarnert Oct 20 '14 at 22:12
  • 1
    As a side note, you may want to consider using the `csv` library instead of manually calling `split`. I have no idea where your data come from or what they mean, but I wouldn't be too surprised it you ran into a column like `"Smith, John"`, which your code will treat that as two columns instead of one, which would be hard to fix, while with `csv` it will either just work, or be a trivial matter of setting a dialect parameter to fix it. – abarnert Oct 20 '14 at 22:14
  • http://stackoverflow.com/questions/5626815/how-can-i-avoid-type-checking-a-python-object-if-its-attributes-arent-used – Ryne Everett Oct 20 '14 at 22:14
  • if type(a) = int does not test for equality. To test for equality you use ==(double equals). A single = is for assignment. – Totem Oct 20 '14 at 22:15
  • @Totem: At any rate, `isinstance(a, int)` is almost a better test than `type(a) == int`, but neither one is going to help for exactly the reason already explained in the question: `type(a)` is guaranteed to be `str` anyway… – abarnert Oct 20 '14 at 22:16

4 Answers4

5

The first point is that you need some rule that tells you which values are integers and which ones aren't. In a data set that includes things like 32s, I'm not sure it makes sense to just treat anything that could be an integer as if it were.

But, for simplicity, let's assume that is the rule you want: anything that could be an integer is. So, int(a) is already pretty close; the only issue is that it can fail. What do you do with that?

Python is designed around EAFP: it's Easier to Ask Forgiveness than Permission. Try something, and then deal with the fact that it might fail. As Cyber suggests, with a try statement:

try:
    intvalue = int(a)
except ValueError:
    # Oops, it wasn't an int, and that's fine
    pass
else:
    # It was an int, and now we have the int value
    maxvalue = max(maxvalue, intvalue)
Community
  • 1
  • 1
abarnert
  • 354,177
  • 51
  • 601
  • 671
  • @PadraicCunningham: If "any of these values can be str or int", `float(a)` seems like a bad idea. Of course if the desired rule is "anything that can be interpreted as a float should be parsed as a float and then truncated" (or rounded or whatever) that's different, but that seems a lot less likely than "anything that can be interpreted as an int should be parsed as an int". – abarnert Oct 20 '14 at 22:42
  • @PadraicCunningham: Yes, `"2.0"` will raise a `ValueError` and be skipped. So will `"34s"`, which we know is in his input. And so will `"2 + 3*4"`, and `"0xcab"`, and `"two"` and `"0'"` and `"4/1"`, and all kinds of other things that could be considered representations of integers. And if you don't understand how making the ints floats could affect the sum, try adding `12345678901234567890 + 1 + -12345678901234567890` after converting everything to floats. – abarnert Oct 20 '14 at 23:08
  • @PadraicCunningham: Who cares how large the cutoff is before it becomes an issue, when there's no reason to create the issue in the first place? And if the rule is "everything is `int` or `str`", as the OP said, then `"2.0"` is just as not-an-`int` as `"two"`. – abarnert Oct 20 '14 at 23:25
1

isalpha() Returns "True" if all characters in the string are in the alphabet

isnumeric() Returns "True" if all characters in the string are numeric

so;

data="Hello World" 
print(data.isnumeric()) #it will retuns with False
print(data.isalpha())   # True

Sorry for my soulles answer, I just came here for same issue, I found a different way and wanted to share with you

0
values = theLine.split(',')
for value in values:
    try:
        number = int(value)
        # process as number
    except ValueError:
        # process value as string
Jonathan Eunice
  • 21,653
  • 6
  • 75
  • 77
0

this :

def ret_var(my_var: int) -> int:
try:
    intvalue = int(my_var)
    return my_var
except ValueError:
    print("my_var not int!")
Salio
  • 1,058
  • 10
  • 21