Sorting numerical data from a notepad file with string in it

Question

So I have this code:

meme = int
meme = 1
import sys
data = int

if meme == 1:
    lines = open('C:\Users\maksn\Desktop\A452\scores class 1').readlines()
new_data = []
for line in lines:
    new_data.append(int(line.strip()))
print (new_data)

I want it to read string data but only the numerical values so I can later convert them to integers to sort out but I get this error: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

The text file at the moment has this in it:

kek got 4

kek got 2

kek got 10

Any help would be appreciated.

Edit:

Not sure if this is useful but the bracket after open is highlighted in red.

That error sounds like your file is encoded in some unexpected way. Although your example content doesn't show anything I'd expect to cause a problem. Could the real file have null characters or non-ASCII characters (names with accents, for example) — SpoonMeiser, Apr 14 '16 at 16:41
Also, what's that `meme` nonesense? Why is it initially set to `int`, and your code is such that it'll throw an exception if it's anything other than `1`. Similarly, what's up with `data = int`? — SpoonMeiser, Apr 14 '16 at 16:43
I'm kind of inexperienced with python and programming in general so the meme was used to make sure the coding runs whilst the data part was just a test to see if it matters. — TreeMen OneChanell, Apr 14 '16 at 16:47

score 0 · Answer 1 · edited May 23 '17 at 10:28

First off let me second what SpoonMeiser said, there is no reason for setting meme or data=int. If you really want to do what you saying then do the following: Loop through each line then read this post (Checking whether a variable is an integer or not) and add an if statement that checks if each character you read in is an int and then append this to the array. You will also need to add some logic to handle multi-digit integers. Basically just add all integers you read from any given line to a string and then use int() on the string to convert it to an int.

If what you really want to do is just print out any numbers, then ignore the part about converting to int and if the file will always have the form "kek got x" then just remove the first part from each line using the following function:

def filter_char(string, char_set):
for char in string:
    if char in char_set:
        string = string.replace(char, "")
return string

filter_char(line, "kek has") and then append this to the array you are going to print.

Finally, make sure you are using a plain text file; this is probably why you are getting the unicode error. Some text editors(eg Text Edit, the default for Mac) save in .rtf, rich text format as a default. Either change the file type directly or copy paste into a better editor.

Above where I linked to a post about checking numbers, you can also use the built in str.isnumeric(), or conversely remove the letters using str.isalpha(). Again you would want to run these one character at a time(is run on a whole line then they will return true only if the whole line is numeric or a letter). A downside of using the isalpha() is that it would filter out other unwanted, non-letter characters (eg @#$) — Albert Rothman, Apr 15 '16 at 17:53

Sorting numerical data from a notepad file with string in it

1 Answers1