Python-read a file from a defined variable

Question

I want to a user to input a file name to be read (for example: text.txt) in Python, but it reads as string and not file type.

r=(input("insert the name of the file"))
  File= open(r,'r')
  data=File.read()
  data.split()
  print(data)

Wait, what? I don't understand what you are asking at all. Of course it reads as a string, why would it do anything else? — anon582847382, Mar 28 '14 at 17:39
Your code does exactly what you want it to right now. What's the issue you're having? — Adam Smith, Mar 28 '14 at 17:40
_and not file type_: What do you _expect_ it to do based on the file type? — Two-Bit Alchemist, Mar 28 '14 at 17:41
Indentation is a bit off and there are a pair of extra () that's not needed. Except from that, the code is correct. — Fredrik Pihl, Mar 28 '14 at 17:45
The only thing that I can see wrong with this code is that `data.split()` does nothing. I think you mean `data = data.split()`. — anon582847382, Mar 28 '14 at 17:47

score 3 · Accepted Answer · edited Jun 20 '20 at 09:12

New hotness

EDIT: as per the comments on my answer, OP is looking to build a dict containing {word:wordcount} for all words in a file (whitespace separated).

There's one REALLY GREAT way to do this, but it doesn't really teach you anything, so I'll show you the slow way to do it first, then include the optimal solution afterwards.

wordcountdict = dict()

r = input("filename: ")
with open(r, 'r') as infile:
    for line in infile:
        for word in infile.split(): # split on whitespace
            try:
                wordcountdict[word.lower()] += 1
                # try adding one to the word in the counter
            except KeyError:
                wordcountdict[word.lower()] = 1
                # If the word isn't in the dict already, set it to 1

Now you may want to filter out some common words ("at", "I", "then" etc), in which case you can build a blacklist of them (something like blacklist = ['at', 'i', 'then']) and do if word.lower() in blacklist: continue inside the for word in infile.split() and before the try/except block. That will test if the word is in the blacklist, and skip the rest of that execution if it is.

Now I promised you a GREAT way to do this, and that's with collections.Counter. It's a dictionary specifically created to count elements in a list. There are faster ways to count words, but nothing cleaner in Python (imo). You check out timings on this question

from collections import Counter

wordcountdict = Counter()
r = input("filename: ")
with open(r, 'r') as infile:
    for line in infile:
        wordcountdict += Counter( map(str.lower,line.split()) )

If you've never used imports from collections, or the map function, this is going to be very arcane which is why I didn't put it first! :).

Basically: collections.Counter takes as an argument an iterable, and counts all elements in the iterable (so `Counter([1,1,2,3,4,4,4]) == {1:2, 2:1, 3:1, 4:3}). You can add them and it creates new keys where they're unique and adds values where they're not.

map(callable, iterable) runs callable with an argument of each element of the iterable, and returns a map object (it's a list in Python2) that is itself iterable (so map(str.lower, ["ThIS", "Has", "UppEr", "aNd", "LOWERcase"]) gives you a map object that you can iterate through to get ["this","has","upper","and","lowercase"] since str.lower was called on all of it).

When we combine the two, we're feeding collections.Counter a map object that's lowercased each individual word in line.split(), then adding it to an initially-empty Counter that's being used as an accumulator. Capisce?

Old and busted

It's very unclear what your problem is with the code, so I'll just throw out some knowledge for you and hope something sticks.

r = input("insert the name of the file")
# this will be a string from the user, containing the file name, e.g.
# r == "text.txt"
# this is normal, because you pass `open` a filename, not a file object

File = open(r, "r")
# this makes File a file object that's pointed at the file name given from
# the user, opened for reading.

data = File.read()
# this sets data equal to the string containing the entire text in File
# This is usually NOT what you want to do, but without further explanation,
# I'll leave it be

data.split()
# this isn't an in-place operation, so you built a list out of the string
# data, split on newlines, then threw it away since you didn't assign it to
# anything.

print(data)
# prints your original data variable, because remember data.split() is not
# in-place, you'd have to do data = data.split(), but that's the wrong way
# to do that anyway....

Here's what I THINK you want to do...

filename = input("insert the name of the file: ")
with open(filename, "r") as infile:
    data = infile.readlines()

This uses a context manager (with) instead of File = open(filename), because that's a better practice. It basically frees you from having to type File.close() after you're done with it, and also accounts for the fact that Things Can Go Wrong while you're working with files, so if for whatever reason your code throws an exception and doesn't GET to your File.close(), it still closes the file object once it leaves the with block.

It also uses .readlines() instead of .read().split(), which is literally the same thing. This is still probably NOT what you're trying to do (in most cases you want to just iterate through a file rather than dump all its data into memory) but without more context, I can't help you further.

It also follows PEP8's naming convention, where Capitalizednames are classes. File is not a class, it's a file object, so I named it infile instead. I usually use in_ and out for filenames, but YMMV.

If you comment in what you're trying to do with the file, I can write some specific code for you.

It would be better to do `read().splitlines()` as `readlines()` leaves the annoying newline character `'\n'` that Python gets whilst scanning the file. — anon582847382, Mar 28 '14 at 17:54
@AlexThornton `list(map(strip,map(splitlines,''.join([char for char in infile.read()]))))` for more obfuscation :) — Adam Smith, Mar 28 '14 at 17:55
Very good explanation! But what i want to do is to recieve a file and count the number of words in that file, and put them in a dicitionary with {word,numberoftimes}. Im new at python so im still learning — Pnelson, Mar 29 '14 at 16:14
@user3448350 I'll edit, but you should put that in the question so we can reopen. — Adam Smith, Mar 29 '14 at 17:57

score 1 · Answer 2 · answered Mar 28 '14 at 17:57

1

I don't mind taking a shot in the dark. But it would help if you posted one of the files to be read. and you need to account for the person now knowing which file they should type and what happens when they don't type what you think they will type.

r = raw_input('type the name of the file: ')
with open(r,'r') as myfile:
    for data in myfile:
        print(data.split())

answered Mar 28 '14 at 17:57

Back2Basics

7,406
2
32
45

A concern here, `data` in OP's code doesn't mean `data` in your code. Your `data` is just one line of the file. Not bad by any means, just may be confusing for OP – Adam Smith Mar 28 '14 at 18:07
Good point. It was more instructional on how to use/format data. – Back2Basics Mar 29 '14 at 07:30

Python-read a file from a defined variable

2 Answers2

New hotness

Old and busted