I want to a user to input
a file name to be read (for example: text.txt
) in Python, but it reads as string and not file type.
r=(input("insert the name of the file"))
File= open(r,'r')
data=File.read()
data.split()
print(data)
I want to a user to input
a file name to be read (for example: text.txt
) in Python, but it reads as string and not file type.
r=(input("insert the name of the file"))
File= open(r,'r')
data=File.read()
data.split()
print(data)
EDIT: as per the comments on my answer, OP is looking to build a dict
containing {word:wordcount}
for all words in a file (whitespace separated).
There's one REALLY GREAT way to do this, but it doesn't really teach you anything, so I'll show you the slow way to do it first, then include the optimal solution afterwards.
wordcountdict = dict()
r = input("filename: ")
with open(r, 'r') as infile:
for line in infile:
for word in infile.split(): # split on whitespace
try:
wordcountdict[word.lower()] += 1
# try adding one to the word in the counter
except KeyError:
wordcountdict[word.lower()] = 1
# If the word isn't in the dict already, set it to 1
Now you may want to filter out some common words ("at"
, "I"
, "then"
etc), in which case you can build a blacklist of them (something like blacklist = ['at', 'i', 'then']
) and do if word.lower() in blacklist: continue
inside the for word in infile.split()
and before the try/except
block. That will test if the word is in the blacklist, and skip the rest of that execution if it is.
Now I promised you a GREAT way to do this, and that's with collections.Counter
. It's a dictionary specifically created to count elements in a list. There are faster ways to count words, but nothing cleaner in Python (imo). You check out timings on this question
from collections import Counter
wordcountdict = Counter()
r = input("filename: ")
with open(r, 'r') as infile:
for line in infile:
wordcountdict += Counter( map(str.lower,line.split()) )
If you've never used imports from collections
, or the map
function, this is going to be very arcane which is why I didn't put it first! :).
Basically: collections.Counter
takes as an argument an iterable, and counts all elements in the iterable (so `Counter([1,1,2,3,4,4,4]) == {1:2, 2:1, 3:1, 4:3}). You can add them and it creates new keys where they're unique and adds values where they're not.
map(callable, iterable)
runs callable
with an argument of each element of the iterable, and returns a map
object (it's a list
in Python2) that is itself iterable (so map(str.lower, ["ThIS", "Has", "UppEr", "aNd", "LOWERcase"])
gives you a map object that you can iterate through to get ["this","has","upper","and","lowercase"]
since str.lower
was called on all of it).
When we combine the two, we're feeding collections.Counter
a map
object that's lowercased each individual word in line.split()
, then adding it to an initially-empty Counter
that's being used as an accumulator. Capisce?
It's very unclear what your problem is with the code, so I'll just throw out some knowledge for you and hope something sticks.
r = input("insert the name of the file")
# this will be a string from the user, containing the file name, e.g.
# r == "text.txt"
# this is normal, because you pass `open` a filename, not a file object
File = open(r, "r")
# this makes File a file object that's pointed at the file name given from
# the user, opened for reading.
data = File.read()
# this sets data equal to the string containing the entire text in File
# This is usually NOT what you want to do, but without further explanation,
# I'll leave it be
data.split()
# this isn't an in-place operation, so you built a list out of the string
# data, split on newlines, then threw it away since you didn't assign it to
# anything.
print(data)
# prints your original data variable, because remember data.split() is not
# in-place, you'd have to do data = data.split(), but that's the wrong way
# to do that anyway....
Here's what I THINK you want to do...
filename = input("insert the name of the file: ")
with open(filename, "r") as infile:
data = infile.readlines()
This uses a context manager (with
) instead of File = open(filename)
, because that's a better practice. It basically frees you from having to type File.close()
after you're done with it, and also accounts for the fact that Things Can Go Wrong while you're working with files, so if for whatever reason your code throws an exception and doesn't GET to your File.close()
, it still closes the file object once it leaves the with
block.
It also uses .readlines()
instead of .read().split()
, which is literally the same thing. This is still probably NOT what you're trying to do (in most cases you want to just iterate through a file rather than dump all its data into memory) but without more context, I can't help you further.
It also follows PEP8's naming convention, where Capitalizednames
are classes. File
is not a class, it's a file object, so I named it infile
instead. I usually use in_
and out
for filenames, but YMMV.
If you comment in what you're trying to do with the file, I can write some specific code for you.
I don't mind taking a shot in the dark. But it would help if you posted one of the files to be read. and you need to account for the person now knowing which file they should type and what happens when they don't type what you think they will type.
r = raw_input('type the name of the file: ')
with open(r,'r') as myfile:
for data in myfile:
print(data.split())