Use data from different files with same names but different extensions to get the line numbers

Question

I use the following code:

 from collections import defaultdict
 import sys
 import os
 for doc in   os.listdir('path1'):
doc1 = "path1" + doc
doc2 = "path2" + doc

doc3 = "path3" + doc
with open(doc1,"r") as words:
    sent = words.read().split()
        print sent
    linenos = {}

    with open(doc2, "r") as f1:
            for i, line in enumerate(f1):
                for word in sent:
                        if word in line:
                            if word in linenos:
                                    linenos[word].append(i + 1)
                            else:
                                    linenos[word] = [i + 1]

    matched2 = []
    for word in sent:
            if word in linenos:
                matched2.append('%s %r' % (word, linenos[word][0]))
            else:
                matched2.append('%s <does not exist>' % word)
    with open(doc3,"w") as f1:
        f1.write( ', '.join(matched2))

So, my path1 contains files like file1.title, file2.title and so on... till file240.title

Similarly, I have path2 which contains files like file1.txt, file2.txt and so on.. till tile240.txt

For example:

file1.title will have data like:

military  troop deployment number need

file1.txt will have :

foreign 1242
military 23020
firing  03848
troop 2939
number 0032
dog 1234
cat 12030
need w1212

OUTPUT:

path3/file1.txt

military 2, troop 4, deployment <does not exist>, number 5, need 8

Basically, the code gets the line number of the words present in file1.txt and the words are inputted from file1.title. It works fine for individual files like inputting single file at a time. But I need this to be done for a folder full of documents.

That is, it should read words from file1.title and get the line numbers of the words from file1.txt and similarly, read words as string from file2.title and get the line numbers of those words from file2.txt and so on..

The problem is, I am unable to read the same files with different extensions with this code. How should I modify this to get the appropriate output?

Possible duplicate of [changing file extension in python](http://stackoverflow.com/questions/2900035/changing-file-extension-in-python) — R Nar, Nov 17 '15 at 17:18
No. I don't want t rename but use two files with different extensions to get the line number — Ana_Sam, Nov 17 '15 at 17:19
When asking a question on SO, try boiling it down to a short, self-containing example. Most of the code and explanation has nothing to do with your actual problem. — Falko, Nov 17 '15 at 17:25
Sorry.. I am still learning to get a hold of stackoverflow. I will change it here after. — Ana_Sam, Nov 17 '15 at 17:28

score 2 · Accepted Answer · answered Nov 17 '15 at 17:24

2

I guess you're asking for replacing the extension in a filename string, like as follows:

doc2 = "path2" + doc[:-6] + ".txt"

This strips the 6 characters ".title" from doc and adds the extension ".txt".

answered Nov 17 '15 at 17:24

Falko

17,076
13
60
105

score 1 · Answer 2 · answered Nov 17 '15 at 17:40

1

Are you looking to do something like this?

import os

for name in set([fname.split('.')[0] for fname in os.listdir('.') if fname.split('.')[1] in ['txt', 'title']]):
    f1 = open(''.join([name, '.txt'])).read()
    f2 = open(''.join([name, '.title'])).read()
    # Do whatever with the file contents

answered Nov 17 '15 at 17:40

Adam Acosta

603
3
6

I wanted to strip the extension and perform the necessary functions. The previous answer was what I wanted. Thanks for your time – Ana_Sam Nov 17 '15 at 17:42

score 0 · Answer 3 · answered Nov 17 '15 at 17:18

0

I think you just need to write the full name of the file on open(docx, 'w'). For example replace doc1 to 'file1.title' and doc2 to 'file1.txt', I don't know if that's what you're doing but the extension is important when you call for a file.

answered Nov 17 '15 at 17:18

Seraf

850
1
17
34

I want this process to be performed for a folder full of files and not on single file at a time. It works for single files – Ana_Sam Nov 17 '15 at 17:20

Use data from different files with same names but different extensions to get the line numbers

3 Answers3