-1

Take this invoice.txt for example

Invoice Number

INV-3337

Order Number

12345

Invoice Date

January 25, 2016

Due Date

January 31, 2016

And this is what dict.txt looks like:

Invoice Date

Invoice Number

Due Date

Order Number

I am trying to find keywords from 'dict.txt' in 'invoice.txt' and then add it and the text which comes after it (but before the next keyword) in a 2 column datatable.

So it would look like :

col1 ----- col2

Invoice number ------ INV-3337

order number ---- 12345

Here is what I have done till now

with open('C:\invoice.txt') as f:
    invoices = list(f)

with open('C:\dict.txt') as f:
    for line in f:
        dict = line.strip()
        for invoice in invoices:
            if dict in invoice:
                print invoice

This is working but the ordering is all wrong (it is as in dict.txt and not as in invoice.txt)

i.e. The output is

Invoice Date

Invoice Number

Due Date

Order Number

instead of the order in the invoice.txt , which is

invoice number

order number

invoice date

due date

Can you help me with how I should proceed further ?

Thank You.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
jokol
  • 383
  • 2
  • 12

1 Answers1

1

This should work. You can load your invoice data into a list, and your dict data into a set for easy lookup.

with open('C:\invoice.txt') as f:
    invoice_data = [line.strip() for line in f if line.strip()] 

with open('C:\dict.txt') as f:
    dict_data = set([line.strip() for line in f if line.strip()])

Now iterate over invoices, 2 at a time and print out the line sets that match.

for i in range(0, len(invoice_data), 2):
    if invoice_data[i] in dict_data:
        print(invoive_data[i: i + 2])
cs95
  • 379,657
  • 97
  • 704
  • 746
  • the output comes out weird like this : " ['Invoice Date\n', '\n'] " and it did not recognize "Order Number" – jokol Jul 26 '17 at 12:32
  • @jokol It seems you have extra newlines. Check my edit. – cs95 Jul 26 '17 at 12:41
  • hey , I was wondering , how can i put this into an excel file or csv file ? each of the columns in 2 rows. – jokol Jul 27 '17 at 07:15
  • 1
    @jokol Info on the csv module: https://docs.python.org/3/library/csv.html – cs95 Jul 27 '17 at 08:11
  • no matter which way I use , I am getting " IndexError : Index out of range " I will post my code as an answer so you may see... – jokol Jul 27 '17 at 12:33
  • @jokol No please don't. Open a new question. – cs95 Jul 27 '17 at 12:34
  • ok. https://stackoverflow.com/questions/45351141/python-get-data-from-a-text-file-and-put-it-in-an-csv-file-list-index-out-of – jokol Jul 27 '17 at 12:42