0

Not sure why it is not going inside the loop.

import csv
import pandas as pd
print('hi')
df = pd.DataFrame
df = pd.DataFrame(columns=['ProductID', 'SKU', 'ISBN', 'UPC','MasterProductID'])

with open(rb'E:\ETL\Python\Client\WORKCAT\shq_client.txt', encoding="utf8") as pf:
    number_of_lines = len(pf.readlines())
    print(number_of_lines)
    csv_file = csv.DictReader(pf, delimiter = '|', skipinitialspace=True, lineterminator='\r\n')
    print(csv_file)
    line_limit = 3
    line_cnt = 0
    print(line_limit)
    for row in csv_file:
        print('before if')
        if line_cnt < number_of_lines:
            print('inside')
            print(row['item_id'], row['short_description'])
            #df.index = line_cnt
            df['ProductID'] = row['item_id']
            print(df)
        else:
            pf.close    
        line_cnt += 1

The output is shown as below:

Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.

Try the new cross-platform PowerShell https://aka.ms/pscore6

PS C:\Users\marunachalam> & C:/Users/marunachalam/AppData/Local/Programs/Python/Python37-32/python.exe c:/Users/marunachalam/Downloads/Extract-ProductFeed-Sams.py
hi
339
<csv.DictReader object at 0x0C9FB070>
3
PS C:\Users\marunachalam> 

I am not sure why it does not go inside the for loop where it is supposed to read the csv file records as there are 339 records in the file.

Any help is appreciated. I am a very beginner in Python.

Thanks

Kingsley
  • 14,398
  • 5
  • 31
  • 53
Mani A
  • 11
  • 1
  • 6
  • `number_of_lines = len(pf.readlines())` read the entire file. Now you are at the end with nothing to iterate. You could do `fp.seek(0)` but is there any reason for the count? `line_count` will always be less than `number_of_lines`. Are you trying to skip the last line? – tdelaney Feb 24 '20 at 03:32
  • I thnk you'll find other problems when that one is solved. `pandas` has csv processors like [pandas.read_csv](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html#pandas-read-csv) to read the file without resorting to the python `csv` module. The goal is to keep processing inside pandas which doesn't have to make a python object for every basic C object. – tdelaney Feb 24 '20 at 03:38
  • Thank you! I understood now. I am trying to read the file and load them into data frames. But my realfile is really a huge one (with 250k lines in it) and I tried using read_csv and I keep getting Memory Error and I even posted that in here and got redirected to posts related to that. But none of them worked (even chunksize option with chunksize=1) and not really sure where exactly the issue was. I ended up using dictreader method. But I still have to do lots of aggregate functionalities with this data after it gets loaded – Mani A Feb 24 '20 at 19:31
  • Does this answer your question? [Import CSV file as a pandas DataFrame](https://stackoverflow.com/questions/14365542/import-csv-file-as-a-pandas-dataframe) – Trenton McKinney Sep 25 '20 at 03:30

0 Answers0