0

I am comparing two excel files by searching a column value in other file and if that its not present in other file, It will write that whole row to text file.

My excels files are very large, They contain about 2,90,000 rows

Here is what I have tried

 import sys
 import pandas as pd

 orig_stdout = sys.stdout
 f = open('out.txt', 'w')
 sys.stdout = f`


df0 = pd.ExcelFile('1.xlsx').parse('Sheet1')
df1 = pd.ExcelFile('v2.xlsx').parse('Sheet1')

print (df0[~df0['initial_id'].isin(df1['initial_id'])])

sys.stdout = orig_stdout
f.close()

print('Done.')'

compare a value under initial_id column and if its not present in second excel file , print that whole row from first file to output file

Actual Result

21  EXCLAMATION MARK            A1  INVERTED EXCLAMATION MARK
22  QUOTATION MARK              A2  CENT SIGN
23  NUMBER SIGN                 A3  POUND SIGN
24  DOLLAR SIGN                 A4  CURRENCY SIGN
25  PERCENT SIGN                A5  YEN SIGN
26  AMPERSAND                   A6  BROKEN BAR
27  APOSTROPHE                  A7  SECTION SIGN
...     ...                    ...       ...
3159  DIGIT NINE                  B9  SUPERSCRIPT ONE
3160  COLON                       BA  MASCULINE ORDINAL INDICATOR
3161  SEMICOLON                   BB  RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
3162  LESS-THAN SIGN              BC  VULGAR FRACTION ONE QUARTER
3163  EQUALS SIGN                 BD  VULGAR FRACTION ONE HALF

Expected Result

Missing lines after 27 should also be written to file. If It consumes RAM to store, Part files will also work

Akbar
  • 68
  • 1
  • 10
  • This may be a duplicate of [How to remove ellipsis from a row in a Python Pandas series or data frame, shown when long lines/wide columns are truncated?](https://stackoverflow.com/questions/21028819/how-to-remove-ellipsis-from-a-row-in-a-python-pandas-series-or-data-frame-shown). – John Anderson Jan 29 '19 at 03:55
  • 1
    Umm...why wouldn't you just use `to_csv` or something? – alkasm Jan 29 '19 at 05:42

0 Answers0