2

I'm having 10 CSV files each of the CSV file is having same number of column from which I'm reading data one by one in the form of pandas data frame. I want those data to be displayed in a window or like in some table form. And it should be like if every time the data gets into new row. Any suggestions on this ?

Below is my sample CSV file : enter image description here

Like this, there are 10 or more CSV file and I will be reading data from those file one by one and want to display in GUI.

Brief Introduction to my Application

I have a machine that is generating CSV files after a certain interval of time into a folder. I am using Watchdog library to put a watch on the folder where the CSV files are being generated. When I receive a CSV file I Read it into a pandas data frame. Sample CSV file is given above.

As far as the machine is running it will keep generating the CSV files. So if I want to see the data I need to open each and every CSV files, Instead, I want a View in which the Data gets updated when there is a new CSV file generated.

So Technically One CSV file is getting read gets converted into a data frame and then inserted into some sort of Table View. And this process happens again when a new CSV file is generated, But now the data should be kept in the next row of the same Table View.

Here is my main file :

import time
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
import pandas as pd
from Append_Function import append_df_to_excel
import os.path
import sys

class Watcher:
    def __init__(self, args):
        self.watch_dir = os.getcwd()
        print(args[0])
        self.directory_to_watch = os.path.join(self.watch_dir, args[1])
        self.observer = Observer()
        self.event_handler = Handler(patterns=["*.CSV"], ignore_patterns=["*.tmp"], ignore_directories=True)

    def run(self):
        self.observer.schedule(self.event_handler, self.directory_to_watch, recursive=False)
        self.observer.start()
        try:
            while True:
                time.sleep(1)
        except:
            self.observer.stop()
            print("Error")

        self.observer.join()


class Handler(PatternMatchingEventHandler):
    @staticmethod
    def on_any_event(event):
        if event.is_directory:
            return None
        elif event.event_type == 'created':
            # Take any action here when a file is first created.
            print("Received created event - %s." % event.src_path)
            df = pd.read_csv(event.src_path, header=1, index_col=0)
            append_df_to_excel(os.path.join(os.getcwd(), "myfile.xlsx"), df)
        elif event.event_type == 'modified':
            # Taken any actionc here when a file is modified.
            df = pd.read_csv(event.src_path, header=0, index_col=0)
            append_df_to_excel(os.path.join(os.getcwd(), "myfile.xlsx"), df)
            print("Received modified event - %s." % event.src_path)


if __name__ == '__main__':
    print(sys.argv)
    w = Watcher(sys.argv)
    w.run()

Here is my Append Function:

import pandas as pd
import openpyxl as ox


def append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None,
                       truncate_sheet=False,
                       **to_excel_kwargs):
    # ignore [engine] parameter if it was passed

    if 'engine' in to_excel_kwargs:
        to_excel_kwargs.pop('engine')

    writer = pd.ExcelWriter(filename, engine='openpyxl')

    # Python 2.x: define [FileNotFoundError] exception if it doesn't exist
    try:
        FileNotFoundError
    except NameError:
        FileNotFoundError = IOError

    try:
        # try to open an existing workbook
        writer.book = ox.load_workbook(filename,keep_vba=True)

        # get the last row in the existing Excel sheet
        # if it was not specified explicitly
        if startrow is None and sheet_name in writer.book.sheetnames:
            startrow = writer.book[sheet_name].max_row

        # truncate sheet
        if truncate_sheet and sheet_name in writer.book.sheetnames:
            # index of [sheet_name] sheet
            idx = writer.book.sheetnames.index(sheet_name)
            # remove [sheet_name]
            writer.book.remove(writer.book.worksheets[idx])
            # create an empty sheet [sheet_name] using old index
            writer.book.create_sheet(sheet_name, idx)

        # copy existing sheets
        writer.sheets = {ws.title: ws for ws in writer.book.worksheets}
    except FileNotFoundError:
        # file does not exist yet, we will create it
        pass

    if startrow is None:
        startrow = 0

    # write out the new sheet
    df.to_excel(writer, sheet_name, startrow=startrow, **to_excel_kwargs, header=True)

    # save the workbook
    writer.save()
Viral Parmar
  • 468
  • 2
  • 9
  • 24

2 Answers2

3

You have to add the dataframe through a loop:

import pandas as pd
from PyQt5 import QtCore, QtWidgets

class DataFrameTableWidget(QtWidgets.QTableWidget):
    def append_dataframe(self, df):
        df = df.copy()
        if df.columns.size > self.columnCount():
            self.setColumnCount(df.columns.size)
        r = self.rowCount()
        self.insertRow(r)
        for c, column in enumerate(df):
            it = QtWidgets.QTableWidgetItem(column)
            self.setItem(r, c, it)
        i = self.rowCount()
        for r, row in df.iterrows():
            self.insertRow(self.rowCount())
            for c, (column, value) in enumerate(row.iteritems()):
                it = QtWidgets.QTableWidgetItem(str(value))
                self.setItem(i+r , c, it)

if __name__ == '__main__':
    import sys
    app = QtWidgets.QApplication(sys.argv)
    import numpy as np
    w = DataFrameTableWidget()
    df = pd.DataFrame(np.random.randint(0, 100,size=(4, 4)), columns=list('ABCD'))
    w.append_dataframe(df)

    def after_show():
        df = pd.DataFrame(np.random.randint(0, 100,size=(4, 4)), columns=list('ABCD'))
        w.append_dataframe(df)
    QtCore.QTimer.singleShot(2*1000, after_show)
    w.resize(640, 480)
    w.show()
    sys.exit(app.exec_())

Update:

The observer runs on another thread so it can not update the GUI from that thread, so a signal must be used to transmit the information:

import os
import time
import pandas as pd
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler
from PyQt5 import QtCore, QtWidgets

from Append_Function import append_df_to_excel

class Emitter(QtCore.QObject):
    newDataFrameSignal = QtCore.pyqtSignal(pd.DataFrame)

class Watcher:
    def __init__(self, filename):
        self.watch_dir = os.getcwd()
        self.directory_to_watch = os.path.join(self.watch_dir, filename)
        self.emitter = Emitter()
        self.observer = Observer()
        self.event_handler = Handler(
            emitter=self.emitter,
            patterns=["*.CSV"], 
            ignore_patterns=["*.tmp"], 
            ignore_directories=True
        )

    def run(self):
        self.observer.schedule(self.event_handler, self.directory_to_watch, recursive=False)
        self.observer.start()


class Handler(PatternMatchingEventHandler):
    def __init__(self, *args, emitter=None, **kwargs):
        super(Handler, self).__init__(*args, **kwargs)
        self._emitter = emitter
    def on_any_event(self, event):
        if event.is_directory:
            return None
        elif event.event_type == 'created':
            # Take any action here when a file is first created.
            print("Received created event - %s." % event.src_path)
            df = pd.read_csv(event.src_path, header=1)
            self._emitter.newDataFrameSignal.emit(df.copy())
            df.set_index(df.columns.values.tolist()[0], inplace=True)
            append_df_to_excel(os.path.join(os.getcwd(), "myfile.xlsx"), df)
        elif event.event_type == 'modified':
            # Taken any actionc here when a file is modified.
            df = pd.read_csv(event.src_path, header=1)
            self._emitter.newDataFrameSignal.emit(df.copy())
            df.set_index(df.columns.values.tolist()[0], inplace=True)
            append_df_to_excel(os.path.join(os.getcwd(), "myfile.xlsx"), df)
            print("Received modified event - %s." % event.src_path)

class DataFrameTableWidget(QtWidgets.QTableWidget):
    @QtCore.pyqtSlot(pd.DataFrame)
    def append_dataframe(self, df):
        df = df.copy()
        if df.columns.size > self.columnCount():
            self.setColumnCount(df.columns.size)
        r = self.rowCount()
        self.insertRow(r)
        for c, column in enumerate(df):
            it = QtWidgets.QTableWidgetItem(column)
            self.setItem(r, c, it)
        i = self.rowCount()
        for r, row in df.iterrows():
            self.insertRow(self.rowCount())
            for c, (column, value) in enumerate(row.iteritems()):
                it = QtWidgets.QTableWidgetItem(str(value))
                self.setItem(i+r , c, it)

if __name__ == '__main__':
    import sys
    app = QtWidgets.QApplication(sys.argv)
    w = DataFrameTableWidget()
    w.resize(640, 480)
    w.show()
    watcher = Watcher(sys.argv[1])
    watcher.run()
    watcher.emitter.newDataFrameSignal.connect(w.append_dataframe)
    sys.exit(app.exec_())
eyllanesc
  • 235,170
  • 19
  • 170
  • 241
  • Wow! you implemented it so soon! That's Awesome! Let me Try it and I'll Let you know! – Viral Parmar Apr 03 '19 at 18:49
  • Do I still need to use PandasModel Class? – Viral Parmar Apr 03 '19 at 18:52
  • @ViralParmar It is not necessary. – eyllanesc Apr 03 '19 at 19:05
  • ok now I have a question here , i want the Pyqt table view widget to be updated after adding one data frame to the view and so on . What you are doing here is appending all the data and then displaying the table View. – Viral Parmar Apr 03 '19 at 19:39
  • @ViralParmar I have modified the example so that you can see that dataframes can be added before or after the widget is displayed. – eyllanesc Apr 03 '19 at 19:47
  • Let me give you a brief intro of what I am doing, I am having a machine that generates a CSV file after every 1 min. I have used watchdog Library to keep watch on those CSV files. When I receive a CSV file I read it into a pandas data frame. And now I need those data frame to be displayed after every minutes in a next row of table widget. – Viral Parmar Apr 03 '19 at 19:47
  • @ViralParmar Your watchdog must run on another thread and send the data to the main thread by means of a signal, I am working on an example with the library that you point out. – eyllanesc Apr 03 '19 at 19:50
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/191184/discussion-between-viral-parmar-and-eyllanesc). – Viral Parmar Apr 03 '19 at 19:51
  • @ViralParmar Can you share some csv files through drive, gist, github or any other similar medium? – eyllanesc Apr 03 '19 at 21:09
  • Let me share you in chat – Viral Parmar Apr 03 '19 at 21:10
  • Let me Check it on my side! – Viral Parmar Apr 03 '19 at 21:54
  • One thing I would like to add here is , I am able to Edit the Data in PyQt Table Widget. – Viral Parmar Apr 03 '19 at 22:09
  • Hi buddy ! Well the code is running fine but I faced a problem while opening the excel file after I close the GUI ..it shows the message that file is corrupted. – Viral Parmar Apr 04 '19 at 18:29
1

You might be looking for:

  • Jupyter notebooks, which are able to display pandas dataframes as HTML formatted tables.
  • Jupyter lab, which includes a GUI CSV viewer.
  • The qgrid extension for jupyter notebooks, which will allow you to interactively filter and edit data.

If your CSV files have the same headers, you might want to concatenate the data to create one single table for review.

FabienP
  • 3,018
  • 1
  • 20
  • 25