6

I am working on a custom file path class, which should always execute a function after the corresponding system file has been written to and its file object closed. The function will upload the contents of file path to a remote location. I want the upload functionality to happen entirely behind the scenes from a user perspective, i.e. the user can use the class just like any other os.PathLike class and automatically get the upload functionality. Psuedo code below for refernce.

import os

class CustomPath(os.PathLike):
    
    def __init__(self, remote_path: str):
        self._local_path = "/some/local/path"
        self._remote_path = remote_path

    def __fspath__(self) -> str:
        return self._local_path
    
    def upload(self):
        # Upload local path to remote path.

I can of course handle automatically calling the upload function for when the user calls any of the methods directly.

However, it unclear to me how to automatically call the upload function if someone writes to the file with the builtin open as follows.

custom_path = CustomPath("some remote location")

with open(custom_path, "w") as handle:
    handle.write("Here is some text.")

or

custom_path = CustomPath("some remote location")

handle = open(custom_path, "w")
handle.write("Here is some text.")
handle.close()

I desire compatibility with invocations of the open function, so that the upload behavior will work with all third party file writers. Is this kind of behavior possible in Python?

scruffaluff
  • 345
  • 4
  • 20
  • 1
    I strongly suggest you learn about the [contextlib.contextmanager](https://docs.python.org/3/library/contextlib.html#contextlib.contextmanager). (if you don't understand). A simple example: https://stackoverflow.com/questions/60367476/context-manager-that-handles-exceptions – Carson Aug 27 '20 at 01:24
  • Can you give an example of "third party file writer" ? What wouldn't fit your requirement if you wrote a separate `custom_open` function ? – LeGEC Aug 27 '20 at 20:10
  • One of the ones I was thinking of when I posted was `h5py.File`. However now I realize my question was ill conceived, since third party file writers could be using compiled code to write the files instead of going through the Python interpreter. I guess my question would be better formed as “Can Python register a callback function for a syscall close event and associate it with a file system path?”. – scruffaluff Aug 27 '20 at 22:24
  • Need to reformulate the question for clarity in the previous comment to “Can Python register a callback function for ANY syscall close event for ANY file object created that corresponds to a SPECIFIC file system path?” – scruffaluff Aug 27 '20 at 22:36

2 Answers2

4

Yes, it is possible with Python by making use of Python's function overriding, custom context manager and __ getattr __ facilities. Here's the basic logic:

  • override the builtins.open() function with custom open() class.
  • make it compatible with context manager using __ enter __ and __ exit__ methods.
  • make it compatible with normal read/write operations using __ getattr __ method.
  • call builtins method from the class whenever necessary.
  • invoke automatically callback function when close() method is called.

Here's the sample code:

import builtins
import os

to_be_monitered = ['from_file1.txt', 'from_file2.txt']   

# callback function (called when file closes)
def upload(content_file):
    # check for required file
    if content_file in to_be_monitered:
        # copy the contents
        with builtins.open(content_file, 'r') as ff:
            with builtins.open(remote_file, 'a') as tf:
                # some logic for writing only new contents can be used here 
                tf.write('\n'+ff.read())
    
                

class open(object):
    def __init__(self, path, mode):
        self.path = path
        self.mode = mode

    # called when context manager invokes
    def __enter__(self):
        self.file = builtins.open(self.path, self.mode)
        return  self.file

    # called when context manager returns
    def __exit__(self, *args):
        self.file.close()
        # after closing calling upload()
        upload(self.path)
        return True
    
    # called when normal non context manager invokes the object
    def __getattr__(self, item):
        self.file = builtins.open(self.path, self.mode)
        # if close call upload()
        if item == 'close':
            upload(self.path)
        return getattr(self.file, item)


if __name__ == '__main__':
    
    remote_file  = 'to_file.txt'
    local_file1  = 'from_file1.txt' 
    local_file2  = 'from_file2.txt' 

    # just checks and creates remote file no related to actual problem
    if not os.path.isfile(remote_file):
        f = builtins.open(remote_file, 'w')
        f.close()

    # DRIVER CODE
    # writing with context manger
    with open(local_file1, 'w') as f:
        f.write('some text written with context manager to file1')

    # writing without context manger
    f = open(local_file2, 'w')
    f.write('some text written without using context manager to file2')
    f.close()

    # reading file
    with open(remote_file, 'r') as f:
        print('remote file contains:\n', f.read())

What does it do:

Writes "some text written with context manager to file1" to local_file1.txt and "some text written without context manager to file2" to local_file2.txt meanwhile copies these text to remote_file.txt automatically without copying explicitly.

How does it do:(context manager case)

with open(local_file1, 'w') as f: cretes an object of custom class open and initializes it's path and mode variables. And calls __ enter __ function(because of context manager(with as block)) which opens the file using builtins.open() method and returns the _io.TextIOWrapper (a opened text file object) object. It is a normal file object we can use it normally for read/write operations. After that context manger calls __ exit __ function at the end which(__ exit__) closess the file and calls required callback(here upload) function automatically and passes the file path just closed. In this callback function we can perform any operations like copying.

Non-context manger case also works similarly but the difference is __ getattr __ function is the one making magic.

Here's the contents of file's after the execution of code:

from_file1.txt

some text written with context manager to file1

from_file2.txt

some text written without using context manager to file2

to_file.txt

some text written with context manager to file1
some text written without using context manager to file2
Girish Hegde
  • 1,420
  • 6
  • 16
  • I feel that I was unclear in my question. I originally accepted the question because it seemed that you answered the question correctly based on your interpretation. I am not asking how to provide the user an `open` function that they can use on the class. I am asking how to override the builtin `open` function for *third* party file writers not written by the user. – scruffaluff Aug 27 '20 at 06:40
2

Based on your comment to Girish Dattatray Hegde, it seems that what you would like to do is something like the following to override the default __exit__ handler for open:

import io


old_exit = io.FileIO.__exit__ # builtin __exit__ method


def upload(self):
    print(self.read()) # just print out contents


def new_exit(self):
    try:
        upload(self)
    finally:
        old_exit(self) # invoke the builtin __exit__ method


io.FileIO.__exit__ = new_exit # establish our __exit__ method


with open('test.html') as f:
    print(f.closed) # False
print(f.closed) # True

Unfortunately, the above code results in the following error:

test.py", line 18, in <module>
    io.FileIO.__exit__ = new_exit # establish our __exit__ method
TypeError: can't set attributes of built-in/extension type '_io.FileIO'

So, I don't believe it is possible to do what you want to do. Ultimately you can create your own subclasses and override methods, but you cannot replace methods of the exiting builtin open class.

Booboo
  • 38,656
  • 3
  • 37
  • 60
  • Is there any possible way to somehow execute custom code when a file object corresponding to the system path is closed by a third party file writer without a manual watcher set up by the user? – scruffaluff Aug 27 '20 at 15:26
  • If you mean a "file object in the system path directories"" I personally do not know how except by monitoring the directories. – Booboo Aug 27 '20 at 17:51
  • Same here. Thanks for your help though. – scruffaluff Aug 27 '20 at 22:16