-2

Here is the relevant code:

    for file in files:
            with readfile(file) as openfile:
                molecules.append(process_file_fn(openfile))

and I am getting this error from the code above:

src/datamodules/components/edm/process.py", line 92, in process_xyz_files with readfile(file) as openfile: AttributeError: __enter__

Here is the definition of the readfile:

     if tarfile.is_tarfile(data):
            tardata = tarfile.open(data, "r")
            files = tardata.getmembers()
    
            def readfile(data_pt): return tardata.extractfile(data_pt)

My data is 1234.xyz.tar.bz2

Any insights/suggestions for me is appreciated. Thank you in advance

I tried to define the mode which is read in both the function and the loop but I am met with the same error.

matszwecja
  • 6,357
  • 2
  • 10
  • 17
  • 3
    This might be useful: [Explaining Python's `__enter__` and `__exit__`](https://stackoverflow.com/questions/1984325/). – Jorge Luis May 09 '23 at 07:44
  • This [question](https://stackoverflow.com/questions/51427729/python-error-attributeerror-enter) too – Laassairi Abdellah May 09 '23 at 07:45
  • 1
    Your code looks weird. You are trying to use context manager, but you are also using explicit open which is what we're trying to avoid with context managers. – matszwecja May 09 '23 at 07:49
  • If you want to write your [context manager](https://docs.python.org/3.11/library/stdtypes.html#context-manager-types) I would recommend you to use the decorator [`@contextlib.contextmanager`](https://docs.python.org/3/library/contextlib.html#contextlib.contextmanager) – Jorge Luis May 09 '23 at 07:54
  • Thank you all for the suggestion. However, My def readfile is inside another def called def process_file. How do I define a class in that case? – Rose Vanilla May 09 '23 at 08:44
  • 1
    Why do you want to use `readfile` as a context manager instead of a direct call `openfile = readfile(file)`? – Jorge Luis May 09 '23 at 08:51
  • @DarkKnight readfile is a function. Here is the code: def readfile(data_pt): return tardata.extractfile(data_pt) – Rose Vanilla May 09 '23 at 09:21

3 Answers3

1

I would say you don't need a context manager for your usage case. I think you can write a direct call: openfile = readfile(file).

If you really need a context manager, you can define your function like so:

@contextlib.contextmanager
def readfile(data_pt): yield tardata.extractfile(data_pt)

The decorator contextlib.contextmanager will define the methods __enter__ and __exit__ for you.

Jorge Luis
  • 813
  • 6
  • 21
  • Thanks alot. I tried your suggestion of writing a direct call however how do I write return tardata.extractfile(data_pt) or return openfile() on that? – Rose Vanilla May 09 '23 at 09:19
  • @Ruby Sorry, I don't understand your question. – Jorge Luis May 09 '23 at 09:33
  • Oh sorry for the confusion. I think your code works but I get this error " xyz_lines = [line.decode("UTF-8") for line in datafile.readlines()] AttributeError: 'NoneType' object has no attribute 'readlines'" I think this error mean it's not returning the extracted file? I implemented the function with context manager – Rose Vanilla May 09 '23 at 23:54
  • 1
    To answer you I will need to see your code. Please edit your question providing code enough for us to repeat the error get. [How to create a Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) – Jorge Luis May 10 '23 at 06:08
0

You need to write a class with the __enter__ and __exit__ methods to use a with statement like that.

Have a look at the responses here: Implementing use of 'with object() as f' in custom class in python

If you don't want to implement the context manager you could try changing your for loop to:

for file in files:
    openfile = readfile(file)
    molecules.append(process_file_fn(openfile))
s_pike
  • 1,710
  • 1
  • 10
  • 22
  • Thanks for the suggestion. My def readfile is inside another def called def process_file. How do I define a class in that case? – Rose Vanilla May 09 '23 at 08:40
  • You'd probably want to refactor your code to do that, though you could just write the class where your function is. But from reading through the other answers/comments, Jorge Luis's suggestion might be best. If you're writing this as a one off for a piece of analysis you can probably get away with a simpler solution. – s_pike May 09 '23 at 10:11
  • Thank you for clarifying @s_pike. I used Jorge Luis' contextlib code but I get this error "xyz_lines = [line.decode("UTF-8") for line in datafile.readlines()] AttributeError: 'NoneType' object has no attribute 'readlines'" I think this error mean it's not returning the extracted file? – Rose Vanilla May 10 '23 at 00:12
  • Yes, that's right. Looks like a different error now. Follow Jorge's suggestion of creating a minimal, reproducible example - it not only helps others solve the issue, it will also help you understand your code better, and maybe even fix it yourself. – s_pike May 10 '23 at 08:12
0

Not sure if this helps but here goes anyway...

You could write your own context manager class that would handle a single tar file. For the sake of simplicity we'll assume that all we want to do is extract a known member of the archive. Thus our class could look like this:

import tarfile

class TarfileHandler:
    def __init__(self, filename):
        self._filename = filename
        self._fd = None
    @property
    def fd(self):
        if self._fd is None:
            self._fd = tarfile.open(self._filename)
        return self._fd
    def extract(self, member):
        try:
            return self.fd.extractfile(member)
        except Exception:
            pass
    def __enter__(self):
        return self
    def __exit__(self, *_):
        if self._fd:
            self._fd.close()
            self._fd = None

Now let's contrive a use-case. We know where the tar file is. We know that it contains 'foo.txt'. We want to extract 'foo.txt' and copy it somewhere.

TARFILE = 'mytarfile.tar'
MEMBER = 'foo.txt'
TARGET = 'foo.txt'

with TarfileHandler(TARFILE) as tfh:
    if data := tfh.extract(MEMBER):
        with open(TARGET, 'wb') as out:
            out.write(data.read())

Hopefully this shows you how to implement a context manager class and how you might adapt it to your needs

DarkKnight
  • 19,739
  • 3
  • 6
  • 22