1

Originally, this question was "deepcopy a string in python" and I am keeping the essential parts of the discussion here for reference. Although I did solve the issue, I do not understand neither the origin of the issue nor how the solution differs from the wrong/not working implementation. I do want to understand this and took my time trimming the actual code into a minimal working example.

But first, part of original content: Deep copies of strings return the exact same string:

import copy
_1= "str"
_2 = copy.deepcopy(_str)
_1 is _2 # True

Novel content: The questions are inline with the code. Basically, I am decorating a TestClass with a decorator that is going to generate standardized 'appender' methods on the class for every attribute inserted in TestClass.channels. Please note, I am not posting the question to discuss software architecture/implementation, but to actually understand the interstitials of Python relevant to the noted issue: why all 'appenders' generated will actually perform exactly the same work as that prepared for the last attribute of TestClass.channels defined:

import pandas as pd


def standardize_appenders(kls):
    # Checks `kls` for `channels`, whether an appropriate pseudo implementation of 'appenders' is
    # available and actually implements the `appenders`: BESIDES AUTOMATION, I AM FORCING A STANDARD
    channels = getattr(kls, 'channels')
    chs = [o for o in dir(channels) if not o.startswith('_')]
    chs_type = [getattr(channels, o) for o in chs]

    for _ch, _ch_type in zip(chs, chs_type):  # ISSUE SOMEWHERE HERE
        ch = _ch
        ch_type = _ch_type

        # THE `def appender(self, value, ch=ch, ch_type=ch_type)` SIGNATURE SOLVES THE PROB BUT I DON'T UNDERSTAND WHY
        def appender(self, value):
            # nonlocal ch, ch_type
            if not isinstance(value, ch_type):
                raise TypeError(f'`{ch}` only accepts values of `{ch_type}` type, but found {type(value)}')
            data: dict = getattr(self, 'data')  # let's not discuss the need for this knowledge please
            data_list = data.setdefault(ch, [])
            data_list.append(value)
            data.setdefault(ch, data_list)

        appender.__doc__ = f'<some {ch}/{ch_type}> specific docstring'

        setattr(kls, 'append_' + ch, appender)  # ALL APPENDERS OF `kls` WILL DO ESSENTIALLY THE WORK DEFINED LAST

    return kls


@standardize_appenders
class TestClass:
    class channels:
        dataframe = pd.DataFrame
        text = str

    def __init__(self):
        self.data = {}


if __name__ == '__main__':
    test_inst = TestClass()
    # test_inst.append_dataframe(pd.DataFrame({"col1": [1, 2, 3], "col2": list("abc")}))
    # TypeError: `text` only accepts values of `<class 'str'>` type, but found <class 'pandas.core.frame.DataFrame'>
    test_inst.append_dataframe("A minimal trimmed example")  # NOTE THE WRONG 'APPENDER'
    test_inst.append_text("of the original implementation")
    print(test_inst.data)
    # Out {'text': ['A minimal trimmed example', 'of the original implementation']}

Be my guest to propose a better title. I am out of good ideas for a short, but informative title for this case.

(Windows 10, python 3.8 from an anaconda env)

deponovo
  • 1,114
  • 7
  • 23
  • Since strings are immutable, why would you need to copy it? See: https://stackoverflow.com/a/24804471/9343156 – whme Sep 22 '21 at 08:28
  • Tks. That's right and I was aware of that. Just bumped into a peculiar situation while performing some metaprogramming with dynamic method generation on class where the novel methods are being generated based on a loop over dictionary items (these are strings). Unfortunately, the complexity of the prob does not allow me generating a minimal working example. I will fight a little bit with it... but I guess I'll end up deleting the question because of the lack of info. – deponovo Sep 22 '21 at 08:39
  • "Problem: a deep copy of a string still returns the reference to the original object" This isn't a problem. If your code depends on the *identity* of strings, then your code is broken. The Python interpreter is free to optimize these sorts of things – juanpa.arrivillaga Sep 22 '21 at 08:49
  • @juanpa.arrivillaga I am almost sure that's the problem and that's why I wanted to generate objects at a different `id`. Kind of clumsy, but was worth a try. I just fixed the issue in a somewhat different manner, which to be honest I am not sure why it works, but yet again I could not find an answer on the internet. Anyway, the problem would now be a different one and therefore I'm deleting this question which is now meaningless. – deponovo Sep 22 '21 at 08:59
  • @juanpa.arrivillaga Just pinging you that I brought this question back to life. I decided to take my time and create a minimal working example and find out what I am missing. – deponovo Sep 26 '21 at 14:23
  • 1
    @deponovo ah yes. So the issue here is that you are creating functions with free variables. Python has lexically scoped closures. Your functions are all closed over the same variable which will end up pointing to the same object, the last value in the loop – juanpa.arrivillaga Sep 26 '21 at 17:30
  • 1
    Edited the title to a better version I hope - just in case somebody bumps into a similar situation in a similar context. – deponovo Sep 26 '21 at 18:23

0 Answers0