3

I have the following class

@dataclass_json
@dataclass
class Input:
    sources: List[Sources] =None
    Transformations: List[str] =None

As well as:

@dataclass_json
@dataclass
class Source:
    type: str =None
    label: str =None
    path: str = None

and the two subclasses:

@dataclass_json
@dataclass
class Csv(Source):
    csv_path: str=None
    delimiter: str=';'

and

@dataclass_json
@dataclass
class Parquet(Source):
    parquet_path: str=None

Given now the dictionary:

parquet={type: 'Parquet', label: 'events', path: '/.../test.parquet', parquet_path: '../../result.parquet'}
csv={type: 'Csv', label: 'events', path: '/.../test.csv', csv_path: '../../result.csv', delimiter:','}
input={'sources':[csv, parquet]}

Now I would like to do something like

Input().from_dict(input) 

with Output:

Input(sources: [Parquet(...), Csv(...)]).

This actually works but it just returns the values of the 3 paremeters from dataclass Source (type, label and path) and not the other specific parameters of Csv and Parquet (csv_path, delimiter and parquet_path), which are just given its default value. This may be due to the library dataclass_json taken the schema of a class before initializing it. On the other hand, I would like to make still use of dataclass_json and just make a wrapper since it has good case and error handling.

I want this behaviour since each source type has different arguments which defines itself (for example Csv has delimiters but Parquet does not, etc. ...).

I struggled trying to obtain the class Source and find the subclasses in the library dataclass_json. Doing so, I encountered

cls.__args__[0]

which is of type 'GenericMeta'. But given this, I could not obtain access to its subclasses.

Is there any work around?

I am using Python 3.6, by the way.

Thanks in advance for your help.

Patricio
  • 253
  • 4
  • 14
  • How is this different from your original question here?: https://stackoverflow.com/questions/61339788/dict-attribute-type-to-select-subclass-of-dataclass/61509283?noredirect=1#comment108860565_61509283 I'm no SO expert, but I think they'd rather you edit your original question if you need to clarify or add new information, rather than asking what is basically the same question twice. – ibonyun May 01 '20 at 22:04
  • I edited the question a little bit to make it more clearer. The problem here is mainly that it does not take all values when I do make 'from dict' what is the method inside the library dataclass_json used. Since I would apply this to Input, the function from_dict is inherently applied to Source and so I cannot control it. – Patricio May 02 '20 at 08:08
  • Are you trying to instantiate an `Input` object by giving it the instructions for how to instantiate `Source` subclasses (either `Csv` or `Parquet`) and you're disappointed when it doesn't do that? If so, then you have 2 options. 1) Don't do that. Instantiate your `Csv` or `Parquet` objects first and then pass them to `Input`. 2) Bite the bullet and actually write an `__init__` method which does this -- or `__post_init__` if you're really married to `dataclass`. You can't expect it to magically know what you want to do. – ibonyun May 04 '20 at 16:58

1 Answers1

0

I know that this thread is fairly old but for anyone else looking for a solution have a look at dacite.

Here is the link to their github repository: https://github.com/konradhalas/dacite#quick-start

and here is the quickstart code for the sake of completeness:

from dataclasses import dataclass
from dacite import from_dict


@dataclass
class User:
    name: str
    age: int
    is_active: bool


data = {
    'name': 'John',
    'age': 30,
    'is_active': True,
}

user = from_dict(data_class=User, data=data)

assert user == User(name='John', age=30, is_active=True)
tafaust
  • 1,457
  • 16
  • 32