0

I'm making some function that parse plain table rows as hierarchical data structure.

There are two data class - ProductInfo : Single product like T-shirts - OptionInfo : Option of single product like 'yellow-XXL', 'blue-XL'

These are written in single xl table. Like ProductInfo head and it's tailing rows as ProdctOption. So I decided to parse this like some kind of state machine.

  1. If row is Single Product, create new Instance of ProductInfo and init ProductInfo.options with empty List.

  2. If already met Product(head) and current row is Option, push into list which instantiated ProductInfo has.

  3. If met Product again, append ProductInfo int reulstList

But final result is really strange. ALl ProductInfo share it's options, if there are 100 options for 10 Products, each Products contains 100 options despite of most of them is not it's option.

It looks like below,

  1. Product T-Shirts -> [('white', 'XL'), ('red', 'XL'), ('rainbow', '270'), ('yellow', '260')]
  2. Product Sneakers -> [('white', 'XL'), ('red', 'XL'), ('rainbow', '270'), ('yellow', '260')] ( As you can see, first 2 options are for T-Shirt, and last 2 is for sneakers )

Data Class looks like this

class OptionInfo:
    def __init__(self, ProductId: int = None, ProductCode: str = None, Barcode: str = None, Color: str = None, Size: str = None, OptionDescription:str = None):
        self.ProductId = ProductId
        self.ProductCode = ProductCode
        self.Barcode = Barcode
        self.Color = Color
        self.Size = Size
        self.OptionDescription = OptionDescription

class ProductInfo:
    def __init__(self, ProductId = None, ProductBaseCode: str = None, ProductBaseBarcode: str = None, HasOption: bool = False, MajorCategory: str = None, MiddleCategory: str = None, MinorCategory: str = None, Gender: str = None, Supplier: str = None, OriginCountry: str = None, ProductName: str = None, TemporaryProductionPrice: float = None, SalesPrice: float = None, OptionList: List['OptionInfo']  = [], PredefinedCategory: str = None):
        self.ProductId = ProductId
        self.ProductBaseCode = ProductBaseCode
        self.ProductBaseBarcode = ProductBaseBarcode
        self.HasOption = HasOption
        self.MajorCategory = MajorCategory
        self.MiddleCategory = MiddleCategory
        self.MinorCategory = MinorCategory
        self.Gender = Gender
        self.Supplier = Supplier
        self.OriginCountry = OriginCountry
        self.ProductName = ProductName
        self.TemporaryProductionPrice = TemporaryProductionPrice
        self.SalesPrice = SalesPrice
        self.OptionList = OptionList
        self.PredefinedCategory = PredefinedCategory

Logic codes are below

def parseSheet(xlPath: str) -> List['ProductInfo']:
    log('PRODUCT >> Parsing xl for {}'.format(xlPath))
    wb = openpyxl.load_workbook(xlPath)
    ws = wb.active
    resultSet: List['ProductInfo'] = []
    prInfo: ProductInfo = None
    allowOptions: bool = False
    for rownum in range(2, ws.max_row+1):
        if ws.cell(row=rownum, column=CL_ProductName).value == '' or ws.cell(row=rownum, column=CL_ProductName).value is None: break
        isRepr = ws.cell(row=rownum, column=CL_IsRepr).value
        if isRepr is 1 or isRepr == '1':
            if prInfo is not None: 
                resultSet.append(prInfo)
                prInfo = None
                allowOptions = False
            allowOptions = True
            prInfo = ProductInfo(
                ProductBaseCode         =ws.cell(row=rownum, column=CL_ProductCode).value if ws.cell(row=rownum, column=CL_ProductCode).value != '' else None,
                HasOption               =True,
                ProductName             =ws.cell(row=rownum, column=CL_ProductName).value if ws.cell(row=rownum, column=CL_ProductName).value != '' else None,
                TemporaryProductionPrice=float(ws.cell(row=rownum, column=CL_ProdPrice).value) if ws.cell(row=rownum, column=CL_ProdPrice).value != '' and ws.cell(row=rownum, column=CL_ProdPrice).value is not None else 0,
                SalesPrice              =float(ws.cell(row=rownum, column=CL_SalesPrice).value) if ws.cell(row=rownum, column=CL_SalesPrice).value != '' and ws.cell(row=rownum, column=CL_SalesPrice).value is not None else 0,
                PredefinedCategory      =ws.cell(row=rownum, column=CL_PredCategory).value if ws.cell(row=rownum, column=CL_PredCategory).value != '' else None
            )

        elif isRepr == 'isOption':
            if allowOptions is False or prInfo is None:
                raise Exception('Sheet row order seems strange, check please near {}'.format(rownum))
            prInfo.OptionList.append(
                OptionInfo(
                    ProductCode         =ws.cell(row=rownum, column=CL_ProductCode).value if ws.cell(row=rownum, column=CL_ProductCode).value != '' else None,
                    Barcode             =ws.cell(row=rownum, column=CL_Barcode).value if ws.cell(row=rownum, column=CL_Barcode).value != '' else None,
                    OptionDescription   =ws.cell(row=rownum, column=CL_OptionDesc).value if ws.cell(row=rownum, column=CL_OptionDesc).value != '' else None
                )
            )

        elif isRepr is 0 or isRepr == '0':
            if prInfo is not None: 
                resultSet.append(prInfo)
                prInfo = None
                allowOptions = False
            prInfo = ProductInfo(
                ProductBaseCode         =ws.cell(row=rownum, column=CL_ProductCode).value if ws.cell(row=rownum, column=CL_ProductCode).value != '' else None,
                HasOption               =False,
                ProductName             =ws.cell(row=rownum, column=CL_ProductName).value if ws.cell(row=rownum, column=CL_ProductName).value != '' else None,
                TemporaryProductionPrice=float(ws.cell(row=rownum, column=CL_ProdPrice).value) if ws.cell(row=rownum, column=CL_ProdPrice).value != '' and ws.cell(row=rownum, column=CL_ProdPrice).value is not None else 0,
                SalesPrice              =float(ws.cell(row=rownum, column=CL_SalesPrice).value) if ws.cell(row=rownum, column=CL_SalesPrice).value != '' and ws.cell(row=rownum, column=CL_SalesPrice).value is not None else 0,
                PredefinedCategory      =ws.cell(row=rownum, column=CL_PredCategory).value if ws.cell(row=rownum, column=CL_PredCategory).value != '' else None,
                OptionList=[OptionInfo(
                    ProductCode         =ws.cell(row=rownum, column=CL_ProductCode).value if ws.cell(row=rownum, column=CL_ProductCode).value != '' else None,
                    Barcode             =ws.cell(row=rownum, column=CL_Barcode).value if ws.cell(row=rownum, column=CL_Barcode).value != '' else None,
                    OptionDescription   =ws.cell(row=rownum, column=CL_OptionDesc).value if ws.cell(row=rownum, column=CL_OptionDesc).value != '' else None
                )]
            )

        else:
            raise Exception('Invalid repr in sheet {} at {}'.format(isRepr, rownum))
    return resultSet

Value isRepr decide if it is Product or Option, 1 means it is Product, 'isOption' means it is option for upper Product(which isRepr is 1), regarding 0, it means it is Product with no option.

I think, prInfo is suspected as cause. But I don't understand it's variable scope. How can I distinctively distinguish between declaration of variable and assign to variable? Is it possible? How can I solve this. I suspect cause, but not sure why does it happened.

Gipyo.Choi
  • 155
  • 2
  • 14
  • The problem isn't scope. Rather, default arguments are evaluated *once at function definition time*. You keep re-using that same list. – juanpa.arrivillaga Feb 03 '20 at 06:38
  • @juanpa.arrivillaga then, is it only for list? what about others like default value is defined as None or ''? I mean, am I only reuse list? not others? – Gipyo.Choi Feb 03 '20 at 07:09
  • 1
    No, it works the same for *all objects*. There is only a single `None` object in any given python process, anyway. In any case, you won't notice anything if you don't modify/mutate those objects, and the only object you mutate is the list – juanpa.arrivillaga Feb 03 '20 at 07:16
  • Also note, `x is 0` is incorrect. You are testing for *equality* not *identity*, so you should use `==` instead of `is`. Note, this *happens* to work for small integers, but this is a CPython implementation detail that shouldn't be relied on. `x is None` is fine since `None` is a language guaranteed singleton – juanpa.arrivillaga Feb 03 '20 at 07:18
  • You are right. It is the defaul parametrer problem. THANKS. Regarding x is 0, you are right. I use 'is' with intention. It is small number so keyword is works correctly. And ther are no ther options rather than 0 – Gipyo.Choi Feb 04 '20 at 08:18
  • No, using `is` is **absolutely incorrect** here. It is an **implementation detail** that should never be relied on. Python is free to change this at any time without warning or notice that it has changed. – juanpa.arrivillaga Feb 04 '20 at 08:30
  • In fact, even **in CPython currently it isn't always the case**, for example, `pow(5, 30, 5**30) == 0` but `pow(5, 30, 5**30) is 0` will be false – juanpa.arrivillaga Feb 04 '20 at 08:33

0 Answers0