0

Background:

I'm attempting to capture valid dates that are in a custom Persian (Non-Gregorian) date format (yyyy/mm/dd)

Formula:

Luckily I got help from the below thread with a decent formula that does the requirement. I can XlsxWriter has parsed it, but it does not work in the generated Excel worksheet. The formula and link are discussed in the below SO thread:

Ensure Excel (Persian) date is valid date with specific format

Root cause:

I'm not sure about the reason, but I think it is because of an error prompt message that pops out. If I clicked ok, the formula works. Would this be the reason? Would this block XlsxWriter?

Issue reproduction

Creating a dummy data frame:

df = pd.DataFrame(data = {'date' : ['1369/05/18',
                                    '1386/05/1',
                                    '1369/5/18',
                                    '1369/05/',
                                    '1369//15',
                                    '/05/08',
                                    '1369/051/18',
                                    '1369/0/518',
                                    '136/5/18',
                                    '1369/13/18',
                                    '1369/05/31',
                                    '1369-05-31',
                                    '1369//05//18',
                                    None
                                   ],
                          'case' : ['valid format - correct',
                                    '1 digit day - correct',
                                    '1 digit month - correct',
                                    'No Day ',
                                    'No Month',
                                    'No Year',
                                    '3 digit month',
                                    '3 digit day',
                                    '3 digit year',
                                    'Invalid month',
                                    'Invalid day',
                                    '31st may - correct',
                                    'Invalid format',
                                    'Invalid format'
                                   ],
                          'gender' : '',
                          'date_input':''
                         }
                 )
     
  • Writing that dataframe to an Excel file.
  • Adding data validation for that file.
    • I'll add an irrelevant male/female list to show that validation works.
    • I'll add my formula to the excel.
writer = pd.ExcelWriter('dates_validation.xlsx', engine='xlsxwriter') # xlsx writer
workbook = writer.book # Workbook object
df.to_excel(writer, sheet_name = 'sheet_1' ,index=False) # Writing to file
worksheet = writer.sheets['sheet_1'] # Worksheet object
worksheet.data_validation(
    'C2:C1048576', {'validate': 'list',
                    'source': ['Male','Female'],
                    'dropdown': True
                   })
worksheet.data_validation(
    'D2:D1048576', {'validate': 'custom',
                    'value': '=LET(s,TEXTSPLIT(D2,"/"),y,AND(--INDEX(s,1)>=1278,--INDEX(s,1)<=9378),m,AND(--INDEX(s,2)>=1, --INDEX(s,2)<=12),d, AND(--INDEX(s,3)>=1, --INDEX(s,3)<=(30+(--INDEX(s,2)<=6))), AND(y,m,d))',
                    'ignore_blank': True
                   }
)
writer.close()

You can see that the formula is in-place but it doesn't work in cell D2. If you clicked ok in the next cell, it will work.

enter image description here

  • 1
    That isn't an xlsxwriter issue. The formula doesn't work in **Excel**. At least not in a data validation. As far as I know Excel doesn't support dynamic formulas like `LET()` in data validation formulas. If you can get it working in Excel then you should be able to transfer the formula to Pandas/XlsxWriter. – jmcnamara Feb 03 '23 at 17:02
  • Thank you for the reply! .. The thing is the formula works in excel already and XlsxWriter transfer it successfully, but you'd have to open excel and re-submit it by clicking on okay for it to work. – Mohamed Mostafa El-Sayyad Feb 05 '23 at 09:26
  • Ok. Got it. In that case it is an XlsxWriter issue. I'll post a solution. – jmcnamara Feb 05 '23 at 10:58

1 Answers1

1

The LET() and TEXTSPLIT are new Future Functions in Excel and have some additional prefixes like _xlfn.LET(_xlpm.s,_xlfn.TEXTSPLIT(D2,"/") ....

The actual formula you need to use is:

_xlfn.LET(_xlpm.s,_xlfn.TEXTSPLIT(D2,"/"),_xlpm.y,AND(--INDEX(_xlpm.s,1)>=1278,--INDEX(_xlpm.s,1)<=9378),_xlpm.m,AND(--INDEX(_xlpm.s,2)>=1, --INDEX(_xlpm.s,2)<=12),_xlpm.d, AND(--INDEX(_xlpm.s,3)>=1, --INDEX(_xlpm.s,3)<=(30+(--INDEX(_xlpm.s,2)<=6))), AND(_xlpm.y,_xlpm.m,_xlpm.d))

Here is a working example based on your code:

import pandas as pd


df = pd.DataFrame(data = {'date': ['1369/05/18',
                                    '1386/05/1',
                                    '1369/5/18',
                                    '1369/05/',
                                    '1369//15',
                                    '/05/08',
                                    '1369/051/18',
                                    '1369/0/518',
                                    '136/5/18',
                                    '1369/13/18',
                                    '1369/05/31',
                                    '1369-05-31',
                                    '1369//05//18',
                                    None
                                   ],
                          'case' : ['valid format - correct',
                                    '1 digit day - correct',
                                    '1 digit month - correct',
                                    'No Day ',
                                    'No Month',
                                    'No Year',
                                    '3 digit month',
                                    '3 digit day',
                                    '3 digit year',
                                    'Invalid month',
                                    'Invalid day',
                                    '31st may - correct',
                                    'Invalid format',
                                    'Invalid format'
                                   ],
                          'gender' : '',
                          'date_input':''
                         }
                 )


writer = pd.ExcelWriter('dates_validation.xlsx', engine='xlsxwriter') # xlsx writer
workbook = writer.book # Workbook object
df.to_excel(writer, sheet_name = 'sheet_1' ,index=False) # Writing to file
worksheet = writer.sheets['sheet_1'] # Worksheet object
worksheet.data_validation(
    'C2:C1048576', {'validate': 'list',
                    'source': ['Male','Female'],
                    'dropdown': True
                   })
worksheet.data_validation(
    'D2:D1048576', {'validate': 'custom',
                    'value': '_xlfn.LET(_xlpm.s,_xlfn.TEXTSPLIT(D2,"/"),_xlpm.y,AND(--INDEX(_xlpm.s,1)>=1278,--INDEX(_xlpm.s,1)<=9378),_xlpm.m,AND(--INDEX(_xlpm.s,2)>=1, --INDEX(_xlpm.s,2)<=12),_xlpm.d, AND(--INDEX(_xlpm.s,3)>=1, --INDEX(_xlpm.s,3)<=(30+(--INDEX(_xlpm.s,2)<=6))), AND(_xlpm.y,_xlpm.m,_xlpm.d))',
                    'ignore_blank': True
                   }
)

worksheet.autofit()

writer.close()

Output:

enter image description here

jmcnamara
  • 38,196
  • 6
  • 90
  • 108