1

I have a string which is basically the header of a CSV file from which i have to extract the month and then convert it to a string by appending a single '0' in front of it to compare with other value.

header --

HGLOABCD8PSGL_ZXFH J20190603NXT_APAC

from this i need to to extract the month from the 20190603 which is 06 and then create a list like ['006', '005'] second element of the list will be previous month of the given month in the header

also the header could also be like where month is different

HGLOABCD8PSGL_ZXFH J20191003NXT_APAC

i have written something like this for first element but not sure how can i substract one month then append '0' to it.

acc_period = []
acc_period.append('0'+str(header)[26:28])

acc_period.append(int('0') + int(str(header)[26:28])-1)
print (acc_period)
mradul
  • 509
  • 4
  • 12
  • 28

2 Answers2

2

Try regex:

import re

output = list()
header = 'HGLOABCD8PSGL_ZXFH J20190103NXT_APAC'
#Using the regex pattern '\d*' this will fnid all the numeric sequences in the input string
find_all_numbers = re.findall('\d*', header)

#Filtering out any empty string resulted from extraction
numbers = [num for num in find_all_numbers if len(num)==8]

#Getting the largest number which is most of the time going to be the date in your case
date = numbers[0]

#Unpacking the data using string slicing

year, month, day = date[:4], date[4:6], date[6:]

#Using string format defining the desired format using left 0 padding
current_month, previous_month = '{0:03d}'.format(int(month)), '{0:03d}'.format(int(month)-1)
if previous_month =='000':
    previous_month = '012'
output.extend((current_month, previous_month))
print(output)
Ahmed Hawary
  • 461
  • 4
  • 15
  • Thanks for your reply just want to know how should i read this piece of code i new to python and want to learn these sort of conversions for my future use :) – mradul Nov 18 '19 at 11:31
  • You're welcome mradul. Do you mean you want an explanation for the script lines? – Ahmed Hawary Nov 18 '19 at 11:41
  • yes if it is possible also i am getting an error TypeError: expected string or bytes-like object got the error need to convert the variable into STR – mradul Nov 18 '19 at 12:00
  • It seems like there is a bug in this code when i change the date to reflect the month as 01 then the second element it creates as '000' which should be '012' – mradul Nov 18 '19 at 12:14
  • just one more thing #Getting the largest number which is most of the time going to be the date in your case i need to get months from the first date string in the header not from the largest number so we can not make this assumption here – mradul Nov 18 '19 at 12:39
2

Using Regex.

Ex:

import re
from datetime import datetime, timedelta
data = ['HGLOABCD8PSGL_ZXFH J20190603NXT_APAC', 'HGLOABCD8PSGL_ZXFH J20191003NXT_APAC', 'HGLOABCD8PSGL_ZXFH J20190103NXT_APAC']

def a_day_in_previous_month(dt):   #https://stackoverflow.com/a/7153449/532312
    return (dt.replace(day=1) - timedelta(days=1)).month

for i in data:
    m = re.search(r"(\d{8,})", i)
    if m:
        date = datetime.strptime(m.group(0), "%Y%m%d")
        print("{}".format(date.month).zfill(3), "{}".format(a_day_in_previous_month(date)).zfill(3))

Output:

006 005
010 009
001 012
Rakesh
  • 81,458
  • 17
  • 76
  • 113
  • Thanks for writing the answer it does also solve my problem i will see which is more easy to use and manage for future use and also the one which i understand more :) – mradul Nov 18 '19 at 12:36