Mapping column values to a combination of another csv file's information

Question

I have a dataset that indicates date & time in 5-digit format: ddd + hm

ddd part starts from 2009 Jan 1. Since the data was collected only from then to 2-years period, its [min, max] would be [1, 365 x 2 = 730].

Data is observed in 30-min interval, making 24 hrs per day period to lengthen to 48 at max. So [min, max] for hm at [1, 48].

Following is the excerpt of daycode.csv file that contains ddd part of the daycode, matching date & hm part of the daycode, matching time.

And I think I agreed to not showing the dataset which is from ISSDA. So..I will just describe that the daycode in the File1.txt file reads like '63317'.

This link gave me a glimpse of how to approach this problem, and I was in the middle of putting up this code together..which of course won't work at this point.

consume = pd.read_csv("data/File1.txt", sep= ' ', encoding = "utf-8", names =['meter', 'daycode', 'val'])
df1= pd.read_csv("data/daycode.csv", encoding = "cp1252", names =['code', 'print'])
test = consume[consume['meter']==1048]

test['daycode'] = test['daycode'].map(df1.set_index('code')['print'])

plt.plot(test['daycode'], test['val'], '.')
plt.title('test of meter 1048')
plt.xlabel('daycode')
plt.ylabel('energy consumption [kWh]')
plt.show()

Not all units(thousands) have been observed at full length but 730 x 48 is a large combination to lay out on excel by hand. Tbh, not an elegant solution but I tried by dragging - it doesn't quite get it.

If I could read the first 3 digits of the column values and match with another file's column, 2 last digits with another column, then combine.. is there a way?

Welcome to SO! Please provide a **[mcve]** – jpp Feb 23 '18 at 00:20 — jpp, Feb 23 '18 at 00:20

score 1 · Accepted Answer · answered Feb 23 '18 at 04:54

For the last 2 lines you can just do something like this

df['first_3_digits'] = df['col1'].map(lambda x: str(x)[:3])
df['last_2_digits'] = df['col1'].map(lambda x: str(x)[-2:])

for joining 2 dataframes

df3 = df.merge(df2,left_on=['first_3_digits','last_2_digits'],right_on=['col1_df2','col2_df2'],how='left')

Mapping column values to a combination of another csv file's information

1 Answers1