I have an initial column in a dataframe that contains several bits of information (weight and count of items) I am trying to pull out and do some calculations with.
When I pull out my desired numbers everything looks fine if I print out the variable I store the series in.
Below is my code for how I am parsing out my numbers from the initial column. I just stacked a few methods and used regex to tease it out.
[Hopefully it is fairly easy to read, with some cleaning, my target weight numbers are always in the 3rd to last position after the split() // and my target count numbers are always in the 2nd to last position after the split]
weight = df['Item'].str.replace('1.0gal','128oz').str.replace('YYY','').str.split().str[-3].str.extract('(\d+)', expand=False).astype(np.float64)
count = df['Item'].str.replace('NN','').str.split().str[-2].replace('XX','1ct').str.extract('(\d+)', expand=False).astype(np.float64)
Variable 'weight' returns a series like [32, 32, 0.44, 5.3, 64] and that is what I want to see.
HOWEVER, when I try to set these values into a new column in the dataframe it leaves off everything to the right of the decimal place; for example my new column shows up as [32, 32, 0, 5, 64].
This is throwing off my calculated columns as well.
However if I do the math in a separate variable and print that out it shows up right (decimals and all). But something about assigning it to the dataframe zeros out my weight and screws up any calculations thereafter.
Any and all help is greatly appreciated!