Convert prices with daily rate in pandas

Question

I'm currently trying to normalize a DataFrame(~600k rows) with prices (pricevalue) in different currencies(pricecurrency) so that every row has prices in EUR.

I'd like to convert them with the daily rate taken from a column date.

My current "solution" (using the CurrencyConverter package found on PyPI) looks like this:

from currency_converter import CurrencyConverter

c = CurrencyConverter(fallback_on_missing_rate=True,fallback_on_missing_rate_method="last_known")

def convert_currency(row):
     return c.convert(row["pricevalue"], row["pricecurrency"],row["date"])

df["converted_eur"] = df.apply(lambda x: convert_currency(x),axis=1)

However, this solution is taking forever to run.

Is there a faster way to accomplish that? Any help is appreciated :)

does this helps: https://stackoverflow.com/a/51669468/6660373? — Pygirl, May 06 '20 at 13:36
Do you have the YOURCOIN>EUR daily price in another sheet or is this CurrencyConverter package getting i for you? If so, it might explain the long time to do it. — mrbTT, May 06 '20 at 13:38
Are these current prices or historic prices? What is the range of dates that a price could be in? — Dave, May 06 '20 at 14:06
@mrbTT The CurrencyConverter Package is getting the prices for me — PStein, May 06 '20 at 14:14
@Dave the prices are historic prices. The prices are ranging from ~2008 - 2018, so i need some form of lookup of the exchange rate, which is provided by the CurrencyConverter Package mentioned. — PStein, May 06 '20 at 14:15
The issue might be that CurrencyConverter needs to get the values online, and it's API has a limited number of queries or is slow to fetch results. — mrbTT, May 06 '20 at 18:44

score 0 · Accepted Answer · answered May 06 '20 at 15:05

It sounds strange to say this, but unfortunately you're not doing anything wrong!

The currency interpolation code is doing what you need it to do, and not much else. In your code, you're doing everything right. This means there's no thing you can quickly fix to get performance. You have a double lambda where you only need a single, but that won't make much of a difference:

i.e.

df["converted_eur"] = df.apply(lambda x: convert_currency(x),axis=1)

should be

df["converted_eur"] = df.apply(convert_currency, axis=1)

The first thing to do is to understand how long this processing will actually take by adding some UI:

from tqdm import tqdm

df["converted_eur"] = df.progress_apply(convert_currency, axis=1)

Once you know how long the job will actually take, try out these, in order:

Live with it.
Single instance parallelization, with something like pandarallel
Multi instance parallelization, with something like Dask

Thanks for the detailed answer and pointing out further readings! Will definitely take a look at option 2 and 3. tqdm is btw a neat little tool i'll add to my toolbox! :) — PStein, May 06 '20 at 21:06

Convert prices with daily rate in pandas

1 Answers1