I can't get my head over the way to use logging in conjonction with auto-sklearn.
The example from the doc about logging with auto-sklearn is here. What I'm trying to achieve is:
- a main script with a main logger,
- functions runing auto-sklearn models along with separated logs.
I've made multiple attempts; one solution I got was to configure the root logger first (using basicConfig), then running an auto-sklearn model with (root) logger configuration, and finally updating the root logger (using basicConfig(force=True)). This doesn't seem very pythonic to me but it works.
The pythonic way would have been to use two named loggers (I think). To my knowledge however, auto-sklearn can't configure logging with anything but a config dictionary. As you can't pass an existing logger as an argument, you have to stick with some inner mechanism triggered by specific logger names (names being present in the default yaml file but undocumented AFAIK).
My current code is the following:
import logging
import pandas as pd
import numpy as np
from autosklearn.regression import AutoSklearnRegressor
#Basic logging config
file_handler = logging.FileHandler("main.log", mode="a", encoding="utf8")
console_handler = logging.StreamHandler()
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
logger.addHandler(file_handler)
logger.addHandler(console_handler)
#Construct dummy dataframe for a short regression
df = pd.DataFrame(
dict(x1=range(100), x2=range(50, 150), noise=np.random.normal(size=100))
)
df['y'] = np.square(df.x1+df.noise) + df.x2
#Message is stored to main log and console
logger.info("Starting modelisation")
#Modelisation configuration with logger
logging_config = {
"version":1,
"disable_existing_loggers": False,
"handlers":{
"spec_logger":{
'level':'INFO',
"class":"logging.FileHandler",
'filename':"dummy_autosklearn.log",
},
},
'loggers': {
"":{"handlers":["spec_logger"]}, # <- I'd say this is what is wrong here
},
}
model = AutoSklearnRegressor(
memory_limit=None,
time_left_for_this_task=30,
logging_config=logging_config,
)
model.fit(df[['x1', 'x2']], df['y'])
#Message is stored in both logs as well as in the console
logger.info("Finished !")
Running it you will get a main.log with two statements, which will also be displayed in the console.
But as auto-sklearn is running with a root logger config, the "Finished" statement will also be present in the dummy_autosklearn.log.
How could I configure auto-sklearn in an easy way ? (I mean, I'm only hopping to redirect the verbose content displayed by auto-sklearn, in case I need it in the future...).