I'm having a hard time figuring out what's going on here (as I see most people are with trying to figure out TF 2.1). Below is my problem and a few of the solutions I've already tried with code examples.
I'm trying to use AdaNet to start a TensorFlow Estimator training session, by creating a tf.data.Dataset from an imported .csv file. I'm running:
Python 3.6
Windows 10
tensorflow==2.1.0
pandas==0.25.1
numpy==1.16.5
This error...:
ValueError: Received a feature column from TensorFlow v1, but this is a TensorFlow v2 Estimator. Please either use v2 feature columns (accessible via tf.feature_column.* in TF 2.x) with this Estimator, or switch to a v1 Estimator for use with v1 feature columns (accessible via tf.compat.v1.estimator.* and tf.compat.v1.feature_column.*, respectively.
...is produced by this code (commented nicely. I'm posting it all because I really don't know what part is giving me this error. And yes, getting the list of column names I want to use at each step is annoying, but that's how I'm keeping it for now):
import numpy as np
import pandas as pd
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import warnings
warnings.filterwarnings("once")
import adanet
import tensorflow as tf
from tensorflow.estimator import BinaryClassHead, MultiClassHead
# This will be a binary classification problem
head = BinaryClassHead()
# Import the dataset we're going to train with, just to get a list of the column names
# we want our estimator to reference
df = pd.read_csv('./datasets/call_restored_df_' + str(4) + '_' + 'SPY' + '.csv')
df = df.set_index(['Date'])
df['class'] = df['class'].astype('int32')
# Create a list of all the column names
feature_columns = list(df.columns)
# Remove the columns we aren't going to use during training
feature_columns.remove('Ticker')
feature_columns.remove('DailyChange')
feature_columns.remove('DailyHighChange')
feature_columns.remove('DailyLowChange')
# Adanet estimator
# Learn to ensemble linear and DNN models.
estimator = adanet.AutoEnsembleEstimator(
head=head,
candidate_pool=lambda config: {
"linear":
tf.estimator.LinearEstimator(
head=head,
feature_columns=feature_columns,
config=config,
optimizer='Adagrad'),
"dnn":
tf.estimator.DNNEstimator(
head=head,
feature_columns=feature_columns,
config=config,
optimizer='Adagrad',
hidden_units=[1000, 500, 100])},
max_iteration_steps=50)
# Input builders
# Define our train function called by the estimator during training to return
# a tf.data.Dataset (x, y) tuple
def input_fn_train():
# Do the same thing to collect a list of usable column names
df = pd.read_csv('./datasets/call_restored_df_' + str(4) + '_' + 'SPY' + '.csv')
df = df.set_index(['Date'])
df['class'] = df['class'].astype('int32')
feature_columns_list = list(df.columns)
feature_columns_list.remove('Ticker')
feature_columns_list.remove('DailyChange')
feature_columns_list.remove('DailyHighChange')
feature_columns_list.remove('DailyLowChange')
# Make our tf.data.Dataset from the same .csv file as before
df = tf.data.experimental.make_csv_dataset(
'./datasets/call_restored_df_' + str(4) + '_' + 'SPY' + '.csv',
batch_size=32,
label_name="class",
select_columns=feature_columns_list)
df_batches = (
df.cache().repeat().shuffle(500)
.prefetch(tf.data.experimental.AUTOTUNE))
return df_batches
# Get the estimator to train ...
estimator.train(input_fn=input_fn_train, steps=100)
So given that error, I replaced every instance of tf.
with tf.compat.v1.
in the above code, and got this error:
ValueError: Items of feature_columns must be a _FeatureColumn. Given (type <class 'str'>): Close_Resistance.
Doing some more searching, I discovered that each column has to be labelled as a numeric column type for some reason, so I then implemented these two loops to convert my two lists of column names to numeric type (after reverting back to tf.
instead of tf.compat.v1.
):
...
feature_columns.remove('DailyLowChange')
# Make all the feature columns numeric type for TF 2.1 for some reason
new_feature_list = []
for i in feature_columns:
new_feature_list.append(tf.feature_column.numeric_column(i))
# Adanet estimator
# Learn to ensemble linear and DNN models.
estimator = adanet.AutoEnsembleEstimator(
...
and
...
feature_columns_list.remove('DailyLowChange')
# Make all the feature columns numeric type for TF 2.1 for some reason
new_feature_columns_list = []
for i in feature_columns_list:
new_feature_columns_list.append(tf.feature_column.numeric_column(i))
# Make our tf.data.Dataset from the same .csv file as before
df = tf.data.experimental.make_csv_dataset(
...
...and now get this error:
TypeError: not all arguments converted during string formatting
So I'm at a loss of what to do. I want to use TF 2.1 to get this thing working, but I am frustrated with failure. I see at this post, there was a solution, but my .csv file has too many column names to individually go through one at a time and define each as a numeric type, so I need it to be dynamic no matter how many columns are being loaded. Someone help! Thanks.