As assignment for schoolwork we have to be able to predict best trip for a traveler_type. I am following a tensorFlow tutorial. After hours of working my way thru bugs I have finally hit a dead end. The following have been looked at for answers: Python TensorFlow Cast string to float is not.... python-COnverting from pandas to Tensorflow..
I have rewritten the code in several ways due to other bugs that came up. This is the latest iteration.
I initial open the csv file to clean it up and format it. There are some columns we dont need. If there is a way to only model and train the data without having to manually remove the unwanted columns I am open to suggestions. After I remove what I dont need I save the dataframe without headers and index to a different file and reopen in the input_fn. My print statements are for debugging. They allow me to see what is going in or coming out of a function, thus letting me know if the types are correct.
This is whats getting me hung up. According to the prints, all my columns are the correct data types. They match up with my feature columns. Then again I have been at this for most of the day so I am probably going blind.
According to the output i get from attempting to run this and one of the SO posts i listed, I am inputting one of the input columns in the incorrect type. But I cant figure out which one. This is our first attempt at tensorflow so I am at a loss at trying to figure this out on my own.
Following is the code.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import pandas as pd
import numpy as np
import tensorflow as tf
_CSV_COLUMNS = ['Score', 'Period_of_stay',
'Traveler_type', 'Pool',
'Gym',
'Tennis_court', 'Spa',
'Casino',
'Free_internet',
'Hotel_name',
'Hotel_stars',
'Review_month']
_TRAIN_FILE = './cleanedUPTrain.csv'
_TEST_FILE = './cleanedUPTest.csv'
train = pd.read_csv('./vegas2.csv', index_col=False,
usecols=_CSV_COLUMNS,
dtype={'Score': np.float64, 'Period_of_stay':np.str,
'Traveler_type':np.str, 'Pool':np.str,
'Gym':np.str,
'Tennis_court':np.str, 'Spa':np.str,
'Casino':np.str,
'Free_internet':np.str,
'Hotel_name':np.str,
'Review_month':np.str,'Hotel_stars': np.float64})
train.to_csv(_TRAIN_FILE, header=False, index=False)
print(train.dtypes)
print("$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$")
test = pd.read_csv('./vegas.csv', index_col=False,
usecols=_CSV_COLUMNS,
dtype={'Score': np.float64, 'Period_of_stay':np.str,
'Traveler_type':np.str, 'Pool':np.str,
'Gym':np.str,
'Tennis_court':np.str, 'Spa':np.str,
'Casino':np.str,
'Free_internet':np.str,
'Hotel_name':np.str,
'Review_month':np.str,'Hotel_stars': np.float64})
test.to_csv(_TEST_FILE, header=False, index=False)
print(test.dtypes)
print("$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$")
score = tf.feature_column.numeric_column(
'Score', [1, 2, 3, 4, 5]
)
stay_period = tf.feature_column.categorical_column_with_vocabulary_list(
'Period_of_stay', ['Dec-Feb', 'Mar-May', 'Jun-Aug', 'Sep-Nov']
)
traveler_type = tf.feature_column.categorical_column_with_vocabulary_list(
'Traveler_type', ['Friends', 'Business', 'Families', 'Couples', 'Solo'])
pool = tf.feature_column.categorical_column_with_vocabulary_list(
'Pool', ['YES', 'NO'])
gym = tf.feature_column.categorical_column_with_vocabulary_list(
'Gym', ['YES', 'NO'])
tennis_court = tf.feature_column.categorical_column_with_vocabulary_list(
'Tennis_court', ['YES', 'NO'])
spa = tf.feature_column.categorical_column_with_vocabulary_list(
'Spa', ['YES', 'NO'])
casino = tf.feature_column.categorical_column_with_vocabulary_list(
'Casino', ['YES', 'NO'])
free_internet = tf.feature_column.categorical_column_with_vocabulary_list(
'Free_internet', ['YES', 'NO'])
hotel_name = tf.feature_column.categorical_column_with_vocabulary_list(
'Hotel_name',
['Circus Circus Hotel & Casino Las Vegas', 'Excalibur Hotel & Casino', 'Tuscany Las Vegas Suites & Casino', 'Hilton Grand Vacations at the Flamingo', 'Monte Carlo Resort&Casino', 'Treasure Island- TI Hotel & Casino', 'Tropicana Las Vegas - A Double Tree by Hilton Hotel', 'Paris Las Vegas', 'The Westin las Vegas Hotel Casino & Spa', 'Caesars Palace', 'The Cosmopolitan Las Vegas', 'The Palazzo Resort Hotel Casino', 'Wynn Las Vegas', 'Trump International Hotel Las Vegas', 'Encore at wynn Las Vegas', 'The Venetian Las Vegas Hotel', 'Bellagio Las Vegas',"Marriott's Grand Chateau", 'Wyndham Grand Desert', 'The Cromwell']
)
hotel_stars = tf.feature_column.numeric_column(
'Hotel_stars', [1, 2, 3, 4, 5])
review_month = tf.feature_column.categorical_column_with_vocabulary_list(
'Review_month',
['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
)
base_columns = [score, stay_period, pool,
gym, tennis_court, spa, casino,
free_internet, hotel_name, hotel_stars, review_month]
def input_fn(data_file, num_of_epochs, shuffle, batch_size):
assert tf.gfile.Exists(data_file), ('file not found!!!')
def parse_csv(value):
print('Parsing file')
record_defaults = [[1], ['Dec-Feb'], ['Solo'], ['YES'], ['YES'], ['YES'], ['YES'],
['YES'], ['YES'], ['Circus Circus Hotel & Casino Las Vegas'], [1], ['January']]
columns = tf.decode_csv(value, record_defaults=record_defaults)
features = dict(zip(_CSV_COLUMNS, columns))
labels = features.pop('Traveler_type')
print(labels)
print("###################################")
print(columns)
print("###################################")
print(features)
print("###################################")
return features, labels
dataset = tf.data.TextLineDataset(data_file)
if shuffle:
dataset = dataset.shuffle(shuffle)
dataset = dataset.map(parse_csv, num_parallel_calls=5)
dataset = dataset.repeat(num_of_epochs)
dataset = dataset.batch(batch_size)
return dataset
#CREATE THE MODEL
classifier = tf.estimator.LinearClassifier(
model_dir = '~/lab3AI/',
feature_columns=base_columns
)
#train the model
classifier.train(input_fn=lambda:input_fn(_TRAIN_FILE,50, 50, 20), steps=10)
results = classifier.evaluate(input_fn=lambda: input_fn(_TEST_FILE, 1, False, 20))
inputs = tf.feature_column.input_layer(temp, base_columns)
var_init = tf.global_variables_initializer()
table_init = tf.tables_initializer()
sess = tf.Session()
sess.run((var_init,table_init))
print(sess.run(inputs))
Error log.
Score float64
Period_of_stay object
Traveler_type object
Pool object
Gym object
Tennis_court object
Spa object
Casino object
Free_internet object
Hotel_name object
Hotel_stars float64
Review_month object
dtype: object
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
Score float64
Period_of_stay object
Traveler_type object
Pool object
Gym object
Tennis_court object
Spa object
Casino object
Free_internet object
Hotel_name object
Hotel_stars float64
Review_month object
dtype: object
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
Parsing file
Tensor("DecodeCSV:2", shape=(), dtype=string) #labels
###################################columns
[<tf.Tensor 'DecodeCSV:0' shape=() dtype=int32>, <tf.Tensor 'DecodeCSV:1' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:2' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:3' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:4' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:5' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:6' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:7' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:8' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:9' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:10' shape=() dtype=int32>, <tf.Tensor 'DecodeCSV:11' shape=() dtype=string>]
###################################features
{'Hotel_name': <tf.Tensor 'DecodeCSV:9' shape=() dtype=string>, 'Free_internet': <tf.Tensor 'DecodeCSV:8' shape=() dtype=string>, 'Casino': <tf.Tensor 'DecodeCSV:7' shape=() dtype=string>, 'Hotel_stars': <tf.Tensor 'DecodeCSV:10' shape=() dtype=int32>, 'Tennis_court': <tf.Tensor 'DecodeCSV:5' shape=() dtype=string>, 'Score': <tf.Tensor 'DecodeCSV:0' shape=() dtype=int32>, 'Period_of_stay': <tf.Tensor 'DecodeCSV:1' shape=() dtype=string>, 'Spa': <tf.Tensor 'DecodeCSV:6' shape=() dtype=string>, 'Pool': <tf.Tensor 'DecodeCSV:3' shape=() dtype=string>, 'Review_month': <tf.Tensor 'DecodeCSV:11' shape=() dtype=string>, 'Gym': <tf.Tensor 'DecodeCSV:4' shape=() dtype=string>}
###################################
2018-03-13 00:50:53.694401: W tensorflow/core/framework/op_kernel.cc:1179] OP_REQUIRES failed at cast_op.cc:77 : Unimplemented: Cast string to float is not supported
2018-03-13 00:50:53.694466: E tensorflow/core/common_runtime/executor.cc:645] Executor failed to create kernel. Unimplemented: Cast string to float is not supported
[[Node: linear/head/ToFloat = Cast[DstT=DT_FLOAT, SrcT=DT_STRING, _device="/job:localhost/replica:0/task:0/device:CPU:0"](linear/head/labels)]]
Traceback (most recent call last):
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1361, in _do_call
return fn(*args)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1340, in _run_fn
target_list, status, run_metadata)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnimplementedError: Cast string to float is not supported
[[Node: linear/head/ToFloat = Cast[DstT=DT_FLOAT, SrcT=DT_STRING, _device="/job:localhost/replica:0/task:0/device:CPU:0"](linear/head/labels)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "testingColumns.py", line 149, in <module>
classifier.train(input_fn=lambda:input_fn(_TRAIN_FILE,50, 50, 20), steps=10)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 352, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 891, in _train_model
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/training/monitored_session.py", line 546, in run
run_metadata=run_metadata)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/training/monitored_session.py", line 1022, in run
run_metadata=run_metadata)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/training/monitored_session.py", line 1113, in run
raise six.reraise(*original_exc_info)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/six.py", line 693, in reraise
raise value
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/training/monitored_session.py", line 1098, in run
return self._sess.run(*args, **kwargs)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/training/monitored_session.py", line 1170, in run
run_metadata=run_metadata)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/training/monitored_session.py", line 950, in run
return self._sess.run(*args, **kwargs)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 905, in run
run_metadata_ptr)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1137, in _run
feed_dict_tensor, options, run_metadata)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1355, in _do_run
options, run_metadata)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1374, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnimplementedError: Cast string to float is not supported
[[Node: linear/head/ToFloat = Cast[DstT=DT_FLOAT, SrcT=DT_STRING, _device="/job:localhost/replica:0/task:0/device:CPU:0"](linear/head/labels)]]
Caused by op 'linear/head/ToFloat', defined at:
File "testingColumns.py", line 149, in <module>
classifier.train(input_fn=lambda:input_fn(_TRAIN_FILE,50, 50, 20), steps=10)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 352, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 812, in _train_model
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 793, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/canned/linear.py", line 316, in _model_fn
config=config)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/canned/linear.py", line 170, in _linear_model_fn
logits=logits)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/canned/head.py", line 1100, in create_estimator_spec
features=features, mode=mode, logits=logits, labels=labels))
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/canned/head.py", line 1010, in create_loss
labels = math_ops.to_float(labels)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 830, in to_float
return cast(x, dtypes.float32, name=name)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 779, in cast
return gen_math_ops.cast(x, base_type, name=name)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 911, in cast
"Cast", x=x, DstT=DstT, name=name)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3271, in create_op
op_def=op_def)
File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1650, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
UnimplementedError (see above for traceback): Cast string to float is not supported
[[Node: linear/head/ToFloat = Cast[DstT=DT_FLOAT, SrcT=DT_STRING, _device="/job:localhost/replica:0/task:0/device:CPU:0"](linear/head/labels)]]
forgot to add sample data
5.0,Dec-Feb,Friends,NO,YES,NO,NO,YES,YES,Circus Circus Hotel & Casino Las Vegas,3.0,January
3.0,Dec-Feb,Business,NO,YES,NO,NO,YES,YES,Circus Circus Hotel & Casino Las Vegas,3.0,January
4.0,Dec-Feb,Families,NO,YES,NO,NO,YES,YES,Circus Circus Hotel & Casino Las Vegas,3.0,December
2.0,Dec-Feb,Couples,NO,YES,NO,NO,YES,YES,Circus Circus Hotel & Casino Las Vegas,3.0,December
4.0,Dec-Feb,Couples,YES,YES,NO,YES,YES,YES,Excalibur Hotel & Casino,3.0,January
4.0,Dec-Feb,Business,YES,YES,NO,YES,YES,YES,Excalibur Hotel & Casino,3.0,January
5.0,Dec-Feb,Couples,YES,YES,NO,YES,YES,YES,Excalibur Hotel & Casino,3.0,February
3.0,Dec-Feb,Business,YES,YES,NO,YES,YES,YES,Excalibur Hotel & Casino,3.0,February