4

As assignment for schoolwork we have to be able to predict best trip for a traveler_type. I am following a tensorFlow tutorial. After hours of working my way thru bugs I have finally hit a dead end. The following have been looked at for answers: Python TensorFlow Cast string to float is not.... python-COnverting from pandas to Tensorflow..

I have rewritten the code in several ways due to other bugs that came up. This is the latest iteration.

I initial open the csv file to clean it up and format it. There are some columns we dont need. If there is a way to only model and train the data without having to manually remove the unwanted columns I am open to suggestions. After I remove what I dont need I save the dataframe without headers and index to a different file and reopen in the input_fn. My print statements are for debugging. They allow me to see what is going in or coming out of a function, thus letting me know if the types are correct.

This is whats getting me hung up. According to the prints, all my columns are the correct data types. They match up with my feature columns. Then again I have been at this for most of the day so I am probably going blind.

According to the output i get from attempting to run this and one of the SO posts i listed, I am inputting one of the input columns in the incorrect type. But I cant figure out which one. This is our first attempt at tensorflow so I am at a loss at trying to figure this out on my own.

Following is the code.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import pandas as pd
import numpy as np
import tensorflow as tf

_CSV_COLUMNS = ['Score', 'Period_of_stay',
            'Traveler_type', 'Pool',
            'Gym',
            'Tennis_court', 'Spa',
            'Casino',
            'Free_internet',
            'Hotel_name',
            'Hotel_stars',
            'Review_month']

_TRAIN_FILE = './cleanedUPTrain.csv'

_TEST_FILE = './cleanedUPTest.csv'


train = pd.read_csv('./vegas2.csv', index_col=False,
        usecols=_CSV_COLUMNS,
        dtype={'Score': np.float64, 'Period_of_stay':np.str,
            'Traveler_type':np.str, 'Pool':np.str,
            'Gym':np.str,
            'Tennis_court':np.str, 'Spa':np.str,
            'Casino':np.str,
            'Free_internet':np.str,
            'Hotel_name':np.str,
            'Review_month':np.str,'Hotel_stars': np.float64})

train.to_csv(_TRAIN_FILE, header=False, index=False)
print(train.dtypes)
print("$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$")

test = pd.read_csv('./vegas.csv', index_col=False,
        usecols=_CSV_COLUMNS,
        dtype={'Score': np.float64, 'Period_of_stay':np.str,
            'Traveler_type':np.str, 'Pool':np.str,
            'Gym':np.str,
            'Tennis_court':np.str, 'Spa':np.str,
            'Casino':np.str,
            'Free_internet':np.str,
            'Hotel_name':np.str,
            'Review_month':np.str,'Hotel_stars': np.float64})

test.to_csv(_TEST_FILE, header=False, index=False)
print(test.dtypes)
print("$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$")


score = tf.feature_column.numeric_column(
        'Score', [1, 2, 3, 4, 5]
        )

stay_period = tf.feature_column.categorical_column_with_vocabulary_list(
        'Period_of_stay', ['Dec-Feb', 'Mar-May', 'Jun-Aug', 'Sep-Nov']
        )
traveler_type = tf.feature_column.categorical_column_with_vocabulary_list(
        'Traveler_type', ['Friends', 'Business', 'Families', 'Couples', 'Solo'])

pool = tf.feature_column.categorical_column_with_vocabulary_list(
        'Pool', ['YES', 'NO'])

gym = tf.feature_column.categorical_column_with_vocabulary_list(
        'Gym', ['YES', 'NO'])

tennis_court = tf.feature_column.categorical_column_with_vocabulary_list(
        'Tennis_court', ['YES', 'NO'])

spa = tf.feature_column.categorical_column_with_vocabulary_list(
        'Spa', ['YES', 'NO'])

casino = tf.feature_column.categorical_column_with_vocabulary_list(
        'Casino', ['YES', 'NO'])

free_internet = tf.feature_column.categorical_column_with_vocabulary_list(
        'Free_internet', ['YES', 'NO'])

hotel_name = tf.feature_column.categorical_column_with_vocabulary_list(
        'Hotel_name',
        ['Circus Circus Hotel & Casino Las Vegas', 'Excalibur Hotel & Casino', 'Tuscany Las Vegas Suites & Casino', 'Hilton Grand Vacations at the Flamingo', 'Monte Carlo Resort&Casino', 'Treasure Island- TI Hotel & Casino', 'Tropicana Las Vegas - A Double Tree by Hilton Hotel', 'Paris Las Vegas', 'The Westin las Vegas Hotel Casino & Spa', 'Caesars Palace', 'The Cosmopolitan Las Vegas', 'The Palazzo Resort Hotel Casino', 'Wynn Las Vegas', 'Trump International Hotel Las Vegas', 'Encore at wynn Las Vegas', 'The Venetian Las Vegas Hotel', 'Bellagio Las Vegas',"Marriott's Grand Chateau", 'Wyndham Grand Desert', 'The Cromwell']
        )
hotel_stars = tf.feature_column.numeric_column(
        'Hotel_stars', [1, 2, 3, 4, 5])

review_month = tf.feature_column.categorical_column_with_vocabulary_list(
        'Review_month',
        ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
        )


base_columns = [score, stay_period, pool,
        gym, tennis_court, spa, casino,
        free_internet, hotel_name, hotel_stars, review_month]



def input_fn(data_file, num_of_epochs, shuffle, batch_size):
    assert tf.gfile.Exists(data_file), ('file not found!!!')

    def parse_csv(value):
        print('Parsing file')
        record_defaults = [[1], ['Dec-Feb'], ['Solo'], ['YES'], ['YES'], ['YES'], ['YES'],
                ['YES'], ['YES'], ['Circus Circus Hotel & Casino Las Vegas'], [1], ['January']]
        columns = tf.decode_csv(value, record_defaults=record_defaults)
        features = dict(zip(_CSV_COLUMNS, columns))
        labels = features.pop('Traveler_type')
        print(labels)
        print("###################################")
        print(columns)
        print("###################################")
        print(features)
        print("###################################")
        return features, labels

    dataset = tf.data.TextLineDataset(data_file)

    if shuffle:
        dataset = dataset.shuffle(shuffle)

    dataset = dataset.map(parse_csv, num_parallel_calls=5)

    dataset = dataset.repeat(num_of_epochs)
    dataset = dataset.batch(batch_size)
    return dataset




#CREATE THE MODEL
classifier = tf.estimator.LinearClassifier(
        model_dir = '~/lab3AI/',
        feature_columns=base_columns
        )

#train the model
classifier.train(input_fn=lambda:input_fn(_TRAIN_FILE,50, 50, 20), steps=10)

results = classifier.evaluate(input_fn=lambda: input_fn(_TEST_FILE, 1, False, 20))



inputs = tf.feature_column.input_layer(temp, base_columns)

var_init = tf.global_variables_initializer()

table_init = tf.tables_initializer()
sess = tf.Session()
sess.run((var_init,table_init))
print(sess.run(inputs))

Error log.

Score             float64
Period_of_stay     object
Traveler_type      object
Pool               object
Gym                object
Tennis_court       object
Spa                object
Casino             object
Free_internet      object
Hotel_name         object
Hotel_stars       float64
Review_month       object
dtype: object
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
Score             float64
Period_of_stay     object
Traveler_type      object
Pool               object
Gym                object
Tennis_court       object
Spa                object
Casino             object
Free_internet      object
Hotel_name         object
Hotel_stars       float64
Review_month       object
dtype: object
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
Parsing file
Tensor("DecodeCSV:2", shape=(), dtype=string) #labels
###################################columns
[<tf.Tensor 'DecodeCSV:0' shape=() dtype=int32>, <tf.Tensor 'DecodeCSV:1' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:2' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:3' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:4' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:5' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:6' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:7' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:8' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:9' shape=() dtype=string>, <tf.Tensor 'DecodeCSV:10' shape=() dtype=int32>, <tf.Tensor 'DecodeCSV:11' shape=() dtype=string>]
###################################features
{'Hotel_name': <tf.Tensor 'DecodeCSV:9' shape=() dtype=string>, 'Free_internet': <tf.Tensor 'DecodeCSV:8' shape=() dtype=string>, 'Casino': <tf.Tensor 'DecodeCSV:7' shape=() dtype=string>, 'Hotel_stars': <tf.Tensor 'DecodeCSV:10' shape=() dtype=int32>, 'Tennis_court': <tf.Tensor 'DecodeCSV:5' shape=() dtype=string>, 'Score': <tf.Tensor 'DecodeCSV:0' shape=() dtype=int32>, 'Period_of_stay': <tf.Tensor 'DecodeCSV:1' shape=() dtype=string>, 'Spa': <tf.Tensor 'DecodeCSV:6' shape=() dtype=string>, 'Pool': <tf.Tensor 'DecodeCSV:3' shape=() dtype=string>, 'Review_month': <tf.Tensor 'DecodeCSV:11' shape=() dtype=string>, 'Gym': <tf.Tensor 'DecodeCSV:4' shape=() dtype=string>}
###################################
2018-03-13 00:50:53.694401: W tensorflow/core/framework/op_kernel.cc:1179] OP_REQUIRES failed at cast_op.cc:77 : Unimplemented: Cast string to float is not supported
2018-03-13 00:50:53.694466: E tensorflow/core/common_runtime/executor.cc:645] Executor failed to create kernel. Unimplemented: Cast string to float is not supported
     [[Node: linear/head/ToFloat = Cast[DstT=DT_FLOAT, SrcT=DT_STRING, _device="/job:localhost/replica:0/task:0/device:CPU:0"](linear/head/labels)]]
Traceback (most recent call last):
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1361, in _do_call
    return fn(*args)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1340, in _run_fn
    target_list, status, run_metadata)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnimplementedError: Cast string to float is not supported
     [[Node: linear/head/ToFloat = Cast[DstT=DT_FLOAT, SrcT=DT_STRING, _device="/job:localhost/replica:0/task:0/device:CPU:0"](linear/head/labels)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "testingColumns.py", line 149, in <module>
    classifier.train(input_fn=lambda:input_fn(_TRAIN_FILE,50, 50, 20), steps=10)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 352, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 891, in _train_model
    _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/training/monitored_session.py", line 546, in run
    run_metadata=run_metadata)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/training/monitored_session.py", line 1022, in run
    run_metadata=run_metadata)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/training/monitored_session.py", line 1113, in run
    raise six.reraise(*original_exc_info)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/six.py", line 693, in reraise
    raise value
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/training/monitored_session.py", line 1098, in run
    return self._sess.run(*args, **kwargs)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/training/monitored_session.py", line 1170, in run
    run_metadata=run_metadata)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/training/monitored_session.py", line 950, in run
    return self._sess.run(*args, **kwargs)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 905, in run
    run_metadata_ptr)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1137, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1355, in _do_run
    options, run_metadata)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1374, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnimplementedError: Cast string to float is not supported
     [[Node: linear/head/ToFloat = Cast[DstT=DT_FLOAT, SrcT=DT_STRING, _device="/job:localhost/replica:0/task:0/device:CPU:0"](linear/head/labels)]]

Caused by op 'linear/head/ToFloat', defined at:
  File "testingColumns.py", line 149, in <module>
    classifier.train(input_fn=lambda:input_fn(_TRAIN_FILE,50, 50, 20), steps=10)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 352, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 812, in _train_model
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 793, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/canned/linear.py", line 316, in _model_fn
    config=config)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/canned/linear.py", line 170, in _linear_model_fn
    logits=logits)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/canned/head.py", line 1100, in create_estimator_spec
    features=features, mode=mode, logits=logits, labels=labels))
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/estimator/canned/head.py", line 1010, in create_loss
    labels = math_ops.to_float(labels)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 830, in to_float
    return cast(x, dtypes.float32, name=name)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 779, in cast
    return gen_math_ops.cast(x, base_type, name=name)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 911, in cast
    "Cast", x=x, DstT=DstT, name=name)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3271, in create_op
    op_def=op_def)
  File "/home/guak/tensorFlow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1650, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

UnimplementedError (see above for traceback): Cast string to float is not supported
     [[Node: linear/head/ToFloat = Cast[DstT=DT_FLOAT, SrcT=DT_STRING, _device="/job:localhost/replica:0/task:0/device:CPU:0"](linear/head/labels)]]

forgot to add sample data

5.0,Dec-Feb,Friends,NO,YES,NO,NO,YES,YES,Circus Circus Hotel & Casino Las Vegas,3.0,January
3.0,Dec-Feb,Business,NO,YES,NO,NO,YES,YES,Circus Circus Hotel & Casino Las Vegas,3.0,January
4.0,Dec-Feb,Families,NO,YES,NO,NO,YES,YES,Circus Circus Hotel & Casino Las Vegas,3.0,December
2.0,Dec-Feb,Couples,NO,YES,NO,NO,YES,YES,Circus Circus Hotel & Casino Las Vegas,3.0,December
4.0,Dec-Feb,Couples,YES,YES,NO,YES,YES,YES,Excalibur Hotel & Casino,3.0,January
4.0,Dec-Feb,Business,YES,YES,NO,YES,YES,YES,Excalibur Hotel & Casino,3.0,January
5.0,Dec-Feb,Couples,YES,YES,NO,YES,YES,YES,Excalibur Hotel & Casino,3.0,February
3.0,Dec-Feb,Business,YES,YES,NO,YES,YES,YES,Excalibur Hotel & Casino,3.0,February

0 Answers0