I am working on a project with Tensorflow federated. I have managed to use the libraries provided by TensorFlow Federated Learning simulations in order to load, train, and test some datasets.
For example, i load the emnist dataset
emnist_train, emnist_test = tff.simulation.datasets.emnist.load_data()
and it got the data sets returned by load_data() as instances of tff.simulation.ClientData. This is an interface that allows me to iterate over client ids and allow me to select subsets of the data for simulations.
len(emnist_train.client_ids)
3383
emnist_train.element_type_structure
OrderedDict([('pixels', TensorSpec(shape=(28, 28), dtype=tf.float32, name=None)), ('label', TensorSpec(shape=(), dtype=tf.int32, name=None))])
example_dataset = emnist_train.create_tf_dataset_for_client(
emnist_train.client_ids[0])
I am trying to load the fashion_mnist dataset with Keras to perform some federated operations:
fashion_train,fashion_test=tf.keras.datasets.fashion_mnist.load_data()
but I get this error
AttributeError: 'tuple' object has no attribute 'element_spec'
because Keras returns a Tuple of Numpy arrays instead of a tff.simulation.ClientData like before:
def tff_model_fn() -> tff.learning.Model:
return tff.learning.from_keras_model(
keras_model=factory.retrieve_model(True),
input_spec=fashion_test.element_spec,
loss=loss_builder(),
metrics=metrics_builder())
iterative_process = tff.learning.build_federated_averaging_process(
tff_model_fn, Parameters.server_adam_optimizer_fn, Parameters.client_adam_optimizer_fn)
server_state = iterative_process.initialize()
To sum up,
Is any way to create tuple elements of
tff.simulation.ClientData
from Keras Tuple Numpy arrays?Another solution that comes to my mind is to use the
tff.simulation.HDF5ClientData
and load manually the appropriate files in aHDF5
format(train.h5, test.h5)
in order to get thetff.simulation.ClientData
, but my problem is that i cant find the url for fashion_mnistHDF5
file format i mean something like that for both train and test:fileprefix = 'fed_emnist_digitsonly' sha256 = '55333deb8546765427c385710ca5e7301e16f4ed8b60c1dc5ae224b42bd5b14b' filename = fileprefix + '.tar.bz2' path = tf.keras.utils.get_file( filename, origin='https://storage.googleapis.com/tff-datasets-public/' + filename, file_hash=sha256, hash_algorithm='sha256', extract=True, archive_format='tar', cache_dir=cache_dir) dir_path = os.path.dirname(path) train_client_data = hdf5_client_data.HDF5ClientData( os.path.join(dir_path, fileprefix + '_train.h5')) test_client_data = hdf5_client_data.HDF5ClientData( os.path.join(dir_path, fileprefix + '_test.h5')) return train_client_data, test_client_data
My final goal is to make the fashion_mnist dataset work with the TensorFlow federated learning.