2

I am getting an error while implementing TensorFlow in TPU

UnimplementedError: File system scheme '[local]' not implemented (file: '1.png')

I know this question has been answered before but my issue is different, I am getting this error when I do

for i, j in train_dataset.take(3):
    print(i,j)

It works with train_dataset.take(3)

Here are my functions

def decode(img,image_size=(IMG_SIZE, IMG_SIZE)):
    bits = tf.io.read_file(img)
    image = tf.image.decode_jpeg(bits, channels=3)
    image = tf.cast(image, tf.float32) / 255.0
    image = tf.image.resize(image, image_size)
    
    image = tf.image.random_flip_left_right(image, seed=2020)
    image = tf.image.random_flip_up_down(image, seed=2020)
    image = tf.image.random_crop(image,size=[IMG_SIZE,IMG_SIZE,3],seed=2020 )
    image = tf.image.random_brightness(image,max_delta=0.5 )
    image = tf.image.rot90(image)
    return image

def decode_image(img,labels=None ):
    if labels is None:
        return decode(img)
    else:
        return decode(img),labels 

train_image=tf.data.Dataset.from_tensor_slices((train.iloc[:,0],train.iloc[:,1::] ))
train_dataset=train_image.map(decode_image, num_parallel_calls=AUTO).repeat().shuffle(512).batch(BATCH_SIZE).prefetch(AUTO)
test_image=tf.data.Dataset.from_tensor_slices((test.iloc[:,0]))
test_dataset=test_image.map(decode_image, num_parallel_calls=AUTO).batch(BATCH_SIZE)

How should I resolve it?

It might an issue with the path. SO I am adding how is set path This is how the directory looks like

weights
images
-train
--train
---train
----img1
----img2
---csv
-val
--val
---img1

When I run

GCS_DS_PATH = KaggleDatasets().get_gcs_path('images')
!gsutil ls $GCS_DS_PATH

I got following

gs://kds-aab923e1c9bc934f088881f1e537365b8f18fe192b3b3dc14e272a37/train/
gs://kds-aab923e1c9bc934f088881f1e537365b8f18fe192b3b3dc14e272a37/val/

This is how my paths are set

def train_format_path(st):
    return GCS_DS_PATH + '/train/train/train/' + st 

def test_format_path(st):
    return GCS_DS_PATH + '/val/val/' + st 

train_paths = train.ID.apply(train_format_path).values
test_paths = test.ID.apply(test_format_path).values

With train_paths[0] I got

'gs://kds-aab923e1c9bc934f088881f1e537365b8f18fe192b3b3dc14e272a37/train/train/train/1.png'
Talha Anwar
  • 2,699
  • 4
  • 23
  • 62
  • @TakhaAnswar - Your picture is different from the link you provided, (I do not know about if this is any wrong or if it does work) but - check, can re-try this at the same picture, but as another extension? maybe the extension in the link you provided: jpg instead of your "1.png" ("1.jpg" maybe could work?) – William Martens Feb 28 '21 at 22:28
  • 1
    yes, i know, but mostly stackoverflow mods refers to similar post, so telling in advance that my issue is different. and its `.png` – Talha Anwar Feb 28 '21 at 22:33
  • @TalhaAnswar Yeah; No worries what so ever! :) I just wondered if you have tried another extension? (does .jpg work - or is it the same problem there too?) sorry if I was unclear, will follow this post. – William Martens Feb 28 '21 at 22:37
  • 1
    so, you are saying, first to convert data to jpg? – Talha Anwar Feb 28 '21 at 22:39
  • Well, yeah - or at least begin (as number 1) to just rename the file "1.png" to "1.jpg" and try it again, if that does not work; You can try as you yourself suggested - convert the png to jpg, and then try it (This is what I would've done in this scenario) – William Martens Feb 28 '21 at 22:40
  • 1
    yes, i have. It might the issue with path, so I added how I set path – Talha Anwar Feb 28 '21 at 22:50
  • Okay, (For me it's really late, so I'll go) but ill just say that; I will try this on my front tomorrow, since one of my friends (not so long ago) actually had a very similar problem, (was also on kaggle, surprisingly!). For now, //Best wishes from Sweden! – William Martens Feb 28 '21 at 22:53
  • I don't quite understand. You say your problem is with `train_dataset.take(3)`, but I don't see that in your code. And are you saying that this call works unless you put it in a `for` loop and then simply print the resulting pair of values from each iteration? If so, then there's only one explanation I can think of that makes sense. – CryptoFool Feb 28 '21 at 22:58
  • yes, it work with out loop and did not work with loop. and if let for a while I ignore it, I got the same error at 'model.fit()` function – Talha Anwar Feb 28 '21 at 23:00
  • If it's really that simple, then the only explanation I can think of is "back pressure". That is, in the absence of the `for` loop asking for values, there is code that isn't running because it isn't being asked to provide values. I know this concept better in Java, but I can see that Python "generators" might have this same characteristic. That's the only explanation I can see making sense. The only one, which is pretty wild, is that calling `print()` on the values somehow changes the iteration. That is possible if the `__repr__` method of some object being printed has a side effect. – CryptoFool Feb 28 '21 at 23:06
  • nope, i dont think it has to do with print. because of fork some one notebook and did the same and it worked – Talha Anwar Feb 28 '21 at 23:07
  • ah...ok. You might try assigning the result to a variable and then converting that to a list (it may be a list already, but you want to make sure). That should insure that all values are generated. If that fails also, then it has to be a backpressure thing. – CryptoFool Feb 28 '21 at 23:09
  • I noticed that you have: `train_image=tf.data.Dataset.from_tensor_slices((train.iloc[:,0],train.iloc[:,1::] ))` but it looks like your changes to the path should work. Does it work if you replace `train.iloc...` with `train_paths.iloc...`? – Allen Wang Mar 08 '21 at 23:09
  • @AllenWang thanks. look like this is the issue – Talha Anwar Mar 09 '21 at 13:37
  • @Talha Anwar, Is your issue resolved after changing `train.iloc` with `train_paths.iloc` ? –  Mar 12 '21 at 12:27
  • @TFer2 yes, I added the solution in the comment as suggested by AllenWang – Talha Anwar Mar 12 '21 at 12:46

1 Answers1

1

As suggested by @Allen Wang, the solution is to use train_paths instead of train to pass images.

This is what I have changes to make it work

    train_image=tf.data.Dataset.from_tensor_slices((train_paths,train.iloc[:,1::] ))
    train_dataset=train_image.map(decode_image, num_parallel_calls=AUTO).repeat().shuffle(512).batch(BATCH_SIZE).prefetch(AUTO)
    test_image=tf.data.Dataset.from_tensor_slices((test_paths))
    test_dataset=test_image.map(decode_image, num_parallel_calls=AUTO).batch(BATCH_SIZE)
Talha Anwar
  • 2,699
  • 4
  • 23
  • 62