2

I am working with Google Drive in Python using fsspec to perform various operations like listing and downloading files and directories. However, I have encountered a challenge when dealing with items that share the same name. For example, there might be a file and a directory both named "example.txt" in different locations. When using fsspec, I find it difficult to differentiate between these items solely based on their names. I need a way to distinguish between same-named files and directories while working with Google Drive using fsspec.

I have attempted to list the contents of directories using fsspec and then loop through the results to handle the files and directories accordingly. However, since items with the same name appear identical in the listing, I couldn't reliably tell them apart. As a result, my attempts to download specific files or navigate to directories were not successful.

I am looking for guidance on how to address this issue and implement a solution that allows me to differentiate between same-named files and directories effectively using fsspec in Python.

muhammad ali e
  • 655
  • 6
  • 8
  • In my GDrive, I have encountered a situation where there are two files with the same name (`test1`) and identical sizes. To obtain the list of files along with their details, I used `fsspec.ls` with `detail=True`, resulting in the following output: ``` [{'type': 'file', 'name': 'root/testFolder/test1', 'size': 1024, 'checksum': None}, {'type': 'file', 'name': 'root/testFolder/test1', 'size': 1024, 'checksum': None}] ``` As you can see, both files have the same name and path. Now, I am uncertain about which file will be downloaded if I attempt to do so using the provided name. – muhammad ali e Jul 28 '23 at 04:19

1 Answers1

0

Let me start by saying that gdrivefs is nowhere near as mature and complete as other fsspec backends. Furthermore, the gdrive API itself is much less amenable to being viewed as as a filesystem, allowing identically-names files (some of which may be directories), each potentially with versions.

However, fsspec does attempt to distinguish directories from files, so if you do .ls(..., detail=True), each entry should have a "type" field you can rely on, and if a path name appears multiple times, each one will have different details.

I don't happen to have any file/directory conflicts in my personal space, so I'm not sure what functionality breaks for that case.

mdurant
  • 27,272
  • 5
  • 45
  • 74
  • In my GDrive, I have encountered a situation where there are two files with the same name (`test1`) and identical sizes. To obtain the list of files along with their details, I used `fsspec.ls` with `detail=True`, resulting in the following output: ``` [{'type': 'file', 'name': 'root/testFolder/test1', 'size': 1024, 'checksum': None}, {'type': 'file', 'name': 'root/testFolder/test1', 'size': 1024, 'checksum': None}] ``` As you can see, both files have the same name and path. Now, I am uncertain about which file will be downloaded if I attempt to do so using the provided name. – muhammad ali e Jul 28 '23 at 04:19
  • tahnsks @mdurant I will check the gdrivefs code – muhammad ali e Jul 28 '23 at 04:34
  • It opens the first item found. Im GoogleDriveFile, you can see that it's only the "id" field that really matters, which is in the file listing, but there is no API for selecting the one you want. – mdurant Jul 28 '23 at 13:29