4

I'm trying to extract multiple files from some .zip archives. My code is:

import os
import zipfile

os.chdir('/home/marlon/Shift One/Projeto Philips/Consolidação de Arquivos')

for f in os.listdir("/home/marlon/Shift One/Projeto Philips/Consolidação de Arquivos"):
    if f.endswith(".zip"):
        z = zipfile.ZipFile(f, 'r')
        z.extractall(path = '/home/marlon/Shift One/Projeto Philips/Consolidação de Arquivos/dados')
        z.close()

However, it only extracts the files inside the first archive. I'm using Python 3.6. What is wrong?

  • Possible duplicate of [Unzip all zipped files in a folder to that same folder using Python 2.7.5](https://stackoverflow.com/questions/31346790/unzip-all-zipped-files-in-a-folder-to-that-same-folder-using-python-2-7-5) – zamir May 18 '19 at 18:03
  • @aaz I'm using Python 3.6 – Marlon Henrique Teixeira May 18 '19 at 18:08
  • 1
    do you get error message ? If you get error then add it in question (not in comment). First you could use `print()` to check values in `f` - maybe files have extension '.ZIP' instead of `.zip`. You can also check if code inside `if` statement is executed. You can also check if files inside .zip files have different names - it may ovewrite one file with another. – furas May 18 '19 at 18:12
  • @furas no error message. I printed it and it just out put one .zip archive. There are nine of them. – Marlon Henrique Teixeira May 18 '19 at 18:20
  • do you use `print()` before `if f.endswith(".zip"):` or inside `if f.endswith(".zip"):` ? if you get one file before `if f.endswith(".zip"):` then it seems there is only one file in folder. If you get one file inside `if f.endswith(".zip"):` then maybe they don't have extensions `.zip` but `.ZIP` or there is space at the end of `.zip `. – furas May 18 '19 at 18:24
  • @furas when I print it just before `if f.endswith(".zip"):` I get every thing inside the folder including the 9 .zip archives. Printing after the whole gives me only one archive. Running the code with .ZIP doesn't do anything. – Marlon Henrique Teixeira May 18 '19 at 18:29
  • so it seems your files may have some space in extension - `".zip "`. You will no see space when you print it but for Python it is not `".zip"`. You can `print(f + "<")` to see if there is space before `<`. OR you have other char which is not printed - ie. `tab`. – furas May 18 '19 at 18:44
  • 1
    OR maybe you have files with extensions like `.Zip` - for Linux and Python (and any other programming language and program) it is different extension than `.zip` and `.ZIP`. You would have to check `f.lower().endswith(".zip")` – furas May 18 '19 at 18:57
  • Thanks @furas. Appreciate it. I’ll check it out! – Marlon Henrique Teixeira May 18 '19 at 19:08
  • Do your archives contain files having same names? This is one situation that could lead to your scenario. – CristiFati May 18 '19 at 20:11
  • @CristiFati that is the case. Sorry, I’ve read quickly your comment and I thought you’ve meant “all the archives have the same names”. All the files inside the multiples .zip archives have the same name. – Marlon Henrique Teixeira May 20 '19 at 22:46
  • If all the archives would have the same name, then they couldn't be more than one :) – CristiFati May 20 '19 at 23:23

1 Answers1

3

I thought this scenario might be a serious candidate ...

What happens is that for each .zip file, all its members are extracted, but they overwrite the ones extracted from the previous file (well, except the 1st). So, at the end, you end up with the files from the last archive that was enumerated.
To get past this issue, you should unzip each .zip file members in a separate directory (the .zip file name).

Below is an example (I also simplified / cleaned your code a bit).

code00.py:

#!/usr/bin/env python3

import os
import glob
import zipfile


dir_name_base = r"/home/marlon/Shift One/Projeto Philips/Consolidação de Arquivos"

for arc_name in glob.iglob(os.path.join(dir_name_base, "*.zip")):
    arc_dir_name = os.path.splitext(os.path.basename(arc_name))[0]
    zf = zipfile.ZipFile(arc_name)
    zf.extractall(path=os.path.join(dir_name_base, "dados", arc_dir_name))
    zf.close()  # Close file after extraction is completed
CristiFati
  • 38,250
  • 9
  • 50
  • 87
  • 1
    I'd say you were not quite so far, after all you managed to enumerate the archives and extract their content.... it just happened that the content was the same :) – CristiFati May 20 '19 at 23:30
  • let me ask you something. I’m a beginner, as you’ve seen probably. What’s the meaning of “[0]”? Thanks – Marlon Henrique Teixeira May 21 '19 at 15:04
  • 1
    It's the 1st element of the list (sequence). Try playing in the *Python* console: `os.path.splitext("file.txt")`, then `os.path.splitext("file.txt")[0]`, you'll see what I mean (you have to `import os` first). – CristiFati May 21 '19 at 16:11