2

I am extracting .tar.gz files which inside there are folders (with files with many extensions). I want to move all the .txt files of the folders to another, but I don't know the folders' name.

.txt files location ---> my_path/extracted/?unknown_name_folder?/file.txt

I want to do ---> my_path/extracted/file.txt

My code:

os.mkdir('extracted')
t = tarfile.open('xxx.tar.gz', 'r')
for member in t.getmembers():
      if ".txt" in member.name:
            t.extract(member, 'extracted')
      ###
Marta
  • 103
  • 7
  • @Noxeus I want to move all the files that ends with '.txt' and move it to extracted folder, – Marta Apr 06 '18 at 11:34
  • If you run your code, what happens? I'm looking at your code but can't figure out why that wouldn't work. – Noxeus Apr 06 '18 at 11:34
  • @Noxeus for example, inside 'xxx.tar.gz' there is an unique folder ('folder'), inside folder 'folder' there are files like: image.jpg, text.txt,text2.txt... If I run the code, I get my_path/extracted/folder/, and inside: text.txt,text2.txt – Marta Apr 06 '18 at 11:37
  • 1
    Have you tried `t.extract(member, os.path.join(os.getcwd(), 'extracted'))` ? I'm just guessing here. – Noxeus Apr 06 '18 at 11:41

1 Answers1

4

I would try extracting the tar file first (See here)

import tarfile
tar = tarfile.open("xxx.tar.gz")
tar.extractall()
tar.close()

and then use the os.walk() method (See here)

import os
for root, dirs, files in os.walk('.\\xxx\\'):
    txt_files = [path for path in files if path[:-4] == '.txt']

OR use the glob package to gather the txt files as suggested by @alper in the comments below:

txt_files = glob.glob('./**/*.txt', recursive=True)

This is untested, but should get you pretty close

And obviously move them once you get the list of text files

new_path = ".\\extracted\\"
for path in txt_files:
    name = path[path.rfind('\\'):]
    os.rename(path, new_path + name)
Josh Wilkins
  • 193
  • 1
  • 8