This is somehow final version after some tries. Previous was not useful so I'm removing it, instead of appending. Read till the end, since not everything may be needed for final solution.
To the topic. I would use Python. If that is one time task, then it can be overkill, but in any other case - you can log all steps for future investigation, regex, orchestrating some commands with providing input, and taking and processing output - each time. All that cases are quite easy in Python. If you have it however.
Now, I'll write what to do to have env. configured. Not all is mandatory, but trying install did some steps, and maybe description of the process can be beneficial itself.
I have MinGW - 32 bit version. That is not mandatory to extract 7zip however. When installed go to C:\MinGW\bin
and run mingw-get.exe
:
Basic Setup
I have msys-base
installed (right click, mark for installation, from Installation menu - Apply changes). That way I have bash, sed, grep, and many more.
- In
All Packages
there is mingw32-libarchive with dll as class. Since python
libarchive` package is just a wrapper you need this dll to actually have binary to wrap.
Examples are for Python 3. I'm using 32 bit version. You can fetch it from their home page. I have installed in default directory which is strange. So advise is to install in root of your disk - like mingw.
Other things - conemu is much better then default console.
Installing packages in Python. pip
is used for that. From your console go to Python home, and there is Scripts
subdirectory there. For me it is: c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\Scripts
. You can search with for instance pip search archive
, and install with pip install libarchive-c
:
> pip.exe install libarchive-c
Collecting libarchive-c
Downloading libarchive_c-2.7-py2.py3-none-any.whl
Installing collected packages: libarchive-c
Successfully installed libarchive-c-2.7
After cd ..
call python
, and new library can be used / imported:
>>> import libarchive
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\__init__.py", line 1, in <module>
from .entry import ArchiveEntry
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\entry.py", line 6, in <module>
from . import ffi
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\ffi.py", line 27, in <module>
libarchive = ctypes.cdll.LoadLibrary(libarchive_path)
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\ctypes\__init__.py", line 426, in LoadLibrary
return self._dlltype(name)
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\ctypes\__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
TypeError: LoadLibrary() argument 1 must be str, not None
So it fails. I've tried to fix that, but failed with that:
>>> import libarchive
read format "cab" is not supported
read format "7zip" is not supported
read format "rar" is not supported
read format "lha" is not supported
read filter "uu" is not supported
read filter "lzop" is not supported
read filter "grzip" is not supported
read filter "bzip2" is not supported
read filter "rpm" is not supported
read filter "xz" is not supported
read filter "none" is not supported
read filter "compress" is not supported
read filter "all" is not supported
read filter "lzma" is not supported
read filter "lzip" is not supported
read filter "lrzip" is not supported
read filter "gzip" is not supported
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\__init__.py", line 1, in <module>
from .entry import ArchiveEntry
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\entry.py", line 6, in <module>
from . import ffi
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\ffi.py", line 167, in <module>
c_int, check_int)
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\ffi.py", line 92, in ffi
f = getattr(libarchive, 'archive_'+name)
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\ctypes\__init__.py", line 361, in __getattr__
func = self.__getitem__(name)
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\ctypes\__init__.py", line 366, in __getitem__
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: function 'archive_read_open_filename_w' not found
Tried with set
command to directly provide information, but failed... So I moved to pylzma
- for that mingw is not needed. pip
install failed:
> pip.exe install pylzma
Collecting pylzma
Downloading pylzma-0.4.9.tar.gz (115kB)
100% |--------------------------------| 122kB 1.3MB/s
Installing collected packages: pylzma
Running setup.py install for pylzma ... error
Complete output from command c:\users\texxas\appdata\local\programs\python\python36-32\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\texxas\\AppData\\Local\\Temp\\pip-build-99t_zgmz\\pylzma\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\texxas\AppData\Local\Temp\pip-ffe3nbwk-record\install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
creating build
creating build\lib.win32-3.6
copying py7zlib.py -> build\lib.win32-3.6
running build_ext
adding support for multithreaded compression
building 'pylzma' extension
error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": http://landinghub.visualstudio.com/visual-cpp-build-tools
Again failed. But that is easy one - I've installed visual studio build tools 2015, and that worked. I have sevenzip
installed, so I've created sample archive. So finally I can start python and do:
from py7zlib import Archive7z
f = open(r"C:\Users\texxas\Desktop\try.7z", 'rb')
a = Archive7z(f)
a.filenames
And got empty list. Looking closer... gives better understanding - empty files are not considered by pylzma
- just to make you aware of that. So putting one character into my sample files, last line gives:
>>> a.filenames
['try/a/test.txt', 'try/a/test1.txt', 'try/a/test2.txt', 'try/a/test3.txt', 'try/a/test4.txt', 'try/a/test5.txt', 'try/a/test6.txt', 'try/a/test7.txt', 'try/b/test.txt', 'try/b/test1.txt', 'try/b/test2.txt', 'try/b/test3.txt', 'try/b/test4.txt', 'try/b/test5.txt', 'try/b/test6.txt', 'try/b/test7.txt', 'try/c/test.txt', 'try/c/test1.txt', 'try/c/test11.txt', 'try/c/test2.txt', 'try/c/test3.txt', 'try/c/test4.txt', 'try/c/test5.txt', 'try/c/test6.txt', 'try/c/test7.txt']
So... rest is a piece of cake. And actually that is a part of original post:
import os
import py7zlib
for folder, subfolders, files in os.walk('.'):
for file in files:
if file.endswith('.7z'):
# sooo 7z archive - extract needed.
try:
with open(file, 'rb') as f:
z = py7zlib.Archive7z(f)
for file in z.list():
if arch.getinfo(file).filename.endswith('*.py'):
arch.extract(file, './dest')
except py7zlib.FormatError as e:
print ('file ' + file)
print (str(e))
As a side note - Anaconda is great tool, but full install takes 500+MB, so that is way too much.
Also let me share wmctrl.py tool, from my github:
cmd = 'wmctrl -ir ' + str(active.window) + \
' -e 0,' + str(stored.left) + ',' + str(stored.top) + ',' + str(stored.width) + ',' + str(stored.height)
print cmd
res = getoutput(cmd)
That way you can orchestrate different commands - here it is wmctrl
. Result can be processed, in the way that allows data processing.