13

I have tried to use Zlib to decompress the file, but it just said "Data error" and gave me an empty file.

This is the code I tried:

// Open a new temp file to write new file to
$tempFile = fopen("tempFile", "w");
// Make sure tempFile is empty
ftruncate($tempFile, 0);

// Write new decompressed file 
fwrite($tempFile, zlib_decode(file_get_contents($path))); // $path = absolute path to data.tar.Z

// close temp file
fclose($tempFile);

I have also tried to decompress it in parts, going from .tar.Z to .tar to just a file. I tried using lzw functions to take off the .Z, but I was unable to make it work. Is there a way to do this?

EDIT: Here is some more code I have tried. Just to make sure the file_get_contents was working. I still get a "data error".

$tempFile = fopen("tempFile.tar", "w");
// Make sure tempFile is empty
ftruncate($tempFile, 0);

// Write new decompressed file 
$contents = file_get_contents($path);
if ($contents) {
    fwrite($tempFile, gzuncompress($contents));
}

// close temp file
fclose($tempFile);

EDIT2: I think the reason why LZW was not working is because the contents of the .tar.Z file looks like this:

��3dЀ��0p���a�
H�H��ŋ3j��@�6l�

The LZW functions I have tried both use ASCII characters in their dictionaries. What kind of characters are these?

UndoingTech
  • 709
  • 3
  • 16
  • Do you want this to be done by pure php? Are you running in a linux server? Do you have the ability to run `exec` or `shell_exec`? – Kishor Feb 09 '16 at 19:51
  • I cannot use `exec` or `shell_exec`, and I prefer to use pure PHP. – UndoingTech Feb 09 '16 at 19:58
  • don't chain functions like that. check if `file_get_contents()` was actually able to read that file. if it can't, it'll return boolean `false`, which would then just be blindly passed on to zlib_decode(), which of course can't decode a `false` - boolean false in string context is an empty string. – Marc B Feb 09 '16 at 20:00
  • Thank you, Marc B, for the suggestion. The file was readable and neither `gzuncompress` nor `zlib_decode` worked. I don't know what to do. – UndoingTech Feb 09 '16 at 20:29
  • "What characters are these?" - they're no particular characters, it's binary data. If you open any data file in a text editor, it will try to look at each 8 bits as a character, but that's not really relevant. A compression format is going to squeeze the data into as few bits as possible, so it makes sense the result won't look like text. – IMSoP Feb 10 '16 at 13:47
  • can you upload a sample file? – hanshenrik Feb 10 '16 at 14:04
  • I am getting these files from the NOAA database: [here.](ftp://ftp.ncdc.noaa.gov/pub/data/hourly_precip-3240/05/) – UndoingTech Feb 10 '16 at 14:06
  • 1
    feel free to port the compress algorithm, or do what everyone else would do, call uncompress / tar with something like exec() / system() / proc_open() -- https://github.com/vapier/ncompress/blob/ncompress-4.2.4/compress42.c – hanshenrik Feb 10 '16 at 15:09
  • 1
    Are you sure that the archive is valid? Do you manage to unzip it in command line? – max Feb 12 '16 at 00:25
  • I've been using a [FTP](ftp://ftp.ncdc.noaa.gov/pub/data/hourly_precip-3240/05/) to get the file, and then I can use WinZip to open it and see the data. I tried using `gzinflate`, `gzuncompress`, and `zlib_decode`. All gave me a "data error". – UndoingTech Feb 12 '16 at 13:57
  • https://packagist.org/packages/wapmorgan/unified-archive – Bogdan Burym Feb 12 '16 at 17:46
  • Unified Archive is a nice library because it gives you a consistent API for all the archive types, but it does still rely on the relevant PHP extensions being installed in order for it to be able to actually work for any given file type. – Simba Feb 16 '16 at 16:02
  • @UndoingTech If the file has been compressed with *LZW* compression, `gzinflate` et al won't work as they use *gzip* compression. See my answer for a full explanation. – quickshiftin Feb 16 '16 at 17:04
  • I checked this morning `tar -xZf 3240_05_1948-1998.tar.Z` (one of the files from the link you provided) works. I've updated my answer once more. – quickshiftin Feb 17 '16 at 14:33
  • @UndoingTech can you explain why you "cannot use exec or shell_exec" ? PHP is a scripting language, and is designed to be used in conjunction with other scripting tools, like shell scripts. Can you use a bash script, for example, to perform your decompression, and then call the PHP script to perform a particular action on its contents? Can your PHP script execute a worker process to perform the decompression? – Todd Feb 17 '16 at 15:30
  • @UndoingTech I created a new php extension for you buddy. Take a look at my new answer. – quickshiftin Feb 18 '16 at 13:39
  • Have you seen this ? Link : [Click me :)](http://stackoverflow.com/questions/9416508/php-untar-gz-without-exec) – pegas Feb 18 '16 at 16:42
  • @pegas `PharData` does not support LZW compression – quickshiftin Feb 18 '16 at 17:43

4 Answers4

4

So you want to decompress a taz file natively with PHP? Give my new extension a try!

lzw_decompress_file('3240_05_1948-1998.tar.Z', '3240_05_1948-1998.tar');
$archive = new PharData('/tmp/3240_05_1948-1998.tar');
mkdir('unpacked');
$archive->extractTo('unpacked');

Also note, the reason the zlib functions aren't working is because you need LZW compression, not gzip compression.

quickshiftin
  • 66,362
  • 10
  • 68
  • 89
1

according to this url https://kb.iu.edu/d/acsy you can try

<?php

$file = '/tmp/archive.z';
shell_exec("uncompress $file"); 

if you don't have Unix like OS check https://kb.iu.edu/d/abck for appropriate program.

Cosmin Ordean
  • 165
  • 1
  • 9
  • OP has stated he is not interested in an external source "I cannot use exec or shell_exec, and I prefer to use pure PHP", and besides, you may as well run this through `tar` while you're at it. – quickshiftin Feb 17 '16 at 21:27
0

The file is compressed with LZW compression, and I tried a few but there seems to be no reliable method for decompressing these in PHP. Cosmin's answer contains the correct first step but after using your system's uncompress utility to decompress the file, you still have to extract the TAR file. This can be done with PHP's built-in tools for handling its custom PHAR files.

// the file we're getting
$url = "ftp://ftp.ncdc.noaa.gov/pub/data/hourly_precip-3240/05/3240_05_2011-2011.tar.Z";
// where to save it
$output_dir = ".";
// get a temporary file name
$tempfile = sys_get_temp_dir() . basename($url);
// get the file
$compressed_data = file_get_contents($url);
if (empty($compressed_data)) {
    echo "error getting $url";
    exit;
}
// save it to a local file
$result = file_put_contents($tempfile, $compressed_data);
if (!$result) {
    echo "error saving data to $tempfile";
    exit;
}
// run the system uncompress utility
exec("/usr/bin/env uncompress $tempfile", $foo, $return);
if ($return == 0) {
    // uncompress strips the .Z off the filename
    $tempfile = preg_replace("/.Z$/", "", $tempfile);
    // remove .tar from the filename for use as a directory
    $tempdir = preg_replace("/.tar$/", "", basename($tempfile));
    try {
        // extract the tar file
        $tarchive = new PharData($tempfile);
        $tarchive->extractTo("$output_dir/$tempdir");
        // loop through the files
        $dir = new DirectoryIterator($tempdir);
        foreach ($dir as $file) {
            if (!$file->isDot()) {
                echo $file->getFileName() . "\n";
            }
        }
    } catch (Exception $e) {
        echo "Caught exception untarring: " . $e->getMessage();
        exit;
    }
} else {
    echo "uncompress returned error code $return";
    exit;
}
miken32
  • 42,008
  • 16
  • 111
  • 154
  • If you've decided to use the external `compress` program why would you then use `PharData`? May as well just do it entirely on the command line at that point and save yourself some typing and performance... – quickshiftin Feb 16 '16 at 13:04
  • True, mostly just for catching errors in a more granular fashion. – miken32 Feb 16 '16 at 15:23
  • There is now [a reliable method](https://github.com/quickshiftin/lzw-ext/blob/master/README.md) for decompressing these in PHP :) – quickshiftin Feb 18 '16 at 15:49
-2

Please try this.

 <?php
    try {
        $phar = new PharData('myphar.tar');
        $phar->extractTo('/full/path'); // extract all files
        $phar->extractTo('/another/path', 'file.txt'); // extract only file.txt
        $phar->extractTo('/this/path',
            array('file1.txt', 'file2.txt')); // extract 2 files only
        $phar->extractTo('/third/path', null, true); // extract all files, and overwrite
    } catch (Exception $e) {
        // handle errors
    }
    ?>

Source : http://php.net/manual/en/phardata.extractto.php I haven't tested it but i hope it will work for you.

Manoj Kumar
  • 477
  • 2
  • 8
  • 24