1

I have string of roughly 200 characters including characters and symbols I would like to compress this string using any algorithms...

Please help me any kind of programs , codes , algortihms

Thanks in advance

currently i am using this , but when symbols are there it shows array index out of bounds.

**COMPRESSION**
byte[] encode(String txt, int bit){
int length = txt.length();
float tmpRet1=0,tmpRet2=0;
if(bit==6){
    tmpRet1=3.0f;
    tmpRet2=4.0f;
}else if(bit==5){
    tmpRet1=5.0f;
    tmpRet2=8.0f;
}
byte encoded[]=new byte[(int)(tmpRet1*Math.ceil(length/tmpRet2))];
char str[]=new char[length];
txt.getChars(0,length,str,0);
int chaVal = 0;
String temp;
String strBinary = new String("");
for (int i = 0;i<length; i++){
    temp = Integer.toBinaryString(toValue(str[i]));
    while(temp.length()%bit != 0){
        temp="0"+temp;
    }
    strBinary=strBinary+temp;
}
while(strBinary.length()%8 != 0){
   strBinary=strBinary+"0";
}
Integer tempInt =new Integer(0);
for(int i=0 ; i<strBinary.length();i=i+8){
    tempInt = tempInt.valueOf(strBinary.substring(i,i+8),2);
    encoded[i/8]=tempInt.byteValue();
}
return encoded;
}



**DECOMPRESSION** :

String decode(byte[] encoded, int bit){
String strTemp = new String("");
String strBinary = new String("");
String strText = new String("");
Integer tempInt =new Integer(0);
int intTemp=0;
for(int i = 0;i<encoded.length;i++){         
    if(encoded[i]<0){
        intTemp = (int)encoded[i]+256;
    }else
        intTemp = (int)encoded[i];
    strTemp = Integer.toBinaryString(intTemp);
    while(strTemp.length()%8 != 0){
        strTemp="0"+strTemp;
    }
    strBinary = strBinary+strTemp;
}
for(int i=0 ; i<strBinary.length();i=i+bit){
    tempInt = tempInt.valueOf(strBinary.substring(i,i+bit),2);
    strText = strText + toChar(tempInt.intValue()); 
}
return strText;
}
rolling.stones
  • 496
  • 3
  • 9
  • 20
  • 1
    Where(line number) exactly are you getting exception ? – Aman J Oct 03 '12 at 07:24
  • Note: Never use `new String("");`, it's unnecessarily complicated and inefficient. Just use `String strTemp = "";` instead. Likewise, don't use `new Integer(0);`, but just `0`. In fact, `tempInt` should have been an `int` instead of an `Integer`. – Jesper Oct 03 '12 at 07:55

2 Answers2

1

Once, while I was studing, my teacher made me code a text compressor (cool homeworks). The basic idea was: if each character is 8 bits, find the characters that appear most and assign them a shorter value, while assigning a larger value to the letters that appear less.

Example:

A = 01010101 B = 10101010

Uncompressed: AAAB - 01010101 01010101 01010101 10101010

Compressed:

A appears 3 times (should have shorter representation) B appears 1 time (should have longer representation)

A - 01

B - 10

Result: 01 01 01 10

So, you generate a serie of bits for each letter in a way that no letter should have a representation that could be matched against another letter. Then you store that generated scheme in the compressed file. If you want to de-compress just read the scheme from the compressed file and then start reading bit-a-bit.

Look here for details: http://web.stonehill.edu/compsci//LC/TEXTCOMPRESSION.htm

alexandernst
  • 14,352
  • 22
  • 97
  • 197
  • 1
    See [Huffman coding](http://en.wikipedia.org/wiki/Huffman_coding) for a slightly more sophisticated version of this idea. – Jesper Oct 03 '12 at 07:53
  • Huffman Coding is a good call for me.. !! :) Any links for JAVA code of huffman would be helpful ? – rolling.stones Oct 03 '12 at 08:17
  • 1
    Look at http://rosettacode.org/wiki/Huffman_coding#Java and http://algs4.cs.princeton.edu/55compression/Huffman.java.html – alexandernst Oct 03 '12 at 08:58
0

You could use a GZIPOutputStream for compression a GZIPInputStream for decompression.

If you want to do it in memory, just use a ByteArrayInputStream/ByteArrayOutputStream as a target for the two classes above.

See the link bellow:

http://docs.oracle.com/javase/1.5.0/docs/api/java/util/zip/GZIPOutputStream.html

Tudor Vintilescu
  • 1,450
  • 2
  • 16
  • 28
  • i appreciate the response , but using GZIP outputstream is not applicable , because i want to use the same decompression method for other language or application e.g. DELPHI , and there i will not be able to find the same functionalities – rolling.stones Oct 03 '12 at 07:55
  • 1
    gzip is a standard compression protocol and implementations are found on virtually any platform and language. A short search retrieved this: http://stackoverflow.com/questions/8598145/how-to-decode-gzip-data – Tudor Vintilescu Oct 03 '12 at 08:18
  • +10 for this :) :) Awesome search ... !! Kind of solves my issues ! if available in delphi also ! – rolling.stones Oct 03 '12 at 08:23
  • Please help me with a link with a working piece delphi code for GZIP. NEW QUESTION + V.IMP - Also is it possible , if i zip a file in JAVA using GZIP , and put it in delphi code , will it retrieve same info ? – rolling.stones Oct 03 '12 at 08:29
  • The GZIP compression/decompression algorith doesn't have anything to do with the language it is implemented in (given a correct implementation :) ). I'll try to find a sample. – Tudor Vintilescu Oct 03 '12 at 08:35
  • Thanks :) If found , nothing like it :) – rolling.stones Oct 03 '12 at 10:01