3

I'm trying to parse a file like this:

textfile.txt

_=1406048396605
bh=1244
bw=1711
c=24
c19=DashboardScreen
c2=2014-07-22T10:00:00-0700
c4=64144090210294
c40=3#undefined#0#0#a=-2#512#-1#0
c41=14060470498427c3e4ed
c46=Green|firefox|Firefox|30|macx|Mac OS X
c5=NONFFA
c6=HGKhjgj
c7=OFF_SEASON|h:PARTIAL|
ch=YHtgsfT
g=https://google.hello.com
h5=77dbf90c-5794-4a40-b1ab-fe1c82440c68-1406048401346
k=true
p=Shockwave Flash;QuickTime Plug-in 7.7.3;Default Browser Helper;SharePoint Browser Plug-in;Java Applet Plug-in;Silverlight Plug-In
pageName=DashboardScreen - Loading...
pageType= 
pe=lnk_o
pev2=pageDetail
s=2432x1520
server=1.1 pqalmttws301.ie.google.net:81
t=22/06/2014 10:00:00 2 420
v12=3468337910
v4=0
v9=dat=279333:279364:375870:743798:744035:743802:744033:743805:783950:783797:783949:784088
vid=29E364C5051D2894-400001468000F0EE

into something like this:

_=1406048396605<CONTROL_CHARACTER_HERE>bh=1244<CONTROL_CHARACTER_HERE>bw=1711<CONTROL_CHARACTER_HERE>c=24<CONTROL_CHARACTER_HERE>c19=DashboardScreenc2=2014-07-22T10:00:00-0700.....etc

So I'm basically taking a multiline file and making it into a single line file delimiting each field with a CONTROL_CHARACTER.

This is what I currently have:

private String putIntoExpectedFormat() { 

    File f1 = new File("InputFile.txt");
    File f2 = new File("OutputFile.txt"); 

    InputStream in = new FileInputStream(f1);
    OutputStream out = new FileOutputStream(f2); 

    StringBuilder sb = new StringBuilder();

    byte[] buf = new byte[1024];
    int len;

    while( (len=in.read(buf)) > 0) {



        out.write(buf,0,len);
    }

    in.close();
    out.close();

}

I'm not even sure if I'm doing this right. Does anybody know how to do this?

ajb
  • 31,309
  • 3
  • 58
  • 84
mosawi
  • 1,283
  • 5
  • 25
  • 48

4 Answers4

2

Since its a text file hence you have to use Reader classes for reading character streams. For better performance use BufferedReader

Reads text from a character-input stream, buffering characters so as to provide for the efficient reading of characters, arrays, and lines.

You can use Java 7 - The try-with-resources Statement

sample code:

try (BufferedReader reader = new BufferedReader(new FileReader(
        new File("InputFile.txt")));
     BufferedWriter writer = new BufferedWriter(new FileWriter(
        new File("OutputFile.txt")))) {
    String line = null;
    while ((line = reader.readLine()) != null) {
        writer.write(line);
        // write you <CONTROL_CHARACTER_HERE> as well
    }
}
Braj
  • 46,415
  • 5
  • 60
  • 76
  • but I don't want to remove my thanks. You comments are valid and there is no need to remove it because as I mentioned in my last comment that I have edited it. – Braj Jul 22 '14 at 17:55
1

Easiest way is to use Scanner and PrintWriter:

    Scanner in = null;
    PrintWriter out = null;
    try {
        // init input, output
        in = new Scanner(new File("InputFile.txt"));
        out = new PrintWriter(new File("OutputFile.txt"));
        // read input file line by line
        while (in.hasNextLine()) {
            out.print(in.nextLine());
            if (in.hasNextLine()) {
                out.print("<CONTROL_CHARACTER>");
            }
        }
    } finally {
        // close input, output
        if (in != null) {
            in.close();
        }
        if (out != null) {
            out.close();
        }
    }
Aleksei Shestakov
  • 2,508
  • 2
  • 13
  • 14
1

Here are three pieces of code which will read the file, replace all newlines with you <CONTROL_CHARACTER> and then write the file.

Read the file:

public static String readFile(String filePath) {
    String entireFile = "";

    File file = new File(filePath);

    if (file.exists()) {
        BufferedReader br;
        try {
            br = new BufferedReader(new FileReader(file));

            String line;
            while ((line = br.readLine()) != null) {
                entireFile += line + "\n";
            }

            br.close();

        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    } else {
        System.err.println("File " + filePath + " does not exist!");
    }

    return entireFile;
}

Change newlines to <Control-Character>:

String text = readFile("Path/To/file.txt");
text = text.replace("\n", <Control-Character-Here>);

Write the file:

writeToFile("Path/to/newfile.txt", text);

Here is the method writeToFile()

public static void writeToFile(String filePath, String toWrite) {
    File file  = new File(filePath);
    if (!file.exists()) {
        try {
            file.createNewFile();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            System.err.println(filePath + " does not exist. Failed to create new file");
        }
    }

    try {
        PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter(filePath, true)));
        out.println(toWrite);
        out.close();
    } catch (IOException e) {
        System.err.println("Could not write to file: " + filePath);
    }
}
Zach
  • 4,652
  • 18
  • 22
0
  • Used Guava 17.0.
  • Useful for small size files. Not tested for large and very large files.I think as per the question is considered the expected input file is is small size.
  • Here we are not processing each lines, so reading line by line is not required.

One more Way using Guava IO libraries

    public static void main(String[] args) {
        try {
            String content = Files.toString(new File("/home/chandrayya/InputFile.txt"), Charsets.UTF_8);//Change charset accordingly
            content = content.replaceAll("\r\n"/*\r\n windows format, \n UNIX/OSX format \r old mac format*/, "<C>"/*C is control character.*/);
            Files.write(content, new File("/home/chandrayya/OutputFile.txt.txt"), Charsets.UTF_8 );
        } catch( IOException e ) {
            e.printStackTrace();
        }
    }
Chandrayya G K
  • 8,719
  • 5
  • 40
  • 68