Read a file line by line in reverse order

Question

I have a java ee application where I use a servlet to print a log file created with log4j. When reading log files you are usually looking for the last log line and therefore the servlet would be much more useful if it printed the log file in reverse order. My actual code is:

    response.setContentType("text");
    PrintWriter out = response.getWriter();
    try {
        FileReader logReader = new FileReader("logfile.log");
        try {
            BufferedReader buffer = new BufferedReader(logReader);
            for (String line = buffer.readLine(); line != null; line = buffer.readLine()) {
                out.println(line);
            }
        } finally {
            logReader.close();
        }
    } finally {
        out.close();
    }

The implementations I've found in the internet involve using a StringBuffer and loading all the file before printing, isn't there a code light way of seeking to the end of the file and reading the content till the start of the file?

Try the skip() method with a max int. It'll tell you how far it actually skipped. Then subtract some amount from that value, and skip to there. Then read in the remaining amount, and parse that buffer. — Marvo, May 15 '11 at 21:56
possible duplicate of [Java: Quickly read the last line of a text file?](http://stackoverflow.com/questions/686231/java-quickly-read-the-last-line-of-a-text-file) In particular, look at @Jon Skeet's answer which links to a similar question for C#. — Stephen C, May 15 '11 at 22:31
See also: http://stackoverflow.com/questions/4121678/java-read-last-n-lines-of-a-huge-file — Stephen C, May 15 '11 at 22:34
@Marvo: `skip()` returns how far it skipped because it won't necessarily skip as far as you ask. It could actually try to skip as far as you asked, though, and fail. — ColinD, May 15 '11 at 23:10
Simply printing the logfile lines in reverse order isn't a good idea. What happens, for example, if you have a multi-line log entry ... such as an exception? — Anon, May 16 '11 at 12:29

Nathan Ryan · Accepted Answer · 2011-05-16T21:13:47.597

[EDIT]

By request, I am prepending this answer with the sentiment of a later comment: If you need this behavior frequently, a "more appropriate" solution is probably to move your logs from text files to database tables with DBAppender (part of log4j 2). Then you could simply query for latest entries.

[/EDIT]

I would probably approach this slightly differently than the answers listed.

(1) Create a subclass of Writer that writes the encoded bytes of each character in reverse order:

public class ReverseOutputStreamWriter extends Writer {
    private OutputStream out;
    private Charset encoding;
    public ReverseOutputStreamWriter(OutputStream out, Charset encoding) {
        this.out = out;
        this.encoding = encoding;
    }
    public void write(int ch) throws IOException {
        byte[] buffer = this.encoding.encode(String.valueOf(ch)).array();
        // write the bytes in reverse order to this.out
    }
    // other overloaded methods
}

(2) Create a subclass of log4j WriterAppender whose createWriter method would be overridden to create an instance of ReverseOutputStreamWriter.

(3) Create a subclass of log4j Layout whose format method returns the log string in reverse character order:

public class ReversePatternLayout extends PatternLayout {
    // constructors
    public String format(LoggingEvent event) {
        return new StringBuilder(super.format(event)).reverse().toString();
    }
}

(4) Modify my logging configuration file to send log messages to both the "normal" log file and a "reverse" log file. The "reverse" log file would contain the same log messages as the "normal" log file, but each message would be written backwards. (Note that the encoding of the "reverse" log file would not necessarily conform to UTF-8, or even any character encoding.)

(5) Create a subclass of InputStream that wraps an instance of RandomAccessFile in order to read the bytes of a file in reverse order:

public class ReverseFileInputStream extends InputStream {
    private RandomAccessFile in;
    private byte[] buffer;
    // The index of the next byte to read.
    private int bufferIndex;
    public ReverseFileInputStream(File file) {
        this.in = new RandomAccessFile(File, "r");
        this.buffer = new byte[4096];
        this.bufferIndex = this.buffer.length;
        this.in.seek(file.length());
    }
    public void populateBuffer() throws IOException {
        // record the old position
        // seek to a new, previous position
        // read from the new position to the old position into the buffer
        // reverse the buffer
    }
    public int read() throws IOException {
        if (this.bufferIndex == this.buffer.length) {
            populateBuffer();
            if (this.bufferIndex == this.buffer.length) {
                return -1;
            }
        }
        return this.buffer[this.bufferIndex++];
    }
    // other overridden methods
}

Now if I want to read the entries of the "normal" log file in reverse order, I just need to create an instance of ReverseFileInputStream, giving it the "revere" log file.

This definitely is an interesting answer but I think is too code heavy for my approach and I don't like the idea of have two log files. — eliocs, May 16 '11 at 06:44
@eliocs: Definitely understand about simple solutions and not duplicating the log data. If you need this behavior frequently, a "more appropriate" solution is probably to move your logs from text files to database tables with `DBAppender` (part of log4j 2). Then you could simply query for latest entries. — Nathan Ryan, May 16 '11 at 08:03
I will use a DBAppender this way I can easily add a procedure that purges log periodically and can take advantage on later stages when I need to search the logs. — eliocs, May 17 '11 at 07:02

score 11 · Answer 2 · edited Jun 29 '15 at 16:24

11

This is a old question. I also wanted to do the same thing and after some searching found there is a class in apache commons-io to achieve this:

org.apache.commons.io.input.ReversedLinesFileReader

edited Jun 29 '15 at 16:24

Mindwin Remember Monica

1,469
2
20
35

answered Sep 26 '14 at 09:20

Chathurika Sandarenu

1,368
13
25

1

Even if the link points to the class, I would add the name of the class in your answer as the link could break in the future. Great find by the way! – eliocs Sep 26 '14 at 09:36
1

Here's the Maven artifact for this: http://mvnrepository.com/artifact/commons-io/commons-io/2.4 – Renato Aug 16 '15 at 15:40

score 4 · Answer 3 · answered May 15 '11 at 21:37

4

I think a good choice for this would be using RandomFileAccess class. There is some sample code for back-reading using this class on this page. Reading bytes this way is easy, however reading strings might be a bit more challenging.

answered May 15 '11 at 21:37

yms

10,361
3
38
68

The first link doesn't have any information on the page. – But I'm Not A Wrapper Class Feb 19 '15 at 19:45

score 3 · Answer 4 · answered May 16 '11 at 19:44

3

If you are in a hurry and want the simplest solution without worrying too much about performance, I would give a try to use an external process to do the dirty job (given that you are running your app in a Un*x server, as any decent person would do XD)

new BufferedReader(new InputStreamReader(Runtime.getRuntime().exec("tail yourlogfile.txt -n 50 | rev").getProcess().getInputStream()))

answered May 16 '11 at 19:44

fortran

74,053
25
135
175

depending on an external command line process doesn't seem very portable ;) – eliocs May 16 '11 at 20:55

score 2 · Answer 5 · answered May 16 '11 at 12:52

2

A simpler alternative, because you say that you're creating a servlet to do this, is to use a LinkedList to hold the last N lines (where N might be a servlet parameter). When the list size exceeds N, you call removeFirst().

From a user experience perspective, this is probably the best solution. As you note, the most recent lines are the most important. Not being overwhelmed with information is also very important.

answered May 16 '11 at 12:52

Anon

2,328
12
7

1

Not being overwhelmed by information indeed is a really nice point. – eliocs May 16 '11 at 20:52

score 1 · Answer 6 · answered Jul 19 '12 at 01:44

you can use RandomAccessFile implements this function,such as:

import java.io.File;
import java.io.IOException;
import java.io.RandomAccessFile;

import com.google.common.io.LineProcessor;
public class FileUtils {
/**
 * 反向读取文本文件（UTF8）,文本文件分行是通过\r\n
 * 
 * @param <T>
 * @param file
 * @param step 反向寻找的步长
 * @param lineprocessor
 * @throws IOException
 */
public static <T> T backWardsRead(File file, int step,
        LineProcessor<T> lineprocessor) throws IOException {
    RandomAccessFile rf = new RandomAccessFile(file, "r");
    long fileLen = rf.length();
    long pos = fileLen - step;
    // 寻找倒序的第一行:\r
    while (true) {
        if (pos < 0) {
            // 处理第一行
            rf.seek(0);
            lineprocessor.processLine(rf.readLine());
            return lineprocessor.getResult();
        }
        rf.seek(pos);
        char c = (char) rf.readByte();
        while (c != '\r') {
            c = (char) rf.readByte();
        }
        rf.readByte();//read '\n'
        pos = rf.getFilePointer();
        if (!lineprocessor.processLine(rf.readLine())) {
            return lineprocessor.getResult();
        }
        pos -= step;
    }

  }

use:

       FileUtils.backWardsRead(new File("H:/usersfavs.csv"), 40,
            new LineProcessor<Void>() {
                                   //TODO  implements method
                                   .......
            });

WhiteFang34 · Answer 7 · 2011-05-16T03:40:21.027

Good question. I'm not aware of any common implementations of this. It's not trivial to do properly either, so be careful what you choose. It should deal with character set encoding and detection of different line break methods. Here's the implementation I have so far that works with ASCII and UTF-8 encoded files, including a test case for UTF-8. It does not work with UTF-16LE or UTF-16BE encoded files.

import java.io.BufferedReader;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.RandomAccessFile;
import java.io.Reader;
import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

import junit.framework.TestCase;

public class ReverseLineReader {
    private static final int BUFFER_SIZE = 8192;

    private final FileChannel channel;
    private final String encoding;
    private long filePos;
    private ByteBuffer buf;
    private int bufPos;
    private byte lastLineBreak = '\n';
    private ByteArrayOutputStream baos = new ByteArrayOutputStream();

    public ReverseLineReader(File file, String encoding) throws IOException {
        RandomAccessFile raf = new RandomAccessFile(file, "r");
        channel = raf.getChannel();
        filePos = raf.length();
        this.encoding = encoding;
    }

    public String readLine() throws IOException {
        while (true) {
            if (bufPos < 0) {
                if (filePos == 0) {
                    if (baos == null) {
                        return null;
                    }
                    String line = bufToString();
                    baos = null;
                    return line;
                }

                long start = Math.max(filePos - BUFFER_SIZE, 0);
                long end = filePos;
                long len = end - start;

                buf = channel.map(FileChannel.MapMode.READ_ONLY, start, len);
                bufPos = (int) len;
                filePos = start;
            }

            while (bufPos-- > 0) {
                byte c = buf.get(bufPos);
                if (c == '\r' || c == '\n') {
                    if (c != lastLineBreak) {
                        lastLineBreak = c;
                        continue;
                    }
                    lastLineBreak = c;
                    return bufToString();
                }
                baos.write(c);
            }
        }
    }

    private String bufToString() throws UnsupportedEncodingException {
        if (baos.size() == 0) {
            return "";
        }

        byte[] bytes = baos.toByteArray();
        for (int i = 0; i < bytes.length / 2; i++) {
            byte t = bytes[i];
            bytes[i] = bytes[bytes.length - i - 1];
            bytes[bytes.length - i - 1] = t;
        }

        baos.reset();

        return new String(bytes, encoding);
    }

    public static void main(String[] args) throws IOException {
        File file = new File("my.log");
        ReverseLineReader reader = new ReverseLineReader(file, "UTF-8");
        String line;
        while ((line = reader.readLine()) != null) {
            System.out.println(line);
        }
    }

    public static class ReverseLineReaderTest extends TestCase {
        public void test() throws IOException {
            File file = new File("utf8test.log");
            String encoding = "UTF-8";

            FileInputStream fileIn = new FileInputStream(file);
            Reader fileReader = new InputStreamReader(fileIn, encoding);
            BufferedReader bufReader = new BufferedReader(fileReader);
            List<String> lines = new ArrayList<String>();
            String line;
            while ((line = bufReader.readLine()) != null) {
                lines.add(line);
            }
            Collections.reverse(lines);

            ReverseLineReader reader = new ReverseLineReader(file, encoding);
            int pos = 0;
            while ((line = reader.readLine()) != null) {
                assertEquals(lines.get(pos++), line);
            }

            assertEquals(lines.size(), pos);
        }
    }
}

this doesn't correctly handle encoding correctly at all. this will quite happily mangle your data. reading from an arbitrary byte stream and correctly converting to chars (especially a variable, multi-byte encoding) is _exceedingly_ difficult to do correctly. not to mention, it converts bytes directly to chars when searching for '\r' and '\n'--also broken. — jtahlborn, May 16 '11 at 02:40
@jtahlborn: I'm not sure how you figure it won't handle encoding correctly "at all". I tested UTF-8 encoded files with all sorts of multi-byte characters in it and variations of new lines. I never said it was perfect and it likely has issues with malformed files. Yet I believe it works with most cases and I'd be interested in seeing an example of a properly encoded file that it fails with. As for detecting the new line characters as bytes, see http://stackoverflow.com/questions/686231/java-quickly-read-the-last-line-of-a-text-file as to why it should be safe. — WhiteFang34, May 16 '11 at 02:53
@jtalborn: It does indeed not work with UTF-16LE or UTF-16BE files. I've updated the answer to indicate that. I doubt the OP and most others use either of those encodings for log4j files though. No doubt it's difficult to do correctly. My answer started with that disclaimer and is inclusive of my solution :) — WhiteFang34, May 16 '11 at 03:47

vaquar khan · Answer 8 · 2016-03-02T09:21:31.757

import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.file.Files;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
/**
 * Inside of C:\\temp\\vaquar.txt we have following content
 * vaquar khan is working into Citi He is good good programmer programmer trust me
 * @author vaquar.khan@gmail.com
 *
 */

public class ReadFileAndDisplayResultsinReverse {
    public static void main(String[] args) {
        try {
            // read data from file
            Object[] wordList = ReadFile();
            System.out.println("File data=" + wordList);
            //
            Set<String> uniquWordList = null;
            for (Object text : wordList) {
                System.out.println((String) text);
                List<String> tokens = Arrays.asList(text.toString().split("\\s+"));
                System.out.println("tokens" + tokens);
                uniquWordList = new HashSet<String>(tokens);
                // If multiple line then code into same loop
            }
            System.out.println("uniquWordList" + uniquWordList);

            Comparator<String> wordComp= new Comparator<String>() {

                @Override
                public int compare(String o1, String o2) {
                    if(o1==null && o2 ==null) return 0;
                    if(o1==null ) return o2.length()-0;
                    if(o2 ==null) return o1.length()-0;
                    //
                    return o2.length()-o1.length();
                }
            };
            List<String> fs=new ArrayList<String>(uniquWordList);
            Collections.sort(fs,wordComp);

            System.out.println("uniquWordList" + fs);

        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }

    static Object[] ReadFile() throws IOException {
        List<String> list = Files.readAllLines(new File("C:\\temp\\vaquar.txt").toPath(), Charset.defaultCharset());
        return list.toArray();
    }


}

Output:

[Vaquar khan is working into Citi He is good good programmer programmer trust me tokens[vaquar, khan, is, working, into, Citi, He, is, good, good, programmer, programmer, trust, me]

uniquWordList[trust, vaquar, programmer, is, good, into, khan, me, working, Citi, He]

uniquWordList[programmer, working, vaquar, trust, good, into, khan, Citi, is, me, He]

If you want to Sort A to Z then write one more comparater

score 0 · Answer 9 · answered Aug 26 '16 at 23:17

Concise solution using Java 7 Autoclosables and Java 8 Streams :

try (Stream<String> logStream = Files.lines(Paths.get("C:\\logfile.log"))) {
   logStream
      .sorted(Comparator.reverseOrder())
      .limit(10) // last 10 lines
      .forEach(System.out::println);
}

Big drawback: only works when lines are strictly in natural order, like log files prefixed with timestamps but without exceptions

score 0 · Answer 10 · answered May 16 '11 at 12:38

The simplest solution is to read through the file in forward order, using an ArrayList<Long> to hold the byte offset of each log record. You'll need to use something like Jakarta Commons CountingInputStream to retrieve the position of each record, and will need to carefully organize your buffers to ensure that it returns the proper values:

FileInputStream fis = // .. logfile
BufferedInputStream bis = new BufferedInputStream(fis);
CountingInputStream cis = new CountingInputSteam(bis);
InputStreamReader isr = new InputStreamReader(cis, "UTF-8");

And you probably won't be able to use a BufferedReader, because it will attempt to read-ahead and throw off the count (but reading a character at a time won't be a performance problem, because you're buffering lower in the stack).

To write the file, you iterate the list backwards and use a RandomAccessFile. There is a bit of a trick: to properly decode the bytes (assuming a multi-byte encoding), you will need to read the bytes corresponding to an entry, and then apply a decoding to it. The list, however, will give you the start and end position of the bytes.

One big benefit to this approach, versus simply printing the lines in reverse order, is that you won't damage multi-line log messages (such as exceptions).

Read a file line by line in reverse order

10 Answers10

Linked

Related