17

I have no idea on how to do this. Where should I start? I have googled this and not one result came up on how to pull a random line from a text file.

The only thing I have found is https://github.com/chrisinajar/node-rand-line, however it doesn't work. How can I read a random line from a text file?

mike
  • 171
  • 1
  • 1
  • 3
  • 1
    How big is this file? One easy method is to read the whole file, and pick a random line. However, this takes at least as much memory as the file. – Brad Nov 15 '12 at 05:37
  • 2MB? just read it into the memory – Dmitry Nov 15 '12 at 07:04

5 Answers5

13

You would probably want to look at the node.js standard library function for reading files, fs.readFile, and end up with something along the lines of:

const fs = require("fs");
// note this will be async
function getRandomLine(filename, callback){
  fs.readFile(filename, "utf-8", function(err, data){
    if(err) {
        throw err;
    }

    // note: this assumes `data` is a string - you may need
    //       to coerce it - see the comments for an approach
    var lines = data.split('\n');
    
    // choose one of the lines...
    var line = lines[Math.floor(Math.random()*lines.length)]

    // invoke the callback with our line
    callback(line);
 })
}

If reading the whole thing and splitting isn't an option, then maybe have a look at this stack overflow for ideas.

cfp
  • 5
  • 1
  • 3
kieran
  • 1,537
  • 10
  • 10
  • 2
    This didn't work right away for me, I got the error: `data.split is not a function`. Following the answer to [this question](http://stackoverflow.com/questions/10145946/what-is-causing-the-following-error-string-split-is-not-a-function-in-javascr), I added `data+=''` and it worked. – Teleporting Goat Dec 19 '16 at 14:10
  • Please note that if the file consists of `foo\nbar\n`, the function will return one of `'foo'`, `'bar'` or `''`. Fix e.g. by changing `data.split('\n')` to `data.replace(/\n$/, '').split('\n')`. – tuomassalo Mar 23 '20 at 10:35
  • 1
    You should try returning lines instead of doing something in the function – SP73 Jan 20 '21 at 04:13
5

I had the same kind of need to pick a random line from a file of more than 100 Mo.
So I wanted to avoid storing all the file content in memory.
I ended up iterating over all the lines twice: first to get the lines count, then to get target line content.
Here is what the code looks like:

const readline = require('readline');
const fs = require('fs');
const FILE_PATH = 'data.ndjson';

module.exports = async () =>
{
    const linesCount = await getLinesCount();
    const randomLineIndex = Math.floor(Math.random() * linesCount);
    const content = await getLineContent(randomLineIndex);
    return content;
};

//
// HELPERS
//

function getLineReader()
{
    return readline.createInterface({
        input: fs.createReadStream(FILE_PATH)
    });
}

async function getLinesCount()
{
    return new Promise(resolve =>
    {
        let counter = 0;
        getLineReader()
        .on('line', function (line)
        {
            counter++;
        })
        .on('close', () =>
        {
            resolve(counter);
        });
    });
}

async function getLineContent(index)
{
    return new Promise(resolve =>
    {
        let counter = 0;
        getLineReader().on('line', function (line)
        {
            if (counter === index)
            {
                resolve(line);
            }
            counter++;
        });
    });
}
sasensi
  • 4,610
  • 10
  • 35
3

I don't have Node handy to test code, so I can't give you exact code, but I would do something like this:

  1. Get the file size in bytes, pick a random byte offset
  2. Open the file as a stream
  3. Use this snippet to emit lines (or readline, but last I used it had a nasty bug where it essentially didn't work)
  4. Keep track of your position in the file as you read. As you pass your chosen offset, select that line and return it.

Note that this isn't entirely random. Longer lines will be weighted more heavily, but it is the only way to do it without reading the whole file to get a count of lines.

This method allows you to get a "random" line without keeping the whole file in memory.

Brad
  • 159,648
  • 54
  • 349
  • 530
0

I can give you a suggestion as I don't have any demo code

  1. Read the file line by line using buffered reader
  2. Store every line in a String array
  3. Create a method int returnRandom(arraySize)
  4. Pass the array size in to the function
  5. Calculate a random number between 0 to arraySize
  6. Return the random number
  7. Print the given index from your string array
Preview
  • 35,317
  • 10
  • 92
  • 112
0

I did it like this

const path = require('path')
const fs = require('fs/promises')

const FILE_NAME = path.resolve(__dirname, '../bigfile.txt')
const DELIMITER = '\n'

const READ_BUFFER_SIZE = 1000 // Must be greater than the record size

/*
 * Reading a random line from a very large (does not fit in RAM) file
 *
 * Note that you will never get the first or last line in the file,
 * but who cares when the file contains millions of lines.
 */
async function main() {
    const stats = await fs.stat(FILE_NAME)
    const handle = await fs.open(FILE_NAME, 'r')

    for (;;) {
        const randomPos = Math.floor(Math.random() * stats.size)

        const buffer = Buffer.alloc(READ_BUFFER_SIZE)
        await handle.read(buffer, 0, READ_BUFFER_SIZE, randomPos)

        const xs = buffer.toString().split(DELIMITER)
        if (xs[2] !== undefined) {
            console.log('Random line:', xs[1])
        }
    }
}
main().catch(console.log)
hi_artem
  • 259
  • 3
  • 8