0

I have a log file where events are in multiple lines. In order to make event into single line, first I have to separate lines that contain date and lines from those that are without date. Now I am trying to write a logic to check a line and if it doesn't have date, merge it with prevLine.

How can I combine multiples lines into one using regular expression or any other module that helps to achieve this task?

ctrl.js

var regex = /\[\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\w+\]/;
var prevLine;
var readStream = fs.createReadStream(dir + '/' + logFile,'utf8');
        readStream.pipe(split()).on('data', function (line) {
            if (regex.test(line)) {
                    console.log('Line with date:', line);
                    parseLog(line,prevLine);
                } else {
                    console.log('Line without date:', line);
                    line = prevLine + line;
                }
                function parseLog(line, prev) {
                    if (line.indexOf('|') === -1) line = prev + line;
                 }
          });

fileData

[2017-03-23T18:13:16Z]|verbose|bmid: n/a|infra.topicWorkers|topology changed, emitting topology event { newTopology: 
   [ '-0000001337',
     '-0000001338',
     '-0000001339',
     '-0000001340',
     '-0000001341',
     '-0000001342' ],
  oldTopology: 
   [ '-0000001337',
     '-0000001338',
     '-0000001339',
     '-0000001340',
     '-0000001341' ],
  workerId: 6,
  pid: 30488 }
[2017-03-23T18:13:16Z]|verbose|bmid: n/a|infra.topicWorkers|topology changed, emitting topology event { newTopology: 
   [ '-0000001337',
     '-0000001338',
     '-0000001339',
     '-0000001340',
     '-0000001341',
     '-0000001342' ],
  oldTopology: [],
  workerId: 4,
  pid: 30481 }
Talha Awan
  • 4,573
  • 4
  • 25
  • 40
hussain
  • 6,587
  • 18
  • 79
  • 152
  • `string.replace('\n', '')`? – Mike Cluck Mar 31 '17 at 19:57
  • Better: `line.replace(/\r?\n/, "")` (if the line breaks are in fact CRLF) – Psi Mar 31 '17 at 19:59
  • @MikeC. This is obviously **not** a duplicate. OP are explicitly asking how to remove only specific new lines, **not all newlines**. Also, your suggestion `string.replace('\n', '')` are surely wrong because in any case it replaces only the first occurence of `\n` character. – user7771338 Mar 31 '17 at 20:01
  • @FREE_AND_OPEN_SOURCE I was giving a simple example to show how removing a newline works. This is still most definitely a duplicate since they merely need to apply the same method to whatever subset of string needs the new lines removed from it. – Mike Cluck Mar 31 '17 at 20:05
  • Yeah i don't think its duplicate because i want to remove line breaks only if lines dont have date – hussain Mar 31 '17 at 20:05
  • @hussain So only apply the answer in the other question on lines which don't have a date. – Mike Cluck Mar 31 '17 at 20:06
  • @Psi there are no newlines in the stream generated by the split module used in the question. – rsp Mar 31 '17 at 20:17
  • @FREE_AND_OPEN_SOURCE It doesn't matter if the code replaces one or all of the newlines because there are no newlines in streams generated by the `split` module that is used in the code in question. – rsp Mar 31 '17 at 20:18
  • @MikeC There are no newlines present in the stream generated by the `split` module that is used in the code in question so no way of removing newlines will work as there are no newlines to remove anyway. – rsp Mar 31 '17 at 20:20
  • @rsp. Ok, you don't need to say the same thing three times. – user7771338 Mar 31 '17 at 20:20
  • @FREE_AND_OPEN_SOURCE Don't get me wrong. I think the OP was unfairly accused of posting duplicates and wildly attacked by everyone who obviously didn't read the code that the question is about. I seriously couldn't find any other question about selectively joining stream elements together. I don't see anyone apologizing to the OP which is unfortunate because we make an impression of a community who is against asking any new questions, even if they are interesting and not trivial like this one. – rsp Mar 31 '17 at 20:24
  • @rsp I suppose I don't understand the question then because I still don't see how it isn't a duplicate. An example of the expected output would be nice. "How can i combine multiples lines into one" tells me that multiple lines are currently separated (presumably by a line break) and OP wants to remove those separators. – Mike Cluck Mar 31 '17 at 20:32

1 Answers1

1

You can do something like this:

const split = require('split');
const regex = /\[\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\w+\]/;
let items = [];
let item = [];
fs.createReadtStream(path.join(dir, logFile), 'utf8')
  .pipe(split()).on('data', (line) => {
    if (regex.test(line)) {
        item = [];
        items.push(item);
    }
    item.push(line);
});
let lines = items.map(item => item.join(' '));
lines.forEach(line => console.log(line));

You can add a new line to an array every time there is a new line, but when there is a line with a date then you can put that array into another array and create a new array for those single lines. Then you can combine the elements within the inner arrays by joining them and you will have a large array of combined lines - the array called lines in that example.

rsp
  • 107,747
  • 29
  • 201
  • 177
  • struggling little bit with syntax issue since i am not using ES6. – hussain Mar 31 '17 at 20:31
  • @hussain If you're struggling with ES6 syntax then you can always paste the code in https://babeljs.io/repl/ and it will translate it for you. Very useful service. – rsp Mar 31 '17 at 20:36
  • its printing empty array `var lines = items.map(function (item) { return item.join(' '); }); console.log('Lines',lines);` i see its printing item in `if` statement – hussain Mar 31 '17 at 20:41
  • item is not printing in `map` function before return – hussain Mar 31 '17 at 20:48