I'm trying to build a regex for my NodeJS (12.8.0) project that fetches the plaintext content of emails out of the .eml
files of spam emails (building a simple spam filter for fun).
For this, I have written this regex:
[-]{14}[0-9]*\s.+[\s]+.+(?:[\s]*)([\s\S]+)[\s]{3}[-]{14}[0-9]+[\r\n]
When I use this regex in NodeJS, however, I get a value of null
instead of the content of the mail.
const regexp = new RegExp("[-]{14}[0-9]*\s.+[\s]+.+(?:[\s]*)([\s\S]+)[\s]{3}[-]{14}[0-9]+[\r\n]");
let matches = content.match(regexp);
console.log(matches);
I have added my regex on regex101.com and it works mostly fine but interestingly enough, it tells me that it found a group Group 1
and shows the right content... but doesn't show what lines (like with the Full Match
).
Now to add some more interesting stuff, when I swap it to PCRE
, it works perfectly fine (and even shows the lines).
Please do note that the demo on regex101 is containing an actual sample mail.
EDIT: As per @CertainPerformance's suggestion, I have updated the code to the following, unfortunately, this returns false
instead of true
:
const regexp = /[-]{14}[0-9]*\s.+[\s]+.+(?:[\s]*)([\s\S]+)[\s]{3}[-]{14}[0-9]+[\r\n]/;
let matches = regexp.test(content);
console.log(matches); // false
as well as the following, which still returns null
:
const regexp = /[-]{14}[0-9]*\s.+[\s]+.+(?:[\s]*)([\s\S]+)[\s]{3}[-]{14}[0-9]+[\r\n]/;
let matches = content.match(regexp);
console.log(matches); // null
EDIT 2: Tested the regex in PHP and it works perfectly fine... Seems like something must be derping out...
EDIT 3: Adding the entire snippet of code in the hoped someone could spot the issue...
const pattern = /[-]{14}[0-9]+[\s].+[\s]+.+(?:[\s]*)([\s\S]*)[\s]{3}[-]{14}[0-9]+[\r\n]/;
const spamFolder = './datasets/spam/';
fs.readdir(spamFolder, (err, files) => {
if (err) return console.log('Unable to scan directory: ' + err);
// Loop over each file
files.forEach(file => {
// Read the file
var contents = fs.readFileSync(spamFolder + file, 'utf8');
var matches = contents.match(pattern);
console.log(matches); // null
});
});