0

This might look like a duplicate but Its a different issue. I'll almost copy/paste another Question but I'm asking for a different issue. Also since that thread owner asked it very well and understandable I will describe it like he did.

I have a normal text files with each line having data in the following format.

Username | Age | Street

Now what I wanted to do was to search for the Username in the file and when found It will print the whole line. The question below does this perfectly with one main problem:

PHP to search within txt file and echo the whole line

Issue: If you have the name "Tobias" and search for "Tobi" it will find it and disply "Tobias" but I only want to search a whole word that your using as the search string. If I want to search for "Tobi" it should only find "Tobi" and not "Tobias" or every other string containing the word "Tobi".

It works using this solution: https://stackoverflow.com/a/4366744/14071499

But that also has the issue that using the solution above would only print the string that I am searching for and doesn't print the whole line.

So how am I able to search for a word and printing the whole line afterwards without also finding other string that aren't only the word but containing it?

The Code I have so far:

<?php
$file = 'ids.txt';
$searchfor = $_POST['search'];

// the following line prevents the browser from parsing this as HTML.
header('Content-Type: text/plain');

// get the file contents, assuming the file to be readable (and exist)
$contents = file_get_contents($file);
// escape special characters in the query
$pattern = preg_quote($searchfor, '/');
// finalise the regular expression, matching the whole line
$pattern = "/\b{$pattern}.*\$/m";
// search, and store all matching occurences in $matches
if(preg_match_all($pattern, $contents, $matches)){
   echo "Found matches:\n";
   echo implode("\n", $matches[0]);
}
else{
   echo "No matches found";
}

?>
  • ... `\$` is a literal dollar sign (not a regex metacharacter / end-of-line anchor) But you don't need the `$` either -- because `.*` goes to the end of the line. – mickmackusa Oct 28 '20 at 06:19
  • This question is incomplete because the [mcve] is not complete. We can only assume that columnar values in your file are comma separated -- but without knowing we cannot develop an appropriate pattern. It is also assumed that you only want to match the username "column" value -- is this true? Using `preg_match_all()` may actually be the best tool (instead of `preg_grep()` and then re-looping to print the results). Too much is unknown. – mickmackusa Oct 29 '20 at 02:41

2 Answers2

2

This answer doesn't take into account fields in your source data, since at the moment you're just bulk-matching the raw text and interested in getting full lines. There is a much simpler way to accomplish this, ie. by using file that loads each line into an array member, and the application of preg_grep that filters an array with a regular expression. Implemented as follows:

$lines = file('ids.txt', FILE_IGNORE_NEW_LINES|FILE_SKIP_EMPTY_LINES); // lines as array

$search = preg_quote($_POST['search'], '~');
$matches = preg_grep('~\b' . $search . '\b~', $lines);

foreach($matches as $line => $match) {
    echo "Line {$line}: {$match}\n";
}

In related notes, to match only complete words, instead of substrings, you need to have word boundaries \b on both sides of the pattern. The loop above outputs both the match and the line number (0-indexed), since array index keys are saved when using preg_grep.

Markus AO
  • 4,771
  • 2
  • 18
  • 29
  • Glad to hear. I've updated `file` to have flags `FILE_IGNORE_NEW_LINES|FILE_SKIP_EMPTY_LINES`, meaning 1. you don't get a trailing newlines included, and 2. blank lines are automatically skipped. – Markus AO Oct 28 '20 at 01:03
  • okay thank you. Could you tell me where I have to add the "i" in order to make it noncasesensitive – Baseult Private Oct 28 '20 at 01:21
  • @Bas the `i` pattern modifier goes after the end-of-pattern delimiter (like you did with `m` in your question's snippet). – mickmackusa Oct 28 '20 at 06:21
1
<?php

$file = "ids.txt";
$search = $_POST["search"];

header("Content-Type: text/plain");

$contents = file_get_contents($file);
$lines = explode("\n", $contents);

foreach ($lines as $line) {
    if (preg_match("/\b${search}\b/", $line, $matches)) {
        echo $line;
    }
}
  • This snippet is vulnerable to pattern breakage due to regex meta-characters existing in the posted input. This answer is missing its educational explanation. – mickmackusa Oct 28 '20 at 03:16
  • Your first point is correct - it is indeed vulnerable - but I do suspect that since the original question solves that issue, it is a non-issue in my answer. However, your second point has no merit based on this link (https://stackoverflow.com/help/how-to-answer) - or maybe I missed it in that link - or maybe there is another link that contains more detailed information? –  Oct 28 '20 at 05:44
  • "Has no merit"!? If you think Stack Overflow would be half of the researchers' paradise that it is today by being a collection of snippets, you have a very different perception than I have. – mickmackusa Oct 28 '20 at 05:47
  • I mean .. I said "your second point has no merit based on this link" - and then provided the link. I definitely think that I didn't say "your second point has no merit based on my opinions and they exist at all of these locations: ..." –  Oct 28 '20 at 05:50
  • If you are answering questions to "be helpful" -- it is more helpful to explain how/why your answer works because that will invariably help the OP and thousands of future researchers. If you are here to "farm rep points" -- it is more lucrative to be generous and explain how/why your answer works because then you have a better chance of your answer being helpful to the OP and thousands of future researchers and earning more unicorn points. I, personally, will NEVER upvote any code-only answers on Stack Overflow as a matter of principle. – mickmackusa Oct 28 '20 at 05:54
  • 1
    "This answer is missing its educational explanation" is a simple fact. It's a code snippet. While the link Mack posted doesn't explicitly spell it out (*maybe it should!*), SO has a culture of providing *learning*, not just *solutions* one can copy-paste and be none the wiser. Some explanation of your code is expected. You can read up to get the vibe on SO Meta, e.g. [here](https://meta.stackoverflow.com/questions/392712/explaining-entirely-code-based-answers), [here](https://meta.stackoverflow.com/questions/262695/new-answer-deletion-option-code-only-answer). – Markus AO Oct 28 '20 at 13:57