0

I have a txt file that is filled with html code in it. I'm trying to create a PHP page that searches the code and gets the "username" for me:

Here is a small sample of the page:

  <div class="search-result-details">
    <div class="employee-name">This is my name!</div>
    <ul class="employee-details">
      <li><span class="label">Login</span>username</li>
      <li><span class="label">Employee ID</span>####</li>
      <li><span class="label">Barcode ID</span>###</li>
      <li><span class="label">Status</span>Active</li>
    </ul>
    <ul class="org-details">
      <li><span class="label">Location</span>SAT1 (755)</li>
      <li><span class="label">Shift</span>AAAA</li>
      <li><span class="label">Department</span>1231</li>
      <li><span class="label">Area</span>26</li>
      <li><span class="label">Crew</span>0</li>
      <li><span class="label">Supervisor</span>manager name</li>
    </ul>
  </div>
</a></li>
                    </ol>
                </div>

and I need to grab the username from the following line:

<li><span class="label">Login</span>username</li>

I have this already that at least grabs the line I need:

    <?php
$file = 'log.txt';
$searchfor = '<ul class="employee-details">
      <li><span class="label">Login</span>';

// the following line prevents the browser from parsing this as HTML.
header('Content-Type: text/plain');

// get the file contents, assuming the file to be readable (and exist)
$contents = file_get_contents($file);
// escape special characters in the query
$pattern = preg_quote($searchfor, '/');
// finalise the regular expression, matching the whole line
$pattern = "/^.*$pattern.*\$/m";
// search, and store all matching occurences in $matches
if(preg_match_all($pattern, $contents, $matches)){
   echo "Found matches:\n";
   echo implode("\n", $matches[0]);
}
else{
   echo "No matches found";
}

?>

Current Output:

<ul class="employee-details">
  <li><span class="label">Login</span>username</li>

Any help is greatly appreciated. Thank you.

dkeeper09
  • 537
  • 3
  • 11
  • 29

2 Answers2

0

Although a bit hacky, this is one way you could do it.

$contents = file_get_contents($file);

preg_match("/(Login<\/span>)([a-zA-Z0-9]*)(<\/li>)/", $contents, $matches);

if (is_array($matches) && isset($matches[2])) {
   $username = trim($matches[2]);
}

Of course, that middle capture group would need to support whatever characters are possible in usernames.

Also be aware that this will break if that HTML structure is changed ever.

And finally, if there can be more than one username in a file, you can use preg_match_all and then $matches[2] will be an array of usernames.

Jeremy Harris
  • 24,318
  • 13
  • 79
  • 133
0

Using DOMDocument:

$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML('<div class="search-result-details">
    <div class="employee-name">This is my name!</div>
    <ul class="employee-details">
      <li><span class="label">Login</span>username</li>
      <li><span class="label">Employee ID</span>####</li>
      <li><span class="label">Barcode ID</span>###</li>
      <li><span class="label">Status</span>Active</li>
    </ul>
    <ul class="org-details">
      <li><span class="label">Location</span>SAT1 (755)</li>
      <li><span class="label">Shift</span>AAAA</li>
      <li><span class="label">Department</span>1231</li>
      <li><span class="label">Area</span>26</li>
      <li><span class="label">Crew</span>0</li>
      <li><span class="label">Supervisor</span>manager name</li>
    </ul>
  </div>
</a></li>
                    </ol>
                </div>');
libxml_use_internal_errors(false);

$html = new DOMXPath($doc);
$result = '';
foreach ($html->query("//*[@class='label']") as $value) {
    if ($value->textContent == 'Login') {
        $result = $value->nextSibling->textContent;
        break;
    }
}

echo $result;

Output:

username

The reason for libxml_use_internal_errors is to suppress validation errors as outlined in this answer.

Community
  • 1
  • 1
mister martin
  • 6,197
  • 4
  • 30
  • 63