4

Im trying to improve my programming (PHP) skills by working on the following challenge. The question that follows is not so much about a code problem nor am I asking for code but rather about the programming logic which should be applied.

    (9,'zxvvgf@housecapades.com',0,0,1,1,0,1,1),
    (10,'qwer@rogers.co.uk',1,0,0,1,0,0,1),
    (11,'lorenIpsum@hotmail.com',0,0,0,1,0,0,1),
    (12,'BarackObama@googlemail.co.uk',1,0,9,1,1,1,1),
    (13,'DonaldTrump@courtesysupportteam.net',0,0,9,1,1,1,1),
    (15,'Mcaine@mynet.com',1,0,9,1,1,1,1),
    (16,'davestra_@hotmail.com',0,0,0,1,0,0,1),
    (17,'lewisHamilton@carracing.co.uk',1,0,9,1,1,1,1)

Objective

Consider the following data dump above, I need to find a way to extract ONLY the email addresses which ends in .co.uk and enter it into a db table. In this example there are only 2 email adresses ending in .co.uk namely BarackObama@googlemail.co.uk and lewisHamilton@carracing.co.uk

The Prolem

Im having a tough time to figure out how to tackle this problem because:

  1. No (most) email addresses does not have the same amount of characters
  2. There is irrelevant data, more specifically numbers, between email addresses which should be ignored

My Logic / Psuedo Code

  1. Find a common denominator in rows (I noticed the first column are integers increasing by one for each row) use this to assign values to variables $min and $max (In this example $min=9 and $max=17)

  2. Use the variables assigned above to loop over rows increasing loop by one with each iteration

  3. Inside Loop Ignore all characters that are of type integrer

  4. Use pregmatch to find an email address that ends with .co.uk

  5. if found add to array $couk_emails else loop to next row

  6. When loop ends upload array $couk_emails to DB table

Thats the logic / Psuedo I came up with however it feels flawed to me. I consider this to be quite a difficult challenge so I would love to hear from experienced programmers how they will tackle this type of problem

Note all Email addresses in this post are fictional / made up to the best of my knowledge

Dharman
  • 30,962
  • 25
  • 85
  • 135
Timothy Coetzee
  • 5,626
  • 9
  • 34
  • 97
  • if you could loop it over, one row at a time, and for single row, like `9,'zxvvgf@housecapades.com',0,0,1,1,0,1,1` you can do one of the following: 1) use `strpos/stripos` to detrmine if there is any `.co.uk` in it. 2) split the contents on the basis of `,` and do some `preg_match` on `$split[1]` – Mubin Aug 25 '15 at 11:19

3 Answers3

3

Try using a regex for this purpose
Something like this -

'(.*?\.co\.uk)'

Regex explanation here.

You can match strings in php with a regex using preg_match function.
Testing this with a simple example -

>>> $regex = "/'(.*?\.co\.uk)'/"
>>> $str = "(12,'BarackObama@googlemail.co.uk',1,0,9,1,1,1,1),"
>>> preg_match($regex, $str, $match)
=> 1
>>> $match
=> [
       "'BarackObama@googlemail.co.uk'",
       "BarackObama@googlemail.co.uk"
   ]

EXPLANATION
In the above code, preg_match takes in the $regex and the $str to match and returns 0 or 1 depending on whether it matched the string or not.

To extract the email part of the string and discarding the rest(like the single quotes which was used in the regex), you need to put the corresponding part inside a capturing group which will be returned in the array of matches in the third parameter($match variable in the above example).

Finally, $match[0] contains the whole string matched against the regex and $match[1] contains only the email.

Community
  • 1
  • 1
Kamehameha
  • 5,423
  • 1
  • 23
  • 28
2

The first three steps in your algorithm are useless.

I assume you already have the data split to lines. If it's not then you can use explode() to split a text into lines.

The algorithm:

  1. Create an empty list (array) to hold the results;
  2. Use foreach to loop the input list;
  3. Use preg_match() to detect if the email address from the current line ends in .co.uk; preg_match() also extracts the email address in a variable;
  4. If it matches on step 3 then put the extracted email address into the output list (created on step 1);
  5. That's all. Do whatever you need with the list of emails; put them into the database, display them, ignore them, it doesn't matter. Any processing you do at this step is not part of this algorithm; it is either a new algorithm or, together with this one, it's just a step of a bigger processing.

The code:

$text = "(9,'zxvvgf@housecapades.com',0,0,1,1,0,1,1),
(10,'qwer@rogers.co.uk',1,0,0,1,0,0,1),
(11,'lorenIpsum@hotmail.com',0,0,0,1,0,0,1),
(12,'BarackObama@googlemail.co.uk',1,0,9,1,1,1,1),
(13,'DonaldTrump@courtesysupportteam.net',0,0,9,1,1,1,1),
(15,'Mcaine@mynet.com',1,0,9,1,1,1,1),
(16,'davestra_@hotmail.com',0,0,0,1,0,0,1),
(17,'lewisHamilton@carracing.co.uk',1,0,9,1,1,1,1)";


$input  = explode("\n", $text);    // 0. prepare the input data
$output = array();                 // 1. prepare the output
foreach ($input as $line) {        // 2. loop over the input
    $match = array();
    if (preg_match("/'([^']*\\.co\\.uk)'/", $line, $match)) {   // 3. check if matches
        $output[] = $match[1];     // 4. put the extracted email address aside
    }
}
print_r($output);                  // 5. print the results for visual validation

The output:

Array
(
    [0] => qwer@rogers.co.uk
    [1] => BarackObama@googlemail.co.uk
    [2] => lewisHamilton@carracing.co.uk
)

Surprise! There are three email addresses ending in .co.uk.


Update:

The question clearly states it's not about the code, it's about the logic behind the code. What follows is an addendum that doesn't answer the question; it shows the capabilities of PHP functions.

Inspired by the OP's comment about the input data not necessarily being a set of lines but a big text, I suggest the following code that runs much faster than the code above but it doesn't improve the logic skills of anybody:

$match = array();
preg_match_all("/'([^']*\\.co\\.uk)'/", $text, $match);
print_r($match[1]);

It uses the same regular expression, this time with preg_match_all(). preg_match_all() extracts the matched fragments (the emails surrounded by apostrophes) in $matches[0] and the fragments that match the expression in parenthesis in $matches[1]. This one is the expected output.

Community
  • 1
  • 1
axiac
  • 68,258
  • 9
  • 99
  • 134
  • Thank you sir you clearly know your stuff, before I accept your answer I have one more question if you dont mind....In your code you used `$input = explode("\n", $text); ` `//to prepare the input data.` Lets assume all the text are not on a new line. Am I correct in saying I can do the following: `$input = explode("),", $text); ` to preapare the input data since each row ends with `),` – Timothy Coetzee Aug 25 '15 at 11:54
  • Indeed, if all the data is on the same line you must find a different separator to split it to lines. You are right, `),` could be used as separator too. – axiac Aug 25 '15 at 12:10
0

thats all:

select * from emailtable e where e.email LIKE '%co.uk';

or save the Address in reverse oder in 2 second feld. Than mysql can use the index

update emailtable set e.remail = reverse(e.email);

select * from emailtable e where e.remail LIKE 'ku.oc%';
Bernd Buffen
  • 14,525
  • 2
  • 24
  • 39