2

I've been trying to validate over 1 million randomly generated values (strings) with PHP and a client side programming language on an online form, but there are a few challenges I'm facing:

PHP

Link to the (editable) PHP code:https://3v4l.org/AtTkO
The PHP code:

<?php
function generateRandomString($length = 10) {
    $characters = '0123456789abcdefghijklmnopqrstuvwxyz-_.';
    $charactersLength = strlen($characters);
    $randomString = '';
    for ($i = 0; $i < $length; $i++) {
        $randomString .= $characters[rand(0, $charactersLength - 1)];
    }
    return $randomString;
}

$unique = array();
for ($i = 0; $i < 9000000; $i++)
{
    $u=$i+1;
    $random = generateRandomString(5);
    if(!in_array($random, $unique)){
                echo $u.".m".$random."@[server]\n";
                $unique[] = $random;
                gc_collect_cycles();
    }else{
        echo "duplicate detected";
        $i--;
    }

}
echo memory_get_peak_usage();

What should happen:

  1. New 5 character value gets randomly generated
  2. Value gets checked if it already exists in the array
  3. Value gets added to array
  4. All randomly generated values are exported to a .txt file to be used for validating. (Not in the script yet)

What actually happens:

I hit either a memory usage limit or a server timeout for the execution time.

What I've tried

  • I've tried using sleep(3) during the for loop.
  • Setting Memory limit to -1 and timeout to 0. The unlimited memory doesn't make a difference and is too dangerous in a working environment.
  • Using gc_collect_cycles() during the for loop
  • Using echo memory_get_peak_usage(); -> I don't really understand how I could use this for debugging.

What I need help with:

  • Memory management in PHP
  • Having pauses in the script that will reset the PHP execution timer

Client Side Programming language

This is where I have absolutely no clue which way I should go or which programming language I should use for this.

What I want to achieve

  • Load a webpage that has a form
  • Load the .txt with all randomly generated strings
  • fill in the form with the first string
  • submit the form:
    • If positive response from form > save string in special .txt file or array, go to the next value
    • If negative response from form > delete string from file, go to the next value | or just go to the next value
  • All values with a positive response are filtered out and easily accessible at the end.

I don't know which programming language I should use for this function. I've been thinking about Javascript and Python but I'm not sure how I could combine that with PHP. A nudge in the right direction would be appreciated.
 
I might be completely wrong for trying to achieve this with PHP, if so, please let me know what would be the better and easier option. Thanks!

4 Answers4

1

Interesting question, first of all whenever you think of a solution like this, one of the first things you need to consider is can it be async? If your answer is yes, then your implementation will likely be simple, else, you will likely have to pay huge server costs or render random cached results.

NB remove gc_collect_cycles. It does the opposite of what you want, and you hardly ever need to call it manually.

That being said, the approach I would recommend in your case is as follows:

  1. Use a websocket which will be opened only once on the client browser, and then forward results in realtime from server to the browser. Of course, this code itself, can run completely on clientside via javascript, so if it's not just a PoC, you can convert the php code to javascript.

  2. Change your code to yield items or forward results via websocket once a generated code has been confirmed as unique.

However, if you're really just doing only what the PHP code says, you can do that completely in javascript and save your server resources. See this answer for an example code to replace your generateRandomString function.

Chibueze Opata
  • 9,856
  • 7
  • 42
  • 65
  • I figured as much. So you'd suggest using Ajax to send all values to the server side to create a .txt file once it's done on the client side? – Elisee Kamanzi Dec 07 '19 at 22:51
  • 1
    @EliseeKamanzi not AJAX, that is costly too, as you have to poll with AJAX, rather use [websocket](https://socketo.me). – Chibueze Opata Dec 07 '19 at 22:53
  • Thank you! I'll switch it over to javascript, open up a socket to pass on the values to a PHP script on the server side. A small worry: what about the RAM usage on the client side if I'd have to compare 1 value against 1 million existing values? I know with PHP, memory usage is heavily affected when the array grows over 10 000 items. – Elisee Kamanzi Dec 07 '19 at 23:02
  • @EliseeKamanzi It's really not that much. A 5-character string is about 5 bytes, and 5 million bytes is about just 5mb. It's the processing and testing of each string that takes compute time. – Chibueze Opata Dec 07 '19 at 23:09
0

Assuming you have the ability to edit the php.ini:

Increase your memory limit as described here: PHP MEMORY LIMIT INCREASE

Community
  • 1
  • 1
  • tried this, but the for loop is too long. Even with an unlimited limit (-1) the server stops the script from running. I've edited my question to reflect his. – Elisee Kamanzi Dec 07 '19 at 22:46
0

For the 'memory limit' see here

and for the 'timeout for the execution time' add :

set_time_limit(0);

on the top of the PHP file.

Zakari
  • 453
  • 4
  • 15
  • Tried this for memory limit, but the for loop is so big that even a limit of -1 is not enough. I'd have to seperate the for loop in big chunks to free some memory. – Elisee Kamanzi Dec 07 '19 at 22:47
-1

Have you tried using sets? https://www.php.net/manual/en/class.ds-set.php

Sets are very efficient whenever you want to ensure a value isn't present twice.

Checking the presence of a value in a set it way way way faster that loop across all entries on the array.

I'm not a expert with PHP but it would look like something like that in Ruby

require 'set'

CHARS = '0123456789abcdefghijklmnopqrstuvwxyz-_.'.split('');
unique = Set.new()

def generateRandomString(l = 10)
  Array.new(l) { CHARS.sample }.join
end


while unique.length < 1_000_000
  random_string = generateRandomString
  if !unique.include?(random_string)
    unique.add(random_string)
  end
end

hope it helps

Jakikiller
  • 149
  • 1
  • 9