0

I am trying to find an algorithm better than this function in PHP because this is making memory limit explode while running on 1000000000000 characters length string

substr_count($string, $needle, $offset, $length)

Any help will be appreciated.

  • 1
    If performance issue, first you can check if the needle is there or not using this https://stackoverflow.com/a/5821716/5026957 , and if the needle is found then go for the count the occurrence. Another alternative is you can use `explode()` to break the target string by needle, and apply `count() - 1` on the output array, – HarisH Sharma Oct 07 '19 at 08:09
  • *"is making memory limit explode"*: How much memory do you actually have available? – trincot Oct 07 '19 at 08:28
  • Please produce a minimum verifiable example. Also, where is the bottleneck? In file reading(as in taking string as input) or in the substr_count() processing? – nice_dev Oct 07 '19 at 09:19
  • The problem @harishsharma is that I am counting the number of 'a' characters in that 1000000000000 string characters, so there is no way to divide it, and if I divided it, I will count in all chunks which will be the same – Mostafa A. Hamid Oct 08 '19 at 01:26
  • @trincot I do not know exactly the memory limit, but I got that in an hackerrank problem solving question, and it is not available to be the memory limit for the machine hosting the code – Mostafa A. Hamid Oct 08 '19 at 01:29
  • @vivek_23 honestly, I do not read from a file, the parameter of the 1000000000000 string characters is passed directly to the implementation function – Mostafa A. Hamid Oct 08 '19 at 01:30
  • 1
    You mean that on the HackerRank server they pass your code a string that takes 1 TB of memory!? Even in 2019 that still sounds as extreme. Can you give the link? – trincot Oct 08 '19 at 05:00
  • @MostafaA.Hamid If there are major performance issue when user is on the website, then you can run your PHP algorithm in the background, or you can also use Shell/Bash script, and run it via PHP, I also used Shell/Bash script, in long calculation or report processing, – HarisH Sharma Oct 09 '19 at 05:09
  • @harishsharma No, it is just an algorithm question on hackerrank.com – Mostafa A. Hamid Oct 16 '19 at 05:32

1 Answers1

0

If you have such a huge file, you will indeed run into problems.

I think you should make your own code using the filehandle, and read in in chunks.

Have a look at fopen() and fread():

Open a filehandle to your huge file with fopen, then read in chunks of data with fread. You must provide the logic yourself to handle the case where the string you are looking for is broken between the chuncks of data.

That way you will NOT have the whole file in memory, but just the parts you get in with fread().

fopen()

fread()

Erwin Moller
  • 2,375
  • 14
  • 22