3

Let says that I stored dataID in json file with 1,000,000 records.

My zresults.json = {"dataID":["1","2","3", ... "1000000"]}z

I want to find ID "100000" in the array.

$file = file_get_contents('results.json');
$data = json_decode($file,true);
if(in_array('100000', $data['dataID']))
{
  echo "found";
} else {
  echo "not found"; 
}

It took about 0.6 sec. for the result.

Is there a faster way for searching in json array like this?

Please give me an example!

Thank you in advance.

Update:

Although sql would much faster but considered 1,000,000 record in one table the more record the more space! At least, static file reduced server load and less space.

It depends on how designed your system. Use it the right place and the right time!

webmastx
  • 683
  • 1
  • 8
  • 30

5 Answers5

2

Sure!

$stm = $pdo->prepare("SELECT 1 FROM data WHERE id = ?");
$stm->execute(array(100000));
if ($stm->fetchColumn())
{ echo "found"; } else { echo "not found"; }

you will need to import your array into database first.

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
  • This certainly is _much_ slower, except when you reuse that database for multiple searches later on. – arkascha Aug 09 '13 at 11:23
  • yup, this is what databases are for. And JSON is not. – Your Common Sense Aug 09 '13 at 11:27
  • Yes, but do you know if that is what the OP wants? You posted this as an answer to his question... – arkascha Aug 09 '13 at 11:28
  • Note that although havin a relational schema potentially allows for faster SELECT queries, it might take much effort to create and maintain such schema if the tree has a lot of branches. The mongo DB alternative might not be so bad. – hek2mgl Aug 09 '13 at 11:31
  • `*sigh*`. whatever JSON arrays of 10000 elements aren't intended for search. But for data interchange only. While for search we use databases. Sincerely, Your Common Sense. – Your Common Sense Aug 09 '13 at 11:31
  • My reason for using json file is to avoid mysql database server overloading. I generated the data from mysql into json file so user can just read from the json file instead. – webmastx Aug 09 '13 at 11:38
  • @webmastx this is a perfect example of getting a contrary result out of good (but groundless) intentions. IN fact you **have** got your server loaded. – Your Common Sense Aug 09 '13 at 11:40
  • Should also be noted that a db server should not overload if indexes are correctly setup.. – Ben Aug 09 '13 at 11:43
  • 1
    I think you have to query with the proper search then create the json to pass it over your objects (which is the intention of json) – Royal Bg Aug 09 '13 at 11:43
  • 1
    @webmastx - although you had the best intentions of not overloading your database, who says that your solution was a proper one? It apparently creates problems. – N.B. Aug 09 '13 at 12:00
  • Thanks everyone for replying. Based on a shared web server, Which one is the correct way? search directly from mysql or from a static json file? If database is the best approach I may change to. Thank you once again. – webmastx Aug 09 '13 at 12:27
  • 1
    @webmastx apparently database is a most robust solution. Databases intended for search. And they have many means in speeding search up. – Your Common Sense Aug 09 '13 at 12:36
  • @YourCommonSense Havin an API which will communicate just via JSON and havin a web2.0 environment: meaning many updates while concurrent reads. Have tested it. I got better results with mongo DB. – hek2mgl Aug 10 '13 at 00:43
  • @YourCommonSense Seems that you can think and event :D LOL – hek2mgl Aug 10 '13 at 00:49
  • @YourCommonSense `SELECT 1 FROM data WHERE id = ?` Is this a power query? I ask because of the `?` – hek2mgl Aug 10 '13 at 11:17
1

Depending on the structure of the data in the results.json file you may be able to do a simple string search for example

$file = file_get_contents('results.json');

if(strpos($file, '"100000"') !== false)
{
    echo 'found';
}
else
{
    echo 'not found';
}

After benchmarking your method I got around 0.78 seconds (on my slow local system) however with this method I achieved around 0.03 seconds.

Like I say, it depends on your data structure but if it does permit you to use this method you'll see significant speed benefits.

olliefinn
  • 587
  • 4
  • 16
  • I created a file containing a json array in the same structure as that in the question with one million records in. Exactly like this `{"dataID":["1","2","3", ... "1000000"]}` – olliefinn Aug 09 '13 at 11:47
  • This doesn't count :D What if `{"dataID":["1","2","3",..."999999"],"userID":["1","2","3",...."1000000"]}` Then your script will return true, but it's not, his searching in "dataID", so you will need one more to use the json as an array to say in which key you want to search, which will slowdown the further strpos – Royal Bg Aug 09 '13 at 11:50
  • That's why I made it clear that this option was only an option in the case where the data in the file permitted it. You are right, in this case my suggestion wouldn't work. However using the data and structure in the question it does work. – olliefinn Aug 09 '13 at 11:53
  • Why the down vote? This is an option and it answers the question. – olliefinn Aug 09 '13 at 12:11
  • It's not me the downvoter, maybe someone will comment its arguements about it – Royal Bg Aug 09 '13 at 12:13
  • @user761389 Yes, strpos does faster than in_array it took from 0.6 to 0.3. I think because the json_decode has never been used! Thanks. – webmastx Aug 10 '13 at 04:05
0

Why don't you store the id as keys and then do:

if(isset($data['dataID']['100000'])){
 // do something
}

Because checking if a key exists is alot faster than looping through the array. You can check out this link for further information:

List of Big-O for PHP functions

Community
  • 1
  • 1
Muqito
  • 1,369
  • 3
  • 13
  • 27
  • You confuse array keys and values. – arkascha Aug 09 '13 at 11:21
  • I forgot to add "Why don't you store the id as keys and then do:" :) – Muqito Aug 09 '13 at 11:23
  • And why should that be faster? – arkascha Aug 09 '13 at 11:24
  • Searching for the key is only faster if the data structure is unordered! Otherwise it scales exactly the same, since the same algorithm is used. But you don't know if the OPs data structure is ordered or not. – arkascha Aug 09 '13 at 11:32
  • You sure? "isset/array_key_exists is much faster than in_array and array_search" "Lookups: isset( $array[$index] ) O(n) but really close to O(1) - it uses the same lookup as array_key_exists. Since it's language construct, will cache the lookup if the key is hardcoded, resulting in speed up in cases where the same key is used repeatedly. in_array O(n) - this is because it does a linear search though the array until it finds the value." Otherwise I learned something new then... – Muqito Aug 09 '13 at 11:34
  • I don't see how array_key_exists should come close to O(1), since no assumptions about the structure of the key set can be made in php... But I have not checked the algorithm, true. Caching certainly would help, but only for subsequent searches. – arkascha Aug 09 '13 at 11:38
  • Em, I was a bit fast with my last comment... Obviously searching in the key set will perform near O(log(n))... – arkascha Aug 09 '13 at 11:43
  • A lookup took me approximately 6.19*10^-6 s, but I created a list with assoc keys of "id-X" like "id-100000" :). I've always used this approach when searching through arrays in stuff. – Muqito Aug 09 '13 at 12:04
  • What does this show? You can get the same result when searching in the values, as said: it depends! There is no 'right' solution to this question because the OPs situation is not clear. – arkascha Aug 09 '13 at 12:09
0

A document based database like mongo DB is what you should try instead of working with plain json files - if you are bound to the json format.

Note that mongoDB can keep the json object in memory where plain PHP solutions had to parse the file again and again.

I see three performance boosts:

  • less disk IO
  • less parsing
  • index based searches
hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • This depends on the use case. Your approach _might_ be a good one, but how can you tell what the OPs situation is? – arkascha Aug 09 '13 at 11:22
  • The situation is, that he want's to perform an index based search in a json document. (he told that). Note that mongoDB can keep the json object in memory where plain PHP solutions had to parse the file again and again – hek2mgl Aug 09 '13 at 11:23
  • Yep. But also that he _has_ that JSON structure. You want to ask him to convert it just for a single search? – arkascha Aug 09 '13 at 11:24
0

Maybe it's possible to try to predict the outcome, you can then use in_array to search the value in a much smaller json.

Otherwise you could try alternative search algorithms, those can be complicated.

Maxim
  • 3,836
  • 6
  • 42
  • 64