0

I need to identify PMID in the string and extract the ID from it. However, I am facing an issue to do this is Php.

I tried using Regex to identify PMID, but couldn't succeed

   ob_start();
   include('getCallbyVkey.php');
   $output = ob_get_clean();
   $arr1 = explode('}', $output);
   foreach ($arr1 as $line_){
//        if (strpos($line_, 'pmid')){
           preg_match_all('/"pmid":#(\d+)/', $line_, $matches);
           print_r($matches);

Following is the data:

[{"author":"F\u00c3\u00bchrer S","pmid":"31401120","volume":"","issue":"","year":"2019","month":"Aug","journal":"Journal of molecular biology","journalabbrev":"J. Mol. Biol.","title":"Pathogenic Mutations Associated with Legius Syndrome Modify the Spred1 Surface and Are Involved in Direct Binding to the Ras Inactivator Neurofibromin.","order":"1","source":"PubMed"}
,{"author":"Yl\u00c3\u00a4-Outinen H","pmid":"31397088","volume":"","issue":"","year":"2019","month":"Aug","journal":"Molecular genetics & genomic medicine","journalabbrev":"Mol Genet Genomic Med","title":"Intestinal tumors in neurofibromatosis 1 with special reference to fatal gastrointestinal stromal tumors (GIST).","order":"2","source":"PubMed"}
,{"author":"Ahlawat S","pmid":"31396668","volume":"","issue":"","year":"2019","month":"Aug","journal":"Skeletal radiology","journalabbrev":"Skeletal Radiol.","title":"Current status and recommendations for imaging in neurofibromatosis type 1, neurofibromatosis type 2, and schwannomatosis.","order":"3","source":"PubMed"}
,{"author":"Ahlawat S","pmid":"31395668","volume":"","issue":"","year":"2019","month":"Aug","journal":"Neurology","journalabbrev":"Neurology","title":"Imaging biomarkers for malignant peripheral nerve sheath tumors in neurofibromatosis type 1.","order":"4","source":"PubMed"}
,{"pmid":"24033266","year":"2013","title":"A systematic approach to assessing the clinical significance of genetic variants.","author":"H Duzkale","clinacc":"RCV000218671","ClinicalSignificance":"benign","source":"ClinVar"}
,{"pmid":"25741868","year":"2015","title":"Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.","author":"S Richards","clinacc":"RCV000218671","ClinicalSignificance":"benign","source":"Pubmed"}

The expected output is:

31401120
31397088
31395668
24033266
25741868
RRg
  • 123
  • 1
  • 12

2 Answers2

1

json_decode it as @Alex Howansky says.


    $data = json_decode($output);
    foreach($data as $row) {
        print_r($row->pmid);
    }

Etin
  • 365
  • 1
  • 9
1

Try changing your regex to:

preg_match_all('/"pmid":"(\d+)/', $line_, $matches);

this should make the trick, but as @Alex Howansky mentioned, you could use json_decode

Jose Rojas
  • 3,490
  • 3
  • 26
  • 40