2

Using bash or php, how can I detect whether the last block of php in a file has a closing tag or not, regardless of trailing newlines or whitespace?

This is what I've got so far, but I can't figure out how to determine if what comes after the closing tag is more php or not.

#!/bin/bash
FILENAME="$1"
closed=false

# just checking the last 10 lines
# should be good enough for this example
for line in $(tail $FILENAME); do
  if [ "$line" == "?>" ]; then
    closed=true
  else
    closed=false
  fi
done

if $closed; then
  exit 1
else
  exit 0
fi

I wrote a few tests with a test runner script.

#!/bin/bash
for testfile in $(ls tests); do
  ./closed-php.bash tests/$testfile
  closed=$?
  if [ $closed -eq 1 -a "true"  == ${testfile##*.} ] || 
     [ $closed -eq 0 -a "false" == ${testfile##*.} ]; then
    echo "[X] $testfile"
  else
    echo "[ ] $testfile"
  fi
done

You can clone these files, but this is what I've got so far.

.
├── closed-php.bash
├── test.bash
└── tests
    ├── 1.false
    ├── 2.true
    ├── 3.true
    ├── 4.true
    ├── 5.false
    └── 6.false
  1. FALSE :

    <?php
    $var = 'value';
    
  2. TRUE :

    <?php
    $var = 'value';
    ?>
    
  3. TRUE :

    <?php
    $var = 'value';
    ?><!DOCTYPE>
    
  4. TRUE :

    <?php
    $var = 'value';
    ?>
    <!DOCTYPE>
    
  5. FALSE :

    <?php
    $var = 'value';
    ?>
    <!DOCTYPE>
    <?php
    $var = 'something';
    
  6. FALSE :

    <?php
    $var = 'value';
    ?><?php
    $var = 'something';
    

I'm failing 3 & 4 because I can't figure out if what comes after the closing tag is more php or not.

[X] 1.false
[X] 2.true
[ ] 3.true
[ ] 4.true
[X] 5.false
[X] 6.false
Jeff Puckett
  • 37,464
  • 17
  • 118
  • 167
  • 1
    Just curious - what you need that for? Mixing php with markup is one file is bad thing. And once that is corrected, you should have just file with php and it's recommended to not use closing `?>` is such case anyway – Marcin Orlowski Jul 14 '16 at 22:57
  • Unless it's interspersed PHP – Jonathan Jul 14 '16 at 22:59
  • 1
    what about `$str="?>";` i can't see this would ever be 100% possible –  Jul 14 '16 at 23:01
  • Remember that ``is also a valid PHP opening tag (if enabled) and XML uses the same opening tag. Can your files have XML besides PHP? Usually trying what you are trying is a lot more complex than you think and probably not worth the time. So, as @MarcinOrlowski asked, what you need that for? – Alvaro Flaño Larrondo Jul 14 '16 at 23:01
  • @MarcinOrlowski I agree with you about the separation, but this is just a piece of a larger project where a bash script will be generating php code and appending files. I don't have any control over whether those files are mixed or not, so I need a check to determine whether I need to prefix my appendix with an opening tag or not. – Jeff Puckett Jul 14 '16 at 23:04
  • @AlvaroFlañoLarrondo luckily short tags can disabled in `php.ini` and this is what I always do. Using short tags is asking for troubles – Marcin Orlowski Jul 14 '16 at 23:05
  • @AlvaroFlañoLarrondo maybe I'm nearsighted for this task, but I have a feeling I don't need to check for opening tags - just closing. But I wouldn't be surprised to see a good solution that used them. – Jeff Puckett Jul 14 '16 at 23:06
  • @JeffPuckettII Why you not make code generator always writting `?>`? For me it looks you are fixing things at wrong place. – Marcin Orlowski Jul 14 '16 at 23:08
  • @MarcinOrlowski because if the tag is already closed, then inserting an additional `?>` would be an unwanted artifact. – Jeff Puckett Jul 14 '16 at 23:09
  • @Dagon I don't follow your drift, what do you mean by `$str="?>";` – Jeff Puckett Jul 14 '16 at 23:10
  • i mean in the file you could have a sting containing that. –  Jul 14 '16 at 23:12
  • @JeffPuckettII But tags is does not close itself. How come php code block you are taking for merge is not closed with `?>` before you merge it? Fix it there and make it always use `?>` – Marcin Orlowski Jul 14 '16 at 23:13
  • @Dagon ah I see, yes, this will make the distinction tricky for test 3. – Jeff Puckett Jul 14 '16 at 23:13
  • According to your question, your concers should be tests 5 and 6. Again your question is "how can I detect whether a php tag has been closed in a file, regardless of trailing newlines or whitespace?" Well tests 5 and test 6 both have closing php tags, yet you do not seem to be concerned about those. I think you should re-formulate your question - as it seems it is not what you really are hoping to achieve. – Radmation Jul 14 '16 at 23:14
  • @Dagan actually the example you gave in your former comment is not problematic, but it is sufficient to "spread" the strings among lines (by adding LFs after opening `"` and before closing `"` and things can start giving false positives – Marcin Orlowski Jul 14 '16 at 23:15
  • @Radmation notice that tests 5 and 6, the tags are reopened, so I'm trying to detect whether the last block of php is closed or not. I will try to rephrase/clarify my question, thanks. – Jeff Puckett Jul 14 '16 at 23:18
  • Do you want test 3 and 4 to pass or fail? I am assuming that you just want to make sure that if a block of php is opened that it must be closed in the file? Is that accurate? – Radmation Jul 14 '16 at 23:21
  • @Radmation the pass/fail is just to determine whether my `closed-php.bash` script is successful in determining whether a given file ends with a closed block (exit status 1 if closed or 0 if open). The test files' extensions are named by convention `true` if closed `false` if left open, so the test runner can determine the pass/fail by comparing those values. – Jeff Puckett Jul 14 '16 at 23:24
  • Thanks for the response. Do you want to flag a file for being left open in case 3 or 4? Or is that considered close? I am asking because your question doesn't address it, but you are having a problem with tests 3 and 4. I think your question is this: "How can I determine that the last characters of a file are '?>' with the exception of trailing whitespace?" – Radmation Jul 14 '16 at 23:30
  • @Radmation files 3 and 4 are both named `true` because they have closed tags. The script I have is currently incorrectly determining them as open, so that's why those two tests are failing. – Jeff Puckett Jul 14 '16 at 23:33
  • 2
    maybe interesting? http://php.net/manual/en/function.token-get-all.php Also: https://github.com/nikic/PHP-Parser. Also: http://stackoverflow.com/questions/5586358/any-decent-php-parser-written-in-php Why? They parse PHP and produce output that may be easier to identify what you want? – Ryan Vincent Jul 15 '16 at 00:31
  • 1
    @Ryan thank you for those fabulous links! They do look very helpful :) – Jeff Puckett Jul 15 '16 at 01:28
  • 1
    @RyanVincent thanks again!!! This only took me about 15 minutes after looking at your link for [`token_get_all`](http://php.net/manual/en/function.token-get-all.php). I posted my answer for others. – Jeff Puckett Jul 15 '16 at 22:58

1 Answers1

2

Thanks to Ryan Vincent's comment, this was pretty easy when using token_get_all

<?php
$tokens = token_get_all(file_get_contents($argv[1]));
$return = 0;
foreach ($tokens as $token) {
  if (is_array($token)) {
    if (token_name($token[0]) === 'T_CLOSE_TAG')
      $return = 0;
    elseif (token_name($token[0]) === 'T_OPEN_TAG')
      $return = 1;
  }
}
exit ($return);

I even added a few more tests, and you can see the full solution here.

Community
  • 1
  • 1
Jeff Puckett
  • 37,464
  • 17
  • 118
  • 167