0

I have been looking for this problem for 3 days now - and trying anything close or related to see if i can get it working....

I am pulling EXIF tags from JPG files with PHP

and using it in a simple mouseover script, but that is where it breaks down:

Getting the EXIF Data

foreach($images as $img){

    $exif = exif_read_data($img, 0, true);

And during my testing I put simplified the variable names - I don't think this is necessary but this is where I am now

$Ititle =  $exif['IFD0']['Title'];
    $Isubject = $exif['IFD0']['Subject'];
    $Icomment = $exif['IFD0']['Comments'];
    $m = "<p>Title: ".$Ititle."<BR>Subject: ".$Isubject."<BR>Comments: ".$Icomment."</p>";
    echo $m;

This Echo $m works the Title / Subject / Comments from the Jpg image look as expected.

So I have a thumbnail image using with a mouseover to change the big image "preview1.2.3.4..." to the img you mouseover.... and Change a <p> to the correct title / subject / comments..

<img onmouseover="document.getElementById('exifdata<?echo $b;?>').innerHTML = '<?echo $m;?>'; preview<?echo $b;?>.src=img<?echo $p;?>.src" name="img<?echo $p;?>" src="<?echo $img;?>" style="float:left; margin-right:10px; Max-width: 100px; Max-height:100px; width:auto; height:auto;">

The Img and text change works on rollover but in the <p> it shows up like this Title: H�a�p�p�y� �C�o�u�p�l�e��� Subject: E�n�j�o�y�i�n�g� �I�N� �S�u�m�m�e�r�s��� Comments:

Here is the DIV where the text changes

<div style="width:680px; height:auto; overflow:hidden; background: rgba(66, 95, 149, 1);">
    <p id='exifdata<?echo $b;?>'>testing
    </p>
      </div>

What is adding these question marks After it gets passed through this innerHTML?

And the entire php page: *sorry if it is messy - I have been trying a lot of things

<html>
<head>
<link rel="stylesheet" type="text/css" href="style.css">
</head>
<body>
<?php
header('Content-Type:text/html; charset=UTF-8');
$b = 1;

$blogs = array_filter(glob('./Content/*'), 'is_dir');
foreach($blogs as $entries){
    /*
    print "<br>";
    print $entries;
    print "<br>";
    print "<p>Images</p>";
    */
    #get all the JPG s in the blog folder
    $images = array_filter(glob("$entries/*.JPG"));

    #get Textblock and title txt files for verbage....
    $textblock = file_get_contents("$entries/Textblock.txt");
    $title = file_get_contents("$entries/Title.txt");

    #get date for post
    $PostDatestr = substr($entries,-8);
    $PostDate = date("d M Y", strtotime($PostDatestr));

    #Create Entry regardless of type:
    ?>
<div id="Notice">
  <div id="Title"><h2><?echo $title;?></h2></div>
    <section class="Wrapper">
      <header class="Wrapper"><h1><?echo $PostDate;?></h1></header>
        <article>
    <?


    #print the Blog post....  if one or less photos in DIR
    if (count($images) <= 1){
    #Don't use img tag if there are 0 images.
        if (count($images) === 1){
            ?><img src="<?echo $images[0];?>" style="float:left; margin-right:10px; Max-width: 680px; Max-height:680px; width:auto; height:auto;">
            <?
        }
        echo $textblock;

    }
    #print the Blog post.... if there is more than 1 photo in DIR
    if (count($images) > 1){
        #get info for each photo

        ?>
        <div class="thumbnails" style="width;100%; height:auto; display:block; overflow:hidden;">
        <?
        foreach($images as $img){

        $exif = exif_read_data($img, 0, true);
        $Ititle =  $exif['IFD0']['Title'];
        $Isubject = $exif['IFD0']['Subject'];
        $Icomment = $exif['IFD0']['Comments'];
        $m = "<p>Title: ".$Ititle."<BR>Subject: ".$Isubject."<BR>Comments: ".$Icomment."</p>";
        echo $m;
        #echo $exif===false ? "No header data found.<br />\n" : "Image contains headers<br />\n";
        ?>

        <img onmouseover="document.getElementById('exifdata<?echo $b;?>').innerHTML = '<?echo $m;?>'; preview<?echo $b;?>.src=img<?echo $p;?>.src" name="img<?echo $p;?>" src="<?echo $img;?>" style="float:left; margin-right:10px; Max-width: 100px; Max-height:100px; width:auto; height:auto;">
                <?
        $p++;
        $lastimg = $img;
        }
        ?>
        </div>
        <br><br>

        <div class="preview<?echo $b;?>" align="center" Style="width:640px; margin:0 auto; overflow:hidden;">
            <img name="preview<?echo $b;?>" src="<?echo $lastimg;?>" style="float:left; margin-right:10px; Max-width: 680px; Max-height:680px; width:auto; height:auto;" alt=""/>
        </div>
        <div style="width:680px; height:auto; overflow:hidden; background: rgba(66, 95, 149, 1);">
        <p id='exifdata<?echo $b;?>'>testing
        </p>
          </div>


    <?
    }
    ?>
        </article>
    </Section>
 </div> 

<?
    $b++;
    }

?>  
</body>
</html>
Cœur
  • 37,241
  • 25
  • 195
  • 267
Don Fouts
  • 137
  • 1
  • 11
  • 1
    My immediate thought is that "happy couple" and "enjoying..." are 16 bit character strings being interpreted as 8 bit character strings. Given that "Title", "Subject" and "Comment" appear as they should, I would suggest looking into the character handling of `$m = "

    Title: ".$Ititle."
    Subject: ".$Isubject."
    Comments: ".$Icomment."

    ";`
    – traktor Sep 30 '15 at 23:20
  • Thanks - is there a way echo in 16 bit, or convert the variables into 8 bit? – Don Fouts Oct 01 '15 at 03:34
  • Please confirm that the string length of `$Isubject = $exif['IFD0']['Subject'];` is twice what you expect. This would confirm a character encoding issue underlies the problem and would allow an appropriate answer with less guess work. – traktor Oct 01 '15 at 05:04
  • well strlen($Isubject) shows 26, the string printed with the � is 26, without the � , what it should be is 12 - almost half.... – Don Fouts Oct 01 '15 at 15:25

2 Answers2

0

I finally found a function that resolved the problem...

https://stackoverflow.com/a/20103241/1112764

Thanks! I tried all the different ways to encode / decode, re-tag with different programs into the JPG - I had to use this way of stripping out the invalid characters.

Community
  • 1
  • 1
Don Fouts
  • 137
  • 1
  • 11
0

Background.

The problem could be a result of bad formatting of tags within the image file. Namely that tag text is written as Unicode (16) but is listed as ASCII in tag header fields. Tag formats are described in Exif 2.2 from 2002 and Exif 2.3 (JEITA CP-3451) from 2012. The same result would arise if exif_read_data is treating everything as ASCII irrespective of format flags.

There are reports of image file processing software introducing bugs into Unicode image tags. For example this KDE bug EXIF UserComments with special characters get tagged as ASCII could still be an issue depending on image history.

The � character itself is the the Unicode "Replacement Character", code point 65533 decimal, used to replace invalid characters in a string. For ASCII text stored as 16 bit values, the high order byte is zero (the ASCII NUL character) and is likely the character being replaced. Where the replacement is occuring is not proven – image tags might flagged as Unicode and contain replacement characters (unlikely), exif_read_data() may be inserting them (likely but unproven), or the browser might be replacing NUL characters with Replacement Characters (unlikely or browser dependant).

May I suggest checking the following:

  1. PHP setup needs to be correct for Unicode characters and module mbstring must be available. The documentation does not read well if you have to handle images from different systems requiring different setup.

  2. A quick solution with obvious limitations would be to convert tag strings to US-ASCII, or at least Latin1 8 bit octets by removing NUL and Replacement Characters. A function to do this is

    function annul(s)
    {   return s.replace(/[\u0000|\uFFFD]/g, "");
    }
    

This at least cleaned up the strings in your post but won't restore any Unicode characters present.

  1. A harder approach would be to reconstruct Unicode 16 values from pairs of characters returned in tag strings. Having to do this would mean there are serious problems with PHP, or the image files are badly encoded, or both, and might not work in all cases.

  2. The ideal would be to have PHP exif.ini setup for Unicode and all images correctly tagged with a encoding declarations suited to the setup.

Deciding what action to take will largely depend on whether your site supports Unicode and global languages.

traktor
  • 17,588
  • 4
  • 32
  • 53