This has to do with the BOM (Byte Order Mark) added by Notepad to detect the encoding:
Microsoft compilers and interpreters, and many pieces of software on Microsoft Windows such as Notepad treat the BOM as a required magic number rather than use heuristics. These tools add a BOM when saving text as UTF-8, and cannot interpret UTF-8 unless the BOM is present or the file contains only ASCII. Google Docs also adds a BOM when converting a document to a plain text file for download.
From this article you can also see that:
The UTF-8 representation of the BOM is the (hexadecimal) byte sequence 0xEF,0xBB,0xBF
We should therefore be able to write a PHP function to account for this:
function is_utf8_file_empty($filename)
{
$file = @fopen($filename, "r");
$bom = fread($file, filesize($filename));
if ($bom == b"\xEF\xBB\xBF") {
return true;
}
return false;
}
Do be aware that this is specific for files created in the manner you described and this is just example code - you should definitely test this and possible modify it to allow it to better handle large files / files that are completely empty etc