0

I've been researching and going through stackoverflow for a few hours now. There are some solutions out there but they don't work in all situations. Let me start by explaining my problem first.

My problem is I am trying to upload a file and the file could be any of following options

  1. pdf
  2. doc
  3. docx
  4. xls
  5. xlsx

Peoples have mostly suggested using following approach

 if (FileUpload.FileContent.Length == 0)

I tried this approach on an empty docx file which is newer format of Microsoft Word. Surprisingly it failed. When I debugged the code i actually saw there was content in the file and when I edited it in notepad it turned out true. Similar is the case with 97-2003 format which is doc and also with newer excel format which is xlsx and old 97-2003 format which is xls.

It seems very obvious checking length of content will not work. I have not worked about pdf but its highly likely similar is the case with pdf as well. It may have its own data.

Now the big question is how do we check if the file has actual content or not.

Note that user can try to upload a file which may have content but that content may be nothing more then white space. Consider space, tab, carriage return or new lines. Essentially a file with white spaces is still an empty/blank file. So need to check for that as well

ItsZeus
  • 142
  • 12
  • So in the case of the docx file, when you say 'empty', are you saying that it's just a blank document? – TrevorGoodchild Nov 02 '16 at 17:37
  • yes, thats right @TrevorGoodchild its a blank document – ItsZeus Nov 02 '16 at 17:39
  • When I create a new DOCX from Windows Explorer, the Properties panel shows me it's 0 bytes, and when I load it in a text editor, it's clearly empty. – Todd Sprang Nov 02 '16 at 17:42
  • @ToddSprang I've created docx file using microsoft word and it clearly has by default some bytes already in it. Similar is the case with excel file as well. – ItsZeus Nov 02 '16 at 18:14
  • I see it's the difference of creating a new file from the application versus creating a new file through the "New" context menu item in Windows. Technically, those files that have a bit of data but no content are not blank, but that's probably not what you want to hear. – Todd Sprang Nov 02 '16 at 18:15
  • Yes thats true @ToddSprang you are absolutely right – ItsZeus Nov 02 '16 at 18:18
  • You'll find that any of these types of "blank" file can be opened in a ZIP program and browsed like it was a normal .ZIP. From there you could probably "browse" into each file type and try to inspect the contents. e.g. DOCX into 7-ZIP lets me browse into /word/document.xml. It may be reasonable to check for content in there programmatically. – Todd Sprang Nov 02 '16 at 18:23

2 Answers2

0

Not totally sure how to cross reference links on SA to a specific post but I found this:

string file = "file.csv";
var fi = new FileInfo(file);
if (fi.Length == 0 || 
    (fi.Length < 100000 
     && !File.ReadAllLines(file)
        .Where(l => !String.IsNullOrEmpty(l.Trim())).Any()))
{
    //empty file
}

here:

Is file empty check

so I assume you could set some kind of lower limit to the number of bytes you're expecting from an empty docx file and if the file you're uploading has a higher number than that, it's not empty.

Community
  • 1
  • 1
TrevorGoodchild
  • 978
  • 2
  • 23
  • 49
  • That may not work, lets say I have an empty file like. But have added tabs, new lines, carriage returns. In that case the number of bytes may be more but still the file will be empty. Ideally we should be able to read content of file, remove all these empty white spaces, new lines, tabs etc and then check if the file has any content. – ItsZeus Nov 02 '16 at 18:08
  • Right, I didn't think that code was going to immediately snap in just thought it would be a guideline. How about this: https://bytes.com/topic/c-sharp/answers/875307-check-empty-text-file. Basically read the file into a streamReader and check to see if each individual line has anything in it. – TrevorGoodchild Nov 02 '16 at 18:44
  • Already did that and thats how I came to know basically there's some program specific data in the file already. Data which is not entered by the user creating the file – ItsZeus Nov 02 '16 at 18:50
  • I don't follow. If you just read it line by line shouldn't you be able to tell if there's actual data in the file? Is the program specific data the carriage returns and tabs? – TrevorGoodchild Nov 02 '16 at 18:52
  • Yeah i tried that already, i get data back which is not entered by me – ItsZeus Nov 02 '16 at 18:59
0

You could try using jQuery, as found here:

Asp.Net Check file size before upload

ASPX

<asp:CustomValidator ID="customValidatorUpload" runat="server" ErrorMessage="" ControlToValidate="fileUpload" ClientValidationFunction="setUploadButtonState();" />
<asp:Button ID="button_fileUpload" runat="server" Text="Upload File" OnClick="button_fileUpload_Click" Enabled="false" />
<asp:Label ID="lbl_uploadMessage" runat="server" Text="" />

jQuery

function setUploadButtonState() {

   var maxFileSize = 4194304; // 4MB -> 4 * 1024 * 1024
   var fileUpload = $('#fileUpload');

   if (fileUpload.val() == '') {
    return false;
   }
   else {
      if (fileUpload[0].files[0].size < maxFileSize) {
         $('#button_fileUpload').prop('disabled', false);
         return true;
      }else{
         $('#lbl_uploadMessage').text('File too big !')
         return false;
      }
   }
}
Community
  • 1
  • 1
TrevorGoodchild
  • 978
  • 2
  • 23
  • 49
  • This approach is good but it won't work. See one of the problems is when we create word document or excel document it has some program specific data in it. Hence the file size would be greater then 0 which doesn't guarantee the file has data. See my point ? – ItsZeus Nov 02 '16 at 20:46