0

I want to determine how large a file would be based on some text input but without having to save it to file.

From tests it appears a file with 4 characters in it will be 5 bytes.

Does this hold true in general, charcount + 1?

It's a bunch of javascript that I am looking to save.

Many thanks for any advice.

dibs
  • 1,018
  • 2
  • 23
  • 35

2 Answers2

1

Well it all breaks down when somebody puts in a comment in his native language, using some UTF characters, that have varying size (then one character != one byte). Other than that there are also some differences in the filesystem the file is stored on; usually the smallest unit that can be allocated on a hard disk drive is specified and file sizes will always be a multiple of this number.

itchy355
  • 73
  • 5
  • Do you know if I could detect this and adjust my count? Or is it pretty much impossible to know untill you save the file? – dibs Apr 24 '12 at 04:54
  • 1
    Yes, you can; but the question is how bullet proof this needs to be. If it's enough just to roughly guess the filesize, then I think it's OK to use what you've suggested. If there's anything to depend on this, I would not use this method. What do you need it for? – itchy355 Apr 24 '12 at 05:09
  • It's not for life-support so I guess it'll be ok. Thanks for the input! – dibs Apr 24 '12 at 05:21
  • I can just give the user "filesize will be > x". That'll work. – dibs Apr 24 '12 at 05:25
0

No.

An ASCII text file is exactly one byte per character long. But line breaks are also (one or two) characters, that is probably where your extra byte comes from.

For non-ASCII text, every character can take up more than one byte, in UTF-8 encoding usually one to three.

In addition to that, the file may take up some extra space on disk, because depending on the file system being used it may need to be rounded up to a minimum block size, for example 8K.

Thilo
  • 257,207
  • 101
  • 511
  • 656