0

I would like to know if inside a .txt there are numbers, are they really numbers or are they Strings? When I say String I mean the type of data, they are not integers, when we see them as numbers we may think they are integers but they are really Strings, could you please confirm it? I ask this because I am trying to store the data in an array of integers.

Scanner s = new Scanner(new File("numbers.txt"));
    int[] num = new int[s.nextInt()];
    for (int i = 0; i < num.length; i++) {
        num[i] = s.nextInt();
    }
    
    for (int i = 0; i < num.length; i++) {
        System.out.println(num[i]);
    }

1 Answers1

2

For this matter, a file is just a sequence of bytes. Each byte may represent many different things. Files with .txt extension are supposed to be interpreted as a long sequence of characters (what we usually call "a string"). That is, each byte (or a very few number of bytes) is interpreted as a character/entity, which may be a letter, a digit [0-9], a punctuation mark, etc. Some of these characters/entities may have special meanings, like "end of line (of text)".

When you're reading or writing a file that's really a text file, and you know that some (sub)strings represent numbers, you can do the appropriate conversion (like with any number "contained" in a string). When your code calls Scanner.nextInt(), it reads into a string a sequence of bytes that represent digits [0-9], stoping when it finds any other character/entity, and then converts that string to a "real" integer. When your code calls System.out.println(), it does exactly the opposite: converts an integer to a string (by means of an implicit .toString()).

Please be aware that in text files, as in strings, the code (ie. the exact sequence of bits) used to represent a given character/entity is actually arbitrary and depends on a chosen "character set" (like ASCII, EBCDIC, ISO8859*, some of the Unicode variantes, etc).

  • maybe this way of reading the file is not the correct way and it is better to save it in a String array. Because the file may contain a space between each number or a comma and the Scanner.nextInt() method fails ? So if I am right, the data in the file is considered as characters? – Broliton syan Feb 14 '23 at 19:02
  • 1
    We can impose more formats on top of the existing data. For example, if you specifically expect the text to use digit symbols that **represent** a number, then you can read that text and then try to **convert** it to integer. In Java, `Scanner.nextInt` works exactly that way: it will use some logic to decide how much of the file should be used to represent "a number", and then try to interpret the digit symbols to compute an integer, and return that. When the data is wrong, an exception is raised instead. – Karl Knechtel Feb 14 '23 at 19:09
  • @Brolitonsyan There is no "right" answer to your question. Only you know if you can trust your data (which is probably not really _yours_ and probably hasn't been "sanitized"). If you're sure about the contents of the file (and you're willing to handle and accept an ocasional exception, when the format is not the expected), your code is perfectly correct. Me? When in doubt, I prefer to read each line into a string (text files are usually divided into lines) and parse it "manually". – Diego Ferruchelli Feb 15 '23 at 16:13