-1

Let's say I have the following format of string in a file:

"0001 Full NameOtherDataXXX"

Questions:

  • Is there a technical name for these sort of files that I'm unaware of? It's harder to search about it without knowing a name for it.
  • Is there some sort of library in Java 7/8 that would help me reading/writing this types of files/strings by just specifying it? Ex of specification

Example of specification (it could also be through annotations in a model class):

| Name   | Type    | Size | Padding char | Padding type |

| Field1 | Integer | 4    |    '0'       | Left         |
| Field2 | String  | 11   |    ' '       | Left         |
| Field3 | String  | 12   |    'X'       | Right        |
Andy Turner
  • 137,514
  • 11
  • 162
  • 243
JeanK
  • 1,765
  • 1
  • 15
  • 22
  • 2
    1) I'd call it a "fixed-width file format"; or at least "fixed-" something. – Andy Turner Jun 07 '16 at 20:03
  • 2) No, but it's rather easy: just read a line as a `String` with a `BufferedReader`, and use `substring` to chop it up. – Andy Turner Jun 07 '16 at 20:05
  • 2
    Fixed-width? What about space separated values (variation of CSV)? Isn't it more accurate here? – Mateusz Chrzaszcz Jun 07 '16 at 20:05
  • 1
    Yeah, this appears to just be CSV where the separator is a space. CSV is just a name, the format isn't standardized. As a result, any Java CSV parsing library you use is going to allow you to specify the separator. Google Java CSV etc. – DavidS Jun 07 '16 at 20:12
  • 2
    @DavidS except `Field2` contains `" Full NameO"`, which has two spaces in it, and there is no space before `Field3`, containing `"therData"`. – Andy Turner Jun 07 '16 at 20:20
  • I thought "Full" was a field and "NameOtherDataXXX" was a field (weird, but whatever). I guess you're right: this is fixed width. – DavidS Jun 07 '16 at 20:22
  • Possible duplicate of [What's the best way of parsing a fixed-width formatted file in Java?](http://stackoverflow.com/questions/1609807/whats-the-best-way-of-parsing-a-fixed-width-formatted-file-in-java) – DavidS Jun 07 '16 at 20:31

2 Answers2

2

Take a look at BeanIO

I've used it to read fixed width files with different data types, and worked perfectly. You can define expected types and formats n your beanIO definitions

Cristian Meneses
  • 4,013
  • 17
  • 32
1

No official name but generally referred to as “fixed-width” or “padded text”. As I recollect this format is most commonly seen with systems related to old mainframes. Reports printed to terminal screen or green-bar paper were commonly produced in this style.

Each field is defined as a certain number of characters. Using a monospaced font means the columns of data visually align. When the data value for a field has a fewer number of characters, some “padding” characters are added to value. Add the padding characters to the front for a right-aligned column, and to the end for a left-aligned column. Your example spec is defining the padding characters as a zero, space, or X as well as that left-right alignment.

I've never seen a Java library for reading or writing such padded text. But that might be a good idea. Having a formally-defined machine-readable spec for any particular file’s format is an intriguing idea. Most folks make their own little library as it is not difficult.


In the PC era, delimiter formats are more common. Generally Tab-delimited or Comma-Separated-Values (CSV). More sensible to my mind would be the use of the four characters explicitly defined in ASCII and Unicode for the very purpose of delimiting data in text files, code points 28 to 31, but inexplicably I've never seen them used.

The chores of reading and writing both Tab and CSV formats are performed handily by the Apache Commons CSV library.

And in the internet-age, we commonly write data to XML or JSON formats.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154