1

Suppose I have a text file in the following (non-standard) format:

xxx { a = v1; b = v2 }
yyy { a = v3; c = v4 } 

I cannot change it to any standard (INI/XML/YAML, etc.) format.

Now I would like to find the value of property a in section xxx (that is v1). What is the simplest way to do it in Java/Groovy?

Michael
  • 10,185
  • 12
  • 59
  • 110
  • 1
    possible duplicate of [What is the easiest way to parse an INI file in Java?](http://stackoverflow.com/questions/190629/what-is-the-easiest-way-to-parse-an-ini-file-in-java) – Stephen C Jan 17 '12 at 15:48
  • 1
    I'd rather use YAML or JSON. There are ready to use parsers. Why invent another format? You can even create valid Groovy script and run it from file on runtime getting real objects and props. – Piotr Gwiazda Jan 17 '12 at 15:50
  • @StephenC it's not a duplicate, since the linked question asks about INI files. This question asks about an INI-_like_ format. – Chris Cashwell Jan 17 '12 at 15:51
  • 1
    You will need to define your format in more detail like `1.` Will there always be no semicolon before closing `}`? `2.` Can there be new lines between `{` and `}`? etc. You may have to write your customized parser for this format. – anubhava Jan 17 '12 at 16:09

4 Answers4

3

With Groovy, you could leverage the ConfigSlurper.

However, you would first need to hack a map of valid values together, so that it doesn't choke trying to work out what v1, v2, v3, etc are:

This seems to work:

def input = '''xxx { a = v1; b = v2 }
              |yyy { a = v3; c = v4 }'''.stripMargin()

def slurper = new ConfigSlurper()

// Find all words 'w' and make a map of [ w1:'w1', w2:'w2', ... ]
slurper.binding = ( ( input =~ /\w+/ ) as List ).collectEntries { w -> [ (w):w ] }

def result = slurper.parse( input )
println result

That prints out:

[xxx:[a:v1, b:v2], yyy:[a:v3, c:v4]]

(Groovy 1.8.4)

tim_yates
  • 167,322
  • 27
  • 342
  • 338
2

For a true INI-format file: What is the easiest way to parse an INI file in Java?

What you're showing here looks more like JSON than INI format to me. Perhaps look at JSON parsing libraries. The truth here is that you're not using an established format, so you probably won't be using an established format parser. Your best bet is probably to refactor the file you're dealing with (if possible) into a well-known format to begin with. Don't try to reinvent the wheel unless you absolutely have to.

Community
  • 1
  • 1
Chris Cashwell
  • 22,308
  • 13
  • 63
  • 94
2

There's likely not going to be an out-of-box solution if you're dealing with a non-standard format. Here's a few approaches you might want to look into:

  • if the format is simple, write a custom recursive descent parser
  • write a filter to transform your format into INI, JSON, etc. and use existing libraries
  • create a groovy DSL that matches your format and execute your file as a groovy script
  • use a parser generator tool like antlr or parboiled to create a parser from a language specification
ataylor
  • 64,891
  • 24
  • 161
  • 189
2

Firstly, you've given an example, not specified a format. Before you go any further, you need to get hold of a complete specification for the format. Or if there isn't one, you need to see the code that generates it, and reverse engineer a specification.

(If you try to implement based on a small example, there's a good chance that your parser will encounter real life examples that don't fit the patterns that you have intuited.)

Having done that you can look for an off-the-shelf parser that can cope with your format. If you are lucky, it might be close enough to INI, or JSON or YAML or something else for the corresponding parser to (mostly) work.

But the chances are that it won't, and that you will need to write your own parser. There are various ways you could do this, for instance:

  • You could split the file into lines and "parse" each line with a regex.
  • You could parse the file using a Scanner with appropriate delimiters.
  • You could use a parser generator to implement a lexer and parser.
  • You could implement a simple lexer and parser by hand.
  • There are probably Groovy specific solutions.

In reality the correct choice(s) depend on how simple or complex the actual format is. We can't tell that from a single example.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216