-1

I'm getting back into C#, and I've ran into a problem. I'm making an interpreter and I need to get strings and numbers. My first way was to use Regex, but I don't know how to use it that well.

Let's say I have this:

print "string 1" "string 2" 10

I need to get an array of the args/parameters:

{'string 1', 'string 2', '10'}

So if you can help I appreciate that.

Владимир
  • 135
  • 1
  • 10
  • Yes, but like I said, I want to split with spaces but If I have a string argument like "Hello There!" i want to get it with spaces. – Владимир Mar 02 '19 at 11:15
  • You need a lexer. Google has rather a large number of hits. Arbitrarily, [this one](https://stackoverflow.com/questions/673113/poor-mans-lexer-for-c-sharp) has been vetted for a long time. – Hans Passant Mar 02 '19 at 11:21
  • An interpreter for what? Unless the "language" you want to interpret is very very primitive and very very limited, you would need to write some parser and/or lexer, which itself is not done easily/quickly. Yes, there are parser and lexer libraries out there, but if you still struggle with regular expressions, i believe (my apologies!) trying to utilize parser/lexer libraries will be even harder for you... :-( –  Mar 02 '19 at 11:22
  • Its a very simple language, nothing special. – Владимир Mar 02 '19 at 11:23
  • "_nothing special_" Yeah, i heard that somewhere. Because it is "_not special_" it surely must be easy, or so i heard, too... –  Mar 02 '19 at 11:24
  • I've been working in Python for a year now, so its all different now. – Владимир Mar 02 '19 at 11:24
  • Yeah, I made something like this in Python, it didn't need a lexer. I think I have an answer. I'll try to make it first. – Владимир Mar 02 '19 at 11:25

2 Answers2

1

What you are trying to achieve is easily done with a Lexer. There are many lexers for C# (just do a quick google search), but if you want to learn, you can build one yourself.

You asked for the first step of a lexer work, called tokenization, that is correct splitting a string into smaller strings called tokens, taking into account things like single/double quotes, escape characters, variable expansions, and so on.

Tokenization is a simple task and you will find tons of ready-to-use libraries. Process is like that:

  • scan the input string character by character
  • mark boundaries for each token
  • extract substrings (tokens) to an array
Mario Cianciolo
  • 1,223
  • 10
  • 17
0

If you run your program from a console your arguments get passed into the Main(string[] args) where args contains your arguments

schgab
  • 521
  • 5
  • 19