-2

im currently making an c# application that receives a string from an serialport, and i need to parse this data so i can do stuff with it.

The strings that are send through the SerialPort are formatted like this:

*NTF,CTRL,SQL,OPEN,+,-66*NTF,CTRL,DBUSY,ON,+,-63*NTF,CTRL,DBUSY,OFF*NTF,CTRL,SQL,CLOSE*

now im wondering how i can split this string into segments on the * symbol, i have made a couple attempts at this myself but couldnt figure it out.

The attempts i made are:

String[] tmp = data.ToString().Split('*');
foreach(String word in tmp)
{
    if (word.Contains(",80") || word.Contains(",81"))
    {
        COM_PORT_INFO_BOX.Text += word + "\r\n";
    }
}

which gave me:

NTF,CTRL,SQL,OPEN,+,-66
NTF,CT RL,DBUSY
,ON,+,-6
3
NTF,CT
RL,DBUSY
,OFF NTF,CT
RL,SQL,C
LOSE

i have also tried:

var regex = new Regex('*'+".+"+'*');
var matches = regex.Matches(data);

But this gave me an error.

What i want to achieve:

The formatted string to look like this:

NTF,CTRL,SQL,OPEN,+,-66
NTF,CTRL,DBUSY,ON,+,-63
NTF,CTRL,DBUSY,OFF
NTF,CTRL,SQL,CLOSE

EDIT:

I have solved the problem by having this piece of code:

SerialPort sp = (SerialPort)sender;
data += sp.ReadExisting().ToString();
string[] tmp = data.Split(new char[] {'\u0002','\u0003'}, StringSplitOptions.RemoveEmptyEntries);
foreach (string line in tmp)
{
    if (line.Contains(",80") || line.Contains(",81") || line.Contains(",RXVCALL"))
    {
        COM_PORT_INFO_BOX.Text += line.Substring(1) + "\r\n";
        data = "";
    }
}          
  • 3
    "prefferably with regex" Why? – D Stanley Nov 12 '14 at 14:00
  • 3
    Read about `string.Split` method. – Konrad Kokosa Nov 12 '14 at 14:02
  • because i know that its better with regex i tried doing it using split normally but then it would put everything at kind of a random point in a new array field – Giovanni Le Grand Nov 12 '14 at 14:02
  • 1
    You have three asterisks in the example, so would you want just the text between the first two (i.e. pairs of delimiters), or also the text between the second and third? – Guffa Nov 12 '14 at 14:02
  • also between the second and third and counting upwards – Giovanni Le Grand Nov 12 '14 at 14:03
  • 2
    What have you tried? What was the result? What is the expected output? Please put more effort into asking a question if you want others to make an effort in answering – Alex Nov 12 '14 at 14:03
  • or ofc if thats easier removing the first asterisk and the found text after saving it – Giovanni Le Grand Nov 12 '14 at 14:04
  • Does the text always start and end with an asterisk? – Guffa Nov 12 '14 at 14:06
  • What's inside `data`? – siride Nov 12 '14 at 14:07
  • inside data is the datareceived from COM1 – Giovanni Le Grand Nov 12 '14 at 14:09
  • @GiovanniLeGrand You have some problem with your data. `YOUR_EXAMPLE_STRING.Split('*')` gives the exact result you wanted (but with two empty entries - see `StringSplitOptions.RemoveEmptyEntries` in D.Stanley's answer to resolve that). EDIT: see for yourself: http://csharppad.com/gist/8234ba34d9df2170d08c – decPL Nov 12 '14 at 14:26
  • 3
    `String.Split()` is not some new brittle technology. It's pretty straightforward, and it's not going to mess up unless you give it bad input. Double-check the string you are actually getting and make sure it doesn't have, e.g., embedded newlines or extra asterisks. – siride Nov 12 '14 at 14:26
  • exact data im receiving is: *NTF,CTRL,SQL,OPEN,+,-64*NTF,CTRL,DBUSY,ON,+,-63*NTF,CTRL,DBUSY,OFF*NTF,CTRL,SQL,CLOSE*NTF,CTRL,SQL,OPEN,9,-68*NTF,CTRL,DBUSY,ON,+,-63*NTF,CTRL,DBUSY,OFF*NTF,CTRL,SQL,CLOSE*NTF,CTRL,SQL,OPEN,+,-63*NTF,CTRL,DBUSY,ON,+,-63*NTF,CTRL,DBUSY,OFF*NTF,CTRL,SQL,CLOSE stackoverflow removed the asteriks before the whole line – Giovanni Le Grand Nov 12 '14 at 14:28
  • You might need to pull it up in a text editor or hex editor and make sure there are no special characters. I am concerned that the text is switching between italics and non-italics, indicating that there's something special in your input (probably apostrophes, but who knows). – siride Nov 12 '14 at 14:30
  • 1
    @GiovanniLeGrand Works for that string as well - http://csharppad.com/gist/6754242cff544445aea8 – decPL Nov 12 '14 at 14:30
  • hmm let my try once more then because it diddnt work properly so ill also check in notepad++ to see if there are some extra unseen special characters – Giovanni Le Grand Nov 12 '14 at 14:32
  • if i paste the line inside notepad++ i see that its like this: STX*NTF,CTRL,SQL,OPEN,9,-70ETXSTX etc what are the 'STX' and 'ETX'? – Giovanni Le Grand Nov 12 '14 at 14:33
  • @GiovanniLeGrand http://en.wikipedia.org/wiki/Control_character – decPL Nov 12 '14 at 14:39
  • 2
    Your new question is a duplicate of http://stackoverflow.com/questions/6799631/removing-control-characters-from-a-utf-8-string – decPL Nov 12 '14 at 15:02
  • Your question doesn't make sense. The output you show couldn't possibly come from the code you posted. The input string shown at the top of the question does not have the characters `,80` and `,81`. Please update your question so it properly demonstrates the problem you are having. – Chris Dunaway Nov 12 '14 at 15:54

3 Answers3

5

I know you said "preferrably with regex" but this is cleaner IMHO with String.Split:

string s = "*blablablab,blablabla,blablabla,blablabla*blablabla,blabalbal,blablabla*";
string[] results = s.Split(new [] {'*'}, StringSplitOptions.RemoveEmptyEntries);

results:

String[] (2 items)
----------------------------
blablablab,blablabla,blablabla,blablabla 
blablabla,blabalbal,blablabla 

One thing to remember with String.Split is that is your string begins or ends with a delimiter you'll get empty entries at the beginning and end, respectively, of the resulting array. Adding the StringSplitOptions.RemoveEmptyEntries parameter removes those empty entries so you are just left with the two stings between each pair of asterisks.

D Stanley
  • 149,601
  • 11
  • 178
  • 240
  • This would also include strings before the first asterisk and after the last, e.g. from `"asdf*1*2*asdf"` you would no only get `"1"` and `"2"` but also the `"asdf"` before and after. Not sure yet what the OP expects in that case, but the "between" in the title suggests that it might not be desired. – Guffa Nov 12 '14 at 14:09
  • i've tried this and tried to show it in a textbox using:data = sp.ReadExisting().ToString(); string[] tmp = data.Split(new [] {'*'},StringSplitOptions.RemoveEmptyEntries); foreach(string word in tmp){ COM_PORT_INFO_BOX.Text += word + "\r\n"; } but it still results in giving me cuts between words like this: NTF,CTRL,SQL,OPEN,+,-66 NTF,CT RL,DBUSY ,ON,+,-6 3 NTF,CT RL,DBUSY ,OFF NTF,CT RL,SQL,C LOSE – Giovanni Le Grand Nov 12 '14 at 14:09
  • the end CLOSE is seperated after the C – Giovanni Le Grand Nov 12 '14 at 14:12
  • 2
    You must have some embedded whitespace or control characters. That has nothing to do with `string.Split` and your result wouldn't be any different using regex. I would use the debugger to look more closely at those characters and see if you can filter them out. – D Stanley Nov 12 '14 at 14:16
  • 2
    @GiovanniLeGrand What you've written here does not match the question you've asked in the least. Could you update your question with the exact example? The answer provided is valid for your `blabla` string. – decPL Nov 12 '14 at 14:17
  • sorry i've edited and approved the suggested edit with a real example – Giovanni Le Grand Nov 12 '14 at 14:23
-1

This works for me on regexr.com

The Problem with your regex was, that the ending "*" needs to be used as the ending of the first and the beginning of the second entry. But since it's already used for the first, it's ignored in the second one.

Thats why I used the "\2" backreference

\2(.+?)(\*)

\2 -> backreference to the second group (\*)
(.+?) -> every character until a "*" is found
(\*) -> The character thats ending a single Entry
democore
  • 316
  • 1
  • 3
  • 6
-2

try this one

([^\*](\.*)[^\*])*

it worked on http://regexstorm.net/tester

[^\*] = match any character which is not *
(\.*) = match any character

so the regex explanation is 
at first match any character that is not *, then match any character that does not ends with *

i think logically true and i tried it and it matched

hope it will help you

Monah
  • 6,714
  • 6
  • 22
  • 52
  • Have a look at what this regex actually does. Because I doubt it works as you expect. [Visual Regex Preview (Debuggex.com)](https://www.debuggex.com/r/rHo-qzkep_2Rl6Oy) – AeroX Nov 12 '14 at 14:24
  • i've tried this but while trying to use this it already gives me the error: Unrecognized escape sequence – Giovanni Le Grand Nov 12 '14 at 14:26
  • 1
    @GiovanniLeGrand: you shouldn't be using this answer, but if you did go this route, you have to make sure you use C#'s special string literal: `@"([^\*](\.*)[^[\*])*"` (note the @ at the beginning). – siride Nov 12 '14 at 14:27
  • i gave you the solution using C# try it on the link i gave you please – Monah Nov 12 '14 at 14:51