3

A have the following input string:

string input =
   "Ta005000000000000000000Tb001700000000000000000Sa005000000000000000000" +
   "Sb002500000000000000000F     00000000000000000I     00000000000000000N" +
   "     00000000000000000FS    00000000000000000IS    00000000000000000NS" +
   "    00000000000000000";

I need to separate this string into parts, however, the content varies greatly.

Have to let this string into a list like:

[0] "Ta005000000000000000000"
[1] "Tb001700000000000000000"
[2] "Sa005000000000000000000"
[3] "Sb002500000000000000000"
[4] "00000000000000000I"
[5] "00000000000000000N"
[6] "0000000000000000FS"
[7] "0000000000000000IS"
[8] "0000000000000000NS"
[9] "000000000000000000"

The only thing that I know in this case is that the max lenght of the string is 23. So, in this example, I need to separate the 'T' or 'S' of the fisrt part of the string return. Or, if you have no occurrence of these characters, separated by space (it happens on the last part of my string return). I did so:

var linq = test.Split(new[] { 'T', 'S', ' ', '{', '}' }, StringSplitOptions.RemoveEmptyEntries).ToList();

My "test" is a StrintBuilder containing those return characters. By doing this, I can separate my list, but I lost a very important information for me in this case, which is the 'T' and 'S'.

Well, do not know if it was clear, but it seems to be something so simple and is giving me a huge headache.

Obs: Other problem is that, for example: "0000000000000000FS", in this part of string I need to maintain the "FS" together.

Thank you for your attention,

Felipe Volpatto
  • 155
  • 2
  • 11
  • duplicate question: [C# split string but keep split chars / separators](http://stackoverflow.com/questions/521146/c-sharp-split-string-but-keep-split-chars-separators) – default Jan 18 '12 at 12:16
  • Are you sure your partial strings can only start with 'T' or 'S'? That last part of your input string looks different from the first part. – vgru Jan 18 '12 at 12:20
  • Yes, you're right. The only thing that I know is that the max lenght of the string is 23. I'm going to edit my question. – Felipe Volpatto Jan 18 '12 at 12:28
  • Shouldn't your 4th string start with an 'F'? – vgru Jan 18 '12 at 12:35
  • 1
    If they are 23 chars long, what about your examples 4-9? They are shorter. And if they can end with 'S', shouldn't this be the starting character of the next one??? – Matten Jan 18 '12 at 12:35
  • @Matten: I agree, it doesn't make sense. The string has exactly 230 characters, and can be simply split into 10 strings of length 23. But then there is this missing 'F' character all of a sudden. – vgru Jan 18 '12 at 12:39
  • Because the "max" lenght is 23. The first part ok, but the last I have the problem with the 'S' when the string is separated. – Felipe Volpatto Jan 18 '12 at 12:44
  • @FelipeVolpatto: I still don't understand the rule which allows you to remove the 'F' character completely. You should describe what the data represents, and then it may be easier to understand the actual **rules** of your transformation. If you have additional examples with different inputs, post them. – vgru Jan 18 '12 at 12:47
  • The rule sounds very complex. First you split at ' ', then each line (if longer than 23 characters) by 'S' and 'T' and then you truncate the lines if the second step does not apply. Is this right? – Matten Jan 18 '12 at 12:52
  • Yes, I know, it's difficult to me explain it. Ok, I have that return of a serial port using fiscal printer. Every time that I execute a especific method, it will return something like this. In the first time of my string: "Ta005000000000000000000Tb001700000000000000000Sa005000000000000000000Sb002500000000000000000F", i need to count 23 characters and get the subtring. The last part: "00000000000000000I 00000000000000000N ...", i need to get the substring of 0 until last character before the "space", like "00000000000000000I" and other 0000000000000000FS", for example. – Felipe Volpatto Jan 18 '12 at 12:57
  • It's difficult to explain, I hope I have been a little clearer. – Felipe Volpatto Jan 18 '12 at 12:58
  • Have a look at my updated answer – Matten Jan 18 '12 at 13:02

3 Answers3

6

Split removes the splitting character. Just replace these as showed below to insert spaces and then split at the space character:

var linq = myRates.Replace("T"," T").Replace("S"," S").Split(new[] { ' ', '{', '}' }, StringSplitOptions.RemoveEmptyEntries).ToList();

EDIT

This rule is very complicated. Maybe this solves your problem.

string input =
   "Ta005000000000000000000Tb001700000000000000000Sa005000000000000000000" +
   "Sb002500000000000000000F     00000000000000000I     00000000000000000" +
   "N     00000000000000000FS    00000000000000000IS    00000000000000000" +
   "NS    00000000000000000";

First step: split at ' '

string[] spaceSplit = input.Split(' ', StringSplitOptions.RemoveEmptyEntries);

Now spaceSplit looks like this:

[0] "Ta005000000000000000000Tb001700000000000000000Sa005000000000000000000Sb002500000000000000000F"
[1] "00000000000000000I"
[2] "00000000000000000N"
[3] "0000000000000000FS"
[4] "0000000000000000IS"
[5] "0000000000000000NS"
[6] "000000000000000000"

Now split each line if it is longer than 23 characters by 'T' and 'S'

List<string> temp = new List<string>();
foreach(string s in spaceSplit)
  if (s.Length>23)
    temp.AddRange(s.Replace("T", " T").Replace("S", " S").Split(' '));
  else
    temp.Add(s);

temp.ToArray() yields

[0] "Ta005000000000000000000"
[1] "Tb001700000000000000000"
[2] "Sa005000000000000000000"
[3] "Sb002500000000000000000F"
[4] "00000000000000000I"
[5] "00000000000000000N"
[6] "0000000000000000FS"
[7] "0000000000000000IS"
[8] "0000000000000000NS"
[9] "000000000000000000"

var linq = (from s in temp select s.Substring(0,23)).ToList();

et voilà, linq is the array you want. But for other input combinations this "algorithm" might break.

Matten
  • 17,365
  • 2
  • 42
  • 64
  • Yes, I know, Regex is much better, but I'm unable to remember these things :) It's like a mystical silver bullet thats creeping me out... – Matten Jan 18 '12 at 12:30
  • 2
    depends on if the person who later maintains the code knows regex or not :) I would understand this code better than the regex. – default Jan 18 '12 at 12:36
  • Yes, it works but in parts. The problem is that, for example: "0000000000000000FS", in this part of string I need to maintain the "FS" together. Using this code, the 'S' that i need to maintain is always separated. Do you understand? – Felipe Volpatto Jan 18 '12 at 12:40
  • 1
    Yes I do. But maybe you should rephrase your question because the splitting condition is unclear... – Matten Jan 18 '12 at 12:41
  • Yes, I know, but it's difficult to explain it :(, but, using it, i have: [6] "00000000000000000F" and [7] "S". I have to have these two parts together – Felipe Volpatto Jan 18 '12 at 12:48
  • Perfect, perfect! You understand exactly what I wanted! Thanks a lot, friend! – Felipe Volpatto Jan 18 '12 at 13:12
3

Something like this?

string[] separatedString = Regex.Split(s, @"(?=[TS ])") ;

then you just have to remove the "empty" elements if you want

Ivan Crojach Karačić
  • 1,911
  • 2
  • 24
  • 44
  • Beat me to it in few seconds! That's indeed what the OP here is after as far as I can see. – Shadow The GPT Wizard Jan 18 '12 at 12:19
  • 3
    [duplicate answer](http://stackoverflow.com/a/521172/238902) to the [duplicate question](http://stackoverflow.com/questions/521146/c-sharp-split-string-but-keep-split-chars-separators) :P – default Jan 18 '12 at 12:34
  • Looks like it is... I am also getting myself familiarized with regular expressions on [this](http://www.codeproject.com/KB/dotnet/regextutorial.aspx) address and had this line of code already in my "test" solution :) – Ivan Crojach Karačić Jan 18 '12 at 12:38
  • Yes, it works but in parts. The problem is that, for example: "0000000000000000FS", in this part of string I need to maintain the "FS" together. Using this code, the 'S' that i need to maintain is always separated. – Felipe Volpatto Jan 18 '12 at 12:38
  • @Default Is'n duplicated, because the situation is a little bit different :( – Felipe Volpatto Jan 18 '12 at 12:46
  • @Default: Its a trivial answer; trivial answers tend to be similar to each other. –  Jan 20 '12 at 17:20
1

Use a Regular Expression to do this:

string[] parts = Regex.Split(test, @"(?<=[TS\s\{\})");

I'm not sure if the curly braces are formatted correctly.

Gustavo F
  • 2,071
  • 13
  • 23