3

so I know a paragraph is char 10 + char 13 I do:

streamreader sr = new streamreader();
string s = sr.ReadToEnd();
string s1 = s.Replace((char)10, "*");
string s2 = s1.Replace((char)13, "*");

Now it changed paragraphs to two ** but how do I split by 2 chars? Anyone have any alternatives to splitting paragraphs?

  1. way of easily splitting paragraphs OR
  2. way of splitting by two chars
Oded
  • 489,969
  • 99
  • 883
  • 1,009
user1243565
  • 91
  • 1
  • 2
  • 3
  • possible duplicate of [string.split - by multiple character delimiter](http://stackoverflow.com/questions/1254577/string-split-by-multiple-character-delimiter) – Chris Haas Mar 12 '12 at 16:04

5 Answers5

5
string doc = "line1\r\nline2\r\nline3";
var docLines = doc.Split(new string[] { "\r\n" }, System.StringSplitOptions.None);

Alliteratively you could use Environment.NewLine... which would keep things standard.

var docLines = doc.Split(new string[] { Environment.NewLine }, System.StringSplitOptions.None);
CrazyDart
  • 3,803
  • 2
  • 23
  • 29
3

Assuming you mean ASCII cr+lf (13+10), just use StreamReader.ReadLine().

tomfanning
  • 9,552
  • 4
  • 50
  • 78
2

Have you tried Regex? Windows uses \r(13) folowed by \n(10) as a line separator, so you would get lines. But if you want blocks of text separated by at least one empty line, you might try this:

 string inputString = sr.ReadToEnd();

 string[] paragraphs = Regex.Split(inputString , "(\r\n){2,}");
Mithrandir
  • 24,869
  • 6
  • 50
  • 66
1

See string.Split(string[], StringSplitOption):

var result = s2.Split(new []{"**"}, StringSplitOption.RemoveEmptyEntries)

Also you can do it by using Environment.NewLine, without convert it to **:

var result = s.Split(new []{Enviornment.NewLine}, StringSplitOption.RemoveEmptyEntries)
Saeed Amiri
  • 22,252
  • 5
  • 45
  • 83
  • The problem with the ** is that he is copying it into 2 more strings to make that happen... if that file was large it would eat memory like crazy, thus I would avoid the ** approach completely. – CrazyDart Mar 12 '12 at 16:15
  • @CrazyDart, I wrote both ways before your comment and your answer, Also I just edit it to add link to msdn document.(again before your comment) – Saeed Amiri Mar 12 '12 at 16:18
  • I dont debate that, all I am saying is that the OP should not use the ** method because of the heavy lifting. Its not optimal. I am guessing the only reason that was done was to reveal the \r\n. In the end, Mithrandir may have the better answer... the regex engine will likely work faster. – CrazyDart Mar 12 '12 at 16:26
  • @CrazyDart, I'm agree with you. but is good for OP to knoe it step by step. – Saeed Amiri Mar 12 '12 at 16:49
0

Use a regular expression if your splitting criteria is simple.

Ani
  • 10,826
  • 3
  • 27
  • 46