4

For school i had to make an assignment, which i handed in already, but the code i wrote is awful, i don't like what i ended up with. So, I'm curious, what would be considered the best possible way to solve the following question in C#:

'//4 How many times does “queen” occur in the Alice in Wonderland book? Write some code to count them.'

link to the book (pastebin): book

my code (pastebin): my code (ugly)

please when writing your answer, ignore my code. also, explain what your code does, and why you think it's the best possible solution. The amount of times the word "queen" occurs in the book should be 76.

FrankK
  • 482
  • 8
  • 23
  • 6
    You should probably check out http://codereview.stackexchange.com/. It was created specifically to review code whereas stack overflow is designed more to help troubleshoot broken code.. – Derek Van Cuyk Nov 19 '15 at 15:27
  • Look into [String.IndexOf()](https://msdn.microsoft.com/de-de/library/7cct0x33%28v=vs.110%29.aspx) – Thomas Weller Nov 19 '15 at 15:28
  • Possible duplicate of [The fastest way to count string occurences in another string in c#](http://stackoverflow.com/questions/29038057/the-fastest-way-to-count-string-occurences-in-another-string-in-c-sharp) – Thomas Weller Nov 19 '15 at 15:31
  • 8
    You ask for the "best" way without saying what your metric for goodness is. The shortest code? The easiest to understand? The most flexible in the face of changes? The code that uses the least memory? The fastest for a single run? The fastest if precomputation is allowed? These and many many more things are all possible metrics that you have to consider when designing real world code. – Eric Lippert Nov 19 '15 at 15:34
  • @EricLippert The code that uses the least memory or the shortest code. Also, since im relatively new to c# (and coding in general), I'm asking what YOU would consider to be the best possible solution for this problem. – FrankK Nov 19 '15 at 15:38
  • 1
    @user3499284 The point is the best solution depends on the actual requirements. All of the things that Eric listed are basically equally valid, so you'll get a different answer for "best" for every person you ask, which ultimately isn't terribly helpful. – Kyle Nov 19 '15 at 15:47
  • I seriously doubt memory usage would be an issue for a task like this. Not impossible, just unlikely. Performance, on the other hand is a curiosity. I am inclined to think that `Regex` and `Split` are slower than `IndexOf`, but it's worth a benchmark to see if that's actually the case – Hambone Nov 19 '15 at 15:48
  • possible duplicate of http://stackoverflow.com/questions/541954/how-would-you-count-occurrences-of-a-string-within-a-string – Thomas Weller Nov 19 '15 at 15:53
  • Your belief that "queen" appears 76 times is incorrect. The word "queen" appears zero times. "Queen's" appears 7 times, "Queens" once, "QUEEN" once, and "Queen" 65 times. You implicitly assume that all of these are hits, but why should "Queens" be a hit for "queen"? Suppose "queenship" appeared -- which it does not -- would that count as a hit for "queen"? What about "unqueenlike"? Should it be a hit? – Eric Lippert Nov 19 '15 at 16:12
  • @EricLippert i should have added that Programmeren2Tests.Chapter12Test.TestExercise4(Exercise4); is the same thing as: Assert.AreEqual(76,Exercise4); – FrankK Nov 19 '15 at 17:26

5 Answers5

4

I won't post the full code, as I think it is useful for you to try this as an exercise, but I would personally go for a solution with the IndexOf overload that takes a starting position.

So something like (note: intentionally incorrect):

int startingPosition = 0;
int numberOfOccurrences = 0;
do {
  startingPosition = fullText.IndexOf("queen", startingPosition);
  numberOfOccurrences++;
} while( matchFound );
CompuChip
  • 9,143
  • 4
  • 24
  • 48
2

Shortest way to write. is to use Regex. it will find the matches for you. just get the counts. Also regex have ignore case option so you dont have to use ToLower on big string. So after you read the file

string aliceFile = Path.Combine(Environment.CurrentDirectory, "bestanden\\alice_in_wonderland.txt");
string text = File.ReadAllText(aliceFile);

Regex r = new Regex("queen", RegexOptions.IgnoreCase);
var count = r.Matches(input).Count;

Also because the input is very large but pattern is simple you can use RegexOptions.Compiled to make things faster.

Regex r = new Regex("queen", RegexOptions.IgnoreCase | RegexOptions.Compiled);
var count = r.Matches(input).Count;
M.kazem Akhgary
  • 18,645
  • 8
  • 57
  • 118
1

You could write a string extension method to split on more than one character....

public static string[] Split(this string s, string separator)
{
    return s.Split(new string[] { separator }, StringSplitOptions.None);
}

....And just use the string you are searching for as the specrator and then the result is the length of the array -1.

string s = "How now brown cow";
string searchS = "ow";
int count = s.split( seacrchS ).Length- 1;

The actual array returned by split would be ....

["H"," n"," b","n ","c"]

And extension methods ALWAAYS come in handy again in the future.

AntDC
  • 1,807
  • 14
  • 23
1

Could also use a regular expression:

 string s = "Hello my baby, Hello my honey, Hello my ragtime gal";
 int count = Regex.Matches(s, "Hello").Count;
Hambone
  • 15,600
  • 8
  • 46
  • 69
0

or you could use some linq to do the same thing

string words = "Hi, Hi, Hello, Hi, Hello";  //"hello1 hello2 hello546 helloasdf";
var countList = words.Split(new[] { " " }, StringSplitOptions.None);
int count = countList.Where(s => s.Contains("Hi")).Count();
MethodMan
  • 18,625
  • 6
  • 34
  • 52