1

We are attempting to make a change log of Microsoft KBase updates we are applying during out normal Maintenance Cycles. We want to parse the information below for particular Lines. The Sample is below:

Operation           : 1
ResultCode          : 2
HResult             : 0
Date                : 10/7/2014 10:27:50 AM
UpdateIdentity      : System.__ComObject
Title               : Update for Microsoft Silverlight (KB2977218)
Description         : This update to Silverlight improves security, reliability, accessibility support, startup performance, enhances line-of-business support and includes several fixes to better support rich internet applications. This update is backward compatible with web applications built using previous versions of Silverlight.
UnmappedResultCode  : 0
ClientApplicationID : AutomaticUpdates
ServerSelection     : 1
ServiceID           : 
UninstallationSteps : System.__ComObject
UninstallationNotes : 
SupportUrl          : http://go.microsoft.com/fwlink/?LinkID=105787
Categories          : System.__ComObject

Our Desired output is:

Title               : Update for Microsoft Silverlight (KB2977218)
Date                : 10/7/2014 10:27:50 AM
Description         : This update to Silverlight improves security, reliability, accessibility support, startup performance, enhances line-of-business support and includes several fixes to better support rich internet applications. This update is backward compatible with web applications built using previous versions of Silverlight.

I am trying to write a simple C# application that we would paste the raw data into a Rich text box click a button and have the desired output in another Rich text box. There is a pattern of "Keyword : Data" that might be useful.

I have the form created and the elements on the form. I have attempted to find a method that will search for the keyword but this would not yield a result we seek. We want the Keyword or Line so to speak and as you can see the description could be multiple lines.

I currently don't have any sample code to post as I don't know where to begin given this task. Any sample code would be helpful to accomplish this task.

Dominic Zukiewicz
  • 8,258
  • 8
  • 43
  • 61
Deadphoenix
  • 88
  • 11
  • 1
    Is it really multiple lines or are you just guessing that because that's how your text has been wrapped? Are there actually `\n` characters in the string following `Description :`? Notice how the latest edit indicates that `Description` is just one really long line. – Matt Burland Oct 29 '14 at 18:24
  • Many easy ways to do this. You can either write a regex that matches [anything]:[anything] and then read your matches and only take those with the first element matching "title" or "date" or "description" or alternatively, tokenize it by using split() on line ends (\r\n) and on colon (:). – kha Oct 29 '14 at 18:27
  • Have you tried this? http://blogs.technet.com/b/heyscriptingguy/archive/2009/03/09/how-can-i-list-all-updates-that-have-been-added-to-a-computer.aspx Also this: http://stackoverflow.com/questions/922132/use-c-sharp-to-interact-with-windows-update – dbc Oct 29 '14 at 18:28
  • Looks like you could get PowerShell sorting this out for you. If its event log details, do you really want to extract a record, copy it into a textbox and then split it? Sounds like a lot of manual work to do.. – Dominic Zukiewicz Oct 29 '14 at 18:30
  • @dbc and Dominic Zukiewicz - I just need a text parser for the task I am trying to address for another peer. But thank you for the suggestions. – Deadphoenix Oct 29 '14 at 19:36

2 Answers2

2

You can try the following Regex pattern:

(?<=\b[KEYWORD]\b\s*:\s*).*

Simply replace [KEYWORD] with the actual keyword you're looking for. For example (?<=\bTitle\b\s*:\s*).* would return Update for Microsoft Silverlight (KB2977218). Here's how you'd use it in the code:

private string GetDataFromKeyword(string source, string keyword)
{
    return Regex.Match(source, string.Format(@"(?<=\b{0}\b\s*:\s*).*", keyword)).Value.Trim();
}

And call it like:

string data = GetDataFromKeyword(textbox.Text, "Title");

Explanation of the pattern:

(?<=): Is the notation for positive look-behind.

\b[KEYWORD]\b\s*:\s*: Matches the entire word [KEYWORD] followed by any number of spaces followed by : followed by any number of spaces.

.*: Matches anything after the look behind, which is essentially your Data in the Keyword: Data pair.

Edit

If you have multiple instances of a given keyword, you can use the Matches() method instead of Match():

private IEnumerable<string> GetDataFromKeyword(string source, string keyword)
{
    return Regex.Matches(source, string.Format(@"(?<=\b{0}\b\s*:\s*).*", keyword))
            .Cast<Match>().Select(match => match.Value.Trim());
}

Now var data = GetDataFromKeyword(textbox.Text, "Title"); returns a list of matches which you can enumerate through:

var titles = GetDataFromKeyword(textbox.Text, "Title").ToArray();
var dates = GetDataFromKeyword(textbox.Text, "Date").ToArray();
var descriptions = GetDataFromKeyword(textbox.Text, "Description").ToArray();

for (int i = 0; i < titles.Count(); i++)
{
    string block = string.Format("Title: {0}, Date: {1}, Description: {2}", titles[i], dates[i], descriptions[i]);
    MessageBox.Show(string.Format("Block {0}: {1}", i+1, block));
}

Note that this will assume you have the same number of title, date, and description entries. I'm not sure what your requirements are, but this is just an example of iterating over the lists. Change it based on your needs.

Arian Motamedi
  • 7,123
  • 10
  • 42
  • 82
  • What about situations where there will be multiple Entries pasted in the First Text box? This will only address 1 instance. So it appears I need a way of counting different "Block" of data then parse each with a loop? – Deadphoenix Oct 29 '14 at 19:19
  • @Deadphoenix If by multiple entries you mean multiple instances of a keyword, then you can still do it using regexes. See my edited answer. – Arian Motamedi Oct 29 '14 at 19:41
  • I am sorry I am still learning C# how would I enumerate through the list. My attempts have been unsuccessful. Additionally I would like to return title, date and description in a Blocks when I do output. – Deadphoenix Oct 29 '14 at 20:19
  • Just use a `for` or `foreach` loop. Result of `GetDataFromKeyword()` is just a list of strings (matches). – Arian Motamedi Oct 29 '14 at 20:27
  • Thank you I was on the right track but your version is much more elegant. – Deadphoenix Oct 29 '14 at 21:30
1

I'm usually not a fan of regex-based solutions -- there's almost always a more readable way to accomplish your goal.

Something like this should get you started. Lots of opportunity to refactor, too:

var keywords = new List<string>() { "Keyword1", "Keyword2", "Keyword3" };

var lines = File.ReadLines(@"c:\path\to\file.txt");

foreach (var line in lines)
{
    foreach (var keyword in keywords)
    {
        if (line.StartsWith(keyword))
        {
            // found a match, do something.
            // Split on ":"? etc.
        }
    }
}

As I said, very quick and dirty, but 1) it works 2) It's readable and 3) there's a lot of easy refactoring that you can do.

Ian P
  • 12,840
  • 6
  • 48
  • 70