0

I have a string which I am matching a Regex against - The thing is that the string which is converted from HEX - has a Garbage value in some scenarios:

So the string looks like this:

$<MSG.Info.ServerLogin>
$DeviceName=Unnamed AVL
$Software=avl_2.12.0_rc14 (TestingString)
$Hardware=TST3 rev:03-NUCH
$IMEI=123456789789456
$PhoneNumber=0012345679
$LocalIP=123.456.789.96
$LastValidPosition=GPRMC,062423.000,A,2500.5000,N,05000.1000,E,5.34,142.12,210815,,
$Security=0
$CmdVersion=2
$SUCCESS
$<end>

And this is my Regex:

^(?<Header>\$\<MSG\.Info\.ServerLogin\>\s*)\$DeviceName=(?<DeviceName>[_0-9A-Za-z:\-\s]*)\$Software=(?<Software>[_0-9A-Za-z:\.\s\(\)=]*)\$Hardware=(?<Hardware>[-_0-9A-Za-z:\.\]\[\(\)\s]*)\$IMEI=(?<IMEI>[0-9]{15,}\s*)\$PhoneNumber=(?<PhoneNumber>[0-9A-Za-z\s]*)\$LocalIP=(?<LocalIP>[-_0-9A-Za-z\.\s]*)(?<GPRMCHeader>\$LastValidPosition=GPRMC),(?<UTCTime>[0-9]{6}\.{1}[0-9]{3}),(?<Status>[ALSV]{1}),(?<Latitude>[0-9]{4}\.{1}[0-9]{4},[NS]),(?<Longitude>[0-9]{5}\.{1}[0-9]{4}'?\"?,[EW]),(?<SpeedOverGround>[0-9\.]*),(?<TrackTrue>[0-9\.]*),(?<UTCDate>[0-9]+),(?<MagneticVariation>[0-9\.]*),(?<MagneticDirection>[EW\s]*)\$Security=(?<Security>[0-9\s]*)\$CmdVersion=(?<CmdVersion>[0-9\s]*)\$SUCCESS\s(?<Footer>[\$\<end\>\s])

When the PhoneNumber has a Value everything Matches: enter image description here

But when I have data like in this Snapshot - It doesn't match.

enter image description here

Is there a way to handle this in a Regex- I am using C#:

This is my code:

var regexLoginMessage = new Regex(msgPattern, RegexOptions.IgnoreCase);
var regexMatch = regexLoginMessage.Match(strMessage);
Dawood Awan
  • 7,051
  • 10
  • 56
  • 119
  • 5
    Are you sure you need regex? You can parse the file line by line, reach `PhoneNumber` field and see if there is a valid number or not. [Now you have two problems](http://blog.codinghorror.com/regular-expressions-now-you-have-two-problems/) – Sriram Sakthivel Aug 21 '15 at 06:46
  • 1
    @SriramSakthivel I am maintaining an existing system - I don't want to mess with the rest of the Code. This is not a file reading this is a TCP service which receives the string in HEX. – Dawood Awan Aug 21 '15 at 06:48
  • Why would you create a mess? Just write unit tests to make sure your modification worked! Simple.. (Btw it doesn't matter it is a file or plain string, my suggestion will work in either case) – Sriram Sakthivel Aug 21 '15 at 06:50
  • Checking this Text Line by Line is not an option here: There are different types of input strings coming through the system, on which we have to determine the Business Logic. - It is not a File Read @SriramSakthivel – Dawood Awan Aug 21 '15 at 06:52
  • I would suggest not to use Regex either. It does not seem like the text is regular at all – Bauss Aug 21 '15 at 06:52
  • 1
    I would guess that editing this monster of a regex would be likelier to create a mess to your legacy application than rewriting it like @SriramSakthivel suggests. – steinar Aug 21 '15 at 06:53
  • 2
    It seems your message is "plain ASCII" with the exception of some occational non-printing char. A Q&D solution could be to just remove non-ASCII before parsing, as described [here](http://stackoverflow.com/questions/123336/how-can-you-strip-non-ascii-characters-from-a-string-in-c). – Micke Aug 21 '15 at 06:53
  • @Micke now that sounds like a good idea. will give it a try and let you know . Cheers – Dawood Awan Aug 21 '15 at 06:54
  • 3
    `NAK`is actually ASCII value 21. – Bauss Aug 21 '15 at 06:55
  • 1
    Very true! I guess he'll just have to modify this then: `s = Regex.Replace(s, @"[^\u0000-\u007F]", string.Empty);` – Micke Aug 21 '15 at 06:56
  • 1
    @Bauss Thanks for pointing me in the right Direction - I had to write this line to remove NAK Char: s= s.Replace(((char)21).ToString(), ""); – Dawood Awan Aug 21 '15 at 07:30
  • @Micke It worked with this code - s= s.Replace(((char)21).ToString(), ""); – Dawood Awan Aug 21 '15 at 07:33
  • 1
    @DawoodAwan: The problem is, if you come across another unprintable character that messes up your code, will add more and more `Replace`s? I doubt it. Just strip the unprintables once and for all with `string output = new string(input.Where(c => !char.IsControl(c)).ToArray());`. – Wiktor Stribiżew Aug 21 '15 at 08:25

0 Answers0