0

I have a text file that looks something like this (but with hundreds of lines):

1147-1 SYRUP: DR.PEPPER 5GALLON/BOX
1653-1 SYRUP: DIET DR.PEPPER 5GAL/BOX
2011-2 WATER DISTILLED 6 / 1 GA
1217-2 ALL PURPOSE RASPBERRY FIL 40#
1273-1 STRAWBERRY PIE FILLING 38#
2893-1 BREAD: SOURDOUGH 12/1# OVAL
2287-1 BREAD SQUAW: 8/2.25#LF
1929-1 VINEGAR HERB CONT GRDN 12/12.7
1949-2 KETCHUP: 16/14OZ-PLASTIC BTLS
2408-1 CONE 3 NAB SAMPLER 28/45
2939-1 DULCE LECH FLVR PKT 3/12 EA CS
3017-1 GINGRBRD FLVR PKT 3/12 EA CS
3055-2 EGGNOG FLVR PKT 3/12 EA CS
3192-1 ORIGINAL MRS. DASH SEASONING

I've created the code to pull everything in from the text file line by line and strip out the numbers at the beginning and save the next portion (ie SYRUP: DR.PEPPER 5GALLON/BOX, ALL PURPOSE RASPBERRY FIL 40#) to Mid(nextLine, 10, 30). I want to take that portion and split it up by pulling the name (SYRUP: DR.PEPPER, ALL PURPOSE RASPBERRY FIL), the size (5GALLON, 40#) the number of that size (if it is 12/1# = 12x 1LB) and the unit (BOX, LB) out. As you can see almost every line is different but with many similarities. Not really sure what to do next. I have been trying to use:

re.Pattern = "GALLON|BOX|#|LF|GAL|EA|CS|BTLS"  

to pull out the unit portion but I don't know what else to do.

Here is the code I have so far for this portion:

 Function NumericOnly(s As String) As String
 Dim StrUnit As String
 Static re As RegExp
 If re Is Nothing Then Set re = New RegExp
 re.IgnoreCase = True
 re.Global = True
 re.Pattern = "GALLON|BOX|#|LF|GAL|EA|CS|BTLS"
 StrUnit = re.Replace(s, "")
 End Function
dbugger
  • 15,868
  • 9
  • 31
  • 33
jimf
  • 1
  • 3
  • Have you tried regex? – findwindow Dec 15 '15 at 21:13
  • Yes, but the only part I could figure out is: re.Pattern = "GALLON|BOX|#|LF|GAL|EA|CS|BTLS" – jimf Dec 15 '15 at 21:17
  • You should post the actual regex. Edit: or is that what you're using? Edit2: check [here](http://stackoverflow.com/questions/22542834/how-to-use-regular-expressions-regex-in-microsoft-excel-both-in-cell-and-loops/22542835#22542835). – findwindow Dec 15 '15 at 21:18
  • Function NumericOnly(s As String) As String Dim StrUnit As String Static re As RegExp If re Is Nothing Then Set re = New RegExp re.IgnoreCase = True re.Global = True re.Pattern = "GALLON|BOX|#|LF|GAL|EA|CS|BTLS" StrUnit = re.Replace(s, "") End Function – jimf Dec 15 '15 at 21:24
  • Sorry about that...fixed it :) – jimf Dec 15 '15 at 21:34
  • Did you check the link? Your regex won't work... – findwindow Dec 15 '15 at 21:35
  • I have actually never used regex before. Originally, I was trying to go through every line character by character using an IF statement: 'If (IsNumeric(Mid(PhraseArr(0), Current_Pos, 1))) = True Then' But there is too much variation in the text to get this to work well, and it is super slow which doesn't help. – jimf Dec 15 '15 at 21:38
  • Neither have I XD It's tedious. The link is pretty comprehensive. – findwindow Dec 15 '15 at 21:42
  • 1
    Does `Text to Columns` not do the trick? You could iterate and run it three times, or however many to split things. ...if you want to avoid RegEx I suppose. – BruceWayne Dec 15 '15 at 21:45
  • I don't have a problem using RegEx, just never used it before. Not sure how text to columns would help in this case. There isn't just one thing you could delimite upon to pull out the required information from the string. – jimf Dec 15 '15 at 21:52
  • You could start delimiting with a space, which would put everything into columns. From there it shouldn't be too hard to run through the rest unless I'm overlooking something. – BruceWayne Dec 15 '15 at 21:54
  • How would you then know the difference between 6/1GA and 6 / 5 GA? – jimf Dec 15 '15 at 21:58
  • I would consider creating a few Boolean variables to separate out what a line contains and so act differently based on the Boolean combinations. – PatricK Dec 16 '15 at 00:35
  • Not sure what you mean...can you give an example? – jimf Dec 16 '15 at 00:48

0 Answers0