0

I wanted your help, I'm currently working in extracting some data, now the thing is that I have to count an specific amount of Call IDs a call ID format is the following 9129572520020000711. The pattern is 19 characters that starts with 9 and ends in 1. and I want to count how many times this pattern appears in one cell

I.E. this is the value in one cell and I want to count how many times the pattern appears.

1912957252002000071129129545183410000711391295381628700007114912959791875000071159129597085000000711691295892838400007117912958908933000071189129452513730000711
Community
  • 1
  • 1
Jose T
  • 3
  • 2
  • 1
    You could use [regex](https://regex101.com/r/vD5VLb/1) but you'll probably need to show more effort to get a positive response here. [This will help you get started](https://stackoverflow.com/q/22542834/2521004) – Automate This Aug 15 '17 at 21:55
  • 2
    This works for your example `=Len(A1)/20` – Scott Craner Aug 15 '17 at 22:02
  • 2
    Or since the count appears to be in the string: `=--MID(A1,LEN(A1)-19,1)` – Scott Craner Aug 15 '17 at 22:08
  • Or if you really want to get complicated this array formula: `=MAX(--MID($K$1,(ROW(INDIRECT("1:" & (LEN(K1)/20)))-1)*20+1,1))` – Scott Craner Aug 15 '17 at 22:19
  • 1
    Formulas above ^^ assume no more than 9 call IDs per cell ;) – YowE3K Aug 15 '17 at 22:24
  • Will you *always* know the pattern you're looking for? Will it always be 19 characters long? Or can it be any length, but starting with `9` and ending in `1`...wait...Then why would you indluce the last `11` instead of ending at either the `1` (second character) or the `1` one character before the end? – BruceWayne Aug 15 '17 at 22:30
  • @YowE3K well since the OP seems to have disappeared, we can assume what we want. I kept waiting for the op to say, "That does not work for this instance" :) – Scott Craner Aug 15 '17 at 22:35
  • @YowE3K fine: `=INT(Len(A1)/20)` will do it until there are 100000000000000000000 call ID's – Scott Craner Aug 15 '17 at 22:39
  • @ScottCraner Yeah - it's always annoying when people post questions and then disappear. (I guess they figure they will come back tomorrow and see what the answer is, but it certainly makes it hard to get clarification.) – YowE3K Aug 15 '17 at 22:39
  • @ScottCraner No - `=INT(Len(A1)/20)` will be wrong once you get to 29 call IDs (9 * 20 + 20 * 21 = 600 characters, 600 / 20 = 30) – YowE3K Aug 15 '17 at 22:42
  • @YowE3K Well then.......I guess that is my queue to go have some dinner, and sign off for the day. – Scott Craner Aug 15 '17 at 22:44
  • 2
    @ScottCraner Should I pick on you some more and say "queue" should be "cue" :D – YowE3K Aug 15 '17 at 22:45

3 Answers3

1

To solve this with formulae you need to know:

  • The starting character
  • The ending character
  • The length of your Call ID

Finding all possible Call IDs

enter image description here

Let B1 be your number string and B2 be the call ID (or pattern) you are looking for. In B5 enter the formula =MID($B$2,1,1) to find the starting character you are looking for. In B6 enter =RIGHT($B$2,1) for the end character. In B7 enter =LEN($B$2) for the length of the call ID.

In Column A we'll enter the position of every starting character. The first formula will be a simple Find() formula in B10 as =FIND($B$5,$B$1,1). To find the other starting characters start the Find() at the location after the last starting character: =FIND($B$5,$B$1,$A10+1) in B11. Copy this down the column a few dozen times (or more).

In Column B we'll see if the next X characters (where X is the length of the Call ID) meets the criteria for a Call ID:

=IF(MID($B$1,$A10+($B$7-1),1)=$B$6,TRUE,FALSE)

The MID($B$1,$A10+($B$7-1),1)=$B$6 checks if the character at the end of the character at the end of this possible Call ID is the end character we're looking for. $A10+($B$7) calculates the position of the possible Call ID and $B$6 is the end character.

In Column C we can return the actual Call ID if there is a match. This isn't necessary to find the count, but will be useful later. Simply check if the value in Column B is True and, if yes, return the calculated string: =IF(B10,MID($B$1,$A10,$B$7),"").

To actually count the number of valid Call IDs, do a CountIf() of the Call ID column to check for the number of True values: =IF(B10,MID($B$1,$A10,$B$7),"").

If you don't want all the #Values! just wrap everything in IFERROR(,"") formulas.

Finding all consecutive Call IDs

enter image description here

However , some of these Call IDs overlap. Operating on the assumption that Call IDs cannot overlap, we simply have to start our search after the end character of a found ID, not the start. Insert an "Ending Position" column in Column B with the formulae: =$A10+($C$7-1), starting in B11. Alter A11 to =FIND($C$5,$C$1,$B10+1) and copy down. Don't change A10 as this finds the first starting position and is not depending on anything but the original text.

Which ones are valid?

I don't know, that depends on other criteria for your Call IDs. If you receive them consecutively, then the second method is best and the other possible ones found are by coincidence. If not, then you'll have to apply some other validation criteria to the first method, hence why we identified each ID.

Community
  • 1
  • 1
0

Do you mean something like what's following?

Sub CallID_noPatterns()

Dim CallID As String, CallIDLen As Integer
CallID = "9#################1"
CallIDLen = Len(CallID) 'the CallID's length

'Say that you want to get the value of "A1" cell and deal with its value
Dim CellVal As String, CellLen As Integer
CellVal = CStr(Range("A1").Text) 'get its value as a string
CellLen = Len(CellVal) 'get its length

'You Have 2 options:-
'1-The value is smaller than your CallID length. (Not Applicable)
'2-The value is longer than or equal to your CallID length
'So just run your code for the 2nd option

Dim i As Integer, num_checks, num_patterns
i = 0
num_patterns = 0

'imagine both of them as 2 arrays, every array consists of sequenced elements
'and your job is to take a sub-array from your value, of a length
' equals to CallID's length
'then compare your sub-array with CallID

num_checks = CellLen - CallIDLen + 1
If CellLen >= CallIDLen Then
    For i = 0 To num_checks - 1 Step 19
        For j = i To num_checks - 1
            If Mid(CellVal, (j + 1), CallIDLen) Like CallID Then
                num_patterns = num_patterns + 1
                Exit For
            End If
        Next j
    Next i
End If

'Display your result
MsgBox "Number of Patterns: " & Str(num_patterns)

End Sub
Scott Craner
  • 148,073
  • 10
  • 49
  • 81
  • This does not work, just tested it, also you are looking for a specific string where the string starts with 9 and ends with 1, The numbers between will change. – Scott Craner Aug 15 '17 at 22:58
  • Yes, you're right, The weird thing is that applying `Len()` to even a copy of `CallID`'s value in Cell "A1" gives a length of **0** !, Also excel shows **another** value than I've already given to the cell when I ask to see it from MsgBox! – Serag Hassouna Aug 15 '17 at 23:37
  • If you don't mind I got your code working. I can edit your code to work. – Scott Craner Aug 15 '17 at 23:38
  • Yes, No Problem, Go ahead. – Serag Hassouna Aug 15 '17 at 23:52
0

You can solve this simply with a UDF using a regular expression.

Option Explicit
Function callIDcount(S As String) As Long
    Dim RE As Object, MC As Object
    Const sPat As String = "9\d{17}1"

Set RE = CreateObject("vbscript.regexp")
With RE
    .Global = True
    .Pattern = sPat
    Set MC = .Execute(S)
    callIDcount = MC.Count
End With
End Function

Using your example, this returns a count of 8

The regular expression engine captures all of the matches that match the pattern, into the match collection. To see how many are there, we merely return the count of that collection.

Trivial modifications would allow one to return the actual ID's also, should that be necessary.

enter image description here

The regex:

9\d{17}1

9\d{17}1

Created with RegexBuddy

EDIT Reading through TheFizh's post, he considered that you might want the count to include overlapping CallID's. In other words, given:

9129572520020000711291

We see that includes:

9129572520020000711
9572520020000711291

where the second overlaps with the first, but both meet your requirements.

Should that be what you want, merely change the regex so it does not "consume" the match:

Const sPat As String = "9(?=\d{17}1)"

and you will return the result of 15 instead of 8, which would be non-overlapping pattern.

Ron Rosenfeld
  • 53,870
  • 7
  • 28
  • 60