-1

I'm new to Python, and I would like to find a substring in a string.

For example, if I have a substring of some constant letters such as:

substring = 'sdkj'

And a string of some letters such as:

string = 'sdjskjhdvsnea'

I want to make a counter so that any letters S, D, K, and J found in the string the counter will get incremented by 1. For example, for the above example, the counter will be 8.

How can I achieve this?

Amal Murali
  • 75,622
  • 18
  • 128
  • 150
Hakar
  • 29
  • 6
  • Please update your question so it's clear what you really want, because am your question and the accepted answer don't match – Tim Jun 22 '14 at 15:09
  • @TimCastelijns the second part of the accepted answe works perfectly for the question as for the first part its for finding a whole substring in a substring – Hakar Jun 22 '14 at 17:28

3 Answers3

2

May this code can help you:

>>> string = 'sdjskjhdvsnea'
>>> substring = 'sdkj'
>>> counter = 0
>>> for x in string:
...     if x in substring:
...         counter += 1


>>> counter
8
>>> 
Tim
  • 41,901
  • 18
  • 127
  • 145
Tok Soegiharto
  • 329
  • 1
  • 8
  • Just to clarify! The "if x in substring:" is inside the "for x in string:"-loop. Kind of hard to see. – Willy Jun 22 '14 at 11:31
  • Yes right, if x ... is inside for x in string: loop. – Tok Soegiharto Jun 22 '14 at 11:35
  • @hakar, just want to know if this is a right answer, if so feel free to mark it as a correct answer, otherwise, i can improve the answer. Thanks. – Tok Soegiharto Jun 22 '14 at 11:36
  • oh, thank you alot, it really worked, but what if we want to find the whole substring in the string for example if the string is string = 'sdkjhsgshfsdkj' so the counter is equal to 2 in this case?? – Hakar Jun 22 '14 at 11:36
  • 2
    @Hakar that is a totally different question, and (per my answer) what is usually meant by *"finding a substring"*. – jonrsharpe Jun 22 '14 at 11:38
  • @jonrsharpe so how to do that one, can you tell me please? i mean the second question – Hakar Jun 22 '14 at 11:42
  • @Hakar at the risk of self-promotion, why not read my answer? – jonrsharpe Jun 22 '14 at 11:43
  • @TokSoegiharto note that this approach is `O(len(string)*len(substring))`, so will not be efficient if those strings get larger. – jonrsharpe Jun 22 '14 at 11:48
  • @jonrsharpe excuse me? which comment bro? did you comment the code for finding an entire substring in a string? i haven't seen it – Hakar Jun 22 '14 at 11:49
  • @TokSoegiharto no problem - it's unlikely to matter in this trivial case, but something to be aware of. – jonrsharpe Jun 22 '14 at 11:51
  • @jonrsharpe, I just want to make a help. Thanks again. – Tok Soegiharto Jun 22 '14 at 11:53
1

Edit:

As you apparently do want the count of the appearances of the whole four-character substring, regex is probably the easiest method:

>>> import re
>>> string = 'sdkjhsgshfsdkj'
>>> substring = 'sdkj'
>>> len(re.findall(substring, string))
2

re.findall will give you a list of all (non-overlapping) appearances of substring in string:

>>> re.findall('sdkj', 'sdkjhsgshfsdkj')
['sdkj', 'sdkj']

Normally, "finding a sub-string 'sdkj'" would mean trying to locate the appearances of that complete four-character substring within the larger string. In this case, it appears that you simply want the sum of the counts of those four letters:

sum(string.count(c) for c in substring)

Or, more efficiently, use collections.Counter:

from collections import Counter

counts = Counter(string)
sum(counts.get(c, 0) for c in substring)

This only iterates over string once, rather than once for each c in substring, so is O(m+n) rather than O(m*n) (where m == len(string) and n == len(substring)).

In action:

>>> string = "sdjskjhdvsnea"
>>> substring = "sdkj"
>>> sum(string.count(c) for c in substring)
8
>>> from collections import Counter
>>> counts = Counter(string)
>>> sum(counts.get(c, 0) for c in substring)
8

Note that you may want set(substring) to avoid double-counting:

>>> sum(string.count(c) for c in "sdjks")
11
>>> sum(string.count(c) for c in set("sdjks"))
8
jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
  • import re >>> string = 'sdkjhsgshfsdkj' >>> substring = 'sdkj' >>> len(re.findall(substring, string)) 2 this one is great, but how to save the value in a variable "counter" in this case?? – Hakar Jun 22 '14 at 11:57
  • @Hakar uh... `counter = len(...)`?! – jonrsharpe Jun 22 '14 at 11:58
  • yes I fixed that in anotherway, but there is a problem: what is the substring starts and ends with the same letter, lemme explain it in an example substring = 'sdks' string = 'sdksjhgsdksdks' – Hakar Jun 22 '14 at 12:02
  • @Hakar Per the documentation I have already linked to, `re.findall` is **non-overlapping**. If you have overlapping substrings, consider a [moving window approach](http://stackoverflow.com/q/6822725/3001761) or use [`re.match`](http://stackoverflow.com/q/5616822/3001761). – jonrsharpe Jun 22 '14 at 12:04
  • i think my case is not overlapping as i looked at the links u gave, what i want if the the last letter of the substring is same as the last letter and in the string we have a concatination of of the substring but with the same letter in common, for example: substring = 'sdks' string = 'sdksjhgsdksdkshjhsdks' so the counter will be three in this case as there is two sdks and an sdksdks which will be treated as two not one because the S in the middle will be the last letter of the first one and the first letter of the second one – Hakar Jun 22 '14 at 17:47
  • @Hakar you realised that what you just described is overlapping, right? Also, it's "concatenation". – jonrsharpe Jun 22 '14 at 17:50
1

An alternative solution using re.findall():

>>> import re
>>> substring = 'sdkj'
>>> string = 'sdjskjhdvsnea'
>>> len(re.findall('|'.join(list(substring)), string))
8
Amal Murali
  • 75,622
  • 18
  • 128
  • 150