Calculate occurrences in overlapping string

Question

Calculate occurrences in overlapping string

Navigation

#1 by (2 votes)

2

I would like to know how to make as many combinations of substrings as possible within a string, eg:

   string = 'abcdcdc'
   sub_string = 'cdc'

The return value I want in this case is 2, because the code should be able to check how many combinations of the word "cdc" can be found inside the string.

The way I tried it was like this:

string = input().strip()
sub_string = input().strip()

idx = 0
count_p = 0
count = 0
for i in list(string):
    if i in sub_string[idx]:
        count += 1
        idx += 1
        if idx == 2:
            idx = 0
        elif count == 3:
            count = 0
            count_p += 1

    else:
        count = 0

print(count_p)

I've tried everything but without success, I have no idea how to do this, I'll be waiting, thanks guys!

string python python-3.x

asked by anonymous 11.11.2017 / 16:34

1 answer

POST request with httr not complete (site Fipe table) What is the correct way to do calculations [duplicate]

score 2 · Accepted Answer

Its logic, though a bit strange, is almost correct. The only problem is that because of the overlapping of the occurrences of the substring in the string, its code gets lost in the counters. You have implemented that the counter count_p is incremented only when count is 3, but since the last c of the first occurrence of the substring also refers to the first c of the second occurrence, the counter count will only arrive to 2, but not to 3, making the result just 1 instead of 2, which would be expected.

To solve this, instead of simply zeroing the counter when it reaches 3, you check what the current character is, and if it equals the first character of the substring, assign the value 1. So you will be counting twice the same character, which sets the substrings overlap.

string = 'abcdcdc'
sub_string = 'cdc'

idx = 0
count_p = 0
count = 0
for i in list(string):
    if i in sub_string[idx]:
        count += 1
        idx += 1
        if idx == 2:
            idx = 0
        elif count == 3:
            count = 0 if i != 'c' else 1
            count_p += 1

    else:
        count = 0


print(count_p)

See working at Ideone | Repl.it

Reminds

You do not need to convert the string to a list to run it with for . In Python, the string type is iterate by nature, so just do: for i in string ;

Your code reads from the user the string and the substring, however the logic is restricted to the example you gave, using the string 'abcdcdc' and substring 'cdc' . With different values, the program probably will not work;

Alternatives

Method str.find

Another way to solve the problem is to use the find method of the string. The method returns the start position of the occurrence or -1 if it does not exist. It is also possible to define the beginning of where the string will be considered to prevent it from counting the same occurrence twice. An example would be:

string = 'abcdcdc'
sub_string = 'cdc'

start = count = 0
while True:
    index = string.find(sub_string, start)
    if index >= 0:
        start = index + 1
        count += 1
    else:
        break

print(count)

See working at Ideone | Repl.it

Regular Expression

You can also use regular expressions to get all occurrences of the substring and count how many were. Here's an example:

import re

string = 'abcdcdc'
sub_string = 'cdc'

count = len(re.findall('(?=%s)' % sub_string, string))

print(count)

See working at Ideone | Repl.it

Go through the string and compare the substrings

You can also scroll through the entire string and check if the substring of the same size is the same as the substring you are looking for.

string = 'abcdcdc'
sub_string = 'cdc'

count = 0
for i in range(len(string)):
    if string[i:i+len(sub_string)] == sub_string:
        count += 1

print(count)

See working at Ideone | Repl.it