-1

Please help to realise the following function:

Given a list of strings and lists, which may also contain strings and lists etc. Your job is to collect these strings into a dict, where key would be the string and value the amount of occurrences of that string in these lists.

def count_strings(data: list, pos=None, result: dict = None) -> dict:
    """

    :param data: given list of lists
    :param pos: figure out how to use it
    :param result: figure out how to use it
    :return: dict of given symbols and their count
    """

My attempts:

if result is None and pos is None:
    result = {}
    pos = 0
if pos > len(data):
    return result
if isinstance(data[pos], str):
    return count_strings(data, pos + 1, result)
elif isinstance(data, list):
    return count_strings(data, 0, result)

The output should be something like this:

    print(count_strings([[], ["J", "*", "W", "f"], ["j", "g", "*"], ["j", "8", "5", "6", "*"], ["*", "*", "A", "8"]]))
    # {'J': 1, '*': 5, 'W': 1, 'f': 1, 'j': 2, 'g': 1, '8': 2, '5': 1, '6': 1, 'A': 1}
    print(count_strings([[], [], [], [], ["h", "h", "m"], [], ["m", "m", "M", "m"]]))  # {'h': 2, 'm': 4, 'M': 1}
    print(count_strings([]))  # {}
    print(count_strings([['a'], 'b', ['a', ['b']]]))  # {'a': 2, 'b': 2}
Valhenberg
  • 23
  • 2
  • 2
    What *specific* issue(s) are you having with "your attempt"? – Scott Hunter Oct 29 '21 at 12:09
  • 2
    Whoever gave this exercise and decided they needed `pos` and `result` as parameters should not be teaching Python. – user2390182 Oct 29 '21 at 12:10
  • *"You are given a list"*: not me, but apparently you. Please let us know what issue you have in solving this challenge. So far you just copied the challenge here, but forgot to ask a question that *you* have in solving it. Be specific what the issue is that you have, what you have tried to solve that specific issue, what the wrong output is you get, and what was expected instead, what you researched,... – trincot Oct 29 '21 at 12:12
  • @user2390182: Passing around `result` can simplify things, but I totally agree about `pos`. – Scott Hunter Oct 29 '21 at 12:14
  • Even if passing around `result` can simplify things, it is not best practice. – trincot Oct 29 '21 at 12:15
  • @ScottHunter Yeah, I can live with `result`, but `pos` makes no sense unless you really butcher some iterative approach into fake recursion. – user2390182 Oct 29 '21 at 12:15
  • @trincot: What particular practice would it violate? – Scott Hunter Oct 29 '21 at 12:21
  • 1
    @ScottHunter It feels dirty to return something that was passed in as an argument. Such a mutator should return None in Python IMHO. What is the return value for? That you can save one argument for the top-level call while all the recursive calls completely ignore it? – user2390182 Oct 29 '21 at 12:25
  • I don't really get the benefit of recursion for this use case. I would preprocess the data to flatten the hierarchy and then iterate over the new list counting all strings. – Raavgo Oct 29 '21 at 12:28
  • @Raavgo It is obviously an academic training exercise. Besides, the flattening most likely would involve recursion. – user2390182 Oct 29 '21 at 12:29
  • @Raavgo: Unless you're doing the flattening w/o recursion (which I'd love to see), this is doing the counting *during* the same recursion you would use for flattening. – Scott Hunter Oct 29 '21 at 12:31
  • For best practice, see [Correct Style for Python functions that mutate the argument](https://stackoverflow.com/a/26027714/5459839) – trincot Oct 29 '21 at 12:33
  • @user2390182 not necessarily might be a dirty approach but by using str() and string manipulation you can preprocess the data pretty quickly. – Raavgo Oct 29 '21 at 12:37
  • @Raavgo: And just hope that certain characters don't appear inside any of the strings... – Scott Hunter Oct 29 '21 at 12:39
  • 1
    @Raavgo That's defo dirty :D what if the strings themselves contain meaningful chars like `"["`. Also, then you are relying on `list.__repr__` to call itself (ha, recursion again!). – user2390182 Oct 29 '21 at 12:41
  • @ScottHunter not at all you know exactly what a string is and what the pattern of a list is. – Raavgo Oct 29 '21 at 12:41
  • 1
    @Raavgo: What if your string *contained* the string representation of a list of strings? – Scott Hunter Oct 29 '21 at 12:54

1 Answers1

2

The template you got does not promote best practice (see Correct Style for Python functions that mutate the argument), but ignoring that, your code has these issues:

  • if pos > len(data): this misses the case where these two are equal, in which case you should also enter that if block
  • Your code does not update the dictionary with actual count(s). In particular this should happen when the inspected value is a string
  • The test for the list data type is on the wrong value: you'll want to test that data[pos] is a list, not data.
  • When the value is a list, then you should make the recursive with that sublist, so with data[pos]
  • When the value is a list, you should still process the rest of the main list.

Here is a correction:

def count_strings(data: list, pos=None, result: dict = None) -> dict:
    if result is None and pos is None:
        result = {}
        pos = 0
    if pos < len(data):  # pos should not be equal to len when accessing data[pos]:
        if isinstance(data[pos], str):
            result[data[pos]] = result.get(data[pos], 0) + 1  # increment count
        elif isinstance(data[pos], list):
            count_strings(data[pos], 0, result)  # process nested list
        count_strings(data, pos + 1, result)  # process the remaining entries
    return result

Alternative function signature

It would have been better if the function didn't need a result argument that is going to be mutated, nor the pos argument. Although you are not looking for a solution that changes the template you got, I still prefer to include an alternative here.

There are actually two problems to be solved here:

  • iterate over the nested values in a recursive manner, and
  • collect string frequencies in a dictionary

The two can be solved in separate functions. Without using libraries, you can do that as follows:

def deepvalues(data):
    if isinstance(data, list):
        for item in data:
            yield from deepvalues(item)  # recursion
    else:
        yield data

def count_strings(data: list) -> dict:
    result = {}
    for value in deepvalues(data):
        result[value] = result.get(value, 0) + 1
    return result
trincot
  • 317,000
  • 35
  • 244
  • 286
  • 1
    The fact that `pos` was included in the signature provided by the teacher makes it look as if the teacher wasn't aware that `for`-loops were possible in a recursive function. – Stef Oct 29 '21 at 13:31