This question is can be viewed continuation/extension/generalization of a previous question of mine from here.
Some definitions: I have a set of integers S = {1,2,...,s}
, say s = 20
, and two matrices N
and M
whose rows are finite sequences of numbers from S
(i.e. permutations with possible repetitions), of order n
and m
respectively, where 1 <= n <= m
. Let us think of N
as a collection of candidate sub-sequences for the sequences from M
.
Example: [2 3 4 3]
is a sub-sequence of [1 2 2 3 5 4 1 3]
that occurs with multiplicity 2 (=in how many different ways one can find the sub-seq. in the main seq.), whereas [3 2 2 3]
is not a sub-sequence of it. In particular, a valid sub-sequence by definition must preserve the order of the indices.
Problem statement:
(P1) For each row of M
, obtain the number of sub-sequences of it, with multiplicity and without multiplicity, that occur in N
as rows (it can be zero if none are contained in N
);
(P2) For each row of N
, find out how many times, with multiplicity and without multiplicity, it is contained in M
as a sub-sequence (again, this number can be zero);
Example: Let N = [1 2 2; 2 3 4]
and M = [1 1 2 2 3; 1 2 2 3 4; 1 2 3 5 6]
. Then (P1) returns [2; 3; 0]
for 'with multiplicities' and [1; 2; 0]
for 'without multiplicities'. (P2) returns [3; 2]
for 'with multiplicities' and [2; 1]
without multiplicities.
Order of magnitude: M
could typically have up to 30-40 columns and a few thousand rows, although I currently have M
with only a few hundred rows and ~10 columns. N
could be approaching the size of
M
or could be also much smaller.
What I have so far: Not much, to be honest. I believe I might be able to slightly modify my not-very-well-vectorized solution from my previous question to tackle permutations with repetitions, but I am still thinking on that and will update as soon as I have something working. But given my (lack of) experience so far, it would be in all likelihood very suboptimal :(
Thanks!