0

I have a similar question like this one, but I'm not able to adapt the solutions to my problem. I only need the days where the particular column is True/1 and these days only every x-th week.

My dataset looks like this:

id | y1  | start      | end       | every x-th week | Mon | Tue | Wed | Thu | Fri | Sat | Sun
1  | 30  | 01-02-2021 | 10-03-2021|       2         | 1   | 0   | 1   | 0   | 0   | 0   | 0
2  | 50  | 01-02-2021 | 28-02-2021|       3         | 1   | 1   | 0   | 0   | 0   | 1   | 0
...
n  | 20  | 01-02-2021 | 28-02-2021|       1         | 1   | 0   | 0   | 1   | 0   | 0   | 0

And I need it in this form:

id | y1 | date      
1 | 30  | 01-02-2021 
1 | 30  | 03-02-2021
1 | 30  | 15-02-2021
1 | 30  | 17-02-2021
1 | 30  | 01-03-2021
1 | 30  | 08-03-2021

Does someone have a solution for this?

To Mate
  • 51
  • 6

1 Answers1

0

If I get this right, you want to take all dates into one single column and then only retain those rows that satisfy a condition on the number of weeks ellapsed (every x week). Suppose your dataframe looks like this (I used yours and replicated the rows and changed some):

 id  y1       start         end  every x weeks  Mon  Tue  Wed  Thu  Fri  \
0    1  30  01-02-2021  10-03-2021              2    1    0    1    0    0   
1    2  50  01-02-2021  28-02-2021              3    1    1    0    0    0   
2    3  20  01-02-2021  28-02-2021              1    1    0    0    1    0   
3    1  30  01-03-2021  10-03-2021              2    1    0    1    0    0   
4    2  50  01-02-2021  28-02-2021              3    1    1    0    0    0   
5    3  20  01-02-2021  28-02-2021              1    1    0    0    1    0   
6    1  30  02-02-2021  10-03-2021              2    1    0    1    0    0   
7    2  50  01-01-2021  28-02-2021              3    1    1    0    0    0   
8    3  20  01-02-2021  28-02-2021              1    1    0    0    1    0   
9    1  30  01-03-2021  10-03-2021              2    1    0    1    0    0   
10   2  50  04-01-2021  28-02-2021              3    1    1    0    0    0   
11   3  20  01-02-2021  28-02-2021              1    1    0    0    1    0   
12   1  30  01-02-2021  10-03-2021              2    1    0    1    0    0   
13   2  50  01-02-2021  28-02-2021              3    1    1    0    0    0   
14   3  20  01-02-2021  28-02-2021              1    1    0    0    1    0   

    Sat  Sun  
0     0    0  
1     1    0  
2     0    0  
3     0    0  
4     1    0  
5     0    0  
6     0    0  
7     1    0  
8     0    0  
9     0    0  
10    1    0  
11    0    0  
12    0    0  
13    1    0  
14    0    0  

Define which columns to stack (in your case start and end):

keys = ['start','end']
df2 = pd.melt(df, id_vars=['id','y1','every x weeks'], value_vars=keys, value_name='date')

This gives you

   id  y1  every x weeks variable        date
0    1  30              2    start  01-02-2021
1    2  50              3    start  01-02-2021
2    3  20              1    start  01-02-2021
3    1  30              2    start  01-03-2021
4    2  50              3    start  01-02-2021
5    3  20              1    start  01-02-2021
6    1  30              2    start  02-02-2021
7    2  50              3    start  01-01-2021
8    3  20              1    start  01-02-2021
9    1  30              2    start  01-03-2021
10   2  50              3    start  04-01-2021
11   3  20              1    start  01-02-2021
12   1  30              2    start  01-02-2021
13   2  50              3    start  01-02-2021
14   3  20              1    start  01-02-2021
15   1  30              2      end  10-03-2021
16   2  50              3      end  28-02-2021
17   3  20              1      end  28-02-2021
18   1  30              2      end  10-03-2021
19   2  50              3      end  28-02-2021
20   3  20              1      end  28-02-2021
21   1  30              2      end  10-03-2021
22   2  50              3      end  28-02-2021
23   3  20              1      end  28-02-2021
24   1  30              2      end  10-03-2021
25   2  50              3      end  28-02-2021
26   3  20              1      end  28-02-2021
27   1  30              2      end  10-03-2021
28   2  50              3      end  28-02-2021
29   3  20              1      end  28-02-2021

Now, you can choose whatever weeks you want

    id  y1  every x weeks variable        date
1    2  50              3    start  01-02-2021
4    2  50              3    start  01-02-2021
7    2  50              3    start  01-01-2021
10   2  50              3    start  04-01-2021
13   2  50              3    start  01-02-2021
16   2  50              3      end  28-02-2021
19   2  50              3      end  28-02-2021
22   2  50              3      end  28-02-2021
25   2  50              3      end  28-02-2021
28   2  50              3      end  28-02-2021

If you want all other columns, addd them in the id_vars.

  • Sorry, I just noticed that I described my problem wrong. I just edited it, so it should be clearer. Thanks for your help and sorry for my mistake. – To Mate Mar 18 '21 at 07:52