1

How do I change this:

Date URL Description Category
2022-06-17 14:24:52 /XYBkLO public A
2022-06-17 14:24:52 /XYBkLO public B
2022-06-17 14:24:52 /XYBkLO public C
2022-06-17 14:25:05 /ZWrTVu public A
2022-06-17 14:25:05 /ZWrTVu public B
2022-06-17 14:25:05 /ZWrTVu public C

To this:

Date URL Description Category Date URL Description Category Date URL Description Category
2022-06-17 14:24:52 /XYBkLO public A 2022-06-17 14:24:52 /XYBkLO public B 2022-06-17 14:24:52 /XYBkLO public C
2022-06-17 14:25:05 /ZWrTVu public A 2022-06-17 14:25:05 /ZWrTVu public B 2022-06-17 14:25:05 /ZWrTVu public C

I would like to keep everything with the same URL on one row, but I don't know how I could implement this using pandas. Is there perhaps another way or another library I should use? Could really use some help

Nahuel Mel
  • 51
  • 8

3 Answers3

1

You can try:

from itertools import cycle, count, islice
from collections import defaultdict


def fn(x):
    d = defaultdict(lambda: count(1))
    names = cycle(x.columns)
    vals = x.values.ravel()

    return pd.DataFrame(
        [vals],
        columns=[f"{n}.{next(d[n])}" for n in islice(names, len(vals))],
    )


x = df.groupby("URL").apply(fn).reset_index(drop=True)
print(x)

Prints:

                Date.1    URL.1 Description.1 Category.1               Date.2    URL.2 Description.2 Category.2               Date.3    URL.3 Description.3 Category.3
0  2022-06-17 14:24:52  /XYBkLO        public          A  2022-06-17 14:24:52  /XYBkLO        public          B  2022-06-17 14:24:52  /XYBkLO        public          C
1  2022-06-17 14:25:05  /ZWrTVu        public          A  2022-06-17 14:25:05  /ZWrTVu        public          B  2022-06-17 14:25:05  /ZWrTVu        public          C
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the [help center](https://stackoverflow.com/help/how-to-answer). – Ethan Jun 17 '22 at 21:40
  • 1
    This is perfect, thank you so much. Every time I see an implementation using itertools and collections, it makes me wonder why these aren't emphasized more in python courses. Also, need to dive in more into pandas. Thanks again! – Nahuel Mel Jun 17 '22 at 21:58
0

You can look into pandas transpose function. transpose

Not 100% sure it will fit your use case but it seems like a good place to start.

ggsmith
  • 31
  • 7
0

Here is another way:

(pd.concat(
    [df.set_index('URL',drop=False) for _,df in df.groupby('Category')]
    ,axis=1)
.reset_index(drop=True))

Output:

    Date             URL    Description Category    Date             URL    Description Category  Date               URL    Description Category
0   6/17/2022 14:24 /XYBkLO public       A          6/17/2022 14:24 /XYBkLO public       B        6/17/2022 14:24   /XYBkLO public      C
1   6/17/2022 14:25 /ZWrTVu public       A          6/17/2022 14:25 /ZWrTVu public       B        6/17/2022 14:25   /ZWrTVu public      C
rhug123
  • 7,893
  • 1
  • 9
  • 24