0

Whenever I post the following code the output is a group of lists, but how do I organize them into just one column?

# Import libraries
from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy as np

#Get URL and extract content
class Scraper():
page=1
traits = []
while page != 10:
    content = requests.get('https://raw.githubusercontent.com/recklesslabs/wickedcraniums/main/{}'.format(page))
    soup = BeautifulSoup(content.text, 'html.parser')
    page = page + 1
    
    dic_list = list(map(eval, soup))
    for dic in dic_list:
        traits = dic["attributes"]

    df = pd.DataFrame.from_dict(traits, orient='columns').to_numpy()
    df1 = list(np.concatenate(df[0:1]))
    
    print(df1)

When I use the code above I get this output:

['Background', 'Zeus']
['Background', 'BlueWhale']
['Background', 'ViolentViolet']
['Background', 'MardiGras']
['Background', 'WoodBark']
['Background', 'ViolentViolet']
['Background', 'MidnightExpress']
['Background', 'Maire']
['Background', 'Pohutukawa']

How do I make just one column that lists all backgrounds so it can look something like this:

    Background
0   Zeus
1   BlueWhale
2   ViolentViolet
3   MardiGras
4   WoodBark
5   ViolentViolet
6   MidnightExpress
7   Maire
8   Pohutukawa

In addition to the above, how would I also go about finding the count for each item so that it shows up as:

Background
Zeus - 1
BlueWhale - 1
ViolentViolet - 2
MardiGras - 1
WoodBark - 1
MidnightExpress - 1
Maire - 1
Pohutukawa - 1
  • I don't understand, isn't `df` already a dataframe with the column you want? Can you provide its content? – mozway Aug 22 '21 at 21:51
  • df shows something like this: '[['Background' 'Zeus'] ['Body' 'Pale'] ['Eyes' 'Watery'] ['Clothes' 'Kurta'] ['Head' 'HeadScarf'] ['Mouth' 'Bandana']] [['Background' 'BlueWhale'] ['Body' 'Silver'] ['Eyes' 'NoEyes'] ['Clothes' 'VarsityJacket'] ['Head' 'Beanie'] ['Mouth' 'NoAccessory']]' but I would like to rearrange the dataframe where all the Backgrounds are listed under one column Background, and the same for Body, Eyes, Clothes etc. – intermarketics Aug 22 '21 at 21:54
  • What is `df[1]`? – mozway Aug 22 '21 at 21:57
  • df[1] outputs this '['Body' 'Pale'] ['Body' 'Silver'] ['Body' 'Cracked'] ['Body' 'Tribal'] ['Body' 'AquaHaze'] ['Body' 'Silver'] ['Body' 'AquaHaze'] ['Body' 'TheOGO'] ['Body' 'Schizophrenic']' but I would like to have one only one column Body that has Pale, Silver, Cracked, Tribal etc. listed under it – intermarketics Aug 22 '21 at 22:01
  • Is this a string or a list / list of lists? If is ambiguous from the formatting in comments – mozway Aug 22 '21 at 22:03
  • df = pd.DataFrame.from_dict(traits, orient='columns').to_numpy() is a list, but df = pd.DataFrame.from_dict(traits, orient='columns') is not a list – intermarketics Aug 22 '21 at 22:13

1 Answers1

1
# Import libraries
from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy as np


backgrounds = []

#Get URL and extract content
class Scraper():
    page = 1
    traits = []
    

    while page != 10:
        content = requests.get('https://raw.githubusercontent.com/recklesslabs/wickedcraniums/main/{}'.format(page))
        soup = BeautifulSoup(content.text, 'html.parser')
        page = page + 1
        
        dic_list = list(map(eval, soup))
        for dic in dic_list:
            traits = dic["attributes"]

        # df = pd.DataFrame(traits)
        df = pd.DataFrame.from_dict(traits)
        df = df[df['trait_type']=='Background']

        backgrounds.append(df['value'].values[0])

df = pd.DataFrame({'Backgrounds': backgrounds})
l1 = df['Backgrounds']
print(l1)

value_counts = df['Backgrounds'].value_counts()
l2 = [f"{key} - {value_counts[key]}" for key in value_counts.keys()]
print(l2)
       
PreciXon
  • 443
  • 2
  • 9