Ok, to be completely honest, I am not exactly sure how to ask this question, since I think the error could happen in multiple places, so I'll just type all of them out (thanks for being patient with a noob here).
I am trying to use the lastfm database: https://grouplens.org/datasets/hetrec-2011/
so they have this python script that helps us to read the data from this dataset.
so what i did, is to first parse the line of a csv file with the given iter_lines function:
### first, open file into a file handle object
file = os.path.join(baseDir, 'artists.dat')
file_opener = open(file, "r")
lines = iter_lines(file_opener)
where the iter_lines() function look like this (given):
def iter_lines(open_file):
reader = csv.reader(
open_file,
delimiter='\t',
)
next(reader) # Skip the header
return reader
then I tried to use their given parse_artist_line() function to read the artist.csv:
artists_df = pd.DataFrame(['key','value'])
for line in lines:
### so the parse_artist_line() will return a dictionary
artist_dict = parse_artist_line(line)
artist_list = artist_dict.items()
### try to put in a temporary dataframe
temp = pd.DataFrame.from_dict(artist_dict, orient='index')
### finally append the temporary df to the artists_df
artists_df.append(temp, ignore_index=True)
print(artists_df.head(5))
and when i print the artists_df with the last statement, i only get this output:
0
0 key
1 value
and their parse_artist_line() look like this:
def parse_artist_line(line):
(artist_id, name, _, _) = line
current_artist = deepcopy(ARTISTS)
current_artist["artist_id"] = int(artist_id)
current_artist["name"] = name
return current_artist
btw, if you print temp, it looks like this:
0
artist_id 18743
name Coptic Rain
and if i try to use "columns" for the "orient" argument input for from_dict() i'd get an error:
ValueError: If using all scalar values, you must pass an index
I've followed the following posts/info pages:
- Convert Python dict into a dataframe
- https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.from_dict.html
I'm not sure anymore, what i'm doing wrong (probably every step). Any help/guidance is appreciated!