0

For example : news_dict contains

{articles : [{'headline' : ..., 'url' : ..., 'body' : ...}, {'headline' : ..., 'url' : ..., 'body' : ...}, ...so on uptill 200 data points]}
df_news = pd.DataFrame()
for ix in news_dict['articles']:
       p = {'headline' : ix['headline'], 'url' : ix['url'], 'body' : ix['body']}
       df = pd.DataFrame(data = p, index = 0)
       df_news = df_news.append(df)

now the output above gives a appended data frame with row index as 0 for all. Another way is 'headline' : [ix['headline']] but still it gives index as 0.

One can easily pass a list index = [1,2,3,...200] but it becomes cumber some for data upto 1000. 

How can we dynamically update the index for such ?

If i don't pass an index then it throws an error : ValueError: If using all scalar values, you must pass an index

I am not showing the data for the output as it is quite long. Output :

    headline        url       body
0   headline_1      url_1     body_1
0   ....
0   

one can use a sample input as :

sample_input : {'A':[{'a':1, 'b':2, 'c':3}, {'a':4,'b':5,'c':6}, {'a':20, 'b': 50, 'c': '30}]}

Desired output :

    a   b   c

0   1   2   3

1   4   5   6

2   20  50  30

a b c are the column headers

0 1 2 are the indices.

  • You could add the argument `ignore_index=True` when you append and it will re-create a new RangeIndex instead of repeating 0. But in general it's very bad to append to a DataFrame in a loop. https://stackoverflow.com/a/37009561/4333359 Much better to append to a list, or pre-allocate an array then construct the DataFrame once after the loop. This will also solve your index problem. – ALollz Feb 17 '20 at 06:13
  • thanks!! This is much better and easy to understand. – Shivam Rawat Feb 17 '20 at 06:25

0 Answers0