1

Absolute newbie ;)

Current Input (note that some of the input below relies on other code which I ran before this, not included here):

data = []

vid_list = list(primary_variants['vid'].unique())
for vid in vid_list:
    report_info = get_reports_with_vid([vid]).rename(columns={"xe": "xe_id"})
    og_rqs = report_info["rq"].apply(lambda x: x.split("-")[0])
    report_info = report_info[og_rqs != 'RQ53']
    vid_info = get_vid_warn_info(report_info[["xe_id", "vid"]].to_dict("records"))
    vid_info = report_info.merge(vid_info, on=["xe_id", "vid"], how="left")
    conf_status = dict(vid_info["confirmation_status"].value_counts())
    data.append({'conf_status': conf_status})

df = pd.DataFrame(data)
print(df)

Current Output:

                                                       conf_status  
0  {'confirmation_not_necessary': 16, 'might_need_confirmation': 2}     
1  {'confirmation_not_necessary': 1}                                    
2  {'confirmation_not_necessary': 3}                                    
3  {'confirmation_not_necessary': 6}                                    
4  {'confident_call': 2}                                                
5  {'confirmation_not_necessary': 1791, 'might_need_confirmation': 48}  

Question: Ideally, I really want rearrange the dataframe output like this (below), so I can copy results directly into a spreadsheet from the dataframe output. How can I accomplish this output?

        vid     conf_not_nec    might_need_conf    conf_call
0  3014790      16              2        
1  12246762     1
2  7989296      3
3  2385739      6
4  14560093                                        2
5  1901209      1971            48
Cath Tyner
  • 23
  • 4

1 Answers1

0

this can be confusing for a newbie!

You've actually got 2 questions in there.

Firstly:

You're getting the output you expected it's just wrapping itself around the terminal/console. Note that you can see a backslash after the first column header ''. That's the give-away.

If you set the pandas width with:

pd.set_option('display.width', 200)

*note using pd, and not df, it's a global pandas setting More info can be researched in this answer: How can I display full (non-truncated) dataframe information in HTML when converting from Pandas dataframe to HTML?

You can have a play around with pd.width, pd.column_maxwidth and a few other settings. Have a look through the options available in pandas: https://pandas.pydata.org/pandas-docs/stable/user_guide/options.html

For your second question:

Why is it adding the extra columns? Here:

conf_status = dict(vid_info["confirmation_status"].value_counts())

You're assigning the conf_status as a dictionary type. Which has a key and a value. Then you're appending that dictionary to the 'conf_status' column right below it:

data.append({'vid': vid, 'conf_status': conf_status})

You might be able to just simplify the assignment of the variable with:

conf_status = vid_info["confirmation_status"].value_counts()

Give those a try and let me know how you go!

chromebookdev
  • 305
  • 2
  • 11
  • Just a note here for new players: If you throw in a **print(type(variablename))** It will tell you what your variable type is. When things are doing funny stuff, it can be useful to know. – chromebookdev Oct 04 '22 at 02:14