0

I have created a choropleth map using python, which shows confirmed cases for each state based on latitude and longitude. However, I am unable to input the data that I want from my dataset.

Here is my code that I tried:

import plotly.graph_objects as go
import pandas as pd

df = pd.read_csv("COVID19-DATA-01-ONLYSTATES.csv")

fig = go.Figure(data=go.Choropleth(
    locations = df["AdminRegion1"],
    z = df["Confirmed"],
    locationmode = 'USA-states', # set of locations match entries in `locations`
    colorscale = 'Reds',

))

fig.update_layout(
    geo_scope='usa', 
)

fig.show()

Here is a picture of my dataset. enter image description here

  • First, Group by state and sum on confirmed column. Second, add state code column in dataframe and use it as locations instead of "AdminRegion1". These two changes would make your work way easy. – jgrt Oct 23 '20 at 17:08
  • Hey sir, thank you for the suggestion. Could you possibly explain it a bit more? I apologize I am very new to programming so it is hard to understand all of this. – beginner_coder1 Oct 24 '20 at 17:46
  • Sure. Could you please add some sample data in question. – jgrt Oct 24 '20 at 17:56
  • Hi, the only way I could get the data was to attach the link. Is there anyway else you would like it? – beginner_coder1 Oct 24 '20 at 18:19
  • Thanks. Link helped me but for next time please check [this](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) . – jgrt Oct 28 '20 at 10:29
  • Check answer and add comment if it is not helpful otherwise consider to accept – jgrt Oct 28 '20 at 10:30
  • Definitely accepted! Thank you! – beginner_coder1 Oct 28 '20 at 18:39

1 Answers1

2

This code is for all countries as provided data suggests and also you didn't mention about it. If you want for specific country, add STATE_CODE in dataframe.(right now, STATE_CODE is missing) check

You need some data preprocessing before plotting raw data into map.

Data Preprocessing:

import pandas as pd
import plotly.graph_objs as go

df = pd.read_csv("Bing-COVID19-Data.csv")

selected_columns = ["ID", "Country_Region", "ISO3", "Updated", "Confirmed", "Deaths", "Recovered"] # select columns for plot
sdf = df[selected_columns] 
sdf = sdf[sdf.ISO3.notnull()] # remove null from ISO3, like worldwide wont have any ISO code etc
sdf["Updated"] = pd.to_datetime(sdf.Updated) # Convert Updated column type from str to datetime

sdf = (sdf
       .loc[sdf.groupby('ISO3').Updated.idxmax()] # select only latest date for each contry as you have cumalative sum  
       .reset_index(drop=True)
       .sort_values(["Country_Region"])
      )

Plot:

sdf = sdf.astype(str) # convert columns type to styr to make hover data in plot

sdf["hover_data"] = sdf['Country_Region'] + '<br>' + \
    'Updated: ' + sdf['Updated'] + '<br>' + \
    'Confirmed: ' + sdf['Confirmed'] + '<br>' + \
    'Deaths: ' + sdf['Deaths'] + '<br>' + 'Recovered: ' + sdf['Recovered']

fig = go.Figure(data=go.Choropleth(
    locations = sdf['ISO3'],
    z = sdf['Confirmed'],
    text = sdf['hover_data'],
    colorscale = 'Reds',
    autocolorscale=False,
    marker_line_color='darkgray',
    marker_line_width=0.5,
    colorbar_title = 'Confirmed Cases',
))

fig.update_layout(
    title_text='COVID-19 Cases',
    geo=dict(
        showframe=False,
        showcoastlines=False    )
)

fig.show()

enter image description here

jgrt
  • 220
  • 2
  • 6