1

I am at the final stages of a project where i have been comparing the appraisal price vs the sold price of different properties. The complete code for data collection and tidying is below.

At this stage i am looking at different ways to visualize my data. However, I am quite new to it so my question is whether anyone has any "new" or special ways they visualizing data that they find usefull og intuitive. I have given a couple of examples of what i am able to visualize now using ggplot.

Additionally: Now my visualizations plots all 1275 observations every time. I would however also like to visualize the data both with mean and median for the Percentage, Sold and Tax variables which i am most interested in. For example to visualize the mean value of the Percentage column based on different years.

Appreciate any help!

Complete code:

#Step 1: Load needed library 
library(tidyverse) 
library(rvest) 
library(jsonlite)
library(stringi)
library(dplyr)
library(data.table)
library(ggplot2)

#Step 2: Access the URL of where the data is located
url <- "https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/10/" 

#Step 3: Direct JSON as format of data in URL 
data <- jsonlite::fromJSON(url, flatten = TRUE) 

#Step 4: Access all items in API 
totalItems <- data$TotalNumberOfItems 

#Step 5: Summarize all data from API 
allData <- paste0('https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/', totalItems,'/') %>% 
  jsonlite::fromJSON(., flatten = TRUE) %>% 
  .[1] %>% 
  as.data.frame() %>% 
  rename_with(~str_replace(., "ListItems.", ""), everything())

#Step 6: removing colunms not needed
allData <- allData[, -c(1,4,8,9,11,12,13,14,15)]

#Step 7: remove whitespace and change to numeric in columns SoldAmount and Tax
#https://stackoverflow.com/questions/71440696/r-warning-argument-is-not-an-atomic-vector-when-attempting-to-remove-whites/71440806#71440806
allData[c("Tax", "SoldAmount")] <- lapply(allData[c("Tax", "SoldAmount")], function(z) as.numeric(gsub(" ", "", z)))

#Step 8: Remove rows where value is NA 
#https://stackoverflow.com/questions/4862178/remove-rows-with-all-or-some-nas-missing-values-in-data-frame
alldata <- allData %>%
  filter(across(where(is.numeric),
                ~ !is.na(.)))

#Step 9: Remove values below 10000 NOK on SoldAmount og Tax.
alldata <- alldata %>%
  filter_all(any_vars(is.numeric(.) & . > 10000))

#Step 10: Calculate percentage change between tax and sold amount and create new column with percent change
#df %>% mutate(Percentage = number/sum(number))
alldata_Percent <- alldata %>% mutate(Percentage = (SoldAmount-Tax)/Tax)

Visualization

# Plot Percentage difference based on County
ggplot(data=alldata_Percent,mapping = aes(x = Percentage, y = County)) +
  geom_point(size = 1.5)
#Plot County with both Date and Percentage difference The The 
theme_set(new = ggthemes::theme_economist())
p <- ggplot(data = alldata_Percent, 
            mapping = aes(x = Date, y = Percentage, colour = County)) +
  geom_line(na.rm = TRUE) +
  geom_point(na.rm = TRUE)
p
EinarO
  • 27
  • 4
  • 3
    Hi. One option to see lots of good data visualisations for R is to follow #tidytuesday on Twitter. It is a weekly "challenge", where people are given a specific dataset and they choose their own way to present it. The good thing is, they also share a code for the visualisations. Follow this link: `https://twitter.com/search?q=%23TidyTuesday&src=typeahead_click` – Bloxx Mar 24 '22 at 21:20
  • 1
    Wow, that is a great aid! Thank you:) – EinarO Mar 25 '22 at 09:12

0 Answers0