I'm working in QGIS and R, and have place level spatial files with most of the variables I need, but want to add land use data that is at roughly parcel level. I want to create new variables in my place level dataset that accounts for the percentage of the land area taken up by each land use type. So at the end I would have variables that account for how much land in each city (place) is taken up by each type of land use.
The code I've been working with (unsuccessfully) is roughly this:
library(sf)
library(dplyr)
library(ggplot2)
# Load the first and second datasets
first_dataset <- st_read("path/to/first/dataset")
second_dataset <- st_read("path/to/second/dataset")
# Calculate the area of each feature in the first dataset
first_dataset_area <- st_area(first_dataset)
# Extract the portion of the second dataset that overlaps with the first dataset
intersection_dataset <- st_intersection(first_dataset, second_dataset)
# Group the intersection dataset by the category of interest
grouped_dataset <- intersection_dataset %>%
group_by(category) %>%
summarize(area = sum(st_area(.)))
# Calculate the percentage of the area of the first dataset covered by each group
result_dataset <- grouped_dataset %>%
mutate(percentage_covered = area / first_dataset_area * 100)
# Visualize the results
ggplot(result_dataset, aes(x = category, y = percentage_covered)) +
geom_bar(stat = "identity") +
labs(title = "Percentage of Area Covered by Different Categories",
x = "Category",
y = "Percentage Covered")
Please any advice would be appreciated!
Getting stuck at the result_dataset part of the code. Getting an error that the percentage_covered variable must be sized 182 or 1, not 603.
This is clear to me that there's something wrong going on earlier. Probably when I use st_area() but perhaps I am simply using the wrong process to achieve my goal given my particular dataset.