I am trying to change the name of some columns and remove other that are irrelevant for this use case.
Data Source:
data <- read.csv("data/building_permits.csv")
Data Inspection
colnames(data)
The dataset column names
[1] "Permit.Number"
[2] "Permit.Type"
[3] "Permit.Type.Definition"
[4] "Permit.Creation.Date"
[5] "Block"
[6] "Lot"
[7] "Street.Number"
[8] "Street.Number.Suffix"
[9] "Street.Name"
[10] "Street.Suffix"
[11] "Unit"
[12] "Unit.Suffix"
[13] "Description"
[14] "Current.Status"
[15] "Current.Status.Date"
[16] "Filed.Date"
[17] "Issued.Date"
[18] "Completed.Date"
[19] "First.Construction.Document.Date"
[20] "Structural.Notification"
[21] "Number.of.Existing.Stories"
[22] "Number.of.Proposed.Stories"
[23] "Voluntary.Soft.Story.Retrofit"
[24] "Fire.Only.Permit"
[25] "Permit.Expiration.Date"
[26] "Estimated.Cost"
[27] "Revised.Cost"
[28] "Existing.Use"
[29] "Existing.Units"
[30] "Proposed.Use"
[31] "Proposed.Units"
[32] "Plansets"
[33] "TIDF.Compliance"
[34] "Existing.Construction.Type"
[35] "Existing.Construction.Type.Description"
[36] "Proposed.Construction.Type"
[37] "Proposed.Construction.Type.Description"
[38] "Site.Permit"
[39] "Supervisor.District"
[40] "Neighborhoods...Analysis.Boundaries"
[41] "Zipcode"
[42] "Location"
[43] "Record.ID"
[44] "SF.Find.Neighborhoods"
[45] "Current.Police.Districts"
[46] "Current.Supervisor.Districts"
[47] "Analysis.Neighborhoods"
[48] "DELETE...Zip.Codes"
[49] "DELETE...Fire.Prevention.Districts"
[50] "DELETE...Supervisor.Districts"
[51] "DELETE...Current.Police.Districts"
[52] "DELETE...Supervisorial_Districts_Waterline_data_from_7pkg_wer3"
Length of the column names data:
length(colnames(data))
length(colnames(data)) [1] 52
Remove columns
colremove = c("First Construction Document Date",
"Structural Notification",
"Number of Existing Stories",
"Number of Proposed Stories",
"Voluntary Soft Story Retrofit",
"Fire Only Permit","Existing Units",
"Proposed Units","Plansets",
"TIDF Compliance","Existing Construction Type",
"Proposed Construction Type","Site Permit",
"Supervisor District","Current Police Districts",
"Current Supervisor Districts",
"Current Status Date", "Permit Creation Date",
"Analysis Neighborhoods","Lot","Location",
"SF Find Neighborhoods","Unit","Block", "Permit Type",
"Unit Suffix","Street Number Suffix",
"Existing Construction Type Description")
data <- data[colnames(data)[1:47]] %>% select(-all_of(colremove))
Here the error shows up:
Error: Can't subset columns that don't exist. x Columns
First Construction Document Date
,Structural Notification
,Number of Existing Stories
,Number of Proposed Stories
,Voluntary Soft Story Retrofit
, etc. don't exist.