-2

I am trying to change the name of some columns and remove other that are irrelevant for this use case.

Data Source:

data <- read.csv("data/building_permits.csv")

Data Inspection

colnames(data)

The dataset column names

[1] "Permit.Number"                                                 
 [2] "Permit.Type"                                                   
 [3] "Permit.Type.Definition"                                        
 [4] "Permit.Creation.Date"                                          
 [5] "Block"                                                         
 [6] "Lot"                                                           
 [7] "Street.Number"                                                 
 [8] "Street.Number.Suffix"                                          
 [9] "Street.Name"                                                   
[10] "Street.Suffix"                                                 
[11] "Unit"                                                          
[12] "Unit.Suffix"                                                   
[13] "Description"                                                   
[14] "Current.Status"                                                
[15] "Current.Status.Date"                                           
[16] "Filed.Date"                                                    
[17] "Issued.Date"                                                   
[18] "Completed.Date"                                                
[19] "First.Construction.Document.Date"                              
[20] "Structural.Notification"                                       
[21] "Number.of.Existing.Stories"                                    
[22] "Number.of.Proposed.Stories"                                    
[23] "Voluntary.Soft.Story.Retrofit"                                 
[24] "Fire.Only.Permit"                                              
[25] "Permit.Expiration.Date"                                        
[26] "Estimated.Cost"                                                
[27] "Revised.Cost"                                                  
[28] "Existing.Use"                                                  
[29] "Existing.Units"                                                
[30] "Proposed.Use"                                                  
[31] "Proposed.Units"                                                
[32] "Plansets"                                                      
[33] "TIDF.Compliance"                                               
[34] "Existing.Construction.Type"                                    
[35] "Existing.Construction.Type.Description"                        
[36] "Proposed.Construction.Type"                                    
[37] "Proposed.Construction.Type.Description"                        
[38] "Site.Permit"                                                   
[39] "Supervisor.District"                                           
[40] "Neighborhoods...Analysis.Boundaries"                           
[41] "Zipcode"                                                       
[42] "Location"                                                      
[43] "Record.ID"                                                     
[44] "SF.Find.Neighborhoods"                                         
[45] "Current.Police.Districts"                                      
[46] "Current.Supervisor.Districts"                                  
[47] "Analysis.Neighborhoods"                                        
[48] "DELETE...Zip.Codes"                                            
[49] "DELETE...Fire.Prevention.Districts"                            
[50] "DELETE...Supervisor.Districts"                                 
[51] "DELETE...Current.Police.Districts"                             
[52] "DELETE...Supervisorial_Districts_Waterline_data_from_7pkg_wer3"

Length of the column names data:

length(colnames(data))

length(colnames(data)) [1] 52

Remove columns

colremove = c("First Construction Document Date",
          "Structural Notification",
          "Number of Existing Stories",
          "Number of Proposed Stories",
          "Voluntary Soft Story Retrofit",
          "Fire Only Permit","Existing Units",
          "Proposed Units","Plansets",
          "TIDF Compliance","Existing Construction Type",
          "Proposed Construction Type","Site Permit",
          "Supervisor District","Current Police Districts",
          "Current Supervisor Districts",
          "Current Status Date", "Permit Creation Date",
          "Analysis Neighborhoods","Lot","Location",
          "SF Find Neighborhoods","Unit","Block", "Permit Type",
          "Unit Suffix","Street Number Suffix",
          "Existing Construction Type Description")

data <- data[colnames(data)[1:47]] %>% select(-all_of(colremove))

Here the error shows up:

Error: Can't subset columns that don't exist. x Columns First Construction Document Date, Structural Notification, Number of Existing Stories, Number of Proposed Stories, Voluntary Soft Story Retrofit, etc. don't exist.

Daremitsu
  • 545
  • 2
  • 8
  • 24
  • 2
    Hi, why do you repeatedly ask [a question of yours](https://stackoverflow.com/q/64238572/6574038), that was closed for a specific reason that you should have fixed first? – jay.sf Oct 07 '20 at 10:20
  • Hey, I had by mistake said ```length(colnames(data)) = 19``` when it should have been 52. When I had came back the topic was closed. That's why I am asking the question. – Daremitsu Oct 07 '20 at 10:37
  • Yeah, but we need a [minimal reproducible example](https://stackoverflow.com/a/5963610/6574038) to answer your question. – jay.sf Oct 07 '20 at 11:15
  • Okay. I am editing and adding it then. – Daremitsu Oct 07 '20 at 11:24

2 Answers2

1

If you want to keep using dplyr, the selection helper you are looking for is any_of(), not all_of().

bcarlsen
  • 1,381
  • 1
  • 5
  • 11
0

I have solved the problem I was facing.

data <- data[1:47,!(names(data) %in% colremove)]

It helped to remove the columns and helped allocate the data in the original data set.

Daremitsu
  • 545
  • 2
  • 8
  • 24