I'm working with the following data in Pandas. For the block column, I need to change each value so it only includes the street name (this way I can geocode for the lat long coordinates). To use the geocoder I'm working with, I also need to include "Washington, DC".
crimes = pd.read_csv("/content/SearchResults (2).txt", encoding='latin-1')
This is what I wan't the BLOCK columns to look like:
2ND STREET SE, WASHINGTON DC
TAYLOR STREET NE, WASHINGTON DC
How do I do this? If it's easier, I can add another column with this info instead of changing the block columns. Apparently you can't use string methods on a pd data frame and I'm clueless when it comes to regular expressions ... please help!
Edit:
this code does exactly what I want:
for i in crimes['BLOCK']:
i = i.split()
i = i[-3:]
i = " ".join([str(elem) for elem in i])
i = i + ", WASHINGTON DC "
print(i)
the output looks like this:
MINNESOTA AVENUE NE, WASHINGTON DC
MORSE STREET NE, WASHINGTON DC
How do I reassign the actual column values to the i variable above?
Edit 2:
Here is an example of the csv file:
REPORT_DAT,OFFENSE,METHOD,BLOCK,DISTRICT,WARD,NEIGHBORHOOD_CLUSTER,BLOCK_GROUP,XBLOCK,YBLOCK,START_DATE
6/30/2020 3:03:21 AM,THEFT F/AUTO,OTHERS,5700 - 5799 BLOCK OF 27TH STREET NW,2,4,Cluster 10,001500 1,395132,144513,6/29/2020 2:00:48 PM
6/30/2020 12:04:33 AM,MOTOR VEHICLE THEFT,OTHERS,4432 - 4499 BLOCK OF GREENWICH PARKWAY NW,2,3,Cluster 13,000802 2,392727,138206,6/29/2020 1:00:43 PM