Python 2.7
I have a Dataframe with two columns, coordinates
and loc
. coordinates
contains 10 lat/long pairs and loc contains 10 strings.
The following code leads to a ValueError, arrays were different lengths. Seems like I'm not writing the condition correctly.
lst_10_cords = [['37.09024, -95.712891'], ['-37.605, 145.146'], ['43.0481962, -76.0488458'], ['29.7604267, -95.3698028'], ['47.6062095, -122.3320708'], ['34.0232431, -84.3615555'], ['31.9685988, -99.9018131'], ['37.226582, -95.70522299999999'], ['40.289918, -83.036372'], ['37.226582, -95.70522299999999']]
lst_10_locs = [['United States'], ['Doreen, Melbourne'], ['Upstate NY'], ['Houston, TX'], ['Seattle, WA'], ['Roswell, GA'], ['Texas'], ['null'], ['??, passing by...'], ['null']]
df = pd.DataFrame(columns=['coordinates', 'locs'])
df['coordinates'] = lst_10_cords
df['locs'] = lst_10_locs
print df
df = df[df['coordinates'] != ['37.226582', '-95.70522299999999']] #ValueError
The error message is
File "C:\Users...\Miniconda3\envs\py2.7\lib\site-packages\pandas\core\ops.py", lin e 1283, in wrapper res = na_op(values, other) File "C:\Users...\Miniconda3\envs\py2.7\lib\site-packages\pandas\core\ops.py", lin e 1143, in na_op result = _comp_method_OBJECT_ARRAY(op, x, y) File "C:...\biney\Miniconda3\envs\py2.7\lib\site-packages\pandas\core\ops.py", lin e 1120, in _comp_method_OBJECT_ARRAY result = libops.vec_compare(x, y, op) File "pandas/_libs/ops.pyx", line 128, in pandas._libs.ops.vec_compare ValueError: Arrays were different lengths: 10 vs 2
My goal here is to actually check and eliminate all entries in the coordinates column that are equal to the list [37.226582, -95.70522299999999]
so I want df['coordinates']
to print out [['37.09024, -95.712891'], ['-37.605, 145.146'], ['43.0481962, -76.0488458'], ['29.7604267, -95.3698028'], ['47.6062095, -122.3320708'], ['34.0232431, -84.3615555'], ['31.9685988, -99.9018131'], ['37.226582, -95.70522299999999'], ['40.289918, -83.036372']
I was hoping that this documentation would help, particularly the part that shows:
"You may select rows from a DataFrame using a boolean vector the same length as the DataFrame’s index (for example, something derived from one of the columns of the DataFrame):"
df[df['A'] > 0]
so it seems like I'm not quite getting the syntax right... But I'm stuck. How do I write set a condition for the cell value of a certain column and return a dataframe only containing rows with cells that meet that condition?