-2

I have a csv file and I want to sort (lowest from greatest) the first column. The first column's name is "CRIM".

I can read the first column, but I can't sort it, the numbers are floats.

Also, I would like to find the median of the list.

This is what I did so far:

import csv

with open('data.csv', newline='') as csvfile:
    data = csv.DictReader(csvfile)
    for line in data:
      print(line['CRIM'])
Jim G.
  • 15,141
  • 22
  • 103
  • 166
Chadi N
  • 439
  • 3
  • 13

2 Answers2

0

I would advise using pandas >> dataframe.median()

Eg data:

A   B   C   D
0   12  5   20  14
1   4   2   16  3
2   5   54  7   17
3   44  3   3   2
4   1   2   8   6
# importing pandas as pd 
import pandas as pd 

# for your csv
# df = pd.read_csv('data.csv')
  
# Creating the dataframe (example)
df = pd.DataFrame({"A":[12, 4, 5, 44, 1], 
                   "B":[5, 2, 54, 3, 2], 
                   "C":[20, 16, 7, 3, 8],  
                   "D":[14, 3, 17, 2, 6]}) 
  
# Find median Even if we do not specify axis = 0, the method  
# will return the median over the index axis by default 
df.median(axis = 0) 
A    5.0
B    3.0
C    8.0
D    6.0
dtype: float64
df['A'].median(axis = 0) 
5.0
Kuldeep Singh Sidhu
  • 3,748
  • 2
  • 12
  • 22
  • Thank you for your answer, but how am i suppose to create a dataframe for this : https://www.kaggle.com/arslanali4343/real-estate-dataset?select=data.csv I'm extracting the first colomn (CRIM). – Chadi N Nov 21 '20 at 02:54
  • well data frames are faster, have more features, easy to use, easy to view and on Kaggle pandas is pre-installed and everyone uses it – Kuldeep Singh Sidhu Nov 21 '20 at 02:57
0

https://www.programiz.com/python-programming/methods/built-in/sorted

Use sorted():

CRIM_sorted = sorted(line['CRIM'])

For the median, you can use a package or just build your own: Finding median of list in Python

angrymantis
  • 352
  • 1
  • 9
  • for some reason, when i use sorted(), It split the number from this -> 0.004, to this [".". "0". "0". "4"]. – Chadi N Nov 21 '20 at 02:28