First things first, you must understand the inner-workings of a CSV file.
CSV file are made up of rows and columns, like this:
| NAME | AGE | ROOM |
| ---------------------|
| Kaleb | 15 | 256 |
| ---------------------|
| John | 15 | 257 |
| ---------------------|
| Anna | 16 | 269 |
Where the vertical elements are columns, and the horizontal elements are rows. Rows contain many types of data, like name/age/room. Columns contain only one type of data, like name.
Moving on, here is an example function to read the CSV.
Please carefully study the code.
def read_csv(csv_file):
data = []
with open(csv_file, 'r') as f:
# create a list of rows in the CSV file
rows = f.readlines()
# strip white-space and newlines
rows = list(map(lambda x:x.strip(), rows))
for row in rows:
# further split each row into columns assuming delimiter is comma
row = row.split(',')
# append to data-frame our new row-object with columns
data.append(row)
return data
Now why would I do that? Well, this function allows you to access your CSV file by row/column. Meaning it is easier to index. Look at this example using the above function:
csvFile = 'test.csv'
# invoke our function
data = read_csv(csvFile)
# get row 1, column 2 of file
print(data[1][2])
# get entirety of row 2
print(data[2])
# get row 0, columns 1 & 2
print(data[0][1], data[0][2])
As you can see, we can easily access different parts of the file by using our read_csv()
function and creating a nested-list object. Finally, if you want to print to the entire file, you simply use a for loop after creating the data-object.
data = read_csv(csvFile)
for row in data:
print(row)
In conclusion, Pandas is great for big-data science, but if you just
want to read/access the CSV, this function is just fine. No need to install big packages for little tasks, unless of course you want to :) .
Good luck!