0

I am trying to plot a graph with the train data that I have from this website. The train data consists of many column many rows data, but I wanted to plot the graph column by column.

I managed to figure out a working code to only print out a column, however I do not know how to plot graph for that particular column. For my below code, the last two lines are my attempt to try plot the single column graph but it is not working. Can anyone help me on how I can successfully plot the graph of that column?

https://archive.ics.uci.edu/ml/datasets/Parkinson+Speech+Dataset+with++Multiple+Types+of+Sound+Recordings

import csv

import matplotlib.pyplot as plt

with open("C://Users/RichardStone/Pycharm/Projects/train_data.csv", "r") as csv_file:

    csv_reader = csv.reader(csv_file, delimiter=',')

    for lines in csv_reader:

        print(lines[1])

        plt.plot(lines[1])

        plt.show()
djvg
  • 11,722
  • 5
  • 72
  • 103
Soulofknight
  • 11
  • 1
  • 1

2 Answers2

0

Why not read data into a pandas dataframe and then plot it using matplotlib?

Something like this should work:

import pandas as pd
import matplotlib.pyplot as plt 

file_path = "path\to\file"
df = pd.read_csv(file_path)

for column in df.columns:
    print(df[column])
    plt.figure()
    plt.title(column)
    plt.plot(df[column])
    plt.show()
Erik Hallin
  • 245
  • 2
  • 6
  • 17
  • Thanks! This works! but may i know how i can name each column? Because the windows that came out, the graph title becomes the first value of the column. Is it possible to replace it with some names that we want? Also, is it possible to remove certain column? Like for example, first column is actually not required, as first column are just number sequencing so the graph end up like a linear line because its literally just numberings like 1 to 40. – Soulofknight Feb 19 '21 at 08:33
0

If you're using matplotlib, you've already got numpy, so you could do something like this:

import csv
import matplotlib.pyplot as plt
import numpy


with open('C://Users/RichardStone/Pycharm/Projects/train_data.csv', 'r') as csvfile:
    reader = csv.reader(csvfile)
    # convert strings to numbers, collect everything in a list of lists
    data_list = [[float(item) for item in row] for row in reader if row]

# convert to numpy array for convenient indexing
data = numpy.array(data_list)
column_index = 1
plt.plot(data[:, column_index])  
# or plt.plot(data) to show all columns
# or plt.scatter(data[:, 0], data[:, 1]) for a scatter plot
plt.show()
djvg
  • 11,722
  • 5
  • 72
  • 103
  • Thanks for your reply. I tried this code but then it says there is this error: import matplotlib.pyplot as plt ModuleNotFoundError: No module named 'matplotlib.pyplot' Do you know how I can solve this? – Soulofknight Feb 19 '21 at 08:37
  • @Soulofknight: that line is from your own example. Did it work there? If not, you may need to [install the matplotlib package](https://matplotlib.org/stable/users/installing.html). Otherwise you might find some answers [here](https://stackoverflow.com/q/18176591) – djvg Feb 19 '21 at 08:45
  • If you're using PyCharm, perhaps check the run-configuration, or restart Pycharm's python console, if you're using that. – djvg Feb 19 '21 at 08:56