0

I am doing my school Arduino project at home, and the teacher asks me to visualize my data for him. On my x-axis, I have more than 20K time points need to show, and I try to set a range for it.

The graph I am trying to achieve: desired graph

What I got up till now: current graph

It is very clear I am doing something wrong, and I have researched for a way to solve it.

The # part of my plotting part is what I learned from the internet, but they do not work here.

Please help me and if anything in question is unclear, let me know!

import csv
import matplotlib.pyplot as plt
from datetime import datetime

header = []
data = []

path="DATALOG.csv"   #CHANGE filename to match file
file = open(path, newline='')
reader=csv.reader(file)

header=next(reader) #This reads the first row and stores the words in header.
data=[row for row in reader]  #This reads the rest and store the data in a list

file.close()  #closes the file when complete

# This function will print the headings with their index. Helpful when dealing with many columns
def print_header():
   for index, column_header in enumerate(header):
       print(index, column_header)

# I want the high and low temperatures in New Glasgow for June 2010
# Key headings
# 0 date/time
# 1 temperature
# 2 humidity %
# 3 light level%

days = []
light = []

for row in range (len(data)):
    day = data[row][0]
    lights = data[row][3]
    current_date=datetime.strptime(data[row][0], "%A %H:%M:%S %d/%m/%Y")

    light.append(float(lights))        #store the day’s high temp in list
          #store the day’s low temp in list
    days.append(str(day))

fig = plt.figure(dpi=128, figsize = (50, 6))

#x = [dc[0] for dc in days]

#xticks = range(0,len(x),10)
#xlabels=[x[index] for index in xticks]
#xticks.append(len(x))
#xlabels.append(days[-1][0])
#ax.set_xticks(xtick)
#ax.set_xticklabels(xlabels,rotation=40)

plt.plot(days,light,c = 'red',label = 'temprature')

plt.title("LIGHT")
plt.xlabel('days',fontsize = 5)
#ax.plot(np.arange('Wednesday 12:00:00 08/01/2020',' Thursday 11:58:10 09/01/2020'), range(10))
fig.autofmt_xdate()
plt.ylabel('light (%)', fontsize = 12)
plt.tick_params(axis='both', which='major', labelsize = 10)

plt.legend()
plt.show()
plt.savefig('plot.png')
JohanC
  • 71,591
  • 8
  • 33
  • 66
  • Are you looking for a solution using r ? If so, can you provide a small reproducible example of your dataset ? (see how to do it here: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – dc37 Jan 18 '20 at 21:24
  • Welcome to SO! I think you could improve your chances of getting an answer if you review the [How do I ask and answer homework questions?](https://meta.stackoverflow.com/questions/334822/how-do-i-ask-and-answer-homework-questions) and modify your question. – DCTID Jan 18 '20 at 21:25
  • Thanks for responding to me ! It is really hard for me to explain my idea because I am a python rookie and an English learner! – Conquest Conquer Jan 18 '20 at 23:43

1 Answers1

0

Here an example of how you can plot your data using r:

As you did not provide a reproducible example, I created a fake one using my understanding of your code. Basically, it looks that you have a file with 4 columns: Days, Temperature, Humidity and Light_levels. Here, I only create two columns.

set.seed(123)
df <- data.frame(Day = as.character(seq.Date(from = as.Date("2019-01-01", format = "%Y-%m-%d"), to =as.Date("2019-02-01", format = "%Y-%m-%d"), by = "days" )),
                 Light_levels = sample(0:100,32, replace = TRUE))

The datafame df should looks like:

> head(df)
         Day Light_levels
1 2019-01-01           30
2 2019-01-02           78
3 2019-01-03           50
4 2019-01-04           13
5 2019-01-05           66
6 2019-01-06           41

I think your issue came from the management of date format. In r, it's pretty common that when you import a csv file as a dataframe, it will convert dates in a factor or character format.

You can check by looking at the structure of the dataframe by using str function:

> str(df)
'data.frame':   32 obs. of  2 variables:
 $ Day         : Factor w/ 32 levels "2019-01-01","2019-01-02",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Light_levels: int  30 78 50 13 66 41 49 42 100 13 ...

So, to use dates in a date format, you need to convert this factor format into a date format using:

df$Day = as.Date(df$Day, format = "%Y-%m-%d")

Now, if you check the structure of df, you will see:

str(df)
'data.frame':   32 obs. of  2 variables:
 $ Day         : Date, format: "2019-01-01" "2019-01-02" "2019-01-03" ...
 $ Light_levels: int  30 78 50 13 66 41 49 42 100 13 ...

Now, you can plot it with the package ggplot2(you have ton install ggplot2 first):

library(ggplot2)
ggplot(df, aes(x = Day, y = Light_levels))+
  geom_line()+
  scale_x_date(date_breaks = "days", date_labels = "%b %d")+
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

and get the following plot: enter image description here

You can adjust the display of each dates by playing with arguments of scale_x_date (see more information here: https://ggplot2.tidyverse.org/reference/scale_date.html)

Alternatively, without to install ggplot2 package, you can do it using base R plot function:

plot(x = df$Day, y = df$Light_levels, type = "l", xaxt = "n", xlab = "")
axis.Date(1,at=seq(min(df$Day), max(df$Day), by="days"), format="%b %d")

enter image description here

Hope it will help you to figure it out how to plot your data.

dc37
  • 15,840
  • 4
  • 15
  • 32
  • Hello dc37! Whoa, it is really a very complex `r` code! I am afraid I am not able to understand r at this point in time, but I very much appreciate your response! Thank you for your explaination again! – Conquest Conquer Jan 18 '20 at 23:49
  • You're welcome! But it is not a really difficult code, when you will get into R, you will find it quite easy actually. There is a lot of tutorial online that can help you to start with – dc37 Jan 19 '20 at 00:43