0

I am trying to create a predictive model out of my Principal component analysis that I used on a Dataset called Turbofan Engine Degridation Simulation DataSet, or in this case called "General propulsion"(https://drive.google.com/drive/folders/1WiGafxzYb2Nv0yCNrXqyzbYBbzWHiVEa?usp=sharing). This Dataset contains 20 engines which each a certain amount of Cycles (and other variabels) that the Engine ran untill it broke down. (I'm a student and I do not have that much experience with Rstudio yet so my code can me a little messy)

vapply(Motor_gegevens, function(x) length(unique(x)) > 1, logical(1L))
Motor_gegevens <- Motor_gegevens[vapply(Motor_gegevens, function(x) length(unique(x)) > 1, logical(1L))]

I deleted all the variabels that contained rows with the same value to clean the Dataset and devided the Dataset into a train and a test set by binding Engine 1-5 into a testset and Engine 6-20 into a trainset.

Motor_test <- rbind(Engine1,Engine2,Engine3,Engine4,Engine5)

Motor_train<- rbind(Engine6,Engine7,Engine8,Engine9,Engine10,Engine11,Engine12,Engine13,Engine14,Engine15,Engine16,Engine17,Engine18,Engine19,Engine20)

After I ran a PCA on the trainset and created a plot to check the variance of the components. (98% of the variance can be explained by 15 components)

PCA <- prcomp(Motor_train, scale = T)
PCA
plot(PCA, type= "l")
biplot(PCA, scale = 0)

std_dev <- PCA$sdev
pr_var <- std_dev^2
propvarex<- pr_var/sum(pr_var)
plot(propvarex, xlab = "PC", ylab = "prop of var", type = "b")
plot(cumsum(propvarex), xlab = "pca", ylab = "cum prop van var", type = "b")

I made a rpart model using the traindata where I predicted the Cycles using the PCA.

train.data<- data.frame(Cycle = Motor_train$Cycle, PCA$x)
train.data<- train.data[,1:16]

library(rpart)
rpart.model <- rpart(Cycle ~ ., data = train.data, method = "anova")
rpart.model

Finally I tried to predict the the Cycles of the testset using the rpart model with the current results being not what I hoped for.

test.data<- predict(PCA, newdata = Motor_test)
test.data<- as.data.frame(test.data)
test.data<- test.data[,1:15]

rpart.prediction<- predict(rpart.model, test.data)
head(rpart.prediction)
       1        2        3        4        5        6 
31.74074 31.74074 31.74074 31.74074 31.74074 31.74074 

The method I used did not give me the right results (or the current script that I wrote). The desired result need to provide me with a model that tells me how many Cycles the Engine still can make untill it brakes down. So I'm looking for a way to achieve this. I couldn't find a working methode online or in any stackoverflow question so I try my luck with all of you active Datascientists out here on stack!

Can anyone help me out?

Thanks in advance

Ben G
  • 4,148
  • 2
  • 22
  • 42
Timo
  • 1
  • What are your expected results and in what format? – Ben G Jan 06 '20 at 18:16
  • Hi there, sorry for the late response. I would like the results to be a model that tells me when the Engine is on a critical Cycle level. All the Engines in the Dataset ran untill they broke down so with this I had to create a model that tells when an Engine require maintenance using the other variabels. In this case I tried it using PCA to predict this. The format could be maybe something in Shiny or just an Excel/CSV file. Did I answer your question? Or do you require some more information? Cheers! – Timo Jan 09 '20 at 16:51
  • Check this out, it'll help you ask great R questions: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Ben G Jan 10 '20 at 13:41

0 Answers0