I have two data sets, one displays the schoolenrollment for 6 countries, the other one shows the GDP of each country. I want to calculate the correlation coefficient between the school enrolment and GDP of each country. I have a look for the question at : How can I create a correlation matrix in R?
But I have problem with range of the two datasets (number of rows and columns of the datasets ) …
Schoolenrollemnt dataset: https://drive.google.com/file/d/0B1NJGKqdrgRtTjcySzZOM2xKZU0/edit?usp=sharing
CountryName year_2000 year_2004 year_2008 year_2012
Comoros 201899884 362420484 4880000000 6800000000
Jordan 8457923945 11407566660 54082389393 58768800833
UAEmirates 104337375343 147824374543 21902892584 36044457920
Egypt 99838540997 78845185709 840000000 1240000000
Qatar 17759889598 31675273812 131611819294 210279947256
Syria 19325894913 25086930693 88882967742 95981572517
gdp dataset: https://drive.google.com/file/d/0B1NJGKqdrgRtRm9SWm9ObGpwbU0/edit?usp=sharing
Indicator com_2000 com_2004 com_2008 com_2012 Jor_2000 Jor_2004 Jor_2008 Jor_2012 ARE_2000 ARE_2004 ARE_2008 ARE_2012 Egy_2000 Egy_2004 Egy_2008 Egy_2012 Qat_2000 Qat_2004 Qat_2008 Qat_2012 Syr_2000 Syr_2004 Syr_2008 Syr_2012
preprimary (% gross) 2.39124 4.3563 23.68581 24.80515401 31.08014 32.71263 37.38376 33.81492 63.34796 81.92245 91.926025 71.14425 11.94312 15.1121 23.49822 27.3631 29.23454 32.69621 49.64917 73.42391 8.67231 10.00469 9.93459 10.6214
primary (% gross) 116.7763 121.0558 112.08 117.3767 102.3871 106.8326 102.04 98.87783 94.22761 102.304 107.5285 108.3284 101.3365 105.5968 109.9804 108.6207 104.7228 106.0118 104.0118 102.94 107.6219 121.8342 118.0423 122.2586
secondary (% gross) 31.8468 48.04706 60.04706 73.48619 85.90683 91.6662 93.89221 89.05884 45.0041 57.57103 68.905185 72.91143 85.83446 87.64275 89.48275 76.06258 86.4097 110.453 93.25074 12.14547 43.96275 66.56304 72.69195 74.42249
tertiary (% gross) 1.41838 3.00913 6.474124923 11.42145 28.28053 39.41155 44.30046 39.93893 0 0 0 0 31.62423 30.32905 31.64919 28.7532 22.565405 17.80551 11.3693 12.14547 12.00074 15.0151 24.20384 25.63541
the X-axis has to have the value of years (2000,2004,2008,2012), y-axis has the enrollment type... for each country i want separate graph,,,, "the graph link at the comments"
the code is not that true,, but this is my start :
library(lattice)
xtest<-read.csv(file.choose(), header=T, sep=",")
ytest<-read.csv(file.choose(), header=F, sep=",")
xvalues<-as.matrix(xtest)
yvalues<-as.matrix(ytest)
corvalue<-cor(xvalues,yvalues)
image(x=seq(dim(xvalues)[2]), y=seq(dim(yvalues)[2]), z=corvalue, xlab="x column", ylab="y column")
text(expand.grid(x=seq(dim(xvalues)[2]), y=seq(dim(yvalues)[2])), labels=round(c(corvalue),2))
as a test i take a subset of the original dataset of gdp , xtest :
Comoros Comoros Comoros Comoros
201899884 201899884 201899884 201899884
362420484 362420484 362420484 362420484
4880000000 4880000000 4880000000 4880000000
6800000000 6800000000 6800000000 6800000000
and for the scoolenrollment, i take subset of data, ytest :
0 2.39124 4.3563 23.68581 24.80515401
99.78652 116.7763 121.0558 112.08 117.3767
0 31.8468 48.04706 60.04706 73.48619
0.82459 1.41838 3.00913 6.474124923 11.42145
any suggestion for better output ? the output result in the comments :