Linear regression does not give you Z statistic as rightly commented by @Roland rather linear regression gives you t statistic. You can use the following code to save the coeff, t statistic, and p-value in excel format
library(tidyverse)
library(broom)
library(openxlsx)
# read ozone data (I converted to a text file first)
otm <- read.table("data.txt", header=T)
head(otm, 2)
otm %>%
pivot_longer(-c("Year")) %>%
group_by(name) %>%
do(fitlm = tidy(lm(value ~ Year, data = ., na.action=na.omit))) %>%
unnest(fitlm) %>%
write.xlsx("Linear_trend_1.xlsx")
In the "Linear_trend_1.xlsx" file, Year row for each location provides you the slope and the statistic is t statistic.
Your values from the first method are not matching with the second method because the first method uses markers
as the dependent variable and all the variables in the otm_df
were used as the independent variables. In the second method, Year
was used as independent variable while all the location values were the dependent variable.
Another thing, you are using sample
function to create markers
which will give you different outputs for every run. So you have to use set.seed
to make it constant for every run like
set.seed(123)
otm %>%
mutate(markers = sample(0:1, replace = T, size = 11)) %>%
pivot_longer(-c("Year", "markers")) %>%
group_by(name) %>%
do(fitlm = tidy(lm(markers ~ value, data = ., na.action=na.omit))) %>%
unnest(fitlm) %>%
write.xlsx("Linear_trend_2.xlsx")
# A tibble: 48 x 6
name term estimate std.error statistic p.value
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 AfgERA (Intercept) -8.58 7.79 -1.10 0.299
2 AfgERA value 0.577 0.498 1.16 0.276
3 AfgTES (Intercept) -1.62 2.99 -0.542 0.601
4 AfgTES value 0.0521 0.0750 0.695 0.505
5 AfgTOM (Intercept) -25.3 19.5 -1.30 0.225
6 AfgTOM value 1.94 1.46 1.33 0.218
7 BanERA (Intercept) -2.28 5.03 -0.453 0.662
8 BanERA value 0.201 0.370 0.543 0.600
9 BanTES (Intercept) -3.42 2.80 -1.22 0.253
10 BanTES value 0.0892 0.0644 1.39 0.199
# ... with 38 more rows
If I modify your first approach like the following, you can see that the output from the above code and your code is basically same
# make a data frame version
otm_df <- data.frame(otm)
set.seed(123) #To have same output from sample function for every run
markers <- sample(0:1, replace = T, size = 11)
# calculate OLS slope for all columns
# the -1 at end removes the intercepts
ols <- sapply(otm_df, function(x) coef(lm(markers ~ x))[-1])
Year.x BanTES.x SriTES.x AfgTES.x BhuTES.x IndTES.x
0.054545455 0.089176876 0.092629314 0.052132553 0.001602003 0.107446434
NepTES.x PakTES.x SATES.x BanERA.x SriERA.x AfgERA.x
0.065125607 -0.115108438 0.315928753 0.200712285 -0.519996739 0.577317012
BhuERA.x IndERA.x NepERA.x PakERA.x SAERA.x BanTOM.x
-0.140990921 0.110578204 -0.030546686 0.265909319 0.176797510 1.897234338
SriTOM.x AfgTOM.x BhuTOM.x IndTOm.x NepTOM.x PakTOM.x
5.380610477 1.935170281 1.761172711 2.248531891 2.107380452 2.011356580
SATOM.x
2.214848959
Here is the data in dput
format
dput(otm)
structure(list(Year = 2008:2018, BanTES = c(44.06247376, 43.81239107,
40.68010622, 46.97760506, 37.49591135, 43.81239107, 43.81239107,
43.81239107, 43.81239107, 45.27803189, 44.06247376), SriTES = c(32.07268265,
35.01918477, 29.91018035, 34.1291577, 28.5258431, 32.07268265,
32.07268265, 32.07268265, 32.07268265, 30.96753552, 32.07268265
), AfgTES = c(39.19328867, 42.06325898, 42.31015918, 40.54543762,
34.28696385, 40.54543762, 40.54543762, 40.54543762, 40.54543762,
37.38974643, 39.19328867), BhuTES = c(34.08241824, 36.95440954,
30.41561338, 30.37004394, 19.8861367, 30.41561338, 30.41561338,
30.41561338, 30.41561338, 32.09933763, 32.09933763), IndTES = c(40.05913352,
41.54741392, 38.88957844, 42.47544504, 43.24350644, 41.54741392,
41.54741392, 41.54741392, 41.54741392, 42.65820983, 42.47544504
), NepTES = c(38.12979871, 37.62785275, 34.40488247, 37.7995467,
39.64286364, 37.7995467, 37.7995467, 37.7995467, 37.7995467,
30.63632105, 37.7995467), PakTES = c(41.38388734, 41.99865359,
42.16236093, 42.51838941, 43.4444952, 42.16236093, 42.16236093,
42.16236093, 42.16236093, 44.96627251, 42.51838941), SATES = c(40.03077,
41.52302, 39.6327, 41.9098, 41.11191, 41.11191, 41.11191, 41.11191,
41.11191, 41.57009, 41.52302), BanERA = c(13.76686693, 13.904453,
13.40584856, 13.45199721, 13.47657436, 12.8992102, 13.3586098,
14.23223365, 13.4228729, 13.21487616, 14.50830571), SriERA = c(11.81852768,
11.79187354, 11.51484349, 11.50552588, 11.489789, 11.23384852,
10.61182708, 11.33951759, 11.6357584, 11.74685028, 12.14987906
), AfgERA = c(15.44115983, 15.425995, 15.6161623, 15.47751927,
15.81748069, 15.47498417, 15.41748855, 16.06462541, 15.61143062,
15.32810621, 16.39162424), BhuERA = c(14.34493453, 14.28085419,
14.24728543, 14.03202106, 14.04152053, 13.42977221, 13.22665229,
14.344052, 13.58792484, 13.28851619, 14.28029524), IndERA = c(14.08262362,
14.11485037, 13.80713493, 13.86114379, 13.92607879, 13.37996473,
13.45767152, 14.49365275, 13.88142768, 13.73986257, 14.77032404
), NepERA = c(14.93883379, 14.896056, 14.78607828, 14.50880606,
14.69793299, 13.96309811, 14.18825383, 15.32530354, 14.38700954,
13.98545482, 14.9828434), PakERA = c(14.89773191, 14.87337075,
14.89837223, 14.76236826, 15.13918051, 14.61385609, 14.54589641,
15.40150813, 14.8588883, 14.62185208, 15.6575491), SAERA = c(14.38877,
14.40468, 14.22069, 14.20561, 14.35855, 13.8399, 13.88027, 14.83054,
14.24554, 14.06201, 15.09615), BanTOM = c(9.317937851, 9.308046341,
9.327401161, 9.319338799, 9.311285019, 9.300317764, 9.292790413,
9.540946007, 9.04840374, 9.300317764, 9.317937851), SriTOM = c(5.437336445,
5.436554909, 5.435492039, 5.440690994, 5.436693192, 5.440601349,
5.427892685, 5.54946661, 5.427827358, 5.440601349, 5.437336445
), AfgTOM = c(13.31581736, 13.30339324, 13.30090284, 13.29781604,
13.33800817, 13.31919873, 13.30073023, 13.62503445, 13.16488469,
13.30073023, 13.31581736), BhuTOM = c(11.69911337, 11.67375898,
11.71142626, 11.69099903, 11.68556881, 11.68714046, 11.65387106,
11.97064924, 11.44872904, 11.67375898, 11.69099903), IndTOm = c(9.709311898,
9.704142364, 9.703938368, 9.72520479, 9.709638531, 9.708799697,
9.690851817, 9.952517961, 9.499369441, 9.704142364, 9.709638531
), NepTOM = c(12.45835066, 12.43677187, 12.48850822, 12.49002218,
12.46283817, 12.50376368, 12.44072294, 12.78685617, 12.27891684,
12.44072294, 12.46283817), PakTOM = c(12.37911913, 12.38028261,
12.37067625, 12.38315158, 12.38352468, 12.36856567, 12.37349086,
12.67422019, 12.18962786, 12.37349086, 12.38315158), SATOM = c(10.63543,
10.62967, 10.62981, 10.64489, 10.63941, 10.63525, 10.6195, 10.89613,
10.44028, 10.62967, 10.63941)), class = "data.frame", row.names = c(NA,
-11L))