forgive if this is obvious, but I am very new to R.
What I would need to do is to divide a dataset consisting of a series of 0s and 1s to five chunks, summing up the 1s in each chunk.
So,
1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1
should result in:
2,1,1,0,3
The thing that makes this slightly tricky is that there is variation in the exact number of characters per vector, so instead of 25 ones and zeros like in the example, some might be 21, some 26, some 23, etc.
Regardless of the varying length of the vectors, I would need the resulting sums in five bins.
The reason for doing this is that I work in linguistics and digital humanities with medieval and early modern texts. I am testing whether abbreviations are more likely to occur towards the end of the line in manuscripts and early printed books. What I want to find out whether the number in the fifth column ends up being larger than the rest, and run a chi-square test to determine whether the results are statistically relevant.
Thank you very much in advance!
EDIT: Thanks for linking to the previous thread, Cath. My question differs from it, because I need to sum up the bins (so, not by much, I guess...)