0

I am trying to use the rbga.bin genetic function in R.

I have a dataframe with 40 observations (rows) and 189 metrics (columns). In the evaluation function, I have to run a Principal Component Analysis on both the original dataset and the "chromosome dataset" (i.e., the dataframe with some of the metrics columns - the ones that have 1s in the chromosome) in order to produce the fitness score.

For example, a possible solution (chromosome) is the following:

(1,1,1,0,0,...,0)

The solution dataset that I would want to run a PCA on, would just have only the first 3 columns of the original dataset.

How can I refer to that "reduced" dataset inside the evaluation function?

vic
  • 359
  • 4
  • 18

1 Answers1

0

It seems that the variable you provide to the evaluation function is the chromosome, i.e. the binary vector. You can get the reduced dataset the following way.

Assume chromosome is the binary vector, original is the starting dataframe and reduced is the resulting dataframe with only the columns that are 1 in the chromosome.

reduced = !!chromosome
reduced = original[reduced]
vic
  • 359
  • 4
  • 18
  • Downvote for answering a question that should have been closed as unsuitable for SO. Maybe this sort of thing is ontopic for the SE BioInformatics forum. – IRTFM Jan 27 '23 at 02:08
  • SO is full with R questions, which this one is as well. It's 100% a code question. – vic Jan 29 '23 at 18:25
  • It didn't have any R code in it, no library calls to identify the non-base packages that you were or intended to use, and no test data. – IRTFM Jan 29 '23 at 22:10