3

I have a dataset with 70 foods and information about each food's nutritional value (protein/oz., fat/oz., cals/oz., etc.), as well as the food's cost/oz. I am trying to figure out--given a set budget in $--what the best combination of foods (and the amt. of each food) would be to maximize protein, minimize fat, minimize calories, etc. I aim to do this across a series of price points, and to plot each.

I found a whole bunch of different packages that could help with this here: http://cran.r-project.org/web/views/Optimization.html. However, I'm a beginner and am not sure what would be most helpful/where to start - would love some suggestions from anyone familiar with solving these kinds of optimization problems.

josliber
  • 43,891
  • 12
  • 98
  • 133
eleh915
  • 65
  • 1
  • 6
  • You will need an objective function describing how "good" it is to reach a certain goal. For example, if you want to control cost ($), calories (kCal), and protein (g), you could have a function such as f(cost,cals,prot) = cost^2 + (1000-protein)^2/3 + (14,000 - calories)^2/4 What this function does is try to get you as close to 1,000 grams of protein and 14,000 Calories, while penalizing cost of food. The goal is to minimize this function. – Max Candocia May 29 '15 at 20:59
  • As you are quite new to SO, you might want to read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to produce a [minimal reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). This will make it much easier for others to help you. – Jaap May 29 '15 at 21:01

1 Answers1

5

This is called the diet problem and it is a popular introduction to linear programming (see, e.g., the first Google hit I found for the diet problem). Linear programming solvers through packages such as lpSolve can be used to solve many variants of the diet problem.

For instance, consider the version of the problem in the link above, where you have the following foods to choose from:

(food <- data.frame(Food=c("Corn", "2% Milk", "Wheat Bread"), CostPerServing=c(.18, .23, .05), VitaminA=c(107, 500, 0), Calories=c(72, 121, 65)))
#          Food CostPerServing VitaminA Calories
# 1        Corn           0.18      107       72
# 2     2% Milk           0.23      500      121
# 3 Wheat Bread           0.05        0       65

Assume you wanted to wanted to find the number of servings of each food that minimize total cost subject to the constraint that the total calories must be between 2000 and 2500 and the amount of Vitamin A must be between 5000 and 50000. If you defined variables X1, X2, and X2, then your objective is .18*X1 + .23*X2 + .05*X3, a linear function of the variables. Similarly each of your constraints in a linear function of the variables; for instance, the lower bound on the number of calories is a constraint of the form 72*X1 + 121*X2 + 65*X3 >= 2000.

The lp function from the lpSolve package takes as input a vector indicating the coefficients in the objective value, and information about constraints (a constraint matrix, the direction of each constraint, and the right-hand side of each constraint). For the stated problem, this would be:

library(lpSolve)
mod <- lp("min",  # min/max
          food$CostPerServing,  # Objective
          rbind(food$VitaminA, food$VitaminA, food$Calories, food$Calories),  # Constraint matrix
          c(">=", "<=", ">=", "<="),  # Constraint directions
          c(5000, 50000, 2000, 2500))

After the model has solved, you can look at the objective function and values:

mod$objval
# [1] 2.907692
mod$solution
# [1]  0.00000 10.00000 12.15385
sum(food$VitaminA * mod$solution)
# [1] 5000
sum(food$Calories * mod$solution)
# [1] 2000

The cheapest cost to meet the constraints would be $2.91, and you can achieve this by using 0 servings of corn, 10 servings of 2% milk, and 12.15 servings of wheat bread. This yields exactly 5000 units of Vitamin A and exactly 2000 calories.

josliber
  • 43,891
  • 12
  • 98
  • 133
  • 1
    Is there a way to specify a minimum number of variables for mod$solution? In your example, you need milk and wheat bread to meet the cheapest cost, but not corn. What if I wanted all 3 (or, scaling it up w/ a larger dataset, at least 10 foods, 20, 100, etc.)? – eleh915 Jun 16 '15 at 07:07