4

I have an Excel spreadsheet of the form:

A,B
X,1
X,5
Y,4
Y,11
X,7
Z,1

I would like to get the maximum value of column B for each distinct value of column A - how can I do it via pure Excel formula?

Desired result (order is irrelevant):

A,B
Y,11
X,7
Z,1

In other words, I would like Excel's version of an SQL query

SELECT   A,max(B)
FROM     myTable
GROUP BY A

Any version of Excel is acceptable, but I use 365 one.

Pivot tables are an acceptable approach (I currently did it with a pivot myself) but I would strongly prefer a real formula - the main goal of the question is to enhance my understanding of formula programming in Excel. No VBA

zx8754
  • 52,746
  • 12
  • 114
  • 209
DVK
  • 126,886
  • 32
  • 213
  • 327
  • I was hoping that this goes without saying, but clarification: formulas that rely on **hardcoding** individual values in column A into the formula aren't acceptable by any stretch of imagination. – DVK Jun 09 '15 at 13:21

4 Answers4

5

Gary's Student's answer is correct. I added a couple things.

enter image description here

Also, a link to the method: http://blog.contextures.com/archives/2011/07/27/finding-min-if-or-max-if-in-excel/

A distinct list of values can be generated in a couple ways, here's one: enter image description here

And here's links to a method or two for distinct lists. All with array formulas:

Ignore Duplicates and Create New List of Unique Values in Excel

Getting unique values in Excel by using formulas only

Community
  • 1
  • 1
Stephen Lloyd
  • 733
  • 8
  • 21
  • This is defnitely a major improvement on Garys' answer, but you didn't generate column E using formulas, so your answer seems to (at least withour showing how E is obtained) require hard-coding values as well. If that's incorrrect, please expand on E. – DVK Jun 09 '15 at 13:29
  • That array formula to generate distinct values is just returning the value of `A1` for all the cells it is entered into for me. http://i.imgur.com/q7vwXSQ.jpg – Dean MacGregor Jun 09 '15 at 14:13
  • That's interesting. I would double-check the formulas. If the exact same formula is in each cell, I can duplicate your issue. Did you paste into E2 and then drag down? Doing so should utilize the relative and absolute references correctly. – Stephen Lloyd Jun 09 '15 at 14:18
  • Ok I know what I did wrong. I entered the formula as a single array from e2 to e4 instead of entering it once and then copying it down. – Dean MacGregor Jun 09 '15 at 14:21
  • 1
    just to make things tidy I would wrap your formula in a `iserror(yourformula,"")` – Dean MacGregor Jun 09 '15 at 14:24
2

With data in columns A and B use the Array Formula:

=MAX(IF(A1:A6="x",B1:B6))

Same for "y" and "z"

Array formulas must be entered with Ctrl + Shift + Enter rather than just the Enter key.

enter image description here

Notice the braces in the Formula Bar

EDIT#1:

To generate the formulas automatically, first copy column A to column C.

Then use the Remove Duplicate feature in the Data tab:

enter image description here

Then enter the Array Formula in cell D1:

=MAX(IF(A$1:A$14=C1,B$1:B$14))

and copy down:

enter image description here

The formula is only entered once and then copied down!

Gary's Student
  • 95,722
  • 10
  • 59
  • 99
  • Am I correct in that, if I have 100 distinct values in A, I'd have to write 100 distinct formulas by hand using your solution? – DVK Jun 09 '15 at 13:15
  • @DVK Only if you want to use **formulas only.** A **Pivot Table** will generate the information with using formulas! – Gary's Student Jun 09 '15 at 13:17
  • @DVK There is a way to generate the 100 formulas automatically (*without typing each one*)! – Gary's Student Jun 09 '15 at 13:22
  • if you can show how to generate and place them into a spreadsheet using formulas, I'll not only change my vote but quite possibly accept such an answer :) – DVK Jun 09 '15 at 13:25
  • @DVK See my **EDIT#1** – Gary's Student Jun 09 '15 at 13:51
  • I think DVK was looking for a way to generate the list of distinct values using formulas. In other words wanted to avoid copying column A to column C and then using the remove duplicate tool. Stephen Lloyd found a way to do that in his answer. – Dean MacGregor Jun 09 '15 at 14:28
1

Pivot table is the correct answer.

  1. Select column A's title and drag it to the Row Labels section.
  2. Select column B's title and drag it to the Values section.
  3. Click the arrow on column B's title in the values section and choose Value Field Settings.
  4. Select Max
nwhaught
  • 1,562
  • 1
  • 15
  • 36
  • The question very explicitly notes that Pivot table isn't a "correct" answer. It's acceptable in that it does work, but the main reason for asking a question is to learn how to do it via a formula - I already know I can do it via a pivot. – DVK Jun 09 '15 at 13:19
  • The question said "Pivot tables are an acceptable approach". I assumed that meant that they were an acceptable avenue to accomplish the task. In this case, they are by far the best choice, unless you'd like to enter multiple array formulas as Gary's Student suggested.. – nwhaught Jun 09 '15 at 13:23
  • I'm not arguing against you using a pivot, but against your characterization of "correct". "Correct" is a subjective evaluation dependent on specific context - and in my question's context, it's far from "the correct". Merely "meh, OK" - I mostly added it as being OK because some people strongly hate questions with arbitrary "can't use method X" restriction by assuming they are homework questions. – DVK Jun 09 '15 at 13:26
  • Misunderstanding of how you worded the pivot table condition then. No worries. As the other answers have shown, there are some ways around them, but IMHO, they're all pretty kludgey, and won't scale well as the data set grows. – nwhaught Jun 09 '15 at 13:36
1

Try below. Note I moved your desired output table to Columns D and E. You will not need to hard-code any values into formulas.

Data

D   E
Y   11
X   7
Z   1

Formulas

E2 Formula: ={MAX(--($D1=A1:A6)*B1:B6)}
E3 Formula: ={MAX(--($D2=A2:A7)*B2:B7)}
E4 Formula: ={MAX(--($D3=A3:A8)*B3:B8)}
Blake
  • 207
  • 1
  • 3
  • 9
  • How do you generate D? – DVK Jun 09 '15 at 13:30
  • 1
    It was unclear to me that you did not already have values in a summary-like table. If you are trying to populate E without already having D you could use the following formula in cell `D2` and fill down `={INDEX($A$2:$A$7, MATCH(0, COUNTIF($D$1:D1, $A$2:$A$7), 0))}` – Blake Jun 09 '15 at 13:53
  • If you are using * then the -- is redundant, it would be sufficient to use this formula: ={MAX(($D1=A1:A6)*B1:B6)} - but generically it's better to use IF as suggested in other answers because if the maximum value is negative this formula won't give the correct answer – barry houdini Jun 09 '15 at 15:32