Excel SUMIFS, SUMPRODUCT and SUM magic

0

Man the old days of Excel are long gone, now you can generate incredibly powerful array actions so easily when doing the equivalent of database lookups. As ablebits.com explains this thing starts to look like APL:
So for instance to add all the cells in column C where column A has the text “apple” and B has “banana”, it’s a one liner. Note that the syntax C:C means the entire column C from top to bottom:
 


SUMIFS(C:C, A:A, "apple", B:B "banana")

You can also add arithmetic tests, so this is the sum of everything in C where column D is larger than 1000


SUMIFS(C:C, D:D, ">1000")

You can create really complex strings with the string concatenation, so for instance you can do date checks in column E and you can create an AND easily enough, by specifying the same column twice, so this means look for all dates in column E that are today or a week from now:


SUMIFS(C:C, E:E, ">="@TODAY(), E:E "<=@TODAY()+7)

Then there is highly complicated syntax for looking for blank cells or things that evaluate to blank, so the criteria “=” means if you have a true blank (nothing in it cell), whereas “” means look for cells that evaluate to visually blank cells like null length strings whereas “<>” mean look for non-empty cells including those with zero length strings.
As an aside, SUMIF is nice for an AND, but if you want an OR, then you can use the array syntax which is a string with braces, the first example sums column C if the strings hello or world are in column F. The next one sums if the values 1, 20 or 30 are in column G


SUMIFS($C:$C, F:F, { "hello", "world" }) SUMIFS($C:$C, G:G, {1, 20, 30}}

If you want the above to be variable then you have to use a different approach as this array notation doesn’t store correctly. Instead you have to use the SUMPRODUCT function. This thing, so assuming that the criteria are stored in say K100:K102, then this says sum all the values in column C where column G is larger than 1000 and column H has any of the strings in K100, k101 and k101


SUMPRODUCT($C:$C, --(G:G > 1000), --(ISNUMBER(MATCH(H:H, K100:K102,0)))

Why does this work, well, SUMPRODUCT basically a multiple of each list member and then adds then, so 


SUMPRODUCT( {1, 2, 3}, {4, 5, 6}) = 1*4 + 2*5 + 3*6

Or more generally SUMPRODUCT ( {l1, l2, l3} , {m1, m2, m3} ) but the real trick is what if the list is actually a boolean expression, so –(G:G > 1000) is actually a huge set of boolean values that test the entire column G looking for values > 1000, it looks conceptually like, { FALSE, TRUE, …. }. The funny operator — is basically a double negation and coerces FALSE to be a 0 and TRUE to be a 1. You can also coerce by adding a zero or multiplying by 1. Both look just a wierd to me as the — looks like a C decrement, although (G:G&gt;1000)*1 is hardly better.
The next thing is that MATCH function. This says go MATCH(list, criteria, type) so it goes through the first argument (in this case column H) and finds the first match for each the other arguments. type 0 means that the list is unsorted and it stops at the first occurance. So you will get a long string for all the items in column H that look like { 0, 2, 1, #N/A, 2, #N/A} so if it finds a match, it tells you which one in K100:K102 matches and if it doesn’t it returns a #N/A. 
So the ISNUMBER converts this all to a boolean where TRUE means that some match was found between column H and the search criteria and FALSE means it got an #N/A. 
Finally the product nows does boolean arithmetic. That is you multiple C by 1 if the G:G>1000 matches and by 1 onliy if a match was found, so this is the backwards way of getting an AND because you onlyi get a real C if all the criteria are true (eg 1s). Pretty roundabout and computationally inefficient but it is elegant to express.
Finally, SUMIFS is a relatively recent addition, the old way of doing this was by using boolean arrays and multiplications, so 


SUMIFS(C:C, A:A, "apple", B:B, "bananas") = SUM ( (C:C) * (A:A = "apple") * (B:B = "banana"))

 
 

Related Posts

© All Right Reserved