Friends, there might be situations where you need to calculate the median with conditions. You must find it a bit challenging, but here we are with the very useful tutorial to understand how to calculate the median with conditions. Before proceeding further let’s understand what is a Median.
Median is the middlemost value in a sorted dataset. One can also say that the median is the 50th Percentile of a dataset.
Unlike Average, Median is not impacted by outliers. In the above example, the outlier 30000 is making the average look very high but the median is only 3000. Such kinds of scenarios are common where few outliers can make the average look abnormally High or Low. Some examples are:
- The average Bank Balance can look very different from normal especially when expenses for the month are yet to be paid.
- The average Turnaround time for handling customer queries can be very high if some outlier’s cases took much more time than expected.
- Average Monthly Revenue can be very Low due to some months having productivity or headcount issues
Taking average in such cases where the impact of outlier is High will give us incorrect readings and may affect our analysis. Median will give us the middlemost reading regardless of how high or low the outliers are. Generally, median and average values are very close but if they are significantly different then one should check the outliers before doing any further analysis. In many cases, we can use median instead of average for doing any analysis.
Conditional Average (Averageifs)
Many a time we are required not to look at the average of the entire dataset but the average of some segments of the dataset. For Example
- Sales Efficiency: In Below Example while the overall average is 224 different salesperson has different sales efficiency (Average Sales). The cell highlighted in yellow is an outlier and such kind of efficiency may not be achieved by Sales Person 1 in the normal scenario but this does impact the average.
- Average Turnaround Time (TAT): In Below Example while the overall average is 14.8 different Types of Query have different TAT.
In the above Two Examples, we have used Averageifs Function to find out conditional average within the overall population. This kind of analysis is very useful for
- Planning (Projecting Future Sales Once Sales Person 1 gains some experience and stars delivering similar to Sales Person 3 who is Experienced)
- identifying areas of Improvement (In Example 2, We can see that Refund type of Queries are taking too much time, and improving it will have the highest impact)
- Bench-marking (In Example 2, We can identify the expected TAT for a Refund Query and set the right expectations for the customer that a refund will be processed within 36 hrs)
There can be more used cases for the above analysis. I have just listed down a few above.
Averageifs is a very powerful function that makes things easy to compute the conditional average. However, there is a medianifs function in the google sheet which makes conditional median slightly tricky. In the remaining section of the article, I will explain to you we can compute the median with conditions.
Before we jump to the solution, let’s first briefly understand Array Functions in Google Sheets.
Array Functions are generally applied over functions that return a single output, allowing them to return a range as an output.
In the above example, we see that if() Function which generally produces a single output is given a range as an input and a range as an output. Without Array Function this will give an error. Array Function enables this function to return a range as an output. The output in the range is 1 if the query Type matches else the output in the range is 0. This output is passed as an input in Sum Function which anyways takes a range as input and produces a single output. Hence we get 6 as the output which signifies the number of times the Query Type is appearing.
Now that we know some basics of Array Function, We can easily compute conditional Median as Follows :
For Example 1 :
For Example 2 :
The point here to note is that the output of the array function is passed as an input in the median function to get the conditional median.
In the first example, we have Sales Person 1 with a median Sales of ~ 100 but due to one outlier, the average sales are appearing to be ~168 which is higher and may not be the general case. In example 2, We see that the average TAT is 14.8 while the median is just 5.5 mainly because Refund queries that have a significantly higher TAT are bringing the Average TAT up. We can see in the above two examples how the median can be different from the average and that looking at the median can in some cases give a better idea of what a normal scenario looks like.
From the above examples, we can understand why the conditional median is important and how to calculate the median with conditions using ArrayFunction in Google Sheets. Hope you find this article helpful & informative. Do let us know in the comments in case you have any doubts.