In this tutorial, we will look at how to do variance analysis Excel and what formulas to use to find variance of a sample and population.
Variance is one of the most useful tools in probability theory and statistics. In science, it describes how far each number in the data set is from the mean. In practice, it often shows how much something changes. For example, temperature near the equator has less variance than in other climate zones. In this article, we will analyze different methods of calculating variance in Excel.
Variance is the measure of variability of a data set that indicates how far different values are spread. Mathematically, it is defined as the average of the squared differences from the mean.
To better understand what you are actually calculating with the variance, please consider this simple example.
Supposing there are 5 tigers in your local zoo that are 14, 10, 8, 6 and 2 years old.
To find variance, follow these simple steps:
So, the variance is 16. But what does this number actually mean?
In truth, variance just gives you a very general idea of the dispersion of the data set. A value of 0 means there is no variability, i.e. all the numbers in the data set are the same. The bigger the number, the more spread out the data.
This example is for population variance (i.e. 5 tigers are the whole group you are interested in). If your data is a selection from a bigger population, then you need to calculate sample variance by using a slightly different formula.
There are 6 built-in functions to do variance in Excel: VAR, VAR.S, VARP, VAR.P, VARA, and VARPA.
Your choice of the variance formula is determined by the following factors:
The below table provides an overview of the variation functions available in Excel to help you choose the formula best suited for your needs.
|Name||Excel version||Data type||Text and logicals|
|VAR||2000 - 2019||Sample||Ignored|
|VAR.S||2010 - 2019||Sample||Ignored|
|VARA||2000 - 2019||Sample||Evaluated|
|VARP||2000 - 2019||Population||Ignored|
|VAR.P||2010 - 2019||Population||Ignored|
|VARPA||2000 - 2019||Population||Evaluated|
VARA and VARPA differ from other variance functions only in the way they handle logical and text values in references. The following table provides a summary of how text representations of numbers and logical values are evaluated.
|Argument Type||VAR, VAR.S, VARP, VAR.P||VARA & VARPA|
|Logical values within arrays and references||Ignored||Evaluated
|Text representations of numbers within arrays and references||Ignored||Evaluated as zero|
|Logical values and text representations of numbers typed directly into arguments||Evaluated
A sample is a set of data extracted from the entire population. And the variance calculated from a sample is called sample variance.
For example, if you want to know how people's heights vary, it would be technically unfeasible for you to measure every person on the earth. The solution is to take a sample of the population, say 1,000 people, and estimate the heights of the whole population based on that sample.
Sample variance is calculated with this formula:
There are 3 functions to find sample variance in Excel: VAR, VAR.S and VARA.
It is the oldest Excel function to estimate variance based on a sample. The VAR function is available in all versions of Excel 2000 to 2019.
Note. In Excel 2010, the VAR function was replaced with VAR.S that provides improved accuracy. Although VAR is still available for backward compatibility, it is recommended to use VAR.S in the current versions of Excel.
It is the modern counterpart of the Excel VAR function. Use the VAR.S function to find sample variance in Excel 2010 and later.
The Excel VARA function returns a sample variance based on a set of numbers, text, and logical values as shown in this table.
When working with a numeric set of data you can use any of the above functions to calculate sample variance in Excel.
As an example, let's find the variance of a sample consisting of 6 items (B2:B7). For this, you can use one of the below formulas:
As shown in the screenshot, all the formulas return the same result (rounded to 2 decimal places):
To check the result, let's do var calculation manually:
The average goes to any empty cell, say B8.
The differences go to column C, beginning in C2.
As you can see, the result of our manual var calculation is exactly the same as the number returned by Excel's built-in functions:
If your data set contains the Boolean and/or text values, the VARA function will return a different result. The reason is that VAR and VAR.S ignore any values other than numbers in references, while VARA evaluates text values as zeros, TRUE as 1, and FALSE as 0. So, please carefully choose the variance function for your calculations depending on whether you want to process or ignore text and logicals.
Population is all members of a given group, i.e. all observations in the field of study. Population variance describes how data points in the entire population are spread out.
The population variance can be found with this formula:
There are 3 functions to calculate population variance in Excel: VARP, VAR.P and VARPA.
The Excel VARP function returns the variance of a population based on the entire set of numbers. It is available in all versions of Excel 2000 to 2019.
Note. In Excel 2010, VARP was replaced with VAR.P but is still kept for backward compatibility. It is recommended to use VAR.P in the current versions of Excel because there is no guarantee that the VARP function will be available in future versions of Excel.
It is an improved version of the VARP function available in Excel 2010 and later.
The VARPA function calculates the variance of a population based on the entire set of numbers, text, and logical values. It is available in all version of Excel 2000 through 2019.
In the sample var calculation example, we found a variance of 5 exam scores assuming those scores were a selection from a bigger group of students. If you collect data on all the students in the group, that data will represent the entire population, and you will calculate a population variance by using the above functions.
Let's say, we have the exam scores of a group of 10 students (B2:B11). The scores constitute the entire population, so we will do variance with these formulas:
And all the formulas will return the identical result:
To make sure Excel has done the variance right, you can check it with the manual var calculation formula shown in the screenshot below:
If some of the students did not take the exam and have N/A instead of a score number, the VARPA function will return a different result. The reason is that VARPA evaluates text values as zeros while VARP and VAR.P ignore text and logical values in references. Please see VAR.P vs. VARPA for full details.
To do variance analysis in Excel correctly, please follow these simple rules:
Variance is undoubtedly a useful concept in science, but it gives very little practical information. For instance, we found the ages of the population of tigers in a local zoo and calculated the variance, which equals 16. The question is - how can we actually use this number?
You can use variance to work out standard deviation, which is a much better measure of the amount of variation in a data set.
Standard deviation is calculated as the square root of the variance. So, we take the square root of 16 and get the standard deviation of 4.
In combination with the mean, the standard deviation can tell you how old most of the tigers are. For example, if the mean is 8 and the standard deviation is 4, the majority of the tigers in the zoo are between 4 years (8 - 4) and 12 years (8 + 4).
Microsoft Excel has special functions for working out standard deviation of a sample and population. The detailed explanation of all the functions can be found in this tutorial: How to calculate standard deviation in Excel.
That's how to do variance in Excel. To have a closer look at the formulas discussed in this tutorial, you are welcome to download our sample workbook at the end of this post. I thank you for reading and hope to see you on our blog next week!
Calculate Variance in Excel - examples (.xlsx file)
Table of contents