Average vs Median: Which One Is The Best Choice For Real Estate Statistics?

 minute read

When you're working with real estate statistics - whether it's to create a listing presentation, a market report, or even if you're just trying to do it to come up with data to share on social media - it's important to understand the difference between average and median when it comes to the statistics you're quoting. 

Now, commonly in real estate, most people use averages. 

They quote things like average price per square foot, average days on market, average sale price, etc.

But it's important to understand that averages and medians can display two very different pictures of the data that you're analyzing. 

In this post, we're going to break down the difference between the average and the median and talk to you about why you should be using the median instead of the average.

Now, first of all, it's important to understand that most real estate statistics that you have been quoting or ones that you see anywhere are most likely referring to averages. 

It's the industry standard. 

But the question then becomes, “why should we be using median instead of average if the standard has always been the average?” 

To do that, we're going to break down the two different types of statistics and highlight their strengths and weaknesses.

Average vs Median

So first thing, let's jump in and talk about averages and medians, and how they compare.

An average is just the total of the data divided by the number of data. 

So if you have 10 data points - for example, 10 sale prices - you take all 10 sale prices, add them up, and divide by 10 to get the average sale price. 

The idea of the average is to try and find the middle point of the data. And while the average might seem like a good way to find the middle point, it can be highly skewed depending on the data. 

We'll talk about that more in a minute…

A median, in comparison, is just the actual middle point of the data. 

So again if you have 10 data points, the middle point is going to fall between point five and point six. If there's an odd number, it might fall exactly on one of the data points. 

You can see that by using a median you are actually getting the middle point of the data, whereas the average is getting you a middle point that is more likely to be skewed. 

Now that we know what the difference is between them, let's talk about when an average could potentially be used well. 

The main time this could be done would be in the case that all your data is pretty symmetrical

For example, let's say you have a set of comparable properties that have sales ranging from $500,000 to $600,000 and where there are sales pretty evenly distributed throughout that range. 

In this case, the data would not be skewed towards one end or another, and there are no significant outliers on either end of the range.

In this example, because of the fact that you have data that's evenly distributed all the way throughout the range, you're going to be able to get a pretty good estimate of the middle point when you're using the average.

Unfortunately, most times real estate data is not this symmetrical!

If you do a property search in a neighborhood, and you're looking up 2-story design homes that are 3,000 sqft to 4,000 sqft and have a two-car or three-car garage, you might get price ranges that vary hundreds of thousands of dollars. 

Or you might have a data set where most of the data falls on either the high end or the low end of the range, with only a few data points falling throughout the rest of the range.

In this case, you either have one of two things

1)You have a data set that is skewed in one direction or another. 

For example, if you have 50% of your homes that all fall in the lower $50,000 of a $200,000 range, that would be skewed towards the bottom. 

In this case, most of your data points fall into this segment, with the remaining data scattered throughout the rest of the range. 

Since this segment is on the low end, it’s possible the average could be above, and potentially significantly above, the top end of this segment, even though a majority of the data for the entire price range is represented by this segment. 

2) You have a data set with significant outliers

Even if your data is pretty symmetrical, there is the possibility that an extreme outlier, or a few, could significantly impact the average. 

While you can avoid these issues with a very good data set, it's hard to avoid them completely. 

So, now let's talk about the median.

First off, as we'd said, the median is just the middle point of the data, right? 

So when should it be used? 

In our opinion, we think the median should be used at all times! 

The reason is that even in an evenly distributed, symmetrical, dataset the middle point of the data is at least as good when you're using the median as the average. 

But when you get to the scenario where you have skewed data or data with outliers, the median becomes significantly more accurate. 

Skewed Data Example

Let's look at an example of skewed data. 

In this example, we have ten homes that sold with the following sale prices:

  • Sale #1: $500,000
  • Sale #2: $502,000
  • Sale #3: $504,000
  • Sale #4: $506,000
  • Sale #5: $508,000
  • Sale #6: $510,000
  • Sale #7: $512,000
  • Sale #8: $530,000
  • Sale #9: $532,000
  • Sale #10: $534,000

The median would have a price of $509,000, which falls right between sale #5 and sale #6. But the average would have a price of $513,800.   

Now, as you can see, by using the average we actually have a higher sale price than if we used the median, with the average sale price actually falling above 70% of the data. 

We had seven data points that all topped out at $512,000, but our average sale price is coming in over $513,000. And it's because even though we have 10 data points, they are skewed towards the low end of the price range. 

By using the median and getting the $509,000 price, we're actually more accurately representing the typical data point that is in our data set than if we use the $513,800 average. 

Outlier Data Example

Let's look at another example. This time let's look at a data set with outliers. 

In this case, we again have 10 sales, but this time with the following sale prices:

  • Sale #1: $500,000
  • Sale #2: $502,000
  • Sale #3: $504,000
  • Sale #4: $506,000
  • Sale #5: $508,000
  • Sale #6: $510,000
  • Sale #7: $512,000
  • Sale #8: $514,000
  • Sale #9: $580,000
  • Sale #10: $590,000 

In this example, we still have a median price of $509,000. 

It still falls between points 5 and 6, and that $509,000 price is still very representative of the overall data set, in that 80% of our data all fall below $514,000. 

But if we looked at the average price in this data set, we see that the price of $522,600

is significantly above 80% of the data points, and it's all because of those two outliers. 

Now you might be saying, “Jeff, those outliers did potentially affect the market, right? They are sales in the market.”

While that might be true, it's likely that there is something significantly different about those two properties.

We all know that a home that sells for $510,000 and one that sells for $590,000 are most likely not similar. Maybe one is more updated, or it's larger, or it has a larger lot, or it's from a better quality builder, or there's something else about it that makes it not a great comparable.

Ideally, you would like to be able to exclude this type of data from your data set. 

And, while you can do that with excellent search criteria - by trying to narrow out anything that does not seem like it would be a good comp - at times you're just not going to be able to get rid of some of these outliers or some of this type of skewed data. 

So in this scenario, when you're looking at whether to use the median or the average for things like price per square foot, sale price, or days on market, it's important to realize that by using the median you're protecting yourself from the fact that your data set might not always be as clean or as good as you want it to be. 

If you want to averages you would need to make sure that your data set was only made up of extremely similar properties, was cleaned up to avoid any outliers, and to make sure that the data wasn't skewed in one direction or another. 

In that case, you would still be able to get a good representation of value when using the average. 

But, by using the median, you can avoid all these problems while still being able to make sure that we're factoring in some of these comparables that might be outliers.

The reason we sometimes don't want to exclude those properties is that they may still be comparables - ones that may require significant adjustments for things like square footage, view, condition, quality, etc - but they might still be properties that could be good comparables to use once the adjustments have been made.

Like I said a little bit ago, this doesn't just apply to the sale prices. 

I know we use those as examples, but this can apply the same way to things like days on market, price per square foot, list-to-sale price ratio, or any of the other statistics that you're using in any of your marketing materials or in your conversations with buyers or sellers.

If you use the median you're going to be doing a better job of actually representing the data that you're analyzing and sharing with your clients.

I understand that the average is the typical thing. And most places where you're going to find statistics quoted, you're going to find averages. 

This is why we would encourage you to actually run your own numbers. 

Whether that's through Excel, or in a tool like PropertyBrain, it’s important to calculate your own median statistics for any property you are analyzing!

Ultimately, if you want to be the best agent possible, if you want to provide the best quality service to your clients possible, you need to be breaking these numbers down with a method that makes it stand out that you are going above and beyond, that you are the knowledgeable professional who's actually being able to better analyze this type of data. 

Now you might not be a numbers nerd like me and I get that. But ultimately, your job as a real estate agent, real estate appraiser, or whatever you are in the real estate industry, is to be able to look at real estate data and determine what it means. 

What does that mean for a sale price? 

What does that mean for a listing price? 

What does that mean on an appraisal? 

By analyzing the data better, you're going to be able to do your job better, and you're going to be to provide better service to your clients. 

I believe that one of the best ways you can do that is to really grasp this concept of using median instead of average. 

If you keep using the averages, for the most part, it works out okay. But I know you're not reading this post because you just want to have your business be okay, right?

You want to build the best business possible.

You want to be the best realtor, the best appraiser, or the best investor.

You want to be the best!

That’s why you need to make sure you're taking the correct steps to use the correct data. 

And the median is definitely better than using the average. 

Conclusion

I hope that gives you a better idea of medians and averages, and why I would encourage you to use the median from this point forward. 

But, if you have any additional questions, I'd love to answer them!

Shoot me an email, send me a direct message, leave a message on the YouTube video, or do whatever you want to do.

I'd be happy to answer any questions you have to help you figure out if going the route of calculating your own statistics and coming up with the medians is the right thing for you.

Featured Download

Download
Average vs Median: Which One Is The Best Choice For Real Estate Statistics?