Python for Airbnb hosts exploration: Boston VS Seattle

K Benaggoune
Nerd For Tech
Published in
5 min readJun 27, 2021

--

Airbnb's listings exploration in Seattle and Boston based on three business questions related to pricing, property type, and neighborhood impact.

Context

Airbnb is an online marketplace that connects people who want to rent out their homes with people looking for accommodation in that area. The firm has no ownership of the listings on the app; rather, it operates as a matchmaker and gets commissions on each booking. The company is founded in 2008, and it is based in San Francisco, California, USA.

Airbnb’s data is a good place to start data exploration and explanation. In 2016, Murray Cox, an independent digital storyteller, fascinated by Airbnb data, launched an investigative site called Inside Airbnb, bringing back and viewing data from scraped listings on Airbnb.

In this article, I will focus only on listings data from Seattle and Boston. I will conduct a comparison study on both datasets to answer three business questions.

  • Which city is most expensive?
  • Which are the property type most hosted?
  • Which are the most expensive and cheapest neighborhood in Seattle and Boston?

For more information, you can check the code in my GitHub: https://github.com/khaledbenag/Airbnb_price_prediction

Price: Seattle vs Boston

It is clear that cities are not identical in many things, including the price of rent. In this section, I will compare the price of the Airbnb listings from both Seattle and Boston cities. Before that, the price column is not in the wright format and needs to be formatted. Moreover, as the distribution of the data is right skewed, prices above 500 dollars are considered as outliers and removed. After both datasets are cleaned, we have 3786 listings from Seattle and 3495 from Boston.

The plot_multiple_hist function below will plot the two distribution in one figure.

It is clear that Boston prices are a bit higher and regrouped between 70 and 300, while most of Seattle prices are between 10 and 200.

To accurately compare the price, let’s plot the mean, the median, and the 3rd quartile.

So it is evident based on the median that the half of Boston listings are higher than 150 dollars while it is 100 dollars in Seattle city.

What type of property is most hosted?

Airbnb is just a service that relates renters with people looking for accommodates; therefore, we can find multiple types of properties. Seattle city renters have 16 types of properties, while Boston city renters have just 13 types. ‘Entire Floor’, ‘Guesthouse’, ‘Villa’ can be found only in Boston, and ‘Bungalow’, ‘Cabin’, ‘Chalet’, ‘Tent’, ‘Treehouse’, ‘Yurt’ can be found only in Seattle. The other properties are in common.

Now, lets plot how much each type is hosted in both cities

We can say that the hosting is unbalanced and the majority of hosts is for both cities are in: ‘Apartment’, ‘House’, ‘Condominium’, ‘Townhouse’, ‘Bed & Breakfast’, ‘Loft’. These 6 property types cover over 97.64 % hosts for Seattle, and over 98.74% hosts for Boston.

Price by Neighborhood

If you look to visit one of the cities, maybe you wonder which are the neighborhood most hosted, and which neighborhood are most expensive. Based on the cleaned listings, we have 81 neighborhoods in Seattle city and 30 neighborhoods in Boston city.

First, lets plot the most 10 hosted neighborhoods against their mean prices in both cities.

So Allston-Brighton in Boston city, and Capitol Hill in Seattle city are the most hosted neighborhoods with a large gap from others, where the mean price around 120 dollars.

Now let's check how much the 10 most expensive neighborhoods are hosted.

From Seattle listings, the most expensive neighborhoods are rarely hosted. However, in Boston city even the Back Bay neighborhood is quite expensive, it has many hosts. This can be explained by different factors that distinguish this neighborhood from others.

Finally, if you look to the cheapest neighborhoods in Seattle you can go for ‘Georgetown’, ‘Rainier Beach’, ‘Dunlap’, ‘Olympic Hills’, ‘Roxhill’, or you can go for ‘Dorchester’, ‘Hyde Park’, ‘Somerville’, ‘Mattapan’, ‘Chestnut Hill’ in Boston city.

Conclusion

In this article, we explored Airbnb data from Seattle and Boston to understand three areas of interest: pricing, property type, and neighborhood impact. While we found useful information at each level, many questions remain, like which characteristics and their associated impact make each neighborhood different from the others. In addition, further inspection of the listings by seasonality could yield more information to accurately select the best features for the price prediction task.

For more insights and code, please check my repo: https://github.com/khaledbenag/Airbnb_price_prediction

--

--

K Benaggoune
Nerd For Tech

I have a PhD in industrial computing. I am experienced with predictive models and medical imaging techniques. Likewise, I love chess and puzzles.