Skip to content

Buffalo Reality - Data Engineering project

Published: at 03:57 PM

Buffalo Housing data - Data engineering project using Databricks, web-scraping, Azure blob storage, Azure SQL database and Tableau to analyze and visualize the housing market in Buffalo.

View dashboard

Github


Architecture

Table of contents

Open Table of contents

Source

Home Harvest - https://github.com/Bunsly/HomeHarvest HomeHarvest is a real estate scraping library that extracts and formats data in the style of MLS listings. Scraping data from Zillow, Redfin and realtor.com

Databricks

Databricks is used to load, process and transform the data to load into the Azure SQL database

The JSON data is extracted and loaded into a dataframe, cleaned, transformed and loaded into the warehouse

Azure SQL DB

Used as a data warehouse to store the raw data and connected to Databricks for transformations the updated data is merged into the final table, Tableau uses the final table to update the dashboard.

Tableau

Tableau dashboard with data visualizations and graphs - https://public.tableau.com/views/Book1_17097820994780/Story1?:language=en-US&:sid=&:display_count=n&:origin=viz_share_link

Screenshots:

Screenshot 2024-03-27 at 6 11 36 PM