Hockey-ETL API

Full-stack ETL pipeline and API for scraping, processing, and serving professional hockey statistics.

About this project


The Hockey-ETL API is a robust system designed to scrape, clean, and serve professional hockey data. It consists of a nightly Python ETL pipeline that extracts raw data, cleans it using Pandas, and loads it into a central SQLite3 database. The data is then exposed via a high-performance ExpressJS API with endpoints for standings, matchups, players, and teams. The platform also includes a service for a user to subscribe to and uses Resend to deliver game scores to every morning.

Key Features

  • checkNightly Python ETL jobs to update the database
  • checkRESTful API endpoints for Standings, Matchups, Players, Goalies, and Teams
  • checkSQLite3 database for persistent, clean data storage
  • checkAutomated email subscription service for previous night's scores (Resend)
  • checkData cleaning and processing via Pandas (Python)
  • checkFull production deployment on Digital Ocean with ExpressJS backend

Tech

ExpressJS
Node.js
Python
SQLite3
Pandas
Postman
Resend
DigitalOcean
JavaScript