Autobot - HR data analytics system

Project information

  • Category: Data Analysis - Data Engineering
  • Client: Stax Colombo

This project was initiated by the leadership team of Stax Colombo to streamline the HR analytics of the company. The primary goal of the project was to automate the HR analytics of the company so that employee performance can be measured in real time.

The project consisted of three main componets.

  • 1. Extract data from the HR database
  • 2. Tranform and aggregate the data
  • 3. Display the data in interactive dashboard
The project followed a medallion architecture

I was mainly responsible for the data extraction and data tranformation components. I created a set of python scripts that would extract data from the HR database , the extracted data would be stored in Sql Server database in Azure. This database would act as the Bronze data store for the entire process.

Next the data was cleaned and transformed using python pandas and numpy. Here to further improve the efficiency of the system multiprocessing was used to parallelism the task of cleaning and processing the data. This cleaned data was stored in another Sql server database. This acted as the Silver database.

Finally the data was aggregated to match the requirements of the Top management and was stored in the final sql server. This acted as the Gold database. Later the system was incorporated to the Master Data Management (MDM) system of Stax Colombo