Simran Garg - Data Engineer & Content Creator

Simran Garg

Data Engineer | Cloud Enthusiast | Content Creator

Passionate Data Engineer at Central Co-op with 3+ years of experience building scalable data pipelines, automating ETL workflows, and deploying cloud solutions. Also sharing my UK student journey on YouTube!

About Me

A passionate data professional and content creator dedicated to building ethical, scalable solutions

Location

England, UK 🇬🇧

Education

MSc Data Science

University of Glasgow • Merit

B.Tech IT

Amity University • First Division

CREATOR

Content Creator

Evolving Simran

Sharing insights about data science and student life

Awards & Recognition

Distinguished Services Award - Rotaract
Outstanding Performance Award
Sportsmanship Award - Amity
Gold Medals: Football, Softball, Cricket
Secretary of Rotaract (500+ events)
Professional Summary

As a Data Engineer at Central Co-op and a passionate data science graduate from the University of Glasgow, I thrive at the intersection of cloud engineering, machine learning, and responsible AI. I specialize in building secure, scalable data pipelines using tools like Microsoft Fabric, Azure, and AWS.

Microsoft FabricAzureAWSMachine LearningData PipelinesResponsible AI

With hands-on experience across diverse domains—from deep learning research and economic policy analysis to enterprise data architecture—I bring a versatile skillset and a mindset for continuous improvement. I'm always excited to collaborate on projects that are technically robust and purpose-driven.

Technical Skills

A comprehensive toolkit for building scalable data solutions and driving insights

Programming & Scripting

PythonRJavaT-SQLBashSQL

Cloud & DevOps

Azure Data FactoryAWS S3/EC2/GlueGoogle Cloud PlatformSnowflakeDatabricksDockerCI/CD Pipelines

Data Engineering

ETL PipelinesData WarehousingAPI IntegrationApache AirflowMicrosoft FabricData Orchestration

Machine Learning

Scikit-LearnTensorFlowDeep LearningNLPComputer VisionDifferential Privacy

Data Analysis

PandasNumPyMatplotlibSeabornPower BIAdvanced ExcelStatistical Analysis

Tools & Platforms

GitGitHubTableauJupyterVS CodeSAP Hana S4REST APIs

Professional Experience

3+ years of hands-on experience building robust data solutions across diverse industries

Data Engineer

Central England Co-operative
Lichfield, UK
March 2025 – Present
Full-time
  • Built automated data pipelines using Azure Data Factory and SQL, integrating APIs and file-based workflows
  • Resolved ETL failures from schema mismatches, access errors, and credentials; added robust logging
  • Optimized ingestion via parallel processing and version-controlled deployment using Git
  • Automated ingestion using Azure Functions, monitored via Airflow; reduced latency by 30%
  • Collaborated with IT, BI, and finance teams to ensure secure, compliant data pipelines
Azure Data FactorySQLPythonGitApache AirflowMicrosoft Fabric

Research Data Assistant

University of Glasgow
Glasgow, UK
November 2024 – March 2025
Full-time
  • Conducted curriculum mapping using Excel and data models to identify interdisciplinary learning gaps
  • Analysed life expectancy factors under the SIPHER project using QCA methodology
  • Applied fuzzy-set QCA to model outcomes and derive health policy insights
  • Wrote R scripts for causal extraction; contributed to a forthcoming publication
R ProgrammingQCA AnalysisData ModelingStatistical AnalysisResearch

Platform Engineer

Prospecta Software Company
Delhi, India
March 2023 – August 2023
Full-time
  • Integrated Python and SQL with SAP Hana S4 to improve data quality and real-time insights
  • Improved ETL efficiency by 25% and reduced inconsistencies by 20% via CI/CD restructuring
  • Built Tableau dashboards to validate data integrity with cross-functional teams
  • Managed GitHub repositories, reviewed PRs, and enforced engineering code standards
PythonSQLSAP Hana S4TableauCI/CDGitHub

Python Cloud Developer

BCS Infallible Technology
Delhi, India
June 2022 – November 2022
Contract
  • Built emotion recognition models using OpenCV and TensorFlow; improved performance by 40%
  • Developed Emojify, a Dockerized AI tool for streamlined deployment and scalability
  • Automated model evaluation in Python; deployed inference pipelines on AWS EC2
PythonTensorFlowOpenCVDockerAWS EC2Computer Vision

Featured Projects

A showcase of data engineering, machine learning, and cloud computing projects

Full-Stack ApplicationFeatured
Bike Rental Management System
2024

Developed a Python-based bike rental app with map interface, multi-role access (User/Operator/Manager), secure authentication, wallet system, and comprehensive reporting. Features real-time bike tracking and admin dashboard.

PythonSQLite3TkinterPILGUI DevelopmentDatabase Design
Code
Data VisualizationFeatured
Information Visualization System
2024

Built a multiview visualization system for U.S. traffic accident data analysis (2016-2023). Features interactive maps, temporal analysis, weather correlation studies, and geographic hotspot identification with advanced filtering capabilities.

HTMLJavaScriptD3.jsData VisualizationInteractive Design
Code
Data ScienceFeatured
Geo-Localization Analysis of Tweets
2023

Analyzed spatial distribution of geo-tagged tweets in London using 1km x 1km grids. Implemented newsworthiness scoring mechanism and performed statistical analysis of tweet distribution patterns across the city.

PythonGeoPandasNLTKSpatial AnalysisData MiningVisualization
Code
AI/ML ProjectFeatured
AI Chatbot with Neural Networks
2024

Built a fully functional AI chatbot with 98% accuracy using neural networks and NLP. Features multiple interfaces (terminal, GUI, testing), sub-second response times, and 11 conversation categories with professional-grade performance.

PythonTensorFlowNLTKNeural NetworksNLPTkinter
Code
Data EngineeringFeatured
Real-Time Retail Analytics Pipeline
2024

Production-grade streaming pipeline for real-time retail analytics with sub-second latency. Implements Kappa architecture using Kafka, Spark Structured Streaming, and PostgreSQL for windowed aggregations with 180+ events/min throughput.

Apache KafkaApache SparkPostgreSQLStreamlitDockerPython
Code
Academic ResearchFeatured
MSc Dissertation: Differential Privacy with Deep Learning
2024

Experimented with differential privacy in deep learning for human activity recognition using UCF101 dataset. Analyzed trade-offs between privacy protection and model performance, achieving balanced results with data augmentation techniques.

PythonTensorFlowDeep LearningDifferential PrivacyComputer VisionPyTorch
Code

Get In Touch

I'm always excited to collaborate on projects that are technically robust and purpose-driven. Let's connect if you're building something meaningful with data!

Send a Message