Welcome to My Data Engineering Journey

Hello and welcome to my blog! I’m Salah Algamasy, a passionate Data Engineer and Computer Engineering student with a deep fascination for transforming raw data into meaningful insights.

🎓 My Background

I’m currently pursuing my Bachelor’s in Computer Engineering (2020-2025), where I’ve built a strong foundation in:

  • Data Structures & Algorithms - The backbone of efficient data processing
  • Distributed Systems - Essential for handling big data at scale
  • Databases & Machine Learning - Core technologies for modern data solutions
  • Operating Systems & Networks - Understanding the infrastructure that powers data pipelines

💼 Professional Experience

My journey in data engineering has been shaped by hands-on experience across multiple organizations:

🔄 Current Role: Data Engineer Intern at DataLoops

May 2025 - Present

I’m currently working remotely with DataLoops, a startup where I’m diving deep into:

  • Data pipeline architecture and design
  • Advanced data analysis techniques
  • Modern data management practices

🍊 Big Data Training at Orange Digital Center Egypt

August 2024 - September 2024

This intensive program introduced me to enterprise-level big data technologies:

  • Docker for creating reproducible data environments
  • Hadoop for distributed large-scale data management
  • Apache Spark for lightning-fast in-memory processing
  • Apache Hive for efficient data warehousing
  • Apache NiFi for automated data flow orchestration
  • PostgreSQL & MongoDB for diverse data storage needs

🏛️ Government Training Programs

February 2024 - September 2024

Through programs with Digital Egypt Pioneers Initiative and National Telecommunication Institute (NTI), I gained expertise in:

  • ETL processes using SQL Server Integration Services (SSIS)
  • Data warehousing architecture and implementation
  • Python & MS SQL for data manipulation and analysis
  • Apache Spark through hands-on laboratory work

🚀 What I Build

I’m passionate about creating end-to-end data solutions. Here are some highlights from my project portfolio:

Real-Time Retail Analytics Pipeline

Built a complete streaming data pipeline processing retail transactions in real-time using Kafka, Spark Streaming, and Cassandra, with orchestration via Apache Airflow and transformations using dbt.

Multi-Format Data Warehouse

Designed a versatile data warehouse supporting both OLAP and OLTP workloads, handling CSV, Parquet, and Avro formats using Spark, Hive, and Trino.

Movie Recommender System

Created a full data lifecycle project featuring automated ETL, machine learning with Sentence Transformers, experiment tracking with MLflow, and deployment using FastAPI.

🛠️ My Technology Stack

I work with a diverse range of technologies across the data engineering spectrum:

Big Data Processing: Apache Spark, Hadoop, Hive, Kafka
Databases: PostgreSQL, MongoDB, MySQL, SQLite, Cassandra
ETL/Pipeline Tools: Apache Airflow, Apache NiFi, SSIS, dbt
Programming: Python, SQL, Bash
Cloud & DevOps: Docker, Git
Visualization: Power BI

📝 What You’ll Find Here

This blog is my space to share:

  • Technical tutorials on data engineering tools and techniques
  • Project deep-dives showcasing real-world data solutions
  • Industry insights from my experience in the data field
  • Learning resources for aspiring data engineers
  • Best practices for building scalable data architectures

🎯 My Mission

I believe that data is the new oil, but only when properly refined and processed. My mission is to:

  1. Build robust data pipelines that can handle any scale
  2. Share knowledge with the data engineering community
  3. Explore cutting-edge technologies in big data and analytics
  4. Bridge the gap between raw data and actionable insights

🤝 Let’s Connect

I’m always excited to connect with fellow data enthusiasts, engineers, and anyone passionate about the power of data. Whether you’re just starting your data journey or you’re a seasoned professional, I’d love to hear from you!

Feel free to reach out through my social channels or explore my projects on GitHub. Let’s build the future of data together!


Thank you for joining me on this journey. Stay tuned for deep technical content, project showcases, and insights from the ever-evolving world of data engineering!