Get to know me
Curious, dynamic, and results-oriented Data Engineer with an advanced degree in a quantitative field, a cited-published-article in a peer-review journal on high-resolution peak analysis, and five years of hands-on experience across the US and Canada - startups as well as big tech companies such as Yahoo. Proficient in advanced Python, SQL, Spark, Airflow, Looker, Tableau, and certified in AWS and Snowflake; I specialize in developing robust data pipelines, optimizing database architectures, developing and tracking KPIs, and finding actionable insights from complex datasets.
Motivated by the exhilarating advancements and recognizing the transformative potential of emerging technologies across industries, I made a deliberate choice to embark on a career break dedicated to enhancing my technical skills and my knowledge in the rapidly evolving technologies. This period allowed me to focus on acquiring advanced expertise in cloud computing, new data engineering tools, and contributing to the data engineering community through medium articles on various topics.
My Background
Skills and Education
General: Object Oriented Programming, ETL Development, Database Design, Data Visualization, Big Data Architecture, Cloud Solutions, Data & Statistical Modeling, Deep Learning
Programming/Scripting Languages: Advanced Python (Pandas, Numpy, Scikit-learn, XGBoost, PyTorch, TensorFlow, Matplotlib, and Pyspark) , and R (tidyverse, dplyr, ggplot, and broom), Go, and Scala
Database Management: SQL, Hive, Hue, Presto, Snowflake, DBVisualizer, Microsoft SQL Server, PostgreSQL
Orchestration and Containerisation: Bash, Airflow, Crontab, Oozie, and Docker
AWS Services: Athena, Glue, Lambda, Step Functions, and SageMaker
Tools and Platforms: Looker, Tableau, Jupyter Notebook, GitHub, Jira, AWS (Athena, Glue, Lambda, Step Functions, RDS, DynamoDB, and SQS), Google Suite, and MS Office
Certifications: AWS Certified Solutions Architect, Snowflake SnowPro Core
UT Dallas, Dallas, TX, US — MS in Computational Chemistry Aug 2016 - Dec 2016
Sharif University of Technology, Tehran, Iran — BS in Chemistry Aug 2011 - Dec 2015
Education
Hubio Technology, Toronto, ON, CA —April 2023 - Jan 2024
Spearheaded the design and implementation of robust data pipelines within AWS ecosystem, orchestrating a seamless data flow from S3 to PostgreSQL, specializing in batch processing insurance transactions efficiently and reliably (languages:Python, SQL, Apache Spark; Primary AWS Services:Lambda, Glue, Step Functions, DynamoDB, SES, SNS, SQS).
Applied advanced Python, Apache Spark, and SQL skills to develop custom scripts; ensuring optimized, scalable, and cost-effective solutions for our customers.
Implemented an automated reporting system on data quality within the workflow, providing daily feedback to internal and external stakeholders (Lambda, SNS, SES).
Designed and implemented sophisticated internal tools, enabling non-technical members to implement on-demand changes to the configurations for each customer (DynamoDB, Python, SQL).
Yahoo!, Toronto, ON, CA —Jan 2022 - Mar 2023
Modeled a multimillion dollars third party deal by projecting quarterly revenue using MLR which upon approval brought over 30 millions new users to reach in our platform (language: Python; libraries used: Pandas and SKlearn).
Designed and implemented hive tables and associated pipelines for daily updates with Oozie, these tables were then used as Looker explores (language:SQL; other tools: Looker).
Created multiple dashboards with drilling capabilities for sales and product teams, enabling straightforward KPI tracking for Native ads in NAR (Looker).
Developed Tableau dashboards for finance team to track performance vs budget as well as the guaranteed revenue for third party deals (Tableau).
Collaborated with the product and sales teams extensively, providing detailed analysis to model the impact of multiple initiatives (languages: Python and SQL; libraries: Pandas; other tools: Excel).
Yahoo!, Sunnyvale, CA, US —Jan 2021 - Jan 2022
Created Cron jobs to feed aggregated hive tables for easier and faster data retrieval for the reporting team (languages: Bash and SQL).
Created sophisticated dashboards for performance tracking of our Native ads on Yahoo supply, mainly focused on US users, enabling product and sales team to decide on the next steps to improve performance (Tableau).
Provided ad-hoc analysis for the product and sales teams on-demand (language: SQL; other tools: Excel).
Designed an A/B test to incorporate user categorization in our Native ads algorithm, this test showed 1.5% improvement in CTR for our Yahoo users in the US, estimated to increase the annual revenue by $8M, this project led to a promotion as well as selection as the employee of the quarter in the ad tech department (Internal tools).
Praedicat, Inc., Los Angeles, CA, US — Feb 2019 - Jan 2021
Created the Chemical Object that was used by the prediction model. This enabled the company to provide risk analysis for the vast majority of the chemicals used in manufacturing in both NAR and Europe (language: Python).
Designed and implemented relational databases to store the properties of the Chemical Object (languages: Python and SQL; library used: Pyodbc).
Designed and implemented the ETL pipelines for public external data sources (FDA and EPA) to keep the databases updated with the latest reports (language:Python; libraries used: ElementTree, BeautifulSoup, Pandas, and RegEx).