We provide IT Staff Augmentation Services!

Lead Data Engineer Resume

4.00/5 (Submit Your Rating)

New York, NY

SUMMARY

  • 11 years of software development experience.
  • Object Oriented Development experience in Python.
  • Hands on experience in Python Backend Development.
  • Strong working knowledge of Amazon Web Services (AWS) Platform.
  • Hands on experience in EMR, Hadoop, Spark (PySpark).
  • Hands on experience in ETL Data Pipelines process using AWS and Python libraries.
  • Hands on experience in Migrating on - premise solutions to AWS.
  • Hands on experience with cloud automation and scripting.
  • Hands on experience in implementing Infrastructure as Code using AWS CDK and Cloud formation.
  • Hands on experience in building a large-scale distributed system using latest Python Libraries and OMQ (Messaging library), to accommodate and handle millions of Data and to improve the load performance.
  • Hands on experience with Cloudera, Snowflake, Impala, SQL.
  • Hands on experience in Unit Test, Integration Test, Splunk, Jenkins and Test driven software development
  • Good understanding of Source Control Management Systems such as GIT.
  • Strong analytical skills, problem solving capabilities and excellent communication and presentation skills.
  • Team player, ability to learn and adapt quickly to the emerging new technologies.
  • Good understanding of Waterfall and Agile methodologies, business process definitions, risk analysis and project management.

TECHNICAL SKILLS

Programming languages: Python, Unix Shell Scripting, SQL

Python technologies: Spark, SDK(Boto3), CDK, Pandas, SqlAlchemy, JSON, RabbitMQ, OMQMongoDB, Soaplib, Setuptools, Pip, Nosetest, Virtualenv

Technologies: AWS, Splunk, Talend, Jenkins (Continuous integration), XMLWSDL, SOAP Web Services.

Tools: Git, Talend, Tableau, Aqua Data Studio, SQL DeveloperService now, JIRA, SQL Loader, SQL Plus.

Scheduling Tools: Cisco Tidal Enterprise Scheduler.

Applications: Git, Oracle SQL Developer, WingIDE.

Databases: Snowflake, Impala, MongoDB, MS SQL Server, Oracle, MYSQL, Snowflake.

Operating Systems: Linux, Windows

PROFESSIONAL EXPERIENCE

Confidential - New York, NY

Lead Data Engineer

Responsibilities:

  • Designed and implemented ETL Jobs using Hadoop, Spark, and Python.
  • Designed and implemented Spark Jobs using AWS EMR and Snowflake.
  • Created Python Framework for Spark to process SparkSQL Job based on SQL template.
  • Responsible for design, development, and implementation of AWS architectures and environments.
  • Design and Implemented ALZ, VPC, Subnet, Route, NAT, PrivateLink endpoint.
  • Designed and implemented AWS Security: compliance, cloud security architecture, KMS encryption, Inspector, IAM policies/roles/MFA, etc.
  • Designed and implemented monitoring and logging tools to monitor our applications and servers and Integrated with AWS Centralized Logging and on-premise Splunk.
  • Worked on AWS Cost Management, Optimization and AWS Billing Job (Atana and Glue) and Integrated with on-premise Tableau using PrivateLink.
  • Ensuring that our applications and services are horizontally scalable, highly available.
  • Create Python Framework for AWS Cloud automation, Multiprocessing eod daily and intraday upload and extract using AWS SDK and CDK.
  • Work closely with multiple teams to Integrate their projects into the AWS environment.
  • Developed unit test and integration test suites as part of test driven development in order to enhance code-reusability using Python and Unix Shell scripting.
  • Participated in the daily and weekend operational ETL support in rotation.

Confidential - Orlando, FL

Senior Associate

Responsibilities:

  • Designed and implemented ETL using Python, Teradata Utilities, MongoDB, JSON,XML, Flat CSV files, and logging (File logging, email logger using Python logger module) and exception handling.
  • Developed unit test and integration test suites as part of test driven development in order to enhance code-reusability.
  • Involved in performance tuning and performance optimization and enhanced the long running functions by optimizing the SQLs and redesigning the code.
  • Coded modules for interaction between the ETL framework and the databases using Teradata Utilities.
  • Support team responsibilities included bug reporting and testing, and subsequent code development and releases to clients
  • Monitored data loads of extract files in client development environments and checked for data validation issues.

Confidential, Frisco, TX

Software Developer

Responsibilities:

  • Part of the core ETL team for development and supporting deployments of Confidential ’s JIVA, MAYA and KRIYA platforms for various health care vendors.
  • Designed and implemented ETL using ØMQ (ZeroMQ) Asynchronous messaging library, MongoDB, Flat CSV files, XML files, and logging (File logging, email logger using Python logger module) and exception handling.
  • Designed and implemented SOAP Web Services using Apache Tomcat,Maven, RabbitMQ, JSON and Python.
  • Developed unit test and integration test suites as part of test driven development in order to enhance code-reusability.
  • Monitored data loads of extract files in client development environments and checked for data validation issues.
  • Contributed in a techno-functional role for new product level requirements with direct communications with clients, maintained accurate logs of interactions and technical details
  • Participated in Product features implementation.
  • Travelled to client sites to support production performance issues and identified the critical functions causing more CPU time and more memory usage.

Confidential

Intern programmer

Responsibilities:

  • To Build Interconnect for Yahoo, Hotmail, Rediff and Gmail using Interconnect read email data from web mail Services such As Yahoo, Hotmail, Rediff and Gmail and sends alerts to users mobile.
  • Implemented test suites for various modules and validated code as per functional requirement specifications

We'd love your feedback!