Lead Data Engineer Resume
New York, NY
SUMMARY
- 11 years of software development experience.
- Object Oriented Development experience in Python.
- Hands on experience in Python Backend Development.
- Strong working knowledge of Amazon Web Services (AWS) Platform.
- Hands on experience in EMR, Hadoop, Spark (PySpark).
- Hands on experience in ETL Data Pipelines process using AWS and Python libraries.
- Hands on experience in Migrating on - premise solutions to AWS.
- Hands on experience with cloud automation and scripting.
- Hands on experience in implementing Infrastructure as Code using AWS CDK and Cloud formation.
- Hands on experience in building a large-scale distributed system using latest Python Libraries and OMQ (Messaging library), to accommodate and handle millions of Data and to improve the load performance.
- Hands on experience with Cloudera, Snowflake, Impala, SQL.
- Hands on experience in Unit Test, Integration Test, Splunk, Jenkins and Test driven software development
- Good understanding of Source Control Management Systems such as GIT.
- Strong analytical skills, problem solving capabilities and excellent communication and presentation skills.
- Team player, ability to learn and adapt quickly to the emerging new technologies.
- Good understanding of Waterfall and Agile methodologies, business process definitions, risk analysis and project management.
TECHNICAL SKILLS
Programming languages: Python, Unix Shell Scripting, SQL
Python technologies: Spark, SDK(Boto3), CDK, Pandas, SqlAlchemy, JSON, RabbitMQ, OMQMongoDB, Soaplib, Setuptools, Pip, Nosetest, Virtualenv
Technologies: AWS, Splunk, Talend, Jenkins (Continuous integration), XMLWSDL, SOAP Web Services.
Tools: Git, Talend, Tableau, Aqua Data Studio, SQL DeveloperService now, JIRA, SQL Loader, SQL Plus.
Scheduling Tools: Cisco Tidal Enterprise Scheduler.
Applications: Git, Oracle SQL Developer, WingIDE.
Databases: Snowflake, Impala, MongoDB, MS SQL Server, Oracle, MYSQL, Snowflake.
Operating Systems: Linux, Windows
PROFESSIONAL EXPERIENCE
Confidential - New York, NY
Lead Data Engineer
Responsibilities:
- Designed and implemented ETL Jobs using Hadoop, Spark, and Python.
- Designed and implemented Spark Jobs using AWS EMR and Snowflake.
- Created Python Framework for Spark to process SparkSQL Job based on SQL template.
- Responsible for design, development, and implementation of AWS architectures and environments.
- Design and Implemented ALZ, VPC, Subnet, Route, NAT, PrivateLink endpoint.
- Designed and implemented AWS Security: compliance, cloud security architecture, KMS encryption, Inspector, IAM policies/roles/MFA, etc.
- Designed and implemented monitoring and logging tools to monitor our applications and servers and Integrated with AWS Centralized Logging and on-premise Splunk.
- Worked on AWS Cost Management, Optimization and AWS Billing Job (Atana and Glue) and Integrated with on-premise Tableau using PrivateLink.
- Ensuring that our applications and services are horizontally scalable, highly available.
- Create Python Framework for AWS Cloud automation, Multiprocessing eod daily and intraday upload and extract using AWS SDK and CDK.
- Work closely with multiple teams to Integrate their projects into the AWS environment.
- Developed unit test and integration test suites as part of test driven development in order to enhance code-reusability using Python and Unix Shell scripting.
- Participated in the daily and weekend operational ETL support in rotation.
Confidential - Orlando, FL
Senior Associate
Responsibilities:
- Designed and implemented ETL using Python, Teradata Utilities, MongoDB, JSON,XML, Flat CSV files, and logging (File logging, email logger using Python logger module) and exception handling.
- Developed unit test and integration test suites as part of test driven development in order to enhance code-reusability.
- Involved in performance tuning and performance optimization and enhanced the long running functions by optimizing the SQLs and redesigning the code.
- Coded modules for interaction between the ETL framework and the databases using Teradata Utilities.
- Support team responsibilities included bug reporting and testing, and subsequent code development and releases to clients
- Monitored data loads of extract files in client development environments and checked for data validation issues.
Confidential, Frisco, TX
Software Developer
Responsibilities:
- Part of the core ETL team for development and supporting deployments of Confidential ’s JIVA, MAYA and KRIYA platforms for various health care vendors.
- Designed and implemented ETL using ØMQ (ZeroMQ) Asynchronous messaging library, MongoDB, Flat CSV files, XML files, and logging (File logging, email logger using Python logger module) and exception handling.
- Designed and implemented SOAP Web Services using Apache Tomcat,Maven, RabbitMQ, JSON and Python.
- Developed unit test and integration test suites as part of test driven development in order to enhance code-reusability.
- Monitored data loads of extract files in client development environments and checked for data validation issues.
- Contributed in a techno-functional role for new product level requirements with direct communications with clients, maintained accurate logs of interactions and technical details
- Participated in Product features implementation.
- Travelled to client sites to support production performance issues and identified the critical functions causing more CPU time and more memory usage.
Confidential
Intern programmer
Responsibilities:
- To Build Interconnect for Yahoo, Hotmail, Rediff and Gmail using Interconnect read email data from web mail Services such As Yahoo, Hotmail, Rediff and Gmail and sends alerts to users mobile.
- Implemented test suites for various modules and validated code as per functional requirement specifications