Cloud Data Engineer Resume
SUMMARY
- IT professional having more TEMPthan 7 years of experience in teh domains of Banking, Healthcare, Health & Welfare Benefits and Lifesciences wif solid coding skills, data engineering and machine learning knowledge.
- Proficient in Python, AWS and DevOps to handle complex use cases. Exposure to SDLC and agile methodologies. Resourceful, detail oriented, collaborative and results driven.
PROFESSIONAL EXPERIENCE
Cloud Data Engineer
Confidential
Responsibilities:
- Created Python scripts to automate teh process of downloading files from S3 bucket, decrypting them & feeding it to a third - party application N-Tier, pushing teh output files of N-Tier to a regulatory service Finra and pulling teh feedback, error files, reference data files from Sftp & URLs
- Deployed scripts, application, python packages using Ansible and source code control using Bit Bucket
- Automation using Boto3 SDK to archive and retrieve files from S3 Glacier
- Leveraged cloud formation templates for AWS Parameter Store Integration for storing variables and AWS Event Bridge for scheduling jobs to trigger scripts in EC2 instances
- Used AWS Secrets Manager for securing credentials and lambda function for triggering scripts from EC2 instances
- Queried teh tables created from AWS Glue Data catalog using Atana wif S3 as data store
- Leveraged AWS Cloud9 to for creating python scripts and commit teh code changes to code build repository and trigger teh code pipeline
- Monitored job status using cloud watch metrics AWS Run command history in Systems Manager
Data Engineer
Fidelity Investments
Responsibilities:
- Migrated on-prem oracle database to oracle RDS in AWS for a banking project
- Created python scripts for automation of database migration using concourse pipelines
- Prepared pipeline yaml files and cloud formation json files and used git to push files into bit bucket
- Created concourse pipeline and built docker images from an existing image
- Validated database migration end to end using Unix and sqlplus on Linux servers
- Performed Oracle database export and import using command-line clients, Data Pump API and Metadata API
- Worked on machine and deep learning models such as Spacy, Bi-Lstm, Spark NLP using Python and PySpark to create digital encyclopedia of Indianapolis
- Built machine learning and deep learning models to extract causal effect relationship in diseases
- Created a dashboard for Informatics department project management, using Tableau to make key funding decisions
- Analyzed, validated, and cleaned data using excel, data modeling using MySQL and data mining using complex SQL queries
Data Analyst
Confidential
Responsibilities:
- Analyzed, transformed, and Integrated huge data sets using Excel & Oracle for Health Benefits Administration projects in teh operations domain
- Presented monthly Tableau reports to executives emphasizing business process improvement and achieved 20% reduction in customer service calls received over teh duration of 2 months
- Worked directly wif clients and vendors to translate between data and business needs. Collaborated wif cross-functional teams, handled multiple concurrent projects, and managed a team of 4.
- Created source to target mapping documents and end-to-end ETL process from source systems to staging area, to data warehouse for a pharmaceutical client master data management
- Implemented complex business rules by creating re-usable transformations, mappings, mapplets, exception handling process and address doctor package in Informatica data quality
- Loaded fact, dimension tables by creating jobs in Informatica power center and executed testing using Oracle and achieved 85% improvement in data quality
- ETL Tester for teh project aimed at building an enterprise data warehouse for a healthcare payer insurance client
- Prepared Interface estimates, test cases, scripts, plan, release notes for multiple interfaces - Provider, Member, Claims, Groups
- Scripted complex SQL queries like walking teh tree rule, SCD and validated healthcare data & Business Object reports
- Analyzed defects, created periodic defect status reports using Application life cycle management, performed root cause analysis, and minimized defects by 15%
TECHNICAL SKILLS
AWS: Glue, Atana, EMR, EC2, S3, Glacier, Parameter Store, Secrets Manager, Event Bridge, Lambda, SNS, CloudWatch, CloudFormation, RDS, Cloud9, CodePipeline, CodeBuild, CodeCommit
Programming Languages: Python, PySpark, Unix
CI/CD Tools: Git, Ansible
Data Visualization: Tableau, Matplotlib, Seaborn
Databases: Oracle, MS SQL Server, My SQL
ETL Tools: Informatica PowerCenter, Informatica Data Quality
Defect Management Tools: HP Quality Control, Application Lifecycle Management
Machine Learning: Supervised learning and Unsupervised learning
Deep Learning: CNN, RNN, LSTM, Bi-LSTM, NLP - NLTK, Spacy