We provide IT Staff Augmentation Services!

Data Engineer/ Python Developer Resume

2.00/5 (Submit Your Rating)

Green Bay, WI

TECHNICAL SKILLS

Technologies: AWS, EMR, Spark, Apache Airflow, Docker

Languages: Python, R

Web Technologies: HTML, CSS, XML, JavaScript

Databases: Postgres, Redshift, Athena, Dynamo DB, Oracle, MS Access, SQL Server

Platforms: Windows and Linux

Miscellaneous Tools: GitLab, Tableau, Power BI, MS Visio, one note, JIRA, Version One.

Domains: Agriculture, Financial, Insurance, Healthcare, Telecom

PROFESSIONAL EXPERIENCE

Confidential, Green Bay, WI

Data Engineer/ Python Developer

Responsibilities:

  • Requirement gathering for the ongoing projects by working closely with Data Science and Business teams.
  • Insomnia rest client tool for making HTTP requests (JSON input) and API debugging.
  • Shell scripting to run the Flask application on Linux servers and open shift containers.
  • Data modeling in MS Visio of all the data sources ingested into a web application.
  • Machine learning Algorithms & Neural Networks for giving out an offer for the customers.
  • Exception handling in python to add logs to the application.
  • Git filter branching on a repository in order to cleanup and organize the git network.
  • Python, spark and EMR clusters for data integrations.
  • Jira for project management and GitHub for code reposition.
  • Power BI and Tableau for Data Visualization and Analytics.

Technical Environment: Oracle, GitHub, Python, R, Linux, AWS, OpenShift

Confidential, Durham, NC

Data Engineer

Responsibilities:

  • Requirement gathering for the ongoing projects by working closely with subject matter experts and product selection leads across EAME & North America.
  • Sqoop jobs for Migrating data from sources such as Oracle, SQL Server to Amazon S3 bucket.
  • Web API creation to ingest data from the S3 bucket (AWS) to the web application.
  • Use of AWS Appsync (GraphQL) for web API creation and Data synchronization into Aurora Postgres or Dynamo DB engine.
  • AWS Glue for cataloging S3 bucket (Data Lake) and loading it in Athena.
  • AWS Lambda to trigger an SQS queue to migrate data back and forth from S3 bucket and AWS SNS as pub - sub topic creation.
  • Python and spark programs on EMR clusters for data integration.
  • DAG jobs in Apache Airflow for scheduling.
  • Solution design document creation and Architecture for an Enterprise project.
  • Data engineering tasks that involved applying statistical analysis on a high-volume input file (30 GB) and run it as a parallel process in a high-performance cluster.
  • Gurobi Optimization on python for Predictive analysis.
  • Data modeling in MS Visio of all the data sources ingested into a web application.
  • Jira for project management and GitHub for code reposition.
  • Power BI and Tableau for Data Visualization and Analytics.

Technical Environment: AWS (EC2, S3, Lambda, SQS & SNS, Glue, Athena, AWS Amplify, Workspace, Aurora DB etc.), Boto 3, R, SQL Server, EMR, Hadoop, Oracle, Spark, Python, Json, XML&CSV files, GitHub, Jira.

Confidential, Richmond, VA

Data Engineer

Responsibilities:

  • Leverage Python development environment for data analysis and report building.
  • Build General Ledger reports using 4sight reporting tool built on Apache Tomcat Application.
  • Utilize Amazon Web Services to efficiently move the on-premise data to the cloud.
  • Replace VBA macros with python using PyXll, a python add-in for Microsoft Excel.
  • Work with SQL Server and Oracle database engines to write Store procedures, Triggers and to query data.

Technical Environment: AWS (EC2, S3, Redshift & CFT), R, SQL Server, Oracle, Spark, Python, Oracle Java, XML&CSV files.

Confidential, Richmond, VA

Data Analyst/Python Developer

Responsibilities:

  • Automated one of the Enterprise Operational Risk Management Reports being a part of Risk Analytical Solutions team.
  • Used openpyxl module in python to format excel files.
  • Used python win32com.client library to write macros as a replacement for visual basics in excel.
  • Wrote scripts in python that would pull data from Red shift database, manipulate the data as per the requirement by writing necessary conditional functions and store it in data frames.
  • Loaded the data from the pandas data frames to the team's user defined space in Redshift database by using copy command from AWS S3 bucket.
  • Created Action filters for the Tableau dashboard to be interactive.
  • Followed the necessary guidelines in tableau for better performance such as, extracting data in Tableau from the database as a view rather than a custom SQL query, aggregating the data to test the functionality before loading the complete data.
  • Scheduled the reports for quarterly refresh in SAMGW server.
  • Played key role in Data movement to the cloud (Oracle OBIEE to AWS Redshift) by being a part of HR Data management team.
  • Parsed XML files, JSON documents to load the data from them into database by scripting in python.
  • Extensively used python modules like numpy, pandas, xmltodict, pycompare, datetime and SQL alchemy to perform data analysis.
  • Managed storage in AWS using S3, created volumes and configured snapshots.
  • Created EC2 instances to run automated python scripts.
  • Automated EC2 instances using AWS cloud formation templates.
  • Wrote python scripts to validate and test source to target mapping (STTM) migration from Oracle to Redshift.
  • Implemented ETL logic in python which was originally written in Scala.
  • Used Hydrograph as an ETL tool for loading the data.

Technical Environment: AWS (EC2, S3, Redshift & CFT), Python, Oracle OBIEE, Hydrograph, SAMGW server, XML&CSV files, Scala.

We'd love your feedback!