We provide IT Staff Augmentation Services!

Python Developer/ Data Engineer Resume

2.00/5 (Submit Your Rating)

OH

SUMMARY

  • 8 years of experience as a Python Developer/Data Engineer, and experienced in Design, Development, Implementation of Django, Flask web framework sand client - server technologies-based applications, RESTFUL services, AWS, C,and SQL.
  • Skilled in Python wif proven expertise in using new tools and libraries like NumPy, SciPy, matplotlib, PyTest, python-twitter, Pandas,requests,urllib2, etc.
  • Good experience in developing web applications implementing Model View Control architecture using Django and Flask web application frameworks.
  • Experienced in working wif various Python Integrated Development Environments like PyCharm, Spyder, PyStudio, PyDev,andSublime.
  • Knowledge in big data technologies like Hadoop, Spark/pySpark etc.
  • Extensively worked onSpark Streaming and Apache Kafkato fetch live stream data.
  • Strong expertise in the development of web-based applications using Python, Django, HTML, XML, Angular JS, CSS, REST APIs, JavaScript, JSON,andjQuery.
  • Experience in Agile Methodologies, Scrum stories and sprints experience in a Python-based environment, along wif data analytics, data wrangling, and Excel data extracts.
  • Experienced wif setup, configuration, and maintenance ofELK stack (Elasticsearch, Logstash and Kibana).
  • Strongly follow PEP-8 coding standard and test a program by running it across test cases to ensure validity and effectiveness of code using PyChecker and Pylint.
  • Experience wif both relational (SQL) and non-relational (NoSQL) databases like MongoDB, Redis, Oracle, MySQL, SQLite, PostgreSQL, Greenplum etc. using ORM and ODM.
  • Hands on Experience in AWS EC2, S3, Redshift, EMR, RDS, S3, Lambda, AMI, VPC, Elastic IP Addresses, Load Balancer. Cloud system design, implementation, and troubleshooting (AWS, Rackspace, Google, AZURE).
  • Experience in using various version control systems like CVS, Git, GitHub and deployment using Heroku.
  • Write, maintain, and improve automation scripts in Python and BASH.
  • Experience working noledge inUNIXandLinuxshell environments using command line utilities, also setting up development environments in Linux/OSX.
  • Proficient in writing SQL Queries, Stored procedures, functions, packages, tables, views, triggers using relational databases like Oracle, DB2,andMySQL.
  • Experience in working wif Python ORM Libraries including Django ORM.
  • Experience in Test Driven Development and Behavior Driven Development methodologies for consulting firms and enterprise projects.

TECHNICAL SKILLS

Languages: Python, C, Ruby, shell scripting.

Web Design: HTML5, XHTML, CSS3, JSP, AJAX

Databases: Microsoft SQL Server, SQLite, MySQL, PostgreSQL, DB2, MongoDB, Cassandra, Redis

Frameworks: Django, Flask, Pyramid, Pyjamas, Jython, Angular JS, Node JS, Spring, Hibernate

Python Libraries: Report Lab, NumPy, SciPy, Matplotlib, HTTPLib2, Urllib2, Beautiful Soup, Pickle, Pandas

Application and Web Servers: Apache Tomcat, JBOSS, WE Brick, Phusion Passenger

Bigdata Ecosystems: HDFS, Apache Spark, AWS EMR, PySpark

Version Control Systems: CVS, SVN, Git and GitHub.

Deployment tools: Amazon EC2, He Roku

Operating Systems: Windows, Linux, Unix

Protocols: HTTP/HTTPS, TCP/IP, SOAP, SMTP

Other Tools: MS Office (MS-Excel, MS-PowerPoint, MS-Project 2013), Visio 2013

PROFESSIONAL EXPERIENCE

Confidential, OH

Python Developer/ Data Engineer

Responsibilities:

  • Analyzed the SQL and SAS scripts and designed the solution to implement using Pandas and PySpark.
  • Built web pages using HTML, CSS, JavaScript and jQuery for QC Reports on data.
  • Developed a python package to connect to Teradata from Spark and generating QC Reports.
  • Worked on high configuration EMR cluster to run Spark jobs.
  • Developed a web application using Python and Django.
  • Worked in Agile methodology attending the daily stand up and completing tasks in sprints.
  • Analysed and resolved data load issues by working wif business and technical teams.
  • Managed to build new connections to databases like ORE, Snowflake, Redshift and Teradata from Spark and Python.
  • Rewrote the legacy Teradata SQL queries to Amazon Snowflake as part of data base migration.
  • Analyse requirements at the business meetings and strategize the impact of requirements on different applications.
  • Analyse, design and migrate systems using latest Technologies like python, spark and cloud infrastructure using Amazon Web Services in order to provide long-term supportability and sustainability.
  • Based on the new or updated business requirements, design and implement the Rules for Processing Workflows using latest python and Spark frame work versions.
  • Involved in Business requirements, Data analysis and System design meeting.
  • Created entire web application usingPython, Django and MySQL.
  • Used HTML, CSS and JavaScript to create front end pages using Django Templates and wrote Django Views to implement application functions and business logic.
  • Extracted datafrom multiple sources, integrated data into a common data model, and integrated datainto a target database, application, or file using efficient programming processes.
  • Designed and developed data management system using MySQL and optimized the database queries to improve the performance.
  • Added support for Amazon AWS S3 and RDS to host static/media files and the database into Amazon Cloud.
  • Tuned the code wif performance and consistency as the main factors of consideration.
  • Developed entire frontend and backend modules using Python on Django Web Framework.
  • Designed and developed data management system using MySQL.
  • WrotePython scripts to parse XML documents and load the data in database.
  • Using GitHub version control tool to coordinate team-development.
  • Responsible for debugging and troubleshooting the web application.

Environment: Python 2.7, SQL, Spark 2.1.0, Snowflake, Amazon S3, Elastic Map Reduce, Django 1.9, Java Script, HTML, XHTML, jQuery, JSON, XML, CSS, MySQL, Bootstrap, Git, Linux.

Confidential, Bentonville AR

Data Engineer

Responsibilities:

  • Requirement gathering for the ongoing projects by working closely wif subject matter experts and product selection leads .
  • Sqoop jobs for Migrating data from sources such as Oracle, SQL Server to Amazon S3 bucket.
  • Web API creation to ingest data from the S3 bucket (AWS) to the web application.
  • Use of AWS Appsync (GraphQL) for web API creation and Data synchronization into Aurora Postgres or Dynamo DB engine.
  • AWS Glue for cataloging S3 bucket (Data Lake) and loading it in Athena.
  • AWS Lambda to trigger an SQS queue to migrate data back and forth from S3 bucket and AWS SNS as pub-sub topic creation.
  • Python and spark programs on EMR clusters for data integration.
  • DAG jobs in Apache Airflow for scheduling.
  • Solution design document creation and Architecture for an Enterprise project.
  • Data engineering tasks that involved applying statistical analysis on a high-volume input file (30 GB) and run it as a parallel process in a high-performance cluster.
  • Gurobi Optimization on python for Predictive analysis.
  • Data modeling in MS Visio of all the data sources ingested into a web application.
  • Jira for project management and GitHub for code reposition.
  • Power BI and Tableau for Data Visualization and Analytics.

Environment: AWS (EC2, S3, Lambda, SQS & SNS, Glue, Athena, AWS Amplify, Workspace, Aurora DB etc.), Boto 3, R, SQL Server, EMR, Hadoop, Oracle, Spark, Python, Json, XML&CSV files, GitHub, Jira.

Confidential, Malvern PA

Python Developer/Data Engineer

Responsibilities:

  • Performed efficient delivery of code based on principles of Test-Driven Development (TDD) and continuous integration to keep in line wif Agile Software Methodology principles.
  • Developed a fully automated continuous integration system using Git, Gerrit, Jenkins, MySQL and custom tools developed in Python and Bash
  • Designed and developed the UI of the website using HTML, AJAX, CSS and JavaScript
  • Worked on CSS Bootstrap to develop web applications.
  • Designed ETL Process using Informatica to load data from Flat Files, and Excel Files to target Oracle Data Warehouse database.
  • Interacted wif the business community and database administrators to identify the Business requirements and data realties.
  • Created various transformations according to the business logic like Source Qualifier, Normalizer, Lookup, Stored Procedure, Sequence Generator, Router, Filter, Aggregator, Joiner, Expression and Update Strategy.
  • Created Informatica mappings using various Transformations like Joiner, Aggregate, Expression, Filter and Update Strategy.
  • Improving workflow performance by shifting filters as close as possible to the source and selecting tables wif fewer rows as the master during joins.
  • Used connected and unconnected lookups whenever appropriate, along wif the use of appropriate caches.
  • Created tasks and workflows in the Workflow Manager and monitored the sessions in the Workflow Monitor.
  • Perform Maintenance, including managing Space, Remove Bad Files, Remove Cache Files and monitoring services.
  • Set up Permissions for Groups and Users in all Development Environments.
  • Migration of developed objects across different environments.
  • Designed and developed Web services using XML and jQuery.
  • Improved performance by using more modularized approach and using more in-built methods.
  • Experienced in Agile Methodologies and SCRUM Process.
  • Maintained program libraries, user’s manuals and technical documentation.
  • Wrote unit test cases for testing tools.
  • Involved in entire lifecycle of the projects including Design, Development, and Deployment, Testing and Implementation and support.
  • Built various graphs for business decision making using Python matplotlib library.
  • Worked in development of applications especially in UNIX environment and familiar wif all its commands.
  • Used NumPy for Numerical analysis for Insurance premium.
  • Handling the day to day issues and fine tuning the applications for enhanced performance.
  • Implement code in Python to retrieve and manipulate data.

Environment: Python, Django, MySQL, Linux, Informatica Power Centre 9.6.1, PL/SQL, HTML, XHTML, CSS, AJAX, JavaScript, Apache Web Server, NO SQL, jQuery.

Confidential, Tampa FL

Data Engineer

Responsibilities:

  • Worked on Cloudera CDH 5.4 distribution of Hadoop.
  • Contribute to new and existing projects using Python, Django &GraphQL wif deployment to the cloud (AWS).
  • Experience wif Amazon Web Services (AWS), in particular EC2, EBS, S3, and SQS
  • Extensively working wif MySQL for identifying required tables and views to export into HDFS.
  • Responsible for moving data from MySQL to HDFS to development cluster for validation and cleansing.
  • Developed Hive tables on data using different SERDE's, storage format and compression techniques.
  • Optimized the data sets by creating Dynamic Partition and Bucketing in Hive.
  • Used Pig Latin to analyze datasets and perform transformation according to requirements.
  • Implemented Hive custom UDF's for comprehensive data analysis.
  • Involved in loading data from local file systems to Hadoop Distributed File System.
  • Experience working wif Spark SQL and creating RDD's using PySpark.
  • Extensive experience working wif ETL of large datasets using PySpark in Spark on HDFS.
  • Developed ETL workflow which pushes web server logs to an Amazon S3 bucket.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing wif Sqoop script, Pig script, Hive queries.
  • Exporting data from HDFS environment into RDBMS using Sqoop.

Environment: Hadoop, Pyspark, AWS, SQL, ETL, Sqoop, Hive

We'd love your feedback!