Sr. Data Engineer Resume
New York, NY
SUMMARY
- Around 8+ years of IT experience in Analysis, Design, Development, Management and Implementation of various stand - alone, client-server enterprise applications in Python on various domains.
- Experienced in Agile Methodologies, Scrum stories and sprints experience in a Python based environment.
- Experience in developing Web-Applications implementing Model View Template architecture using Python and Django web application framework.
- Hands on experience in GCP, Big Query, GCS bucket, G - cloud function, cloud dataflow, Pub/Sub cloud shell, GSUTIL, BQ command line utilities, Data Proc, Stack driver
- Experienced in web applications development using Django/Python, and Node.js, Angular, Angular JS, DOJO, jQuery while using HTML/CSS/JS for server-side rendered applications.
- Experienced in implementing Cloud solutions in AWS.
- Experience in using various version control systems like Git, GitHub and Amazon EC2 and deployment using Heroku.
- Relevant Experience in working with various SDLC methodologies like Agile Scrum for developing and delivering applications.
- Demonstrated experience in delivering data and analytic solutions leveraging AWS, Azure or similar cloud Data Lake.
- Experience in developing web services (WSDL, SOAP and REST) and consuming web services with python programming language.
- Well versed with design and development of presentation layer for web applications using technologies like HTML5, CSS3, and JavaScript, Bootstrap.
- Experience in working with Python ORM Libraries including Django ORM, SQLAlchemy.
- Experience in working with various Python Integrated Development, PyCharm, Eclipse, Sublime Text and Notepad++.
- Worked with Cloudera and Horton works distributions.
- Experienced in performing code reviews and close involvement in smoke testing sessions, retrospective sessions.
- Experienced in Microsoft Business Intelligence tools, developing SSIS (Integration Service), SSAS (Analysis Service) and SSRS (Reporting Service), building Key Performance Indicators, and OLAP cubes
- Have good exposure with the star, snowflake schema, data modeling and work with different data warehouse projects.
- Experience in GCP Dataproc, GCS, Cloud functions, Big Query.
- Expertise in writing DDLs and DMLs scripts in SQL and HQL for analytics applications in RDBMS.
- Proficient in writing SQL Queries, Stored procedures, functions, Packages, tables, views, triggers using relational databases like Oracle, MYSQL and Non - Relational (MongoDB) databases.
- Experience in working with NoSQL databases like HBase and Cassandra.
- Proficient in using defect tracking/issue tracking/ Bug tracking tool like Atlassian Jira, Bugzilla.
- Excellent interpersonal and communication skills, efficient time management and organization skills, ability to handle multiple tasks and work well in a team environment.
PROFESSIONAL EXPERIENCE
Sr. Data Engineer
Confidential, New York, NY
Responsibilities:
- Participated in various stages of Software development life cycle (SDLC), Worked in an Agile (Framework: SCRUM) development environment.
- Developed REST based Application using Python, Django, Angular, CSS, HTML, TypeScript and Node.js by following W3C standards.
- Build best practice ETLs with Apache Spark to load and transform raw data into easy to use dimensional data for self-service reporting.
- Developed Angular application using Typescript, and the Angular CLI front end from scratch.
- Responsible for Writing helper scripts with boto3 to launch EC2 instances, monitor instance status with emails notifications.
- Developed operational analytics, financial analytics, model building and enrichment, prediction engine for both batch and real-time using Java, Storm, Kafka, Akka, Spark MLib, Scikit-learn
- Dealt with Python Open stack API's, used Python scripts to update content in the database and manipulate files
- Developing the Python automated scripting using Boto3 library for AWS Security audit and reporting using AWS Lambda for multiple AWS Accounts.
- Used Beautiful Soup 4(python library) for Web Scraping to extract data for building graphs.
- Worked with PyQuery for selecting particular DOM elements when parsing HTML.
- Dealt with GitHub to pull requests, improved code quality, and also conducted meetings among peer.
- Created Local Virtual repositories for the project and release builds, repository management in Maven to share snapshots and work with NOSQL DB - Mongo DB.
- Involved in CI/CD(Jenkins) process for application deployments by enforcing strong source code repository management techniques and securing configuration files away from application source code for improved security
- Used Version Control Tool GIT.
- Worked on Jira for managing the tasks and improving the individual performance.
Environment: Python, Django, Angular, TypeScript, Node JS, NPM, HTML5, Bootstrap, Visual Studio, AWS, S3, EC2, Beautiful Soup, Jenkins, Maven, GIT, Jira, Agile, Windows.
Sr. Data Engineer
Confidential, Evansville, IN
Responsibilities:
- Worked in Agile environment by actively participating in sprint planning to Employ coding standards and advance guidelines for efficient and effective Python programming.
- Involved in developing web applications and implementing Model View Control (MVC) architecture using server-side applications like Django.
- Developing the Python automated scripting using Boto3 library for AWS Security audit and reporting using AWS Lambda for multiple AWS Accounts
- Developed the back-end web services for the worker using python and REST APIs and Implemented MVC architecture in developing the web application by Django framework.
- Created a very secure login/registration application using Django Auth package and securing on front end using proper restful services.
- Worked with WEB API's to make calls to the web services using URLs, which would perform GET, PUT, POST and DELETE operations on the server.
- Implemented AJAX for dynamic functionality of a webpages for front end applications.
- Used REST Webservices for creating rate summary.
- Creating indexes on MySQL tables to improve the performance by eliminating the full table scans and views for hiding the actual tables and to eliminate the complexity of the large queries.
- Developed GUI using Python and Django for dynamically displaying the test block documentation and other features of Python code using a web browser.
- Used automation Jenkins for continuous integration and continuous delivery (CI/CD) on Amazon EC2.
- Maintained the Version and Backups of the source using GitHub.
- Updated storyboard organized Sprint dashboard and involved in stories grooming for future Sprint planning and preparation
Environment: Python, Django, Pandas, HTML5, CSS3, Bootstrap, XML, AWS, EC2, Boto 3, Rest API, AJAX, REST, Jenkins, PyCharm, MySQL, Jenkins, GITHUB, Agile, Windows.
Data Engineer
Confidential, Santa Rosa, NM
Responsibilities:
- Coded model level validation and provide guidance in making long term architectural design decisions and also used Agile Methodology and SCRUM process.
- Utilized the existing Python and Django modules and rewritten to deliver data in required formats.
- Built database Model, APIs and Views utilizing Python, to build an interactive web-based solution.
- Worked on object-oriented programming (OOP) concepts using Python and Linux.
- Embedded AJAX in UI to update small portions of the web page avoiding the need to reload the entire page.
- Used Python in-built libraries urllib2 and beautiful soup modules for web scraping.
- Designed Django REST web services using Python and Django to get and post data.
- Used Python and Django creating graphics, XML processing of documents, data exchange and business logic implementation between servers.
- Handled exceptions and used-test cases by writing python-scripts to refrain website from rendering Error codes.
- Using Gitlab for continuous integration and deployment and Git version control system for collaborating with teammates and maintaining code versions.
- Logged user stories and acceptance criteria in JIRA for features by evaluating output requirements and formats.
Environment: Python, Django, HTML5, CSS3, Oracle, Git, Jira, Agile, Windows.
Hadoop Engineer/Developer
Confidential
Responsibilities:
- Designed and developed the applications on the data lake to transform the data according business users to perform analytics.
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Worked on different files like csv, txt, fixed width to load data from various sources to raw tables.
- Conducted data model reviews with team members and captured technical metadata through modelling tools.
- Implemented ETL process wrote and optimized SQL queries to perform data extraction and merging from SQL server database.
- Experience in loading logs from multiple sources into HDFS using Flume.
- Worked with NoSQL databases like HBase in creating HBase tables to store large sets of semi-structured data coming from various data sources.
- Involved in designing and developing tables in HBase and storing aggregated data from Hive tables.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDD's and Scala.
- Data cleaning, pre-processing and modelling using Spark and Python.
- Strong Experience in writing SQL queries.
- Responsible for triggering the jobs using the Control-M.
Environment: Python, SQL, ETL, Hadoop, HDFS, Spark, Scala, Kafka, HBase, MySQL, Netezza, Web Services, Shell Script, Control-M.
Data Analyst
Confidential
Responsibilities:
- Documented the system requirements to meet end-state requirements and complied Software Requirement Specification Document and Use Case document.
- Prepared ETL (Extract, Transform and Load) standards, naming conventions and wrote ETL flow documentation.
- Used Microsoft SharePoint to upload, manage all project related documents and have version control.
- Segregated business requirements by analyzing them into low level and high level. Converted Business Requirements into Functional Requirements Document.
- Prepared Dashboards using calculations, parameters in Tableau and generated KPI reports to be analysed by management.
- Worked on SQL queries for data manipulation.
- Arranged weekly team meetings to assign testing tasks and acquisition of status reports from individual team members.
- Effectively managed change by deploying change management techniques such as Change Assessment, Impact Analysis and Root cause Analysis.
- Used advanced Excel functions to generate spreadsheets and pivot tables.
- Presented solutions in written reports while analyzing, designing, testing and monitoring systems in a waterfall methodology.