Data Engineer Resume Phoenix, AZ - Hire IT People

SUMMARY:

Professional experience in IT industry as a software Engineer with a background in design, development, and testing of applications.
Worked in various domains including Media, Finance, Manufacturing and E - commerce
Dedicated professional Data Engineer with a solid background in Hadoop ecosystem like HDFS, MapReduce, Spark, Hive, Kafka, Pig, Sqoop and Zookeeper
Have a deep understanding of workload management, schedulers, scalability and distributed platform architectures
Proficient in Spark programing with Scala and Python for high-volume data processing
Experience in collecting, processing and aggregating large amounts of streaming data using Kafka, Spark Streaming
Experience in writing Pig Latin scripts and HiveQL Queries for preprocessing and analyzing large volumes of data
Proficient in writing MapReduce programs with Java for data processing in Hadoop
Experience in importing and exporting buck of data using Sqoop from HDFS/Hive/HBase to RDBMS
Experience in working with RDBMS including Oracle and MySQL
Experience in developing scalable solutions using NoSQL databases including Cassandra, HBase
Knowledge of data serialization and familiar with data formats including SequenceFile, Avro, Parquet, XML and JSON
Experience on commercial distribution of Hadoop including HortonWorks HDP and Cloudera CDH, and MapR
Experience in working with AWS using the services like EC2/EMR/S3
Involved in Hadoop cluster administration & performance tuning
Experience in all the phases of Data warehouse life cycle involving requirement analysis, design, coding, testing, and deployment
Strong in Core Java, Data Structure and Algorithms, and Object-Oriented Design
Experience in Unit Testing with JUnit, Scala Test, Python unittest
Familiar with various web development technologies including JavaScript, Bootstrap, Ajax, JQuery, Node.js, AngularJS, Hibernate, and Spring
Familiar with software development tools like Git, SVN, JIRA and Jenkins.
Expose to various software development methodologies like Agile and Waterfall.
A good team-player, can work independently in a fast-paced multitasking environment, and a self-motivated learner

TECHNICAL SKILLS:

Hadoop/Spark Ecosystem \Programming Language: \: Hadoop 2.x, MapReduce, Spark 2.x, Pig 0.12, \Java, Scala, Python, SQL, Unix/Bash shell, \Hive 0.14, Sqoop 1.4.6, Kafka 0.9.x, Yarn, \JavaScript, HTML, CSS, XML\Mesos, Zookeeper 3.4.x\

Web Development Framework \Database: \: JQuery, Ajax, AngularJS, Bootstrap, Hibernate, \Oracle 10g, MySQL 5.x, HBase 0.98, \Spring, \Cassandra 2.1.x\

Operating System \Cloud Platform: \: Linux, Mac OS, Windows\Amazon Web Services EC2/EMR/S3 \

Environment & Tools: \IDE: \: Git/Github, Agile/Scrum, SVN, JIRA, Jenkins\IntelliJ IDEA, Eclipse, Visual Studio Code\

PROFESSIONAL EXPERIENCE:

Confidential, Phoenix, AZ

Data Engineer

Responsibilities:

Designed, developed, implemented, testing and maintenance of data ingestion and integrated the new enterprise ETL pipelines for Amex Email Marketing Department including Kafka, batch processing, PySpark, Spark streaming, Hive
Developed Kafka producers and consumers using Java and Python to move data from various data sources to different departments
Developed Spark Streaming programs using Python to process real time data from Kafka with batch processing
Utilized Hive as Data warehouse to provide store structured data.
Validated data Avro schema from different data source
Stored processed data in Hive Tables for Machine Learning team to analysis the data
Used Git for version control and JIRA for project tracking
Involved in reviewing Functional requirements and designing solutions
Documented systems process and procedures for future s
Involved in gathering the requirements, designing, development and testing
Used shell scripts for administration, maintenance and troubleshooting
Involved in story-driven Agile development methodology and actively participated in daily Scrum meetings

Environment: Hadoop 2.x, HDFS, Kafka, Spark 2.x, Spark Streaming, Hive, Avro, Java, Python 3.5, Python unittest, Maven, Jenkins, Git, JIRA

Confidential, Sunnyvale, CA

Data Engineer

Responsibilities:

Involved ETL processes including data processing and data storage.
Applied Spark using Scala to do the data batch processing
Designed, developed, implemented Kafka Streaming using Scala including Producer and Consumer.
Processed data between different topics in Kafka in Avro files
Utilized Sqoop to import and output data between Oracle database and HDFS
Configure the Sqoop incremental import job for importing the updated input data
Convert raw data with sequence data format, such as Avro to reduce data processing time and increase data transferring efficiency through the network
Involved in application performance tuning and troubleshooting
Collaborate and tracking the work with Git and JIRA
Actively participated and provided feedback constructively during daily Stand up meetings and weekly Iterative review meetings

Environment: Hadoop 2.x, Kafka, HDFS, Sqoop 1.4.6, Spark 2.x, Scala, ScalaTest, Jenkins, Git,JIRA, Agile

Confidential, Piscataway, NJ

Data Engineer

Responsibilities:

Designed, developed, implemented, testing and maintenance of data ingestion and integration ETL pipelines including Kafka, batch processing, Spark streaming, Cassandra
Developed Kafka consumers efficient ingested data from various data sources
Developed Spark Streaming programs to process real time data from Kafka, and process data with both stateless and state full transformations
Developed Spark programs with Scala and applied principles of functional programming to do batch processing
Utilized Spark SQL with Data Frames API to provide efficiently structured data processing.
Built a Cassandra data model based on different requirement
Stored both the raw data and processed results in the Cassandra for future decision support and BI analytics
Configured ZooKeeper to coordinate and support Kafka, Spark, Cassandra and HDFS
Deploy services on AWS and utilized Lambda function to trigger the data pipeline.
Performed unit testing using ScalaTest
Used Git for version control and JIRA for project tracking

Environment: Hadoop 2.x, HDFS, Kafka 0.9.x, Spark 2.x, Spark Streaming, Spark SQL, Cassandra 2.1.x, Zookeeper 3.4.x, ScalaTest, AWS, Git, JIRA

Confidential, Chicago, IL

Hadoop Developer

Responsibilities:

Involved ETL processes including data processing and data storage.
Applied Spark using Scala to do the data batch processing, and store the output in HBase for scalable storage and fast query
Designed and created of Hive tables and worked on various performance optimizations like Partition, Bucketing in Hive
Implemented Hive custom UDFs and Analyzed large data sets by running HiveQL to achieve comprehensive data analysis
Migrated of MapReduce jobs and Hive queries into Spark transformations and actions to improve the performance
Utilized Sqoop to import and output data between Oracle database and HDFS
Configure the Sqoop incremental import job for importing the updated input data
Convert raw data with sequence data format, such as Avro, and Parquet to reduce data processing time and increase data transferring efficiency through the network
Involved in application performance tuning and troubleshooting
Collaborate and tracking the work with Git and JIRA
Actively participated and provided feedback constructively during daily Stand up meetings and weekly Iterative review meetings

Environment: Hadoop 2.x, MapReduce, HDFS, Sqoop 1.4.6, Hive 0.14, Spark 1.4.x, Scala, HBase 0.98, Git,JIRA, Agile

Confidential, Shenyang, CN

Java Developer

Responsibilities:

Developed user interface using HTML, CSS3 and JavaScript for the presentation tier
Used JSP and JavaScript for encapsulating presentation for sales module
Developed Controller Servlet to handle all the request and MySQL database access.
Involved in integration with Spring and developing ORM using Hibernate
Installed and configured Apache Tomcat
Deployed the application, supported and maintained regular functioning on server.

Environment: Java, Servlet 3.0, JSP 2.2, HTML, CSS3, JavaScript, Spring MVC, Hibernate 4.0, Apache Tomcat 7.0, MySQL 5.1.54, Eclipse

Confidential, Shenyang, CN

Java Developer

Responsibilities:

Designed and coded application components with JSP, Servlet and AJAX.
Implemented data persistency using JDBC for database connectivity and Hibernate for database/java object mapping.
Designed the logical and physical data model, generated DDL, DML scripts.
Designed user-interface and used JavaScript to check validations.
Wrote MySQL queries, stored procedures and database triggers as required on the database objects.

We provide IT Staff Augmentation Services!

Data Engineer Resume

Phoenix, AZ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship