Spark Developer Resume NC - Hire IT People

SUMMARY:

6 Years of professional experience in IT which includes 4 years of comprehensive experience in Apache Hadoop and Spark Developer, and related technologies.
Expertise in writing Hadoop Jobs using Java and Scala language.
In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Map Reduce, Spark and Spark Sql
Experience of converting Hive/SQL queries into Spark transformations using Spark RDDs.
Experience of developing SQL scripts using Spark for handling different data sets and verifying the performance over Map Reduce jobs.
Experience in importing and exporting Multi Terabytes of data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice - versa
Experienced in working with Hadoop/BigData storage and analytical frameworks over Amazon AWS Cloud using tools like SSH, Putty and MindTerm.
Good understanding of Data Mining and Machine Learning techniques like Random Forest, logistic Regression, K-Means.
Experience in implementing Custom Partitioners and Combiners for effective data distributions.
Experience in writing simple to complex adhoc PIG Scripts and Pig UDFs
Having experience in writing simple to complex HIVE adhoc scripts, HIVE UDFs, UDTF and UDAFs
Experience in writing shell scripts to dump the Shared data from MySQL, Oracle servers to HDFS.
Good Knowledge in creating event processing data pipelines using Kafka and Storm.
Good understanding in configuring simple to complex work flows using Oozie.
Good understanding of NoSQL databases like MongoDB and Cassandra
Proficient in Working with Various IDE tools including Eclipse and VM Ware.
Very good experience in customer specification study, requirements gathering, analyzing the requirement, design, development, testing and implementation.
Worked on different operating systems like UNIX/Linux, Windows
Exceptional ability to quickly master new concepts and capable of working in-group as well as independently with excellent communication skills.

TECHNICAL SKILLS:

Languages and Technologies: Java, Scala, R, C, C++, XML, SQL, Shell Script, PIG Latin,Impala, MapReduce, Hive, Sqoop, Spark, Spark SQL,AWS, Zookeeper, Hbase, Kafka, Oozie, Storm, Flume

Operating Systems: Linux, Windows

Databases: MySQL, MSSQL, MongoDB, Cassandra

Tools: Eclipse, Winscp, Wireshark, JIRA, IBM Tivoli

Scripting Languages: Scala, JavaScript, PHP,Python

Others: HTML, XML, JSON, REST, SOAP

PROFESSIONAL EXPERIENCE:

Confidential

Spark Developer

Responsibilities:

Developed Spark code using Scala and Spark-SQL/Streaming for faster processing of data .
Prepared Spark build from the source code and ran the PIG Scripts using Spark rather using MR jobs for better performance
Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
Implemented various machine learning techniques like Random Forest, K-Means, Logistic Regression for predictions and pattern identification using Spark-MLib.
Developed Scripts and Batch Job to schedule various Hadoop Program.
Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
Wrote Hive queries for data analysis to meet the business requirements.
Developed Kafka producer and consumers for message handling.
Responsible for analyzing multi-platform applications using python.
Apache Hadoop installation & configuration of multiple nodes on AWS EC2 system.
Used storm for an automatic mechanism to analyze large amounts of non-unique data points with low latency and high throughput.
Migrating servers, databases, and applications from on-premise to AWS, Azure and Google Cloud Platform
Developed MapReduce jobs in Python for data cleaning and data processing.
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.

Environment: CDH4, Scala, Spark, HDFS, AWS, Hive, Pig, Linux, Python, MySQL, MySQL Workbench, Eclipse, PL/SQL, SQL connector.

Confidential, NC

Hadoop Developer

Responsibilities:

Worked as a H a doo p developer to analyze large amounts of data to analyze regulatory reports by creating M a pR e du c e j ob s i n Ja va.
Exported data using S q oo p into HDFS and H i v e for report analysis.
Worked on U se r D efi n e d Fun cti on s in Hive to load the data from HDFS to run aggregation function on multiple rows.
Created a MapReduce job to perform l oo k - up s of specific entries using key-value pairs.
Developed P i g L ati n scri p t s to load data from output files and put to HDFS.
Monitored and managed Hadoop cluster using the Cloudera Manager web interface.
Developed and implemented hive custom UDFs involving date functions.
Used Oo zi e Workflow engine to run multiple Hive and Pig jobs.
Created stored procedures, triggers and functions to operate on report data in MySQL.
Implemented POC to migrate map reduce jobs into Spark RDD transformations.
Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.

Environment: UNIX Scripting, Java, Hadoop, MapReduce, HDFS, Pig, Sqoop, Hive, Oracle, Teradata and Eclipse

Confidential, Durham, NC

Hadoop Developer/ Administrator

Responsibilities:

Installed and configured Hadoop HDFS, MapReduce, Pig, Hive, and Sqoop.
Wrote Map/Reduce jobs in Java to run over Hadoop clusters.
Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Namenode utilizing zookeeper services.
Developing PIG scripts to transform the raw data into intelligent data as specified by business users.
Worked on Hadoop cluster and data querying tools Hive to store and retrieve data.
Reviewing and managing Hadoop log files by consolidating logs from multiple machines using flume.
Exported analyzed data to HDFS using Sqoop for generating reports.
Importing and exporting data into HDFS and Hive using Sqoop and Flume.
Worked on Oozie workflow engine to run multiple Map Reduce jobs.
Experienced in working with applications team in installing Hadoop updates, upgrades based on requirement.

Environment: Hadoop, MapReduce, HDFS, Pig, Sqoop, Hive, Oracle, Teradata, Eclipse and Unix Scripting.

Confidential

Java Developer

Responsibilities:

Followed Agile software development with Scrum methodology.
Designed and developed various modules of the application with OOAD.
Implemented JAVA/J2EE design patterns such as Factory, Singleton, DTO, DAO, Session Facade.
Utilized J2SE 7 extensively to develop business logic.
Implemented dynamic functionality to screens using JQuery and implemented Asynchronous means of retrieval of data using AJAX
Responsible for designing and coding of User Interfaces using SpringMVC framework.
Implemented Ajax component for dynamic values to get from database and updating forms.
Developed the Code both Front-end and Back-end.
Developed classes in DAO and service layers
Consumed Restful web services using JAX-RS
Used SVN for source control repository
Used supervised machine learning techniques for developing prediction models and logistic decisions.
Used third party library like JFreeChart for data visualization.
Used swing components in creating the dashboard.
Configured spring with hibernate properties and validations for Dependency Injection.

Environment: Java, JSP, Servlets, Web Sphere Application Server, Eclipse, Java Script, Oracle, PL/SQL and JDBC.

We provide IT Staff Augmentation Services!

Spark Developer Resume

NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship