Spark/Hadoop Developer Resume Nashville,TN - Hire IT People

PROFESSIONAL SUMMARY

Overall 8+ years of experience in all phases of Software Application requirement analysis, design, development and maintenance of Hadoop/Big Data application and web applications using java/J2EE technologies.
Having 3+ years of hands on experience wif Big Data Ecosystems including Hadoop (1.0 and YARN) MapReduce, Spark, Pig, Hive, Sqoop, Flume, Oozie, Zookeeper in a range of industries.
Good understanding/knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, DataNode and MapReduce programming paradigm.
Expertise in all components of Hadoop Ecosystem - Spark, Hive, Pig, Sqoop, HBase, Flume, Kafka, Oozie, Hue, Zeppelin, Nifi and EC2 cloud computing wif AWS.
Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
Developed Oozie workflows by integrating all tasks relating to a project and schedule teh jobs as per requirements.
Used Hbase in accordance wif Hive as and when required for real time low latency queries.
Acute knowledge on Spark architecture and real-time streaming using Spark.
Extensively used Spark SQL, Pyspark & Scala API’s for querying and transformation of data residing in Hive.
Good knowledge on Amazon Web Services(AWS) cloud services like EC2, S3, EBS, RDS and VPC.
Hands on experience spinning up different AWS instances including EC2-classic and EC2-VPC using cloudformation templates.

TECHNICAL SKILLS

Hadoop Eco-System: Hadoop, Mapreduce, HDFS, Kafka, Hive, Pig, Sqoop,Impala, Oozie, Flume, Yarn, Zookeeper,Hbase.

Spark components: Spark, Spark SQL, Spark Streaming, Python.

AWS Cloud Services: S3, EBS, EC2, VPC, Redshift, EMR

Programming Languages: Operating System Java, Scala, SQL, Shell scripting, AngularJS, HTML5, and CSS. Windows, UNIX, Linux distributions (Centos, Ubuntu):

Databases: Cassandra, Hbase, Oracle, DB2, MySQL, SQLite, MS SQL Server 2008 / 2012.

Development Processes: RUP, AGILE, Scrum Big data Platforms

PROFESSIONAL EXPERIENCE

Confidential, Nashville,TN

Spark/Hadoop Developer

Responsibilities:

Interacted wif multiple teams in understanding their business requirements for designing flexible and common component.
Developed data pipeline using Sqoop, Flume from teradata to store data into HDFS and further processing through spark.
Implemented partitioning, bucketing and worked on Hive, using file formats and compressions techniques wif optimizations.
Experience working wif avro, parquet file formats wif snappy compression.
Developed Spark scripts using Scala, Spark SQL to access hive tables in spark for faster data processing.
Extensively worked on Text, ORC, Avro and Parquet file formats and compression techniques like Snappy, Gzip and Zlib.
Conceived and designed producer and consumer using Kafka 0.10 and teh Spark Streaming
Automated teh Hadoop pipeline using oozie and scheduled using coordinator for time frequency and data availability.
Participated in evaluation and selection of new technologies to support system efficiency.

Environment: Hadoop, Cloudera, HDFS, Hive, Impala, Spark, Autosys, Kafka, Sqoop, Pig, Java, Scala, Eclipse,, Teradata, UNIX, and Maven.

Confidential, St Louis,Missouri

Hadoop Developer

Responsibilities:

Developed Spark Programs for Batch processing.
Worked on Spark SQL and Spark Streaming.
Developed Spark scripts by python and Scala shell commands as per teh requirement.
Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
Used Spark Sql wif Scala for creating data frames and performed transformations on data frames.
Implemented Spark SQL to access hive tables into spark for faster processing of data.
Installed and configured Hive and also written Hive UDFs.
Involved in creating Hive tables, loading data and writing Hive queries.
Imported and exported data into HDFS using Sqoop which includes incremental loading.
Experienced in defining job flows managing and reviewing Hadoop log files.
Responsible in managing data coming from different sources.
Supported MapReduce Programs those are running on teh cluster.
Jobs management using Fair scheduler and Cluster coordination services through Zoo Keeper.
Involved in loading data from UNIX file system to HDFS.
Hands on Experience in Oozie Job Scheduling.
Worked closely wif AWS to migrate teh entire Data Centers to teh cloud using VPC, EC2, S3, EMR.

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, Java, Scala,Spark, Hortonworks, Hbase, Amazon EMR, EC2, S3.

Confidential

Hadoop Developer

Responsibilities:

Part of team for developing and writing PIG scripts.
Loaded teh data from RDBMS SERVER to Hive using Sqoop.
Created Hive tables to store teh processed results in a tabular format.
Developed teh Sqoop scripts in order to make teh interaction between Hive and MySQL Database.
Developed Java Mapper and Reducer programs for complex business requirements.
Developed Java custom record reader, partitioner and serialization techniques.
Created Managed tables and External tables in Hive and loaded data from HDFS.
Performed complex HiveQL queries on Hive tables and Created custom user defined functions in Hive.
Optimized teh Hive tables using optimization techniques like partitions and bucketing to provide better performance wif HiveQL queries.
Performed SQOOP import from Oracle to load teh data in HDFS and directly into Hive tables.
Performed incremental data movement to Hadoop using Sqoop.
Scheduled mapreduce jobs in production environment using Oozie scheduler.
Analyzed teh Hadoop logs using PIG scripts to oversee teh errors caused by teh team.
Experience in gathering requirements from teh client, giving estimates for developing projects and delivering teh projects in time.

Environment: Java, Hadoop, MapReduce, HDFS, Pig, Hive,Spark, Scala, Hortonworks, Hbase.

Confidential

Java Developer

Responsibilities:

Involved in Analysis, Design, Development and Testing of teh application.
Incorporated UML diagrams (Class diagrams, Activity diagrams, Sequence diagrams) as part of design documentation and other system documentation.
Enhanced teh Port search functionality by adding a VPN Extension Tab.
Created end to end functionality for view and edit of VPN Extension details.
Used AGILE process to develop teh application as teh process allows faster development as compared to RUP.
Used Hibernate for persistence framework
Used Struts MVC framework and WebLogic Application Server in this application.
Involved in creating DAO’s and used Hibernate for ORM mapping.
Implemented using Spring Framework for rapid development and ease of maintenance.
Written procedures, and triggers for validating teh consistency of Metadata.
Written SQL code blocks using cursors for shifting records from various tables based on checks.
Fixed defects and generated input XML’s to run on SOA Client to generate output XML for testing Web services.
Written Java classes to test UI and Web services through JUnit and JWebUnit.
Extensively involved in release/deployment related critical activities.
Performed functional and integration testing and also tested teh entire application using JUnit and JWebUnit.
Log4J was used to log both User Interface and Domain Level Messages and also used CVS for version control.

Environment: JAVA, JSP, servlets, J2EE, EJB, Struts Framework, JDBC, WebLogic Application Server, Hibernate, Spring Framework, Oracle 9i, Unix, Web Services, CVS, Eclipse, JUnit, JWebUnit.

We provide IT Staff Augmentation Services!

Spark/hadoop Developer Resume

Nashville, TN

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship