Sr. Hadoop, Spark Developer Resume Wichita, KS - Hire IT People

SUMMARY:

IT Professional with 9+ years of experience in Design, Analysis, Development, Testing, Documentation, Deployment, Integration, and Maintenance of web - based and Client-Server applications using Java, Scala and Big Data Platforms
4+ years of professional experience in using various BigData technologies including HDFS, MapReduce, Hive, Spark, Kafka, Sqoop, Oozie, YARN, HBase, Flume and ZooKeeper based Big Data Platforms.
Expertise in installing and configuring various Hadoop components like Hive, PIG, Sqoop, HBase, Zookeeper etc.
Experience in fetching live stream of data from DB2 to Hive tables using Spark Streaming and Apache Kafka.
Experience in Apache Spark for quick analytics on object relationships.
Expertise in data transformations, RDDs, Dataframes and Spark SQL.
Experience in creating databases, tables, users, views, triggers, macros, stored procedures, functions, packages in Oracle Database.
Experienced in writing complex query processing using Cloudera Impala.
Expertise in non-relational and relational data modelling and database engineering.
Expertise in building scalable distributed data solutions using HBase, MongoDB and Cassandra.
Expertise in building clusters using AWS using Amazon EC2 services and Cloudera manager.
Expertise in Big Data platforms like Cloudera, Hortonworks, Apache and Amazon AWS.
Excellent knowledge on Agile methodology and Scrum process.
Versatile team player, quick learner with good analytical, inter personal and problem solving skills.

TECHNICAL SKILLS:

Hadoop EcoSystem: Hadoop, Map Reduce, HDFS, Kafka, Hive, Pig, Sqoop, Oozie, storm, Yarn, Zookeeper, Spark 2.0, Spark core, Spark SQL, Solr, Hortonworks Hadoop Stack

Languages: Java, Scala, Python, SQL, Shell scripting, HTML 5 and CSS

Web Technologies: Java Script, JDBC

Operating System: Windows, Linux

Databases: Cassandra, MongoDB, HBase, Oracle, DB2, MySQL

Methodology: AGILE, Scrum

Defect Tracking: Bugzilla, HP Quality Center 9.2, HP ALM

Other Tools: SOA Client, Putty, Scrum Works 1.8.3, Stylus Studio 2008 XML Enterprise Suite

Big data Platforms: Hortonworks HDP 2.5, Cloudera CDH 5.x, Amazon AWS

Applications: JIRA, Amazon EC2, S3, EMR, MySQL, MS Office

PROFESSIONAL EXPERIENCE:

Confidential, Wichita, KS

Sr. Hadoop, Spark Developer

Responsibilities:

Involved in analysis, design, testing phases and responsible for documenting technical specifications.
Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
Developed a data pipeline using Kafka and Strom to store data into HDFS.
Explored Spark to improve the performance and optimization using Spark context, Spark-SQL, Data Frame, pair RDDs, Spark YARN of the existing algorithms in Hadoop.
Installed/Configured/Maintained Hortonworks Hadoop clusters for application development.
Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
Responsible for building scalable distributed data solutions using Hadoop cluster environment with AWS infrastructure services Amazon Simple Storage Service (Amazon S3),EMR, and Amazon Elastic Compute Cloud (Amazon EC2).
Loaded the data into Spark RDD and performed in-memory data computation to generate the output response.
Developed and executed shell scripts to automate the jobs and Wrote complex Hive queries and UDFs.
Worked on reading multiple data formats on HDFS using Spark.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
Developed multiple POCs using Spark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
Analyzed the SQL scripts and designed the solution to implement using Spark.
Involved in loading data from UNIX file system to HDFS, AWS S3.
Extracted the data from Teradata into HDFS using Sqoop.
Handled importing of data from various data sources like AWS S3, Cassandra.
Performed transformations using Hive, MapReduce, Spark and load data into HDFS.
Manage and review Hadoop log files.
Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive using AWS EMR.
Used Atlas exchange of metadata with MariaDB to Hive.
Facilitating the daily scrum meetings, spring planning, spring review, and spring retrospective.
Worked on the core and Spark SQL modules of Spark extensively.
In Involved in running Hadoop streaming jobs to process terabytes data from AWS S3.
Implemented Oozie job for importing real-time data to Hadoop using Kafka and for daily imports.

Environment: Hadoop, HDFS, Hive, Scala, Spark, SQL, MongoDB, MariaDB, UNIX Shell Scripting, AWS S3, EMR, Hortonworks HDP 2.5, Hadoop Stack, Apache Ranger and Apache Atlas

Confidential, Irvine, CA

Sr. Hadoop Developer

Responsibilities:

Involved gathering requirements from the client, giving estimates for developing projects and delivering the projects in time.
Designed conceptual model with Spark for performance optimization.
Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark using Scala.
Managed and reviewed Hadoop log files. Used Scala for integration Spark into Hadoop.
Used Ambari for Managing Hortonworks Distribution of Hadoop, especially for fault-tolerant workflows & error handling etc.
Developing and writing PIG scripts and loaded the data from RDBMS SERVER to Hive using Sqoop.
Created Hive tables to store the processed results in a tabular format.
Developed the Sqoop scripts in order to make the interaction between Hive and MySQL Database.
Developed Java Mapper and Reducer programs for complex business requirements.
Developed Java custom record reader, partitioner and serialization techniques.
Used different data formats (Text format and Avro format) while loading the data into HDFS.
Created Managed tables and External tables in Hive and loaded data from HDFS.
Performed complex HiveQL queries on Hive tables and created custom user-defined functions in Hive.
Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better performance with HiveQL queries.
Created partitioned tables and loaded data using both static partition and dynamic partition method.
Performed SQOOP import from MongoDB to ingest the data in HDFS and directly into Hive tables.
Performed incremental data movement to Hadoop using Sqoop.
Scheduled map reduce jobs in the production environment using Oozie scheduler.
Analyzed the Hadoop logs using PIG scripts to oversee the errors caused by the team.

Environment: HDFS, Map Reduce, Apache Spark Hive, Sqoop, Pig, Flume, HBase, Oozie Scheduler, Java, Oracle, Shell Scripts, Hortonworks, AWS S3, EMR, EC2

Confidential, Gaithersburg, MD

Hadoop Developer

Responsibilities:

Involved in loading data from LINUX file system to HDFS.
Implemented test scripts to support test driven development and continuous integration.
Responsible to manage data coming from different sources.
Load and transform large sets of structured, semi structured and unstructured data
Worked on managing and reviewing Hadoop log files, managing and scheduling Jobs on a Hadoop cluster.
Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Importing and exporting data into HDFS and Hive using Sqoop.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
Used Pig as ETL tool to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS
Wrote Hive queries for data analysis to meet the business requirements
Involved in writing Hive scripts to extract, transform and load the data into Database.
Used JIRA for bug tracking and used CVS for version control.

Environment: Hadoop, Hive, Linux, MapReduce, HDFS, Pig, Sqoop, Shell Scripting, Python, Java (JDK 1.6), Java 6, Eclipse, Control-M Scheduler, Oracle 10g, PL/SQL, SQL*PLUS, Toad 9.6, Linux, JIRA 5.1, CVS, JIRA 5.2

Confidential. Charlotte, NC

Java Developer

Responsibilities:

Involved in Analysis, Design, Development and Testing of the applications.
Incorporated UML diagrams (Class diagrams, Activity diagrams, Sequence diagrams) as part of design documentation and other system documentation.
Enhanced the Port search functionality by adding a VPN Extension Tab.
Created end to end functionality for view and edit of VPN Extension details.
Used Hibernate for persistence framework
Used Struts MVC framework and WebLogic Application Server in this application.
Involved in creating DAO’s and used Hibernate for ORM mapping.
Wrote procedures, and triggers for validating the consistency of metadata.
Wrote SQL code blocks using cursors for shifting records from various tables based on checks.
Fixed defects and generated input XML’s to run on SOA Client to generate output XML for testing Web services.
Wrote Java classes to test UI and Web services through JUnit and JWebUnit.
Extensively involved in release/deployment related critical activities.
Performed functional and integration testing and also tested the entire application using JUnit and JWebUnit.
Log4J was used to log both User Interface and Domain Level Messages and used CVS for version control.

Environment: JAVA, JSP, servlets, J2EE, EJB, Struts Framework, JDBC, WebLogic Application Server, Hibernate, Spring Framework, Oracle 9i, Unix, Web Services, CVS, Eclipse, JUnit, JWebUnit

Confidential

Java Developer

Responsibilities:

Configured data to provide persistence services and persistent objects to the application from the database using Hibernate ORM tool as persistence layer.
Developed DAO layer using Spring MVC configuration XML’s for Hibernate and to manage CRUD operations like insert, update and delete.
Implemented reusable services using BPEL to transfer data.
Developed dependency injection for Spring framework.
Developed Junit classes and created Junit test cases.
Configured logging (enable/disable) using log4j for the application.
Created user interface using HTMP, CSS, JSP, JQuery, AJAX, JavaScript and JSTL.
Implemented database operations using PL/SQL procedures and queries.
Developed shell scripts for UNIX environment to deploy EAR and read log files.
Implemented log4j for logging.

Environment: Java, Jest, SQL, Junit, PL/SQL, SOA Suite 10g BPEL, Struts, Spring, Hibernate, Web services JAX-WS, JMS, EJB, Web logic 10.1 Server, JDeveloper, HTML, LDAP, Maven, XML, CSS, JavaScript, JSON, Oracle, CVS and UNIX

We provide IT Staff Augmentation Services!

Sr. Hadoop, Spark Developer Resume

Wichita, KS

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship