Sr. Hadoop Developer Resume NEW YORK, NY - Hire IT People

PROFESSIONAL SUMMARY:

Good understanding on Spark Streaming with Kafka for real - time processing.
Around 7 years of professional IT experience in all phases of Software Development Life Cycle including hands on experience in Java/J2EE technologies and Big Data Analytics.
Experience in analysis, design, development and integration using BigData HadoopTechnology like MapReduce,Hive, Pig, Sqoop, Ozzie, Kafka, HBase, AWS, Cloudera, Horton works, Impala, Avro, Data Processing, Java/J2EE, SQL.
Good knowledge on Hadoop Architecture and its components such as HDFS, MapReduce, Job Tracker, Task Tracker, Name Node, Data Node.
Hands on experience in installing, configuring, and using Hadoop ecosystem components like HDFS, Hive, Spark, Scala, Spark-SQL, MapReduce, Pig, Sqoop, Flume, HBase, Zookeeper, and Oozie.
Having extensive knowledge on Hadoop technology experience in Storage, writing Queries, processing and analysis of data.
Experience in extending Pig and Hive functionalities with custom UDFs for analysis of data, file processing, by running Pig Latin Scripts and using Hive Query Language.
Experience working with Amazon AWS cloud which includes services like (EC2, S3A, RDS and EBS), Elastic Beanstalk, Cloud Watch.
Worked onDataModelling using various ML (Machine Learning Algorithms) via R and Python.
Experienced in transferring data from different data sources into HDFS systems using Kafka.
Experience in Configured Hive Meta store with MySQL, which stores teh metadata for Hive tables.
Strong experience and knowledge of real time data analytics using Storm, Kafka, Flume and Spark.
Strong knowledge in using Flume for Streaming teh Data to HDFS.
Good knowledge in using job scheduling and monitoring tools likeOozieandZoo Keeper.
Proficient in developing Web based user interfaces using HTML5, CSS3, JavaScript, jQuery, AJAX, XML, JSON, jQuery UI, Bootstrap, AngularJS, Node JS, and Ext JS.
Expertise on working with various databases in writing Sql queries, Stored Procedures, functions and Triggers by using PL\SQL and Sql.
Experience in NoSQL Column-Oriented Databases like Cassandra, HBase, MongoDB and Filo DB and its Integration withHadoop cluster.
Experience in installation, configuring, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH 5.X) distributions and on Amazon web services (AWS).
Strong Experience in troubleshooting teh operating system like Linux, RedHat, and UNIX, maintaining teh cluster issues and java related bugs.
Experience in Developing Spark jobs using Scala in test environment for faster data processing and used Spark SQL for querying.
Good exposure to Service Oriented Architectures (SOA) built on Web services (WSDL) using SOAP protocol.
Well experienced in OOPS principles inheritance, encapsulation, polymorphism and Core Java principles collections, multithreading, synchronization, exception handling.
Java Experience Created applications in core Java, built application that satisfy use of database and constant connectivity such as a client-server model using JDBC, JSP, Spring and Hibernate.
Extensive experience in middle-tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets, JSF, Struts, Spring, Hibernate, EJB.
Good Knowledge in developing responsive Front End components with JSP, HTML, XHTML, JavaScript, DOM, Servlets, JSF, NodeJS, Ajax, JQuery and AngularJS.

TECHNICAL SKILLS:

Programming Languages: Java, J2EE, C, SQL/PLSQL, PIG LATIN, Scala, HTML, XML

Hadoop: HDFS, MapReduce, HBase, Hive, Pig, Impala, SQOOP, Flume, OOZIE, Spark, Spark QL, and Zookeeper, AWS, Cloudera, Horton works, Kafka, Avro.

Web Technologies: JDBC, JSP, JavaScript, AJAX, SOAP.

Scripting Languages: Java Script, Pig Latin, Python 2.7and Scala.

RDBMS Languages: Oracle, Microsoft SQL Server, MYSQL.

NoSQL: MongoDB, HBase, Apache Cassandra, Filo DB.

SOA: Web Services (SOAP, WSDL)

IDES: My Eclipse, Eclipse, and RAD

Operating System: Linux, Windows, UNIX, CentOS.

Methodologies: Agile, Waterfall model.

Testing Hadoop: MR UNIT Testing, Quality Center, Hive Testing.

Other Tools: SVN, Apache Ant, Junit and Star UML, TOAD, Pl/SQL Developer, JIRA, Visual Source, QC, Agile Methodology

PROFESSIONAL EXPERIENCE:

Confidential, NEW YORK, NY

Sr. Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop.
Developed Spark jobs and Hive Jobs to summarize and transform data.
Experienced in developing Spark scripts for data analysis in both python and scala.
Built on premise data pipelines using Kafka and spark for real time data analysis.
Created reports in TABLEAU for visualization of teh data sets created and tested native Drill, Impala and Spark connectors.
Analysed teh SQL scripts and designed teh solution to implement using Scala.
Implemented Hive complex UDF's to execute business logic with Hive Queries.
Responsible for loading bulk amount of data in HBase using MapReduce by directly creating H-files and loading them.
Evaluated performance of Spark SQL vs IMPALA vs DRILL on offline data as a part of poc.
Worked on solr configuration and customizations based on requirements.
Handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, Map Reduce and then loading data into HDFS.
Designed and developed SSIS (ETL) packages to validate, extract, transform and load data from OLTPsystem to teh Data warehouse and Report-Data mart.
Exporting of result set from HIVE to MySQL using Sqoop export tool for further processing.
Experienced in loading and transforming of large sets of structured, semi structured, and
Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis.
Responsible for developing data pipeline by implementing Kafka producers and consumers.
Worked on teh ETL scripts and fixed teh issues at teh time ofdataload from variousdatasources.
Expertise in implementing Spark Scala application using higher order functions for both batch and interactive analysis requirement.
Performed data analysis with HBase using Apache Phoenix.
Managing and reviewing Hadoop Log files to resolve any configuration issues.
Developed a program to extract teh name entities from OCR files.
Used GIT for version control.

Environment: MapR, Cloudera, Hadoop, HDFS, AWS, PIG, Hive, Impala, Drill, SparkSql, OCR,MapReduce, Flume, Sqoop, Oozie, Storm, Zepplin, Mesos, Docker, Solr, Kafka, Mapr DB, Spark,Scala, Hbase, ZooKeeper, Shell Scripting, Gerrit, Java, Redis.

Confidential, Nashville, TN

Hadoop/Spark Developer

Responsibilities:

Worked with Hadoop Ecosystem components like Cassandra, Sqoop, Flume, Oozie, Hive and Pig.
Developed PIG and Hive UDF's in java for extended use of PIG and Hive and wrote Pig Scripts for sorting, joining, filtering and grouping teh data.
Developed spark programs using Scala, involved in creating Spark SQL Queries and Developed Oozie workflow for spark jobs.
Developed teh Oozie workflows with Sqoop actions to migrate teh data from relational databases like Oracle, Teradata to HDFS.
Developed Hive queries to do analysis of teh data and to generate teh end reports to be used by business users.
Implemented Spark using Scala and utilizing Data frames and Spark SQL API for faster processing of data.
Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
Developed a data pipeline using Kafka, Cassandra and Hive to ingest, transform and analysing customer behavioural data.
Great familiarity with Hive joins & used HQL for querying teh databases eventually leading to complex Hive UDFs.
Responsible to migrate iterative map reduce programs into Spark transformations using Spark and Scala.
Used Scala to write teh code for all teh use cases in Spark and Spark SQL.
Expertise in implementing Spark and Scala application using higher order functions for both batch and interactive analysis requirement. Implemented SPARK batch jobs.
Worked with Spark core, Spark Streaming and spark SQL modules of Spark.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
Exploring with Spark various modules of Spark and working with Data Frames, RDD and Spark Context.
Developed a data pipeline using Spark and Hive to ingest, transform and analysing data.
Developed Spark scripts by using Scala shell commands as per teh requirement.
Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
Analysed teh SQL scripts and designed teh solution to implement using Scala.
Responsible for developing data pipeline with Amazon AWS to extract teh data from weblogs and store in HDFS.
Implemented schema extraction for Parquet and Avro file Formats in Hive.
Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
Worked and learned a great deal from AWS Cloud services like EC2, S3, EMR and RDS.
Imported data from AWS S3 into Spark RDD, Performed transformations and actions on RDD's.

Environment: Hadoop YARN, Spark Core, Spark Streaming, Spark SQL, Scala, Kafka, Hive, Cassandra, Sqoop, Amazon AWS, Tableau, Oozie, Cloudera, Oracle, Linux.

Confidential, WI

Hadoop/Spark Developer

Responsibilities:

Worked on Creating Kafka topics, partitions, writing custom partitioner classes.
Experienced in writing Spark Applications in Scala and Python (PySpark).
Imported Avro files using ApacheKafka and did some analytics using Sparkin Scala.
Extracting real time data using Kafka and Spark streaming by Creating D streams and converting them into RDD, processing it and stored it into Cassandra.
Experience in building Real-time Data Pipelines with Kafka Connect and Spark Streaming.
Configured, deployed and maintained multi-node Dev and Test Kafka Clusters.
Using Spark-Streaming APIs to perform transformations and actions on fly for building teh common learner data model which gets teh data from Kafka in near real time and persists into Cassandra.
Developed script which will Load teh data into Spark Data frames and do in memory data computation to generate teh output response.
Used Scala sbt to develop Scala coded spark projects and executed using spark-submit.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Developed teh batch scripts to fetch teh data from AWS S3storage and do required transformations in Scala using Spark frame work.
Building teh Cassandra nodesusing AWS & amp; setting up teh Cassandra cluster using Ansible automation tools
Worked and learned a great deal from Amazon Web Services (AWS) cloud services like EC2, S3, EMR, EBS, RDS and VPC.
Developed Scala scripts, UDF’s using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
Involved in executing various Oozie workflows and automating parallel Hadoop MapReduce jobs.
Developed Oozie Bundles to Schedule Pig, Sqoop and Hive jobs to create data pipelines.
Experience in using ORC, Avro, Parquet, RCFile and JSON file formats and developed UDFs using Hive and Pig.
Developed Hive queries to do analysis of teh data and to generate teh end reports to be used by business users.
Used spark and spark-SQL to read teh parquet data and create teh tables in hive using teh Scala API.
Design solution for various system components using Microsoft Azure.
Written generic extensive data quality check framework to be used by teh application using impala.
Worked on migrating MapReduce programs into Spark transformations using Spark and Scala, initially done using python (PySpark).
Involved in teh process ofCassandra data modelling and building efficient data structures.
Understanding of Kerberos authentication in Oozie workflow for Hive and Cassandra.

Environment: Hadoop, Hive, Impala, Oracle, Spark, Python, Pig, Sqoop, Oozie, Map Reduce, GIT, HDFS, Cassandra, Apache Kafka, Storm, Linux, Solr, Confluence, Jenkins.

Confidential

Java Developer

Responsibilities:

Developed modules in Java and integrated with MySQL database.
Responsible for coding using Java Servlets, Java Beans and XML.
Worked with OOPS concepts such as Inheritance, Encapsulation, Abstraction and Polymorphism.
Expertise in performing operations such as Collections, Exception Handling and Multithreading.
Developed web applications using Spring MVC framework.
Involved in Analysis, Design and Development of different phases of Process Flow module.
Designed and developed highly customized front end screens using Sencha ExtJs framework library, JavaScript, HTML, CSS as a Rich Internet Application (RIA).
Designed Graphical User Interfaces using JSP’s.
Worked on various design patterns UML and Enterprise Application Integration.
Implemented Action class and Action Forms using struts.
Worked on teh design of teh entire end-to end architecture for teh Classification Web Application.
Added Dynamic functionality to teh user interface using Java Script.
Implementation of components and wireframes using cross-browser compatible JavaScript, JQuery and AJAX.
Experience in Programming with SQL, PL/SQL.
Used JDBC for administering and managing users and clients.
Implemented XSLT transformation for converting XML to HTML.
Implemented database tables, middleware designing, client-side web programming and server-side java programming.
Followed Scrum Agile methodology for teh iterative development of teh application.
Scripting of Test cases base on teh specifications received for teh request.
Utilized various Testing methodologies for testing application on various levels like system testing and integration.

Environment: Java, Java Script, JSP, Java Beans, struts, Java Servlets, JQuery, Apache Tomcat, Eclipse, AJAX, Windows, PL/SQL, JDBC, XML, CSS, HTML.

We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

New York, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship