Senior Hadoop Developer Resume Union, NJ - Hire IT People

SUMMARY:

About8+years of experience in IT industryincluding technical proficiency in Big data environment with extensive expertise in development on Hadoop ecosystemand Java.
Extensive experience in Hadoop platform components MapReduce (MRv1,YARN, Hive, Pig, Scoop,Oozie, Hbase, Spark, Spark streaming, Spark SQL, Elastic search, Scala.
Experience with working on cloud infrastructure Amazon Web Services (AWS).
Extensive knowledge on MongoDB concepts and good knowledge in administration.
Good experience in developing and implementing Sparkand its Streaming functionality using Scalaand Pythonto work with Real Time Data.
Proficient in writing Map Reduce Programs and using Apache Hadoop JavaAPI for analyzing the structured and unstructured data.
Extensive experiencein fine tuning, improving the performance and optimization of the Spark and Spark Streaming Jobs.
Worked on replacing MR jobs and Hive scripts with Spark SQL and Spark data transformations for efficient data processing.
Hands on experience on working complex MapReduce programs into Spark RDDoperations like transformations and actions.
Worked on loading PARQUET/TXT files in Spark Framework using Java/Scala language and created Spark Data frame and RDD to process the data and save the file in parquetformat in HDFS to load into fact table using ORC Reader.
Have knowledge on Apache Spark with Cassandra.
Monitoring the Data Streaming (DS) between web sources and HDFS (Hadoop Distributed File System).
Installation, configuration, management, supporting and monitoring Hadoop cluster using various distributions such as Apache Spark, Cloudera and AWS service console.
Development of Spark Streaming Consumer Application integrating Kafka.
Good understanding and knowledge of Hadoop architecture and Hands on experience with Hadoop components such asName Node, Data Node and Map Reduce concepts, Spark Execution Concepts and HDFS Framework.
Familiar with MongoDB clusters, Java scripting to load unstructured data into sharding environment.
Used Apache Kafka to aggregate log data from multiple servers and make them available in Downstream systems for analysis using spark streaming.
Involved in designing various stages of migrating stages from RDBMS to Cassandra.
Experience in launching EMRcluster, Redshiftcluster, EC2 instances, Amazon Data Pipeline,SimpleWorkflowServices.
Expert in working with Hive data warehouse tool - creating tables, data distribution by implementing Partitioningand Bucketing, writing and optimizing the HiveQL queries.
Experience in writing Pig Latin scripts to sort, group, join and filter the data.
Experience in writing UDF’S in java for hive and pig.
Successfully generated consumer lag groups from Kafka using their API.
Hands on experience in setting up workflows using Apache Oozie workflow engine for managing and scheduling Hadoop jobs.
Strong knowledge in NOSQLcolumn-oriented databases like Cassandra, MongoDB and its integration with Hadoop cluster. Working experience on HbaseandElastic Search.
Good Knowledge on Object Oriented Analysis and Design (OOAD) and Java Design patterns and good level of expertise in Core Java.
Comprehensive knowledge of Software Development Life Cycle, Agile methodology, coupled with excellent communication skills.
Strong analytical and Problem-solving skills.
Implementing Microservices in Scala along with Apache Kafka.
Experience working in both team and individual environments. Always eager to learn new technologies and implement them in challenging environment.
Team player with good Inter personnel skills, communication and presentation skills. Exceptional ability to learn and master new technologies and to deliver outputs in short deadlines.

TECHNICAL SKILLS:

Hadoop Platform: MapReduce, Hive, Hbase, Pig, Sqoop, Oozie, Impala, Spark streaming, Spark SQL

NoSQL Databases: Hbase, MongoDB, Cassandra, Elastic Search

Programming: Core Java, SQL, Shell scripting, C, C++

AWS Hadoop Services: S3,EMR,SimpleWorkFlow,DataPipeline,Redshift Database

Operating Systems: Linux (RedHat, CentOS), Windows XP/7/8, Mac OS

NoSQL Databases: Cassandra,MongoDB,HBase,Bigtable,ElasticSearch

ETL: Pentaho Report Designer,Logstash

BI Tools: Tableau, Kibana

Hadoop platform Distributions: Hadoop,HDP,Cloudera,Hadoop Distribution CDH3, CDH4, CDH5, Pivotal HD(2.0), AWS, GCP

PROFESSIONAL EXPERIENCE:

Confidential, Union, NJ

Senior Hadoop Developer

Responsibilities:

Involved in building the data engineeringplatform on AWS for ingesting and aggregating and visualizing streaming real-time data from multiple sources.
Developed spark streaming jobs which streams the data from Kafka topics and performs transformations on the data.
Worked extensively on spark framework using Scalato perform ETL operations.
Involved in end to end development, testing and deployment of the spark jobs, doing performance tuning.
Worked on developing parsers using Scala API for parsing the data from different sources and data formats such as Byte code, JSON, CSV.
Designed and implemented by configuring Topics in new Kafka cluster in all environment.
Worked extensively in optimizing and tuning the spark streaming applicationto have a real-time access to data.
Managed Amazon Web Services (AWS)- ELB, EC2, S3, EMR and Cloud Watch.
Worked on receiver approach, as well as direct stream approach for streaming real-time data from Kafka using Spark Streaming.
Deployed EMR clusters on AWS.
Installed Kafka manager for consumer lags and for monitoring Kafka metrics, also this has been used for adding topics, partitions etc.
Involved in multiple code improvements resulting in significantly less processing time for a single streaming batch., optimizing the performance of the pipeline.
Hands on experience on working with Amazon EMR framework transferring data to EC2 Server.
Worked on developing a parser for converting the Network data in byte code format to Json format using Scala API.
Developed automated scripts for provisioning of the clusters for Kafka, Zookeeper, Elastic Search.

Environment: Scala, Spark, Spark Streaming, Kafka, ElasticSearch, Zookeeper, Python, Java, Shell Scripting, AWS EMR.

Confidential, Tampa, FL

Hadoop Developer

Responsibilities:

Involved in working on Spark SQL Code as an alternative approach for Faster Data Processing and better Performance.
Proposed an automated system using Shell script for the Hadoop jobs deployment process.
Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts,involved in writing Hive, Pig scripts for complex transformations.
Used Kafka functionality like distribution, partition, replicated commit log service by messaging systems by maintaining feeds and created applications, which monitors consumer lag within Apache Kafka clusters.
Implemented Hive custom UDF’s to achieve comprehensive data analysis.
Writing Oozie workflows to run multiple Hive, shell script and Pig jobs which run independently with time and data availability.
Prepared pig scripts and spark sql to handle all the transformations specified in the S2TM’s and to handle SCD2 and SCD1 scenarios.
Used Apache NiFi to implement a system to store, send and ingest data from hundreds of devices.
Load the data into Spark RDD and caching to avoid shuffling, experienced with batch processing of data sources using Apache Spark.
Experience on developing API and framework on YARN applications using Apache TEZ.
Developed a system to monitor Agile teams and performed log analysis on ELK Stack.
Experience in managing large-scale, geographically- distributed database systems, including relational (Oracle, SQL Server) and NOSQL (MongoDB, Cassandra) systems.
Involved in ingesting data into IDW staging directly through Spark Sqoop to push data into HDFS.
Handled installation, administration and configuration of ELK Stack on AWS and performed log analysis.
Experience in developing custom processors in Apache NiFi.
Designed a messaging system using Apache Kafka to send messages across teams.
Used Shell scripting for automation of scripts.
Worked on QA support activities, test data creation and Unit testing activities.
Worked in Agile development approach.

Environment: HortonworksDataPlatform Hadoop Platform, Apache TEZ, HDFS, Kafka,Spark RDD, HBase, Hive, Java, Sqoop, Oracle, MySQL, Spark, Storm, NOSQL, Apache NiFi, ELK Stack.

Confidential, Omaha, NE

Hadoop Developer

Responsibilities:

Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
Worked on automation of delta feeds from, Teradata using Sqoop, also from FTP Servers to Hive.
Involved in loading one of the largest tables (SCAN table) from Teradata to Hadoop using TPT utility.
Implemented Spark using Spark SQL for faster testing and processing of data.
Developed Spark code using Scala and Spark-SQL for faster testing and processing of data.
Responsible for load, aggregate and move large amounts of log data using Flume.
Worked on the core and Spark SQL modules of Spark extensively.
Established custom MapReduces programs to analyze data and used Pig Latin to clean unwanted data.
Strong skills on SQL, Hive, Impalato extract data from SQL server, Oracle and Hadoop databases.
Involved in analyzing the existing BTEQ scripts on mainframes and implementing the same logic Hadoop.
Created complex queries aggregating large datasets in Impala to perform data quality checks for the project.
Involved in exporting data from Hadoop to Greenplum using GPload utility.
Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis
Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components.
Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side join’s.
Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
Participated in requirement gathering from the Experts and Business Partners and converting the requirements into technical specifications
Implemented daily workflow for extraction, processing and analysis of data with Oozie.
Involved in loading data from LINUX file system to HDFS.

Environment: Hadoop (Cloudera, Pivotal HD), Teradata 13.0, Pig, Hive, Sqoop, Flume, MapReduce, HDFS, LINUX, Oozie,Spark, Impala.

Confidential, Boston, MA

Java-Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce.
Worked on debugging, performance tuning of Hive & Pig Jobs.
Worked on tuning the performance Pig queries.
Experience working on processing unstructured data using Pig and Hive.
Worked on evaluating complex business metrics in Pig, MapReduce.
Created Hive scripts to process the data for analysis.
Focused on programming different Java modules and integration.
Implemented Java mail services for email notifications.
Actively involved in design and developing of Java/JEE components.

Environment: Kafka, Data Pipeline, MapReduce (Java), Map-Reduce, Hive, Pig

Confidential

Java Developer

Responsibilities:

Used Java, JSP, JSTL while enhancing the functionality and responsibility for creating database tables on DB2.
Written JavaScript code for front end validation.
Involved in various phases of Software development life cycle (SDLC) as requirement gathering, data modeling analysis, architecture design and development for the project.
Worked on Java Messaging Services (JMS) for developing messaging services.
Developed Server-Side services using Java concepts. Involved in core Java technologies, Multithreading and exceptional handling.
Involved in developing Front-end applications which will interact the mainframe applications using J2C connectors.
Used JDBC object relational mapping and persistence.
Designed and implemented scalable, Restful and microservices-based back-end. The back-end is written in Java using Spring Boot for simplicity and scalability.
Used Junit to develop test cases for performing Unit testing.
Used JIRA as a bug reporting tool for updating the bug report.
Developing new and maintaining existing functionality using SPRING MVC, Hibernate.

Environment: HTML, JavaScript, CSS, Servlets, JSP, XML, ANT, Soap, JIRA, Junit, Ajax, GIT

Confidential

Junior Java Developer

Responsibilities:

Involved in gathering business requirements, analyzing the project and creating UML diagrams such as Use cases, class diagrams and flow charts.
Developed front end using JSTL, JSP, HTML and JavaScript.
Creating new and maintained existing web pages build in JSP and Servlets.
Extensively worked on views, Stored procedures, triggers and SQL queries and for loading the data (Staging) to enhance and maintain the existing functionality.
Coded and developed multi-tiered architecture in Java, J2EE, Servlets.
Consumed Web Services (WSDL, SOAP, UDDI) from third party for authorized payments to/from customers.
Developed Hibernate Mapping file (. hbm.xml) files for mapping declarations.
Actively involved from the start of the project, gathering requirements to quality assurance testing.
Writing/ Manipulating the database queries, stored procedures for Oracle9i.

Environment: Java JDK 1.5, Oracle, Java/J2EE, JSP, Web Logic Application Server, HTML, Servlets, UML, XML, WSDL, SOAP, UDDI.

We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

Union, NJ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship