Hadoop Developer Resume Overland Park, KS - Hire IT People

SUMMARY:

8years of experience in developing, implementing andconfiguring Hadoop ecosystem and development of various web applications using Java,J2EE.
Having 4 years of experience inBig Data Analyticsusing Hadoop, HDFS, MapReduce, Hive, Pig, HBase, Sqoop, YARN, Spark, Scala, Oozie, Kafka and Flume.
Work experience with different Hadoop distributions like Horton Worksand Cloudera.
Excellent understanding of Hadoop distributed File systemand experienced in developing efficient MapReduceand YARN programs to process large datasets.
Good working knowledge in using Sqoop to transfer bulk data between relational databases & HDFS and Flume for ingesting streaming data into HDFS.
Good knowledge in using apache NiFi to automate the data movement between different Hadoop systems.
Implemented Talend jobs to load data from different sources and also integrated with Kafka.
Highly skilled in integrating Kafkawith Spark streamingfor high speed data processing.
Very good at loading data into spark schema RDD’s and querying them using Spark - SQL.
Good at writing custom RDD’s in Scala for applying data specific transformations and also implemented design patterns to improve the performance.
Experienced in using apache Hue and Ambarito manage and monitor the Hadoop clusters.
Experience in analysing large amounts of data using Pig Latin Scripts and Hive Query Language and also assisted with performance tuning.
Mastery in writing customized UDF's using java to extend PIG and HIVE functionalities.
Sound knowledge in using Apache Solr to search against structured and un-structured data.
Worked with Azkaban and Oozie workflow schedulersto recurrently run Hadoop jobs.
Experience in implementing Kerberos authentication protocol in Hadoop for data security.
Experience in creating dash boards and generating reports using Tableau by connecting to tables in Hive and HBase.
Experience in using Sequence files, RC, ORC and Avrofile formats and compression techniques.
Worked on NoSQL databases like HBase, Cassandra to store structured and unstructured data.
Good knowledge in cloud integration with Amazon Elastic MapReduce (EMR), Amazon Cloud Compute (EC2), Amazon's Simple Storage Service (S3) and Microsoft Azure.
Hands on experience onUNIXenvironment and shell scripting.
Experienced in using version control systems like SVN, GITbuild toolMaven and continuous integration tool Jenkins.
Expertise in development of Web Applications using J2EE technologies likeServlets, JSP, WebServices,Spring,Hibernate,HTML,JQuery, Ajax and etc.
Implemented design patternsto improve quality and performance of the applications.
Worked on Junit to test the functionality of java methodsand used Python to do automation.
Good experience in using Relational databases Oracle, SQLServer and PostgreSQL.
Worked with agile, Scrum and Confidential software development framework for managing product development.

TECHNICAL SKILLS:

Hadoop Eco System: HDFS, MapReduce, Pig, Hive, Sqoop, Flume, Zookeeper, Oozie, Kafka, Storm, Talend, Spark, NiFi, Impala, Solr, Avro, and Crunch.

Programming languages: Java, Python, Scala.

No SQL Databases: HBase, Cassandra, MongoDB

Databases: Oracle, SQL Server, PostgreSQL

Web Technologies: HTML, JQuery, Ajax, CSS, JavaScript, JSON, XML.

Business Intelligence Tools: Tableau, Jasper reports.

Testing: Hadoop Testing, Hive Testing, MRUnit.

OperatingSystems: Linux Red Hat/Ubuntu/CentOS, Windows 10/8.1/7/XP.

Hadoop Distributions: Cloudera Enterprise, Horton Works.

Technologies and Tools: Servlets, JSP, Spring, Web Services, Hibernate, Maven, GitHub.

Application Servers: Tomcat, JBoss

IDE’s: Eclipse, Net Beans, IntelliJ

PROFESSIONAL EXPERIENCE:

Confidential, Overland Park, KS

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop cluster environmentwith Hortonworks distribution on 110 data nodes.
Worked on Kafka REST API to collect and load the data on Hadoop file system and also used sqoop to load the data from relational databases.
Used Spark-Streaming APIs to perform necessary transformations and actions on the data got from Kafka and Persists into Cassandra.
Started using apache NiFi to copy the data from local file system to HDFS.
Developed Spark scripts by writing custom RDDs in Scala for data transformations and perform actions on RDDs.
Worked with Avro , ORCfile formats and compression techniques like LZO .
Used Hive to form an abstraction on top of structured data resides in HDFS and implemented Partitions , Dynamic Partitions, Buckets on HIVE tables.
Used Spark API over Hadoop YARN as execution engine for data analytics using Hive.
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala .
Worked on migrating MapReduce programs into Spark transformations using Scala.
Designed, developed data integration programs in a Hadoop environment with NoSQL data store Cassandra for data access and analysis.
UsingJob management scheduler apache Oozie to execute the workflow.
Using Ambari to monitor node’s health and status of the jobs in Hadoop clusters.
Worked on Tableau to build customized interactive reports, worksheets and dashboards.
Implemented Kerberos for strong authentication to provide data security.
Worked on apache Solr for indexing and load balanced querying to search for specific data in larger datasets.
Involved in performance tuning of spark jobs using Cache and using complete advantage of cluster environment.

Environment: Hadoop , HDFS, Spark, Scala, Kafka, Hive, Sqoop, Ambari, Solr, Oozie, Cassandra, Tableau,Jenkins, Bit bucket,Hortonworks and Red Hat Linux.

Confidential, Kansas City, MO

Hadoop Developer

Responsibilities:

Design and develop analytic systems to extract meaningful data from large scale structured and unstructured health data.
Created Sqoop jobs to populate data present inrelational databases to hive tables.
Developed UDF’s in java for enhancing functionalities of Pig and Hive scripts.
Solved performance issues in Pig and Hive scripts with deep understanding in joins, groups and aggregations and how these jobs doestranslated into MapReduce jobs.
Involved in creating Hive external tables, loading data, and writing Hive queries.
Developed the processed data in HBase for faster querying and random access.
Defined job flow using Azkaban scheduler to automate the Hadoop jobs and installed zookeepers for automatic node failovers.
Managing and reviewing Hadoop log files to find the source for job failures and debugging the scripts for code optimization.
Developed complex MapReduce Programs to analyse data that exists on the cluster.
Developed the processes to load data from server logs into HDFS using Flume and also loading from UNIX file system to HDFS.
Build a platform to query and display the analysis results in dashboard using Tableau .
Used apacheHue web interface to monitor the Hadoop cluster and run the jobs.
Implemented apache Sentry for role based authorization to access the data.
Developed Shell scripts to automate routine DBA tasks (i.e. data refresh, backups)
Involved in the performance tuning for Pig Scripts and Hive Queries.

Environment: HDFS, Map Reduce, Pig, Hive, Sqoop, Flume,HBase, Azkaban, Tableau,Java, Maven, Git, Cloudera, Eclipseand Shell Scripting.

Confidential, Sunnyvale, CA

Hadoop Developer

Responsibilities:

Worked on Hadoop Cloudera clusterof50 data nodes with Red Hat enterprise Linux installed.
Involved in loading data from UNIX file system to HDFS using Shell Scripting.
Importing and exporting data into HDFS from Oracle 10.2 databaseusing Sqoop .
Developed ETL processes to load data from multiple data sources to HDFS using Sqoop, analyzing data using MapReduce , Hive and Pig Latin .
Developed custom UDF’s for pig scripts for cleaning unstructured data and used different joins and groups whenever required to optimize the pig scripts.
Created hive external tables on top of processed data to easily manage and query the data using HiveQL.
Involved in performance tuning of Hive Queries by implementing Dynamic Partitions, buckets in Hive to improve the performance.
Integrated Map Reduce with HBase to import bulk data using MR programs.
Used Flume to collect, aggregate and store the web log data from different sources like web servers and pushed to HDFS.
Wrote the Map Reduce jobs in java to parse the web logs, which are stored in HDFS and used MRUnit to test and debug MapReduce programs.
Implemented the workflows using Apache Oozie framework to automate tasks.
Coordinated with team in resolving the issues technically as well as functionally.

Environment: Cloudera, HDFS, MapReduce, Pig, Hive, Flume, Sqoop,HBase, Oozie, Maven, Git, Java, Pythonand Linux.

Confidential

Java Developer

Responsibilities:

Involved in designing and development of the project using java and J2EE technologies by following MVC architecture of which JSP’s are views and Servlets as controllers.
Using StarUMLdesigned network and use case diagrams to monitor the work flow.
Wrote server side programs to handle requests coming from different types of devices like iOS and using RESTfulWebServices.
Implemented design patterns like CacheManager and Factory classes to improve the performance of the application.
Used hibernateORM tool to store and retrieve the data from PostgreSQL database.
Involved in writing test cases for the application using Junit.
Followed the Agilesoftware development process to do this project and achieved the fast development.

Environment: JSP, Servlets, Ajax, RESTful,Hibernate, Design Patterns, StarUML,Eclipseand PostgreSQL.

Confidential

Java Developer

Responsibilities:

Designed and implemented the training and reports modules of the application using Servlets, JSP and Ajax.
Developed custom JSP tags for the application.
Writing queries for fetching and manipulating data using ORM software iBatis.
Used Quartz schedulers to run the jobs sequentially at given time.
Implemented design patterns like Filter, Cache Manager and Singleton to improve the performance of the application.
Implemented the reports module of the application using Jasper Reports to display dynamically generated reports for business intelligence.
Deployed the application in client’s location on Tomcat Server.

Environment:HTML, Java Script, Ajax, Servlets, JSP, iBatis, Tomcat Server, PostgreSQL, Jasper Reports.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Overland Park, KS

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship