Sr. Hadoop Developer Resume Hudson, Ohio - Hire IT People

PROFESSIONAL SUMMARY:

8 years of experience in developing, implementing, and configuring Hadoop ecosystem and development of various web applications using Java, J2EE.
Having good experience in Big Data Analytics using apache Hadoop, HDFS, MapReduce, Hive, Pig, HBase, Sqoop, YARN, Mesos, Spark, Scala, Oozie, Kafka and Flume.
Work experience with different Hadoop distributions Horton Works and Cloudera.
Excellent understanding of Hadoop distributed File system and experienced in developing efficient MapReduce jobs to process large datasets.
Good working knowledge in using Sqoop and Flume for data ingestion into HDFS.
Good knowledge in using apache NiFi to automate the data movement between different Hadoop systems.
Implemented Talend jobs to load data from different sources and integrated with Kafka.
Highly skilled in integrating Kafka with Spark streaming for high speed data processing.
Very good at loading data into spark schema RDD’s and querying them using Spark - SQL.
Good at writing custom RDD’s in Scala and implemented design patterns to improve the performance.
Experienced in using apache Hue and Ambari to manage and monitor the Hadoop clusters.
Experience in analysing large amounts of data using Pig and Hive scripts.
Mastery in writing customized UDF's using java to extend Pig and Hive functionalities.
Sound knowledge in using Apache Solr to search against structured and un-structured data.
Worked with Azkaban and Oozie workflow schedulers to recurrently run Hadoop jobs.
Experience in writing Ad-hoc Queries for moving data from HDFS to HIVE and analyzing the data using HIVE QL
Experience in implementing Kerberos authentication protocol in Hadoop for data security.
Experience in creating dash boards and generating reports using QlikSense.
Experience in using Sequence files, ORC, Parquet and Avro file formats and compression techniques like LZO.
Developed Spark code and Spark-SQl/Streaming for faster testing and processing of data.
Worked on NoSQL databases like HBase, Cassandra and MongoDB to store the processed data.
Good knowledge in cloud integration with Amazon Elastic MapReduce (EMR), Amazon Cloud Compute (EC2), Amazon's Simple Storage Service (S3) and Microsoft Azure.
Hands on experience on UNIX environment and shell scripting.
Experienced in using version control system GIT, build tool Maven and integration tool Jenkins.
Expertise in development of Web Applications using J2EE technologies like Servlets, JSP, Web Services, Spring, Hibernate, HTML, JQuery, Ajax etc.
Implemented design patterns to improve quality and performance of the applications.
Used core java concepts such as Collections, Algorithms, Data Structures and Multithreading
Worked on Junit to test the functionality of java methods and used Python to do automation.
Good experience in using Relational databases Oracle, SQL Server, and PostgreSQL.
Experience in developing the J2EE applications using technologies like Java, JDBC and Servlets.
Worked with Waterfall, agile, Scrum and Sprint software development framework for managing product development.

AREAS OF EXPERTISE:

Hadoop Eco System: HDFS, MapReduce, Pig, Hive, Sqoop, Flume, Zookeeper, Oozie, Kafka, Storm, Talend, Spark, NiFi, Mesos, Avro, and Crunch.

Programming languages: Java, Python, Scala.

No SQL Databases: HBase, Cassandra, MongoDB.

Databases: Oracle, SQL Server, PostgreSQL.

Web Technologies: HTML, JQuery, Ajax, CSS, JavaScript, JSON, XML.

Business Intelligence Tools: QlikSense, Jasper reports.

Testing: Hadoop Testing, Hive Testing, MRUnit.

Operating Systems: Linux Red Hat/Ubuntu/CentOS, Windows 10/8.1/7/XP.

Hadoop Distributions: Cloudera Enterprise, Horton Works.

Technologies and Tools: Servlets, JSP, Spring (Boot, MVC, Batch, Security), Web Services, Hibernate, Maven, GitHub.

Application Servers: Tomcat, JBoss.

IDE’s: Eclipse, Net Beans, IntelliJ.

WORK EXPERIENCE:

Confidential, Hudson, Ohio

Sr. Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop cluster environment with Hortonworks distribution.
Worked on Kafka and REST API to collect and load the data on Hadoop file system also used sqoop to load the data from relational databases.
Implemented Talend jobs to load data from excel sheets.
Used Spark-Streaming APIs to perform necessary transformations and actions on the data got from Kafka and Persists into Cassandra database.
Developed Spark scripts by writing custom RDDs in Scala and Python for data transformations and actions on RDDs.
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala, Python .
Worked on reading multiple data formats on HDFS using scala.
Worked with Python , to develop analytical jobs using light weight PySpark API of spark.
Worked with Avro , ORC file formats and compression techniques like LZO .
Analyzed the SQL scripts and designed the solution to implement using Scala.
Used Hive to form an abstraction on top of structured data resides in HDFS and implemented Partitions , Dynamic Partitions, Buckets on HIVE tables.
Used Spark API over Hadoop YARN as execution engine for data analytics using Hive.
Worked on migrating MapReduce programs into Spark transformations using Scala.
Designed, developed data integration programs in a Hadoop environment with NoSQL data store Cassandra for data access and analysis.
Using Job management scheduler apache Oozie to execute the workflow.
Using Ambari to monitor node’s health, status of the jobs and to run the analytics jobs in Hadoop clusters.
Worked on Qliksense to build customized interactive reports, worksheets, and dashboards.
Implemented Kerberos for strong authentication to provide data security.
Involved in performance tuning of spark jobs using Cache and by utilizing complete advantage of cluster environment.

Environment: Hadoop , HDP, Spark, Scala, Python, Kafka, Hive, Sqoop, Ambari, Talend, Oozie, Cassandra, QlikSense, Jenkins, Hortonworks.

Confidential, Bowie, Maryland

Hadoop Developer

Responsibilities:

Design and develop analytic systems to extract meaningful data from large scale structured and unstructured health data.
Created Sqoop jobs to populate data present in relational databases to hive tables.
Developed UDF’s in java for enhancing functionalities of Pig and Hive scripts.
Solved performance issues in Pig and Hive scripts with deep understanding in joins, groups and aggregations and how these jobs do translate into MapReduce jobs.
Involved in creating Hive external tables, loading data, and writing Hive queries.
Developed the processed data in HBase for faster querying and random access.
Defined job flow using Azkaban scheduler to automate the Hadoop jobs and installed zookeepers for automatic node failovers.
Managing and reviewing Hadoop log files to find the source for job failures and debugging the scripts for code optimization.
Developed complex MapReduce Programs to analyse data that exists on the cluster.
Developed the processes to load data from server logs into HDFS using Flume and loading from UNIX file system to HDFS.
Build a platform to query and display the analysis results in dashboard using QlikSense .
Used apache Hue web interface to monitor the Hadoop cluster and run the jobs.
Implemented apache Sentry for role-based authorization to access the data.
Developed Shell scripts to automate routine DBA tasks (i.e. data refresh, backups)
Involved in the performance tuning for Pig Scripts and Hive Queries.

Environment: HDFS, Map Reduce, Pig, Hive, Sqoop, Flume, HBase, Azkaban, QlikSense, Java, Maven, Git, Cloudera, Eclipse and Shell Scripting.

Confidential, Atlanta, Georgia

Hadoop Developer

Responsibilities:

Worked on Hadoop Cloudera cluster of 50 data nodes with Red Hat enterprise Linux installed.
Involved in loading data from UNIX file system to HDFS using Shell Scripting.
Importing and exporting data into HDFS from Oracle 10.2 database using Sqoop .
Developed ETL processes to load data from multiple data sources to HDFS using Sqoop, analyzing data using MapReduce , Hive and Pig Latin .
Developed MapReduce jobs in Python for data cleaning and data processing.
Developed custom UDF’s for pig scripts for cleaning unstructured data and used different joins and groups whenever required to optimize the pig scripts.
Created hive external tables on top of processed data to easily manage and query the data using HiveQL.
Involved in performance tuning of Hive Queries by implementing Dynamic Partitions, buckets in Hive to improve the performance.
Integrated Map Reduce with HBase to import bulk data using MR programs.
Used Flume to collect, aggregate and store the web log data from different sources like web servers and pushed to HDFS.
Wrote the Map Reduce jobs in java to parse the web logs, which are stored in HDFS and used MRUnit to test and debug MapReduce programs.
Implemented the workflows using Apache Oozie framework to automate tasks.
Coordinated with team in resolving the issues technically as well as functionally.

Environment: Cloudera, HDFS, MapReduce, Pig, Hive, Flume, Sqoop, HBase, Oozie, Maven, Git, Java, Python and Linux.

Confidential

Java Developer

Responsibilities:

Designed and implemented the training and reports modules of the application using Servlets, JSP and Ajax.
Developed custom JSP tags for the application.
Writing queries for fetching and manipulating data using ORM software iBatis.
Used Quartz schedulers to run the jobs sequentially at given time.
Implemented design patterns like Filter, Cache Manager and Singleton to improve the performance of the application.
Implemented the reports module of the application using Jasper Reports to display dynamically generated reports for business intelligence.
Deployed the application in client’s location on Tomcat Server.

Environment: HTML, Java Script, Ajax, Java, Servlets, JSP, iBatis, Tomcat Server, SQL Server, Jasper Reports.

Confidential

Java Developer

Responsibilities:

Involved in designing and development of the project using java and J2EE technologies by following MVC architecture of which JSP’s are views and Servlets as controllers.
Using StarUML designed network and use case diagrams to monitor the work flow.
Wrote server-side programs to handle requests coming from different types of devices like iOS and using RESTful Web Services.
Implemented design patterns like Cache Manager and Factory classes to improve the performance of the application.
Used hibernate ORM tool to store and retrieve the data from PostgreSQL database.
Involved in writing test cases for the application using Junit.
Followed the Agile software development process to do this project and achieved the fast development.

Environment: JSP, Spring MVC, Spring Security, Servlets, Ajax, RESTful, Hibernate, Design Patterns, StarUML, Eclipse and PostgreSQL.

We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

Hudson, OhiO

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship