Sr Big Data / Hadoop Administrator Resume
Washington, DC
SUMMARY:
- Having around 9 years of experience in IT Industry with strong experience in Big Data Hadoop
- 4+ years of experience in BIG DATA Hadoop Administration and Development.
- Hands - on experience in designing and implementing solutions using Hadoop 2.4.0, HDFS 2.7, MapReduce2,HBase 1.1, Hive 1.2, Oozie 4.2.0,Tez 0.7.0,Yarn 2.7.0,Sqoop 1.4.6,Solr,Zookeeper,MongoDB
- Having Knowledge to implement Hortonworks (HDP 2.1 and HDP 2.3,HDP 2.4 ), Cloudera ( CDH3, CDH4, CDH5) and MapR on Linux.
- Experience in Configuring Name-node High availability and Name-node Federation
- Experience in Disaster recovery and Backup activities
- Experience in Multi-node setup of Hadoop cluster
- Experience in Performance tuning and benchmarking of Hadoop Cluster
- Experience in Monitoring, maintenance and troubleshooting of Hadoop cluster.
- Experience in Security integration of Hadoop Cluster.
- Good knowledge on Kerberos Security.
- Setting up and integrating Hadoop eco system tools - HBase, Hive, Pig, Sqoop etc.
- Familiar with installing and configuring Solr 5.2.1 in Hadoop cluster and implementation of Solr collections.
- Familiar with writing Oozie workflows and Job Controllers for job automation - Hive automation.
- Experience in Importing and exporting data from different databases like MySQL, RDBMS into HDFS and HBASE using Sqoop.
- Strong knowledge in configuring High Availability for Name Node, HBase, Hive and Resource Manager
- Experience in deploying and managing the multi-node development and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER) using Hortonworks Ambari.
- Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Horton works, Cloudera and Hoop Apache.
- Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
- Excellent command in creating Backups & Recovery and Disaster recovery procedures and Implementing BACKUP and RECOVERY strategies for off-line and on-line Backups.
- Involved in bench marking Hadoop/HBase cluster file systems various batch jobs and workloads
- Making Hadoop cluster ready for development team working on POCs.
- Experience in minor and major upgrades of Hadoop and Hadoop eco system
- Experience monitoring and troubleshooting issues with Linux memory, CPU, OS, storage and network
- Hands on experience in analyzing Log files for Hadoop and eco system services and finding root cause.
- Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
- Experience in importing and exporting the logs using Flume.
- Hands on experience in Linux admin activities on RHEL &Cent OS .
- Experience in deploying Hadoop 2.0(YARN).
- Good knowledge on cluster monitoring tools like Ganglia and Nagios.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts
- Hands on experience on Unix/Linux environments, which included software installations/upgrades, shell scripting for job automation and other maintenance activities.
- Having Experience in developing Oozie workflows and Job Controllers for job automation - Hive automation and scheduling jobs in HUE Browser.
- Having Experience in developing Hive Queries and Hive query optimization by setting different queues.
- Experience in Importing and exporting data from different databases like MySQL, RDBMS into HDFS and HBASE using Sqoop.
- Strong knowledge in configuring High Availability for Name Node, Hbase, Hive and ResourceManager.
- Experience in deploying and managing the multi-node development and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER) using Hortonworks Ambari.
- Gaining optimum performance with data compression, region splits and by manually managing compaction in Hbase
- Having experience in Upgrading the HDP Cluster from HDP 2.1 to HPD 2.2 and then to HDP 2.3.
- Worked on upgrading cluster from HDP 2.3 to HDP 2.4
- Good knowledge on cluster monitoring tools like Ganglia and Nagios.
- Working experience in Map Reduce programming model and Hadoop Distributed File System.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts
- Hands on experience on Unix/Linux environments, which included software installations/upgrades, shell scripting for job automation and other maintenance activities
- Thorough knowledge and experience in SQL and PL/SQL concepts.
- Sound knowledge of ORACLE 9i, Core Java, jsp, servlet.
- Dedication: Willingness to walk extra mile to achieve excellence.
- Good Knowledge on database stored procedures, functions and Triggers.
- Enthusiasm: High level of motivation.
- Scheduling: Time sense.
- Self-starter and team player, capable of working independently and motivating a team of professionals.
TECHNICAL SKILLS:
Programming Languages: Core Java, C++
Distribution Frameworks: Hadoop
Hadoop Distributions: Hortonworks (HDP 2.1, 2.3 and HDP 2.3), Cloudera (CDH 4.7, 5.4)
Hadoop Technologies: MapReduce, Hbase 0.98, Hive0.13, Sqoop 1.4.4, Pig 0.12.1, Oozie 4.0.0
J2EE Components: Servlets, JSP.
Frame works: Hibernate.
Operating Systems: Windows 2000/XP, Linux & Unix
RDBMS: Oracle 9i, 10g, MySQL
Scripting Languages: JavaScript
Markup Languages: HTML
Web/Application Servers: Tomcat 6.0, Weblogic 8.1
IDE: Eclipse
PROFESSIONAL EXPERIENCE:
Confidential,Washington,DC
Sr Big Data / Hadoop Administrator
Responsibilities:- Designed/Installed/Configured/Maintained HDP2.4 cluster for application Development/Production.
- Designing a multi-node clusters for production environment based on the future data growth.
- Sizing of the cluster exercise performed along with stake holders to understand the data ingestion pattern and provided recommendations.
- Designing and implementing the non-production multi node environments.
- Upgrading Clusters from HDP2.1 to HDP2.3
- Responsible for Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring, troubleshooting, manage and review data backups, and manage and review Hadoop log files.
- Loading data from SAP DSO to Hadoop environment using Sqoop.
- Developed Hive queries and created necessary views to implement update process.
- Responsible for Hadoop Cluster monitoring using the tools like Nagios, Ganglia and Ambari.
- Worked with development teams to deploy Oozie workflow jobs to run multiple Hive and Pig jobs which run independently with time and data availability.
- Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Deployed a Kafka cluster with a separate zookeeper to enable processing of data using spark streaming in real-time and storing it in HBase.
- Implemented Capacity scheduler to securely share the available resources among multiple groups.
- Setting up of HDFS quota and resource quotas for different groups in a multi-tenant environment.
- Analyzed the data using Pig and written Pig scripts by grouping, joining and sorting the data
- Secured Hadoop Cluster by implementing Kerberos with Active Directory.
- Integrated Active Directory for authorization of users and groups of the system using Ranger and also implemented the perimeter security by using Apache Knox.
- Collected the logs data from web servers and integrated in to HDFS using Flume.
- Environment: Hadoop, HBase, HDFS, Hive, Java (jdk1.6), Pig, Zookeeper, Oozie, Flume.
Confidential
Sr Big Data / Hadoop Administrator
Responsibilities:- Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Worked on Installing and configuring the HDP Hortonworks 2.x and Cloudera (CDH 5.5.1) Clusters in Dev and Production Environments
- Worked on Capacity planning for the Production Cluster
- Installed HUE Browser.
- Involved in loading data from UNIX file system to HDFS using Sqoop.
- Involved in creating Hive tables, loading the data and writing hive queries which will run internally in map reduce way.
- Worked on Installation of HORTONWORKS 2.1 in AZURE Linux Servers.
- Worked on Configuring Oozie Jobs.
- Worked on Configuring High Availability for Name Node in HDP 2.1.
- Worked on Configuring Kerberos Authentication in the cluster.
- Worked on cluster upgradation in Hadoop from HDP 2.1 to HDP 2.3.
- Worked on Configuring queues in capacity scheduler.
- Worked on installing and configuring Solr 5.2.1 in Hadoop cluster.
- Worked on taking Snapshot backups for HBase tables.
- Worked on trouble shooting the Hadoop cluster issues and fixing the cluster issues.
- Involved in Cluster Monitoring backup, restore and troubleshooting activities.
- Responsible for implementation and ongoing administration of Hadoop infrastructure
- Managed and reviewed Hadoop log files.
- Importing and exporting data from different databases like MySQL, RDBMS into HDFS and HBASE using Sqoop.
- Worked on indexing the HBase tables using Solr and indexing the Json data and Nested data.
- Worked on configuring Queues in Capacity scheduler
- Worked on configuring queues in Oozie scheduler
- Worked on Performance Optimization for the Hive queries
- Worked on Performance tuning in the Cluster level
- Worked on adding the Users in the clusters
- Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
- Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
- Adding/installation of new components and removal of them through Ambari.
- Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades.
- Monitored workload, job performance and capacity planning
- Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
- Creating and deploying a corresponding Solr Cloud collection.
- Creating collections and configurations, Register a Lily HBase Indexer configuration with the Lily HBase Indexer Service.
- Creating and managing the Cron jobs.
Confidential, Atlanta,GA
Sr Big Data / Hadoop Administrator
Responsibilities:- Installed and configured Hadoop MapReduce, HDFS, Developed multiple mapreduce jobs in java for data cleaning and preprocessing.
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hive and also written Hive UDFs.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.
- Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
- Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
- Adding/installation of new components and removal of them through Ambari.
- Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades.
- Monitored workload, job performance and capacity planning
- Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
Confidential
Big Data / Hadoop Administrator
Responsibilities:- Installed and configured Hadoop on a cluster.
- Written multiple java based MapReduce jobs for data cleaning and preprocessing.
- Experienced in defining job flows using Oozie
- Experienced in managing and reviewing Hadoop log files
- Load and transform large sets of structured, semi structured and unstructured data
- Responsible to manage data coming from different sources and application
- Supported Map Reduce Programs those are running on the cluster
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hive and also written Hive UDFs.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
- Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
- Adding/installation of new components and removal of them through Ambari.
- Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades.
- Monitored workload, job performance and capacity planning
- Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
Confidential
Java J2EE Developer
Responsibilities:- Involving in collecting requirements for the enhancement of new functionalities.
- Coding, Unit testing and SIT.
- Involving code reviews
- Coded the business methods according to the IBM Rational Rose UML model.
- Used Apache log 4j Logging framework for logging of trace and Auditing.
- Extensively used Core Java, Servlets, JSP and XML.
- Used Struts 1.2 in presentation tier.
- Used IBM Web-SphereApplication Server.
- Generated the Hibernate XML and Java Mappings for the schemas
- Used DB2 Database to store the system data
- Used IBM Rational Clearcase as the version controller.
- Used Asynchronous JavaScript and XML (AJAX) for better and faster interactive Front-End.
- Used Rational Application Developer (RAD) as Integrated Development Environment (IDE).
- Used unit testing for all the components using JUnit.
Confidential
Java J2EE Developer
Responsibilities:- Designed Entegrate Screens with Java Swings for displaying the transactions.
- Involved in the development of code for connecting to database using JDBC with the help of Oracle JDevelper 9i.
- Involved in the development of database coding including Procedures, Triggers in Oracle.
- Worked as Research Assistant and a Development Team Member
- Coordinated with Business Analysts to gather the requirement and prepare data flow diagrams and technical documents.
- Identified Use Cases and generated Class, Sequence and State diagrams using UML.
- Used JMS for the asynchronous exchange of critical business data and events among J2EE components and legacy system.
- Worked in Designing, coding and maintaining of Entity Beans and Session Beans using EJB 2.1 Specification
- Worked in the development of Web Interface using MVC Struts Framework.
- User Interface was developed using JSP and tags, CSS, HTML and Java Script.
- Database connection was made using properties files.
- Used Session Filter for implementing timeout for ideal users.
- Used Stored Procedure to interact with database.
- Development of Persistence was done using DAO and Hibernate Framework.
- Used Log4j for logging.