We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

5.00/5 (Submit Your Rating)

Alpharetta, GeorgiA

SUMMARY

  • 8+ years of Professional experience in IT Software and Services in design, development and testing various applications in telecom and financial domains.
  • Expertise in HDFS, MapReduce, YARN, Hive, HBase, Pig, Phoenix, Sqoop, Flume, Zookeeper, Oozie and various other ecosystem components wif noledge in Hadoop Cluster Setup, Integrations, and Installations.
  • Good understanding on Business Intelligence, ETL Transformations and Hadoop Cluster.
  • Experience in building data Ingestion, extraction, transformation for various datasets onto HDFS and Hive for data processing.
  • Experience wif Cloudera distributed Hadoop, Hortonworks and noledge on MapR.
  • Experience in deploying, maintaining, monitoring, troubleshooting and upgrading Hadoop Clusters.
  • Strong noledge on Hadoop components like Hadoop Map Reduce, HDFS, YARN, Zoo Keeper, Oozie, Hive, HBase, Sqoop, Pig, Flume.
  • Good understanding of Hadoop HDFS architecture, cluster planning and Map - Reduce framework, both mrv1 & mrv2 (YARN).
  • Experience in importing and exporting data between different databases and HDFS using Sqoop and performing data transformations using Hive and Pig.
  • Excellent programming skills to write SQL queries, HIVE queries.
  • Experience in configuring and enabling cluster coordination services wif zookeeper.
  • Experience in using Flume to load log files into HDFS, Oozie for configuring job flows and used Flume to collect large log data.
  • Experience in setting up and working wif HBase.
  • Strong noledge of setting up High-Availability and recovering name node metadata residing in cluster.
  • Experience in commissioning and decommissioning and performing major and minor cluster upgrades to latest release and installing hadoop patches.
  • Experience in configuring and enabling Kerberos security and LDAP integration.
  • Excellent noledge in developing Shell Scripts to check file system health and automate system management tasks.
  • Experienced in Performance Monitoring and Tuning of Hadoop cluster.
  • Experienced in analyzing big data using Hadoop environment.
  • Knowledge on Installation and configuration of Spark. Experience in Talend Big Data Platform Studio, Implemented financial audit ETL transformations flows.
  • Implemented Splunk Enterprise environment in HDP Cluster for log aggregation to analyze ecosystem issues.
  • Implemented AppDynamics in HDP to understand the cluster behavior for alerting and monitoring.
  • Good understanding of Java Object Oriented Concepts and development of multi-tier enterprise web applications.
  • Knowledge on cloud services like Azure and AWS E2 instances.
  • Experience wif Operating Systems like Windows, Linux, and Macintosh.
  • Good Understanding in complete SDLC and STLC.
  • Strong troubleshooting and production support skills and interaction abilities wif end users.
  • Has the motivation to take independent responsibility as well as ability to contribute and be a productive team member.
  • Self-motivated wif a strong desire to learn and an Effective Team Player.

TECHNICAL SKILLS

Hadoop Ecosystem: Hadoop 2.7.1, Map Reduce, YARN, HBase 1.1.2, Sqoop 1.4.6, Oozie 4.2.0, Hive 1.2.1, Pig 0.15.0, ZooKeeper 3.4.6, Splunk, AppDynamics, AWS E2, Talend Big Data Platform 6.2, Flume, Kafka, NDM, SFTP, MoveIt, Spark, Solr

Cluster: Hortonworks 2.4, Cloudera CDH5.7, IBM Big Insights 4.1

Languages: Core Java, J2EE, C, JSP, EAI, Shell Script

Web Technologies: HTML5, CSS3, JavaScript, jQuery, XML, XHTML

Servers: Putty, WebSphere, WebLogic, JBoss, Apache Tomcat.

Database: MySQL, Oracle, PL-SQL, NO-SQL(HBase) IDE’s Eclipse Mars.1

PROFESSIONAL EXPERIENCE

Confidential - Alpharetta, Georgia

Hadoop Administrator

Responsibilities:

  • Ingest SAP/Generic data of various client data into HDFS via EY- Helix UI Web application.
  • Create clients, engagements, workspace and file-set for various clients for financial audit year to analyse meaningful business insights of GL, PP, RR.
  • Responsible for implementation and ongoing administration of Hadoop infrastructure.
  • Working wif data delivery teams to setup new Hadoop users. dis job includes setting up Linux users, setting up Kerberos TEMPprincipals and testing HDFS, Hive, Pig and MapReduce access for the new users.
  • Import data from HDFS to Hive to do ETL Transformations based on individual client’s data.
  • SAP data transformation can be performed via Talend ETL. Generic data (Oracle, CSV etc.) transformations performed via Oozie workflow.
  • Writing UDF's in Hive for complex transformations in staging for Staging.
  • Experience in loading raw files to HDFS and move the data to hive database for analytics.
  • Developing Talend packages using HQL and creating Hive Data warehousing in Hadoop.
  • Used Hive to do analysis on the data and identify different correlations.
  • Tuning the map reduce jobs in HDP and increase the performance of the job.
  • Parallelize the sequential jobs in Talend and optimized the query performance for SAP dataset.
  • ETL data provision from various stages to CDM and RDM Transformation HBase
  • Coordinate the MTK central to upload the various templates on Spot-fire to generate reports.
  • Experience in Log management tool - Splunk Enterprise, provide splunk search queries to aggregate log events and analyse/monitor to root causes of issues.
  • Experience in AppDynamics - Dev op application to monitor the cluster environment, health and generate reports, alerts w.r.t
  • Integrated App Dynamics and Splunk to monitor the cluster to analyse events.
  • Used Spot-fire for visualizing and to generate reports.
  • Weekly meetings wif technical collaborators and active participation in code review sessions wif senior and junior developers.
  • Reforming the cluster into cloud - on going implementation of Azure Big Data Platform

Environment: HDP 2.4 (Cent OS): HUEY(Helix User Interface), Map Reduce/YARN, Hive 1.2.1, HBase 1.1.2, Phoenix, Splunk, AppDynamics, Oozie 4.2.0, Ranger 0.6.0, Ambari 2.4.0, Spot-Fire, TFS(Team Foundation Server), Git, Db Visualizer.

Confidential - Fort Worth, Texas

Hadoop Administration /Support

Responsibilities:

  • Worked on a live Big Data Hadoop production environment wif 100 nodes.
  • Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
  • Actively involved in installing, configuring and monitoring HDP Enterprise cluster.
  • Participated in configuring Hadoop cluster on AWS.
  • Developed cost effective and fault tolerant systems using AWS EC2 instances, auto scaling and Elastic Load Balance.
  • Responsible for commissioning and decommissioning, troubleshooting and load balancing.
  • Implemented High Availability for Name Node and configured zookeeper for coordination services.
  • Importing and exporting data in and out of HDFS and HIVE using SQOOP, also loaded data from UNIX file system to HDFS.
  • Integrated tools SAS and Tableau wif hadoop for easier access of data from HDFS and HIVE.
  • Installed and configured HIVE, involved in writing HIVE UDFs.
  • Set up 60 node Cloudera 5.4 Disaster Recovery cluster.
  • Successfully upgraded Cloudera hadoop cluster from HDP 2.3 to HDP 2.4.
  • Supported in managing and reviewing of hadoop log files and data backups.
  • Implemented Kerberos and LDAP for autantication and security of the hadoop cluster.
  • Enabled resource management using fair scheduler.
  • Developed shell scripts for monitoring file system health, running balancer and automate other tasks.
  • Involved in writing python scripts to move data into S3 buckets.
  • Provided highly available and durable data using AWS S3 data store.

Environment: HDP 2.3 (Cent OS) Apache Hadoop 2.7.1, HDFS, MapReduce/YARN HBase 1.1.2, Sqoop 1.4.6, Oozie 4.2.0, Hive 1.2.1, NoSQL, ETL, MySql, Teradata, AWS Amazon Web Services

Confidential, Fairfax, VA

Hadoop Administrator

Responsibilities:

  • Configured Hadoop components including Hive, Pig, HBase, Sqoop, Oozie and Hue in the client environment.
  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in HDFS.
  • Cluster maintenance as well as creation and removal of nodes using tools like Cloudera Manager Enterprise,
  • Manage and review Hadoop log files.
  • File system management and monitoring.
  • HDFS support and maintenance.
  • Diligently teaming wif the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
  • Performance tuning of Hadoop clusters and Hadoop MapReduce routines.
  • Screen Hadoop cluster job performances and capacity planning
  • Monitor Hadoop cluster connectivity and security
  • Created Hive queries dat halped market analysts spot emerging trends by comparing fresh data wif HDFS reference tables and historical metrics.
  • Enabled speedy reviews and first mover advantages by defining the job flow in Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
  • Designed HBase schema to avoid Hotspotting and exposed the data from HBase tables to REST API on UI.
  • DevelopedPig scripts to transform raw datafrom several data sources into forming baseline data and loaded the data into HBase tables.
  • Involved in creating POCs to ingest and process streaming data using Spark and HDFS.
  • Used Flume to collect, aggregate, and store the log data from different web servers.
  • Developed Shell Scripts to automate the batch processing and processed the daily jobs through Maestro scheduler.
  • Provided design recommendations and thought leadership to sponsors/stakeholders datimproved review processes and resolved technical problems.
  • Co-ordinate wif the offshore team and cross-functional teams to ensure dat applications are properly tested, configured, and deployed.
  • Used Tableau for visualizing and to generate reports.

Environment: CDH 5.7.0(Cent OS): Apache Hadoop 2.7.1, MapReduce, HBase 1.1.2, Pig 0.15.0, Sqoop 1.4.6, Oozie 4.2.0, Java 8, Autosys, Hive 1.2.1, Impala, ZooKeeper 3.4.6, Oracle 11g, PL/SQL, SQL Developer 4.0, UNIX. Rest API, Web Services REST, SQL, ANT, Shell Script, JAVA, J2EE

Confidential

Big Data -Technical Lead

Responsibilities:

  • Participated in brainstorming sessions on finalizing the data ingestion requirements and design.
  • Worked on SQOOP to import data from various relational data sources.
  • Working wif Flume in bringing click stream data from front facing application logs.
  • Worked on strategizing SQOOP jobs to parallelize data loads from source systems.
  • Participated in providing inputs for design of the ingestion patterns.
  • Participated in strategizing loads wifout impacting front facing applications.
  • Worked on design on Hive data store to store the data from various data sources.
  • Developed MapReduce and MRUnit jobs to operate on streaming data.
  • Involved in providing inputs to analyst team for functional testing.
  • Worked wif source system load testing teams to perform loads while ingestion jobs are in progress.
  • Worked on performing data standardization using PIG scripts.
  • Worked on building analytical data stores for data science team’s model development.
  • Worked on design and development of Oozie works flows to perform orchestration of PIG and HIVE jobs.
  • Worked on performance tuning of HIVE queries wif partitioning and bucketing process.
  • Worked wif Ambari UI to configure alerts for Hadoop eco system components.
  • Participated in tuning various components in Hadoop Eco System.

Environment: Hadoop 2.2 (Horton works) - PIG 0.14.0, Map Reduce, Hive 0.14.0, TEZ 0.5.2, HDFS 2.6.0, Apache Sqoop 1.4.5, Oozie 4.1.0, HBase 0.98.4, Ambari 2.0.0 JUnit, Zookeeper, maven, Hadoop Data Lake wif Linux-Cent OS.

Confidential

Technical Lead

Responsibilities:

  • Involved in data loading strategies by using Sqoop for importing and exporting the data from HDFS to Relational Database systems and vice-versa.1
  • Involved in design and creating Hive external tables. Optimized the query performance by partitioning & bucketing.
  • Involved in writing Pig scripts for data cleansing.
  • Developed Pig Latin scripts by using HCatalog to read the data from hdfs and load into the hive tables.
  • Developed Hive queries for the data analysts.
  • Developed workflows in Oozie to automate the tasks of loading the data into HDFS and pre-processing wif Pig and MR.
  • Using JUnit test extensively written test cases for dis system to test the application.
  • Implemented logging mechanism using Log4j wif the halp of Spring AOP frame work.

Environment: IBM Big Insights, MapReduce, Hive, PIG, Sqoop, Oozie, Zookeeper, Oracle 11g, SQL Developer 4.1.3, Unix/Shell Scripting.

Confidential 

Application Developer

Responsibilities:

  • Implemented CDR batch processing for MSC, IN, ADSL, and GPRS Streams to object oriented Programming.
  • Involved in the review and analysis of the Functional Specifications, and Requirements Clarification Defects etc.
  • Involved in the analysis and design of the initiatives.
  • Involved in the development of the mediation platform using Java.
  • Involved in design and implementation of migration of Mediation legacy platform to mediation zone.
  • Involved in writing Junit Test for Unit testing.
  • Involved in Regression Testing of SAT and Development Environments.
  • Involved in writing the SQL queries and stored procedures.
  • Participated in the test case reviews, and manual testing of the enhancements during Release 1.5.
  • Involved in fixing the defects during integration testing.
  • Build and deployment of the application using Ant script on to dev and testing environments.
  • Participated in the code reviews for various initiatives, Performed Static Code Analysis to follow the Best Practices for Performance and Security.

Environment: Mediation Zone (Digital Route), Legacy Zone, Core Java 6

Confidential 

Application Developer & Tester

Responsibilities:

  • Analyzing the High Level Design (HLD) and Business Requirements.
  • Providing daily updates to the on-site team over call and making enhancements.
  • Implemented SQL and PL/SQL scripts including Stored Procedures, functions, packages and triggers.
  • Interpreting product requirements into test requirements, writing test plans and test cases.
  • Made enhancements to the application which presented me wif the opportunity to go through the entire SDLC.
  • Preparing the Test Cases and Test Execution Plan.
  • Code coverage and Test case presentation.
  • Involved in Test Planning and Test case execution.
  • Preparing TPI in the test planning phase and the execution phase.
  • Raised defects in QC.
  • Reviewing the test plan document, so dat their is no functionality or the requirements missing.
  • Used Web Services like SOAP and WSDL to communicate over internet.
  • Automated the tests using the integration tool (SOAP-UI).

Environment: WebMethods, SOAP-UI, SEIBEL, CITRIX, Putty, IBM Rational - ClearQuest.

We'd love your feedback!