We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

2.00/5 (Submit Your Rating)

Bridgewater, NJ

SUMMARY

  • Over 8+ years of professional work experience in the IT Industry; around 5 years of experience with Hadoop, HDFS, MapReduce and Hadoop Ecosystem (Pig, Hive, HBase).
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode programming paradigm.
  • Experience with leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Oozie for scheduling and HBase as a NoSQL data store.
  • Good Exposure on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS.
  • Experience in NoSQL database MongoDB and Cassandra. Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, and Flume.
  • Experience in installation, configuration, supporting and managing - CloudEra's Hadoop platformalong with CDH3&4 clusters.
  • Experience in managing and reviewing Hadoop log files.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom MapReduce programs in Java.
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Experience in data management and implementation of Big Data applications using Hadoop frameworks.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Strong Experience in Installation and configuration of Cloudera distribution Hadoop 4, 5& Hortonworks Data Platform 2.1 &2.2.
  • 4years of experience in Redhat Linux.
  • Extensive experience in installation, configuration, maintenance, design, development, implementation, and support on Redhat Linux 3, 4, 5& 6.
  • General Linux system administration including design, configuration, installs, automation, and monitoring
  • Extensive experience in Redhat Linux / UNIX administration and Enterprise Server Integration.
  • Involved and Executed migration tasks like SAN migration both host based and array based migrations from Solaris to Linux.
  • Experienced in deployment of Hadoop Cluster using Puppet tool.
  • Experience in Hadoop Shell commands, writing MapReduce Programs, verifying managing and reviewing Hadoop Log files.
  • Proficient in configuring Zookeeper, Cassandra & Flume to the existing Hadoop cluster.
  • Strong analytical skills with ability to quickly understand client’s business needs and create specifications.
  • Strong knowledge of Mahout Machine learning, Mongo DB Experience working with geographically dispersed teams.
  • Data Warehousing ETL experience in Informatica Power Center 9.0/8.6/8.5/8.1.1/8.0/7.1.2/7.1.1/7.0/6.2 Power Mart 6.2/5.1.1/5.0/4.7.2.
  • Experience in the Implementation of full lifecycle in Data warehouses and Business Data marts with Star Schemas and Snow flake Schemas.
  • Relational and Dimensional Data Modeling experience.
  • Extensive experience in Data warehousing tools including Informatica, Business Objects using different databases.
  • Extensively Worked on Data warehousing with Data Marts, Data Modeling, DataCleansing, ETL tool Informatica and data analyzing reporting Tool.
  • Experience in working with data extraction, transformation and loading using Informatica Power Center.
  • Experience in working with all Informatica modules like Server manager, Repositorymanager, Designer.
  • Expertise in administering InformaticaRepository, including creating and managing users.
  • Extensive experience in supporting production tasks, data enhancements and code fixes.
  • Good working Knowledge in OOA and OOD, using UML and designing use cases.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Well versed in Object Oriented languages like C++ and Java.

TECHNICAL SKILLS

Hadoop / Big Data: HDFS, MapReduce, Hive, Pig, HBase, Sqoop, Flume, Oozie, Zookeeper, MongoDB, Cassandra, Hortonworks

Programming and Scripting languages: Java, SQL, Unix Shell Scripting.

ETL Informatica Tools: Informatica Power Center 9.0/8.6/8.5/8.1.1/7.1.3/6.2/5.0 , Informatica Power Mart5.1, Power Analyzer 3.5x

Database: MS-SQL Server, Oracle 11g/10g/9i, MySQL, DB2, MS-Access

Frameworks: MVC, Struts, Spring, Hibernate

Web Technologies: TCP/IP, UDP, HTTP

Application Server: Apache Tomcat, JBoss, Web Sphere, Web Logic

Operating Systems: Windows,t, Red Hat Linux, UNIX

PROFESSIONAL EXPERIENCE

Confidential, Bridgewater NJ

Hadoop Administrator

Environment: Windows 7/Linux, Hadoop 2.0, Yarn, SharePoint 2014, Hadoop, AWS, Hive, Pig, Map Reduce, Sqoop, Zookeeper, TFS, VS 2015, HDFS Putty, MySQL, Cloudera, Agile

Responsibilities:

  • Strong Knowledge on Multi Clustered environment and setting up Cloudera HadoopEco-System. Experience in installation, configuration and management of Hadoop Clusters.
  • Experience writing Map Reduce Jobs, HIVEQLWork with Enterprise Analytics team and transform analytics requirements into Hadoopcentric technologies
  • Successfully performed installation of CHD5 / CDH4 - Cloudera's Distribution including Apache Hadoop through Cloudera manager.
  • Installed and configured Hive, Hbase.
  • Worked on a live 100 nodes Hadoop cluster running on Horton
  • Experienced on setting up Horton works cluster and installing all the ecosystem components through Ambari and manually from command line.
  • Identity, Authorization and Authentication including Kerberos Setup.
  • Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig and MapReduce access for the new users.
  • Performed both major and minor upgrades to the existing Cloudera Hadoop clusters.
  • Involved in moving all log files generated from various sources to HDFS for further processing.
  • Written the Apache PIG scripts to process the HDFS data.
  • Successfully performed installation of Analytics tools Datameer and Platfora.
  • Performed Hadoop clusters administration through Cloudera Manager.
  • Auditing and reporting hadoop operations and activities using Cloudera Navigator
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
  • Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
  • Working on POC and implementation & integration of Cloudera & Hortonworks for multiple clients.
  • Working on Hadoop ecosystem components Map Reduce, YARN, Hive, SQOOP, PIG, HBase, ZooKeeper and Flume.
  • Utilized Apache Hadoop environment by Hortonworks
  • Managed and reviewed Hadoop log files.
  • Tested raw data and executed performance scripts.
  • Shared responsibility for administration of Hadoop, Hive and Pig.
  • Installed and configured Flume, Sqoop, Pig, Hive, HBase on Hadoop clusters.
  • Managed Hadoop clusters include adding and removing cluster nodes for maintenance and capacity needs.
  • Administered a heterogeneous environment comprising MS Windows 2003 Server, Red Hat Linux. Administration involved installation, configuration, evaluation, implementation and support of strategic Business systems in a heterogeneous environment.
  • Experience supporting Redhat Linux servers running Oracle databases.
  • Provide detailed reporting of work as required by project status reports.
  • Manages project aspects ensuring that it is the gatekeeper to the production environment.

Confidential, Folsom, CA

Hadoop Administrator

Environment: Windows 7/Linux, Hadoop 2.0, SharePoint 2013, Eclipse, Hadoop, HDFS, Hive, Pig, Map Reduce, Sqoop, Hbase, Zookeeper, HPALM, Putty, MySQL, Cloudera, Agile

Responsibilities:

  • Design and Implementation of ETL process in Hadoop Eco-systems.
  • Hands on experience on Hortonworks as well migrating data from Oracle with Sqoop.
  • Installation, configuration and administration experience in Big Data platforms Cloudera CDH, Hortonworks Ambari, Apache Hadoop on Redhat, and Centos as a data storage, retrieval, and processing systems.
  • Importing and exporting data into HDFS using Sqoop.
  • Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing.
  • Very good experience with both MapReduce 1 (Job Tracker/Task Tracker) and MapReduce 2 (YARN)
  • Implementation of de-duplication process to avoid duplicates in daily load.
  • Design and implementation of delta data load systems in Hive, which increased efficiency by more than 60%.
  • Management of Linux user accounts, groups, directories and file permissions.
  • Installation and administration of Cloudera Hadoop (CDH 4.1.0) on 10 node Linux instance
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Monitored all MapReduce Read Jobs running on the cluster using Cloudera Manager and ensured that they were able to read the data to HDFS without any issues.
  • Loaded daily data from websites to Hadoop cluster by using Flume.
  • Involved in loading data from UNIX file system to HDFS.
  • Configuration and administration of DNS, NFSandSendmail in RedHat Linux
  • Installation and configuration of Linux for new build environment.
  • Created volume groups logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
  • Experience with Linux internals, virtual machines, and open source tools/platforms.
  • Improve system performance by working with the development team to analyze, identify and resolve issues quickly.
  • Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing MFS, Hive.
  • Worked on High Availability for NameNode using Cloudera Manager to avoid single point of failure.
  • Design and implementation of pattern mining application using Mahout FP-Growth Algorithm.
  • Developed several advanced Map Reduce programs as part of functional requirements.
  • Developed Hive scripts as part of functional requirements.
  • Implemented Oozie workflows for Map Reduce, Hive and Sqoop actions.
  • Developed and deployed several web services on the Digital Airline platform for the processed data.
  • Good knowledge of using splunk.
  • Successfully integrated Hive tables with MySQL database.
  • Experience working on NoSQL databases including Hbase.
  • Involved in using Kerberos authentication.
  • Experienced of Service Monitoring, Service and Log Management, Auditing and Alerts, Hadoop Platform Security and Configuring Kerberos.
  • Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.
  • Experience in deployment of code changes using team city build.
  • Involved in handling code fixes during production release.
  • Implemented Hadoop cluster on Ubuntu Linux.

Confidential, Houston

Hadoop Administrator

Environment: Windows 7/LINUX, SharePoint 2013, Apache 1.2, Eclipse, Pig, Hive, Flume, Sqoop, HBase, Putty, HPALM, WinSCP, Agile, Cloudera,HDFS

Responsibilities:

  • Worked on Cloudera to search/analyze real time data.
  • Responsible for building scalable distributed data solutions using Hadoop
  • Extensive experience in writing Pig scripts to transform raw data from several data sources in to forming baseline data.
  • Developed several advanced Map Reduce programs to process data files received.
  • Developed Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to load data files into Hadoop.
  • Continuous monitoring and managing the Hadoopcluster through Cloudera Manager.
  • Administrate YARN, Sqoop, Flume, Hive, Spark, Storm, Pig, Oozie, Zookeeper and Cloudera Manager.
  • Extracted feeds form social media sites such as Twitter
  • Designed, configured and managed the backup and disaster recovery for HDFS data.
  • Developed Hive scripts for end user / analyst requirements for adhoc analysis
  • Involved in loading data from UNIX file system to HDFS.
  • Designed, configured and managed the backup and disaster recovery for HDFS data.
  • Worked on Tableau for generating reports on HDFS data.
  • Implemented HDFS snapshot feature.
  • Involved in gathering business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance
  • Usage of Sqoop to import data into HDFS from MySQL database and vice-versa.
  • Bulk loaded data into HBase using NOSQL.
  • Provisioning Red Hat Enterprise Linux Server using PXE Boot according to requirements.
  • Performed Red Hat Linux Kickstart installations on RedHat 4.x/5.x, performed Red Hat Linux Kernel Tuning, memory upgrades.
  • Developed Java programs to apply verbatim cleaning rules for responses.
  • Experience in storing and retrieval of documents in Apache Tomcat
  • Used Sqoop to import data into HDFS and Hive from other data systems.
  • Knowledge transfers sessions on the developed applications to colleagues.

Confidential, Warren, MI

Java Developer

Environment: Windows 7, SharePoint 2010, SQL, Java, SCRUM, JSP, Visual Studio, Agile/scrum, Eclipse

Responsibilities:

  • Involved in complete requirement analysis, design, coding and testing phases of the project.
  • Implemented the project according to the Software Development Life Cycle (SDLC).
  • Developed JavaScript behavior code for user interaction.
  • Used HTML, JavaScript, and JSP and developed UI.
  • Used JDBC and managed connectivity, for inserting/querying& data management including stored procedures and triggers.
  • Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Sql Server database.
  • Part of a team, which is responsible for metadata maintenance and synchronization of data from database.
  • Involved in the design and coding of the data capture templates, presentation and component templates.
  • Developed an API to write XML documents from database.
  • Used JavaScript and designed user-interface and checking validations.
  • Developed Junit test cases and validated users input using regular expressions in JavaScript as well as in the server side.
  • Developed complex SQL stored procedures, functions and triggers.
  • Mapped business objects to database using Hibernate.
  • Wrote SQL queries, stored procedures and database triggers as required on the database objects

Confidential, NC

Java Developer

Environment: HPALM, Java, PL/SQL, XML, Agile/scrum, MS SharePoint 2010, Junit, JSP, XML

Responsibilities:

  • Involved in almost all the phases of SDLC.
  • Complete involvement in Requirement Analysis and documentation on Requirement Specification.
  • Developed prototype based on the requirements using Struts2framework as part of POC (Proof of Concept)
  • Prepared use-case diagrams, class diagrams and sequence diagrams as part of requirement specification documentation.
  • Involved in design of the core implementation logic using MVC architecture.
  • Used Apache Maven to build and configure the application.
  • Configured struts.xmlfile with required action-mappings for all the required services.
  • Developed implementation logic using struts2 framework.
  • Developed JAX-WS web services to provide services to the other systems.
  • Developed JAX-WS client to utilize few of the services provided by the other systems.
  • Involved in developing EJB 3.0 Stateless Session beans for business tier to expose business to services component as well as web tier.
  • Implemented Hibernate at DAO layer by configuring hibernate configuration file for different databases.
  • Developed business services to utilize Hibernate service classes that connect to the database and perform the required action.
  • Developed JSP pages using struts JSP-tags and in-house tags to meet business requirements.
  • Developed JavaScript validations to validate form fields.
  • Performed unit testing for the developed code using JUnit.
  • Developed design documents for the code developed.
  • Used SVN repository for version control of the developed code.

Confidential, Township of Warren, NJ

Jr Java Developer

Environment: Java, JDK 1.5, J2EE, JavaBeans, Spring 2.5, Hibernate 3.3, Servlets, ExtJS, JSP, JSF, EJB 3.0, Web Logic 10.0, Eclipse 3.4, Oracle 10g, Log4j, Maven, Ant 1.7.0, JUnit4.4, Windows, JavaScript, CSS, HTML, XML, VSS

Responsibilities:

  • Gathered requirements, developed system, performed system and integration testing
  • Developed classes in the account selector front-end, data access layer and service layer
  • Involved in design discussions, doing UI mockups
  • Communicated requirements to our -shore team, integrated and reviewed their code
  • Implemented validations using JSF validation framework and Java Script
  • Developed Web pages using JSP, CSS and JavaScript
  • Used Asynchronous JavaScript and XML (AJAX) for better, faster interactive Front-End
  • Developed Servlets which acts as Controller in MVC Architecture
  • Implemented AOP concept using Spring MVC Framework
  • Implemented middleware framework using Hibernate and Spring Framework
  • Used both SQL and HQL as the query languages in Hibernate Mapping
  • Implemented Spring ORM with Hibernate and Spring AOP for declarative transactions using Spring Proxy Beans
  • Implemented design patterns like business delegate, business objects and data access objects
  • Used multithreading in programming to improve overall performance
  • Implemented logs for error tracking using Log4J
  • Implemented agile methodology
  • Used VSS for version control throughout the application.

We'd love your feedback!