We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

0/5 (Submit Your Rating)

MD

SUMMARY

  • 8+ years of professional experience in IT, including 4+ years of work experience in Big Data, Hadoop and Ecosystem Analytics.
  • Passionate towards working in Big Data and Analytics environment.
  • Well versed in Installation, Configuration, Supporting and Managing of Big Data and underlying infrastructure of Hadoop Cluster.
  • Experience in designing and implementing complete end - to-end Hadoop Infrastructure using MapReduce, Spark, Kafka, Pig, Hive, Impala, Sqoop, Oozie, Flume and HBase.
  • In depth knowledge of Hadoop Architecture and Hadoop daemons such as Name Node, Secondary Name Node, Data Node, Job Tracker and Task Tracker.
  • Experience in writing Map Reduce programs using Apache Hadoop for analyzing Big Data.
  • Hands on experience in writing Ad-hoc Queries for moving data from HDFS to HIVE and analyzing the data using HIVE QL.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Experience in writing Hadoop Jobs for analyzing data using Pig Latin Commands.
  • Experience in Integrating Hive and Sqoop with HBase and analyzing data in HBase.
  • Good Knowledge in NoSQL Databases like HBase, Cassandra and MongoDB.
  • Knowledge in extending Hive and Pig core functionality by writing custom UDFs like UDAFs and UDTFs.
  • Real data streaming using Spark and Kafka.
  • Good working experience in PySpark and SparkSql.
  • Familiar with Scala, closures, higher order functions, monads.
  • Knowledge of administrative tasks such as installing Hadoop and its ecosystem components such as Hive and Pig in Pseudo-Distributed Mode.
  • Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
  • Experience in using Apache Flume for collecting, aggregating and moving large amounts of data from application servers.
  • Experience in using Zookeeper and Oozie Operational Services for coordinating the cluster and scheduling workflows.
  • Good Knowledge in configuring and monitoring tools like Ganglia and Nagios.
  • Good Knowledge in Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data.
  • Experience in Launching EC2 instances in Amazon EMR using Console.
  • Knowledge on Reporting tools like Tableau Software which is used to do analytics on data in cloud.
  • Extensive experience with SQL, PL/SQL and database concepts.
  • Experience in developing applications using Java & J2EE technologies.
  • Extensive Knowledge in Java, J2EE, Servlets, JSP, JDBC, Struts and Spring Framework.
  • Experience in working with popular frameworks likes Struts 2.0, Hibernate 3.0, Spring IOC, and Spring MVC.
  • Experience in Web Services using XML, HTML and SOAP.
  • Experience in using version control management tools like CVS, SVN and Rational Clear Case.
  • Experience in loading data to HDFS from UNIX (Ubuntu, Fedora, Centos) file system.
  • Highly motivated, self-starter with a positive attitude, willingness to learn new concepts and acceptance of challenges.

TECHNICAL SKILLS

Big Data Ecosystem: HDFS, MapReduce, Spark, Spark SQL, Scala, Impala, YARN, Hive, Pig, Zookeeper, Sqoop, Oozie, Flume, Kafka.

Web Technologies: HTML, XML, JDBC, JSP, JavaScript, AJAX, REST, SOAPjQuery.

Methodologies: Agile, UML, Design Patterns (Core Java and J2EE)

NOSQL Technologies: HBase, MongoDB, Cassandra

Databases: Oracle 11g/10g/9i, DB2, MS-SQL Server, Confidential, IBMMySQL, MS- Access

Frameworks: MVC, Hibernate

Languages: C, C++, Java, SQL, PL/SQL, Python, Scala, Unix shell scriptingVB

Web Servers: Web Logic 10.3, Web Sphere 6.1, Apache Tomcat 5.5/6.0.

Tools: & Utilities: Eclipse, Putty, Cygwin, MS Office, Crystal Reports, Access Report Designer, SVN, GIT, Maven, Jira.

Reporting Tool: Tableau

Operating System: Windows XP, UNIX (Solaris, Linux)

Software Package: MS Office 2010.

PROFESSIONAL EXPERIENCE

Confidential, MD

Hadoop Administrator

Responsibilities:

  • Involved in all phases of Software Development Life Cycle (SDLC) and Worked on all activities related to the development, implementation, administration and support forHadoop.
  • Installed and Configured ApacheHadoopclusters for application development andHadooptools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing.
  • Worked with the team to increase cluster, the configuration for additional data nodes was done by Commissioning process inHadoop.
  • Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, manage and review data backups and log files.
  • Worked with systems engineering team to plan and deploy newHadoopenvironments and expand existingHadoopclusters.
  • Managed and scheduled Jobs on aHadoopcluster.
  • Involved in defining job flows, managing and reviewing log files.
  • Installed Oozie workflow engine to run multiple Map Reduce, Hive HQL and Pig jobs.
  • Collected the log data from web servers and integrated into HDFS using Flume.
  • Involved in HDFS maintenance and administering it throughHadoop-Java API.
  • Created and maintained Technical documentation for launchingHADOOPClusters and for executing Hive queries and Pig Scripts.
  • Experience in managing and reviewingHadooplog files.
  • Worked on setting up the Kerberos installation.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop. Worked on tuning the performance Pig queries.
  • Implemented best income logic using Pig scripts and UDFs.
  • Component unit testing using Azure Emulator.
  • Analyze escalated incidences within the Azure SQL database.Implemented test scripts to support test driven development and continuous integration.

Environment: Hadoop, Map Reduce, Spark, Kafka, HDFS, Zoo Keeper, Hive, Pig, Oozie, Core Java, Eclipse, Hbase, Sqoop, Flume, Oracle 11g, Knox, SQL, SharePoint, Ranger, UNIX Shell Scripting.

Confidential, MD

Hadoop Administrator

Responsibilities:

  • Installed and configuredHadoopMapReduce, HDFS and developed multiple MapReduce jobs.
  • Deployed aHadoopcluster and integrated with Nagios and Ganglia.
  • Extensively involved in cluster capacity planning, Hardware planning, Installation, Performance tuning of theHadoopcluster.
  • Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name node recovery, Capacity planning, Cassandra and slots configuration.
  • Hands on experience in provisioning and managing multi-nodeHadoopClusters on public cloud environment Amazon Web Services (AWS) - EC2 and on private cloud infrastructure.
  • Monitored multiple clusters environments using Metrics and Nagios.
  • Experienced in providing security forHadoopCluster with Kerberos.
  • Dumped the data from MYSQL database to HDFS and vice-versa using SQOOP.
  • Used Ganglia and Nagios to monitor the cluster around the clock.
  • Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Worked on analyzing Data with HIVE and PIG.
  • Implemented Kerberos for authenticating all the services inHadoopCluster.
  • Configured Zoo keeper to implement node coordination, in clustering support.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.

Environment: - HDFS, Map Reduce, Hive, Sqoop, PIG, Cloudera, Flume, SQL Server, UNIX, RedHat and CentOS.

Confidential, OH

Hadoop Administrator

Responsibilities:

  • Responsible for Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups, Manage and reviewHadooplog files on clusters.
  • Responsible on-boarding new users to theHadoopcluster (adding user a home directory and providing access to the datasets).
  • Played responsible role for deciding the hardware configurations for the cluster along with other teams in the company.
  • Resolved tickets submitted by users, P1 issues, troubleshoot the errors, documenting, resolving the errors.
  • Experienced in writing the automatic scripts for monitoring the file systems, key MAPR services.
  • Responsible for giving presentations about new ecosystems to be implemented in the cluster with the teams and managers.
  • Helped the users in production deployments throughout the process.
  • Managed and reviewedHadoopLog files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
  • Applied patches to cluster.
  • Added new Data Nodes when needed and ran balancer.
  • Responsible for building scalable distributed data solutions usingHadoop.
  • Continuous monitoring and managing theHadoopcluster through Ganglia and Nagios.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
  • Done major and minor upgrades to theHadoopcluster.
  • Upgraded the ClouderaHadoopecosystems in the cluster using Cloudera distribution packages.
  • Done stress and performance testing, benchmark for the cluster.
  • Commissioned and decommissioned the Data Nodes in the cluster in case of the problems.
  • Debug and solve the major issues with Cloudera manager by interacting with the Cloudera team from Cloudera.

Environment: Flume, Oozie, Pig, Sqoop, Mongo, Hbase, Hive, Map-Reduce, YARN, Cloudera Manager.

Confidential

Java Developer

Responsibilities:

  • Involved in various SDLC phases like Design, Development and Testing.
  • Developed front end using Struts and JSP.
  • Developed web pages using HTML, JavaScript, JQuery and CSS.
  • Used various Core Java concepts such as Exception Handling, CollectionAPIs to implement various features and enhancements.
  • Developed server side components Servlets for the application.
  • Involved in coding, maintaining, and administering Servlets and JSP components to be deployed on a Web Sphere application server.
  • Implemented Hibernate ORM to Map relational data directly to java objects.
  • Worked with Complex SQL queries, Functions and Stored Procedures.
  • Involved in developing spring web MVC framework for portals application.
  • Implemented the logging mechanism using log4j framework.
  • Developed REST API, Web Services.
  • Wrote test cases in Junit for unit testing of classes.
  • Used Maven to build the J2EE application.
  • Used SVN to track and maintain the different version of the application.
  • Involved in maintenance of different applications with onshore team.
  • Good working experience in Tepestry processing claims.
  • Working experience with professional billing claims.

Environment: Java, Spring Framework, Struts, Hibernate, RAD, SVN, Maven, Web Sphere Application Server, Web Services, Oracle Database 11g, IBM MQ, JMS, HTML, Java script, XML, CSS, REST API.

Confidential

Java/J2EE Developer

Responsibilities:

  • Involved in client requirement gathering, analysis & application design.
  • Used UML to draw use case diagrams, class & sequence diagrams.
  • Implemented client side data validations using JavaScript.
  • Implemented server side data validations using Java Beans.
  • Implemented views using JSP & JSTL1.0.
  • Developed Business Logic using Session Beans.
  • Implemented Entity Beans for Object Relational mapping.
  • Implemented Service Locater Pattern using local caching.
  • Worked with collections.
  • Implemented Session Facade Pattern using Session and Entity Beans
  • Developed message driven beans to listen to JMS.
  • Performed application level logging using log4j for debugging purpose.
  • Involved in fine-tuning of application.
  • Thoroughly involved in testing phase and implemented test cases using Junit.
  • Involved in the development of Entity Relationship Diagrams using Rational Data Modeler.

Environment: Java SDK 1.4, Entity Bean, Session Bean, JSP, Servlet, JSTL1.0, CVS, JavaScript, and Oracle9i, SQL, JBOSSv3.0, Eclipse 2.1

We'd love your feedback!