Hadoop Admin Resume
New York City, NY
PROFESSIONAL SUMMARY:
- Good experience as Software Engineer with IT Technologies and good working knowledge in Java and BIG Data Hadoop Ecosystems.
- Good experience in Hadoop infrastructure which include Map reduce, Hive, Oozie, Sqoop, HBase, Pig, HDFS, Yarn, Spark. Impala configuration projects in direct client facing roles.
- Good knowledge on Data Structure, Algorithms, Object Oriented Design and Data Modelling.
- Strong knowledge in Core Java programming using Collections, Generics, Exception handling, multithreading.
- Good knowledge on Data Warehousing, ETL development, Distributed Computing, and large scale data processing.
- Good knowledge on implementation and design of big data pipelines.
- Knowledge in installing, configuring and administrating Hadoop cluster for major Hadoop distributions like CDH5 and HDP.
- Knowledge in implementing ETL/ELT processes with MapReduce, PIG, Hive.
- Hands - on experience on major components in Hadoop Ecosystem including Hive, HBase, HBase & Hive Integration, Sqoop, Flume & knowledge of Mapper/Reduce/HDFS Framework.
- Extensive experience in installing, configuring and administrating Hadoop cluster for major Hadoop distributions like CDH5 and HDP
- Worked on NoSQL databases including HBase, Cassandra and MongoDB.
- Strong knowledge on creating and monitoring Hadoop cluster on VM, Hortonworks Data Platform 2.1 7 2.2, CDH3, CDH4 Cloudera Manager on Linux, Ubuntu OS.
- Hands-on experience on major components in Hadoop Ecosystem including Hive, HBase, HBase & Hive Integration, Sqoop, Flume & knowledge of Mapper/Reduce/HDFS Framework.
- Knowledge on MS SQL Server 2012/2008/2005 and Oracle 11g/10g/9i.
- Knowledge in developing MapReduce programs using Apache Hadoop for working with Big Data.
- Strong knowledge in Software Development Life Cycle (SDLC)
- Strong knowledge on creating and monitoring Hadoop cluster on VM, Hortonworks Data Platform 2.1 7 2.2, CDH3, CDH4 Cloudera Manager on Linux, Ubuntu OS.
- Strong understanding in Agile and Waterfall SDLC methodologies.
- Experienced in developing MapReduce programs using Apache Hadoop for working with Big Data.
- Good Knowledge in creating reports using Qlik View/ Qlik Scenes.
- Experienced in installing, configuring and administrating Hadoop Clusters.
TECHNICAL SKILLS:
Big Data: Hadoop, HDFS, Map Reduce, Hive, Sqoop, Pig, Ambari, HBase, MongoDB, Cassandra, Spark, Flume, Impala, Kafka, Oozie, Zookeeper, Cloudera Manager
Hadoop Distribution: Cloudera, Horton Work, AWS
Analysis Tools: DataRobot, Trifacta
Project Management: MS-Project
Programming & Scripting Languages: Java, SQL, PL-SQL, JavaScript, Scala, Unix Shell Scripting, C, Python, R programming
Reporting Tools: Qlik View, Qlik Sense, Tabule, SOAP UI
IDE/GUI: Eclipse, IntelliJ IDEA, Net Beans, Visual Studio.
Database: MS-SQL, Oracle Database, MS-Access, AWS, Teradata, Mongo DB, Cassandra
Other Tools: Wireshark, Cisco Packet Tracer
Operating Systems: Window10, Windows 8, Windows 7, Windows Server2008/2003, Mac OS, Ubuntu, Red Hat Linux, Linux, UNIX
PROFESSIONAL EXPERIENCE:
Confidential, New york City, NY
Hadoop Admin
Responsibilities:
- Configuring hosts as edge nodes with the desired filesystems and adding them to the cluster.
- Involving in the weekly releases and maintaining the sync between different environments such as Production, Disaster Recovery and Pre-Production
- Deleting the users or adding the users as of the request in the Hue, DataRobot and Trifacta.
- Actively involved in the OS Patching activities, Cloudera Upgrade and other maintenance activities.
- Actively involved in the palling and implementation of the Rack Awareness in the different environments.
- Involved in migrating the MySQL database to Oracle database and PSQL database to Oracle database.
- Performed Requirement Analysis, Planning, Architecture Design and Installation of the Hadoop cluster
- Acted as a point of contact between the vendor and my team on different issue,
- Actively involved in the planning and implementation on the Load Balancer with a single GTM and multiple LTM’s
- Involved in writing an automation script for different applications and different purposes such as installing the applications.
- Involved in configuring the LDAP on different application for a secure login.
- Actively involved in the trouble shooting the users issue on 24/7 basis.
- Implemented strategy to upgrade entire cluster nodes OS from RHEL5 to RHEL6 and ensured cluster remains up and running
- Involved in Cluster Level Security, Security of perimeter (Authentication- Cloudera Manager, Active directory and Kerberoes) Access (Authorization and permissions- Sentry) Visibility (Audit and Lineage - Navigator) Data ( Data Encryption at Rest)
- Worked on YARN capacity scheduler by creating queues to allocate resource guarantee to specific groups
- Worked on installing production cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, and slots configuration
Environment: Hadoop, HDFS, Kerberos, Sentry, YARN, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Flume, Oozie, MongoDB, Cassandra, HBase, Eclipse, Oracle, LDAP, DataRobot and Trifacta
Confidential, Richardson, Texas
Hadoop Developer/Admin
Responsibilities:
- Creation of Java classes and interfaces to implement the system.
- Interacting with Cloudera support and log the issues in Cloudera portal and fixing them as per the recommendations
- Scheduled several time based Oozie workflow by developing Python scripts.
- Orchestrated hundreds of Sqoop scripts, python scripts, Hive queries using Oozie workflows and sub-workflows
- Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks
- Created instances in AWS as well as migrated data to AWS from data Center using snowball and AWS migration service
- Extending the functionality of Hive and Pig with custom UDF s and UDAF's on Java
- Involved in extracting the data from various sources into Hadoop HDFS for processing
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop
- Creating and truncating HBase tables in hue and taking backup of submitter ID(s)
- Responsible for building scalable distributed data solutions using Hadoop
- Commissioned and Decommissioned nodes on CDH5 Hadoop cluster on Red hat LINUX
- Worked with BI teams in generating the reports and designing ETL workflows on Tableau
- Configured, supported and maintained all network, firewall, storage, load balancers, operating systems, and software in AWS EC2 and created detailed AWS Security groups which behaved as virtual firewalls that controlled the traffic allowed reaching one or more AWS EC2 instances.
- Hands on experience in Hadoop administration and support activities for installations and configuring Apache Big Data Tools and Hadoop clusters using Cloudera Manager.
- Strong capability to utilize Unix shell programming methods, able to diagnose and resolve complex configuration issues, ability to adapt Unix domain for Hadoop Tools.
- Maintains the EC2 (Elastic Computing Cloud) and RDS (Relational Database Services) in amazon web services.
- Created Hive External tables and loaded the data in to tables and query data using HQL.
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Flume, Oozie, CDH3, MongoDB, Cassandra, HBase, Java, Eclipse, Oracle and Unix/Linux.
Confidential
Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Involved in Installing and configuring Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
- Hands on experience in Hadoop administration and support activities for installations and configuring Apache Big Data Tools and Hadoop clusters using Cloudera Manager.
- Capable to handle Hadoop cluster installations in various environments such as Unix, Linux and Windows, able to implement and execute Pig Latin scripts in Grunt Shell.
- Experienced with file manipulation, advanced research to resolve various problems and correct integrity for critical Big Data issues with NoSQL Hadoop HDFS Database.
- Collected the logs data from web servers and integrated in to HDFS using Flume.
- Implemented NameNode backup using NFS. This was done for High availability.
- Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
- Involved in the installation of CDH3 and up-gradation from CDH3 to CDH4.
- Created Hive External tables and loaded the data in to tables and query data using HQL.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Involved in Installing the Oozie workflow engine in order to run multiple Hive and Pig jobs.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Flume, Oozie, CDH3, MongoDB, Cassandra, HBase, Java, Eclipse, Oracle and Unix/Linux.