Sr.hadoop Administrator Resume
Manhattan, NyC
SUMMARY
- Around 8 years of professional experience in IT background including 6 years of Experience in Bigdata solutions for different domains like Financial, Telecom, Health and Retail.
- Experience in installing, upgrading, configuring and usingHadoopecosystem components likeMapReduce, Yarn, HDFS, HIVE, IMPALA, Hbase, OOZIE, SQOOP, SPARK, KAFKA, ZOOKEEPER, SENTRY,RANGER,TLS SSL services.
- Experience to run patch, small bug fixing and upgrade Cloudera cluster from 5.x and 6.x to CDP.
- Helped in planning, implementing, testing and documenting the performance benchmarking toHadoop platform.
- Experience in both On - Premises and Cloud space: AWS and AZURE. Experience in Launching and Setting upHadoopCluster on AWS, which includes installing and configuring different Hadoop components, Managing HA, Commissioning and Decommissioning the datanodes. Managing Cluster Connectivity and Security by creating Users, Groups, Roles, Profiles and assigned users to groups and granted privileges and permissions to appropriate groups.
- Experience with securingHadoopclusters by implementing Kerberos KDC installation, LDAP Integration, data transport encryption with TLS, and data-at-rest encryption with Cloudera Navigator Encrypt.
- Implementing authorization for HDFS, HIVE, IMPALA using Apache Sentry/Ranger and third party access management tools.
- Experience in Data Warehousing and ETL processes like Configuring Sqoop and Exporting/Importing data into HDFS and HIVE Table.
- Data moving with different ETL tools like Sqoop, Talend and Spark.
- Experience in Integrating Tableau, Teradata, DB2 and ORACLE connection using ODBC/JDBC drivers withHadoop.
- Experience in using distcp, sftp, scp and peer replication to migrate data between and across the clusters.
- Experience in configuring various configuration files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon the job requirement.
- Experience in installing and configuring the Zookeeper to co-ordinate theHadoopdaemons.
- Experience in following standard Back up Measures to make sure the high availability of cluster.
- Implementing Rack Awareness for data locality optimization.
- Monitoring systems and services, architecture design and implementation ofHadoop deployment, configuration management, backup, and disaster recovery systemsusing Data Replication, Snapshots, Cloudera BDR utilities.
- Hands on experience in Linux admin activities on RHEL & CentOS. Good Knowledge on Implementing and using Cluster monitoring tools like Cloudera Manager, Ganglia and Nagios.
- Experienced in manitaining auditing tools like Cloudera Navigator.
- Experience in performing minor and major Upgrades ofHadoopCluster.
- Experience in analyzing Log files and finding the root cause and then involved in analyzing failures, identifying root causes and taking/recommending course of actions.
- Additional responsibilities include interacting with offshore team on a daily basis, communicating the requirement and delegating the tasks to offshore/on-site team members and reviewing their delivery.
- Providing 24x7X365, on-call and weekend production support.
- Excellent interpersonal and communication skills, creative, research-minded with problem solving skills, ability to learn and use new technologies/tools quickly.
TECHNICAL SKILLS
Frameworks: Hadoop, Yarn, HDFS, HBase, Zookeeper, Hive, Pig, SQOOP,Oozie, YARN, Spark, Tez, Impala
Web Technologies: HTML, CSS, XML, JSON, Restful.
Scripting Languages: Python, SQL, Bash
Database: Oracle, MySQL, PostgreSQL, DB2, MsSQL, Teradata
NOSQL Data Storage: Cassandra, S3
Cluster Security: Active Directory, Kerberos, Ranger, Sentry, TLS/SSL
Cloud Computing: AWS, Google Cloud, Azure
Reporting Tools: Tableau, JIRA, Maven
Operating Systems: Windows, Linux, UNIX
ETL Tools: Talend, Kafka, Sqoop, Spark, Nifi, flume, SFTP, distcp
Cluster Management Tools: Cloudera Manager, Ambari
Methodology: Agile SDLC, OLTP, OLAP
Version/SCM Control: GIT, GITHUB, SVN (Subversion)
Protocols: TCP/IP, UDP, SMTP, Routing
PROFESSIONAL EXPERIENCE
Sr.Hadoop Administrator
Confidential, Manhattan, NYC
Responsibilities:
- Involve in installation, configuration, deployment, maintenance, monitoring and troubleshooting CDH clusters in multiple environments such as Development and Production.
- Extensively worked with Cloudera Distribution Hadoop CDH 5.x.
- Extensively involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuning, Cluster Monitoring and Troubleshooting theHadoopCluster.
- Implemented Load Balancing and HA for NameNode, HIVE and IMPALA Proxy using Cloudera manager.
- Installation of hue for GUI access for Hive, pig and OOZIE and also resolve the issues reported by users in Hue.
- Responsible for managing and scheduling jobs on aHadoopCluster.
- Administrating and optimizing theHadoopclusters, monitoring Hadoop jobs and worked with development team to fix the issues.
- Flume and Talend configuration for data transfer from Webservers toHadoopcluster.
- Involved in loading data from UNIX file system to HDFS.
- Used Sqoop to import data from RDBMS to HDFS
- Setup MySQL as External backup databases for Cloudera Manager and other Hadoop components.
- Implemented Kerberos Security Authentication protocol for all services in Hadoop cluster.
- Implemented TLS for CDH Services and for Cloudera Manager.
- Work on User Management Tool for user creation, granting permission for the user to various tables and database, giving group permissions.
- Working with data delivery teams to set up newHadoopusers, which includes setting up Linux users, setting up Kerberos principals.
- Deployed a Test Cluster leveraging the AWS cloud services such as EC2 and installed and configure all the Hadoop Ecosystem Services.
- Choosing the right file formats inHadoopfile systems Text, Avro, Parquet and compression techniques such as Snappy, bz2,LZO.
- Improved communication between teams which led to increase in number of simultaneous projects and average billable hours.
Environment: Cloudera distributionHadoopCDH HA Namenode, Yarn, Hive, Spark, Impala, Pig, Sqoop, Flume, Cloudera Navigator, Oozie, Hue, Ganglia, Nagios, HBase, Cassandra, Kafka, Zookeeper, Jenkins, Maven
Hadoop Administrator
Confidential, phoenix, AZ
Responsibilities:
- Worked on installing Cluster, Commissioning & Decommissioning of DataNodes, capacity planning, slots configuration and Performance tuning. worked on ScriptingHadooppackage installation and configuration to support fully - automated deployments.
- Built high availability for major production cluster and designed automatic failover control using Zookeeper Failover Controller (ZKFC) and Quorum Journal nodes.
- Implemented HA for HIVE and HUE using Cloudera manager.
- Responsible for Cluster Maintenance, Monitoring and Troubleshooting, Manage and Review data backups, Manage and reviewHadooplog files.
- Experience in Implementing Rack Topology scripts to the Hadoop Cluster.
- Load data from various data sources into HDFS using Flume.
- Loaded and transformed large sets of structured, semi structured and unstructured data using Sqoop.
- Managed data coming from different sources like different cluster and UNIX file system.
- Worked extensively with importing metadata into Hive and migrated existing tables and applications to work on Hive and Spark
- Responsible to migrate fromHadoopto Spark frameworks, in-memory distributed computing for real time.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Implemented the right file formats in Hadoop file systems Text, Avro, Parquet and compression techniques such as Snappy, bz2.
- Implemented TLS for CDH Services and for Cloudera Manager.
- Integrated Kerberos intoHadoopto make cluster more strong and secure from unauthorized users.
- Experience working on LDAP user accounts and configuring ldap on client machines.
- Worked closely with Business stake holders, BI analysts, developers, and SAS users to establish SLAs and acceptable performance metrics for theHadoopas a service offering.
- Formulated procedures for installation ofHadooppatches, updates and version upgrades.
Environment: Hadoop, Map Reduce, HDFS, Hive, Oracle 11g, Flume, SQOOP, pig, Kerberos, LDAP, CDH 4 and CDH.5, Zookeeper, Maven.
Hadoop Admin
Confidential, Minneapolis, MN
Responsibilities:
- Working experience on designing and implementing complete end - to-end Hadoop Infrastructure including MapReduce, Hive, Sqoop, Oozie and Zookeeper.
- Built multiple clusters running Cloudera as per the business requirements. Managing and scheduling Jobs on aHadoopcluster.
- Used Sqoop to migrate data to HDFS from My SQL/Oracle and deployed Hive and HBase integration to perform OLAP operations on HBase data.
- Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
- Worked on an Oozie scheduler to automate the pipeline workflow and orchestrate the Sqoop, hive and pig jobs that extract the data in a timely mannerCreating and managing the Cron jobs.
- Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
- Created Users, Groups, Roles, Profiles and assigned users to groups and granted privileges and permissions to appropriate groups.
- Transformed the data using Hive, Pig for BI team to perform visual analytics, according to the client requirement.
- Experience in analyzing Cassandra database and comparing it with other open-source NoSQL databases to find which one of them best suits the current requirements
- Implemented Fair schedulers on the Job Tracker to share the resources of the cluster of the Map Reduce jobs given by the users
Environment: Cloudera CDH 4 Distribution, HDFS, MapReduce, Cassandra, Hive, Oozie, Pig, Shell Scripting, MySQL
SQL DBA
Confidential, Englewood Cliffs, NJ
Responsibilities:
- Monitored and tuned database resources and activities for SQL server databases.
- Performed and automated SQL server version upgrade and patch installs.
- Designed, developed and maintained relational databases
- Developed, implemented and maintained enterprise business information systems.
- Develop SSRS reports and configure SSRS subscriptions per specifications provided by internal and external clients.
