Hadoop Administrator Resume
Irvine, CA
PROFESSIONAL SUMMARY:
- Seeking a challenging Hadoop Big Data Administration & Systems Administration/Electrical Engineering career with progressive result oriented organization which offer ample opportunity to prove, improve and grow in career at professional advancement and a challenging and rewarding technical environment where I can expand upon my existing skill base while contributing to a lively team.
- Experience in installing, configuring and administrating Hadoop cluster for major Hadoop distributions like Cloudera Version CDH5.7,CDH 5.9, CDH 5.10,CDH 5.13,CDH 5.15
- Experience in setting up automated monitoring and escalation infrastructure for Hadoop Cluster using Ganglia and Nagios
- Experience in Hadoop infrastructure which include Map reduce, Hive, Oozie, Scoop, Hbase, hive, Pig, HDFS, Yarn, Hbase, HUE, Spark, Kafka,Key - value store Indexer in direct Client role
- Having Strong Experience in LINUX/UNIX Administration, expertise in Red Hat Enterprise Linux 4, 5 and 6, familiar with Solaris 9 &10 and IBM AIX 6
- Excellent knowledge of in NOSQL databases like HBase, Cassandra.
- Experience in monitoring and troubleshooting issues with Linux memory, CPU, OS, storage and network
- Strong knowledge of Hadoop platforms and other distributed data processing platforms
- Worked with business users to extract clear requirements to create business value
- Investigated on new technologies like Spark to catch up with industry developments.
- Exceptionally well organized that demonstrates self-motivation, learning, creativity & initiatives, extremely dedicated & possess skills in actively learning new technologies within short span of time
- Strong experience in Splunk configuration files, RegEx and comfort in using the Linux CLI and Windows. experience with Splunk real-time processing architecture and deployment; Splunk dashboard design
TECHNICAL SKILLS:
Big Data Technologies: HDFS, Hive, Map Reduce, Cassandra, Pig, Hcatalog, Phoenix, Falcon, Scoop, Flume, Zookeeper, Mahout, Oozie, Avro, HBase, MapReduce, HDFS, Storm, CDH 5.3, CDH 5.5
Monitoring Tools: Cloudera Manager, Ambari, Nagios, Ganglia
Scripting Languages: Shell Scripting, Puppet, Scripting, Python, Bash, CSH.
Programming Languages: C, Python, SQL, and PL/SQL.
Front End Technologies: HTML, XHTML, XML.
Application Servers: Apache Tomcat, WebLogic Server, Web sphere
Databases: Oracle 11g, MySQL, MS SQL Server, IBM DB2.
NoSQL Databases: HBase, Cassandra, MongoDB
Operating Systems: Linux, UNIX, MAC, Windows NT / 98 /2000/ XP / Vista, Windows 7, Windows 8.
Networks: HTTP, HTTPS, FTP, UDP, TCP/TP, SNMP, SMTP.
Security: Kerberos, Ranger.
WORK EXPERIENCE:
Hadoop Administrator
Confidential, Irvine, CA
Responsibilities:
- Installed and configured Hadoop clusters and Eco-system components like spark, Hive, Scala, Yarn, Map Reduce and HBase.
- Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning and slots configuration.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
- Installed security components like Authentication (Kerberos)
- Supporting Cloudera Hadoop infrastructure and other services in its eco system.
- Providing subject matter expertise in management of Cloudera Hadoop infrastructure
- Maintaining and provide day to day to administration of Cloudera Hadoop infrastructure
- Helping with ongoing Splunk workload and LASR migration
- Managing regular patching and upgrade of CDH
- Experience in managing Microsoft Azure cloud based and on Premise Cloudera Clusters
- Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
- Developed Spark scripts by using Scala shell commands as per the requirement.
- Excellent command in implementing High Availability,creating backups and recovery and Disaster recovery procedures.
- Currently handling 4 on premise(CDH 5.13.3) and 4 Micorsoft Azure Cloudera Clusters (CDH 5.15)
- Worked closely with Informatica BDM Team and Autosys Teams
Hadoop Admin (Cloudera)
Confidential, CA
Responsibilities:
- Manage Critical Data Pipelines that power analytics for various business units.CDH 5.13
- Responsible for installing, configuring, supporting and managing of Hadoop Clusters.
- Worked on Performance tuning on Hive SQLs.
- Created external tables with proper partitions for efficiency and loaded the structured data in HDFS resulted from MR jobs.
- Monitored all MapReduce Read Jobs running on the cluster using Cloudera Manager and ensured that they were able to read the data to HDFS without any issues.
- Involved in moving all log files generated from various sources to HDFS for further processing.
- Involved in collecting metrics for Hadoop clusters using Ganglia.
- Worked on Kerberos Hadoop cluster with 100+ nodes cluster.
- Used Hive and created Hive tables, loaded data from Local file system to HDFS.
- Responsible for deploying patches and remediating vulnerabilities.
- Experience in setting up Test, QA, and Prod environment.
- Involved in loading data from UNIX file system to HDFS.
- Created root cause analysis (RCA) efforts for the high severity incidents.
- Worked hands on with ETL process. Handled importing data from various data sources, performed transformations.
- Coordinating with On-call Support if human intervention is required for problem solving.
- Make sure that the analytics data is available on-time for the customers which in turn provides them insight and helps them make key business decisions.
- Aimed at providing a delightful data experience to our customers who are the different business groups across the organization.
- Worked on Alert mechanism to support production cluster/workflows in effective manner and daily running jobs in effective manner to meet SLA.
- Involved in providing operational support to the platform and also following best practices to optimize the performance of the environment.
- Involved with various teams on and offshore for understanding of the data that is important
Hadoop Administrator
Confidential, Sunnyvale, CA
Responsibilities:
- Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting,
- Manage and review data backups, Manage and review Hadoop log files on Cloudera clusters
- Architecting Hadoop clusters with Cloudera CDH 5.7,CDH 5.9 Built a UAT and Production Cloudera Cluster on CDH 5.9 Commissioned and decommissioned the Data Nodes in the cluster in case of the problems.
- Debug and solve the major issues with Cloudera manager by interacting with the Cloudera team from Cloudera Continuous monitoring and managing the Hadoop cluster through Ganglia and Nagios Giving presentations about new ecosystems to be implemented in the cluster with the teams and managers.
- Helped the users in production deployments throughout the process Resolved tickets submitted by users, P2,P3 issues, troubleshoot the errors, documenting, resolving the errors On-boarding new users to the Hadoop cluster (adding user a home directory and providing access to the datasets).
- Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
- Also Done major and minor upgrades to the Hadoop cluster Monitoring Hadoop Cluster through
- Cloudera Manager and implementing alerts based on Error messages.
- Providing reports to management on Cluster Usage Metrics Benchmarking and Stress Testing an
- Hadoop Cluster With TeraSort, TeraGen and Teravalidate,TestDFSIO & Co. experience with Splunk real-time processing architecture and deployment
- Comfortable in using Linux CLI and windows
- Experience with splunk configuration files
- Experience with Python,HTML,XML
Python Developer
Confidential, San Jose, CA
Responsibilities:
- Django Framework that was used in developing web applications to implement the model view control architecture.
- Used Django configuration to manage URLs and application parameters Designed and
- Developed data management system using MySQL Developed frontend and backend modules using Python on Django Web Framework.
- Wrote SQL Queries, Store Procedures, Triggers and functions in MySQL Databases Worked with various Python Integrated Development Environments like Net Beans, PyCharm, PyStudio, PyDev and Sublime Text Used MySQL as backend database and MySQL dB of python as database connector to interact with MySQL server.
- Used Python and Django creating graphics, XML processing of documents, data exchange and business logic implementation between servers.
Python Developer
Confidential
Responsibilities:
- Involved in analysis, specification, design, and implementation and testing phases of Software Development Life Cycle (SDLC) and used agile methodology for developing application.
- Working as an application developer experienced with controllers, views and models in Django
- Used Salt Stack to configure and manage the infrastructure
- Restful web services using Python REST API Framework.
- Implemented the application using Python Spring IOC (Inversion of Control), Django Framework and Installed and configured Py Builder for application builds and deploying it.
- Environment: Python, Django Web Framework, HTML, CSS, NoSQL, JavaScript, JQuery, Sublime Text,
- Jira, GIT, py Builder, unit test, Firebug, Web Services.