Hadoop Admin Resume
Santaclara, CA
PROFESSIONAL SUMMARY:
- Over 5 Years of Hadoop experience in dealing with Apache Hadoop, components like HDFS, MapReduce, HIVE, Hbase, PIG,SQOOP,NAGIOS, Chef, Puppet, Spark,Impala,OOZIE, Kafka and Flume Big Data and Big Data Analytics.
- Experienced in integrations and configuration of Hadoop framework with technologies such as flume and kafka
- Experience managing Hadoop environment with configuration management tools such as Chef and Puppet
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, NameNode, Job Tracker, DataNode, Task Tracker and Map Reduce concepts.
- Experience in installation, configuration, support and management of a Hadoop Cluster.
- Experience in task automation using Oozie, cluster co - ordination through Tidal and MapReduce job scheduling using Fair Scheduler.
- Experience in analyzing data using HiveQL, Pig Latin and custom Map Reduce programs in Java.
- Experience in writing custom UDF’s to extend Hive and Pig core functionality.
- Got experience in managing and reviewing Hadoop Log files.
- Worked with Sqoop to move (import/export) data from a relational database into Hadoop and used FLUME to collect data and populate Hadoop.
- Worked with HBase to conduct quick look ups (updates, inserts and deletes) in Hadoop.
- Experience in working with cloud infrastructure like Amazon Web Services (AWS) and Rackspace.
- Experience in Core Java, Hadoop Map Reduce related program. Used Hive to transfer the data from RDBMS to our Hive datawarehouse .
- Experience in writing PigLatin. Use Pig Interpreter to run Map Reduce jobs .
- Experience in storing and managing data on H-catalog data model.
- Experience in writing SQL queries to process some joins on Hive table and No SQL Database.
- Experience in Agile Methodology, Micro Service Management, tracking and bug tracking using JIRAWorking experience on designing and implementing complete end-to-end Hadoop Infrastructure including Pig, Hive, Sqoop, Oozie and Zookeeper.
- Experience in Data Modeling, Data Extraction, Data Migration, Data Integration, Data Testing and Data Warehousing using Ab Initio.
- Configured Informatica environment to connect to different databases using DB config, Input Table, Output Table, Update table Components.
- Able to interact effectively with other members of the Business Engineering, Quality Assurance, Users and other teams involved with the System Development Life cycle
TECHNICAL SKILLS:
Big Data Ecosystem: Cloudera, Hortonworks, Hadoop, MapR, HDFS, HBase, Zookeeper, Nagios, chef, puppet, Hive, Pig, Ambari.Spark,Impala
Utilities: Oozie, Sqoop, HBase, NoSQL, Cassandra, Flume.
Data warehousing Tools: Informatica 6.1/7.1x,9.x
Data Modeling: Star-Schema Modeling, Snowflakes Modeling, Erwin 4.0, Visio
RDBMS: Oracle 11g/10g/9i/8i/,Teradata 13.0, Teradata V2R6, Teradata 4.6.2, DB2, MS SQL Server 2000, 2005,2008
Programming: UNIX Shell Scripting, C/C++, Java, Korn Shell, SQL*Plus, PL/SQL,HTML
Operating Systems: Windows NT/XP/2000, UNIX, LINUX(Redhat)
BI tools: Obiee,Tableau
PROFESSIONAL EXPERIENCE:
Confidential, SantaClara, CA
Hadoop Admin
- Worked on Hadoop cluster with CDH 5.9.
- Involved in start to end process of Hadoop cluster setup where in installation, configuration and monitoring the Hadoop Cluster.
- Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & review Hadoop log files.
- Monitoring systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
- Responsible for ETL tools Sqoop/Attunity/dataservice/Webmethods.
- Worked with BI tool Tableau/Atscale connecting to Hadoop
- Knowledge on Atscale to build to build virtual OLAP on HDFS.
- Knowledge on Solr, Sentry, Hue
- Involved in building DR for Hadoop
- Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
- Worked with systems engineering team to plan and deploy new hadoop environments and expand existing hadoop clusters.
- Delivered end to end project. Form requirement gathering to development and testing.
- Ingested data from different sources into Hadoop
- Monitored multiple hadoop clusters environments using Ganglia and Nagios . Monitored workload, job performance and capacity planning using Cloudera Manager .
- Worked on Performance tuning on Hive SQLs and pig scripts.
Confidential, Mountain View, CA
Hadoop Admin
Responsibilities:
- Worked on Hadoop cluster with 450 nodes on CDH 5.4.
- Involved in start to end process of Hadoop cluster setup where in installation, configuration and monitoring the Hadoop Cluster.
- Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & review Hadoop log files.
- Monitoring systems and services, architecture design and implementation of hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
- Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon the job requirement
- Involved with various teams on and offshore for understanding of the data that is imported from their source.
- Involved in data visualization and provided the files required for the team by analyzing the data in hive and developed Pig scripts for advanced analytics on the data.
- Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
- Worked with systems engineering team to plan and deploy new hadoop environments and expand existing hadoop clusters.
- Delivered end to end project. Form requirement gathering to development and testing.
- Ingested data from different sources into Hadoop
- Monitored multiple hadoop clusters environments using Ganglia and Nagios . Monitored workload, job performance and capacity planning using Cloudera Manager .
- Worked on Performance tuning on Hive SQLs and pig scripts.
- Loaded data from hive to netezza and build tableau reports for the end user.
- Weekly meetings with Business partners and active participation in review sessions with other developers and Manager.
Environment: Hadoop, Hive, Pig, tableau, Netezza, Oracle.Environment: Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, Tidal, CheckMK, Graphana, Vertica
Confidential, Pittsburgh, PA
Hadoop Admin
Responsibilities:
- Installed, Configured and Maintained Apache Hadoop 2 clusters for application development and Hadoop tools like Hive, Hbase, Zookeeper and Sqoop.
- Extensively involved in Installation and configuration of Cloudera distribution Hadoop NameNode, Secondary NameNode, Resource Manager, Node Manager and DataNodes.
- Collected the logs data from web servers and integrated into HDFS using Flume.
- Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
- Installed Oozie workflow engine to run multiple Hive Jobs
- Developed data pipeline using Flume, Sqoop and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Experience in Implementing High Availability of Name Node and Hadoop Cluster capacity planning to add and remove the nodes.
- Installed and configured Hive, Hbase
- Identity, Authorization and Authentication including Kerberos Setup.
- Configuring Sqoop and Exporting/Importing data into HDFS
- Configured NameNode high availability and NameNode federation.
- Experienced in loading data from UNIX local file system to HDFS.
- Configured Hadoop 2 NameNode high availability and NameNode federation.
- Use of Sqoop to import and export data from HDFS to Relational database and vice-versa.
- Data analysis in running Hive queries.
Environment: Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, Chef, Puppet, Ubuntu
Confidential, Germantown, MD
Hadoop Admin
Responsibilities:
- Developed and implemented platform architecture as per established standards.
- Supported integration of reference architectures and standards.
- Utilized Big Data technologies for producing technical designs, prepared architectures and blue prints for Big Data implementation.
- Assisted in designing, development and architecture of Hadoop clusters and HBase systems.
- Coordinated with technical teams for installation of Hadoop and third related applications on systems.
- Formulated procedures for planning and execution of system upgrades for all existing Hadoop clusters.
- Integrated Kerberos Security with Hadoop.
- Involved in Backup and Recovery Procedure and configuring Disaster Recovery Procedures
- Supported technical team members for automation, installation and configuration tasks.
- Provided technical assistance for configuration, administration and monitoring of Hadoop clusters.
- Evaluated and documented use cases and proof of concepts, participated in learning of tools in Big Data systems.
- Developed process frameworks and supported data migration on Hadoop systems.
- Worked on Data Lake architecture to collate all the enterprise data into single place for ease of correlation, data analysis to find operational and functional issues in the enterprise workflow as part of this project.
- Designed ETL flows to get data from various sources, transform for further processing and load in Hadoop/HDFS for easy access and analysis by various tools.
- Developed multiple Proof-Of-Concepts to justify viability of the ETL solution including performance and compliance to non-functional requirements.
- Conduct Hadoop training workshops for the development teams as well as directors and management team to increase awareness.
- Prepare presentations of solutions to BigData/Hadoop business cases and present the same to company directors to get go-ahead on implementation.
- Collaborate with Hortonworks team for technical consultation on business problems and validate the architecture/design proposed.
- Designed end to end ETL flow for one of the feed having millions of records inflow daily. Used apache tools/frameworks Hive, Pig, Sqoop & HBase for the entire ETL workflow.
- Setup Hadoop cluster, build Hadoop expertise across development, production support and testing teams, enable production support functions, optimize Hadoop cluster performance in isolation as well as in context of the production workloads/jobs.
- Designed the Data Model to be used for correlation in Hadoop/Hortonworks.
- Designed Data flow and transformation functions for cleansing call records generated on various networks as well as reference data.
- Supported technical team members in management and review of Hadoop log files and data backups.
- Designed and proposed end-to-end data pipeline using falcon and Oozie by doing POCs.
- Use NAGIOS to configure cluster/server level alerts and notifications in case of a failure or glitch in the service.
Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, spark,impala,ZooKeeper, Nagios, Micro service, Hortonworks HDP 2.0/2.1, Micro Service, MongoDB, Cassandra, Kafka,Oracle, NoSQL and Unix/Linux.
