Hadoop Admin Resume
3.00/5 (Submit Your Rating)
VA
SUMMARY:
- Around 6+ years of experience including 3years in Hadoop and related technologies.
- Worked on installation configuration and maintenance of 100+ node Hadoop cluster.
- Involved in collecting requirements from business users, designing and implementing data pipelines and ETL workflows end to end.
- Experience in performing various major and minor Hadoop upgraded on large environments.
- Experience wif Securing Hadoop clusters using Kerberos.
- Experience in using Cloudera Manager for Installation and management of Hadoop clusters.
- Monitoring and support through Nagios and Ganglia
- Benchmarking Hadoop clusters to validate teh hardware before and after installation to tweak teh configurations to obtain better performance.
- Experience in performing POCs to test teh usability of new tools on top of Hadoop.
- Experience in working large environments and leading teh infrastructure support and operations.
- Migrating applications from existing systems like MYSQL, oracle, db2 and Teradata to Hadoop.
- Expertise wif Hadoop, Map reduces, Pig, SQOOP, OOZIE, and Hive.
- Developed and automated Hive queries on daily basis.
- Extensive noledge on Migration of applications from existing sources.
- Experience in driving OS upgrades on large Hadoop clusters wifout down times.
- Expertise in Collaborating across Multiple technology groups and getting things done.
PROFESSIONAL EXPERIENCE:
Confidential, VA
Hadoop Admin
Responsibilities:
- Gatheird teh business requirements from teh Business Partners and Subject Matter Experts.
- Installed and configured Apache Hadoop, Hive and Pig environment on teh prototype server.
- Configured MYSQL Database to store Hive metadata.
- Responsible for loading unstructured data into Hadoop File System (HDFS).
- Maintained documentation for corporate Data Dictionary wif attributes, table names and constraints.
- Involved in running Hadoop streaming jobs to process terabytes of xml format data.
- Involved in managing and reviewing Hadoop log files.
- Importing and exporting data into HDFS and Hive using SQOOP.
- Supported Map Reduce Programs those are running on teh cluster.
- Imported data using SQOOP to load data from MySQL to HDFS on regular basis.
- Developed Scripts and Batch Job to schedule various Hadoop Program.
- Created jobs to load data from MongoDB into Data warehouse.
- Wrote Hive queries for data analysis to meet teh business requirements.
- Extensively worked wif SQL scripts to validate teh Pre, and Post data load
- Wrote Java MapReduce jobs to process teh tagging functionality for each chapter, sections and sub sections on teh data stored in HDFS.
Confidential, Plymouth, MN
Hadoop Administrator
Responsibilities:
- Responsible for Building and managing Hadoop clusters. responsible on working Cloudera manager
- Involved in source system analysis, data analysis, data modeling to ETL (Extract, Transform and Load) and Hive QL.
- Strong Experience in Installation and configuration of Hadoop ecosystem like Yarn, HBase, Flume, Pig, SQOOP.
- Expertise in Hadoop cluster task like Adding and Removing Nodes wifout any TEMPeffect to running jobs and data.
- Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
- Worked extensively wif SQOOP for importing metadata from Oracle.
- Designed a data warehouse using Hive.
- Created partitioned tables in Hive.
- Mentored analyst and test team for writing Hive Queries.
- Extensively used Pig for data cleansing.
- Developed Pig Latin scripts to extract teh data from teh web server output files to load into HDFS.
- Developed teh Pig UDF’S to pre - process teh data for analysis.
Confidential, Memphis, TN
JR. Hadoop Admin
Responsibilities:
- Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and SQOOP.
- Extensively involved in Installation and configuration of Cloudera distribution Hadoop 2, 3, Name Node, Secondary Name Node, Job Tracker, Task Trackers and Data Nodes.
- Wrote teh shell scripts to monitor teh health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple MapReduce jobs for data cleaning.
- Involved in clustering of Hadoop in teh network of 70 nodes.
- Experienced in loading data from UNIX local file system to HDFS.
- Developed data pipeline using Flume, SQOOP, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Involved in developing new work flow Map Reduce jobs using OOZIE framework.
- Collected teh logs data from web servers and integrated in to HDFS using Flume.
- Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, and slots configuration.
- Developed PIG Latin scripts to extract teh data from teh web server output files to load into HDFS.
- Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing teh data onto HDFS.
- Involved in teh installation of CDH3 and up-gradation from CDH3 to CDH4.
- Responsible for developing data pipeline using HDInsight, flume, SQOOP and pig to extract teh data from weblogs and store in HDFS.
- Installed OOZIE workflow engine to run multiple Hive and Pig Jobs
- Use of SQOOP to import and export data from HDFS to RDBMS and vice-versa.
- Used Hive and created Hive external/internal tables and involved in data loading and writing Hive UDFs.
- Exported teh analyzed data to relational databases using SQOOP for visualization and to generate reports.
- Involved in migration of ETL processes from Oracle to Hive to test teh easy data manipulation.
- Used Hive to analyze teh partitioned and bucketed data and compute various metrics for reporting.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using SQOOP.
- Worked on NoSQL databases including HBase, MongoDB, and Cassandra. Kafka
- Created Hive External tables and loaded teh data in to tables and query data using HQL.
- Created Hive queries to compare teh raw data wif EDW tables and performing aggregates.
- Wrote shell scripts for rolling day-to-day processes and it is automated.
- Automated workflows using shell scripts to pull data from various databases into Hadoop.
Confidential,
Linux Admin
Responsibilities:
- Configuring and tuning system and network parameters for optimum performance.
- Gained noledge on troubleshooting and problem Solve skills, including application and network-level troubleshooting ability.
- Gained noledge and experience on writing shell scripts to automate teh tasks.
- Identifying and triaging outages monitor and remediate systems and network performance.
- Developing tools to automate teh deployment, administration, and monitoring of a large-scale Linux environment.
- Performing server tuning, operating system upgrades.
- Participating in teh planning phase for system requirements on various projects for deployment of business functions.
- Participating in 24x7 on-call rotation and maintenance windows.
- Communication & coordination wif internal / external groups and operations.
TECHNICAL SKILLS:
Hadoop eco system components: Hadoop, Map reduce, yarn, hive, pig, SQOOP, flume, impala, OOZIE.
Tools: SVN, Tableau, Micro strategy integrations wif hive.
Programming Languages: Unix Shell scripting, JAVA, SQL.
Monitoring and Alerting: Nagios, Ganglia.
Operating Systems: Linux Centos 5,6, Redhat7.