We provide IT Staff Augmentation Services!

Sr. Hadoop Administrator Resume

0/5 (Submit Your Rating)

Portland, OregoN

SUMMARY

  • Overall 9 years of experience in Systems Administration and Data Analytics with specialization in Hadoop Ecosystem and Big Data Technologies. 5 Years of extensive experience as Hadoop Administrator with strong expertise in MapReduce and Hive.
  • Strong Experience in Implementation and Administration of Hadoop Infrastructure
  • Strong hands on experience in designing optimized solutions using various Hadoop components like Mapreduce, Hive, Sqoop, Pig, HDFS, Flume, Oozie etc.
  • Strong Understanding of Cassandra and HBASE NoSQL Databases.
  • Strong skills in writing UNIX and Linux scripts.
  • Experience in writing Python scripts.
  • Experience in Backup and Disaster recovery strategies
  • Strong Hands - on in creating and removal of nodes using tools like Nagios, Ganglia’s etc.
  • Designed and Deployed Hortonworks 3 Tier Architectures
  • Experience in Collaborating with application teams to install operating system and Hadoop updates, patches and version updates when required
  • Implemented Large scale Hadoop Solutions using Cloudera and Hortonworks Distribution
  • Strong experience in working with Cloudera and Hortonworks teams to ensure rapid response to customer questions and requirements
  • Performance tuning of Hadoop Clusters and Hadoop MapReduce routines
  • Strong experience in architecting large scale storage in Implementing globally distributed solutions
  • Have strong analytical skills with proficiency in debugging, problem solving. Experience in Sizing and Scaling of Distribution Databases.
  • Implemented Kerberos in the Hadoop cluster environment. Kerberos is a security gateway to authenticate any user getting into the Hadoop cloud.
  • The Kerberos security system includes Key distribution center, Users and HDFS nodes as the components. Have done implementation of these components in the system. Handled tickets related to the Hadoop security system
  • Used to communicate with the Cloudera team in case there is any critical issue that cannot be resolved very easily as the background of this Hadoop-Kerberos security environment is supported by the Cloudera team.
  • Experience in Supporting SPARK Clusters

TECHNICAL SKILLS

Operating Systems: Windows 2000 Server, Windows 2000 Advanced Server, Windows Server 2003 Centos, Debian, Fedora, Windows NT, Windows 98/XP UNIX, Linux RHEL

Database: MS SQL Server 2000/2005/2008 , MS-Access, Tera Data, Oracle 9i/10g, Cassandra,HBASE

Languages: JAVA, C, C++, PIG, HIVE

Tools: /Utilities MapReduce, Sqoop, Flume, Oozie, SQL Profiler

Reporting Tools: Tableau, Impala, Qlikview, Datameer Web Utilities HTTP, IIS Administration, APACHE

PROFESSIONAL EXPERIENCE

Confidential, Portland, Oregon

Sr. Hadoop Administrator

Responsibilities:

  • Design Develop Monitoring Optimizing Peta Byte level Hadoop Cluster on AWS Cloud
  • Automated Major Components with in Hadoop Ecosystem
  • Implemented AD Integration with Kerberos
  • Analyzing Query performances and fine tuning
  • Automated Deployments
  • Installed and Configured Sentry as additional security layer to access Hive tables
  • Migration of local users to AD setup
  • Upgraded Hadoop Hortonworks version to Higher Versions
  • Implemented Hive Impala best practices to enhance query performance
  • Implemented Apache Ranger Configurations
  • Manage and review Hadoop Log files

Environment: /Tools: Hadoop Map Reduce, Hive, Impala. AWS, Kerberos AD Integration, Puppet, HBASE, Hortonworks, Apache Ranger

Confidential, Bentonville, Arkansas

Sr Hadoop Administrator

Responsibilities:

  • Strong experience working with Apache Hadoop Including creating and debug production level jobs
  • Analyzed Complex Distributed Production deployments and made recommendations to optimize performance
  • Driven HDP POC’s with various lines of Business successfully
  • Design and develop Automated Data archival system using Hadoop HDFS. The system has
  • Configurable limit to set archive data limit for efficient usage of disk space in HDFS
  • Configure Apache Hive tables for Analytic job and also create Hive QL scripts for offlineJobs
  • Designed Hive tables for partitioning and bucketing based on different use cases.
  • Develop UDF to enhance Apache Pig and Hive features for client specific data filteringLogics
  • Designed and implemented a stream filtering system on top of Apache Kafka to reduce stream size
  • Written Kafka Rest API to collect events from Front end
  • Implemented Apache Ranger Configurations in Hortonworks distribution

Environment: Hadoop, Kafka, Storm, Cassandra, Java Map Reduce, Hive, Hive QL, HDP, Apache Ranger

Confidential, Austin, Texas

Hadoop Engineer/Administrator

Responsibilities:

  • Installed and Configured multi-nodes fully distributed Hadoop cluster
  • Involved in Installing Hadoop Ecosystem components
  • Responsible to manage data coming from different sources
  • Setup Hadoop Cluster environment administration that includes adding and removing cluster nodes, cluster capacity planning and performance tuning
  • Developed multiple Map Reduce jobs in Java for data cleaning and processing
  • Written Complex Map reduce programs
  • Handled importing of data from various data sources, performed transformations using Hive, Map Reduce and loaded data into HDFS
  • Involved in HDFS maintenance and administering it through Hadoop-Java API
  • Configured Fair Scheduler to provide service level agreements for multiple users of a cluster
  • Loaded data into the cluster from dynamically generated files using FLUME and from RDBMS using Sqoop
  • Involved in writing Java API’s for interacting with HBase
  • Involved in writing Flume and Hive scripts to extract transform and load data into Database
  • Used HBase as the data storage
  • Experience in using SPARK.
  • Installation of Storm and Kafka on 40 node Cluster
  • Written Storm topology to accept the events from Kafka Producer and emit into Cassandra DB
  • Written JUNIT test cases for Storm Topology

Environment: Java Map Reduce, Hive, HBASE. Flume, Sqoop, Spark, Storm, Kafka, Cassandra

Confidential, Richmond, VA

Hadoop Engineer

Responsibilities:

  • Identify and decide on the tools/utilities to extract data into Hadoop
  • Created Map Reduce jobs for data transformations and data parsing
  • Created Hive scripts for extracting the summarized information from hive tables
  • Collaborate with Cross functional teams to gather requirements to ensure the applications are properly configured and deployed
  • Write Apache Pig Latin scripts to do data filtering in HDFS and upload in Hive tables.
  • Installed Configured Sentry as Security layer for accessing
  • Develop UDF to enhance Apache Pig and Hive features for client specific data filtering logics
  • Experience in using SPARK.
  • Writing map-reduce jobs
  • Written Automation scripts using Puppet

Environment: UNIX, Linux, CDH 4 distribution, Tableau, Hive, Map Reduce Puppet

Confidential, Lake forest, Illinois

Hadoop Administrator Engineer

Responsibilities:

  • Solid Understanding of Hadoop HDFS, Map-Reduce and other Eco-System Projects
  • Installation and Configuration of Hadoop Cluster
  • Working with Cloudera Support Team to Fine tune Cluster
  • Working Closely with SA Team to make sure all hardware and software is properly setup for Optimum usage of resources
  • Developed a custom File System plugin for Hadoop so it can access files on Hitachi Data Platform
  • Plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly
  • The plugin also provided data locality for Hadoop across host nodes and virtual machines
  • Wrote data ingesters and map reduce programs
  • Developed map Reduce jobs to analyze data and provide heuristics reports
  • Good experience in writing data ingesters and complex MapReduce jobs in java for data cleaning and preprocessing and fine tuning them as per data sets
  • Extensive data validation using HIVE and also written Hive UDFs
  • Involved in creating Hive tables loading with data and writing hive queries which will run internally in map reduce way lots of scripting (python and shell) to provision and spin up virtualized hadoop clusters
  • Adding, Decommissioning and rebalancing nodes
  • Created POC to store Server Log data into Cassandra to identify System Alert Metrics
  • Rack Aware Configuration
  • Configuring Client Machines
  • Configuring, Monitoring and Management Tools
  • HDFS Support and Maintenance
  • Cluster HA Setup
  • Applying Patches and Perform Version Upgrades
  • Incident Management, Problem Management and Change Management
  • Performance Management and Reporting
  • Recover from Name Node failures
  • Schedule Map Reduce Jobs - FIFO and FAIR share
  • Installation and Configuration of other Open Source Software like Pig, Hive, HBASE, Flume and Sqoop
  • Integration with RDBMS using swoop and JDBC Connectors
  • Working with Dev Team to tune Job Knowledge of Writing Hive Jobs

Environment: Windows 2000/ 2003 UNIX Linux Java, Apache HDFS Map Reduce, Pig Hive HBase Flume Sqoop, Cassandra, NOSQL

Confidential, Northbrook, Illinois

SQL/Hadoop Developer

Responsibilities:

  • Developer Hadoop ecosystem: Hadoop, MapReduce, Hbase, Sqoop, Amazon Elastic Map Reduce (EMR)
  • Developed a scalable, cost effective, and fault tolerant data ware house system on Amazon EC2 Cloud.
  • Developed MapReduce/EMR jobs to analyze the data and provide heuristics and reports. The heuristics were used for improving campaign targeting and efficiency
  • Response time for web services built on typical LAMP (php) stack was too slow developed a high performant / high volume / highly scalable platform for bidding in real-time understand from the client the extraction process and decide on the load strategy i.e. whether they want historical data or the current view
  • Written complex HSQL’s to generate data required in the final reports and pass these HSQL’s to the Ruby programs to convert these HSQL’s to map Reduce programs
  • Importing, exporting data into HDFS and HIVE using Sqoop
  • Responsible for loading unstructured data into Hadoop file system (HDFS)
  • Created and scheduled jobs for maintenance
  • Configured Database Mail
  • Monitored File Growth
  • Maintained Operators, Categories, Alerts, Notifications, Jobs and Schedules
  • Maintained database response times, proactively generated performance reports
  • Automated most of the DBA Tasks and Monitoring stats
  • Developed complex stored procedures, views, clustered/non-clustered indexes, triggers (DDL, DML, LOGON) and user defined functions
  • Created a mirrored database using Database Mirroring with High Performance Mode
  • Created database snapshots and stored procedures to load data from the snapshot database to the report database
  • Restore Development and Staging databases from production as per the requirement
  • Involved in resolving Dead lock issues and Performance issues
  • Query Optimization and Performance Tuning for long running queries and created new indexes on tables for faster I/O

Environment: MS SQL Server 2005/2000, Windows 2000/2003 Server, DTS, Web Logic, Redhat Enterprise MS Access, XML, Hadoop, MapReduce, Hbase, Sqoop, Amazon Elastic Map Reduce CDH, Cassandra, NOSQL, Teradata

Confidential, Cedar Rapids, IOWA

SQL Administrator

Responsibilities:

  • Installing, configuring Linux based systems
  • Installed, Configured and Maintained Supporting open source Linux operating systems (CENTOS, Debian, Fedora)
  • Monitoring the health and stability of Linux and Windows System environments
  • Diagnosed and resolved problems associated with DNS, DHCP, VPN, NFS, and Apache
  • Scripting expertise including BASH, PHP, PERL, Java script and UNIX Shell
  • Maintained and Monitored Replication by managing the profile parameters
  • Implemented Log Shipping and Database Mirroring
  • Used BCP Utility and Bulk Insert for bulk operations on data
  • Automated and enhanced daily administrative tasks including disk space management Backup and recovery
  • Used DTS and SSIS to Import and Export various forms of data
  • Performance Tuning, capacity planning, Server Partitioning and Database security Configuration are done on regular basis to maintain the consistency
  • Created alerts and notifications to notify system errors
  • Used SQL Server Profiler for troubleshooting, monitoring and optimization of SQL Server
  • Worked with developers in creation of Stored Procedures, triggers and User Defined Functions to handle the complex business rules data and audit analysis
  • Provided 24X7 on call Support
  • Generated reports daily, weekly and monthly reports

Confidential

SQL Server Admin

Responsibilities:

  • To set up SQL Server configuration settings.
  • Export or Import data from other data sources like flat files using Import/Export of DTS.
  • Back up, package and distribute databases more efficiently by using Red gate
  • Automate common tasks and use functionality in applications by using Red gate
  • Rebuilding the indexes at regular intervals for better performance
  • Designed and implemented comprehensive Backup plan and disaster recovery strategies
  • Involved in trouble shooting and fine-tuning of databases for its performance and concurrency.
  • Expertise and Interest include Administration, Database Design, Performance Analysis, and Production Support for Large (VLDB) and Complex Databases
  • Monitored and modified Performance using execution plans and Index tuning.
  • Manage the clustered environment.
  • Using log shipping for synchronization of database.
  • Implementation of SQL Logins, Roles and Authentication Modes as a part of Security Policies for various categories of User Support.
  • Monitoring SQL server performance using profiler to find performance and deadlocks.
  • Maintaining the database consistency with DBCC at regular intervals

We'd love your feedback!