We provide IT Staff Augmentation Services!

Hadoop Admin Resume

3.00/5 (Submit Your Rating)

Irvine, CA

PROFESSIONAL SUMMARY:

  • 8+ years of experience in IT industry includes MS SQL DBA and Big data consultant in Banking, Telecom and financial clients.
  • Having 3+ years of comprehensive experience as a Hadoop (HDFS, MAPREDUCE, HIVE, PIG, SQOOP, FLUME, SPARK, KAFKA, DRILL, ZOOKEEPER, AVRO, OOZIE, HBASE)Hadoop Consultant & Administrator.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as Hadoop File System HDFS, JobTracker, TaskTracker, NameNode, DataNode (Hadoop1.X),YARN concepts like ResourceManager, NodeManager (Hadoop 2.x) and Hadoop MapReduce programming paradigm.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop along with CDH3 - 5.8 & HDP1-2.3 clusters. MapReduce, HDFS, Spark, Oozie, Hive, Sqoop, Pig, HBase.
  • Good Understanding in Kerberos and how it interacts with Hadoop and LDAP.
  • Practical knowledge on functionalities of every Hadoop daemons, interaction between them, resource utilizations and dynamic tuning to make cluster available and efficient.
  • Good Expertise on MapReduce programming, PIG Scripting and Distributed Application and HDFS.
  • Experience in handling Hadoop Cluster and monitoring the cluster using Cloudera Manager, Ambari, Nagios and Ganglia.
  • Experience in managing and reviewing Hadoop and eco system tools log files.
  • Dealt with projects including importing and exporting data using Sqoop between HDFS and Relational Database Systems.
  • Hands on experience in writing shell scripts.
  • Experienced in deployment of Hadoop Cluster using Puppet tool.
  • Expert in setting up SSH, SCP, SFTP connectivity between UNIX hosts.
  • Experience in Hadoop Shell commands, verifying managing and reviewing Hadoop Log files.
  • In depth knowledge of JobTracker, TaskTracker, NameNode, DataNodes and MapReduce concepts.
  • Strong analytical skills with ability to quickly understand client’s business needs and create specifications.
  • Hands on experience on installation and configuration of Capacity Scheduler on Hadoop cluster.
  • Experience in Knox and Ranger authorization and authentication on Hadoop.
  • Application development using RDBMS, and Linux shell scripting.
  • Experience in performing major and minor upgrades of Hadoop clusters in Apache, and Cloudera distributions.

TECHNICAL SKILLS:

Big Data Ecosystem: Hadoop, MapReduce, HDFS, HBase, Cassandra, Zookeeper, Avro, Drill, Ambari, Spark, Hive, Pig, Sqoop, Oozie, Falcon and Flume.

J2EE Technologies: JSP 2.1 Servlets 2.3, JDBC, JMS, JNDI, JAXP, Java Beans

Operating Systems: DOS, Windows NT/ 2000/2003/2008/2012, Sun Solaris 2.6-8, Linux, and MacOS

Database: MS SQL Server 2000/2005/2008 /2008 R2/2012/2014, Oracle 8/9/10, Access, Sybase, and IBM DB2

Database Tools: Profiler, DTS, Management Studio/Enterprise Manager, Query Analyzer

Web Servers: IIS 5.0/6.0, Apache Tomcat and Apache Http web server

Languages: T-SQL, C, JAVA, VB.NET, XML, C++, Java, VB 6.0,ASP.NET, CGI, JSP, Unix shell scripting

Reporting Tools: SQL Server Reporting Services, Crystal Reports

Applications: Microsoft Office, Erwin, MS VISIO 2003/2005

Networking: TCP/IP, Named Pipes, VIA

Design Tools: Erwin, Visio 2000, Designer 2000, Developer 2000, DTS, OLAP, Crystal Reports, PeopleSoft SQR, Informatica, Ab Initio

ETL Tools: AscentialDataStage, InformaticaPowerCenter / PowerMart, DecisionStream, DBArtisan

PROFESSIONAL EXPERIENCE:

Confidential, Irvine, CA

Hadoop Admin

Responsibilities:

  • Installed, configured and maintained Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Oozie, Flume, Zookeeper and Sqoop.
  • Installing and Upgrading Cloudera CDH on production & Hortonworks HDP Versions on test.
  • Moving the Services (Re-distribution) from one Host to another host within the Cluster to facilitate securing the cluster and ensuring High availability of the services.
  • Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple MapReduce jobs in java for data cleaning.
  • Worked on installing cluster, commissioning & decommissioning of Data Nodes, NameNode recovery, capacity planning, and slots configuration.
  • Implemented NameNode backup using NFS for High availability.
  • Used Kerberos 5-1.6.1 on CentOS 6 in CDH 5.
  • Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
  • Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
  • Created Hive tables and involved in data loading and writing Hive UDFs.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Worked on HBase, Hive, and Impala.
  • Automated workflows using shell scripts to pull data from various databases into Hadoop.
  • Deployed Hadoop Cluster in Fully Distributed and Pseudo-distributed modes.
  • Used Nagios and Ganglia for monitoring tools.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Flume, HBase, Zookeeper, Ranger, Knox, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux.

Confidential, Mason, OH

Hadoop Administrator

Responsibilities:

  • Installed and configured CDH cluster, using Cloudera manager for easy management of existing Hadoop cluster.
  • Extensively using Cloudera manager for managing multiple clusters with petabytes of data.
  • Implementing Oracle Big Data Appliance for production environment.
  • Setting up the machines with Network Control, Static IP, Disabled Firewalls, Swap memory.
  • Regular Commissioning and Decommissioning of nodes depending upon the amount of data.
  • Worked on setting up high availability for major production cluster.
  • Performed Hadoop version updates using automation tools.
  • Implemented rack aware topology on the Hadoop cluster.
  • Importing and exporting structured data from different relational databases into HDFS and Hive using Sqoop.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
  • Managed load balancers, firewalls in a production environment.
  • Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Good experience in troubleshoot production level issues in the cluster and its functionality.
  • Backed up data on regular basis to a remote cluster using distcp.
  • Managing and scheduling Jobs on a Hadoop cluster.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Flume, HBase, Zookeeper, Cloudera Distributed Hadoop, Cloudera Manager.

Confidential, San Diego, CA

Hadoop Administrator/ SQL Server DBA

Responsibilities:

  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Imported logs from web servers with Flume to ingest the data into HDFS.
  • Retrieved data from HDFS into relational databases with Sqoop. Parsed cleansed and mined useful and meaningful data in HDFS using Map-Reduce for further analysis
  • Partitioned and queried the data in Hive for further analysis by the BI team.
  • Extending the functionality of Hive and Pig with custom UDF’s and UDAF’s.
  • Involved in extracting the data from various sources into Hadoop HDFS for processing.
  • Effectively used Sqoop to transfer data between databases and HDFS.
  • Designed and implemented PIG UDFS for evaluation, filtering, loading and storing of data.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several
  • Installed and configured SQL Server 2008 R2/2012 on DEV, TEST and PRODUCTION servers
  • Migrated SQL Server 2005 to 2008, SQL Server 2008 to 2012 and DTS to SSIS (ETL).
  • Scheduled jobs and Monitored job Failures to Diagnose the problems on a daily basis.
  • Experience with AlwaysOn Availability Transactional Replication and Clustering as well as log shipping.

Environment: Hadoop, HDFS, MapReduce, Hive, Oozie, Java (jdk1.6), Cloudera, MySQL, Windows Server 2008 R2/2012/2012 R2, MS SQL Server 2012/2008 R2, SSIS, SSRS, SSAS, Erwin, Visual Basic 6.0, SQL Azure, Crystal Reports and Ganglia.

Confidential

SQL Server DBA

Responsibilities:

  • Responsible for logical and physical database design for tables in SQL Server 2008/2008 R2/2005 databases.
  • Tuned the database to perform efficiently by managing databases using the options to grow or shrink database files and monitored the size of the transaction log using DBCC SHRINKFILE
  • Managed databases on multiple disks using Disk Mirroring and RAID technology.
  • Evaluated data storage considerations to store databases and transaction logs.
  • Created database Stored Procedures and functions using SQL Server Management Studio
  • Performed Extract Transfer and Loading (ETL) of physical data to the systems by means of SSIS/DTS packages and workflows between source and destination servers
  • Used log shipping for synchronization of databases and supported Active/Passive clusters
  • Maintained the database consistency with DBCC at regular intervals. Involved in troubleshooting and fine-tuning of databases for its performance and concurrency.
  • Completed documentation about the database. Analyzed long running, slow queries and tuned the same to optimize application and system performance.
  • Monitored event and server error logs for troubleshooting.
  • Setup SQL Server configuration settings and transactional replication on production SQL servers.
  • Performed daily backup and created named backup devices to backup the servers regularly.
  • Generated Script files of the databases whenever changes were made to stored procedures or views.
  • Scheduled tasks for transformation of data from heterogeneous environment.
  • Worked on DTS and BCP Import and Export utility for transferring data.

Environment: Windows Server 2008/2008 R2, MS SQL Server 2008/2008 R2, Visual Basic 6.0, .NET, PL/SQL, VB, ASP.

Confidential

Jr. SQL Server DBA/ Developer

Responsibilities:

  • Managing Production SQL Servers on VIRTUAL and PHYSICAL environments
  • Installing and Configuring SQL Server 2005/2008.
  • Supporting the business 24/7, maintaining 99.999 uptimes depending on the application and business SLA requirements.
  • Involved in upgrading SQL Server 2000 instances to SQL Server 2008.
  • Supported the configuration, deployment and administration of vendor and in-house applications interfacing with Oracle and SQL Server databases.
  • Supported for installation Oracle Applications R12/11i in single node & Multimode
  • Involved in database design, database standards, and T-SQL code reviews.
  • Configured Active/Active and Active/passive SQL Server Clusters.
  • Implemented Mirroring and Log Shipping for Disaster recovery
  • Used Send Mail, Bulk Insert, Execute SQL, Data Flow, Import Export control extensively in SSIS
  • Performed Multi File Imports, Package configuration, Debugging Tasks and Scripts in SSIS
  • Scheduled package execution in SSIS
  • Installing packages on multiple servers in SSIS
  • Extensively worked on SSIS for data manipulation and data extraction, ETL Load
  • Created extensive reports using SSRS (Tabular, Matrix).
  • Configured transactional and snapshot replication and managed publications, articles.
  • Proactively involved in SQL Server Performance tuning at Database level, T-SQL level and operating system level. Maintained database response times and proactively generated performance reports.
  • Automated most of the DBA Tasks and monitoring stats.
  • Responsible for SQL Server Edition upgrades and SQL Server patch management.
  • Created a mirrored database for reporting using Database Mirroring with high performance mode.
  • Created database snapshots and stored procedures to load data from the snapshot database to the report database.
  • Involved in data modeling for the application and created ER diagrams using Erwin and VISIO.
  • Created Schemas, Logins, Tables, Clustered and Non-Clustered Indexes, Views, Functions and Stored Procedures.
  • Troubleshooting and supporting all user problems by using SQL server profiler, network traces, Windows logs etc.
  • Migrated DTS Packages to SSIS Package.
  • Monitored event and server error logs for troubleshooting.

Environment: MS SQL server 2000/2005/2008, Windows 2003/2008, Enterprise Manager, Query Analyzer, SQL Server Profiler, SSIS, SSRS, Erwin, VISIO.

We'd love your feedback!