We provide IT Staff Augmentation Services!

Hadoop Admin Resume

Chicago, IL

SUMMARY:

  • 8+ years of IT Experience in Analysis, Design, Development, Implementation and Testing of enterprise wide application, Data Warehouse, Client Server Technologies and Web - based Applications.
  • Expertise in dealing with Apache Hadoop components like HDFS, MapReduce, HIVEHbase, PIG, SQOOP, NAGIOS, Spark, Impala, OOZIE, and Flume Big Data and Big Data Analytics.
  • Experienced in administrative tasks such as Hadoop installation in pseudo distribution mode, multimode cluster and installation of Apache Ambari in Hortonworks Data Platform (HDP2.5).
  • Installation, configuration, supporting and managing Hortonworks Hadoop cluster.
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFSNameNode, Job Tracker, DataNode, Task Tracker and Map Reduce concepts.
  • Experience in installation, configuration, support and management of a Hadoop Cluster.
  • Experience in task automation using Oozie, cluster co-ordination through Pentaho and MapReduce job scheduling using Fair Scheduler.
  • Experience in analyzing data using HiveQL, Pig Latin and custom Map Reduce programs in Java.
  • Experience in writing custom UDF's to extend Hive and Pig core functionality.
  • Got experience in managing and reviewing Hadoop Log files.
  • Worked with Sqoop to move (import/export) data from a relational database into Hadoop and used FLUME to collect data and populate Hadoop.
  • Worked with HBase to conduct quick look ups (updates, inserts and deletes) in Hadoop.
  • Experience in working with cloud infrastructure like Amazon Web Services (AWS) and Rackspace.
  • Experience in storing and managing data on H-catalog data model.
  • Experience in writing SQL queries to process some joins on Hive table and No SQL Database.
  • Worked with Transform Components such as Aggregate, Router, Sorted, Filter by Expression, JoinNormalize and Scan Components and created appropriate DMLs and Automation of load processes using Autosys.
  • Extensively worked on several ETL assignments to extract, transform and load data into tables as part of Data Warehouse development with high complex Data models of Relational, Star, and Snowflake schema.
  • Experienced in all phases of Software Development Life Cycle (SDLC).
  • Experience in Data Modeling, Data Extraction, Data Migration, Data Integration, Data Testing and Data
  • Warehousing using Ab Initio.
  • Configured Informatica environment to connect to different databases using DB config, Input Table, Output Table, Update table Components.
  • Performed systems analysis for several information systems documenting and identifying performance and administrative bottlenecks.
  • Good understanding of Big Data and experience in developing predictive analytics applications using open source technologies.
  • Interested in Applied analytics and real-time web mining techniques.
  • Good understanding and extensive work experience on SQL and PL/SQL.
  • Knowledge on working with Pentaho Data Integration.
  • Able to interact effectively with other members of the Business Engineering, Quality Assurance, Users and other teams involved with the System Development Life cycle
  • Excellent Communication skills in interacting with various people of different levels on all projects and also playing an active role in Business Analysis.

WORK EXPERIENCE:

Hadoop Admin

Confidential, Chicago, IL

Responsibilities:

  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
  • Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recoverycapacity planning, Cassandra and slots configuration.
  • Installed and managed Hadoop production cluster with 50+ nodes with storage capacity of 10PB with HDP distribution using 1.7 Ambri and 2.1.3 HDP
  • Upgraded Production cluster from Ambari1.7 to 2.1 and HDP 2.1 to 2.2.6.
  • Installed, configured, and administered a small Hadoop clusters consisting of 10 nodes.
  • Monitored cluster for performance and, networking and data integrity issues.
  • Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.
  • Formulated procedures for installation of Hadoop patches, updates and version upgrades.
  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadooptools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Responsible for developing data pipeline using HDInsight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
  • Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
  • Managed log files, backups and capacity.
  • Found and troubleshot Hadoop errors
  • Created Ambari Views for Tez, Hive and HDFS.
  • Architecture and designed Hadoop 30 nodes Innovation Cluster with SQRRL, SPARK, Puppet, HDP 2.2.4.
  • Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.
  • Managed 350+ Nodes HDP 2.2.4 clusters with 4 petabytes of data using Ambari 2.0 and Linux Cent OS 6.5
  • Complete end to end design and development of Apache Nifi flow which acts as the agent between middleware team and EBI team and executes all the actions mentioned above.
  • Created 25+ Linux Bash scripts for users, groups, data distribution, capacity planning, and system monitoring.
  • Upgraded the Hadoop cluster from CDH4.7 to CDH5.2.
  • Supported MapReduce Programs and distributed applications running on the Hadoop cluster.

Environment: Hive, Pig, HBase Apache Nifi, PL/SQL, Hive, Java, Unix Shell scripting, Sqoop, ETL, Ambari 2.0, Linux Cent OS, HBase, MongoDB, Cassandra, Ganglia and Cloudera Manager.

Hadoop Admin

Confidential, Grand Rapid, MI

Responsibilities:

  • Analyzed Hadoop cluster and other big data analysis tools including Pig
  • Implemented multiple nodes on CDH3 Hadoop cluster on Red hat Linux
  • Built a scalable distributed data solution
  • Imported data from Linux file system to HDFS
  • Transmitted data from SQL to HBase using Sqoop
  • Worked with a team to successfully tune Pig's performance queries
  • Excelled in managing and reviewing Hadoop log file
  • Worked on evaluating, architecting, installation/setup of Hortonworks 2.1/1.8 Big Data ecosystem which includes Apache Hadoop HDFS, Pig, Hive and Sqoop.
  • Used Apache Spark API over Hortonworks Hadoop YARN cluster to perform analytics on data in Hive.
  • Fitted Oozie to run multiple Hive and Pig job
  • Maintained and backed up meta - data
  • Used data integration tools like Flume and Sqoop
  • Setup automated processes to analysis the system and find errors
  • Supported IT department in cluster hardware upgrades
  • Contributed to building hands-on tutorials for the community to learn how to use Hortonworks Data Platform (powered by Hadoop) and Hortonworks DataFlow (powered by NiFi) covering categories such as Real-World use cases, Operations.
  • Installing, Upgrading and Managing Hadoop Cluster on Hortonworks
  • Setup, configured, and managed security for the Cloudera Hadoop cluster.
  • Loaded log data into HDFS using Flume
  • Created multi-cluster test to test the system's performance and failover
  • Improved a high-performance cache, leading to a greater stability and improved performance.
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodesTroubleshooting, Manage and review data backups, Manage & review log files
  • Responsible to Design & Develop the Business components using Java.
  • Creation of Java classes and interfaces to implement the system.
  • Designed and developed automation test scripts using Python.
  • Azure Cloud Infrastructure design and implementation utilizing ARM templates.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Built a scalable Hadoop cluster for data solution
  • Responsible for maintenance and creation of nodes
  • Worked with other teams to decide the hardware configuration
  • Implemented cluster high availability
  • Scheduled jobs using Fair Scheduler
  • Help design of scalable Big Data clusters and solutions.
  • Commissioning and Decommissioning Nodes from time to time
  • Hand on experience on cluster up gradation and patch upgrade without any data loss and with proper backup plans.
  • Configured alerts to find possible errors
  • Handled patches and updates
  • Worked with developers to setup a full Hadoop system on AWD

Environment: HDFS CDH3, CDH4, Hbase, NOSQL, RHEL 4/5/6, Hive, Pig, Perl Scripting and AWS S3, EC2Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Ubuntu, Linux Red Hat

Hadoop Admin

Confidential, Dallas, TX

Responsibilities:

  • Installed, configured and Administrated of all UNIX/LINUX servers, includes the design and selection of relevant hardware to Support the installation/upgrades of Red Hat (5/6), CentOS 5/6, Ubuntu operating systems.
  • Network traffic control, IPsec, Quos, VLAN, Proxy, Radius integration on Cisco Hardware via Red Hat Linux Software.
  • Responsible for managing the Chef client nodes and upload the cookbooks to chef - server from Workstation
  • Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment
  • Provisioning, building and support of Linux servers both Physical and Virtual using VMware for ProductionQA and Developers environment.
  • Troubleshooting, Manage and review data backups, Manage and review Hadoop log files.
  • Deployed Datalake cluster with Hortonworks Ambari on AWS using EC2 and S3.
  • Hands on experience in installing, configuring Cloudera, MapR, Hortonworks clusters and using Hadoop ecosystem components like Hadoop Pig, Hive, HBase, Sqoop, Kafka, Oozie, Flume, Zookeeper
  • Expertise with Hortonworks Hadoop platform(HDFS, Hive, Oozie, Sqoop, Yarn)
  • Responsible for reviewing all open tickets, resolve and close any existing tickets.
  • Document solutions for any issues that have not been discovered previously.
  • Performing Linux systems administration on production and development servers (Red Hat Linux, CentOS and other UNIX utilities)
  • Installing Patches and packages on Unix/Linux Servers.
  • Installation, Configuration, Upgradation and administration of Sun Solaris, Red hat Linux.
  • Installation and Configuration of VMware vSphere client, Virtual Server creation and resource allocation.
  • Performance Tuning, Client/Server Connectivity and Database Consistency Checks using different Utilities.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
  • Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recoverycapacity planning, Cassandra and slots configuration.
  • Installed, configured, and administered a small Hadoop clusters consisting of 10 nodes. Monitored cluster for performance and, networking and data integrity issues.
  • Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.
  • Formulated procedures for installation of Hadoop patches, updates and version upgrades.
  • Shell scripting for Linux/Unix Systems Administration and related tasks. Point of Contact for Vendor escalation.
  • Automate administration tasks through the use of scripting and Job Scheduling using CRON.Fagile

Hadoop Admin

Confidential, Atlanta, GA

Responsibilities:

  • Tested raw data and executed performance scripts
  • Shared responsibility for administration of Hadoop, Hive and Pig
  • Aided in developing Pig scripts to report data for the analysis
  • Moved data between HDFS and RDBMS using Sqoop
  • Analyzed MapReduce jobs for data coordination
  • Setup, configured, and managed security for the Cloudera Hadoop cluster.
  • Built a scalable Hadoop cluster for data solution.
  • Responsible for maintenance and creation of nodes.
  • Managed log files, backups and capacity
  • Found and troubleshot Hadoop errors
  • Experience in installing Hadoop cluster using different distributions of Apache Hadoop, ClouderaHortonworks Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
  • Deploy and management of multi node HDP clusters (Hortonworks).
  • Worked with other teams to decide the hardware configuration
  • Implemented cluster high availability
  • Worked with management to determine the optimal way to report on data sets
  • Installed, configured, and monitored Hadoop Clusters using Cloud era
  • Installed, upgraded, and patched ecosystem products using Cloudera Manager
  • Balanced and tuned HDFS, Hive, Impala, MapReduce, and Oozie work flows
  • Scheduled jobs using Fair Scheduler
  • Configured alerts to find possible errors
  • Handled patches and updates.
  • Diligently teaming with the infrastructure, network, database, application, and business intelligence teams to guarantee high data quality and availability.
  • Configured Fair scheduler to share the resources of the cluster.
  • Experience designing data queries against data in the HDFS environment using tools such as Apache Hive.
  • Imported data from MySQL server to HDFS using Sqoop.
  • Manage the day-to-day operations of the cluster for backup and support.
  • Implemented Hive custom UDF's to integrate the Weather and geographical data with business data to achieve comprehensive data analysis.
  • Installed and configured Hadoop components on multi-nodes fully distributed Hadoop clusters of large number of nodes.
  • Monitored cluster for performance and, networking and data integrity issues.
  • Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files
  • Used oozie scripts for deployment of the application and perforce as the secure versioning software.
  • Performed unit testing, system testing and integration testing.
  • Provided input to the documentation team.
  • Scripting Hadoop package installation and configuration to support fully-automated deployments.

Environment: Linux, Map Reduce, HDFS, Hive, Pig, Shell Scripting

Hadoop Admin

Confidential, New York, NY

Responsibilities:

  • Involved in installing, configuring and using Hadoop Ecosystems (Hortonworks).
  • Involved in Importing and exporting data into HDFS and Hive using Sqoop.
  • Experienced in managing and reviewing Hadoop log files.
  • Installation, configuration and administration experience in Big Data platforms Hortonworks Ambari, Apache Hadoop on Redhat, and Centos as a data storage, retrieval, and processing systems
  • Involved in development/implementation of Ubuntu Hadoop environment.
  • Responsible for managing data coming from different sources.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in creating Hive tables, loading with data and writing hive queries, which will run internally in map.
  • Used AWS remote computing services such as S3, EC2.
  • Involved in upgrading Hadoop Cluster from HDP 1.3 to HDP 2.0.
  • Involved in loading data from UNIX file system to HDFS.
  • Tested raw data and executed performance scripts.
  • Shared responsibility for administration of Hadoop, Hive and Pig.

Environment: HDFS, Hive, Sqoop, Zookeeper and HBase, Unix Linux Java, HDFS Map Reduce, Pig, HiveHBase, Flume, Kafka, Sqoop, Shell Scripting.

Hire Now