Hadoop Admin Resume PA - Hire IT People

SUMMARY

9+ years of IT experience as a Big Data Admin, Developer, Designer & quality Tester with cross platform integration experience using Hadoop, Java, J2EE and Software Functional Testing.
Hands on experience with Hadoop, HDFS, MapReduce and Hadoop Ecosystem like Pig, Hive, Impala, Oozie, Zookeeper, Sqoop, Flume, Spark, Kafka and Hbase.
Hands on experience using Cloudera and Horton works Hadoop Distributions.
Hands on experience in installing and configuring Apache Hadoop ecosystems using Cloudera Manager, Apache Ambari, puppet and Chef.
Strong understanding of various Hadoopservices, MapReduce and YARN architecture.
Responsible for writing MapReduce programs.
Experienced in importing - exporting data into HDFSusing SQOOP.
Load log data into HDFS using Flume.
Experience loadingdata to Hive partitions and creating buckets in Hive
Logical Implementation and interaction with HBase
Developed MapReduce jobs to automate transfer the data fromHBase.
Writing Map Reduce programs in Hadoop, pig, Hive and Scala
Expertise in analysis using PIG, HIVE and Mapreduce.
Worked in Multiple Environment in installation and configuration.
Experienced in developing UDFs for Hive using Java.
Strong understanding of NoSQL databases like HBase, MongoDB& Cassandra.
Familiar with handling complex data processing workflows using Oozie.
Scheduling all hadoop/hive/sqoop/Hbase jobs using Oozie.
Experience in SQL and Worked on databases like Oracle and IBM DB2, MySQL, MongoDB
Ability to learn quickly in work environment, fluent in communication, productive interpersonal skills with the ability to understand and cooperate with group requirements efficiently
Dedicated to successful project completion with the ability to work in a team or as an individual, and as a liaison between different teams
Experience in setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud.
Developed core modules in large cross-platform applications using JAVA, J2EE, spring, Struts, Hibernate, JAX-WS Web Services, and JMS.
Worked on debugging tools such as Dtrace, Struss and Top. Expert in setting up SSH, SCP, SFTP connectivity between UNIX hosts.
Good understanding of Scrum methodologies, Test Driven Development and continuous integration.
Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills
Experience in defining detailed application software test plans, including organization,participant, schedule, test and application coverage scope.
Experience in gathering and defining functional and user interface requirements for softwareapplications.

TECHNICAL SKILLS

Hadoop/Big Data: Hadoop, Map Reduce, HDFS, Zookeeper, Kafka,Hive, Pig, Sqoop, Oozie, Flume, Yarn,HBase, Spark with Scala.

No SQL Databases: Hbase, Cassandra, mongoDB

Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, HiveQL, UNIX shell scripts

Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB,JSF, JQuery

Frameworks: MVC, Struts, Spring, Hibernate

Operating Systems: Sun Solaris, HP-UNIX,RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, JBoss

Databases: Oracle 9i/10g/11g, DB2, SQL Server, MySQL, Teradata

Tools: and IDE: Eclipse, NetBeans, Toad, Maven, ANT, Hudson, Sonar, JDeveloper, Assent PMD,DB Visualizer

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

PROFESSIONAL EXPERIENCE

Confidential, PA

Hadoop Admin

Responsibilities:

Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbasedatabase and Sqoop.
Responsible for building scalable distributed data solutions using Hadoop.
Involved in loading data from LINUX file system to HDFS.
Perform architecture design, data modeling, and implementation of Big Data platform and analytic applications for the consumer products
Analyze latest Big Data Analytic technologies and their innovative applications in both business intelligence analysis and new service offerings.
Worked on installing cluster, commissioning & decommissioning of datanode, namenode recovery, capacity planning, and slots configuration.
Implemented test scripts to support test driven development and continuous integration.
Worked on tuning the performance of Mapreduce Jobs.
Implemented Map Reduce using Hadoop,Pig,Hive and Scala
Responsible to manage data coming from different sources.
Load and transform large sets of structured, semi structured and unstructured data
Experience in managing and reviewing Hadoop log files.
Job management using Fair scheduler.
Exported the analyzed data to the relational databases using Sqoopfor visualization and to generate reports for the BI team.
Responsible for maintaining Content Management System on daily basis.
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
CreatedOozie workflows to run multiple MR, Hive and pig jobs.
Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Ubuntu, Linux Red Hat, Spark,Scala

Confidential

HadoopAdmin/Developer

Responsibilities:

Installed and configured HadoopMapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing.
Extensive experience in designing and implementing Data Flow pipeline from RDBMS.
Designing ETL Data Pipeline flow to ingest the data from RDBMS source to Hadoop using shell script, sqoop, ssis package and mysql
Creating Sqoop Jobs to import the data and to load into hdfs
Involved in setting up Multi Node cluster in Amazon Cloud by creating instances on Amazon EC2.
Created MapReduce Jobs on Amazon Elastic Map Reduce (Amazon EMR).
Involved in loading data from UNIX file system to HDFS.
Installed and configured Hive and also written Hive UDFs.
Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
Devised procedures that solve complex business problems with due considerations for hardware/software capacity and limitations, operating times and desired results.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Provided quick response to ad hoc internal and external client requests for data and experienced in creating ad hoc reports.
Responsible for building scalable distributed data solutions using Hadoop.
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS.
Extracted the data from Teradata into HDFS using Sqoop.
Analyzed the data by performing Hive queries and running Pig scripts to know user behavior like shopping enthusiasts, travelers, music lovers etc.
Exported the patterns analyzed back into Teradata using Sqoop.
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
Installed Oozie workflow engine to run multiple Hive.
Developed Hive queries to process the data and generate the data cubes for visualizing.
Process Real time data with Spark

Environment: Hadoop, MapReduce, HDFS, Hive, Ooozie, Java (jdk1.6), Cloudera, NoSQL, Oracle 11g, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting, Spark.

Confidential

Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce on EC2.
Worked with the Data Science team to gather requirements for various data mining projects.
Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Worked on debugging, performance tuning of Hive & Pig Jobs.
Implemented test scripts to support test driven development and continuous integration.
Involved in running Hadoop jobs for processing millions of records and compression techniques.
Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
Worked on tuning the performance of Pig queries.
Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest behavioral data into HDFS for analysis.
Moved Relational Database data using Sqoop into Hive Dynamic partition tables using staging tables.
Optimizing the Hive queries using Partitioning and Bucketing techniques, for controlling the data distribution.
Involved in loading data from LINUX file system to HDFS.
Importing and exporting data into HDFS and HBase using Sqoop from MYSQL.
Experience working on processing semi-structured data using Pig and Hive.
Supported MapReduce Programs those are running on the cluster.
Gained experience in managing and reviewing Hadoop log and JSON files.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
Designing and documenting the project use cases, writing test cases, leading offshore team, and interacting with client.
Experience with professional software engineering practices and best practices for the full software development life cycle including coding standards, code reviews, source control management and build processes.

Environment: Hadoop, HDFS, HBase, Pig, Hive, MapReduce, Sqoop, Oozie, LINUX, S3, EC2, AWS and Big Data.

Confidential hadoop Admin

Responsibilities:

Attend requirement meeting with Business Analysts/ Business Users
Worked on Multi node Clustered environment and set up Cloudera in Hadoop echo-System.
Performed basic Hadoop Administration responsibilities including software installation, configuration, software upgrades, backup and recovery, commissioning and decommissioning data nodes, cluster setup, cluster performance and monitoring on a daily basis.
Involved in analyzing system failures, identifying the root causes and recommending actions to be taken.
Created user accounts and set user’s access in the Hadoop cluster.
Configuring Hadoop Ecosystem tools including Pig, Hive, Hbase, Sqoop, Kafka, Oozie, Zookeeper and Spark in Cloudera Environment.
Performed on Capacity Planning, Performance Tuning, Cluster Monitoring as well as Troubleshooting.
Creating Hive tables, loading data and writing hive queries which will run internally in Map Reduce way.
Working experience on importing and exporting data into HDFS and Hive using Sqoop.
Creating and managing the database objects such as tables, indexes and views.
Experience on importing & exporting data using Sqoop from MySQL to Hive.
Troubleshooting many cloud issues such as Data Node down, Network failures and data block missing.
Implementing Kerberos to authenticate all services in Hadoop cluster and manage security.
Managed Hadoop cluster and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts.
Hands-on perform to setup data pipeline using Kafka and Spark platform.
Assign permission on topics to different consumers and groups, manage spark RDD, working with dataset and Dataframe, save different data as a hive table using HCatalog server.
Manage different file format for Hive table like text, RC, ORC, Sequence, Parquet and Avro.
Understanding of AWS cloud computing platform and related services

Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Ubuntu, Linux Red Hat, Spark, Scala

Confidential

Hadoop Developer/QA

Responsibilities:

Experience in Importing and exporting data into HDFS and Hive using Sqoop.
Developed Pig program for loading and filtering the streaming data into HDFS using Flume.
Experienced in handling data from different data sets, join them and pre process using Pig join operations.
Moving Bulk amount data into HBase using Map Reduce Integration.
Developed Map-Reduce programs to clean and aggregate the data
Developed HBase data model on top of HDFS data to perform real time analytics using Java API.
Developed different kind of custom filters and handled pre-defined filters on HBase data using API.
Imported and exported data from Teradata to HDFS and vice-versa.
Strong understanding of Hadoop eco system such as HDFS, MapReduce, HBase, Zookeeper, Pig, Hadoop streaming, Sqoop, Oozie and Hive
Implement counters on HBasedata to count total records on different tables.
Experienced in handling Avro data files by passing schema into HDFS using Avro tools and Map Reduce.
Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, etc.
We used Amazon Web Services to perform big data analytics.
Implemented Secondary sorting to sort reducer output globally in map reduce.
Implemented data pipeline by chaining multiple mappers by using Chained Mapper.
Created Hive Dynamic partitions to load time series data
Experienced in handling different types of joins in Hive like Map joins, bucker map joins, sorted bucket map joins.
Created tables, partitions, buckets and perform analytics using Hive ad-hoc queries.
Experienced import/export data into HDFS/Hive from relational data base and Tera data using Sqoop.
Handling continuous streaming data comes from different sources using flume and set destination as HDFS.
Integrated spring schedulers with Oozie client as beans to handle cron jobs.
Experience with CDH distribution and Cloudera Manager to manage and monitor Hadoop clusters
Actively participated in software development lifecycle (scope, design, implement, deploy, test), including design and code reviews.
Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, Hbase, Sqoop, RDBMS/DB, Flat files, Teradata, Mysql, CSV, Avro data files.

We provide IT Staff Augmentation Services!

Hadoop Admin Resume

PA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship