Sr. Hadoop Admin Resume Irvine, CA - Hire IT People

SUMMARY:

Over 9 years of experience with emphasis on Big Data Technologies, administration, Development and Design of Java based enterprise applications.
Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, Data Node and MapReduce programming paradigm.
Experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4, CDH5) distributions and on Amazon web services (AWS).
Hands - on experience on major components in Hadoop Ecosystem including Hive, HBase, HBase & Hive Integration, Sqoop, Flume& knowledge of Mapper/Reduce/HDFS Framework.
Set up standards and processes for Hadoop based application design and implementation.
Worked on NoSQL databases including HBase, Cassandra and MongoDB.
Experience on Horton works and Cloudera Hadoop environments.
Talend administrator with hands on Big data ( Hadoop ) with Cloudera framework
Setting up data in AWS using S3 bucket and configuring instance backups to S3 bucket.
Good experience in analysis using Pig and Hiveand understanding of SQOOP and Puppet.
Expertise in database performance tuningdata modeling.
Good Experience in Talend DI Administration, Talend Data Quality and Talend Data Mapping
Experience in designing, installing and configuring complete Hadoop ecosystem (components such as HDFS, Map reduce, pig, hive, Oozie, flume, zookeeper).
Experience in managing the cluster resources by implementing fair scheduler and capacity scheduler.
Experienced in developing MapReduceprograms using Apache Hadoop for working with Big Data.
Experience in tools like puppet to automate Hadoop installation, configuration and monitoring.
Used the Spark - Cassandra Connector to load data to and from Cassandra.
Experience increating databases, users, tables, triggers, macros, views, stored procedures, functions, Packages, joins and hash indexes in Teradata database.
Experience in developing ETL process using Hive, Pig, Sqoop and Map-Reduce Framework.
Involved in log file management where the logs greater than 7 days old were removed from log folder and loaded into HDFS and stored for 3 months.
Experienced in using Talend database components, File components and processing components based up on requirements.
Expertise in development support activities including installation, configuration and successful deployment of changes across all environments.
Loading the data into EMR from various sources S3 process it using Hive Scripts
Familiarity and experience with data warehousing and ETL tools.
Good working Knowledge in OOA&OOD using UML and designing use cases.
Good understanding of Scrum methodologies, Test Driven Development and continuous integration.
Experience in production support and application support by fixing bugs.
Used HP Quality Center for logging test cases and defects.
Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.

TECHNICAL SKILLS:

Big Data: Hadoop, Hive, Sqoop, Pig, Puppet, Ambary, HBase, MongoDB, Cassandra, Power Pivot, Defamer, Pentaho, spark,Flume, SolrCloud, Flume, Impala

Operating Systems: Windows, Ubuntu, Red Hat Linux, Linux, UNIX

Project Management: Plan View, MS-Project

Programming&Scripting Languages: Java, SQL, Unix Shell Scripting, C, Python#

Modeling Tools: UML, Rational Rose

IDE/GUI: Eclipse

Framework: Struts, Hibernate

Database: MS-SQL, Oracle, MS-Access

Middleware: Web Sphere, TIBCO

ETL: Informatica, Pentaho

Business Intelligence: OBIEE, Business Objects

Testing: Quality Center, Win Runner, Load Runner, QTP

PROFESSIONAL EXPERIENCE:

Confidential, Irvine, CA

Sr. Hadoop Admin

Responsibilities:

Worked on setting up Hadoop cluster for the Production Environment.
Install Hadoop2/ Yarn, spark, scala IDE, JAVA JRE on three machines. Configure these machines as a cluster, and set one Name node and two Data nodes
Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
Implemented AWS solutions using EC2, S3 and load balancers.
Installed application on AWS EC2 instances and also configured the storage on S3 buckets.
Storing and loading the data from HDFS to Amazon S3 and backing up the Namespace data.
Worked on Hadoop clusters capacity planning and management.
Monitored and Debugged Hadoop jobs/Applications running in production.
Setup and monitored Cloudera CDH4 cluster with Hadoop2/ YARN running to read data from the Cluster
Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Namenode utilizing zookeeper services.
Experienced in using Talend Big Data components to create connections to various third-party tools used for transferring, storing or analyzing big data, such as Sqoop, MongoDB and BigQuery to quickly load, extract, transform and process large and diverse data sets.
Expert in processing bulk amount data into data warehouse using complex SQL and Talend components.
Troubleshooted, Managed and reviewed data backups, Manage & review Hadoop log files.
Worked with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.
Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Performed maintenance, monitoring, deployments, and upgrades across infrastructure that supports all our Hadoop clusters.
Used Ganglia to Monitor and Nagios to send alerts about the cluster around the clock.
Created concurrent access for hive tables with shared and exclusive locking that can be enabled in hive with the help of Zookeeper implementation in the cluster.
Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop for analysis, visualization and to generate reports.
Implemented PySpark and Spark SQL for faster testing and processing of data.
Developed multiple MapReduce jobs in java for data cleaning.
Ran many performance tests using the Cassandra-stress tool in order to measure and improve the read and write performance of the cluster.
Worked on migrating MapReduce programs into PySpark transformation.
Built wrapper shell scripts to hold Oozieworkflow.
Experienced in using debug mode of talend to debug a job to fix errors.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
Provided ad-hoc queries and data metrics to the Business Users using Hive, Pig.
Useded Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Wrote shell scripts for rolling day-to-day processes and it is automated.
Document and manage failure/recovery.
Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, Hadoop 2, MapReduce, Hive, HDFS,Cassandra,PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH5, MongoDB, Cassandra, Talend, Oracle, NoSQL and Unix/Linux, Kafka, Amazon web services.

Confidential, Houston, Tx

Hadoop Admin

Responsibilities:

Experienced with Cloudera, MapR and Horton works distribution of Hadoop
Analyzed the clients existing Hadoop infrastructure and understanding the performance bottlenecks and provide the performance tuning accordingly
Experienced with Installing Hadoop in new servers and rebuild existing servers
Experienced in setting up automated 24x7 on monitoring and escalation infrastructure for Hadoopcluster using Nagios and Ganglia
Expert in processing bulk amount data into data warehouse using complex SQL and Talendcomponents
Worked in ETL tools Talend to simplify Map Reduce jobs from the front end.
Expertise in using Oozie for configuring job flows
Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Deployed Hadoop Cluster in the following nodes.
Manage Hadoop clusters: Monitor, maintain, setup.
Strong troubleshooting and performance tuning skills
Configured High Availability for Control services like Namenode and Job tracker.
Performed a upgrade in development environment from CDH 4.2 to CDH 4.6.
Involved in analyzing system failures, identifying root causes and recommended course of actions.
Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
Involved in creating Hadoop streaming jobs using Python.
Handled importing of data from various data sources, performed transformations using Hive, Pig and Spark and loaded data into HDFS.
Troubleshooting, Manage and review data backups, Manage & review Hadoop log files.
Wrote queries to create, alter, insert and delete elements from lists, sets and maps in Datastax Cassandra.
Developed multiple MapReduce jobs in java for data cleaning and accessing.
Importing and exporting data into HDFS and Hive using Sqoop
Expert in Talend job migration and deployment to different environment and successfully scheduled job in TAC.
Implemented NameNode backup using NFS. This was done for High availability.
Monitored workload, job performance and capacity planning using Cloudera Manager..
Managed and reviewed Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
Documented and managed failure/recovery.
Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.

Environment: Hadoop 1x, Hive, Pig, HBASE, Sqoop, Flume, Zookeeper, Talend, Pig, HDFS, Ambary, Cassandra, Oracle, CDH4, HPD 2.2

Confidential, Dallas, TX

Hadoop Developer

Responsibilities:

Exported data from DB2 to HDFS using Sqoop.
Developed MapReduce jobs using Java API.
Installed and configured Pig and also wrote Pig Latin scripts.
Wrote MapReduce jobs using Pig Latin.
Developed workflow using Oozie for running MapReduce jobs and Hive Queries.
Worked on Cluster coordination services through Zookeeper.
Worked on loading log data directly into HDFS using Flume.
Involved in loading data from LINUX file system to HDFS.
Responsible for managing data from multiple sources.
Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
Responsible to manage data coming from different sources.
Assisted in exporting analyzed data to relational databases using Sqoop.
Implemented JMS for asynchronous auditing purposes.
Created and maintained Technical documentation for launching Cloudera Hadoop Clusters and for executing Hive queries and Pig Scripts
Experience with CDH distribution and Cloudera Manager to manage and monitor Hadoop clusters
Experience in defining, designing and developing Java applications, specially using Hadoop Map/Reduce by leveraging frameworks such as Cascading and Hive.
Experience in Develop monitoring and performance metrics for Hadoop clusters.
Experience in Document designs and procedures for building and managing Hadoop clusters.
Strong Experience in troubleshooting the operating system, maintaining the cluster issues and also java related bugs.
Experienced import/export data into HDFS/Hive from relational database and Teradata using Sqoop.
Successfully loaded files to Hive and HDFS from Mongo DB Solar.
Experience in Automate deployment, management and self-serve troubleshooting applications.
Define and evolve existing architecture to scale with growth data volume, users and usage.
Design and develop JAVA API (Commerce API) which provides functionality to connect to the Cassandra through Java services.
Responsible for cluster availability and available 24x7 on call support.
Experience with disaster recovery and business continuity practice in hadoop stack
Installed and configured Hive and also written Hive UDFs.
Experience in managing the CVS and migrating into Subversion.
Experience in managing development time, bug tracking, project releases, development speed, release forecast, scheduling and many more.

Environment: Hadoop, HDFS, Hive, Flume, Sqoop, HBase, PIG, MySQL and Ubuntu, Zookeeper, Java (JDK 1.6)

Confidential

Java Developer

Responsibilities:

Created the Database, User, Environment, Activity, and Class diagram for the project (UML).
Implement the Database using Oracle database engine.
Designed and developed a fully functional generic n-tiered J2EE application platform the environment was Oracle technology driven. The entire infrastructure application was developed using Oracle JDeveloper in conjunction with Oracle ADF-BC and Oracle ADF- Rich Faces.
Created an entity object (business rules and policy, validation logic, default value logic, security)
Created View objects, View Links, Association Objects, Application modules with data validation rules, LOV, dropdown, value defaulting, transaction management features.
Web application development using J2EE: JSP, Servlets, JDBC, Java Beans, Struts, Ajax, JSF, JSTL, Custom Tags, EJB, JNDI, Hibernate, ANT, JUnit and Apache Log4J, Web Services, Message Queue (MQ).
Designing GUI prototype using ADF 11G GUI component before finalizing it for development.
Create Reusable Component (ADF Library and ADF Task Flow)
Experience using Version controls such as CVS, PVCS, and Rational Clear Case.
Creating Modules Using Task Flow with Bounded and Unbounded
Generating WSDL (Web Services) And Create Work Flow Using BPEL
Handel the AJAX functions (partial trigger, partial Submit, auto Submit)
Created the Skin for the layout.

Environment: Java core, Servlet, JSF, ADF Rich client UI Framework ADF-BC (BC4J) 11g, web services Using Oracle SOA (Bell), Oracle WebLogic.

We provide IT Staff Augmentation Services!

Sr. Hadoop Admin Resume

Irvine, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship