Hadoop Administrator/ Developer Resume Dublin, California - Hire IT People

SUMMARY

Around 6+ years of overall experience in IT as a Developer, Designer & Database Administrator with cross platform integration experience using Hadoop and Java/J2EE
Around 4 years of experience exclusively on BIG DATA ECOSYSTEM using HADOOP framework and related technologies such as HDFS, HBASE, MapReduce, HIVE, PIG, FLUME, OOZIE, SQOOP, and ZOOKEEPER
Experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4 and CDH5) distributions and on amazon web services (AWS)
Experience onHortonworksandCloudera Hadoopenvironments.
Developed automated scripts using Unix Shell for performing RUNSTATS, REORG, REBIND, COPY, LOAD, BACKUP, IMPORT, EXPORT and other related to database activities.
Good experience in analysis using PIG and HIVE and understanding of SQOOP and Puppet.
Expertise in database performance tuning &data modeling.
Experienced in developing MapReduce programs using Apache Hadoop for working with Big Data.
Experience with Sequence files, AVRO and HAR file formats and compression.
Experience in tuning and troubleshooting performance issues in Hadoop cluster.
Worked in the project ApacheKafkawhich aims to provide a unified, high - throughput, low-latency platform for handling real-time data feed.
Experience on monitoring, performance tuning, SLA, scaling and security in Big Data systems.
Hands on NoSQL database experience with HBase, MongoDB and Cassandra.
Extensive experience in Data Ingestion, In-Stream data processing, Batch Analytics and Data Persistence strategy.
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Good experience in using databases - MongoDB, SQL Server, T-SQL, Stored Procedures, Constraints and Triggers.
Experience in managing theHadoopcluster with, IBM Big Insights, Horton works Distribution Platform.
Experience in implementingSparkusingScalaandSpark-SQLfor faster analysing and processing of data.
Good knowledge on building Apache spark applications usingScala
Good understanding of XML methodologies (XML, XSL, XSD) including Web Services and SOAP.
Experience in integration of various data sources like Oracle, DB2, Sybase, SQL server, MS access and non-relational sources like flat files into staging area.
Experience in working with Teradata utilities likeBTEQ, Fast Export, Fast Load, Multi Loadto export and load data to/from different source systems including flat files.
Implemented cascading parameters on SSRS reports, when applicable, to allow for maximum flexibility for report users
Experience in creatingdatabases, users, tables, triggers,macros, views, stored procedures, functions, Packages, joins and hash indexes in Teradata database.
Good knowledge on Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data and also on Windows Azure.
Knowledge in DevOps tools like Maven, Git/GitHub and Jenkins.
Hands-on experience with testing frameworks for hadoop using MRUnit framework.
Experience in developing and designing Web Services (SOAP and Restful Web services).
Excellent Java development skills using J2EE, Spring, J2SE, Servlets, JUnit, MRUnit, JSP, JDBC
Holds strong ability to handle multiple priorities and work load and also has ability to understand and adapt to new technologies and environments faster.

TECHNICAL SKILLS

Big Data: Hadoop Eco System, HDFS, MapReduce, PIG, Hive, HBase, Sqoop, Flume, Zookeeper, Oozie, Spark, Scale, SolrCloud

Java Technologies: Java, J2EE, JSTL, JDBC 3.0/2.1, JSP 1.2/1.1, Java Servlets, JMS, JUNIT, Log4j

Frameworks: Struts 1.2, Spring 3.0, Hibernate 3.2

Languages: C, C++, Java, Unix Shell Scripts, Python

Client Technologies: Java Script, CSS, HTML5, XHTML, JQUERY

Web services: XML, SOAP, WSDL, SOA, JAX- WS, DOM, SAX, XPATH, XSLT, UDDI, JAX-RPC, REST, and JAXB 2.0

Databases: Oracle9i/10g/11g, MySQL, SQL/PL SQL

Web/Application Servers: Apache Tomcat 5.x, BEA Weblogic 8.x, IBM WebSphere 6.0/5.1.1, AWS.

Analytical Tools: TABLEAU

OOAD Modeling Tools: Rational Rose, Rational Clear Case, Enterprise Architect, Microsoft Visio

IDE Development Tools: Eclipse 3.5, Net Beans, My Eclipse, Oracle JDeveloper 10.1.3, SOAP UI, Ant, Maven, RAD

DB Tools: TOAD, MySQL, MYSQL developer

ETL Tools: IBM Datastage 8.1, Netezza

Operating systems: WINDOWS 9X /NT/2000, Linux

PROFESSIONAL EXPERIENCE

Confidential, Dublin, California

Hadoop Administrator/ Developer

Responsibilities:

Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
Involved in installation, configuration, supporting and managing Hadoop clusters,Hadoop cluster administration that includes commissioning & decommissioning of Data Node, capacity planning, slots configuration, performance tuning, cluster monitoring and troubleshooting.
Responsible for building scalable distributed data solutions using Hadoop.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Developed Simple to complex MapReduce Jobs using Hive and Pig.
Administration, installing, upgrading and managing distributions of Hadoop(CDH3, CDH4, Cloudera manager), Hive, HBase.
Plan and execute on system upgrades for existingHadoopclusters.
Imported/exported data from RDBMS to HDFS using Data Ingestion tools like Sqoop.
Commissioning and Decommissioning nodes to Hadoop cluster.
Used Fair Scheduler to manage Map Reduce jobs so that each job gets roughly the same amount of CPU time.
Recovering from node failures and troubleshooting commonHadoopcluster issues.
ScriptingHadooppackage installation and configuration to support fully-automated deployments.
Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
Import the data from different sources like HDFS/HBase into Spark RDD.
Experienced in implementing Spark RDD transformations, actions to implement business analysis.
Migrated Hive QL queries on structured into Spark QL to improve performance.
Developed multiple POCs using PySpark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently.
Worked on Kafka while dealing with raw data, by transforming into new Kafka topics for further consumption.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
Worked on the Ad hoc queries, Indexing, Replication, Load balancing, Aggregation in MongoDB.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Using Kafka functionalities like distribution, partition, replicated commit log service for messaging systems by maintaining feeds.
UsingKafkaon publish-subscribe messaging as a distributed commit log, have experienced in its fast, scalable and durability.
Managed and reviewed Hadoop log files.
Involved in creating Hive tables, loading with data and writing hive queries which run internally in MapReduce.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Installed and configured Pig and also written Pig Latin scripts.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Load and transform large sets of structured, semi structured and Un-structured data.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Java, SQL, Spark, Kafka, MongoDB, Cassandra, Sqoop, Nagios, Ganglia, Zookeeper, Fair Scheduler, Java (jdk1.6)

Confidential, Charlotte, NC

Hadoop Administrator/Developer

Responsibilities:

Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Installed and configured Pig and also written Pig Latin scripts.
Have solid understanding of REST architecture style and its application to well performing web sites for global usage.
Involved in ETL, Data Integration and Migration. Imported data using Sqoop to load data from Oracle to HDFS on regular basis.
Developing Scripts and Batch Job to schedule various Hadoop Program.
Written Hive queries for data analysis to meet the business requirements.
Implemented test scripts to support test driven development and continuous integration.
Responsible to manage data coming from different sources.
Load and transform large sets of structured, semi structured and unstructured data.
Experience in managing and reviewing Hadoop log files.
Designed the front-end applications, user interactive (UI) web pages using web technologies like HTML5/CSS3, Angular JS and bootstrap.
Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
Managing and scheduling Jobs on a Hadoop cluster.
Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Importing and exporting data into HDFS and Hive using Sqoop.
Involved in creating Hive tables, loading with data and writing hive queries, which will run internally in map, reduce way.
Used Pig as ETL tool to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS.
Written Hive queries for data analysis to meet the business requirements.
Involved in writing Hive scripts to extract, transform and load the data into Database.
Used JIRA for bug tracking.
Used CVS for version control.

Environment: Hadoop, Hive, Linux, MapReduce, HDFS, Hive, Pig, Sqoop, Shell Scripting, Java (JDK 1.6), Java 6, Eclipse, Oracle 10g, PL/SQL, SQL*PLUS, Toad 9.6, Linux, JIRA 5.1, CVS, JIRA 5.2.

Confidential, Richmond, VA

Hadoop Administrator/ Developer

Responsibilities:

Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Installed cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity and configured Hadoop, MapReduce, HDFS, developed multiple MapReduce jobs in JAVA for data cleaning.
Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Collected the logs data from web servers and integrated in to HDFS using Flume.
Worked on installing planning, and slots configuration.
Implemented Name Node backup using NFS. This was done for High availability.
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.
Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: NoSQL, Cassandra, MongoDB, Sqoop, HDFS, HBase, PIG Latin, Hive, Flume, MapReduce, JAVA, Eclipse, NetBeans.

Confidential

Java/J2EE Developer

Responsibilities:

Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
Involved in analysis and design of the application.
Involved in preparing the detailed design document for the project.
Developed the application using J2EE architecture.
Involved in developing JSP forms.
Designed and developed web pages using HTML and JSP.
Designed various applets using JBuilder.
Designed and developed Servlets to communicate between presentation and business layer.
Used EJB as a middleware in developing a three-tier distributed application.
Developed Session Beans and Entity beans to business and data process.
Used JMS in the project for sending and receiving the messages on the queue.
Developed the Servlets for processing the data on the server.
The processed data is transferred to the database through Entity Bean.
Used JDBC for database connectivity with MySQL Server.
Query Statements and advanced Prepared Statements, Designed tables and indexe.
Wrote complex SQL queries and stored procedures.
Used CVS for version control.
Involved in unit testing using Junit.
Prepared the Installation, Customer guide and Configuration document which were delivered to the customer along with the product.

Environment: Core Java, J2EE, JSP, Servlets, XML, XSLT, EJB, JDBC, JBuilder 8.0, JBoss, Swing, JavaScript, JMS, HTML, CSS, MySQL Server, CVS, Windows 2000.

We provide IT Staff Augmentation Services!

Hadoop Administrator/ Developer Resume

Dublin, CaliforniA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship