We provide IT Staff Augmentation Services!

Hadoop Administrator/ Developer Resume

4.00/5 (Submit Your Rating)

Dublin, CaliforniA

SUMMARY

  • Around 6+ years of overall experience in IT as a Developer, Designer & Database Administrator with cross platform integration experience using Hadoop and Java/J2EE
  • Around 4 years of experience exclusively on BIG DATA ECOSYSTEM using HADOOP framework and related technologies such as HDFS, HBASE, MapReduce, HIVE, PIG, FLUME, OOZIE, SQOOP, and ZOOKEEPER
  • Experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4 and CDH5) distributions and on amazon web services (AWS)
  • Experience onHortonworksandCloudera Hadoopenvironments.
  • Developed automated scripts using Unix Shell for performing RUNSTATS, REORG, REBIND, COPY, LOAD, BACKUP, IMPORT, EXPORT and other related to database activities.
  • Good experience in analysis using PIG and HIVE and understanding of SQOOP and Puppet.
  • Expertise in database performance tuning &data modeling.
  • Experienced in developing MapReduce programs using Apache Hadoop for working with Big Data.
  • Experience with Sequence files, AVRO and HAR file formats and compression.
  • Experience in tuning and troubleshooting performance issues in Hadoop cluster.
  • Worked in the project ApacheKafkawhich aims to provide a unified, high - throughput, low-latency platform for handling real-time data feed.
  • Experience on monitoring, performance tuning, SLA, scaling and security in Big Data systems.
  • Hands on NoSQL database experience with HBase, MongoDB and Cassandra.
  • Extensive experience in Data Ingestion, In-Stream data processing, Batch Analytics and Data Persistence strategy.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Good experience in using databases - MongoDB, SQL Server, T-SQL, Stored Procedures, Constraints and Triggers.
  • Experience in managing theHadoopcluster with, IBM Big Insights, Horton works Distribution Platform.
  • Experience in implementingSparkusingScalaandSpark-SQLfor faster analysing and processing of data.
  • Good knowledge on building Apache spark applications usingScala
  • Good understanding of XML methodologies (XML, XSL, XSD) including Web Services and SOAP.
  • Experience in integration of various data sources like Oracle, DB2, Sybase, SQL server, MS access and non-relational sources like flat files into staging area.
  • Experience in working with Teradata utilities likeBTEQ, Fast Export, Fast Load, Multi Loadto export and load data to/from different source systems including flat files.
  • Implemented cascading parameters on SSRS reports, when applicable, to allow for maximum flexibility for report users
  • Experience in creatingdatabases, users, tables, triggers,macros, views, stored procedures, functions, Packages, joins and hash indexes in Teradata database.
  • Good knowledge on Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data and also on Windows Azure.
  • Knowledge in DevOps tools like Maven, Git/GitHub and Jenkins.
  • Hands-on experience with testing frameworks for hadoop using MRUnit framework.
  • Experience in developing and designing Web Services (SOAP and Restful Web services).
  • Excellent Java development skills using J2EE, Spring, J2SE, Servlets, JUnit, MRUnit, JSP, JDBC
  • Holds strong ability to handle multiple priorities and work load and also has ability to understand and adapt to new technologies and environments faster.

TECHNICAL SKILLS

Big Data: Hadoop Eco System, HDFS, MapReduce, PIG, Hive, HBase, Sqoop, Flume, Zookeeper, Oozie, Spark, Scale, SolrCloud

Java Technologies: Java, J2EE, JSTL, JDBC 3.0/2.1, JSP 1.2/1.1, Java Servlets, JMS, JUNIT, Log4j

Frameworks: Struts 1.2, Spring 3.0, Hibernate 3.2

Languages: C, C++, Java, Unix Shell Scripts, Python

Client Technologies: Java Script, CSS, HTML5, XHTML, JQUERY

Web services: XML, SOAP, WSDL, SOA, JAX- WS, DOM, SAX, XPATH, XSLT, UDDI, JAX-RPC, REST, and JAXB 2.0

Databases: Oracle9i/10g/11g, MySQL, SQL/PL SQL

Web/Application Servers: Apache Tomcat 5.x, BEA Weblogic 8.x, IBM WebSphere 6.0/5.1.1, AWS.

Analytical Tools: TABLEAU

OOAD Modeling Tools: Rational Rose, Rational Clear Case, Enterprise Architect, Microsoft Visio

IDE Development Tools: Eclipse 3.5, Net Beans, My Eclipse, Oracle JDeveloper 10.1.3, SOAP UI, Ant, Maven, RAD

DB Tools: TOAD, MySQL, MYSQL developer

ETL Tools: IBM Datastage 8.1, Netezza

Operating systems: WINDOWS 9X /NT/2000, Linux

PROFESSIONAL EXPERIENCE

Confidential, Dublin, California

Hadoop Administrator/ Developer

Responsibilities:

  • Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
  • Involved in installation, configuration, supporting and managing Hadoop clusters,Hadoop cluster administration that includes commissioning & decommissioning of Data Node, capacity planning, slots configuration, performance tuning, cluster monitoring and troubleshooting.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig.
  • Administration, installing, upgrading and managing distributions of Hadoop(CDH3, CDH4, Cloudera manager), Hive, HBase.
  • Plan and execute on system upgrades for existingHadoopclusters.
  • Imported/exported data from RDBMS to HDFS using Data Ingestion tools like Sqoop.
  • Commissioning and Decommissioning nodes to Hadoop cluster.
  • Used Fair Scheduler to manage Map Reduce jobs so that each job gets roughly the same amount of CPU time.
  • Recovering from node failures and troubleshooting commonHadoopcluster issues.
  • ScriptingHadooppackage installation and configuration to support fully-automated deployments.
  • Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
  • Import the data from different sources like HDFS/HBase into Spark RDD.
  • Experienced in implementing Spark RDD transformations, actions to implement business analysis.
  • Migrated Hive QL queries on structured into Spark QL to improve performance.
  • Developed multiple POCs using PySpark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently.
  • Worked on Kafka while dealing with raw data, by transforming into new Kafka topics for further consumption.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
  • Worked on the Ad hoc queries, Indexing, Replication, Load balancing, Aggregation in MongoDB.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Using Kafka functionalities like distribution, partition, replicated commit log service for messaging systems by maintaining feeds.
  • UsingKafkaon publish-subscribe messaging as a distributed commit log, have experienced in its fast, scalable and durability.
  • Managed and reviewed Hadoop log files.
  • Involved in creating Hive tables, loading with data and writing hive queries which run internally in MapReduce.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Installed and configured Pig and also written Pig Latin scripts.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Load and transform large sets of structured, semi structured and Un-structured data.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Java, SQL, Spark, Kafka, MongoDB, Cassandra, Sqoop, Nagios, Ganglia, Zookeeper, Fair Scheduler, Java (jdk1.6)

Confidential, Charlotte, NC

Hadoop Administrator/Developer

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Installed and configured Pig and also written Pig Latin scripts.
  • Have solid understanding of REST architecture style and its application to well performing web sites for global usage.
  • Involved in ETL, Data Integration and Migration. Imported data using Sqoop to load data from Oracle to HDFS on regular basis.
  • Developing Scripts and Batch Job to schedule various Hadoop Program.
  • Written Hive queries for data analysis to meet the business requirements.
  • Implemented test scripts to support test driven development and continuous integration.
  • Responsible to manage data coming from different sources.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Experience in managing and reviewing Hadoop log files.
  • Designed the front-end applications, user interactive (UI) web pages using web technologies like HTML5/CSS3, Angular JS and bootstrap.
  • Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
  • Managing and scheduling Jobs on a Hadoop cluster.
  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Involved in creating Hive tables, loading with data and writing hive queries, which will run internally in map, reduce way.
  • Used Pig as ETL tool to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS.
  • Written Hive queries for data analysis to meet the business requirements.
  • Involved in writing Hive scripts to extract, transform and load the data into Database.
  • Used JIRA for bug tracking.
  • Used CVS for version control.

Environment: Hadoop, Hive, Linux, MapReduce, HDFS, Hive, Pig, Sqoop, Shell Scripting, Java (JDK 1.6), Java 6, Eclipse, Oracle 10g, PL/SQL, SQL*PLUS, Toad 9.6, Linux, JIRA 5.1, CVS, JIRA 5.2.

Confidential, Richmond, VA

Hadoop Administrator/ Developer

Responsibilities:

  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Installed cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity and configured Hadoop, MapReduce, HDFS, developed multiple MapReduce jobs in JAVA for data cleaning.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Collected the logs data from web servers and integrated in to HDFS using Flume.
  • Worked on installing planning, and slots configuration.
  • Implemented Name Node backup using NFS. This was done for High availability.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
  • Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
  • Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.
  • Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
  • Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: NoSQL, Cassandra, MongoDB, Sqoop, HDFS, HBase, PIG Latin, Hive, Flume, MapReduce, JAVA, Eclipse, NetBeans.

Confidential

Java/J2EE Developer

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
  • Involved in analysis and design of the application.
  • Involved in preparing the detailed design document for the project.
  • Developed the application using J2EE architecture.
  • Involved in developing JSP forms.
  • Designed and developed web pages using HTML and JSP.
  • Designed various applets using JBuilder.
  • Designed and developed Servlets to communicate between presentation and business layer.
  • Used EJB as a middleware in developing a three-tier distributed application.
  • Developed Session Beans and Entity beans to business and data process.
  • Used JMS in the project for sending and receiving the messages on the queue.
  • Developed the Servlets for processing the data on the server.
  • The processed data is transferred to the database through Entity Bean.
  • Used JDBC for database connectivity with MySQL Server.
  • Query Statements and advanced Prepared Statements, Designed tables and indexe.
  • Wrote complex SQL queries and stored procedures.
  • Used CVS for version control.
  • Involved in unit testing using Junit.
  • Prepared the Installation, Customer guide and Configuration document which were delivered to the customer along with the product.

Environment: Core Java, J2EE, JSP, Servlets, XML, XSLT, EJB, JDBC, JBuilder 8.0, JBoss, Swing, JavaScript, JMS, HTML, CSS, MySQL Server, CVS, Windows 2000.

We'd love your feedback!