We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Woodland Hills, CA

SUMMARY

  • 7 years of Professional experience in IT Industry in Developing, Implementing, configuring, testing Hadoopecosystem components and maintenance of various web based applications using Java, J2EE.
  • 3+ years' experience inHadoopFramework, and its ecosystem.
  • Extensive work experience in the areas of Banking, Finance, Insurance and Marketing Industries.
  • Familiar with data architecture including data ingestion pipeline design,Hadoopinformation architecture, data modelling and data mining, machine learning and advanced data processing.
  • Real time experience inHadoop/Big Data related technology experience in Storage, Querying, Processing and analysis of data.
  • Excellent knowledge onHadoopArchitecture and ecosystems such as HDFS, Hive, Pig, Sqoop, Job Tracker, Task Tracker, Name Node, Data.
  • Expertise in writingHadoopJobs for analyzing data using MapReduce, Hive &Pig.
  • Knowledge in installing, configuring, and usingHadoopecosystem components likeHadoopMap Reduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, spark, kafka, storm, Zookeeper and Flume.
  • Experience in managing and reviewingHadooplog files.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Experience in importing and exporting data using Sqoop from HDFS to RDBMS and vice - versa.
  • Experienced in extending Hive and Pig core functionality by writing custom UDFs using Java.
  • Experience in building, maintaining multipleHadoopclusters of different sizes and configuration and setting up the rack topology for large clusters.
  • Experience in installation, configuration, supporting and managing - Cloudera'sHadoopplatform along with CDH3&4 clusters.
  • Experience in NoSQL databases such as HBase and Cassandra.
  • Experienced in job workflow scheduling tool like Oozie.
  • Experienced in managingHadoopcluster using Cloudera Manager Tool.
  • Experience in performance tuning by identifying the bottle necks in sources, mappings, targets and Partitioning.
  • Generated Java APIs for retrieval and analysis on No-SQL database such as HBase and Cassandra.
  • Experience in Object Oriented Analysis, Design and development of software using UML Methodology.
  • Excellent Java development skills using J2jEE, Spring, J2SE, Servlets, JUnit, MRUnit JSP, JDBC.
  • Excellent Java development skills using J2EE, Spring, J2SE, Servlets, JUnit, JSP, JDBC.
  • Experienced in worked on Backend database programming using SQL, PL/SQL, Stored Procedures, Functions, Macros, Indexes, Joins, Views, Packages and Database Triggers.
  • Experience in application development using Java, RDBMS, TALEND and Linux shell scripting, Greenplum and DB2.
  • Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
  • Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.Extensive Experience in working with MS Visio, PowerPoint, Word and Excel.

PROFESSIONAL EXPERIENCE

Confidential, Woodland Hills, CA

Sr. Hadoop Developer

Responsibilities:

  • Worked on analyzingHadoopstack and different big data analytic tools including Pig and Hive, HBase database and Sqoop.
  • Designed high level ETL architecture for overall data transfer from the OLTP to OLAP.
  • Created various Documents such as Source-To-Target Data mapping Document, Unit Test Cases and Data Migration Document.
  • Installed and configured Pig for ETL jobs Designed high level ETL architecture for overall data transfer from the OLTP to OLAP.
  • Wrote MapReduce jobs to perform operations like copying data on HDFS and defining job flows on EC2 server, load and transform large sets of structured, semi-structured and unstructured data.
  • Importing and Exporting database dumps through Oracle Server 10g and 11g using IMDP and EXPDP, SQL Server.
  • Installed Oozie workflow engine to run multiple MapReduce jobs.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and processing with Pig.
  • Worked on installing cluster, commissioning & decommissioning of Data nodes.
  • Implemented best income logic using Pig scripts and UDFS.
  • Extracted files from CouchDB through Sqoop and placed in HDFS and processed.
  • Experience in reviewingHadooplog files to detect failures.
  • Supported in setting up QA environment and updating configurations for implementing scripts.
  • Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
  • Imported data from MySQL server and other relational databases to ApacheHadoopwith the help of Apache Sqoop.
  • Developed PL/SQL procedures for processing business logic in the database.
  • Imported data using Sqoop from Teradata using Teradata connector.
  • Creating Hive tables and working on them for data analysis in order to meet the business requirements.
  • Gained very good business knowledge on health insurance, claim processing, fraud suspect Identification, appeals process etc.
  • Implemented a script to transmit sys print information from Oracle to HBase using Sqoop.
  • Implemented Kafka Storm topologies, which are capable of handling and channelizing high stream of data and integrating the storm topologies with Esper to filter and process that data across multiple clusters for complex event processing.
  • Worked on the Analytics Infrastructure team to develop a stream filtering system on top of Apache Kafka.
  • Worked on a POC on Spark and Scala parallel processing.
  • Real streaming the data using Spark with Kafka.
  • Implemented data ingestion and handling clusters in real time processing using Kafka.
  • Experience with Core Distributed computing and Data Mining Library using Apache Spark.
  • Used Hive to process data and Batch data filtering .Used Spark for any other value centric data filtering.
  • Designed and implemented Spark test bench application to evaluate quality of recommendations made by the engine.
  • Monitored and identified performance bottlenecks in ETL code. Worked on data utilizing aHadoop, Zookeeper, and Accumulo stack, aiding in the development of specialized indexes for performant queries on big data implementations.
  • Used Zookeeper for various types of centralized configurations, SVN for version control, Maven for project management, Jira for internal bug/defect management, MapReduce.
  • Installed the Operating System on Solaris and Linux servers and Blades over the network.
  • Got good experience with NoSQL database.
  • Worked on MongoDB database concepts such as locking, transactions, indexes, Sharding, replication, schema design.
  • Configuring high availability using geographical MongoDB replica sets across multiple data centers.
  • Generated Java APIs for retrieval and analysis on No-SQL database such as HBase and Cassandra.
  • Designed and implemented MapReduce-based large-scale parallel relation-learning system, Installed and benchmarkedHadoop/HBase clusters for internal use.
  • Written HBASE Client program in Java and web services.

Environment: Informatica Power Center 9.5,Hadoop, Spark, HDFS, Hive, Pig, HBase, Oozie, Sqoop, Kafka, Strom, Zookeeper, MongoDB, MapReduce, Cassandra, Linux, XML, Toad, Maven, NoSQL, MySQL Workbench, Java 6, Eclipse, Oracle 10g, PL/SQL, SQL*PLUS.

Confidential, Tampa, FL

Hadoop Developer

Responsibilities:

  • Handled importing of data from various data sources, performed data transformations using HAWQ, Map Reduce.
  • Developed hive queries on data logs to perform a trend analysis of user behavior on various online modules.
  • Developed the Pig UDF'S to pre-process the data for analysis.
  • Involved in the setup and deployment ofHadoopcluster.
  • Developed Map Reduce programs for some refined queries on big data.
  • Involved in loading data from UNIX file system to HDFS.
  • Loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
  • Exported the analyzed data to the relational databases using Sqoop and generated reports for the BI team.
  • Managing and scheduling jobs on aHadoopcluster using Oozie.
  • Along with the Infrastructure team, involved in design and developed Kafka and Storm based data pipeline.
  • Developed storm-monitoring bolt for validating pump tag values against high-low and High high - low low values from preloaded metadata.
  • Installed, Configured Talend ETL on single and multi-server environments.
  • Troubleshooting, debugging & fixing Talend specific issues, while maintaining the health and performance of the ETL environment
  • Developed Simple to complex Map/reduce Jobs using Hive.
  • Implemented Partitioning and bucketing in Hive.
  • Mentored analyst and test team for writing Hive Queries.
  • Involved in setting up of HBase to use HDFS.
  • Installation of patches and packages using RPM and YUM in Red hat and suse Linux and also using patchadd and pkgadd in Solaris 10 Operating System.
  • Extensively used Pig for data cleansing.
  • Along with the Infrastructure team, involved in design and developed Kafka and Storm based Data pipeline. This pipeline is also involved in Amazon Web Services EMR, S3 and RDS.
  • Planed and implemented UNIX shell scripting to automate cross-domain file flow to move time sensitive files from high side network down, and out to other sites.
  • Involved in creating data-models for customer data using Cassandra Query Language.
  • Performed benchmarking of the No-SQL databases, Cassandra and HBase.
  • Implemented Spark using Scalaand SparkSQL for faster testing and processing of data.
  • Knowledgeable of Spark and Scala mainly in framework exploration for transition from Hadoop/MapReduce to Spark.
  • Supported in setting up QA environment and updating configurations for implementing scripts With Pig and Sqoop.
  • Collaborated with Business users for requirement gathering for building Tableau reports per Business needs.
  • Monitored the performance and identified performance bottlenecks in ETL code.
  • Involved in testing the SQL Scripts for report development, Tableau reports, Dashboards, Scorecards and handled the performance issues effectively.
  • Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Configured Flume to extract the data from the web server output files to load into HDFS.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.

Environment: Unix Shell Scripting, Oracle 11g, DB2, Erwin 4.0, HDFS, Kafka, Storm, Spark, ETL, 1Java (jdk1.7), Pig, Linux, Cassandra, MapReduce, Ms Access, Toad, SQL, MySQL Workbench, XML, No-SQL, HBase, Hive, Sqoop, Tableau, Flume, Talend, Oozie.

Confidential, Franklin Lakes, NJ

Java/J2EE Developer

Responsibilities:

  • Involved in gathering business requirements, analyzing the project and created UML diagrams such as Use Cases, Class Diagrams, Sequence Diagrams and flowcharts for the optimization Module using Microsoft Visio.
  • Designed and developed Optimization UI screens for Rate Structure, Operating Cost, Temperature and Predicted loads using JSF myfaces, JSP, JavaScript and HTML.
  • Configured faces-config.xml for the page navigation rules and created managed and backing beans for the Optimization module.
  • Developed JSP web pages for rate Structure and Operating cost using JSF HTML and JSF CORE tags library.
  • Designed and developed the framework for the IMAT application implementing all the six phases of JSF life cycle and wrote Ant build, deployment scripts to package and deploy on JBoss application server.
  • Designed and developed Simulated annealing algorithm to generate random Optimization schedules and developed neural networks for the CHP system using Session Beans.
  • Integrated EJB 3.0 with JSF and managed application state management, business process management (BPM) using JBoss Seam.
  • Wrote AngularJS controllers, views, and services for new website features.
  • Developed Cost function to calculate the total cost for each CHP Optimization schedule generated by the Simulated Annealing algorithm using EJBs.
  • Implemented Spring web flow for the Diagnostics Module to define page flows with actions and views and created POJOs and used annotations to map them to SQL Server database using EJB.
  • Wrote DAO classes, EJB 3.0 QL queries for Optimization schedule and CHP data retrievals from SQL Server database.
  • Used Eclipse as IDE tool to develop the application and JIRA for bug and issue tracking
  • Created combined deployment descriptors using XML for all the session and entity beans.
  • Wrote JSF and JavaScript validations to validate data on the UI for Optimization and Diagnostics and Developed Web Services to have access to the external system (WCC) for the diagnostics module.
  • Wrote Message Driven Bean to implement the Diagnostic Engine and configured the JMS queue details and involved in performance tuning of the application using JProbe and JProfiler.
  • Designed and coded application components in an Agile environment utilizing a test driven development approach.
  • Skilled in test driven development and Agile development.
  • Wrote JUnit test cases to test the Optimization Module and created functions, sub queries and stored procedures using PL/SQL.
  • Tested the Simulated Annealing algorithm with different input schedules (always-on, always-off, human optimized schedule and five random input schedules) and stored the test results in a spread sheet.
  • Created technical design document for the Diagnostics Module and Optimization module covering Cost function and Simulated Annealing approach.
  • Involved in code reviews and performed version guidelines.

Confidential, Irvine, CA

Java/J2EE Developer

Responsibilities:

  • Created the UI tool - using Java, XML, DHTML, and JavaScript.
  • Wrote stored procedures using PL/SQL for data retrieval from different tables.
  • Worked extensively on bug fixes on the server side and made cosmetic changes on the UI side.
  • Part of performance tuning team and implemented caching mechanism and other changes.
  • Recreated the system architecture diagram and created numerous new class and sequence diagrams.
  • Designed and developed UI using HTML, JSP and Struts where users have all the items listed for auctions.
  • Developed Authentication and Authorization modules where authorized persons can only access the inventory related operations.
  • Developed Controller Servlets, Action and Form objects for process of interacting with Oracle database and retrieving dynamic data.
  • Responsible for coding SQL Statements and Stored procedures for back end communication using JDBC.
  • Developed the Login screen so that only authorized and authenticated administrators can only access the application.
  • Developed various activities like transaction history, search products that enable users to understand the system efficiently.
  • Involved in preparing the Documentation of the project to understand the system efficiently.

Environment: s: JDK1.2, JavaScript, HTML, DHTML, XML, Struts, JSP, Servlet, JNDI, J2EE, Tomcat, Rational Rose, Oracle.

We'd love your feedback!