We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

0/5 (Submit Your Rating)

Irving, TX

SUMMARY

  • Over 9 years of experience in analysis, design, development, implementation and testing of web - based distributed applications
  • 4 Years of Experience in Hadoop and its Ecosystem and Java/J2EE related technologies
  • Primary technical skills in HDFS, MapReduce, Pig, Hive, Impala, Sqoop, HBase, Aster Teradata, Zookeeper.
  • Expertise in writing SQL queries using Teradata.
  • Worked extensively on Teradata Utilities like MLOAD, TPUMP, FASTLOAD, FASTEXPORT, SQL queries and loading data into Data Warehouse/Data Marts.
  • Expertise in processing Bigdata and analyzing data using Pig Latin, HiveQL.
  • Experience in importing and exporting data using Sqoop for Hadoop to/from RDBMS.
  • Good Experience with a high-level Python Web framework.
  • Experience object oriented programming (OOP) concepts using Python, Java.
  • Experienced in LAMP (Linux, Apache, MySQL, and Python) Architecture.
  • Experienced in developing web-based applications using Python, Java, XML, CSS, HTML, DHTML and JavaScript.
  • Experienced in developing Web Services with Python programming language.
  • Excellent understanding of Hadoop architecture and its components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode and MapReduce programming paradigm.
  • Have good experience in extracting and generating statistical analysis using Business Intelligence tool Tableau for better analysis of data.
  • Extensive experience in developing applications using JSP, Servlets, JavaBeans, JSTL, JSP Custom Tag Libraries, JDBC, JMS publish/Subscribe, JavaScript, XML and XSLT.
  • Good Knowledge in understanding and writing scripting including Perl, Python, shell scripting,
  • Experience in creating complex SQL Queries and SQL tuning, writing PL/SQL blocks like stored procedures, Functions, Cursors, Index, triggers and packages.
  • Good knowledge in understanding Strom and its functionality.
  • Very Good knowledge and Hands-on experience in Cassandra, Flume and Spark (YARN).
  • Good understanding of Machine Learning algorithms tool (Mahout).
  • Good knowledge in distributed coordination system ZooKeeper and search platform Solr, Elastic.
  • Good Knowledge on Tez.
  • Experience with performance tuning Apache Spark systems.
  • Exposure to Cloudera development environment and management using Cloudera Manager.
  • In-depth understanding of Data Structures and Algorithms and Optimization.
  • Expertise in all major phases of a SDLC including Design, Development and Deployment, implementation and support.
  • Excellent hands on experience on Java/J2EE, UNIX environments.
  • Working experience in AGILE and WATERFALL models.
  • Good experience in implementation and testing of Web Services using SOAP and REST based architecture.
  • Have good experience in MVC architecture and proficient in OOPS concepts.
  • Designed the projects using MVC architecture providing multiple views using the same model and thereby providing efficient modularity and scalability.
  • Expertise in preparing the test cases, documenting and performing unit testing and Integration testing.
  • Expertise in cross-platform (PC/Mac, desktop, laptop, tablet) and cross-browser (IE, Chrome, Firefox, Safari) development.
  • Skilled in problem solving and troubleshooting, strong organizational and interpersonal skills.
  • Possesses professional and cooperative attitude, Adaptable approach to problem analysis and solution definition.
  • Good team player with strong analytical and communication skills.

TECHNICAL SKILLS

Programming Languages: Java, C, C++, JMS, PHP 4 & 5, C#, SQL, PL/SQL, Python, Bash

BigData Technologies: Hadoop, HDFS, MapReduce, Hive, Pig, HBase, Impala, Sqoop, Oozie, ZooKeeper, Spark, Solr, Flume

No SQL: HBase, Cassandra

Development Tools: Eclipse, .Net Framework 3.0/3.5, Visual Studio 2003/2005/2008 , ASP.NET, MS Visio 2007

Web Development: JSP, Struts, ATG. ASP.NET, HTML, CSS, DHTML, XML web Services, AJAX

Database Tools: Oracle 9.X, 10.X,SQL Server 2005/2008, MS Access 2003/2007/2010 , Teradata

Design & Analysis: Design patterns, UML modelling, MVVM, MVC

Frameworks: ATG Framework, JSF, Struts, Spring, MVC

PROFESSIONAL EXPERIENCE

Confidential, Irving, TX

Sr. Hadoop Developer

Responsibilities:

  • Work closely with business users to congregate and comprehend requirements.
  • Preparation of documentation for all the requirements and enhancements to reports.
  • Have setup the 64 node cluster and configured the entire Hadoop platform.
  • Migrating the needed data from Oracle, MySQL in to HDFS using Sqoop and importing various formats of flat files in to HDFS.
  • Proposed an automated system using Shell script to sqoop the job.
  • Experience in writing shell scripts to dump the Shared Data from MySQL server to HDFS.
  • Real time streaming the data using Spark with Kafka.
  • Installed Storm and Kafka on 4 node cluster.
  • Written Junit test cases for Storm Topology.
  • Written Storm topology to accept the events from Kafka producer and emit into Cassandra DB.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
  • Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
  • Writing/Modifying Shell scripts to support various types of network elements in a simulated network.
  • Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
  • Used flume to collect all the web log from the online ad-servers and push into HDFS.
  • Working with Apache Crunch library to write, test and run Hadoop MapReduce pipeline jobs.
  • Involved in joining and data aggregation using Apache Crunch.
  • Worked on Oozie workflow engine for job scheduling.
  • Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
  • Developed job flows in Oozie to automate the workflow for extraction of data from warehouses and weblogs.
  • Established custom MapReduces programs in order to analyze data and used Pig Latin to clean unwanted data.
  • Developed a Front-End GUI as stand-alone Python application.
  • Created Business Logic using Python.
  • Developed scripts in Python for extracting data from HTML file.
  • Developed Python scripts which format and create daily transmission files.
  • Generated property list for every application dynamically using Python.
  • Worked in Agile development approach.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
  • Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
  • Created the estimates and defined the sprint stages.
  • Developed a strategy for Full load and incremental load using Sqoop.
  • Mainly worked on Hive queries to categorize data of different claims.
  • Integrated the hive warehouse with HBase.
  • Written customized Hive UDFs in Java where the functionality is too complex.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
  • Extracted the data from Teradata into HDFS using Sqoop.
  • Develop ETL processes using Informatica, Teradata fast load, Multi load scripts.
  • Extracted data stored in different databases such as Oracle 9i, SQL server, Flat files and to load data into staging tables first and then into Teradata.
  • Worked on Teradata utilities like BTEQ, FastLoad Fast Load, and Multi Load Scripts.
  • Implementation of Cross Connectivity of Virtual Private Clouds (AWS VPC) across regions using OpenSwan over IPSec as part of Disaster Recovery in AWS.
  • Worked on integration AWS APIs for Automated Network configuration and Server/Application provisioning.
  • Used different file formats like Text files, Sequence Files, Avro.
  • Used Zookeeper to manage coordination among the clusters.
  • Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts.
  • Used IMPALA to pull the data from Hive tables.
  • Generate final reporting data using Tableau for testing by connecting to the corresponding Hive tables using Hive ODBC connector.
  • Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, Hive and Pig).
  • Monitored System health and logs and respond accordingly to any warning or failure conditions.
  • Presented data and dataflow using Talend for reusability.

Environment: HDFS, Hadoop, Pig, Hive, Sqoop, Cloudera CDH4, Flume, Storm,MapReduce, Oozie, Java 6/7, Oracle 10g, Python, Sub Version, Toad, YARN, UNIX Shell Scripting, Apache Crunch, Teradata, SOAP, REST services, Oracle 10g, MySQL, Tableau, Talend Agile Methodology, JIRA, Auto Sys

Confidential, San Jose, CA

Big Data / Hadoop Developer

Responsibilities:

  • Imported Data from Different Relational Data Sources like RDBMS, Teradata to HDFS using Sqoop.
  • Development and Testing using Teradata Utilities and Stored Procedures.
  • Involved in running Hadoop streaming jobs to process terabytes of text data.
  • Imported Bulk Data into HBase Using MapReduce programs.
  • Imported data using Sqoop from Teradata using Teradata connector.
  • Perform analytics on Time Series Data exists in HBase using HBase API.
  • Designed and implemented Incremental Imports into Hive tables.
  • Used Rest API to Access HBase data to perform analytics.
  • Designed and developed Use-Case Diagrams, Class Diagrams, and Object Diagrams using UML Rational Rose.
  • Implemented Flume to collect the data from various sources and is loaded in to HDFS.
  • Added test cases in Python to test thoroughly the developed features in Library.
  • Developed Python scripts to parse XML documents and load the data in database.
  • Used Python scripts to update content in the database and manipulate files.
  • Worked in Loading and transforming large sets of structured, semi structured and unstructured data.
  • Involved in maintaining various Unix Shell scripts.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in MapReduce way.
  • Worked on configuration and administration of Load Balancers, Network and Auto scaling for subdomains in AWS VPC.
  • Experienced in managing and reviewing theHadooplog files.
  • Migrated ETL jobs to Pig scripts did Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Worked with Avro Data Serialization system to work with JSON data formats.
  • Worked on different file formats like Sequence files, XML files and Map files using MapReduce Programs.
  • Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Worked on Oozie workflow engine for job scheduling.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Pig Scripts.

Environment: Hadoop, HDFS, MapReduce, Hive, Oozie, Sqoop, Pig, Java, Teradata, Rest API, Maven, MRUnit, Junit.

Confidential, Sunnyvale, CA

Java Developer

Responsibilities:

  • Involved in analysis, design and high-level coding phase.
  • Developed the application using J2EE Design Patterns like Singleton and Factory pattern.
  • Used MVC at presentation layer.
  • Developed front-end content using JSP, Servlets, DHTML, JavaScript and CSS.
  • Created Data source for interaction with Database.
  • Involved in writing Spring Configuration XML, file that contains declarations and business classes are wired-up to the front end managed beans using Spring IOC pattern.
  • Involved in integration of layers (UI, Business & DB access layers).
  • Coded classes to invoke Web Services.
  • Used spring frame work AOP features and JDBC module features to persist the data to the database for few applications.
  • Also used the Spring IOC feature to get hibernate session factory and resolve other bean dependencies.
  • Monitored the error logs using Log4J and fixed the problems.
  • Developed, implemented, and maintained an asynchronous, AJAX based rich client for improved customer experience.
  • Validated the UI components using AJAX Validation Framework.
  • Implemented Ajax with JQuery to refresh user selections.
  • Developed the Action classes and Form Beans.
  • Developed XML Converter classes based on JDOM and XPATH and XML technologies to obtain and persistent data.
  • Developed and worked with JSP custom tags.
  • Involved in system, Unit and Integration testing.

Environment: Servlets, JSP, DHTML, Struts, Spring, JavaScript, UML, Web Services, HTML, CSS, Eclipse, Java1.5, J2EE, SQL Server 2008, Ant, Ajax, JQuery, Sun Solaris Unix OS Log4J and Oracle 10g.

Confidential

Java Developer

Responsibilities:

  • Involved in client requirement gathering, analysis & application design.
  • Used UML to draw use case diagrams, class & sequence diagrams.
  • Implemented client side data validations using JavaScript.
  • Implemented server side data validations using Java Beans.
  • Implemented views using JSP & JSTL1.0.
  • Developed Business Logic using Session Beans.
  • Implemented Entity Beans for Object Relational mapping.
  • Implemented Service Locater Pattern using local caching.
  • Worked with collections.
  • Implemented Session Facade Pattern using Session and Entity Beans.
  • Developed message driven beans to listen to JMS.
  • Performed application level logging using log4j for debugging purpose.
  • Involved in fine tuning of application.
  • Thoroughly involved in testing phase and implemented test cases using Junit.
  • Involved in the development of Entity Relationship Diagrams using Rational Data Modeler.

Environment: Java SDK 1.4, Entity Bean, Session Bean, JSP, Servlet, JSTL1.0, CVS, JavaScript, and Oracle9i, SQL, PL/SQL, Triggers, Stored Procedures, JBOSSv3.0, Eclipse 2.1, Solaris Flavor UNIX.

Confidential

SQL Developer

Responsibilities:

  • Performance tuning of SQL and PLSQL Scripts.
  • Writing PL/SQL scripts, procedures and packages.
  • Performed data loads using SQL*LOADER.
  • Tuned SQL and PL/SQL code using Oracle Enterprise Manager Tools and Explain plan.
  • Supported Oracle developers.
  • Performed database tuning, created database reorganization procedures, scripted database alerts, and monitored scripts.

Environment: Oracle, SQL, PL/SQL Scripts.

We'd love your feedback!