We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Richardson, TX

SUMMARY

  • 8years of IT experience in Software Development, Having4+ years of experience in Big DataHadoop and NoSQL technologies in various domains like Automobile, Finance, Insurance, Health care and telecom.
  • Extensive experience in analyzing data using Hadoop Ecosystem including Hive, PIG, Sqoop, Flume, HBase, HBase - Hive Integration, Avro, Oozie, Solr and Zookeeper.
  • Experience in developing custom Map Reduce Programs in Java using Apache Hadoopfor analyzing Big Data as per the requirement.
  • Extensive knowledge of Hadoop Architecture and hands on experience on major components in Hadoop Ecosystem such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node andMapReduce programming
  • Experience in extending HIVE and PIG core functionality by using custom UDF’s.
  • Knowledge of NoSQL databases such as HBase, Cassandra and MongoDB.
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
  • Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop MapReduce and Pig jobs.
  • Work experience with cloud infrastructure like Amazon Web Services (AWS).
  • Involved in business requirements gathering for successful implementation and POC (proof-of-concept) of Hadoop and its ecosystem
  • Good understanding of Data Mining and Machine Learning techniques.
  • Experience in managing and reviewing Hadoop Log files.
  • Experience in setting up automated monitoring and escalation infrastructure for Hadoop Cluster using Ganglia and Nagios.
  • Experience in installation, configuration, management and deployment of Big Data solutions and the underlying infrastructure of Hadoop Cluster.
  • Experience in the integration of various data sources like Java, RDBMS, Shell Scripting, Spreadsheets and Text files.
  • Solid background in Object-Oriented analysis and design OOAD . Very good Confidential various Design Patterns, UML and Enterprise Application Integration EAI
  • Experience in Web Services using XML, HTML and SOAP.
  • Extensive experience with SQL, PL/SQL and database concepts
  • Diverse experience in utilizing Java tools in business, Web, and client-server environments including Java Platform, J2EE, EJB, JSP, Java Servlets,Junit, Java database Connectivity (JDBC) technologiesand application servers like Web Sphere and Weblogic.
  • Familiarity in working with popular frameworks likes Struts, Hibernate, SpringMVC and AJAX.
  • Involved in exploring, mining and visualization of Big Data utilizing BI tools to provide various interesting insights and possibilities.
  • Ability to blend technical expertise with strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem solving technique and leadership skills.

TECHNICAL SKILLS

Big data/Hadoop Ecosystem: HDFS, Map Reduce, HIVE, PIG, HBase, Sqoop, Flume, Oozie, Storm and Avro

Java / J2EE Technologies: Core Java, Servlets, JSP, JDBC, XML, REST, SOAP, WSDL

ProgrammingLanguages: C, C++, Java, Scala, SQL, PL/SQL, Linux shell scripts.

NoSQL Databases: MongoDB, Cassandra, HBase

Database: Oracle 11g/10g, DB2, MS-SQL Server, MySQL, Teradata.

Web Technologies: HTML, XML, JDBC, JSP, JavaScript, AJAX, SOAP

Frameworks: MVC, Struts 2/1, Hibernate 3, Spring 3/2.5/2.

Tools: Used: Eclipse, IntelliJ, GIT, Putty, Winscp

Operating System: Ubuntu (Linux), Win 95/98/2000/XP, Mac OS, RedHat

ETL Tools: Informatica, pentaho.

Testing: Hadoop Testing, Hive Testing, Quality Center (QC)

Monitoring and Reporting tools: Ganglia, Nagios, Custom Shell scripts.

PROFESSIONAL EXPERIENCE

Confidential, Richardson, TX

Hadoop Developer

Responsibilities:

  • Involved in installation, configuration, supporting and managing Hadoop clusters, Hadoop cluster administration that includes commissioning & decommissioning of Data Node, capacity planning, slots configuration, performance tuning, cluster monitoring and troubleshooting.
  • Worked onHadoopcluster scaling from 6 nodes in development environment to 10 nodes in pre-production stage and up to 32 nodes in production.
  • Involved in complete Implementation lifecycle, specialized in writing custom MapReduce, Pig and Hive programs.
  • Extensively used Sqoop to get data from RDBMS sources like Teradata and Netezza.
  • Participated in development and execution of system and disaster recovery processes.
  • Automated processes for troubleshooting, resolution and tuning of Hadoop clusters.
  • Extensively used Hive/HQL or Hive queries to query or search for a particular string in Hive tables in HDFS.
  • Experienced in Analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
  • Supported tuple processing, writing data with Storm by provide Storm-Kafka connectors.
  • Experience in developing customized UDF's in java to extend Hive and Pig Latin functionality.
  • Created HBase tables to store various data formats of data coming from different sources.
  • Administration, installing, upgrading and managing distributions of Hadoop (CDH3, CDH4, Cloudera manager), Hive, HBase.
  • Real streaming the data using Spark with Kafka and store the stream data to HDFS usingScala.
  • Responsible for building scalable distributed data solutions using Hadoop Cloudera works.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Supporting Hadoop developers and assisting in optimization of Map Reduce jobs, Pig Latin scripts, Hive Scripts, and HBase ingest required.
  • Prepared Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability.
  • Enabled Kerberos for authorization and authentication.
  • Testing Hadoop components on sample datasets in local pseudo distribution mode.
  • Implemented MR unit testing framework.
  • Experience in configuring Java components using Spark.
  • Used File System check (FSCK) to check the health of files in HDFS.
  • Developed the UNIX shell scripts for creating the reports from Hive data.
  • Used Flume extensively in gathering and moving log data files from Application Servers to a central location inHadoopDistributed File System (HDFS).
  • Enabled HA for Namenode, Resource Manager and Hive Metastore.

Environment: Hadoop 1x, HDFS, Map Reduce, Hive 0.10, Pig 0.11, Sqoop, HBase, Shell Scripting, Apache Solr, Java.

Confidential, San Francisco,CA

Hadoop Developer

Responsibilities:

  • Developed solutions to process data into HDFS (Hadoop Distributed File System), process within Hadoop and emit the summary results from Hadoop to downstream systems.
  • Developed a Wrapper Script around Teradata connector for Hadoop TCD to support option param’s.
  • Used Sqoop extensively to ingest data from various source systems into HDFS.
  • Hive was used to produce results quickly based on the report that was requested.
  • Integrated Hive server 2 with Tableau using Horton Works Hive ODBC driver, for auto generation of Hive queries for non-technical business user.
  • Integrated multiple sources data (SQL Server, DB2, TD) into Hadoop cluster and analyzed data by Hive-HBase integration.
  • Developed PIG UDFs for the needed functionality such as custom Pigsloader known as timestamp loader.
  • Oozie and Zookeeper were used to automate the flow of jobs and coordination in the cluster respectively.
  • Worked on different file formats like Text files, Sequence Files, Avro, Record columnar files (RC).
  • Developed several shell scripts, which acts as wrapper to start theseHadoop jobs and set the configuration parameters.
  • Kerberos security was implemented to safeguard the cluster.
  • Worked on a stand-alone as well as a distributed Hadoop application.
  • Tested the performance of the data sets on various NoSQL databases.
  • Understood complex data structures of different type (structured, semi structured) and de-normalizing for storage in Hadoop.

Environment: Hadoop, HDFS, Pig 0.10, Hive, MapReduce, Sqoop, Java Eclipse, SQL Server, Shell Scripting.

Confidential, Penfield, NY

Hadoop Developer

Responsibilities:

  • Worked on Hadoop cluster (CDH 5) with 30 nodes.
  • Worked with highly semi-structured and structured data of 90TB with replication factor 3.
  • Extracted the data from Oracle, MySQL, and SQL server databases into HDFS using Sqoop.
  • Extracted data from weblogs and social media using flume and loaded into HDFS.
  • Created jobs in Sqoop with incremental load and populated Hive tables.
  • Developed software to process, cleanse, and report on vehicle data utilizing various analytics and REST API languages like Java, Scala and Akka (Asynchronous programming Framework)
  • Involved in Developing Assert Tracking project where we use to collect real-time vehicle location data using IBM streams from JMS queue and processed that data in Vehicle Tracking using ESRI - GIS Mapping Software, Scala and Akka Actor Model.
  • Involved in developing web-services using REST, HBase Native API and BigSQL Client to query data from HBase.
  • Experienced in Developing Hive queries in BigSQL Client for various use cases.
  • Involved in developing few Shell Scripts and automated them using CRON job scheduler
  • Implemented test scripts to support test driven development and continuous integration.
  • Responsible to manage data coming from different sources.
  • Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Environment: Hadoop 1x, Hive 0.10, Pig 0.11, Sqoop, HBase, UNIX Shell Scripting, Scala, Akka,IBM InfoSphere BigInsights, IBM InfoSphere Streams, IBM BigSQL, Java

Confidential, Memphis,TN

Hadoop Developer

Responsibilities:

  • Worked with structured and semi structured data of approximately 100TB with replication factor of 3.
  • Involved in complete Implementation lifecycle, specialized in writing custom MapReduce, Pig and Hive programs.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Extensively used Hive/HQL or Hive queries to query or search for a particular string in Hive tables in HDFS.
  • Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
  • Experience in developing customized UDF's in java to extend Hive and Pig Latin functionality.
  • Created HBase tables to store various data formats of data coming from different portfolios.
  • Managing and scheduling Jobs to remove the duplicate log data files in HDFS using Oozie.
  • Used Flume extensively in gathering and moving log data files from Application Servers to a central location in Hadoop Distributed File System (HDFS).
  • Experienced with SOLR for indexing and search.
  • Experienced in Analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
  • Used File System check (FSCK) to check the health of files in HDFS.
  • Developed the UNIX shell scripts for creating the reports from Hive data.

Environment: Java, UNIX, HDFS, Pig, Hive, Spark, Scala, MapReduce, Flume, Sqoop, Kafka, HBase, Cassandra, Cloudera Distribution, Oozie, Ambari, Ganglia, Yarn, Shell scripting

Confidential -Plano, TX

Java/J2EE/Hadoop Developer

Responsibilities:

  • Participated in requirement gathering and converting the requirements into technical specifications.
  • Created UML diagrams like use cases, class diagrams, interaction diagrams, and activity diagrams.
  • Developed the application using Struts Framework that leverages classical Model View Controller (MVC) architecture.
  • Extensively worked on User Interface for few modules using JSPs, JavaScript and Ajax.
  • Created Business Logic using Servlets, POJO’s and deployed them on Web logic server.
  • Wrote complex SQL queries and stored procedures.
  • Developed the XML Schema and Web services for the data maintenance and structures.
  • Implemented the Web Service client for the login authentication, credit reports and applicant information using Apache Axis 2 Web Service.
  • Responsible to manage data coming from different sources.
  • Developed map reduce algorithms.
  • Got good experience with NOSQL database.
  • Involved in loading data from UNIX file system to HDFS.
  • Installed and configured Hive and also written Hive UDFs.
  • Integrated Hadoop with Solr and implement search algorithms.
  • Worked with cloud services like Amazon web services (AWS)
  • Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 10g database.
  • UsedHibernateORM framework withSpringframework for data persistence and transaction management.
  • Used struts validation framework for form level validation.
  • Wrote test cases in JUnit for unit testing of classes.
  • Involved in creating templates and screens in HTML and JavaScript.
  • Involved in integrating Web Services using SOAP.

Environment: Hive 0.7.1,Apache Solr - 3.x, HBase-0.90.x/0.20.x,JDK 1.5,, Struts 1.3, WebSphere 6.1, HTML, XML, JavaScript, JUnit 3.8,Oracle 10g, Amazon Web Services.

Confidential - McLean, VA

Java/J2EE Developer

Responsibilities:

  • Responsible for gathering business and functional requirements for the development and support of in-house and vendor developed applications
  • Gathered and analyzed information for developing, supporting, and modifying existing web applications based on prioritized business needs
  • Played key role in design and development of new application using J2EE, Servlets, and Spring technologies/frameworks using Service Oriented Architecture (SOA)
  • Wrote Action classes, Request Processor, Business Delegate, Business Objects, Service classes and JSP pages
  • Played a key role in designing the presentation tier components by customizing the Spring framework components, which includes configuring web modules, request processors, error handling components, etc.
  • Implemented the Web Services functionality in the application to allow external applications to access data
  • Used Apache Axis as the Web Service framework for creating and deploying Web Service Clients using SOAP and WSDL
  • Worked on Spring to develop different modules to assist the product in handling different requirements
  • Developed validation using Spring's Validation Interface and used Spring Core and MVC develop the applications and access data
  • Implemented Spring Beans using IOC and Transaction management features to handle the transactions and business logic
  • Design and developed different PL/SQL blocks, Stored Procedures in DB2 database
  • Involved in writing DAO layer using Hibernate to access the database
  • Involved in deploying and testing the application using WebsphereApplication Server
  • Developed and implemented several test cases using JUnitframework
  • Involved in troubleshoot technical issues, conduct code reviews, and enforce best practices

Environment: Java SE 6, J2EE 6, JSP 2.1, Servlets 2.5, Java Script, IBM Websphere7, DB2, HTML, XML, Spring 3, Hibernate 3,JUnit, Windows 7, Eclipse 3.5

Confidential - Seattle, WA

Java/J2EE Developer

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
  • Developed and deployed UIlayerlogics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax.
  • CSS and JavaScript were used to build rich internet pages.
  • Agile Scrum Methodology been followed for the development process.
  • Designed different design specifications for application development that includes front-end, back-end using design patterns.
  • Developed proto-type test screens in HTML and JavaScript.
  • Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
  • Developed the application by using the Spring MVC framework.
  • Collection framework used to transfer objects between the different layers of the application.
  • Developed data mapping to create a communication bridge between various application interfaces using XML, and XSL.
  • Spring IOC being used to inject the parameter values for the Dynamic parameters.
  • Developed JUnit testing framework for Unit level testing.
  • Actively involved in code review and bug fixing for improving the performance.
  • Documented application for its functionality and its enhanced features.
  • Created connection through JDBC and used JDBC statements to call stored procedures.

Environment: Spring MVC, Oracle 11g J2EE, Java, JDBC, Servlets, JSP, XML, Design Patterns, CSS, HTML, JavaScript 1.2, Junit, Apache Tomcat, My SQL Server 2008.

Confidential

Application Developer

Responsibilities:

  • Developed the application under JEE architecture, developed, Designed dynamic and browser compatible user interfaces using JSP, Custom Tags, HTML, CSS, and JavaScript.
  • Deployed & maintained the JSP, Servlets components on Web logic 8.0
  • Developed Application Servers persistence layer using JDBC andSQL.
  • Used JDBC to connect the web applications to Databases.
  • Implemented Test First unit testing framework driven using Junit.
  • Developed and utilized J2EE Services and JMS components for messaging communication in Web Logic.
  • Configured development environment using Web logic application server for developers integration testing.

Environment: Java/J2EE, SQL, Oracle 10g, JSP 2.0, EJB, AJAX, Java Script, Web Logic 8.0, HTML, JDBC 3.0, XML, JMS, log4j, Junit, Servlets, MVC

We'd love your feedback!