We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

TX

SUMMARY:

  • Over 9+ years of professional IT experience including 4 years of experience in Big Data technologies and Hadoop ecosystem components like Spark, MapReduce, Hive, Pig, YARN, HDFS, Oozie, Sqoop, Flume and Kafka. NoSQL systems like HBase, Cassandra.
  • Experience in differentHadoop distributions like Cloudera (Cloudera distribution CDH3, 4 and 5), and Horton Works Distributions (HDP).
  • Strong Knowledge on Architecture of Distributed systems and parallel processing, In - depth understanding of MapReduce Framework and Spark execution framework.
  • Experience in Apache Flume and Kafka for collecting, aggregating and moving huge chunks of data from various sources such as web server, telnet sources etc.
  • Good knowledge and experience of Real time streaming technologies Storm and Kafka.
  • Worked on Java HBase API for ingestion processed data to HBase tables
  • Strong experience in working with UNIX/LINUX environments, writing shell scripts.
  • Hands on experience in NOSQL databases like HBase and MongoDB.
  • Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
  • Extensive experience in importing/exporting data from/to RDBMS the Hadoop Ecosystem using Apache Sqoop.
  • Used Tableauas reporting tool as data visualization tool.
  • Expertise in back-end/server side java technologies such as: Web services, Java Data base Connectivity.
  • Experience with Kafka in understanding and performing thousands of megabytes of reads and writes per second on streaming data.
  • Experience includes application development in Java (client/server), JSP, Servlet programming, Enterprise Java Beans, Struts, JSF, JDBC, spring, Spring Integration, Hibernate.
  • Implemented POC in spark using Java and Scala.
  • Good experience in optimizing Map-Reduce algorithms by using Combiners and Custom partitioners.
  • Experience in using version control tools like GITHUB, SVN etc.
  • Having good knowledge of Oracle 8i, 9i, 10g as Database and excellent in writing the SQL queries
  • Performed performance tuning and productivity improvement activities
  • Proactive in time management and problem solving skills, self-motivated and good analytical skills.
  • Experience in building and deploying web applications in multiple applications servers and middleware platforms including Apache Tomcat, JBoss.
  • Experience in writing test cases in Java Environment using JUnit.
  • Demonstrated technical expertise, organization and client service skills in various projects undertaken.
  • Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.

TECHNICAL SKILLS:

Big Data Ecosystem: Hadoop, MapReduce, YARN, HDFS, HBase, Zookeeper, Hive, Hue, Pig, Sqoop, Cassandra, Spark, Oozie, Storm, Flume, Cloudera Manager, Amazon AWS, Hortonworks clusters, CASK-Hydrator.

Languages: C, Java, PL/SQL, Pig Latin, HiveQL, Scala, SQL

Java/J2EE Web Technologies: J2EE, Servlets, JSP, JSTL, HTML, XHTML, CSS, XML, AJAX.

Frame works: Struts, ORM (Hibernate), JDBC

Web Services: SOAP, RESTful

Web Servers: Web Logic, Web Sphere, Apache Tomcat.

Scripting Languages: Shell Scripting, Java script.

Database: Oracle 9i/10g, Microsoft SQL Server, MySQL, RDBMS, MongoDB, Hbase

IDE & Build Tools: Eclipse, NetBeans, ANT and Maven.

Version Control System: CVS, SVN, GITHUB.

PROFESSIONAL EXPERIENCE:

Confidential, TX

Hadoop Developer

Responsibilities:

  • Worked on migrating raw data from external relational databases into HDFS using Sqoop bulk load using CASK-HYDRATOR.
  • Implemented pipe-lines using CASK-HYDRATOR.
  • Responsible for building scalable distributed end to end data processing pipelines usingHadoop ecosystem tools.
  • Monitoring and managing theHadoop cluster using Apache Ambari.
  • Worked on loading log data directly into HDFS using Flume.
  • Worked on writing various Linux scripts to stream data from multiple data sources like Oracle and Teradataonto the data lake.
  • Developed PIG Latin scripts for the Processing of semi structured data
  • Involved in creating Hive tables, loading with data and writing hive queries.
  • Load and transform large sets of structured, semi structured and unstructured data that includes Avro, sequence files and xml files.
  • Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
  • Developed end to end data pipeline using FTP Adaptor, Spark, Hive and Impala.
  • Developed product profiles using Pig and commodity UDFs
  • Used the Spark - Cassandra Connector to load data to and from Cassandra.
  • Implemented a prototype to perform Real time streaming the data using Spark Streaming with Kafka
  • Extensive experience in working with structured data using Hive QL, join operations, writing custom UDF’s and experienced in optimizing Hive Queries.
  • Worked on tuning the performance of HIVE queries.
  • Worked on developing OOZIE WORKFLOWS to run multiple Hive and Pig jobs, which run independently with time and data availability.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and for generating reports.
  • Experienced in managing and reviewing Hadoop log files.

Environment: HDFS, Hadoop 2.x, Pig, Hive, Sqoop, Flume, MapReduce, Scala, Oozie, Oracle 11g, Horton Works Cluster

Confidential, NYC, NY

Hadoop Developer

Responsibilities:

  • Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
  • Worked on automation of delta feeds from Mainframes using Sqoop, also from FTP Servers to Hive.
  • Implemented a Flume agent which will read logs from port and store into DATALAKE.
  • Real time streaming the data using STORM and Kafka for faster processing.
  • Inserted the data into DATALAKE after processing from STORM.
  • Implemented Hive tables on top of the data inserted.
  • Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better performance with Hive QL queries.
  • Schedule the meetings with different IT groups; coordinate the Deployment activities, plan and prepare project cutover activities and Prepare deployment plan and run the plan.
  • Analyzed the data by performing Hive queries (Hive QL) and running Pig scripts (Pig Latin) to study customer behavior.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Created components like Hive UDFs for missing functionality in HIVE for analytics
  • Implemented Hive tables and HQL Queries for the reports. Written and used complex data type in Hive.
  • Implemented Fair Scheduler on the job tracker for allocation of many resources to small jobs.
  • Implemented automatic failover Zookeeper and zookeeper failover controller.
  • Work with System architects and DBA has to ensure decisions meet long-term enterprise growth needs & reusability factors.
  • Implemented Installation and configuration of multi-node cluster on Cloud using Amazon Web Services (AWS) on EC2.
  • Worked with BI tools like Tableau for report creation and further analysis from the front end.

Environment: s: Hadoop 2.x, Hive, HQL, HDFS, MapReduce, Sqoop, Flume, Oozie, Python, Java, Maven, Eclipse, Putty, Cloudera Manager 4 and CDH 4, Zookeeper, Tableau

Confidential, Summit, NJ

Hadoop Developer

Responsibilities:

  • Developed a MAP-REDUCE code for parsing Bills and loading the data into HDFS.
  • Implemented sqoop bulk load to Import Production data.
  • Implemented aggregations using HIVE and implemented join queries using hive.
  • Written custom UDF’s in Hive.
  • Participate in Design Reviews & Daily Project Scrums meetings.
  • Worked closely with the business analysts to convert the Business Requirements into Technical Requirements and prepared low and high level documentation.
  • Involved in installing and configuring Hive, Sqoop on the Hadoop cluster.
  • Support and maintenance process managed and audited by tracking the activities using Request Items, Incidents and Change Requests.
  • Track and maintain tasks/projects completed on time and within given scope between Onsite and Offshore team
  • Supported Data Analysts in running Map Reduce Programs
  • Gained very good business knowledge on fraud suspect identification, appeals process etc.reated and managed Source to Target mapping documents for required jobs.
  • Execution of Hadoop ecosystem and Applications through Apache HUE.
  • Optimizing Hadoop MapReduce code, Hive/Pig scripts for better scalability, reliability and performance.
  • Feasibility Analysis (For the deliverables) - Evaluating the feasibility of the requirements against complexity and time lines.
  • Developed a custom File System plug in for Hadoop so it can access files on Data Platform.

Environment: Hadoop, HDFS, Map Reduce, HIVE, PIG, Sqoop, My Sql, Putty

Confidential, Somerset, NJ

Java Developer

Responsibilities:

  • Involved in business requirement gathering and technical specifications.
  • Involved in designing and developing modules Confidential both Client and Server Side.
  • Developed Java classes and helper classes in the business layer and tested them using Junit.
  • Designed and implemented front-end application with the use of JavaScript, CSS, HTML and JSPs and Spring MVC to register a new entry and management and used Hibernate to configure and connect to database.
  • Developed and tested Efficiency Management module with the help of EJB, Servlets and JSP& Core Java components in WebLogic Application server.
  • Implemented Hibernate in the data access object layer to access and update information in Oracle database.
  • Data manipulation (update, modify) using SQL queries.
  • Developed stored procedures and complex SQL queries.
  • Achieved object relational mapping by configuring deployment descriptors in Hibernate.
  • Created Stored procedures using PL-SQL for data modification (Using DML insert, update, delete) in Oracle
  • Closely worked with QA, Business and Architect to solve various Defects in quick and fast to meet deadlines
  • Conducted Design reviews and Technical reviews with other project stakeholders.

Environment: J2EE, JDBC, Java 1.4, Servlets, JSP, JSF, Struts, Hibernate, Web services, Design Patterns, MVC, HTML, JavaScript 1.2, WebLogic 8.0, XML, JUnit, Oracle 9.i/10.g, My Eclipse

Confidential

Java Developer

Responsibilities:

  • Closely involved with design, development and implementation of the application.
  • Followed agile based development strategy for developing of the application.
  • Worked on development of UI screens using JSP, CSS for the application.
  • Worked with JavaScript events and functions for client side validation.
  • Implemented the database connectivity to Oracle using JDBC.
  • Design and developed SQL Queries, Views and procedures.
  • Designed, developed and deployed necessary stored procedures, Functions using SQL, PL/SQL.
  • Supported the deployed applications through debugging, fixing, and maintenance releases.
  • Preparing and executing Unit test cases
  • Worked with minimal supervision in a fast paced, dynamic environment.
  • Wrote test plans and test cases for the developed screens.
  • Executed test cases and fixed bugs through unit testing.

Environment: My Eclipse, Oracle, JSP, RDBMS, SQL, PL/SQL, JavaScript, JQuery, JDBC, Servlet, JSP, JSF, Struts, Hibernate, Web services, Design Patterns

We'd love your feedback!