We provide IT Staff Augmentation Services!

Hadoop Developer/spark Resume

0/5 (Submit Your Rating)

NJ

SUMMARY

  • Around 8+ years of IT experience which includes 4 years of experience in java mainframe technology and Around 4 years of experience as a Hadoop Developer.
  • Hands on experience in Hadoop eco - systems such as HDFS, MapReduce, Pig, Hive, Hbase, Oozie, Zookeeper, Sqoop, flume, impala, Kafka and storm, YARN.
  • Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
  • Excellent understanding/knowledge of HadoopDistributed system architecture and design principles
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems.
  • Expertise in writing Hadoop Jobs for analyzing data using Hive and Pig.
  • Experience in analyzing data using Hive Query, Pig Latin and custom MapReduce programs in Java
  • Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS and transferred large datasets between Hadoop and RDBMS by implementing SQOOP.
  • Hands on experience developing applications on HBase and expertise with SQL, PL/SQL database concepts.
  • Expertise in Hadoop workflows scheduling and monitoring using Oozie, Zookeeper.
  • Experience in Extraction, Transformation and Loading (ETL) of data from multiple sources like Flat files, XML files, and Databases. Used Informatica for ETL processing based on business.
  • Worked on data warehouse product Amazon Redshift which is a part of the AWS.
  • Good knowledge on Spark and Scala and Experience in writing MapReduce programs in Java.
  • Good understanding of NoSQL databases like HBase, Mongo DB.
  • Involved in utilizing HCATALOG to get to Hive table metadata from MapReduce or Pig code.
  • Involved in converting Hive/Sql queries into Spark transformations using Spark RDD's.
  • Experienced in working with spark eco system using Spark SQL and Scala queries on different formats like Text file, CSV file.
  • Excellent understanding/knowledge of HadoopDistributed system architecture and design principles.
  • Expertized in implementing Spark using scala and Spark SQL for faster testing and processing of data responsible to manage data from different sources.

TECHNICAL SKILLS

Big Data Technologies: Hadoop, MapReduce, Hdfs, Hive, Pig, HBase, Sqoop, Flume, Zookeeper, Oozie, Kafka, Yarn, Spark,Scala MongoDB and Cassandra.

Databases: Oracle, MySQL, Teradata, Microsoft SQL Server, MS Access,DB2 and NOSQL

Programming Languages: C, C++, Java, J2EE, Scala, SQL, PL/SQL and Unix Shell Scripts,Bash Shell Scripting.

Frameworks: MVC, Struts, Spring, Junit and Hibernate

Development Tools: Eclipse, NetBeans, Toad, Maven and ANT

Web Languages: XML, HTML, HTML5, DHTML, DOM, JavaScript, AJAX, JQuery, JSON and CSS

Operating Systems & others: Linux(Cent OS, Ubuntu), Unix, Windows XP, Server 2003, Putty, Winscp, FileZilla, AWS and Microsoft Office Suite

PROFESSIONAL EXPERIENCE

Confidential - NJ

HADOOP DEVELOPER/SPARK

Responsibilities:

  • Installed and Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, Oozie, Zookeeper, HBase, Flume and Sqoop.
  • Developed Simple to complex Map/Reduce Jobs utilizing Java, Hive and Pig for data cleaning and preprocessing.
  • Used Kafka for log accumulation like gathering physical log documents off servers and places them in a focal spot like HDFS for handling.
  • Experience in Importing and exporting data into HDFS and Hive using Sqoop.
  • Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Experience in managing and reviewing Hadooplog files.
  • Experience in Storm for handling real-time processing.
  • Developed the Pig UDF'S to pre-process the data for analysis
  • Implemented partitioning, bucketing in Hive for better organization of the data.
  • Developed Scala program for data extraction using Spark Streaming.
  • Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
  • Experienced with batch processing of data source using Apache SparK and Elastic search.
  • Experienced in implementing Spark RDD transformations, actions to implement business analysis.
  • Used Zookeeper to manage coordination among the clusters.
  • Designed and developed weekly, monthly reports related to the financial departments using TERADATASQL.
  • Used AWS services like EC2 and S3 for small data sets.
  • Experienced in working with spark eco system using Spark SQL and Scala queries on different formats like Text file, CSV file.
  • Installed Oozie workflow engine to run multiple Map Reduce, HiveQL and Pig jobs.

Environment: Hadoop, MapReduce, HDFS, Hive, Cloudera, Core Java, Scala, SQL, Flume, Spark, Pig, Sqoop, Oozie, impala, Pyhton, AWS, Ruby HBase, Kafka, Cassandra, ETL, informatica, Oracle, Unix.

Confidential - Redmond

HADOOP DEVELOPER

Responsibilities:

  • Design and develop components of big data processing using HDFS, MapReduce, PIG, and Hive.
  • Analyzed data using Hadoop components Hive and Pig.
  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Developed PIG scripts for source data validation and transformation.
  • Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems/mainframe and vice-versa. Loading data into HDFS.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Developed a data pipeline using Kafka and Storm to store data into HDFS.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Used flume, sqoop, Hadoop, spark and oozie for building data pipeline.
  • Scheduling and managing jobs on a Hadoop cluster using Oozie work flow.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.

Environment: Hadoop, MapReduce, HDFS, Hive, Cloudera, Core Java, Scala, SQL, Flume, Spark, Pig, Sqoop, Oozie, impala, Ruby, AWS HBase, Kafka, Cassandra, ETL, informatica, Oracle, Python,Unix.

Confidential - Los Angeles,CA

Hadoop Developer

Responsibilities:

  • Worked on Hadoop cluster which ranged from 4-8 nodes during pre-production stage and it was sometimes extended up to 24 nodes during production.
  • Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components.
  • Established custom MapReduce programs in order to analyze data and used Pig Latin to clean unwanted data.
  • Developed well tested, readable, reusable Ruby, JavaScript, HTML and CSS.
  • Developed Test Frameworksin Selenium for UI Regression Test Automation and when necessary, and potentially execute Unit Test Automation (Java / Testing).
  • Performed various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
  • Impacting technology roadmap to be 40% streamlined delivering technical architecture for migration to cloud based AWS solution
  • Expert in creating PIG and Hive UDFs using Java in order to analyze the data efficiently.
  • Responsible for loading the data from Oracle database, Teradata into HDFS using Sqoop.
  • Implemented AJAX, JSON, and Java script to create interactive web screens.
  • Wrote data ingestion systems to pull data from traditional RDBMS platforms such as Oracle and Teradata and store it in NoSQL databases such as MongoDB.
  • Involved in creating Hive tables, then applied HiveQL on those tables, this will invoke and run MapReduce jobs automatically.
  • Delivery of Enterprise Data Platform reducing delivery time by 24% enhanced Cloud based architectures leveraging AWS.
  • Worked on AWSRDS and Migrating data from Oracle DB and SQL DB from other applications over the internet cloud.
  • Support of applications running on Linux machines.
  • Developed data formatted web applications and deploy the script using HTML5, XHTML, CSS and Client side scripting using JavaScript.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
  • Participated in requirement gathering form the Experts and Business Partners and converting the requirements into technical specifications.
  • Experienced in analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability.
  • Assisted application teams in installing Hadoop updates, operating system, patches and version upgrades when required.
  • Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.

Environment: Hadoop, MapReduce, HDFS, Hive, Cloudera, Core Java, SQL, Flume, NoSql, Pig, Sqoop, Oozie, HBase, Cassandra, ETL, informatica, Oracle, Unix, Ajax, Json

Java Developer

Confidential

Responsibilities:

  • Involved in all the test cases and fixed any bugs or any issues identified during the testing period.
  • Worked on IEDevelopertools to debug given HTML.
  • Written test cases for Unit testing using Junit.
  • Implemented logging mechanism using log4j.
  • Created Restful web service in Doc-delete application to delete documents older than given expiration date.
  • Involved in complete development of Agile Development Methodology and tested the application in each iteration.
  • Designed and Developed websites using CXML and REST for Cisco and multiple other clients.
  • Migrated production database from SQL 2000 to SQL 2008 and upgraded production JBOSS application servers.
  • Designed User Interfaces using JavaScript, Ajax, CSS JQUERY, functionality
  • Used Swing for sophisticated GUI components.
  • WritingJavautility classes.
  • Troubleshooting and resolving defects.
  • IntelliJ as IDE for the application development and integration of the frameworks.
  • Designed the application by implementing Struts 2.0 MVC Architecture.
  • Development, enhancement, maintenance and support ofJavaJ2EE applications,
  • Developed JSP and Servlets to dynamically generate HTML and display the data to the client side.
  • Implemented JSON along with Ajax to improve the processing speed.
  • Deployed the applications on Tomcat Application Server.
  • Prepared high and low level design documents for the business modules for future references and updates.

ENVIRONMENT: Java, apache-maven, SVN, Jenkins, Spring 3.2, Spring Integration, JBOSS, Spring boot Strap, log4j, Junit, IBM MQ, JMS, Web Services, HTML, JQuery,JavaScript, Java1.5, Servlets 2.3, JSP 2.x, Hibernate.

Java Developer

Confidential

Responsibilities:

  • Involved in coding using Java Servlets, created web pages using JSP's for generating pages dynamically.
  • Involved in developing forms using HTML.
  • Developed Enterprise Java beans for the business flow and business objects.
  • Designing, coding and configuring server side J2EE components like JSP, Servlets, Java Beans, XML.
  • Responsible for implementing the business requirement using the Spring core, Spring boot and Spring data.
  • Extensive use of Struts Framework for Controller components and view components.
  • Learned XML for communicating client and Consumed and created Restful web services.
  • Developed the Database interaction classes using JDBC, java.
  • Rigorously followed Test Driven Development(TDD) in coding.
  • Implemented Action Classes and server side validations for account activity, payment history and Transactions
  • Implemented views using Struts tags, JSTL2.0 and Expression Language.
  • Worked with various java patterns such as Service Locater and Factory Pattern at the business layer for effective object behaviors.
  • Used Hibernate to transfer the application data between client and server.
  • Worked on the JAVA Collections API for handling the data objects between the business layers and the front end
  • Worked with JAXB, SAXP and XML Schema for exporting data into XML format and importing data from XML format to data base and JAXB in the web service's request response data marshalling as well as un marshalling process.
  • Responsible for coding MySQL Statements and Stored procedures for back end communication using JDBC.
  • Developed an API to write XML documents from a database. Utilized XML and XSL Transformation for dynamic web-content and database connectivity.
  • Developed a Restful WebService using spring framework.
  • Involved in implementing the Hibernate API for database connectivity.
  • Maintaining the Source Code Designed, developed and deployed on Apache Tomcat Server.
  • Used Maven for continuous integration of the builds and Used ANT for deploying the web applications.

Environment: Java/J2EE, Struts 1.2, Tiles, EJB, JMS, Servlets, JSP, JDBC, HTML, CSS, JavaScript, JUnit, WebSphere 7.0, Eclipse, SQL Server 2000, log4j, Subversion, Jenkin.

We'd love your feedback!