We provide IT Staff Augmentation Services!

Lead Hadoop/spark Consultant Resume

4.00/5 (Submit Your Rating)

Tampa, FL

SUMMARY:

  • Around 13 years of experience in Big data Hadoop/Spark and Java/J2EE technologies development including requirements Analysis and Design, Development, implementation, support, maintenance and enhancements in Finance & Insurance domains.
  • 4+ years of experience as Hadoop/Spark Developer with good knowledge of Java Map Reduce, Hive, Pig Latin, Scala and Spark.
  • Organizing data into tables, performing transformations, and simplifying complex queries with Hive.
  • Performing real - time interactive analysis on massive data sets stored in HDFS.
  • Strong knowledge and experience with Hadoop architecture and various components such as HDFS, YARN, Pig, Hive, Sqoop, Oozie, Flume, Spark, Kafka and Map Reduce programming paradigm.
  • Developed many Map/Reduce programs.
  • Experience in analyzing data using Spark SQL, HIVEQL, PIG Latin and experience in developing custom UDF s using Pig and Hive.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Good knowledge in using job scheduling tools like Oozie.
  • Experienced in using IDE Tool like Eclipse 3.x, IBM RAD 7.0
  • Experience in requirement gathering, analysis, planning, designing, coding and unit testing.
  • Strong work ethic with desire to succeed and make significant contributions to the organization.
  • Strong problem solving skills, good communication, interpersonal skills and a good team player.
  • Have the motivation to take independent responsibility as well as ability to contribute and be a productive team member.

TECHNICAL SKILLS:

Hadoop Technologies: Hadoop, HDFS, Hadoop Map-Reduce, Hive, HBase, SQOOP, Oozie, AVRO, Pig- Latin, Hue, CDH, Parquet, Impala, Scala, Spark,Python, Kafka,AWS,S3, DynamoDB, EMR, Apache Nifi

No Sql: HBase

IDE/Tools: RAD, Eclipse

Web and Application Servers: Web sphere, JBOSS, Tomcat

Core Competency Technologies: Java, OOPS, design patterns, JSP, servlets, JDBC, java 5 / java 6 / java 7, C, C++, shell scripting, Spark, SAS EG, Scala, Spark Streaming, Kafka

Web presentation frameworks: Java Script, HTML, AJAX, jQuery, CSS, JSON

Testing & Issue Log tools: JUnit 4, Bugzilla, HP Quality Centre

SCM/Version control tools: PVCS, CVS, Sub Version

Modeling tools: Visio 2007

Build and continuous Integration: Maven, ANT

Data base: Oracle 8i/9i/10g, DB2 & MySQL 4.x/5.x

OS: UNIX, LINUX, Windows, Aix

PROFESSIONAL EXPERIENCE:

Confidential, Tampa, FL

Lead Hadoop/Spark Consultant

Responsibilities:

  • Used Spark API over Cloudera Hadoop YARN to perform analytics using Scala programming.
  • Created dataframes and performed various transformations to generate recommendation strategy
  • Created hive tables and views for business to access the recommendation strategy
  • Based on updated recommendation strategy from Business users, regenerate card wise recommendation strategy to downstream application
  • Used Spark streaming on Kafka to achieve real time data analytics.
  • Created Dstreams & dataframes from streaming data and performed transformations.
  • Extract Real time feed using Kafka and Spark Streaming and convert it to RDD and process data in the form of Data Frame.
  • Persist the streaming data on Hbase No sql database
  • Wrote shell scripts to automate the jobs in UNIX.
  • Experienced in Performing tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Efficient joins & Transformations.
  • Used Spark SQL to save the data into multiple Hive tables.
  • Developed Hive queries to process the data for visualizing.

Environment: Spark 1.6, Hadoop, Hive 2.0, HDFS, Kafka, Sqoop 1.4.6,Java 1.8, Scala 2.11, CDH5.8.2, Oozie, Eclipse, Parquet, Shell Scripting, bitbucket

Confidential, Charlotte, NC

Hadoop/Spark Consultant

Responsibilities:

  • Used Spark API to perform analytics on data in Hive using Scala programming.
  • Optimization of existing algorithms in Hadoop using Spark Context, Data Frames, Hive context.
  • Spark RDDs are created in Scala for all the data files which then undergo transformations.
  • The filtered RDDs are aggregated and transformed based on the business rules and converted into data frames and saved as temporary hive tables for intermediate processing.
  • The RDDs and data frames undergo various transformations and actions and are stored in HDFS as parquet Files to create Impala views.
  • Used Oozie scheduler to create workflows and scheduled jobs in Hadoop Cluster.
  • Written Hive UDFs to extract data from staging tables.
  • Involved in creating Hive tables & views to load, transform the data.
  • Involved in writing Pig scripts.
  • Supported and Monitored Map Reduce Programs running on the cluster.
  • Worked on data quality framework to generate reports to business on the quality of data processed in Hadoop.
  • Worked on developing Web UI using J2EE for reports generated on the final Basel views.

Environment: Java 1.8, Scala 2.11,Spark 1.6, Hadoop, Pig0.12, Hive 1.1, Map Reduce, HDFS, MySQL, Sqoop 1.4.6, CDH5.8.2, Oozie, Eclipse, Avro, Parquet, Toad, Shell Scripting, Teradata, Impala, J2EE, SAS EG

Confidential, NJ

Hadoop Developer

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig & Hbase Nosql database.
  • Importing and exporting data in HDFS and Hive using Sqoop.
  • Designed and developed Map Reduce jobs to process data from upstream.
  • Experience with NoSQL databases.
  • Worked on data ingestion to bring data to HDFS and Hive.
  • Written Hive UDFs to extract data from staging tables.
  • Involved in creating Hive tables, loading with data.
  • Hands on writing Map Reduce code to make unstructured data as structured data and for inserting data into HBase from HDFS.
  • Experience in creating integration between Hive and HBase.
  • Used Oozie scheduler to submit workflows.
  • Review QA test cases with the QA team.

Environment: Java 6, Eclipse, Hadoop, Pig0.12, Hive 0.13, Centos 6.4, Map Reduce, HDFS, My SQL, Sqoop 1.4.4, CDH4, Hue, Oozie, Toad, HBASE

Confidential

Sr. Java Developer

Responsibilities:

  • Involved in software development on web-based front-end applications.
  • Involved in development of the CSV files using the Data load.
  • Performed unit testing of the developed modules.
  • Involved in bug fixing, writing SQL queries & unit test cases.
  • Used Rational Application Developer (RAD).
  • Used Oracle as the Backend Database.
  • Involved in configuration and deployment of front-end application on RAD.
  • Involved in developing JSP’s for graphical user interface.
  • Implemented code for validating the input fields and displaying the error messages.

Environment: Java, JSP, Servlets, Apache Struts framework, WebSphere, RAD, Oracle, PVCS, TOAD

Confidential

Java Developer

Responsibilities:

  • Participated in the implementation of efforts like coding, unit testing.
  • Implemented a Web based application using SERVLETS, JSP.
  • Client side validation has been done using Java Script.
  • Involved in Unit integration and bug fixing.
  • Involved in acceptance testing with test cases and code reviews.
  • Developed code for handling the exceptions using exceptional handing.
  • Involved in writing and executing queries in MySQL.
  • Developed the application on Eclipse.
  • Involved in deploying application on WebSphere server.
  • Prepared test case document and performed unit testing and system testing.

Environment: Java, JSP, Servlets, Java script, WebSphere, MySQL, Eclipse, TOAD

Confidential

Java Developer

Responsibilities:

  • Involved in the design of the applications using J2EE.
  • Created user-friendly GUI interface and Web pages using HTML embedded in JSP.
  • Used Rational Application Developer (RAD).
  • Client side validation has been done using Java Script.
  • Involved in understanding the business processes and defining the requirements.
  • Build test cases and performed unit testing.
  • Logging done using Log4j.
  • Used PVCS for version control.
  • Developed Stored Procedures, Triggers and Functions in Oracle.
  • Used JDBC to make connection to Oracle and retrieve necessary data from it.

Environment: Java, JSP, HTML, RAD, JavaScript, Oracle, PL/SQL, WebSphere, Servlets, log4j, PVCS

Confidential

Java Developer

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC/SCRUM).
  • Worked with various types of controllers like simple form controller, Abstract Controller and Controller Interface etc.
  • Integrated Spring DAO for data access using Hibernate used HQL for querying databases.
  • Developed UI modules using HTML, JSP, JavaScript and CSS.
  • Implemented the logging mechanism using Log4j framework.
  • Designed and developed batch processing using multi-threading to process payments.
  • Used Eclipse as the IDE for developing the J2EE application.
  • Involved in writing ANT scripts to build the application.
  • Involved in production support and fixed the issues based on the priority.
  • Used Concurrent Version System (CVS) as source control tool to keep track of the system state.
  • Created and Configured Connection pools in WebSphere Application Server.
  • Used JUnit for debugging, testing and maintaining the system state.

Environment: Java, JSP, WebSphere Application Server, HTML, ANT, JUnit, Log4j, CVS, Eclipse, COBOL, Db2, JCL.

We'd love your feedback!