We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

3.00/5 (Submit Your Rating)

San Jose, CA

PROFESSIONAL SUMMARY:

  • Has 7+ years of programming experience with skills in analysis design, testing and deploying various software applications that includes 3 + years of strong work experience in Hadoop Eco System and Big - Data Analytics
  • Experience with Hadoop MapReduce, HDFS and Hadoop Ecosystem tools HBase, Hive, Pig, Sqoop, Flume,Apache, Oozie
  • Has experience build data driven application using Hadoop and Spark
  • Build a data pipeline using Spark Architecture from Scratch on Hadoop using Yarn as a cluster management service
  • Has good experience in Spark and Cassandra
  • Hands on Experience Migrating from hive to SparkSQL for better performance
  • Development expertise of RDBMS like ORACLE, SYBASE, TERADATA, NETEZZA, MS SQL etc & No-SQL databases like HBase and Cassandra
  • Experience with all flavor of Hadoop distributions including Cloudera, Hortonworks and MapR
  • Hands on experience on Cloud Computing infrastructure like Amazon EMR, S3 and EC2 instance.
  • Hands on experience in using Hadoop and Spark API’s
  • Developed several Map Reduce programs using Hadoop java API and also using Hive and pig
  • Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, ORC,AVRO etc
  • Developed Map Reduce programs to transform/process data
  • Experience in importing and exporting data from different data sources using Sqoop.
  • Hands on experience analyzing data with HIVE and PIG
  • Experience in writing Hive UDF’s and Pig UDF’s based on requirements
  • Hands on experience in writing HIVE queries & Pig scripts
  • Expert in working with Hive tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries
  • Clear understanding on Hadoop MRV1 architectural components viz. HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Secondary Name Node and YARN architectural components viz. Resource Manager, Node Manager and Application Master
  • Expertise on job/workflow scheduling using Oozie and monitoring tools like Nagios and Ganglia.
  • Extensive experience in middle-tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets, JSF, Struts, Spring, Hibernate, EJB
  • Involved in implementation of enterprise integration with Web Services and Legacy Systems using SOAP, and REST (Using Axis and Jersey frameworks)
  • Strong technical and interpersonal skills combined with great commitment towards meeting deadlines

PROFESSIONAL EXPERIENCE

Confidential, San Jose, CA

Hadoop/Spark Developer

Responsibilities:

  • Build an enterprise level application capable of wat if analysis for any change in product level and component level costs changes and identify items eligible for savings
  • Migrating all batch process from ORACLE to Hadoop
  • Importing source data from oracle and SAP HANA and creating an enterprise level Data Lake in Hadoop MapR distribution
  • Importing and exporting data from Oracle server and SAP HANA
  • Denormalizing Oracle souce data in Hive and creating Bill of Materials data and other Historical data
  • Generate analytical views using Spark SQL like Material cost savings views for Cisco products
  • Writing Spark jobs in Scala/Java using SparkSQL for identifying items eligible for savings
  • Run Spark Jobs to perform cost rollups at both component level and Product family
  • Implementing SAP HANA like features in Spark
  • Generate views with cost margin and BU/PF cost, total sent and savings eligible items
  • Designed and built the Reporting Application, which uses the Spark SQL to fetch and generate reports
  • Creating snapshots and versions which show impact on total cost for any changes in individual product cost
  • Hands on experience in configuring, launching, performance tuning spark jobs in yarn-client mode
  • POC on connecting Spark with custom made Front end build in Angular JS

Confidential, Medford, MA

Hadoop Developer

Responsibilities:

  • Setting up auto scalable MapR cluster on AWS and also setup Spark 0.9.0 yarn client mode for POC purposes
  • Data Loading from various data sources into S3 instance
  • Involved in designing and implemented Map reduce-based system that handle aggregation based analysis on un-structured data
  • POC on including spark in our data pipeline to perform batch processing
  • Used SparkSQL to perform these batch processing
  • Developed MapReduce code for several calculations on CSV files and put back the results in DynamoDB
  • Involved in monitoring the cluster status and debugged issues
  • Designed the Mobile team dashboard using Geckoboard to display the jobs during the last 1 day, last seven days, Push notifications, job start and stop time etc
  • Setup the bucket policies on Amazon s3
  • Reading data file and performing similar transformation using SparkSQL to demonstrate Spark’s performance over hive
  • Processing the data based on our business logic and putting it into DynamoDB
  • Setting up bucketing policies and enabling versioning and lifecycle rules for these buckets on AWS Amazon s3
  • Developed Hive queries to pre-process the data for analysis by imposing read only structure on the stream data
  • Also Implemented business logic by writing UDFs in Java and used Hive UDFs and Serde for processing email logs
  • Developed Pig scripts to handle joins by different data sets and cleansing process
  • Automated all the jobs for pulling data from FTP server to load data into Hive tables, using Oozie workflows
  • Developed UNIX shell scripts to send out an E-mail on success of the process indicating the destination folder where the files are available

Confidential, Sunnyvale, CA

Hadoop Developer

Responsibilities:

  • Follow Agile methodology to collect requirements and translation of complex functional and technical requirements into detailed architecture and design
  • Worked on Cloud based application such as AWS EC2, EMR, and other AWS features
  • Developed scalable distributed data solutions using Hadoop and migrate legacy retail applications ETL to Hadoop
  • Importing data from TERADATA into Hadoop data lake
  • Used AvroSerdes to handle Avro format data in Hive and Impala
  • Responsible for loading the customer's data and event logs from Oracle database, Teradata into HDFS/Hive
  • Importing data form RDBMS like Oracle into HBase and Hive using Sqoop
  • Experienced with handling CRUD operations on HBase data using Java API.
  • Developed Bulk loading data to HBase using Map Reduce
  • Design and developed Queries to perform analytics on Time series data using HBase
  • Migrated complete ETL process using Pig Latin operations, transformations and UDF's
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS

Confidential - Duluth, GA

Big Data engineer

Responsibilities:

  • Involved in any and/or all phases of the product development cycle from product definition and design, through implementation
  • Responsible for complete SDLC management using Agile methodology
  • Performing Map Reduce jobs to transform unstructured data as structured data and tan insert them into HBase from HDFS
  • Involved in transforming data from Mainframe tables to HDFS, and HBase tables using Sqoop
  • Prepared Avro schema files for generating Hive tables and shell scripts for executing Hadoop commands for single execution
  • Exporting the results into relational databases using Sqoop for visualization and would be further analyzed by the business Intelligence team
  • Also used Map Reduce and Sqoop to load, aggregate, store and analyze web log data from different web servers
  • Job automation for loading data into Hive tables
  • Running Cluster co-ordination services through Zookeeper

Confidential - Little Rock, AR

Java/JEE Developer

Responsibilities:

  • Involved in SDLC Requirements gathering, Analysis, Design, Development and Testing of application developed using AGILE methodology
  • Developed Technical Design Document (TRD), Functional Specification Documents and Unit Test Cases
  • Implemented J2EE design patterns such as Factory, DAO, Session Façade, Singleton, Value object
  • Built the code from the version control system and deployed it to the targeted web logic server using HUDSON jobs
  • Implemented features like logging, user session validation using Spring-AOP module
  • Developed client request validations and processing using JavaScript
  • Used Spring Framework at Business Tier and also spring's Bean Factory for initializing services
  • Used Session Beans for business logic and Entity Beans for database persistence
  • Developed server-side services using Java multithreading, Java, springs, Web Services (SOAP, AXIS)
  • Wrote application front end with HTML, JSP, Ajax/JQuery. Wrote custom JSP tags for role-based sorting and filtering
  • Used Software development best practices from MVC, spring, databases
  • Develop and execute Unit Test plans using JUnit ensuring that results are documented and reviewed with Quality Assurance teams responsible for integrated testing
  • Provided extensive pre-delivery support using bug fixing and code reviews

Confidential

Java/JEE Developer

Responsibilities:

  • Worked with business analysts and project managers to understand the requirements and developed the required functionality
  • Reviewed the requirements to ensure technical feasibility and defining the scope within the iteration plan
  • Created High Level Designs and making sure the design is according to the specifications
  • Developed and supporting the application using HSBC client specific Framework, which involved Springs, Struts, Servlets, Jsps and Hibernates
  • visualized the timelines for the newly created enhancements
  • Involved in code reviews to verify whether the code changes done by the team members are meeting the client's standards/ Java standards
  • Extensively used Spring MVC for servlet configurations both during application Development and Test
  • Involved in developing the user interface using JSP's, JSTL, HTML, Struts and Servlets
  • Performed version control management using Clear Case
  • Involved in the preparation of Test Cases for Integration Testing
  • Responsible for Component Integration Testing and supporting System Integration Testing
  • Developed the automatic build scripts using Maven for the application to deploy and test
  • Used JDBC, SQL queries, prepared statements and batch processing
  • Involved in each phase of the delivery to make sure of quality deliverables

We'd love your feedback!