Hadoop/Spark Developer Resume San Jose, CA - Hire IT People

PROFESSIONAL SUMMARY:

Has 7+ years of programming experience with skills in analysis design, testing and deploying various software applications that includes 3 + years of strong work experience in Hadoop Eco System and Big - Data Analytics
Experience with Hadoop MapReduce, HDFS and Hadoop Ecosystem tools HBase, Hive, Pig, Sqoop, Flume,Apache, Oozie
Has experience build data driven application using Hadoop and Spark
Build a data pipeline using Spark Architecture from Scratch on Hadoop using Yarn as a cluster management service
Has good experience in Spark and Cassandra
Hands on Experience Migrating from hive to SparkSQL for better performance
Development expertise of RDBMS like ORACLE, SYBASE, TERADATA, NETEZZA, MS SQL etc & No-SQL databases like HBase and Cassandra
Experience with all flavor of Hadoop distributions including Cloudera, Hortonworks and MapR
Hands on experience on Cloud Computing infrastructure like Amazon EMR, S3 and EC2 instance.
Hands on experience in using Hadoop and Spark API’s
Developed several Map Reduce programs using Hadoop java API and also using Hive and pig
Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, ORC,AVRO etc
Developed Map Reduce programs to transform/process data
Experience in importing and exporting data from different data sources using Sqoop.
Hands on experience analyzing data with HIVE and PIG
Experience in writing Hive UDF’s and Pig UDF’s based on requirements
Hands on experience in writing HIVE queries & Pig scripts
Expert in working with Hive tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries
Clear understanding on Hadoop MRV1 architectural components viz. HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Secondary Name Node and YARN architectural components viz. Resource Manager, Node Manager and Application Master
Expertise on job/workflow scheduling using Oozie and monitoring tools like Nagios and Ganglia.
Extensive experience in middle-tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets, JSF, Struts, Spring, Hibernate, EJB
Involved in implementation of enterprise integration with Web Services and Legacy Systems using SOAP, and REST (Using Axis and Jersey frameworks)
Strong technical and interpersonal skills combined with great commitment towards meeting deadlines

PROFESSIONAL EXPERIENCE

Confidential, San Jose, CA

Hadoop/Spark Developer

Responsibilities:

Build an enterprise level application capable of wat if analysis for any change in product level and component level costs changes and identify items eligible for savings
Migrating all batch process from ORACLE to Hadoop
Importing source data from oracle and SAP HANA and creating an enterprise level Data Lake in Hadoop MapR distribution
Importing and exporting data from Oracle server and SAP HANA
Denormalizing Oracle souce data in Hive and creating Bill of Materials data and other Historical data
Generate analytical views using Spark SQL like Material cost savings views for Cisco products
Writing Spark jobs in Scala/Java using SparkSQL for identifying items eligible for savings
Run Spark Jobs to perform cost rollups at both component level and Product family
Implementing SAP HANA like features in Spark
Generate views with cost margin and BU/PF cost, total sent and savings eligible items
Designed and built the Reporting Application, which uses the Spark SQL to fetch and generate reports
Creating snapshots and versions which show impact on total cost for any changes in individual product cost
Hands on experience in configuring, launching, performance tuning spark jobs in yarn-client mode
POC on connecting Spark with custom made Front end build in Angular JS

Confidential, Medford, MA

Hadoop Developer

Responsibilities:

Setting up auto scalable MapR cluster on AWS and also setup Spark 0.9.0 yarn client mode for POC purposes
Data Loading from various data sources into S3 instance
Involved in designing and implemented Map reduce-based system that handle aggregation based analysis on un-structured data
POC on including spark in our data pipeline to perform batch processing
Used SparkSQL to perform these batch processing
Developed MapReduce code for several calculations on CSV files and put back the results in DynamoDB
Involved in monitoring the cluster status and debugged issues
Designed the Mobile team dashboard using Geckoboard to display the jobs during the last 1 day, last seven days, Push notifications, job start and stop time etc
Setup the bucket policies on Amazon s3
Reading data file and performing similar transformation using SparkSQL to demonstrate Spark’s performance over hive
Processing the data based on our business logic and putting it into DynamoDB
Setting up bucketing policies and enabling versioning and lifecycle rules for these buckets on AWS Amazon s3
Developed Hive queries to pre-process the data for analysis by imposing read only structure on the stream data
Also Implemented business logic by writing UDFs in Java and used Hive UDFs and Serde for processing email logs
Developed Pig scripts to handle joins by different data sets and cleansing process
Automated all the jobs for pulling data from FTP server to load data into Hive tables, using Oozie workflows
Developed UNIX shell scripts to send out an E-mail on success of the process indicating the destination folder where the files are available

Confidential, Sunnyvale, CA

Hadoop Developer

Responsibilities:

Follow Agile methodology to collect requirements and translation of complex functional and technical requirements into detailed architecture and design
Worked on Cloud based application such as AWS EC2, EMR, and other AWS features
Developed scalable distributed data solutions using Hadoop and migrate legacy retail applications ETL to Hadoop
Importing data from TERADATA into Hadoop data lake
Used AvroSerdes to handle Avro format data in Hive and Impala
Responsible for loading the customer's data and event logs from Oracle database, Teradata into HDFS/Hive
Importing data form RDBMS like Oracle into HBase and Hive using Sqoop
Experienced with handling CRUD operations on HBase data using Java API.
Developed Bulk loading data to HBase using Map Reduce
Design and developed Queries to perform analytics on Time series data using HBase
Migrated complete ETL process using Pig Latin operations, transformations and UDF's
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS

Confidential - Duluth, GA

Big Data engineer

Responsibilities:

Involved in any and/or all phases of the product development cycle from product definition and design, through implementation
Responsible for complete SDLC management using Agile methodology
Performing Map Reduce jobs to transform unstructured data as structured data and tan insert them into HBase from HDFS
Involved in transforming data from Mainframe tables to HDFS, and HBase tables using Sqoop
Prepared Avro schema files for generating Hive tables and shell scripts for executing Hadoop commands for single execution
Exporting the results into relational databases using Sqoop for visualization and would be further analyzed by the business Intelligence team
Also used Map Reduce and Sqoop to load, aggregate, store and analyze web log data from different web servers
Job automation for loading data into Hive tables
Running Cluster co-ordination services through Zookeeper

Confidential - Little Rock, AR

Java/JEE Developer

Responsibilities:

Involved in SDLC Requirements gathering, Analysis, Design, Development and Testing of application developed using AGILE methodology
Developed Technical Design Document (TRD), Functional Specification Documents and Unit Test Cases
Implemented J2EE design patterns such as Factory, DAO, Session Façade, Singleton, Value object
Built the code from the version control system and deployed it to the targeted web logic server using HUDSON jobs
Implemented features like logging, user session validation using Spring-AOP module
Developed client request validations and processing using JavaScript
Used Spring Framework at Business Tier and also spring's Bean Factory for initializing services
Used Session Beans for business logic and Entity Beans for database persistence
Developed server-side services using Java multithreading, Java, springs, Web Services (SOAP, AXIS)
Wrote application front end with HTML, JSP, Ajax/JQuery. Wrote custom JSP tags for role-based sorting and filtering
Used Software development best practices from MVC, spring, databases
Develop and execute Unit Test plans using JUnit ensuring that results are documented and reviewed with Quality Assurance teams responsible for integrated testing
Provided extensive pre-delivery support using bug fixing and code reviews

Confidential

Java/JEE Developer

Responsibilities:

Worked with business analysts and project managers to understand the requirements and developed the required functionality
Reviewed the requirements to ensure technical feasibility and defining the scope within the iteration plan
Created High Level Designs and making sure the design is according to the specifications
Developed and supporting the application using HSBC client specific Framework, which involved Springs, Struts, Servlets, Jsps and Hibernates
visualized the timelines for the newly created enhancements
Involved in code reviews to verify whether the code changes done by the team members are meeting the client's standards/ Java standards
Extensively used Spring MVC for servlet configurations both during application Development and Test
Involved in developing the user interface using JSP's, JSTL, HTML, Struts and Servlets
Performed version control management using Clear Case
Involved in the preparation of Test Cases for Integration Testing
Responsible for Component Integration Testing and supporting System Integration Testing
Developed the automatic build scripts using Maven for the application to deploy and test
Used JDBC, SQL queries, prepared statements and batch processing
Involved in each phase of the delivery to make sure of quality deliverables

We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

San Jose, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship