Spark Developer Resume
Murray, UT
SUMMARY
- 8+ years of experience in application development and design using Hadoop echo system tools, Spark and Java/Spring Frameworks.
- 1+ years of experience in PySpark, Spark Streaming and Spark SQL
- 3+ years of expertise in developing custom MapReduce programs to analyze the data and configuring and using components like HDFS, Flume, Hive, Pig, Sqoop & Oozie
- Experience in installing and configuring Apache, Hortonworks (HDP) & Cloudera (CDH) distributions.
- Knowledge in cleansing and analyzing data using HiveQL, Pig Latin custom UDFs and UDAFs for extending Hive and Pig core functionalities.
- Experience in importing and exporting data using Sqoop from HDFS to RDBMS and vice versa.
- Worked on Amazon Web Services S3, EC2 and Redshift.
- Experience with different queries in RDBMS and NoSQL databases like MongoDB & Cassandra.
- Good idea on Data warehousing applications using Teradata & Greenplum DB.
- Experience in handling applications where BI tools are involved like Tableau etc.
- Designed and developed various web and enterprise applications using Core Java, RESTful services, SOA, Spring, AJAX, Hibernate, JQuery, Javascript, JAX - RS
- Implemented modules in projects using AGILE methodology.
- Good knowledge on application servers like WebSphere and Tomcat.
TECHNICAL SKILLS
DB Skills: Oracle, MSSQL, Teradata, Cassandra
Languages: Java, Scala, Pig, Python, Php, R
Hadoop/Spark Eco Systems: HDFS, Spark Streaming, Hive, MLlib, Storm, Kafka, Flume, MapReduce, Oozie, Sqoop
Hadoop Distributions: Hortonworks, Cloudera, Apache
Application Server: Websphere & Tomcat
Java Technologies: AJAX, JQuery, Javascript, Core Java, Hibernate, REST/RESTful service, SOA, Spring, JAX-RS, JDBC
DevOps: Svn, Git, Maven & Jenkins
PROFESSIONAL EXPERIENCE
Confidential, Murray, UT
Spark Developer
Responsibilities:
- Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.
- Developed Spark scripts by using Python & Scala Shell commands as per the requirement.
- Developed and implemented core API services using Scala and Spark.
- Adaptation to Spark streaming when needed instead of Storm
- Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Worked on partitioning HIVE tables and running the scripts in parallel to reduce run-time of the scripts.
- Worked on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, PARQUET, JSON formats.
- Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data.
- Worked on Spark SQL rather than HiveQL to optimize performance
- Created HBase tables to store variable data formats coming from different portfolios.
- Created Hive tables, loaded data and wrote Hive queries that run within the map.
- Implemented business logic by writing Pig UDF's in Java
- Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
- Developed Sqoop scripts to import export data from relational sources and handled incremental loading on the customer, transaction data by date.
- Populated HDFS and HBASE with huge amounts of data using Apache Kafka.
- Created alter, insert and delete queries involving lists, sets and maps in Cassandra.
- Design and develop JAVA API which provides functionality to connect to the MongoDB through Java services.
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
Environment: Map Reduce, HDFS, Hive, Pig, PySpark, Scala, Storm, Sqoop, Flume, Oozie, Apache Kafka, Zookeeper.
Confidential, Minnetonka, MN
Big Data Developer
Responsibilities:
- Analyzed the data by performing Hive queries and running Pig scripts to know user behavior
- Responsible for loading customer's data and event logs into HBase using Java API.
- Created HBase tables to store variable data formats of input data coming from different portfolios
- Installed Oozie workflow engine to run multiple Hive and Pig jobs
- Worked on Big Data Integration and Analytics based on Hadoop, Spark, Kafka, Storm and web Methods technologies.
- Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
- Developed PIG scripts to transform the raw data into intelligent data as specified by business users.
- Experience migrating MapReduce programs into Spark transformations using Spark and Scala
- Configured Sqoop and developed scripts to extract data from MySQL into HDFS. Created HBase tables to store various data formats of PII data coming from different portfolios
- Cluster co-ordination services through Zookeeper
- Optimized PIG jobs by using different compression techniques and performance enhancers.
- Optimization of complex joins in PIG by using techniques such as skewed joins and hash based aggregations.
- Spark Streaming collects this data from Kafka in near-real-time and performs necessary processing.
- Installed and configured Hive and also written Hive UDFs in java and python
- Worked on installing and configuring EC2 instances on Amazon Web Services (AWS) for establishing clusters on cloud
- Written shell scripts and Python scripts for automation of job
Environment: Hadoop, MapReduce, HDFS, Hive, Java, Scala, Cassandra, Pig, Sqoop, Oozie, Zookeeper, MySQL, HBase
Confidential, Palo Alto, CA
Java/Spring Developer
Responsibilities:
- Developed the application using Spring Framework that leverages classical Model View Layer (MVC)Architecture UML diagrams like use cases, class diagrams, interaction diagrams (sequence and collaboration)and activity diagrams were used
- Gathered business requirements and wrote functional specifications and detailed design documents
- Extensively used Core Java, Servlets, JSP and XML
- Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle database
- Developed EJB tier using Session Facade, Singleton and DAO design patterns, which contains business logic, and database access functions.
- Created spring beans and used autowiring by XML & Annotation.
- Implemented Enterprise Logging service using JMS and apache CXF.
- Developed Unit Test Cases, and used JUNIT for unit testing of the application
- Involved in design process using UML & RUP (Rational Unified Process).
- Extensively used SQL queries, PL/SQL stored procedures & triggers in data retrieval and updating of information in the Oracle database using JDBC.
- Used Aspect oriented programming (AOP), J2SE Dynamic Proxy & CGLIB.
- Expert in writing Hibernate Query Language (HQL) or JPQL and Tuning the hibernate queries for better performance.
- Developed Web Services using REST with JAX-RS, XML to provide facility to obtain quote, receive updates to the quote, customer information, status updates and confirmations.
- Used SVN Version Control for Project Configuration Management.
- Writing build files using ANT. Used Maven in conjunction with ANT to manage build files.
- Running the nightly builds to deploy the application on different servers.
- Extensively worked on UNIX Environment.
Environment: Java, Spring core, REST Web services, JMS, JDK, Ajax, SAX, Git, Maven, Jenkins, Junit, XML, UML
Confidential
Java Developer
Responsibilities:
- Developed, implemented, and maintained an asynchronous, AJAX based rich client for improved customer experience using XML data and XSLT templates.
- Used OR mapping tool Hibernate for the interaction with database. Involved in writing Hibernate queries and Hibernate specific configuration and mapping files.
- Developed SQL stored procedures and prepared statements for updating and accessing data from database.
- Implemented the presentation layer with HTML, CSS and JavaScript.
- Developed web components using JSP, Servlets and JDBC
- Implemented secured cookies using Servlets.
- Involved in developing JSP pages and custom tag for presentation layer in Spring framework.
- Wrote complex SQL queries and stored procedures.
- Implemented Persistent layer using Hibernate API
- Implemented Transaction and session handling using Hibernate Utilities.
- Implemented Search queries using Hibernate Criteria interface.
- Used AGILE methodology for developing the application.
- Involved in writing the validation rules classes for general server side validations for implementing validation rules as part observer J2EE design pattern.
Environment: Java, Servlets, JSP, Hibernate, Junit Testing, Oracle DB, SQL
Confidential
Web Developer
Responsibilities:
- Designed and developed UI using JSP, dynamic JSP and page validations using JavaScript.
- Developed websites using PHP and used JQuery for Plug-ins
- Created Forms for user login and register using HTML, CSS and JavaScript
- Implemented search engine in website using AJAX
- Created databases for the websites in MySQL
- Used ‘mail to’ functionality in contact us page to get message from user
- Maintaining the website and managing with regular updates from the client.
- Using Templates and designing websites from various Content Management Systems (CMS) like Joomla.
- Involved in performance tuning, debugging issues in testing and deployment phases.
- Utilized PL/SQL for querying the database.
- Developed customer care related web pages using JSP, JSP Tags and Servlets.
Environment: HTML, CSS, PHP, JSP, JavaScript, DHTML, JQuery, AJAX, MySQL, Joomla.