We provide IT Staff Augmentation Services!

Sr.hadoop / Spark Developer Resume

0/5 (Submit Your Rating)

Washington, DC

SUMMARY

  • 8+ years of overall IT experience in a variety of industries, which includes hands on experience of 3+ years in Big Data Analytics and development
  • Expertize with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, MapReduce, Sqoop, Storm, Spark, Kafka, Yarn, Oozie, and Zookeeper.
  • Excellent knowledge on Hadoop ecosystems such as HDFS, Resource Manager, Node Manager, Name Node, Data Node and Map Reduce programming paradigm
  • Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
  • Experience in analyzing large datasets and finding patterns and insights within structured and unstructured data.
  • Strong experience on Hadoop distributions like Cloudera, MapR and HortonWorks.
  • Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase and Cassandra.
  • Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice - versa according to client's requirement.
  • Extensive Experience on importing and exporting data using stream processing platforms like Flume and Kafka.
  • Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml and JSON.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Experienced in working with Amazon Web Services (AWS) using EMR for computing and S3 as storage mechanism.
  • Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
  • Key participant in all phases of software development life cycle with Analysis, Design, Development, Integration, Implementation, Debugging, and Testing of Software Applications in client server environment, Object Oriented Technology and Web based applications.
  • Excellent Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC, SOAP and RESTful web services.
  • Excellent implementation knowledge of Enterprise/Web/Client Server using Java, J2EE.
  • Strong Experience of Data Warehousing ETL concepts using Informatica Power Center, OLAP, OLTP and Control-M.
  • Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle.
  • Experienced in using agile approaches, including Extreme Programming, Test-Driven Development and Agile Scrum.
  • Worked in large and small teams for systems requirement, design & development.
  • Experience in using various IDEs Eclipse, IntelliJ and repositories SVN and Git.
  • Experience of using build tools Ant, Maven.
  • Preparation of Standard Code guidelines, analysis and testing documentations.

TECHNICAL SKILLS

BigData/Hadoop Technologies: HDFS, YARN, MapReduce, Hive, Pig, Impala, Sqoop, Flume, Spark, Kafka, Zookeeper and Oozie

NO SQL Databases: HBase, Cassandra

Languages: C, Java, Scala, SQL, PL/SQL, Pig Latin, HiveQL, Shell Scripting

Java & J2EE Technologies: Core Java, Servlets, Hibernate, Spring, Struts, JMS, EJB, RESTful

Application Servers: Web Logic, Web Sphere

Cloud Computing Tools: Amazon AWS

Databases: Microsoft SQL Server, MySQL, Oracle

Operating Systems: UNIX, Windows, LINUX

Build Tools: Jenkins, Maven, ANT

Business Intelligence Tools: Tableau, Splunk

Development Tools: Microsoft SQL Studio, Eclipse, IntelliJ

Development Methodologies: Agile/Scrum, Waterfall

Version Control Tools: Git, SVN

PROFESSIONAL EXPERIENCE

Confidential -Washington, DC

Sr.Hadoop / Spark Developer

Responsibilities:

  • Managing the full project life cycle, from initiation through implementation which includes gathering requirements, creating tasks in Jira, defining scope and prioritization, development and testing, approvals, troubleshooting and production support.
  • Co - ordinate in design and implementation of scalable big data ETL solution for Payment Integrity Compass, (PIC) product under Revenue Cycle Solution, using Apache Spark - using Scala and Oracle database to analyze health care datasets in TB s
  • Improved performance of PIC by reverse engineering 10+ years old cursor style PL/SQL and migrating to big data batch processing in Spark.
  • Developed ETL code using Apache Spark 1.6.3 - Scala and upgraded to Spark 2.1.0
  • Wrote and executed unit tests and integration tests using Scala-Test framework to ensure software quality.
  • Worked with QA team in creating/implementing test scenarios for system (end-to-end) testing using inbuilt internal products.
  • Initiated and implemented in generating reports for the PIC application analysis after each nightly processing job using Spark and Zeppelin which provides better visual representation of application functional performance
  • Designed, developed and did maintenance of data pipelines in a Hadoop and RDBMS environment with both traditional and non-traditional source systems using RDBMS
  • Implemented Kafka event log producer to produce the logs into Kafka topic for generating reports and alerts of Spark application performance analysis using driver logs and Splunk tool.
  • Working with AWS team in testing our Apache Spark- ETL application on EMR/EC2 using S3.

Environment: Hadoop YARN, Spark-Core, Spark Streaming, Spark SQL, Scala, pl/sql, Kafka, Streamsets, Amazon AWS, Tableau, Oozie, Informatica, Splunk, Cloudera, Oracle 12c, Linux

Confidential - San Diego, CA

Hadoop Developer

Responsibilities:

  • Worked on analyzing data using different big data analytic tools including Pig, Hive and spark.
  • Worked on Installation and configuring of Zoo Keeper to co-ordinate and monitor the cluster resources.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Executed Hive queries on tables stored in Avro format to perform data analysis to meet the business requirements.
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive
  • Extending HIVE and PIG core functionality by using custom User Defined Function’s (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig using Java.
  • Implemented Kafka consumers to store the data into HDFS and query it by creating hive tables on top of it.
  • Implemented Data Integrity and Data Quality checks inHadoopusing Hive and Linux scripts
  • Used Snappy compression technique to compress the files before loading it to Hive.
  • Experienced with performing CURD operations in HBase.
  • Created HBase tables to store the Avro data from different portfolios and querying on top of it.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
  • Imported data from external sources (MS SqlServer, Teradata) to Hadoop using Sqoop and Teradata connector. experienced in running Hadoop streaming jobs to process terabytes of data.
  • Worked with MS SqlServer for storing of metadata and performing the lookup requests.
  • Worked on automation of the process for efficient testing of the application.
  • Performed POC to implement Apache Spark to discuss the uses of implementing in project.
  • Expertise in platform related Hadoop Production support tasks by analyzing the job logs.
  • Responsible for continuous Build/Integration with Jenkins and deploying the applications into production using XL Deploy.
  • Actively involved in code review and bug fixing for improving the performance.

Environment: HDFS, Apache Hive, Pig, Spark, Solr, Sqoop, core Java, shell scripting, HBase, Ambari, Hortonworks, Nifi, MS SqlServer 2012, Teradata, Zoo Keeper, Git, Jenkins.

Confidential - Austin, TX

Hadoop Java Developer

Responsibilities:

  • Development of Map Reduce jobs for data cleansing and data processing of flat files.
  • Responsible for importing the flat files from external environments to ACC in Hadoop.
  • Design, developed and implemented main flow component for end to end data flow process within the platform.
  • Responsible for creating the Restful clients for consuming the web service Requests.
  • Involved in Analysis and design for setting up edge node as per the client requirement.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Expertise in writing the hive scripts for large data sets comparison.
  • Expertise in performance optimization and memory tuning of map-reduce applications.
  • Experience with MS SqlServer for utilizing it for auditing purposes on the cluster.
  • Used Gzip compression technique to compress the files before loading it to Hive
  • Responsible for HortonWorks upgrade in both production and non-production environment.
  • Written shell scripts for data extraction and data cleansing for performing member specific analytics.
  • Co-ordinate with Administrator team to analyze Map Reduce Jobs performance for resolving any cluster related issues.
  • Expertise in platform related Hadoop Production support tasks by analyzing the job logs.
  • Transferred data from external sources from MS SqlServer to Hadoop using Sqoop.
  • Co-ordinate with different teams to determine the root cause and taking steps to resolve them.
  • Responsible for continuous Integration with Jenkins and deploying the applications into production using XL Deploy.
  • Managed and reviewed Hadoop log files to identify issues when job fails and finding out the root cause.
  • Utilizing Jira to provide application support for the existing clients.

Environment: Hadoop YARN, MapReduce, Hive, Pig, Java, Shell Scripting, Rest web service, MySQL, HortonWorks, Control-M, Agile, Git, Jira.

Confidential

Sr.Java/J2EE Developer

Responsibilities:

  • Involved in Requirement Analysis, Design, Development and Testing of the risk workflow system.
  • Involved in the implementation of design using vital phases of the Software development life cycle (SDLC) that includes Development, Testing, Implementation and Maintenance Support.
  • Applied OOAD principle for the analysis and design of the system.
  • Implemented XML Schema as part of XQuery query language
  • Applied J2EE design patterns like Singleton, Business Delegate, Service Locator, Data Transfer Object (DTO), Data Access Objects (DAO) and Adapter during the development of components.
  • Used RAD for the Development, Testing and Debugging of the application.
  • Used Websphere Application Server to deploy the build.
  • Developed front-end screens using Struts, JSP, HTML, AJAX, JQuery, Java script, JSON and CSS.
  • Used J2EE for the development of business layer services.
  • Developed Struts Action Forms, Action classes and performed action mapping using Struts.
  • Performed data validation in Struts Form beans and Action Classes.
  • Developed POJO based programming model using spring framework.
  • Used IOC (Inversion of Control) Pattern and Dependency Injection of Spring framework for wiring and managing business objects.
  • Used Web Services to connect to mainframe for the validation of the data.
  • SOAP has been used as a protocol to send request and response in the form of XML messages.
  • JDBC framework has been used to connect the application with the Database.
  • Used Eclipse for the Development, Testing and Debugging of the application.
  • Log4j framework has been used for logging debug, info & error data.
  • Used Hibernate framework for Entity Relational Mapping.
  • Used Oracle 10g database for data persistence and SQL Developer was used as a database client.
  • Extensively worked on Windows and UNIX operating systems.
  • Used SecureCRT to transfer file from local system to UNIX system.
  • Performed Test Driven Development (TDD) using JUnit.
  • Used Ant script for build automation.
  • SVN version control system has been used to check-in and checkout the developed artifacts. The version control system has been integrated with Eclipse IDE.
  • Used Rational Clear quest for defect logging and issue tracking.

Environment: Windows XP, RAD7.0, Core Java, J2EE, Struts, Spring, Hibernate, Web Services, Design Patterns, Websphere, Ant, (Servlet, JSP), HTML, AJAX, JavaScript, CSS, jQuery, JSON,SOAP, WSDL, XML, Eclipse, Agile, Jira, Oracle 10g, Win SCP, Log4J, JUnit.

Confidential

Java/J2EE Developer

Responsibilities:

  • Designed and developed the application using agile methodology.
  • Implementation of new module development, new change requirement, fixes the code. Defect fixing for defects identified in pre-production environments and production environment.
  • Wrote technical design document with class, sequence, and activity diagrams in each use case.
  • Created Wiki pages using Confluence Documentation.
  • Developed various reusable helper and utility classes which were used across all modules of the application.
  • Involved in developing XML compilers using XQuery.
  • Developed the Application using Spring MVC Framework by implementing Controller, Service classes.
  • Involved in writing Spring Configuration XML file that contains declarations and other dependent objects declaration.
  • Used Hibernate for persistence framework, involved in creating DAO's and used Hibernate for ORM mapping.
  • Written Java classes to test UI and Web services through JUnit.
  • Performed functional and integration testing, extensively involved in release/deployment related critical activities. Responsible for designing Rich user Interface Applications using JSP, JSP Tag libraries, Spring Tag libraries, JavaScript, CSS, HTML.
  • Used SVN for version control. Log4J was used to log both User Interface and Domain Level Messages.
  • Used Soap UI for testing the Web Services.
  • Use of Maven for dependency management and structure of the project
  • Create the deployment document on various environments such as Test, QC, and UAT.
  • Involved in system wide enhancements supporting the entire system and fixing reported bugs.
  • Explored Spring MVC, Spring IOC, Spring AOP, and Hibernate in creating the POC.
  • Done data manipulation on front end using JavaScript and JSON.

Environment: Java, J2EE, JSP, Spring, Hibernate, CSS, JavaScript, Oracle, JBoss, Maven, Eclipse, JUnit, Log4J, AJAX, Restful Web services, JNDI, JMS, HTML, XML, XSD, XML Schema, SVN, Git.

We'd love your feedback!