- Total 15+ Years of experience in Enterprise Application programming using Spark, Hadoop, Map Reduce, Micro services, Scala, Java, and J2EE.
- Recent 8+ years of work experience in Telecom, Entertainment & or Ecommerce which includes hands on experience of 3+ years in Big Data Analytics and Hadoop development.
- Expertise in developing applications Integrating Apache Kafka with Spark Streaming API configuring Brokers, Zookeeper, Topics.
- Excellent knowledge in Kafka Broker architecture, Topics, Partitions, Partition Management, processing D - Streams, Batching, Windowing and Check Pointing
- Working Knowledge on Nifi integrated integrated to Kafka.
- Expert in developing end to end big data solutions with Spark, HDFS, HBase, Kafka, Hive, YARN, Flume, Sqoop for more than 3 years.
- In depth knowledge of Hadoop2.0 Architecture & YARN in designing and developing big data applications with cluster computing.
- Excellent knowledge with distributed storages (HDFS) and distributed processing for real-time and batch processing (Spark Core, Spark-SQL, Spark Streaming, Hive, HBase)
- Expertise on Hadoop components such as HDFS, Name Node, Data Node, Node Manager and Resource Manager(YARN)
- Hands on experience in using HIVE, Partitions, Bucketing writing Ad-hoc Queries for storing data to HDFS and analyzing the data from HDFS using HIVE QL.
- Experienced in using Zookeeper and Oozie operational services for coordinating cluster and scheduling workflows.
- Solid understanding in using Apache Flume for collecting, aggregating and moving large amounts of data from application servers to Hadoop.
- Proficient in writing transformations using Spark-Core, Spark-SQL with Dataset/Dataframe API in Scala.
- Hands on experience in importing and exporting data from different databases like MySQL, Oracle, Teradata into HDFS using Sqoop.
- Proficiency in manipulating/analyzing large datasets, deriving patterns or insights within structured and unstructured data.
- Strong experience on Hadoop distributions like Confidential and Cloudera.
- Hands on experience with NoSQL databases and writing applications on NoSQL databases like HBase and Cassandra.
- Experience in writing and executing MapReduce, Spark programs that work with different file formats like Text, CSV, XML, JSon, Parquet and Avro.
- Experience in working with Amazon Web Services (AWS) using EC2 for cloud computing.
- Familiarity with analytical tools like Zeppelin, Elastic Search, Nifi.
- Hands on work experience in developing Microservices using spring Boot.
- Expert in java development skill using J2EE, J2SE, Servlets, Oracle ATG, JSP, EJB, JDBC, SOAP and Restful web services.
- Experience with build tools like Ant & Maven. Source Control and CI tools like SVN, Git (Git-Stash and Git-Hub), Jenkins, Splunk, Sonar, Team City and HP Fortify for security analysis.
- Experience with Application servers like BEAWeblogic10.3, IBMWebsphere6.0, Tomcat 7.x and JBoss4.0.
- Worked with most of IDE’s like Eclipse, IntelliJ, ATOM, Visual Studio, Brackets
- Executed projects in Agile Scrum (Rally), RUP and Waterfall SDLC methodologies familiar with XP and TDD methodologies
- Drives the issue resolution management.
Big Data Technologies: Hadoop, HDFS, Kafka, Spark, HDFS, Yarn, MapReduce, Sqoop, Hive, HBase, Cassandra, Oozie, Flume.
Query Languages: HiveQL, SQL, PL/SQL
Databases: MySQL, Oracle, DB2, SQL Server.
Build/CI Tools: Maven, Ant, Jenkins, SVN, Git, Stash, HP-Fortify Security, Splunk.
Frameworks: Spark, MapReduce, J2EE, Microservices, Hibernate, REST, Spring, Struts, ATG, ExtJS, Angular, JQuery
Programming Languages: Scala, Python, Java, Java Script, Typescript, HTML5
Cloud: AWS EC2
Operating Systems: Windows, Unix, Linux
IDE: Eclipse, IntelliJ, ATOM, Visual Studio, Brackets
Confidential, Denver, CO
Sr. Big Data Developer
- Responsible for building scalable distributed real time & batch processing data solutions using HadoopV2 architecture.
- Integrated Kafka with Spark streaming for analyzing the near real time data and check pointing for recovering failures and efficient storage processing.
- Responsible for developing streaming applications Integrating Kafka with Spark Streaming for Web Analytics and log processing.
- Good working knowledge of Kafka 0.10 components Brokers, Topics, Partitions, CheckPointing, D-Streams, Zookeeper, Producer API, Spark-Streaming API using scala 2.11
- Worked on POC using Nifi tool for filtering tweets related to Confidential .
- Developed web Analytics & log processing Kafka Producer Applications using Synchronous and Asynchronous publishing ways to Kafka Topics & or loading Kafka with Flume balancing High Throughput, reliability and low latency
- Worked closely with BI/DMG Team and helping them by integrating Spark Streaming with Kafka to build a report using HBase and Tableau for omniture reporting dashboard & Analytics
- Executed Job management using Fair scheduler and developed job processing scripts using Oozie workflow.
- Developed Scala scripts, UDFs using both Dataframes/SQL/Datasets and RDD in Spark1.6 for data aggregation, queries and writing data back to storage system.
- Developed Spark scripts by using Scala shell commands as per the requirement.
- Designed workflows and coordinators in Oozie to automate and parallelize Hive jobs on Apache Hadoop environment by Cloudera (CDH 5.4.2)
- Involved in performance tuning of Spark Applications for setting right Batch Interval time, correct level of parallelism and memory tuning.
- Worked on handling large datasets using Partitions, Spark in memory capabilities, Broadcasts in Spark, effective & efficient joins, Transformations & actions during ingestion process itself.
- Worked on Cluster of size 400 nodes.
- Analyzed the SQL scripts and designed the solution to implement using Scala API.
- Involved in creating Hive tables and loading and analyzing data using hive queries.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
- Worked on Data serialization formats for converting complex objects into sequence bits by using AVRO, JSON, XML, CSV, Parquet & Text formats.
- Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
- Implemented partitioning, Dynamic partitions and Buckets in HIVE.
- Collaborated with infrastructure, network, database, application and BI teams to ensure data quality and availability.
Environment: CDH5.4.2, Hadoop 2.7.1, Spark 1.6.2, Spark 2.0, Scala 2.10/2.11, Sqoop 1.4.6, Hive 2.1.0, HBase 1.1.2, Oozie 4.2.0, Spark-Streaming-Kafka0.10 2.11, CentOS 6.4, Maven, Git/Stash, IntelliJ IDEA.
Confidential, Denver, CO
Lead Java/ J2EE Developer
- AS IT Principal Developer responsible for working on projects with advanced clientside scripting in the context of best.
- Performing lead role for onshore & offshore activities like Agile Rally tracking, User Story reviews, scrum meetings, LOE reviews, design reviews, code reviews.
- Responsible for developing Micro Webservices using Spring Boot.
- Responsible for Continuous Integration using Ant, Jenkins, Git, Stash, Splunk, HP-Fortify (security) & Subversion synch and generate nightly build reports for secure coding Analysis.
- Responsible for LOE, Analysis, Unit Testing (Junit), Code delivery, Test support, Production support .
Environment: Agile-Rally, Micro WebServices (Spring Boot), Jdk 1.7 & J2EE (JSP, EXT JS3.2.1, XML, DOM, JAXB, Servlets, CSS, HTML), Soap Web Services, Ant1.6.5, LDAP, JDBC, SOAP1.2, Java Scripts, DHTML/HTML, CSS, Oracle 10g- Database, LOG4j, Splunk, Jenkins, HP-Fortify, Weblogic10.3.3, IntelliJ10.5.4, Soap1.2, SVN, Git-Stash, HP-Quality Center, Unix 4.0
Confidential, Denver, CO
- Implemented the entire project using RUP Methodology TDD development delivering quality ROI for the client.
- Release Design artifacts using UML methodology
- Developed model, view and control layers using Jspx, Struts Action controllers, Form Objects, DTO’s for communication with business logic.
- Developed generic UI layout using Tiles, implemented tag lib, validator and upload packages in Apache Struts.
- Analyzed the business domain of the application for J2EE development work. Implemented J2EE based pluggable BRE (Business Rule Engine) using Drools (JBoss Rules Framework).
- Developed Message publisher and Subscriber beans using JMS Api.
- Injected the message publisher and subscriber beans into the ufeed processes using spring framework inversion control mechanism
- Revised the publish and subscribing mechanism completely into Spring JMS framework using JMS Template and Spring Message listener container.
- Implemented scheduler using Spring Framework Quartz Job Bean for customer and stream summarization.
- Implemented multiple connection FTP Pooling one of the core creational java design pattern for optimizing the memory usage.
- Implemented Feed File utility class for input stream readers, writers and entire project using java NIO package.
- Involved in designing and developing the java-based object relational Hibernate 3.x persistence framework to persist the business data with Oracle database and J2EE based spring framework.
- Entire application is developed, compiled, packaged and deployed using eclipse3.4, Maven2 and Drools plugin.
- Assisted in the testing phase of project (development testing, unit testing, System testing and integration).
- Involved in writing perl scripts for upstream modules and documentation activities such as preparation of overviews and clarifications of each module.
- Coding and code reviews.
- Involved in Configuration management activities using tool called Hudson.
- Responsible for Project Execution, project environment set-up, software installations, development, testing and coordinating with the team.
- Responsible for generating build scripts using Maven2.
Environment: JDK1.5, Eclipse3.3, Struts1.2, Spring IOC, SpringAOP, Spring JMS, Drools, Hibernate, Active MQ, UNIX, Maven2, Subversion, ApacheTomcat5.6, Oracle9i