- 8+ years of professional experience in Requirements Analysis, Design, Development and Implementation of Java, J2EE and Big Data technologies.
- 4+ years of exclusive experience in Big Data technologies and Hadoop ecosystem components like Spark, MapReduce, Hive, Pig, YARN, HDFS, Sqoop, Flume, Kafka and NoSQL systems like HBase, Cassandra.
- Strong Knowledge on Architecture of Distributed systems and Parallel processing, In - depth understanding of MapReduce Framework and Spark execution framework.
- Expertise in writing end to end Data Processing Jobs to analyze data using MapReduce, Spark and Hive.
- Extensive experience in working with structured data using Hive QL, join operations, writing custom UDF’s and experienced in optimizing Hive Queries.
- Experience using various Hadoop Distributions (Cloudera, Hortonworks, Amazon AWS) to fully implement and leverage new Hadoop features.
- Proficient in using Cloudera Manager, an end-to-end tool to manage Hadoop operations in Cloudera Cluster.
- Experience in Apache Flume and Kafka for collecting, aggregating and moving huge chunks of data from various sources such as web server, telnet sources etc.
- Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
- Extensive experience in importing/exporting data from/to RDBMS the Hadoop Ecosystem using Apache Sqoop.
- Worked on Java HBase API for ingestion processed data to HBase tables
- Strong experience in working with UNIX/LINUX environments, writing shell scripts.
- Good knowledge and experience of Real time streaming technologies Spark and Kafka.
- Experience in optimization of MapReduce algorithm using Combiners and Partitioners to deliver the best results.
- Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
- Extensive experiences in working with semi/unstructured data by implementing complex MapReduce programs using design patterns.
- Sound knowledge of J2EE architecture, design patterns, objects modeling using various J2EE technologies and frameworks.
- Adept at creating Unified Modeling Language (UML) diagrams such as Use Case diagrams, Activity diagrams, Class diagrams and Sequence diagrams using Rational Rose and Microsoft Visio.
- Experienced in using Agile methodologies including extreme programming, SCRUM and Test Driven Development (TDD).
- Proficient in integrating and configuring the Object-Relation Mapping tool, Hibernate in J2EE applications and other open source frameworks like Struts and Spring.
- Experience in building and deploying web applications in multiple applications servers and middleware platforms including Web logic, Web sphere, Apache Tomcat, JBoss.
- Experience in writing test cases in Java Environment using JUnit.
- Hands on experience in development of logging standards and mechanism based on Log4j.
- Experience in building, deploying and integrating applications with ANT, Maven.
- Good knowledge in Web Services, SOAP programming, WSDL, and XML parsers like SAX, DOM, AngularJS, Responsive design/Bootstrap.
- Demonstrated technical expertise, organization and client service skills in various projects undertaken.
- Strong commitment to organizational work ethics, value based decision-making and managerial skills.
Big Data Ecosystem: Hadoop, MapReduce, YARN, HDFS, HBase, Zookeeper, Hive, Hue, Pig, Sqoop, Cassandra, Spark, Oozie, Storm, Flume, Talend, Cloudera Manager, Amazon AWS, Hortonworks clusters.
Languages: C, Java, PL/SQL, Pig Latin, Python, HiveQL. Scala, SQL
Java/J2EE & Web Technologies: J2EE, EJB, JSF, Servlets, JSP, JSTL, CSS, HTML, XHTML, CSS, XML, Angular JS, AJAX.
Frame works: Struts, Spring 3.x, ORM (Hibernate), JPA, JDBC
Web Services: SOAP, RESTful, JAX-WS
Web Servers: Web Logic, Web Sphere, Apache Tomcat.
Scripting Languages: Shell Scripting, Java script.
Database: Oracle 9i/10g, Microsoft SQL Server, MySQL, DB2, Teradata SQL, RDBS, MongoDB, Cassandra, HBase
Design: UML, Rational Rose, E-R Modelling, Microsoft Visio
IDE & Build Tools: Eclipse, NetBeans, ANT and Maven.
Version Control System: CVS, SVN, GITHUB.
Confidential, Wilmington, DE
Sr. Hadoop DeveloperResponsibilities:
- Implemented best practices for the full software development life cycle including coding standards, code reviews, source control management and build processes.
- Processed data ingested into HDFS using Sqoop, custom HDFS Adaptors and analyzed the data using Spark, Hive, MapReduce and produced summary results from Hadoop to downstream systems.
- Create/Modify Shell scripts for scheduling data cleansing scripts and ETL loading process.
- Developed Spark applications to perform all the data transformations on User behavioral data coming from multiple sources.
- Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala (Prototype).
- Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
- Used Sqoop to import the data to Hadoop Distributed File System (HDFS) from RDBMS.
- Created components like Hive UDFs for missing functionality in HIVE for analytics.
- Worked on various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in Hive and Map Side joins.
- Used Oozie and Oozie coordinators to deploy end to end data processing pipelines and scheduling the work flows.
Environment: HDFS, Hadoop 2.x, Pig, Hive, Sqoop, Flume, Spark, MapReduce, Scala, Oozie, Oracle 11g, YARN, UNIX Shell Scripting, Agile Methodology, JIRA, Cloudera 5.4,
Confidential, Herndon, VA
Sr. Hadoop DeveloperResponsibilities:
- Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
- Worked on automation of delta feeds from Teradata using Sqoop, also from FTP Servers to Hive.
- Implemented Hive tables and HQL Queries for the reports. Written and used complex data type in Hive.
- Developed Hive queries to analyze reducer output data.
- Designed workflow by scheduling Hive processes for Log file data, which is streamed into HDFS using Flume.
- Responsible for managing data coming from different sources.
- Developed MapReduce (YARN) programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
- Designed and developed Map Reduce jobs to process data coming in different file formats like XML, CSV, JSON.
- Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components.
- Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
- Implemented daily workflow for extraction, processing and analysis of data with Oozie.
Environments: Hadoop 2.x, Hive, HQL, HDFS, MapReduce, Sqoop, Flume, Oozie, Python, Java, Maven, Eclipse, Putty, Cloudera Manager 4 and CDH 4
Confidential, Boston, MA
- Understanding business needs, analyzing functional specifications and map those to develop and designing MapReduce programs and algorithms.
- Created Hive Tables, loaded transactional data from Teradata using Sqoop.
- Developed MapReduce jobs for cleaning, accessing and validating the data.
- Implemented Hive Generic UDF’s to incorporate business logic into Hive Queries.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most visited page on website.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Wrote Pig scripts to transform raw data from several data sources.
- Monitored workload, job performance and capacity planning using Cloudera Manager.
- Involved in build applications using Maven and integrated with Continuous Integration servers like Jenkins to build jobs.
- Involved in End-to-End implementation of ETL logic.
- Performing data migration from Legacy Databases RDBMS to HDFS using Sqoop.
- Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources.
- Involved in Agile methodologies, daily Scrum meetings, Sprint planning.
Environment: Hadoop 1.x, HDFS MapReduce, Hive, Pig, HBase, Sqoop, Oozie, Maven, Shell Scripting, CDH3, Cloudera Manager
Confidential, Chicago, IL
Sr. Java DeveloperResponsibilities:
- Created the design documents with use case and sequence diagrams in conjunction to developing the project framework.
- Used Core Java concepts like (Collections, Multithreading, Garbage Collection and OOPs concepts) and APIs to do encryption and compression of incoming request to provide security and save memory foot print.
- Followed Scrum development cycle to streamline process with iterative and incremental development.
- Developed JSP screens using Tiles, custom TagLibs, JSP templates and JSTL.
- Implemented the associated business modules integration using Spring and Hibernate.
- Developed various business components like Session Beans and managed database transactions.
- Developed the service framework functionality following the design patterns.
- Used JSF framework in developing user interfaces using JSF UI Components, Validator, Events.
- Supported JSF components using enables AJAX functionalities and Facelets for templates.
- Developed PL/SQL statements including stored procedures and triggers to perform the business.
- Implemented Spring service layer with dependency wiring, transaction, DAO and annotations.
- Performed code review, unit testing, functional testing, system testing and integration testing.
- Implemented Log4j log framework and custom code for configuring logging levels.
- Provided production support and troubleshoot the requests from end-users.
- Involved in database design and coding SQL for Oracle.
Sr. Java Application DeveloperResponsibilities:
- Implemented new features like creating highly preferment, multi-threaded transforms to process incoming messages into trading object model using Java, Struts 1.2.
- Coded JDBC calls in the servlets to access the Oracle database tables. Also invoked EJB 2.1 Stateless Session beans for business service implementation.
- Used Spring Batch for reading, validating and writing the daily batch files into the database
- Developed user management screens using JSF framework, business components using Spring framework and DAO classes using Hibernate framework for persistence management and involved in integrating the frameworks for the project.
- Implemented J2EE design patterns such as Session Facade, Factory, DAO, DTO, and MVC.
- Designed & Developed persistence service using Hibernate framework.
- Configured and Integrated JSF, Spring and Hibernate frameworks.
- Responsible for writing Java code to convert HTML files to PDF file using Apache FOP.
- Involved in the performance tuning of PL/SQL statements.
- Developed database triggers and procedures to update the real-time cash balances.
- Worked closely with the testing team in creating new test cases and also created the use cases for the module before the testing phase.
- Wrote ANT build scripts to compile Java classes and create jar, performed unit testing and package them into ear files.
- Coordinated work with DB team, QA team, Business Analysts and Client Reps to complete the client requirements efficiently.
Java Application DeveloperResponsibilities:
- Assisted in designing and programming for the system, which includes development of Process Flow Diagram, Entity Relationship Diagram, Data Flow Diagram and Database Design.
- Involved in Transactions, login and Reporting modules, and customized report generation using Controllers, Testing and debugging the whole project for proper functionality and documenting modules developed.
- Designed front end components using JSF.
- Involved in developing Java APIs, which communicates with the Java Beans.
- Implemented MVC architecture using Java, Custom and JSTL tag libraries.
- Involved in development of POJO classes and writing Hibernate query language (HQL) queries.
- Implemented MVC architecture and DAO design pattern for maximum abstraction of the application and code reusability.
- Created Stored Procedures using SQL/PL-SQL for data modification.
- Used XML, XSL for Data presentation, Report generation and customer feedback documents.
- Used Java Beans to automate the generation of Dynamic Reports and for customer transactions.
- Developed JUnit test cases for regression testing and integrated with ANT build.
- Implemented Logging framework using Log4J.
- Involved in code review and documentation review of technical artifacts.