- An accomplished IT Professional with 10+ years of diverse experience with emphasis on BigData/Hadoop Ecosystem, SQL/NoSQL databases and Java /J2EE technologies.
- Extensive development experience on Hadoop tools which include Spark, Hive, Oozie, Sqoop, Pig, HBase and MapReduce programming.
- Experience on Hadoop Distributions, both cloud based Distributions (AWS EMR, Google DataProc) and in house distributions Cloudera and Hortonworks.
- Experience on Data Ingestion such as ingesting data into Hadoop from various data sources like Oracle, MySQL using Sqoop tool
- Experience on Data Processing such as developing Spark applications using Scala for transforming the raw data sets and utilizing Spark - SQL and Dataframe API.
- Good experience on real time streaming data using Kafka and Spark Streaming.
- Experience designing Oozie workflows to schedule and manage data pipelines.
- Good experience is designing and implementing end to end Data Security within Hadoop Platform using Kerberos and Apache Sentry .
- Experience in working with NoSQL database like HBase, Cassandra and Mongo DB.
- Experience in ETL process consisting of data transformation, data sourcing, mapping, conversion and loading.
- Experience in Apache Spark Core, Spark SQL, Spark Streaming, Spark ML .
- Experience in using different columnar file formats like Avro, ORC and Parquet formats.
- Experience in working with the integration of Hadoop with Amazon s3 Redshift.
- Expertise in back-end/server side java technologies such as: Web services, Java persistence API (JPA), Java Messaging Service (JMS), Java Database Connectivity (JDBC).
- Good experience in Object Oriented Programming, using Java & J2EE (Servlets, JSP, Java Beans, EJB, JDBC, RMI, XML, JMS, Web Services, AJAX).
- Ability to write well-documented, well-commented, clear and maintainable efficient code for application development.
BigData Ecosystem: Hadoop, Map Reduce, YARN, HDFS, HBase, Zookeeper, Hive, Hue, Pig, Sqoop, Spark, Oozie, Storm, Flume, TalenD, Cloudera Manager, Amazon AWS, NiFi, Apache Ambari, Presto Zookeeper, Hortonworks, Impala, Informatica, Redshift, Teradata, Podium DataLanguages: C, Java, Advanced PL/SQL, Pig Latin, Python, HiveQL, Scala, SQL
Java/J2EE: J2EE, Servlets, JSP
Frameworks: Struts, Spring 3.x, ORM (Hibernate), JPA, JDBC, Hibernate
Web Services: SOAP, Restful, JAX-WS
Web Servers: Web Logic, Web Sphere, Apache Tomcat, Glassfish 4.0
Database: Oracle 9i/10g, Microsoft SQL Server, MySQL, DB2, Teradata
NOSQL Data Base: MongoDB, Cassandra, HBase
IDE & Build Tools: NetBeans, Eclipse, ANT, Jenkins and Maven, Intellij, SBT
Version Control System: GITHUB, CVS, SVN
Confidential - Northbrook, IL
Sr. Hadoop Engineer
- Developed custom input adaptors for ingesting click stream data from external sources like ftp server and S3 buckets on daily basis.
- Created various spark applications using Scala to perform various enrichment of these click stream data with enterprise data of the users.
- Implemented batch processing of jobs using Spark Scala API .
- Developed Sqoop scripts to import/export data from Oracle to HDFS and into Hive tables.
- Stored the data in columnar formats using Hive.
- Involved building and managing NoSQL Database models using HBase.
- Worked in Spark to read the data from Hive and write it to HBase .
- Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better performance with Hive QL queries.
- Used different data formats while loading the data into HDFS.
- Implemented MapReduce programs to handle semi/ unstructured data like XML, JSON files and sequence files for log files.
- Loaded the final processed data to HBase tables to allow downstream application team to build rich and data driven applications.
- Worked with a team to improve the performance and optimization of the existing algorithms in Hadoop using Spark, Spark -SQL, Data Frame .
- Implemented business logic in Hive and written UDF’s to process the data for analysis.
- Used Oozie to define a workflow to coordinate the execution of Spark, Hive and Sqoop jobs.
- Addressing the issues occurring due to the huge volume of data and transitions.
- Designed, documented operational problems by following standards and procedures using JIRA .
Environment: Hadoop 2.x, Spark, Scala, Hive, Sqoop, Oozie, Kafka, Hortonworks,, ZooKeeper, HBase, YARN, HBase, JIRA, Kerberos, Shell Scripting, SBT, GITHUB, Maven
Confidential - Dallas, TX
Hadoop & Spark Developer
- Involved in requirement analysis, design, coding and implementation phases of the project.
- Used Sqoop to load structured data from relational databases into HDFS .
- Loaded transactional data from Teradata using Sqoop and created Hive Tables.
- Worked on automation of delta feeds from Teradata using Sqoop and from FTP Servers to Hive.
- Performed Transformations like De-normalizing, Cleansing of data sets, Date Transformations, parsing some complex columns.
- Worked with different compression codecs like GZIP, SNAPPY and BZIP2 in MapReduce, Pig and Hive for better performance.
- Handled Avro, JSON and Apache Log data in Hive using custom Hive Serdes.
- Worked on batch processing and scheduled workflows using Oozie .
- Implemented installation and configuration of multi-node cluster on the cloud using Amazon Web Services (AWS) on EC2.
- Used cloud computing on the multi-node cluster and deployed Hadoop application on cloud S3 and used Elastic Map Reduce (EMR) to run Map-reduce .
- Used Hive-QL to create partitioned RC, ORC tables, used compression techniques to optimize data process and faster retrieval.
- Implemented Partitioning, Dynamic Partitioning and Buckets in Hive for efficient data access.
Environment: HDFS, Hadoop, Pig, Hive, HBase, Sqoop, Teradata, Flume, Map Reduce, Oozie, Java 6/7, Oracle 10g, YARN, UNIX Shell Scripting, Maven, Agile Methodology, JIRA, Linux
Confidential - Austin, TX
- Extracted the data from the flat files and other RDBMS databases into staging area and ingested to Hadoop.
- Installed and configured Hadoop Map-Reduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop .
- Responsible for Coding batch pipelines, Restful Service, Map Reduce program, Hive query's, testing, debugging, Peer code review, troubleshooting and maintain status report.
- Implemented Map Reduce programs to classified data organizations into different classifieds based on different type of records.
- Implemented complex map reduce programs to perform joins on the Map side using Distributed Cache in Java.
- Wrote Flume configuration files for importing streaming log data into HBase with Flume .
- Performed masking on customer sensitive data using Flume interceptors.
- Involved in migrating tables from RDBMS into Hive tables using SQOOP and later generate particular visualizations using Tableau.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Installed Oozie workflow engine and scheduled it to run data/time dependent Hive and Pig jobs
- Involved in Agile methodologies, daily Scrum meetings, Sprint planning .
Environment: HDFS, MapReduce, Cassandra, Hive, Pig, Sqoop, Tableau, NoSQL, Shell Scripting, Maven, Git, HDP Distribution, Eclipse, Log4j, JUnit, Linux
Senior Java Developer
- Effectively involved in requirements gathering phase to develop to implement phase.
- Developed design patterns using MVC 2 Web Framework.
- Implemented views using Struts tags, JSTL and Expression Language.
- Used Spring for dependency injection plugging in the Hibernate DAO objects for the business layer.
- Created Spring Interceptors to validate web service requests and enables notifications.
- Integrated Hibernate ORM framework with Spring framework for data persistence and transaction management.
- Involved in implementing DAO pattern for database connectivity and Hibernate for object persistence.
- Configure Batch jobs, Job steps, job listners, readers, writers and tasklets using spring batch .
- Integrate spring batch and apache camel using spring xml to define service beans, batch jobs, camel routes, camel end points
- Applied Object Oriented Programming (OOP) concepts (including UML use cases, class diagrams, and interaction diagrams).
- Designed REST APIs that allows sophisticated, effective and low-cost application Integration.
- Worked with java core concepts like JVM internals, multithreading, garbage collection.
- Implemented Java Message Services (JMS) using JMS API.
- Adopted J2EE design patterns like Singleton, Service Locator and Business Facade
- Developed POJO classes and used annotations to map with database tables
- Developed the application front-end using Spring framework 3.0 that uses MVC design pattern
- Used spring framework as middle tier component and integrated with Hibernate for back-end development.
- Coordinate with Interface Design Architects for meeting accessibility standards Confidential code level.
- Design and build UIs on Server Platform in team environment.
- Involved in the configuration of Struts Framework, Spring Framework and Hibernate mapping tool
- Used Jasper Reports for designing multiple reports.
- Implemented web service client program to access Affiliates web service using SOAP/REST Web Services.
- Involved in production support, resolving the production issues and maintaining the application server.
- Utilized Agile Methodology/ Scrum (SDLC) to manage projects and team.
- Unit tested all the classes using JUNIT various class level and methods level.
- Worked with all the test cases with testing team and created test cases with use cases.
Environment: J2EE, Hibernate 3.0, JSF, Rational Rose, Spring1.2, JSP 2.0, Servlet 2.3, XML, JDBC, JNDI, JUnit, IBM WAS 6.0, RAD 7.0, Oracle 9i, PLSQL, Log4j, Linux, RESTful Webservices
- Involved in all phases of Software Development Life Cycle and analyzing user requirements and converting them into software requirement specifications using Object Oriented approach.
- Implemented AJAX to allow dynamic loading, improved interaction and rich look to the User Interface for admin portal.
- Implemented 2EE Design Patterns like Singleton, Session Facade and Data Access Objects.
- Used Hibernate for Object Relation Database Mapping Java classes.
- Used Spring 3.0 with JMS to establish interactive communication between different domains.
- Designed and developed a web-based client using Servlets, JSP, Java Script, Tag Libraries, CSS, HTML and XML.
- Designed Java classes using Spring Framework to implement the Model View Control (MVC) architecture.
- Good Experience in consuming and exposing SOAP and Restful Web services.
- Wrote complex SQL queries and programmed stored procedures, packages and triggers using Oracle 10g.
- Performed Module and Unit Level Testing with JUnit and Log4j.
- Used JBoss 6.0 as the application server.