- Overall 8 years of experience with strong emphasis on Design, Development, Implementation, Testing and Deployment of Software Applications.
- Over 4+ years of comprehensive IT experience in BiData and BiData Analytics, Hadoop, HDFS, MapReduce, YARN, Hadoop Ecosystem and Shell Scripting.
- 5+ years of development experience using Java, J2EE, JSP and Servlets.
- Highly capable for processing large sets of Structured, Semi - structured and Unstructured datasets and supporting BiData applications.
- Hands on experience with Hadoop Ecosystem components like Map Reduce (Processing), HDFS (Storage), YARN, Sqoop, Pig, Hive, HBase, Oozie, ZooKeeper and Spark for data storage and analysis.
- Expertise in transferring data between a Hadoop ecosystem and structured data storage in a RDBMS such as MY SQL, Oracle, Teradata and DB2 using Sqoop.
- Experience in NoSQL databases like Mongo DB, HBase and Cassandra.
- Experience in Apache Spark cluster and streams processing using Spark Streaming.
- Expertise in moving large amounts of log, streaming event data and Transactional data using Flume.
- Experience in developing MapReduce jobs in Java for data cleaning and preprocessing.
- Expertise in writing Pig Latin, Hive Scripts and extended their functionality using User Defined Functions (UDF's).
- Expertise in handling arrangement of data within certain limits (Data Layout's) using Partitions and Bucketing in Hive.
- Extensively used Microservices and Postman for hitting teh Kubernetes DEV and Hadoop clusters.
- Deployed various Microservices like Spark, MongoDB, Cassandra in Kubernetes and Hadoop clusters using Docker.
- Expertise in preparing Interactive Data Visualization's using Tableau Software from different sources.
- Hands on experience in developing workflows execute MapReduce, Sqoop, Pig, Hive and Shell Scripts using Oozie.
- Experience working with Cloudera Hue Interface and Impala.
- Hands on experience developing Solar Indexes using MapReduce Indexer Tool.
- Expertise in Object-Oriented Analysis and Design (OOAD) like UML and use of various design patterns.
- Experience in Java, JSP, Servlets, EJB, Web Logic, Web Sphere, Hibernate, and spring, JBoss, JDBC, RMI, Java Script, Ajax, JQuery, XML and HTML.
- Fluent with teh core Java concepts like me/O, Multi-Threading, Exceptions, Reg Ex, Data Structures and Serialization.
- Performed Unit Testing using Junit Testing Framework and Log4J to monitor teh error logs.
- Experience in process Improvement, Normalization/De-normalization, Data extraction, cleansing and Manipulation.
- Converting requirement specification, Source system understanding into Conceptual, Logical and Physical Data Model, Data flow (DFD).
- Expertise in working with Transactional Databases like Oracle, SQL server, My SQL, and Db2.
- Expertise in developing SQL queries, Stored Procedures and excellent development experience with Agile Methodology.
- Ability to adapt to evolving technology, Strong sense of Responsibility and .
- Excellent leadership, interpersonal, problem solving and time management skills.
- Excellent communication skills both Written (documentation) and Verbal (presentation).
Languages: Core JAVA, J2EE, SQL, Python, C, C++, PL/SQL, Hive, PIG
Application/Web Servers: Apache Tomcat 4.0/5.0/6.0, Web Logic 8.1/9.1, Web Sphere 7.0, Web Sphere Application Server 8.0 & RAD 7.5/8.5, JBoss
Java Technologies: J2EE/J2SE(JSP, JSTL, JavaBeans, Servlets, Web services, RESTful API, JDBC)
Web Technologies: Servlets, JSP, JDBC, JSF, Spring, Hibernate, AngularJS, Node.js, HTML, HTML4, HTML5, CSS, CSS3, DHTML, AJAX, Java Script, JQuery, Bootstrap, JSON, XML, XSL, XSLT, REST/SOAP Web services, GWT, JNDI, JSTL, JMS, JPA, EJB, WSDL, JAX-RS, JSX-WS, Dojo and Java Beans.
XML Technologies: XML, DTD, XSD, XSLT, SOAP, DOM Parser and SAX Parser
BiData Technologies: HDFS, Hive, Map Reduce, Pig, Sqoop, Oozie, Zookeeper, YARN, and Spark.
Databases: Oracle 8i/9i/10g/11g/12c, My SQL, MS SQL Server, DB2, Mongo DB, MS Access, Cassandra.
Frame Works: Struts, Spring (Dependency Injection, Spring MVC, Spring Access.DAO, Factory Pattern, Core, Spring Context, Spring AOP, Spring DAO, Spring IOC, Spring JDBC, Springwith Hibernate), Hibernate, DWR.
Development Tools: Eclipse, WSAD, IntelliJ Idea, NetBeans RAD, Dojo, WID (Web Sphere Integration Designer)
Confidential, Bloomington, MN
Sr. Hadoop Developer
- Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadooplog files.
- Involved Low level design for MR, Hive, Impala, Shell scripts to process data.
- Involved in complete Big Data flow of teh application starting from data ingestion upstream to HDFS, processing teh data in HDFS and analyzing teh data.
- Knowledge on handling Hive queries using Spark SQL dat integrate with Spark environment implemented in Scala.
- Used Spark Streaming API with Kafka to build live dashboards; Worked on Transformations & actions in RDD, Spark Streaming, Pair RDD Operations, Check - pointing, and SBT.
- Wrote Junit tests and Integration test cases for those microservices.
- Implemented POC to migrate map reduce jobs into Spark RDD transformation using Scala IDE for Eclipse.
- Creating Hive tables to import large data sets from various relational databases using Sqoop and export teh analyzed data back for visualization and report generation by teh BI team.
- Installing and configuring Hive, Sqoop, Flume, Oozie on teh Hadoop clusters.
- Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs.
- Developed a process for teh Batch ingestion of CSV Files, Sqoop from different sources and also generating views on teh data source using Shell Scripting and Python.
- Integrated a shell script to create Collections/morphine, SolrIndexes on top of table directories using MapReduce Indexer Tool within Batch Ingestion Framework.
- Implemented partitioning, dynamic partitions and buckets in HIVE.
- Developed Hive Scripts to create teh views and apply transformation logic in teh Target Database.
- Involved in teh design of Data Mart and Data Lake to provide faster insight into teh Data.
- Involved in using Stream Sets Data Collector tool and created Data Flows for one of teh streaming application.
- Experienced in using Kafka as a data pipeline between JMS (Producer) and Spark Streaming Application (Consumer). Involved in creating UI using Node.js and called different microservices to setup teh frontend.
- Involved in teh development of Spark Streaming application for one of teh data source using Scala, Spark by applying teh transformations.
- Developed a script in Scala to read all teh Parquet Tables in a Database and parse them as Json files, another script to parse them as structured tables in Hive.
- Designed and Maintained Oozie workflows to manage teh flow of jobs in teh cluster.
- Configured Zookeeper for Cluster co-ordination services.
- Developed a unit test script to read a Parquet file for testing Pypark on teh cluster.
- Involved in exploration of new technologies like AWS, Apache Flink, and Apache NIFIetc which can increase teh business value.
Environment: Hadoop, HDFS, Map Reduce, Hive, HBase, Zookeeper, Impala, Java(jdk1.6), Cloudera, Oracle, SQL Server, UNIX Shell Scripting, Flume, Oozie, Scala, Spark, Sqoop, Python, kafka, PySpark.
Confidential, Dallas, TX
- Responsible for Writing MapReduce jobs to perform operations like copying data on HDFS and defining job flows on EC2 server, load and transform large sets of structured, semi - structured and unstructured data.
- Developed a process for Scooping data from multiple sources like SQL Server, Oracle and Teradata.
- Responsible for creation of mapping document from source fields to destination fields mapping.
- Developed a shell script to create staging, landing tables with teh same schema like teh source and generate teh properties which are used by Oozie jobs.
- Developed Oozie workflow s for executing Sqoop and Hive actions.
- Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured data coming from various sources.
- Performance optimizations on Spark/Scala. Diagnose and resolve performance issues.
- Responsible for developing Python wrapper scripts which will extract specific date range using Sqoop by passing custom properties required for teh workflow.
- Developed scripts to run Oozie workflows, capture teh logs of all jobs dat run on cluster and create a metadata table which specifies teh execution times of each job.
- Developed Hive scripts for performing transformation logic and also loading teh data from staging zone to final landing zone.
- Worked on Parquet File format to get a better storage and performance for publish tables.
- Involved in loading transactional data into HDFS using Flume for Fraud Analytics.
- Developed Python utility to validate HDFS tables with source tables.
- Designed and developed UDF S to extend teh functionality in both PIG and HIVE.
- Import and Export of data using Sqoop between MySQL to HDFS on regular basis.
- Responsible for developing multiple Kafka Producers and Consumers from scratch as per teh software requirement specifications.
- Involved in using CA7 tool to setup dependencies at each level (Table Data, File and Time).
- Automated all teh jobs for pulling data from FTP server to load data into Hive tables using Oozie workflows.
- Involved in developing Spark code using Scala and Spark-SQL for faster testing and processing of data and exploring of optimizing it using Spark Context, Spark-SQL, Pair RDD's, Spark YARN.
- Migrating teh needed data from Oracle, MySQL in to HDFS using Sqoop and importing various formats of flat files in to HDFS.
Environment: Hadoop, HDFS, Map Reduce, Hive, HBase, Kafka, Zookeeper, Oozie, Impala, Java(jdk1.6), Cloudera, Oracle, Teradata SQL Server, UNIX Shell Scripting, Flume, Scala, Spark, Sqoop, Python.
Confidential, El Segando, CA
- Responsible for Managing, Analyzing and Transforming petabyte s of data and also quick validation check on FTP file arrival from S3 Bucket to HDFS.
- Responsible for analyzing large data sets and derive customer usage patterns by developing new MapReduce programs.
- Experienced in creation of Hive tables and loading data incrementally into teh tables using Dynamic Partitioning and Worked on Avro Files, JSON Records.
- Experienced in using Pig for data cleansing and developed Pig Latin scripts to extract teh data from web server output files to load into HDFS.
- Worked on Hive by creating external and internal tables, loading it with data and writing Hive queries.
- Involved in development and usage of UDTF s and UDAF s for decoding Log Record Fields and Conversion s, Generating Minute Buckets for teh specified Time Interval s and JSON Field Extractor.
- Developed Pig and Hive UDF's to analyze teh complex data to find specific user behavior.
- Responsible for Debug, Optimization of Hive Scripts and also implementing Deduplication Logic in Hive using a Rank Key Function (UDF).
- Experienced in writing Hive Validation Scripts which are used in validation framework (for daily analysis through graphs and presented to business users).
- Developed workflow in Oozie to automate teh tasks of loading data into HDFS and pre - processing with Pig and Hive.
- Involved for Cassandra Database Schema design.
- Using BULK LOAD Utility data pushed to Cassandra databases.
- Responsible for creating Dashboards on Tableau Server.
- Generated reports for hive tables in different scenarios using Tableau.
- Responsible for Scheduling using Active Batch jobs and Cron jobs.
- Experienced in Jar builds dat can be triggered by commits to Github using Jenkins.
- Exploring new tools for data tagging like Tealium (POC Report).
- Actively updated teh upper management with daily updates on teh progress of project dat include teh classification levels dat were achieved on teh data.
Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, HBase, Zookeeper, Oozie, Impala, Cassandra, Java(jdk1.6), Cloudera, Oracle 11g/10g, Windows NT,UNIX Shell Scripting, Tableau, Tealium.
Confidential, Cincinnati, OH
Sr. Java Developer
- Responsible for understanding teh scope of teh project and requirements gathering.
- Used MapReduce to Index teh large amount of data to easily access specific records.
- Supported MapReduce Programs which are running on teh cluster.
- Developed MapReduce programs to perform data filtering for unstructured data.
- Designed teh application by implementing Struts Framework based on MVC Architecture.
- Developed framework for data processing using Design patterns, Java, XML.
- Implemented J2EE standards, MVC2 architecture using Struts Framework.
- Implementing Servlets, JSP and Ajax to design teh user interface.
- Used JSP, Java Script, HTML5, and CSS for manipulating, validating, customizing, error messages to teh User Interface.
- Used teh light weight container of teh Spring Framework to provide architectural flexibility for Inversion of Controller (IOC).
- Used Spring IOC for dependency injection to Hibernate and Spring Frameworks.
- Designed and developed Session beans to implement teh Business logic.
- Developed EJB components dat are deployed on Web logic Application Server.
- Written unit tests using Junit Framework and Logging is done using Log4J Framework.
- Designed and developed various configuration files for Hibernate mappings.
- Designed and Developed SQL queries and Stored Procedures.
- Used XML, XSLT, XPATH to extract data from Web Services output XML
- Used ANT scripts to fetch, build, and deploy application to development environment.
- Developed Web Services for sending and getting data from different applications using SOAP messages.
- Actively involved in code reviews and bug fixing.
- Applied CSS (Cascading style Sheets) for entire site for standardization of teh site.
- Offshore co-ordination and User acceptance testing support.
- Involved in teh analysis & design of teh application using Rational Rose.
- Developed teh various action classes to handle teh requests and responses.
- Involved in teh design of teh Referential Data Service module to interface with various databases using JDBC.
- Used Hibernate framework to persist teh employee work hours to teh database.
- Developed classes and interface with underlying web services layer.
- Prepared documentation and participated in preparing user's manual for teh application.
- Prepared Use Cases, Business Process Models and Data flow diagrams, User Interface models.
- Gatheird & analyzed requirements for EAuto, designed process flow diagrams.
- Defined business processes related to teh project and provided technical direction to development workgroup.
- Analyzed teh legacy and teh Financial Data Warehouse.
- Participated in Data base design sessions, Database normalization meetings.
- Managed Change Request Management and Defect Management.
- Managed UAT testing and developed test strategies, test plans, reviewed QA test plans for appropriate test coverage.
- Involved in Developing JSP's, action classes, form beans, response beans, EJB's.
- Extensively used XML to code configuration files.
- Developed PL/SQL stored procedures, triggers.
- Performed functional, integration, system and validation testing.