- Over 8+ years of professional IT experience which includes experience in Bigdata, Hadoopecosystem related technologies in Banking, Retail, Insurance and Communication sectors.
- Expertize with the tools inHadoopEcosystem including Pig, Hive, HDFS, MapReduce, Sqoop, Spark, Kafka, Y/arn, Oozie, and Zookeeper.
- Excellent knowledge onHadoopecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
- Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
- Experience in manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data. Experienced in using agile approaches, including Extreme Programming, Test - Driven Development and Agile Scrum.
- Involved in the Agile and Scrum meetings.
- Strong experience on IBM Big InshightsHadoopdistribution.
- Experienced in writing complex MapReduce programs that work with different file formats like Text, Xml.
- Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice-versa according to client's requirement.
- Extensive Experience on importing and exporting data using stream processing platforms like Flume and Kafka.
- Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle.
- Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
- Experienced in using agile approaches, including Extreme Programming, Test-Driven Development and Agile Scrum.
- Expert in optimizing the Hive and Pig Scripts
- Worked in large and small teams for systems requirement, design & development.
- Key participant in all phases of software development life cycle with Analysis, Design, Development, Integration, Implementation, Debugging, and Testing of Software Applications in client server environment, Object Oriented Technology and Web based applications.
- Experience in using various IDEs Eclipse, Java JDE
- Experience of using build tools Sbt, Maven.
- Preparation of Standard Code guidelines, analysis and testing documentations.
- Domain knowledge in insurance.
- Excellent client-facing, negotiation & conflict resolution skills; a highly motivated self-starter and team-player interacting effectively with stakeholders to translate business requirements into IT deliverable.
Programming Languages: SQL, PL/SQL,T SQL, Databases. SQL, T-SQL, PL/SQL, C, C++, C#, CSS, HTMl, Java,JDBC
Databases: NO SQL (HBase), MY SQL,MS SQL server.
IDE s & Utilities: Eclipse and JCreator,NetBeans.
Web Dev. Technologies: HTML, XML.
Protocols: TCP/IP, HTTP and HTTPS.
Operating Systems: Windows 7,8,10, Ubuntu,Unix, Linux, Red hat.
ETL tools: Tableau
Hadoop ecosystem: Hadoop, HDFS,Map Reduce, Sqoop, Hive, PIG, HBASE, HDFS, Zookeeper, Oozie, and Kafka.
Confidential, California, CA
- Creating ETL Process to move data from Source systems toHadoop.
- Create map reduce code to convert the source file in EBCDIC format to ASCII.
- Create Data quality framework to do the basic validation of the data from source.
- Create the Key and Split framework for adding the key columns and splitting the npi/non npi data’s
- Experience in transforming and analyzing data using Hive QL and Pig Latin.
- Experience in developing custom UDF/UDAF, Handling updates in hive and Apache Sentry.
- Experience in optimizing Hive queries and performance tuning.
- Registration of the datasets in a metadata registry that controls admittance intoHadoop.
- Good understanding ofHadoopData classification and Directory Structure options.
- In depth understanding/knowledge ofHadoopArchitecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, MRv1 and MRv2 (YARN).
- Coordinating with offshore team and provide them analysis and guidance.
- Involved in daily SCRUM meetings to discuss the development/progress of Sprints and was active in making scrum meetings more productive.
Environment: Hadoop-Cloudera Distribution, UNIX, Teradata, MapReduce, HDFS, Pig, Hive, Sqoop, UnixShell scripting.
Confidential, Louisville, KY
Hadoop/Java J2ee Developer
- Performed Sqoop imports of data from Data warehouse platform to HDFS and built hive tables on top of the datasets.
- Built ETL workflow to process data on hive tables.
- Used HUE to create Oozie workflows to perform different kinds of actions such as hive,java&MapReduce.Worked extensively in Hive used features like UDF and UDAFs.
- Used sequence and avro file formats and snappy compressions while storing data in HDFS.Used Efficient Columnar Storage like parquet for data used by business.
- Worked extensively in Map Reduce usingJava Well versed with features like multiple outputin MapReduce.
- Worked on features like reading a hive table from MapReduce and making it available for all data nodes by keeping in distributed cache.Used both Hue and xml for Oozie.
- Participated in building CDH4 test cluster for implementing Kerberos authentication.Installing Cloudera manager and Hue.
Environment: Hadoop, CDH4, Hue, Map Reduce, Hive, Pig, Sqoop, Oozie, Impala, corejava/J2EEJSON, Netezza,Maven, SVN, and Eclipse.
Confidential, Orlando, FL
- Designing and creating stories for the development and testing of the application.
- Configuring and performance tuning the Sqoop jobs for importing the raw data from the data warehouse.
- Developing hive queries using partitioning, bucketing and windowing functions.
- Optimized hive joins for large tables and developed map reduce code for the full outer join of two large tables.
- Designed and developed entire pipeline from data ingestion to reporting tables.
- Creating the raw Avro data for an efficient feed to the map reduce processing.
- Design and Develop Pig Latin scripts and Pig command line transformations for data joins and custom processing of Map reduce outputs.
- Creating HBase tables for random read/writes by the map reduce programs.
- Creating hive tables to the imported data for validation and debugging.
- Creating data model, schemas and stored procedures for reporting database.
- Designing and creating Oozie workflows to schedule and manageHadoop, pig and Sqoop jobs.
- Implemented custom workflow scheduler service to manage multiple independent workflows. Implemented a web application, which uses Oozie Rest API and schedule jobs.
- Experience in using Sqoop to migrate data to and fro from HDFS and MySQL or Oracle and deployed Hive and HBase integration to perform OLAP operations on HBase data.
- Assisted SQL Server Database Developers in code review and optimizing SQL queries.
- Experience in Database design, Entity relationships, Database analysis, Programming SQL, Stored procedure's PL/ SQL, Packages and Triggers in Oracle and SQL Server on Windows and LINUX.
- Actively involved in developing front-end spring web application for consumers to create custom profiles for data processing.
- Experience migrating MapReduce programs into Spark transformations using Spark and Scala.
- Actively involved in deploying and testing the application in different environments.
- ConfiguringHadoopEnvironment: Kerberos authentication, Data Nodes, Name Nodes, MapReduce, Hive, Pig, Sqoop, Oozie workflow engine.
Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, Oozie, Cloudera CDH4.5,5.8 SQL, Linux, Java (JDK 1.6), Eclipse IDE, Web services, DB2.
- Analyzed Object Oriented Design and presented with UML Sequence, Class Diagrams.
- Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
- Developed components using Java multithreading concept.
- Developed various EJBs (session and entity beans) for handling business logic and data manipulations from database.
- Involved in design of JSP's and Servlets for navigation among the modules.
- Designed cascading style sheets and XSLT and XML part of Order entry Module & Product Search Module and did client side validations with java script.
- Hosted the application on Web Sphere.