- Around 8+ years of hands - on experience as a Developer /Designer in which includes 5 years of experience with Big Data management and Hadoop related components HDFS, MapReduce, Pig, Hive, YARN, Sqoop, Flume, Spark, Scala, kafka based on Big Data platforms. Actively involved in various phases of the project from Requirements gathering, Working on business flows and Business logic design, Technical Documentation, Identifying the key Abstract Classes, Layered Architecture Analysis, Prototype design, Coding and unit testing.
- Extensive experience in analysis, design, and development of Big Data platform, distributed and web applications using Java, Hadoop, Hadoop Ecosystem, JSP, Servlets, Hibernate, Web services, XML and XSL.
- Have hands on experience in writing Map Reduce jobs, Pig, HBase and Hive.
- Hands on experience in installing, configuring and using ecosystem components like Hadoop MapReduce, HDFS, HBase, zookeeper, Hive, Sqoop, Pig.
- Knowledge and Work experience with different projects in Storage, Banking, Health insurance and Ecommerce domains
- Experience with Cloudera Distribution CDH 4x.
- Experience with MVC and Cairngorm architecture.
- Exceptional team-building skills along with the ability to support and assist in every area to meet project goals
- Experience in interacting with clients in requirements gathering
- Specializing in object-oriented approaches for development and implementation of various projects using Java, J2EE, Integration tools and related technologies
- Designed technical solutions to complex business requirements within tight deadlines Implemented and introduced effective design methods and support protocols
- Experience in design, development of data driven web applications using Flex, Action Script, JSP, HTML, and CSS & Java Script
- Experience in design, implementation and maintenance of system architectures using applications servers like Web Logic 8.1, Oracle 10g and Tomcat web server.
- Extensively worked with NetBeans and Eclipse Integrated Development(IDE) tools
- Experience with writing PL/SQL with Relational Databases Oracle, My SQL
- Effective testing skills for writing, executing, reporting java and Flex test cases & their results for functional/ integration/ regression testing. Have defined & created various functional documents for the systems to be developed
- Experience in interacting with customers and making architecture/ design decisions along with customer
- Good communication skills, interacting calibre and adaptation to quick learning, ability to accomplish deadlines
- Good knowledge in Design Patterns.
Hadoop/Big Data: HDFS, Map Reduce, Hive, Pig, YARN, Sqoop, Flume, Oozie, Cloudera manager,RHadoop
Database/No Sql: SQL, PL/SQL, HBase.
XML/Web Services: SOAP/ Rest
Methodologies: Agile, Waterfall
Build Tools: Maven, ANT, Log4j.
Tools: and other: Rational Rose, Microsoft Visio
Operating Systems: Linux/Unix,WINDOWS
Confidential, Long Beach -CA
Talend/Big Data Developer
- Working in agile, successfully completed stories related to ingestion, transformation and publication of data on time.
- Working on Talend Enterprise studio 6.3.1 for ingesting the data in to Hadoop Data Lake.
- Worked on map reduce code for Omniture (hit data capturing tool) and some more business scenarios.
- Analyzed large data sets by using Hive queries.
- Hands on experience on AWS platform with EC2, S3 & EMR.
- Knowledge on Amazon EC2 Spot integration & and Amazon S3 integration.
- Experienced in dealing Pharma Vertical Data.
- Converted Complex SQL Stored procedures in to Hive QL.
- Optimized Hadoop Environments to meet the requirements.
- Worked on Creating, debugging, and executing Talend mappings, sessions, tasks.
- Implemented Spark using Scalaand Spark SQL for faster testing and processing of data.
- Ingested data sets from different DBs and Servers using Sqoop Import tool and MFT (Managed file transfer) Inbound process.
- Involved in implementing source code management by using GitHub.
- Spearhead research and development with big data technology managing, a small new internal Hadoop cluster using Cloudera Express, building an ETL, log ingestion system, storage into Impala/hiveParquet tables mirroring request patterns.
- Extensive experience in designing and development of software applications with JDK 1.6/1.5, Servlets, Struts, JSP, Spring, Hibernate, JPA 2.0, HTML, XML, Ajax, Java Beans, EJB 3.0, JNDI.
- Worked on JUnit4.0, Java script, Jquery, Angular JS.
- Good experience in Requirement Gathering, Analysis, Design, UI Prototype and Development.
- Wrote Pig Latin Scripts and Hive Queries using Avro schemas to transform the Data sets in HDFS.
- Used Datameer, Tableau, Sqoop Export tool and MFT (Managed file transfer) outbound processes to provide the transformed data to Clients.
- Used Path SerDe to process XML data files.
- Wrote custom Record Reader for map Reduce programs.
- As part of support, responsible for troubleshooting of Map Reduce Jobs, Pig Jobs, Hive
- Worked on performance tuning of Hive & Pig Jobs.
- Used maven to build the Jars for MapReduce, Pig and Hive UDFs.
- Created SVN usage guidelines for the team to maintain the Code repository in branches, trunk and tags.
- Created Migration Process to productionize the developed code and standardized the Build to Run Document to hand over the code to run team for production support.
Environment: AgileScrum,HDP2.1,Talend6.3.1,MapReduce,Hive,Pig,Sqoop,Tableau,MFT,Serde,Oozie,Flume, Linux and SVN.
Confidential, Detroit, MI
- Participated in Gathering requirements, analyze requirements and design technical documents for business requirements.
- Involved different phases in big data projects like data acquiring, data processing and data serving using dash boards.
- Proficient in Java design patterns including Singleton, Dependency Injection, Factory, ModelView Controller (MVC), Service Locator, Data Access Object (DAO) and Business delegate.
- Well versed in core java concepts like Collections, Multi-Threading, Serialization, and Java Beans.
- Import/export data from Oracle data base to/from HDFS using Sqoop, Hue and JDBC.
- Gathered data from different sources like Internet, sensors, user behavior, and moved to HDFS using Optimized join baseinMapReduce programs.
- Implemented Custom Input formats that handles input files received from java applications to process in MapReduce.
- Experience in using Pig as an ETL tool for event joins, filters, transformations and pre- aggregations.
- Created partitions, bucketing across state in Hive to handle structured data.
- Implemented Dash boards that handleHiveQL queries internally like Aggregation functions, basic hive operations, and different kind of join operations.
- Implemented business logic based on state in Hive using Generic UDF's. Used HBase-Hive integration.
- Managing and scheduling batch Jobs on a Hadoop Cluster using Oozie.
- Created production jobs using Oozie work flows that integrated different actions like MapReduce, Sqoop, and Hive.
- Experience in managing and reviewing Hadoop Log files.
- Responsible for analyzing multi-platform applications using python.
- Developed MapReduce jobs in Python for data cleaning and data processing.
Environment: Big Data, Hadoop, MapReduce, Pig, Hive, Sqoop, Oozie, Scala, Spark, Strom, Kafka, Cassandra, Linux, Python, Oracle10g, Cloudera manager.
Confidential, Detroit, MI
Data Stage Consultant
- Providing Data Migration strategy for each system based on data volume, data refresh requirements
- Coordinating with DBA’s to create an Ally instance on Oracle database to host mortgage data
- Providing ETL framework using Data Stage for migrating historical data. This includes migrating Terabytes of data over a stipulated SLA
- Performing gap analysis during historical data migration and syncing up till cutover date
- Providing ETL framework using Data Stage for Daily and Monthly Interfaces
- Providing Data Reconciliation scripts for validating migrated data
- Migrating existing PL/SQL batch jobs to Data Stage environment
- Orchestrating Developers, Testing team, Production support team and Tidal scheduling team
Confidential, Detroit, MI
Data Stage Consultant
- Profiling Mortgage Originations Systems, Mortgage Servicing Systems
- Understanding Confidential requirements, DataMart and ODS data models
- Interacting with business users gathering transformation rules, Interacting with Data Architects for Data standardization
- Providing reusable and scalable architecture for designing Data Stage Jobs loading staging and Operation Data Store
- Interacting with Enterprise Data Architects in building and maintaining Master Data Repository
- Understanding the volume of data, Tuning Data Stage Job configurations and Oracle queries to meet SLA requirements
- Coordinating with developers for unit testing the code, reviewing the code, defect fixing deploying and testing in higher environments
- Interacting with production support team and Tidal scheduling team to configure and schedule the jobs
- Documenting modifications and enhancements made to warehouse as required by the project. In addition to Architecture Design Documents, Technical design documents and Runbooks