Big Data / Hadoop Lead Resume
Dallas, TX
SUMMARY:
- Over all 10+ years of experience in data analysis, data modeling and implementation of enterprise class systems spanning Big Data, Data Integration, Object Oriented programming, Data warehousing and Advanced Analytics
- 4 years of experience with Hadoop, HDFS, Map Reduce and Hadoop Ecosystem (Hive, Hive, Oozie, Kafka, Impala & Spark, AVRO, JSON).
- Good knowledge of Hive optimization with ORC, Partitions and Bucketing.
- Data ingestion schedulers have been created using Sqoop and Oozie scheduler.
- Have hands on experience in writing MapReduce jobs using Java.
- Hands on experience in writing pig Latin scripts and pig commands and hive queries.
- Having good knowledge and experience in Spark and Kafka.
- Hands on experience in installing, configuring and using ecosystem components like Hadoop MapReduce, HDFS, Sqoop, Pig, Scala, Hive, Impala & Spark
- Experience in database development using SQL and PL/SQL and experience working on databases like Oracle 9i/10g, Informix, and SQL Server
- Experience working on NoSQL databases including HBase & MongoDB.
- Experience using Sqoop to import data into HDFS from RDBMS and vice - versa.
- Experience in Database Design and Development using Relational Databases (Oracle, MS-SQL, MySQL Server 2005/2008) and NoSQL Databases (MongoDB, Cassandra, HBase)
- Effective team player and excellent communication skills with insight to determine priorities, schedule work and meet critical deadlines.
- Having good work experience in file formats such as AVRO, JSON, and Parquet etc. with Hadoop tools using SerDe concepts
- Experience in analyzing data in Spark using Scala and Pyspark.
- Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better performance with HiveQL queries.
- Experience in importing streaming data into HDFS using flume sources and flume sinks and transforming the data using flume interceptors
- Experience utilizing Java tools in Business, Web, and Client-Server environments including Java, Jdbc, Servlets, Jsp, Struts Framework, Jasper Reports and Sql.
- Experienced with different scripting languages like Python and shell scripting.
- Proficient in using various IDEs like Eclipse, Net beans.
- Experienced with different scripting languages like Python and shell scripting.
TECHNICAL SKILLS:
Big DataEcosystem: Hadoop, Map Reduce, HDFS, HBase, Spark, Scala, Impala, Hive, PigOozie, sqoop, Flume, Kafka, CDH4, JSON, AVRO
Java Technologies: Java 5,Java 6, JAXP, AJAX, I18N, JFC Swing, Log4j, Java Help API
Methodologies: Agile, UML, Design Patterns
Database: Oracle 10g, DB2, MySQL, No Sql (MongoDB), Hbase
Application Server: Apache Tomcat 5.x 6.0, Jboss 4.0
Web Tools: HTML, Java Script, XML, DTD, Schemas, XSL, XSLT, XPath, DOM, XQuery
Tools: SQL developer, DB visualize, Hortonworks
IDE / Testing Tools: NetBeans, Eclipse, WSAD, RAD, Mat lab
Operating System: Windows. Linux
Scripts: Bash, Python, ANT
Testing API: JUNIT
PROFESSIONAL EXPERIENCE:
Confidential, Dallas, TX
Big Data / Hadoop Lead
Responsibilities:
- Lead the AML Cards North America development and DQ team successfully to implement the compliance project.
- Involved in the project from POC and worked from data staging till saturation of DataMart and reporting.
- Extensive experience in Amazon Web Services (Amazon EC2, Amazon S3, Amazon Simple dB, Amazon RDS, Amazon Elastic Load Balancing, Amazon SQS, AWS Identity and access management, AWS Cloud Watch, Amazon EBS and Amazon CloudFront).
- • Deployment, performance & scalability fine-tuning web/application servers like WebLogic, WebSphere, JBoss and Pramati& Tomcat.
- • Expertise in Spring framework, including Spring IoC, Spring DAO support, Spring ORM, Spring Microservices, Spring AOP, Spring Security, Spring MVC, Spring Cache, Spring Integration, Spring Boot, and Spring REST.
- • Expertise in Developing applications using Restful Web Services, SOAP, Java, J2EE, Servlets, EJB, JPA, WebSphere Commerce, Hibernate, Spring Framework, Jasper Reports Server, Ext js, JSP, JMS, Struts, XML, Eclipse, NetBeans, jQuery, Visual Source Safe, CVS, SVN, JDBC, JNDI, JIRA, ANT, Maven, IReport, Apache Tiles, Spring Batch, Spring Security, Spring Web flow, Spring Data JPA, JSF, ICE faces, HTML and Java Scripts.
- • Expertise in developing Microservices using Spring Boot and Node JS to build more physically separated modular applications which will improve scalability, Availability, Agility of application.
- • Experience and familiarity building modern Spring applications with Spring Boot.
- Worked in an onsite-offshore environment.
- Completely responsible for creating data model for storing & processing data and for generating & reporting alerts. This model is being implemented as standard across all regions as a global solution.
- Involved in discussions and guiding other region teams on Citi Big data platform and AML cards data model and strategy.
- Responsible for technical design and review of data dictionary (Business requirement).
- Responsible for providing technical solutions and work arounds.
- Migrating the needed data from Data warehouse and Product processors into HDFS using Talend and Sqoop and importing various formats of flat files in to HDFS.
- Using Spark Streaming to bring all credit card transactions in the Hadoop environment.
- Involved in design of overall Citi Group Big data architecture.
- Involved in discussion with source systems for issues related to DQ in data.
- Integrated the hive warehouse with Spark & Impala. We replaced impala with spark due to impala’s security issue.
- Comfortable with SCALA functional programming idioms and very familiar with Iterate / Enumerate streaming patterns. Almost entire DQ and end to end reconciliation is done in SCALA & SPARK.
- Implemented partitioning, dynamic partitions, indexing and buckets HIVE.
- Created Custom UDF’s in JAVA to overcome HIVE limitations on cloudera CDH5.
- Used Hive to process data and Batch data filtering. Used Spark/Impala for any other value centric data filtering.
- Supported and Monitored Map Reduce Programs running on the cluster.
- Monitored logs and responded accordingly to any warning or failure conditions.
- Responsible for preserving code and design integrity using SVN and SharePoint.
- Gave a demo to business users on using Datameer for analytics.
Environment: Apache Hadoop, HDFS, Hive, Map Reduce, Java, Talend, Spark, Impala, Scala, Sqoop, Cloudera CDH5, Platform, SVN, SharePoint, Data Meer and Maven.
Confidential, Santa Clara, CA
Big Data / Hadoop Developer
Responsibilities:
- Have setup the 64 node cluster and configured the entire Hadoop platform.
- Migrating the needed data from MySQL & Mongo DB into HDFS using Sqoop and importing various formats of flat files into HDFS.
- Mainly worked on Hive queries to categorize data of different claims.
- Integrated the hive warehouse with HBase.
- Used Kafka to store all online communications into Hbase.
- Written customized HiveUDFs in Java where the functionality is too complex.
- Designed and created Hive external tables using shared meta-store instead of derby with partitioning, dynamic partitioning and buckets.
- Experience in Continuous delivery / Continuous Integration (CD/CI) tools Docker, Jenkins to deploy this application to AWS.
- Experience in making Junit and Test NG test cases and executed as part of auto build process from Jenkins Jobs.
- Hands on experience in designing and implementation of Selenium WebDriver Automation Framework for Smoke test and Regression test using TestNG.
- Experience in developing end to end automation using Selenium WebDriver, Grid, POM, Junit, TestNG, Cucumber, Object Repository, Web Services (REST, SOAP).
- Excellent knowledge and experience in SQL queries, PL/SQL, stored procedures, functions and triggers to interact with SQL, MySQL, Oracle databases.
- Experience in Maven pom.xml and as CICD tool Jenkins CI and Configured Log4j for logging mechanism
- HiveQL scripts to create, load, and query tables in a Hive.
- Generate final reporting data using Tableau for testing by connecting to the corresponding Hive tables using Hive ODBC connector.
- Supported Map Reduce Programs those are running on the cluster
- Maintain System integrity of all sub-components related to Hadoop.
- Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, and Hive).
- Monitored System health and logs and respond accordingly to any warning or failure conditions.
- Presented data and dataflow using Talend for reusability.
Environment: Apache Hadoop, HDFS, Hive, Map Reduce, Java, Pig, Sqoop, Cloudera CDH4, MySQL, Tableau, Talend, Kafka, SFTP.
Confidential, Somerset, NJ
Java J2EE Developer
Responsibilities:
- Coded the business methods according to the IBM Rational Rose UML model.
- Extensively used Core Java, Servlets, JSP and XML.
- Experience implementing Struts (Model View Controller framework), spring frameworks and Object Relational mapping (ORM) tools such as Hibernate.
- Extensive knowledge on Core Java technologies such as Multi-Threading, Exception Handling, Reflection, Collections, Streams & File I/O.
- Strong experience in developing JAVA/ J2EE applications using IDEs like Eclipse, IntelliJ and Web servers like JBoss, Tomcat, WebLogic.
- Strong working experience on mapping tools like Hibernate (Hibernate Connection Pooling, HQL, Hibernate Caching, Transactions).
- Implemented Business Logic using POJOs and used Tomcat and WebLogic to deploy the applications.
- Good perception of Object Oriented Programming concepts (OOPS).
- Good experience in spring modules like Spring AOP, IOC etc. Working experience on Multithreading synchronous and event based programming.
- Extensive knowledge in deploying and maintaining the application on Tomcat and WebLogic servers.
- Used Maven for the project management like build, install.
- Hands on experience in Object Oriented Design and Core Java concepts like Design Patterns, Multithreading, Exception Handling and Collection API's.
- Good experience in Spring like Spring Core, IOC, AOP, Spring MVC
- Expertise in configuring the Spring Application Context with dependency injection and using Spring Framework that can integrate Hibernate and Web Services.
- Developed various Spring starter POMs for Spring Boot based Rest services.
- Used Struts 1.2 in presentation tier.
- Generated the Hibernate XML and Java Mappings for the schemas
- Used DB2 Database to store the system data
- Used Rational Application Developer (RAD) as Integrated Development Environment (IDE).
- Used unit testing for all the components using JUnit.
- Used Apache log 4j Logging framework for logging of trace and Auditing.
- Used Asynchronous JavaScript and XML (AJAX) for better and faster interactive Front-End.
- Used IBM Web-Sphere as the Application Server.
- Used IBM Rational Clearcase as the version controller.
Environment: s: Java 1.6, Servlets, JSP, Struts1.2, IBM Rational Application Developer (RAD) 6, Web sphere 6.0, iText, AJAX, Rational Clear case, Rational Rose, Oracle 9i, log4j.
Confidential, Chicago, IL
JAVA Developer
Responsibilities:
- Involved in the complete SDLC software development life cycle of the application from requirement analysis to testing.
- Developed the modules based on struts MVC Architecture.
- Developed The UI using JavaScript, JSP, HTML, and CSS for interactive cross browser functionality and complex user interface.
- Proficient knowledge in developing the web-based applications design and development using JAVA, J2EE, Servlets, JSP, Spring, JDBC, Hibernate, ANT, Eclipse, XML, JDBC and Databases.
- Experience in web application design using open source Struts, Spring MVC, Frameworks and J2EE Design Patterns. Developed ANT, Maven scripts in to build and deploy J2EE Applications.
- Experience implementing Struts (Model View Controller framework), spring frameworks and Object Relational mapping (ORM) tools such as Hibernate.
- Extensive knowledge on Core Java technologies such as Multi-Threading, Exception Handling, Reflection, Collections, Streams & File I/O.
- Strong experience in developing JAVA/J2EE applications using IDEs like Eclipse, IntelliJ and Web servers like JBoss, Tomcat, WebLogic.
- Strong working experience on mapping tools like Hibernate (Hibernate Connection Pooling, HQL, Hibernate Caching, Transactions).
- Implemented Business Logic using POJOs and used Tomcat and WebLogic to deploy the applications.
- Created Business Logic using Servlets, Session beans and deployed them on WebLogic server.
- Used MVC struts framework for application design.
- Created complex SQL Queries, PL/SQL Stored procedures, Functions for back end.
- Prepared the Functional, Design and Test case specifications.
- Involved in writing Stored Procedures in Oracle to do some database side validations.
- Performed unit testing, system testing and integration testing
- Developed Unit Test Cases. Used JUnit for unit testing of the application.
- Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects. Resolved more priority defects as per the schedule.
Environment: Java 6, Eclipse, Apache Tomcat Web Server, JSP, JavaScript, AWT, Servlets, JDBC, HTML, Front Page 2000, Oracle, CVS.
