- 12+ years of total experience in the Defence , Banking , Marketing , R&D industries with spanning of technologies from BigData, Mobile, UI/Web, EnterpriseApplications, ERP(SAP), DataIntegration, DataWarehouse, IoT, etc. and experience in Hadoop and BigData frameworks and its enterprise wide implementations, integrations and applications with uncanny polyglot ability for leveraging business with critical thinking and problem solving.
- Expertise in BigData architecture design, planning, installation, deployment and migration of traditional warehouse solutions to Hadoop based Integrated Data Warehouses.
- 4+ years of hands - on architecture expertise in designing large data processing frameworks with Hadoop, Spark, Pig, Hive, Flume, Sqoop, MapReduce and NoSQL, Linux data stores and HADOOP echo system and large data applications on multi-clustered environments.
- Hands-on experience with HDFS and NoSQL databases including MongoDB, HBase, Cassandra.
- Hand-on architectural experience with AWS (ELB, S3, EMR (emr, MapR), APIGateway, Lambda, etc.) and Azure cloud platforms and designing MicroServices and Serverless applications.
- Design and build scalable infrastructure and platforms to collect and process very large amounts of data (structured and unstructured), including streaming real-time data.
- Extensive expertise with monitoring, debugging, benchmarking and performance tuning of tools of Hadoop ecosystem.
- Expertise in building Realtime and Near-Realtime complex stream processing platforms using tools like Spark, Flume, Kafka, etc. for a scalable and fault tolerant systems.
- Have architect high performance and largely scalable applications. Actively lead integrating cross domain applications overseeing management on team of multiple developers and testers through end to end project implementations.
- Deep level understanding of BigData Security , Compliace (PII, PCI, SOX, etc.), Vendor certifications and capabilites across product platforms, DataGovernance , Audit and DataLeniage requirements in mariad verticals like Defence, Finance, Marketing,etc.
- Expert Knowledge with Multi Clustered environment and setting up Cloudera, MapR Hadoop echo-systems.
- 4+ years of experience as TeamLead and experience in work management, delegation and status tracking.
- 5+ years of experience working in On-Site and offshore model and has capability to handle and manage multiple projects with offsite team.
BigData Tools: Hadoop, MapReduce (MRv1, MRv2), HDFS, YARN, HBaseZookeeper, KUDU, Impala, Hive, Pig, Sqoop, Oozie, Spark, Spark Streaming, Spark SQL (Data Frames)
Architectures: Lambda, Kappa, Mu
Machine Learning: H2O, SparkML
NoSql: HBase, Cassandra, MongoDB.
Open Source: ApacheDrill, Apache Kylin, Apache Flink, Apache Storm
ETL: Diyotta, Talend BigData
Security/Authorization: Sentry, Knox, NavEncrypt, AD.
Logging/Monitoring: ELK, Ganglia, Nagios
Search/Data Lineage: Cloudera Search, Solr, Navigator, Lucene
Ingestion/Integration: Kafka, Flume, HDF, Splunk/Hunk, SAS
Distributions: CDH (4.x, 5.x), MapR (4.x), HDP (2.X)
Data Visualization Tools: Apache Zepplin, Spotfire, Tableau, ZoomData
OLAP: Splice Machine, Kylin, AtScale
Cloud: AWS (EMR, MapR), Lambda, S3, RDS, ELB, API Gateway, Elastic BeanStalk, RedShift, Kinesis, DataPipeline, Azure, Google Cloud
EventDriven: CQRS, Axon
ETL: Ab Initio, Informatica 7.1, DAC, DataStage, IBMStreams
Datawarehouse: IBM Netezza, Teradata.
IDE: Eclipse, JRebel, IntelliJ, NetBeans.
UI: AngularJS, D3js, Jquery
Database: HBase, Oracle (8i, 10g), MySQL, MS Access
Build & Configuration: ant, maven, chef, puppet
Containers: Docker, ECS
CodeDeploymennt: Jenkins, RunDeck
CodeRepositories: CVS, git, svn, CodeCommit(AWS)
Platforms: Redhat (6, 7), CentOS, Solaris, Windows.
Web/App Servers: NGNIX, Apache Tomcat 4.x/7.x, Weblogic, WebSphere
Enterprise Frameworks: JEE5, SPRING 4.x, Struts2, GWT, EJB
ERP: SAP (ECC 6.0), ABAP
Other tools: log4j, AOP, Junit, Hudson (Jenkins), ActiveMQ, MuleESB
Confidential, Trevose, PA
Sr. Technical Architect - Bigdata/Cloud
- Collaborated with COO, HeadOfTechnology and EA providing strategic and tactical roadmap for migration of current enterprise technology, product and service stack to BigData - Cloud (AWS) based ecosystem that builds value proposition for overall company services strategy.
- Build and designed reference-architectures, models, etc. for new services that leverages Hadoop(MapR) as core underlying platform for Data-Warehouse, DataMart’s, ETL Vendor Integrations, Data -Ingestion(Ingress, Egress) using the various core Hadoop technologies.
- Technical direction to FloodTheLake(DataLake) initiative and build of Operational DataSources(ODS), ETL pipelines, DataMart’s on Hadoop ecosystem using MapRfs, Hive, Sqoop, Oozie, etc.
- Integrated Diyotta (ETL) with MapR for data-integration, ETL data processing, data pipelines and eventual migration to customer facing DataMart’s.
- Migration of datacenter based data-services to Cloud based PaaS and SaaS enablement to end B2C and B2B clients using the best of breed of Hadoop (MapR) and Cloud Services.
- Integration with Tableau and MapR for the development of custom reporting executed on the MapR based ODS and visualization that are consumed by the end clients.
- Executed and evaluated POC for OLAP based data-cube and reporting AtScale integration on MapR (Hive, HBase) cluster.
- Executed and evaluated POC for Machine Learning using H2O on MapR and R based analytics as part of the migration of existing SAS based analytical pipelines.
- Direction to the Infrastructure, Admins, and Network teams for provisioning of virtual hardware on the cloud, data center connectivity and operational readiness.
- Direction to the Administration with setup of clusters(PROD, QA,DEV) and installation of MapR, Profiles and Security setup, Cluster Management, Key Management, Monitoring, data migration, etc.
- Direction to the Infrastructure and admin teams for Disaster Recovery strategy for Hadoop on Cloud.
Confidential, Wilmington, DE
Sr. Solutions Architect - Bigdata
- Responsible for Enterprise and Systems Architecture, Design and Application development initiatives, strategic roadmap, solution delivery, etc. that aligns with overall IT strategy of the Hadoop implementation.
- Worked with Enterprise Architecture Board in creating the technology road map, reference architecture and deriving the project strategy.
- Lead the architecture and development efforts for the complex real-time event processing from myriad of internal and external data sources to ingest the data into Hadoop, process the data real-time for specific use cases of fraud detection, recommendation engines, dashboards and visualization using case specific integrated technologies like Spark Streaming, Flume, Kafka, HBase, etc.
- Introduced the Lambda architecture at BarclaycardUS that provides the latest state of the data combining the batch (slow data) and real-time event (fast data) to get the current state of the data.
- Lead the application development of Batch Ingestion frameworks using custom MapReduce programs, Hive, HDFS and used Pig scripting for data cleansing and transformations. Used Oozie as the workflow scheduler for the batch data pipeline.
- Developed the warehouse specific DataLake using Hive and Pig scripting and also ETL pipelines for populating the DataMarts for user/business consumption using Hive and Spark.
- Lead the development efforts of migration of historical data from existing warehouses to Hadoop using Sqoop for scalable processing of the data and the eventual insights are sqooped back.
- Direction to Compliance and IT Security groups in designing, documenting and implementing Perimeter Security, Access Management, Auditing and Data Encryption for Hadoop Clusters.
- Direction to Infrastructure teams in setting up CDH 5.x Hadoop Clusters, Configuration Management, Capacity Planning, Disaster Recovery(DR) and Business Cotinuity(BC) cluster and Cluster Management among the Prod, Perf, QA and DEV clusters.
- Direction to Admin teams for configuring and implementing Knox, Kerberos, Sentry, Data Encryption solutions on Hadoop Cluster.
- Responsible for evaluation, internal branding, pilot/evaluation and full scale implementation of 3 rd party vendors (BI, Visualization, Analytics, Machine Learning, ETL, etc.) That can integrate and leverage Hadoop as processing platform.
- Responsible for overlook of new technologies and possible integration of those to build a robust, scalable and configurable technology solutions that would should be leveraged by new products enabled for the Confidential ’s banking customers.
Confidential, Cambridge, MA
Bigdata Architect/Technical Lead
- Have architect and implemented 50+ TB data application on 40+ nodes cluster using Cloudera (CDH 4.x) stack and different bigdata analytic tools.
- Experience on designing with Big Data, Hadoop File System (HDFS) solutions with estimation of the infrastructure need, effort estimation etc.
- Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources.
- Design and plan of the data gathering and processing.
- Hands on experience in installing, configuring and using HADOOP ecosystem components.
- Hands on experience in working with Hadoop echo systems applications like Hive, Pig, Sqoop, Map Reduce and Oozie.
- Strong knowledge of Hadoop and Pig and Hive’s analytical functions (UDF’s).
- Capturing data from existing databases that provide SQL interfaces using Sqoop.
- Efficient in building hive, pig and map-reduce scripts.
- Migration of huge amounts of data from different databases( i.e. Netezza, Oracle, SQL Server) to Hadoop using relevant connectors
- Used Solr for indexing and search capabilities on HDFS.
- Experienced in managing and reviewing Hadoop log files.
- Worked on Multi Clustered environment and setting up Cloudera Hadoop echo-System.
- Ability to build clusters on AWS and Rackspace cloud systems.
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, HBase, ZooKeeper, Java (jdk1.6), Custom APIs, Hadoop distribution of HortonWorks & Cloudera, SQL Server 2008/2012, Python, Perl, UNIX Shell Scripting, MySQL, Redhat Linux.
Confidential, Cambridge, MA
Sr. Solutions Consultant/Bigdata
- Worked extensively on importing data using Sqoop and flume.
- Installed and configured Hadoop, Map Reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Installed, configured and used Hadoop Ecosystem components.
- Responsible for creating complex tables using hive.
- Created partitioned tables in Hive for best performance and faster querying.
- Transportation of data to Hbase using pig.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources
- Involved in source system analysis, data analysis, data modeling to ETL (Extract, Transform and Load)
- Wrote multiple MapReduce procedures to power data for extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV & other compressed file formats.
- Handled structured and unstructured data and applying ETL processes.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
- Developed the Pig UDF'S to pre-process the data for analysis.
- Developed Hive queries for the analysts
- Prepared Developer (Unit) Test cases and execute Developer Testing.
- Created/Modified shell scripts for scheduling various data cleansing scripts and ETL loading process.
- Supported and assisted QA Engineers in understanding, testing and troubleshooting.
- Wrote build scripts using ant and participated in the deployment of one or more production systems
- Production Rollout Support which includes monitoring the solution post go-live and resolving any issues that are discovered by the client and client services teams.
- Designed, documented operational problems by following standards and procedures using a software reporting tool JIRA.
- Created Cassandra Advanced Data Modeling course for Data Stax.
- Successfully loaded files to Hive and HDFS from Cassandra.
- Worked in a language agnostic environment with exposure to multiple web platforms such as AWS, databases like Cassandra.
- Performed data scrubbing and processing with Oozie.
- Responsible for managing data coming from different sources.
- Gained good experience with NOSQL database.
- Performed Cassandra Database Configurations and its C++ client libQtCassandra.
- Worked with large datasets in Hadoop using Hive.
Environment: Apache Hadoop, Map Reduce, HDFS, Hive, Java, SQL, PIG, Zookeeper, Oozie, Cassandra, Java (jdk1.6), Flat files, Oracle 11g/10g, MySQL, Windows NT, UNIX, Sqoop, Hive, Oozie.
Confidential, Cambridge, MA
Sr. J2EE Consultant/Team Lead
- As a Team Lead worked and lead the team through various phases of Analysis, Design & Development and analyze user requirements, procedures and problems to automate and thereby enhance existing systems.
- Designed and developed Controllers and Forms using Spring framework
- Involved in high level design, Application design, and development and testing.
- Developed font end application using BootStrap (Model, View, Controller) framework.
- Used Spring framework for implementing Dependency Injection, AOP, Spring ORM
- Consume the Web Services to retrieve data from different applications using SOAP protocol
- Responsible for end-to-end design, development and bug fixing.
- Used Maven to build and deploy the application on web logic server.
- Used PL/SQL developer for writing the queries.
- Configured Hibernates second level cache using EHCache to reduce the number of hits to the configuration table data.
- Used JUnit, Mocktio and PowerMocks framework for unit testing of application and Log4j 1.2 to capture the log that includes runtime exceptions.
- JUnit was used for unit testing and implementing Test Driven Development (TDD) methodology.
- Used JAX-RPC Web Services using SOAP to process the application for the customer
- Developed Web services for sending and getting data from different applications using SOAP1.1 messages.
- Used XML SAX to parse the simulated xml file which has simulated test data
- Responsible for implementing the transaction management in the application by applying Spring AOP methodology.
- Responsible for analysing, designing, implementing, testing, and maintaining all EDI processes and relationships in the environment.
- Used SVN for version control and used STS as the IDE for developing the application.
- Used Oracle11g as backend database using Windows OS. Involved in development of Stored Procedures.
- Integrated the application with Spring Quartz framework
- ORM tool Hibernate 4 to represent entities and fetching strategies for optimization
- Written Oracle Stored Procedures and Functions for the application
Environment: JDK 1.7, Oracle 11g, Struts 1.3, Hibernate 3.5, spring 3.0, JUnit, Maven, Web Service, HTML, JQuery, SVN, IntelliJ and WebLogic
Confidential, Cambridge, MA
Sr. J2EE Consultant
- Involved in the requirement analysis that guarantees and approval/routing modules actively participated in design and technical discussions.
- Designing and Development of various modules.
- Conducting requirements meetings to collect the requirements from the clients.
- Development of common utilities and framework related functionalities on UI Base System.
- Using UI Business Process knowledge for development of enhancements requested by the agency due changing nature of federal and state laws in the system.
- Prepared Functional Design and Technical Design Documents.
- Developed actions and models encapsulating the business logic.
- Developed and maintained the data layer using the ORM framework Hibernate.
- Applied J2EE Design Patterns such as Factory, Singleton, and Business delegate, DAO, and DTO.
- Provided Log4j support to the application for debugging the system.
- Build PL/SQL functions and stored procedures.
- Used Clear Case for maintaining version control and synchronizing changes.
- Analysis and design of legacy data for migration by using DMT.
- Implemented Data base design using Embarcadero.
- Designed and Implemented Batches using Quartz Batch Framework.
- Created work items using Drools engine.
- Participated in Unit Testing and Integration Testing.
- Participated in the production support and maintenance of the project.
- Preparation of business use case documents by studying the business process of the each business area.
Environment: Core Java, Java 1.5, XML, Struts 1.2, Struts 2.0, Hibernate 3.0, AJAX, Log4j, ANT, JavaBeans, Spring, Drools, DB2 9.1, JUnit , JQUERY, JMS, EJB, UML, Clear Case, PL\SQL, Rational Websphere 6.1, JBoss 5.1, LDAP Directory Server on linux for authentication and authorization, ClearCase version control, REST, Eclipse Galileo, Squirrel SQL client 3.2.1
Confidential, Cleveland, OH
Sr. Java/J2EE Developer
- Designed the user interfaces using JSP.
- Developed Custom tags, JSTL to support custom User Interfaces.
- Developed the application using Struts Framework that leverages classical Model View Controller (MVC) architecture.
- Implemented Business processes such as user authentication, Account Transfer using Session EJBs.
- Used WSAD 5.1.2 for writing code for JSP, Servlets, Struts and EJBs.
- Deployed the applications on IBM WebSphere Application Server.
- Used Java Messaging Services (JMS) and Backend Messaging for reliable and asynchronous exchange of important information such as payment status report.
- Developed the Ant scripts for preparing WAR files used to deploy J2EE components.
- Used JDBC for database connectivity to Oracle 8i.
- Written PL/SQL in Oracle Database for creating tables, triggers and select statements.
- Improved code reuse and performance by making effective use of various design patterns such as Singleton, Session Façade, Value Object, etc.
- Involved in JUnit Testing, debugging, and bug fixing.
- Used Log4j to capture the log that includes runtime exceptions and developed WAR framework to alert the client and production support in case of application failures.
Environment: Java 1.4, J2EE 4.0, JSP 2.0, Struts 1.1, EJB 2.0, JMS, JNDI, Oracle 8i, HTML, XML, WSAD 5.1.2 (Web Sphere Studio Application Developer), LDAP,IBM Web Sphere Application Server 5.1.2, Ant, CVS, Log4j.
- Developed Action forms & Action Classes.
- Involved in Coding Server pages using JSP.
- Developed Servlets to process the business logic.
- Developed the project using Struts Framework.
- Developed Java Beans, JSPs and Servlets.
- Involved in writing SQL queries.
- Created Java Beans for transactions between JSP pages and EJBs.
- Developed customized Tag Libraries for use in the JSP pages developed
- Performed Client side validations using Java script.
- Developed web pages and reports using j2ee.
Environment: Struts, JSP, Servlets, EJB, JDBC, Oracle and Web logic Application Server.