Tech Lead Resume
SUMMARY
- 15+ Years of extensive IT experience in Analysis, Design, Development, Implementation and Testing of software applications which includes 4 Years of experience in Big Data developing and architect using Hadoop,MapReduce, HDFS, Hive, Spark, Kafka, Pig, Sqoop, Oozie, Flume, ZooKeeper, CDH4 and AVRO
- Extensive experience in Banking, Financial, Financial Compliance and Retail POS domains.
- Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
- Excellent understanding of Hadoop architecture and its components including storage management.
- Experienced in all phases of Data Modeling - Conceptual, Logical and Physical modeling.
- Experience in developing Entity-Relationship diagrams and modeling Transactional Databases in 3NF and using tools like Erwin.
- Experience in creating HIVE tables, loading with data and writing HIVE queries.
- Experience in working with different data sources like Avro data files, xml files, Json files, Sql server, MySql, Oracle to load data into Hive tables.
- Experience Spark streaming and Spark SQL (with Scala & Python & Java).
- Worked on Performance Tuning of Hadoop jobs by applying techniques such as Map Side Joins, Partitioning and Bucketing
- Experienced in installing and configuring Spark Hadoop cluster and ecosystem components.
- Experience in importing\exporting data to\from HDFS.
- Experience in shell scripting and Pig scripting
- Experience with Oozie Scheduler workflow with Map/Reduce and Pig jobs.
- Experienced with NOSQL DBs like Hbase & MongoDB.
- Experience in managing Hadoop clusters, services using Cloudera Manager
- Experienced in leading the team and coordinating with the offshore teams.
- Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop
- Collected logs data from various sources and integrated in to HDFS using Flume
- Excellent understanding of Cloud (Amazon S3, Azure), Virtualization (VMware & Hyper-V) and experience of setting up a POC multi-node virtual cluster.
- Experience in loading data to HDFS from Linux (Ubuntu, Fedora, Centos) file system.
- Knowledge of Hardware, Software, Networking and external tools.
- Solid understanding & experience of the high volume, high performance systems.
- Involved in complete project life cycle including requirements gathering, Analysis, design, development, testing and deployment.
- Experienced in Agile and Waterfall product development methodologies.
TECHNICAL SKILLS
Operating Systems: Windows, Linux, UNIX
Hadoop Eco Systems: HDFS, Map Reduce, Spark, Kafka, Hive, HBase, Pig, Sqoop, Zookeeper, Oozie, Flume, CRON
Hadoop Distributions: Cloudera 5.4.3, Databricks
Data Base\Tools: Oracle, MS SQL, MySQL, Sybase, PL/SQL Developer, Toad, DB Artisan
Scripting Tools: UNIX Shell Scripting, PIG Latin
Programming Languages: Scala, Java, Python
IDE / Testing Tools: Eclipse, IntelliJ, JUNIT, Scalatest
Data modeling Tools: ERWIN, Powerdesigner
Application Server: Apache Tomcat, Jboss
Project Management Tools: Microsoft Project, Pivotal Tracker, BugZilla, Zendesk, Jira, Confluence
Source Control\Build Tools: SVN, Git, Perforce, JIRA, Maven, SBT
Virtualization\Cloud Technologies: VMware ESX, Microsoft Hyper-V, Amazon S3, Microsoft Azure, Google Cloud
PROFESSIONAL EXPERIENCE
Confidential
Tech Lead
Technologies: Java, Oracle, Apache Hadoop, Spark, Scala, Kafka, IntelliJ, SBT, Python, SparkR, Event Hub, Data Lake, Oozie
Responsibilities:
- Implemented Kafka and Spark streaming to receive real time inventory data from ECD data source ( TIBCO KARIBA pipeline)
- Build custom Kafka consumer, process data and persist into CASSANDRA NoSQL data storage
- Created HIVE datastore for Loyalty data
- Created Spark batch jobs to move data from event hubs to anomaly detection engine
- Development of Spark streaming, SQL and batch jobs through Scala
- Implemented Kafka message receiving for spark streaming job
- Created Spark SQL queries for faster requests
- Implemented Hive Queries for analyzing data using Hive QL
- Converted the data into Parquet format and read the data from Parquet format
- Orchestrating Oozie Workflow Engine in running workflow jobs
- Micro Services / Applications migrated to GCP Cloud:
- MSP Content - provides Product Content Service
- Fast Common Catalog (FCC) - Catalog Service
- Apollo / Discover - search and browse
- Build Azure Data Lake Analytics:
- Lead Design and Develop Data Ingestion Layer(SQOOP/HADOOP/HIVE/SPARK/DATA LAKE)
- Lead Design and Develop Spark processing jobs
- Lead Design and Develop Hive schemas
- Lead Design and Develop Enterprise Data Pipeline, processing Kafka / Spark steams into Cassandra data models.
Confidential
Consultant / Big Data Architect
Technologies: Java, Oracle, Apache Hadoop, Spark, Scala, Kafka, IntelliJ, SBT, Python, SparkR, Event Hub, Data Lake, Oozie
Responsibilities:
- Implemented Spark streaming to receive real time financial and securities market data from the Push agents and dump them to event hub
- Handled importing data from the sources for data cleansing and transformations
- Created Spark batch jobs to move data from event hubs to anomaly detection engine
- Development of Spark streaming, SQL and batch jobs through Scala
- Implemented Kafka message receiving for spark streaming job
- Created Spark SQL queries for faster requests
- Implemented Hive Queries for analyzing data using Hive QL
- Converted the data into Parquet format and read the data from Parquet format
- Orchestrating Oozie Workflow Engine in running workflow jobs
- Worked on creating RDDs, Data Frames and performed different Actions and Transformations
- Defined the data flow within Hadoop eco system and guide the team with the implementation
- Developed custom UDFS and implemented Pig scripts
- Provided support to data analysts in running Hive queries to for further anomaly analysis
- Specifying the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON and Parquet file formats
Confidential
Data Architect / Director Software Development
Responsibilities:
- Designed, architected and developed a high performing data analytics database using Hadoop backend data processing with 1 master and 4 data nodes, Spark process for data processing like sanctions lookups, web scraping data and OPEC data feeds for campaign checks and restricted names lookup. Automate data loading into the Hadoop Distributed File System and PIG to pre-process the data. Created Hive queries to load Cassandra and EDW database, created analytical reports using Tableau desktop. Used MongoDB for log data for troubleshooting and analytical purpose.
- Contribute to task identification, work effort estimates, and work schedules for development and maintenance activities. Developed weekly, monthly upper management reports using Tableau BI (Desktop Version) tool.
- Performed web development, template development, testing, debugging, integration, documentation and deployment in accordance with industry best practices.
- Developed workflow design and analyze existing production and implementation processes in order to ensure scalable solutions.
- Participate in the analysis, definition, and scoping of efficient, cost effective application solutions.
- Work with internal departments to provide associated deliverables required for the successful completion of development and maintenance assignments including business development, database development, web designers, end-user training.
- Research new technologies; provide ideas for technical or workflow / process improvements.
- Designed and developed Exceptions, AML, Pre-trade Approval and various business critical reports using Crystal Reports(BOXI), wrote many critical PL/SQL procs, triggers and functions and optimized queries.
- Designed & developed and implemented generic trade/position file/data translation engine (ETL) consume data into proprietary compliance calculation engine.
- Stabilized custom integration process by directing development and QA team. Created and documented multiple integration processes for client onboarding. Instrumental in conducting effective sprints and scrum calls.
- Developing highly scalable, near real time Exception / Violation data metrics / Reports / Analytics and UI dashboard of various module.
- Exploring AWS and Azure Cloud platforms MS SQL Server. Conducted feasibility study for migrating SQL Database to AWS/Azure.
- Implemented AWS CloudWatch and CloudFront ( CDN ), deployed javascripts, resource file, images and json data files.
- Conducting operational efficiency review, turning tactical solutions into strategic solutions.
- Developed technology road map to meet high demands, including decoupling system, micro services and distributed architecture.
- Implementing Spark streaming to receive real time data.
- Creating Spark SQL queries for faster requests.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
- Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs
- Handled importing of data from various data sources, performed transformations using PIG, Hive
- Created automated jobs for extracting data from different Data Sources like SQL, MySQL to HDFS using SQOOP
- Responsible for defining the data flow within Hadoop eco system and guide the team with the implementation
- Specifying the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format
- Designed and built many applications to deal with vast amounts of data flowing through multiple Hadoop clusters, using Pig Latin and Java-based map-reduce
- Worked with in creating data frames in spark 1.3 and documented performance of RDD
- Written Map Reduce java programs to analyze the log data for large-scale data sets.
- Developing map reduce application to retrieve data from HDFS and store to HBase and Hive
Confidential
Application Architect / Product Manager
Responsibilities:
- Developed login/logout, login status, show matching, criteria list, Broker and Custodian setup screens.
- Developed alert/ notification screen based on the tolerance levels
- Supported development of Laravel 5 standard to provide a high quality, low cost, and quick development framework. Designed/reviewed use case specification documents for various modules, developed functions/ classes for various modules, and documented deployment process.
- Technologies Used: JAVA, PHP, MySQL, JavaScript, HTML, CSS, Smarty Template Engine, XML parser, MS SQL, Ajax, jQuery, Eclipse, DOM, Zend Studio, Web Services, SOAP, EditPlus, MS Windows, IIS, XAMPP.
- System improved all aspects of hospitality, inventory, timesheet, payroll and customer satisfaction.
- Proprietary gift card and discount/royalty card.
- Configurable credit card processing interfaces to several merchant processing payment gateways.
- System optimized to run on any third party available hardware.
- Implemented AWS Cloud RDS for centralized reporting system for multi-location business owners.
- Developed BackEnd Administration Web Application in JAVA, PHP, MySQL, jQuery, Bootstrap, HTML, CSS and Laravel 5.x framework
- Build Menu Category, Group, Menu, User Management, Reference data, Happy Hour & Discounts and Gift Card inventory data entry screens.
BaselineKPI: Created operational workflow system to track service calls, performance indicators, job management, onboarding, billing, maintenance contracts and receivables, customer relationship, and inventory management web application using JAVA, MySQL, jQuery, Bootstrap, HTML, CSS and Laravel 5.x framework
Confidential
Solution Architect / Consultant
Responsibilities:
- Developed a parallel process for new and old VAL reports.
- Developed Dodd-Frank Volcker US Base Metrics project requirements and specifications.
- Built a 5-member team, mentored and managed.
- Developed system data process to calculate Volcker Metrics on real-time and scheduler batch process.
- Integrated seamlessly with global systems. Exploring SQR for accessing, manipulating, and reporting enterprise data spread across heterogeneous systems. Build complex procedures that perform multiple calls to multiple data sources and implement nested, hierarchical, or object-oriented program logic for KPI reports.
- Designed and developed reporting database for management reports and produced various performance metrics like VaR and Stress VaR, Risk Factor Sensitivities, Risk and Position Limits, Comprehensive P&L Attribution, Inventory Risk Turnover, Inventory Aging, Customer-Facing Trade Ratio, Comprehensive P&L, Portfolio P&L, Fee Income & Expense, Spread P&L, VaR Exceedance, Volatility of Comprehensive and Portfolio P&L, Comprehensive and Portfolio P&L to Volatility Ratio, Unprofitable Trading Days, Skewness & Kurtosis of Portfolio P&L and Pay-to-Receive Spread Ratio.
- Developed ETL process to load the analytical data into QlikView BI system. Provided support and ongoing enhancements.
- Mentored QA team in developing test cases and UAT processes