Solution Architect Resume
San Diego, CA
SUMMARY:
- A senior IT professional with experience of 17+ years in delivery management and Practice Management in Big data & Analytics,Data warehouse, Data Lake, Data Engineering and Data Science domain.
- Currently working as a Solution Architect - data engineering and Data science.
- I am responsible for development and maintenance of the data infrastructure and customer analytics capabilities to support the information needs of business leaders.
- Develop and deliverkey initiatives in the areas of data governance, data management, Data Lake and customer analytics roadmaps in line with future business goals.
- Good experience in providing technical and strategic consulting to large-scale Data Engineering and Data Science programs.
- Have track record in setting up and guiding offshore development centers and managing Big data and Analytics, DataLake implementation (Hadoop/Cloud), Data engineering and Data SciencePrograms.
- Drive global Big data and Analytics programs from end-to-end, with focus on building corporate competencies and providing strategic direction.
- Very good in managing multiple programs in onsite-offshore agile delivery Model and building center of excellences for Big Data&predictive Analytics, Machine learning, Business Intelligence, Data Integration COEs.
- Strong working knowledge of data engineering and Data science technologies.
- Experience working with modern cloud data lake architectures (Azure and AWS).
- Experience in delivering insights through Analytics tools such as Tableau & Qlikview.
- Experience in leading cross-functional teams to develop advanced analytics, machine learning and reporting systems.
- Working knowledge of Cloud technologies in Big Data and Data Science space.
- Strong domain experience in Banking, Insurance, Retail and Telecom.
- Proactively recommend improvement opportunities to stakeholders through a customer focussed approach to analytics and delivery of simple and effective solutions to complex problems.
- Attain continuous improvement of the delivery and meet the delivery SLAs.
- Lead and supervise the Bigdata platform, Business Intelligence,predictive and statistical Analytics programs.
- Provide Architectural solutions, leadership, consulting, and information expertise to achieve the goals for organization’s Data Strategy & Customer Analytics.
- Redesigned and improved the performance of the application with introduction of Big data, Open source softwares and robust design principles (pertaining to scalability, high availability and higher throughput). Resulted in hard dollar savings.
- Oversee data strategy and customer analytics projects to ensure they are in line with overall long-term business strategies of the customer.
- Directs the delivery of complex data and analytics portfolio of projects by closely collaborating with business and Enterprise Architecture. This includes management of data warehouse solutions and implementation.
- Strengthen distributed delivery network and focus on the constant improvement of the business through Innovation.
- Leverages agile development methodologies and tools to increase the throughput and success of development efforts.
- Continuous integration (CI/CD) by implementing agile and Devops principles to delivery faster business value.
- Provides subject matter expertise to drive a step change in the data space by improving speed of delivery and reduce cost while simplifying the data environment.
- Works closely with business sponsors and stakeholders to drive execution of data strategies and solutions, and ensure portfolio projects are successfully executed and delivered on time,on budget with high quality.
- Partner with business stakeholders to build Data Strategy, Advanced Analytics and Reporting roadmaps for future years.
- Responsible for Design and delivery of predictive and statistical Analytics Platforms, machine learning applications and solutions.
- Architecting next generation EDW architecture for a banking and E-Commerce Customers of streaming data using Kafka, Hadoop and NoSQL and Providing the data security for PII information with data encryption.
- Designing and implementing the data compression techniques with various file formats like parquet with snappy,ORC and data archival mechanism and data retention mechanism.
- Steered efforts in highlighting the company products by building the proactive POC models with benchmarking of various components to customer and providing roadmap on implementation and integration services.
- Developed predictive models using machine learning algorithms Decision Tree, Random Forest, Naive Bayes, Logistic Regression, Cluster Analysis, XGBoost, KNN, SVM, Linear regression and k-means
- Skilled in Regression and Classification Modeling, Correlation, Multivariate Analysis, and application of Statistical Concepts.
- Architected next generation EDW architecture for a banking and E-Commerce Customers using Hadoop, Hive and NoSQL and Providing the intelligence with artificial intelligence with Deep learning and Neural networks
- Proven ability to lead and manage data strategy, business intelligence, and customer analytics programs.
- Practical, intuitive problem solver with a demonstrated ability to translate business objectives into actionable data tasks and translate quantitative analysis into actionable business strategies.
- Ability to work collaboratively with a variety of business, data science and technical stakeholders.
- Excellent program management skills, with ability to work independently manage multiple projects.
- Ability to create a work environment that emphasizes innovation, empowerment, teamwork, continuous improvement, and a passion for providing service.
- Leads by example demonstrating self-confidence, energy and enthusiasm.
- Commitment to creating a positive, respectful atmosphere to attract and retain existing and future talent.
- Develops and maintains effective relationships with both internal and external stakeholders across the organization. Fosters a positive climate to build effective teams that are committed to organizational goals and initiatives. developed one of the largest Healthcare Data warehouse system and in my leadership successfully implemented larger data Integration programs, decommission programs and built Data lake for an Organization. Using Product management principles and agile practices
- Challenging to make sure data and analytics is embedded in daily decision making. Manage and supervise the risks which exist in the business area, ensuring mechanisms are in place to identify, report, manage and mitigate risk. initiated innovation to build Smart Operations where by making systems to learn the failures and apply the resolutions (With ML and Data science enablement), Predict the abends and avoid potential failures. Resulting into 70% of cost reduction. implementing project plans within preset budgets and deadlines in cross country, cross time zone, multi-vendor environments.
- Strong problem-solving & technical skills coupled with decision-making for enabling effective solutions leading to customer satisfaction and low operational costs
- Preparing presentations on project status reports to senior management and Ensuring that best practices, lessons learned, and standards are documented, communicated, and implemented.
MY FOCUS AREA:
Data sourcing/Ingestion,Data Processing, Big data &predictive Analytics Data lake implementation (Hadoop/Cloud),Data Governance, Predictive and statistical modelling and Machine Learning, DataScience,Realtime Dashboards.
TECHNICAL SKILLS:
Hadoop Eco System: Hadoop, HDFS 2, Hive, Pig, Map Reduce/YARN, Impala, Oozie,Sqoop,Flume,Spark,Kafka,Splunk.
Big Data Platforms: Hortonworks, Cloudera, Amazon AWS,MapR,Platfora.
Data Science: Apache Spark,MATLAB,Scikit-learn,TensorFlow and Amazon SageMaker.
Analytical Tools: Spark-sql,R,python,snowflake,Redshift,Tableau and Elastic search(ELK)
Data Modeling Tools: Erwin r9.6/r9.5/r9.1/8.X, ER Studio and Oracle Designer.
OLAP Tools: Tableau, SAP BO, SSAS, Business Objects, Qlikview and QuickSight.
ETL Tools: SSIS, Datastage, Informatica Power Center,AWS Glue.
Programming Languages: Java, SQL, T-SQL,UNIX shells scripting, PL/SQL,PySpark, Scala
Database Tools: Microsoft SQL Server 2014/2012/2008/2005 , Terradata, Postgress SQL, Netezza, SQL Server, Oracle, Redshift,Greenplum.
NoSQL: MongoDB,Cassandra,DynamoDB.
Reporting Tools: Tableau and Qlikview.
Cloud Technologies: AWS, EC2,S3, EBS,VPC, ELB, RDS,Dynamo DB, IAM, Cloud WatchEMR, Lambda,EMR,Data pipeline,GCP,AZURE.
PROFESSIONAL EXPERIENCE:-
Confidential, San Diego, CA
Solution Architect
Responsibilities:
- Designing and estimating the functional and technical architecture with several benchmarks.
- Responsible for identifying the components for data capturing of various data formats like streaming data from sensors,RDBMS data and csv files.
- Designing and implementing the data compression techniques with various file formats like parquet with snappy,ORC and data archival and data retention mechanism.
- Configuring the Kafka cluster for capturing the streaming data and SAP VORA for inventory information and csv files from other sources.
- Developed business logic in pySpark to apply the transformations and brings those transformed to data to datalake in s3 and make sure this data will be unique data source across the enterprise.
- Creating the data pipeline to automate and run these process in schedule interval and responsible for tightly coupled solution without any data loss.
- Proving the data access to other teams with only read access and allowing them to run adhoc sql queries.
- Designing the data load to Redshift and building the data model and association rules in Redshift.
- Responsible for regular warehousing needs to business team for the decision making.
- Developing the predictive analytics, sales forecasting and early alerting mechanism models by applying the intelligence on the data in datalake with help of data science/machine learning algorithms and Developing the effective visualizing needs for the teams.
- Delivering the plug and play applications which run end to end with serverless architecture, capable and matured to help the business people for decision making.
- Integrated GIT into Code pipeline to automate the code check-out process. Defining Release Process & Policy for projects early in SDLC.
- Used Aws code pipeline for CI and CD purposes and build failures alerts and management of machine failures.
- Carried out deployments and build on various environments using Puppet continuous integration tool and for configuration management of hosted Instances within AWS.
- Utilize Cloud Formation and Puppet by creating DevOps processes for consistent and reliable deployment methodology.
- Used Airflow/Glue for automation and scheduling automated jobs.
- Worked closely with QA and testing teams for automation, testing and build and in error fixing while deployment and release phases.
- Deploy and monitor scalable infrastructure on Amazon web services (AWS) & configuration management using puppet.
- Managed the user accounts (IAM), RDS, Route53, VPC, RDB, DynamoDB, SES, SQS and SNS services in AWS cloud. Managed Amazon Web Services - ELB, EC2, S3, RDS.
- Hands on experience in setting up databases in AWS using RDS, storage using S3 bucket and configuring instance backups to S3 bucket to ensure fault tolerance and high availability.
- Experience in managing and maintaining IAM policies for organizations in AWS to define groups, create users, assign roles and define rules for role based access to AWS resources.
- Providing the early alerting mechanism to the concern teams by defining the rule engine and business logic implementation in ETL with serverless architecture and using various models for automatic system failures.
Environment: AWS(EC2, S3, VPC, IAM, EBS, RDS, Cloud Formation, Cloud Watch, ELB, SNS, SQS, Lambda, Glue,), Kafka,Spark,GIT,AWS Code pipeline, Jenkins, Ansible, Puppet, Nagios, Python.
Confidential, Alpharetta, GA
Principle Bigdata Engineer
Responsibilities:
- Providing functional and technical architecture for implementing the business requirement and calculating the data size, replication and data process flow.
- Installing the agents to integrate signals and logs data from different regions and bring the stream data to the processing layer and defining the business fields for the further analysis.
- Converting the unstructured data to structured format and stores into HIVE tables running sql queries for regular analytics.
- Generating the ratings for each program with respect to region and time.
- Used Hive-spark integration for better storage & Performance and query tuning.
- Integration of Hive-spark sql to Tableau and building the visualization layer.
- Providing historical data analysis and building the models for predictions.
- Implementing a Continuous Delivery framework using Jenkins in Linux environment and Build Automation and Build Pipe Development.
- Enhanced the overall performance by increasing the architectural class definition of specific instances.
- Implemented High-Availability using Elastic Load Balancing which performed a balance across instances in multiple node cluster.
Technologies Used: Flume,Kafka,pySpark, Hive, Spark-shell(scala),spark-sql,milb, Tableau/Qlikview, AWS DataPipeLine,Glue,API Gateway,CI\CD,ELB .
Confidential, Dallas, TX
Bigdata Consultant
Responsibilities:
- Responsible for Data Strategy and implementation . As part of this transformation, the data platform was implemented to ingest and aggregate data from multiple customer interaction points like mobile, social media, stores, e-commerce sites and third-party partners.
- I am driving the architectural implementation and roadmap for driving value though all omni-channel organizations and third-party partners, creating transparency and scale within our Data Management and Engineering teams.
- The objective of this initiative is to empower business stakeholders across the organization to make decisions swiftly, accurately and with confidence
- The business driver for the Azure Data lake implementation and the customer 360 was to shape new product lines and new marketing campaigns.
- The business driver for the Azure Data lake implementation and the customer 360 was to shape new product lines and new marketing campaigns
- These data-driven customer insights will help in some of the key business use cases like improving customer conversion rates, personalizing campaigns to increase revenue, predicting and avoiding customer churn, and lowering customer acquisition costs, Brand propensity modeling.
- I was responsible for end-to-end delivery of data projects leveraging Big Data, AI/ML technology.
- Provided technology leadership for all data projects solutions leveraging Big Data Lake, AI/ML technologies. Was involved in the implementation of strategic vision for data and analytics group.
- Implementation of strategy for how to harness data, and interpret it into meaningful and actionable information in the Consumer behavior and marketing analytics.
- Helped the LOB in increased ability to identify profitable customers, expand wallet share with profitable customers identify relevant cross and up sell opportunities, acquire new profitable customers by targeted marketing campaigns.
- Key predictive and prescriptive analytics models implemented - Customer segmentation, channel mix modelling. Trigger based cross sell, social media listening and measurement.
- I was responsible to build and lead a central data team for data lake implementation using the Hortonworks data platform. The objective of this initiative was to enhance customer experiences by creating a holistic-360 degree view of existing & potential insurance customers and their interactions across channels for various products to meet their life style and key life stages.
- As part of the customer’s digital transformation initiative, one of the key initiative was to map the Insurance customer journey with various Cross sell and up sell opportunities.There are two dimensions to the customer journey that are especially important: the lifecycle of an insurance contract and the insured’s journey through different life stages.
- Different insurance and financial services needs emerge based on age as a person reaches drivers age, achieves the age of majority, reaches middle age, nears retirement age, or becomes elderly.
- Other journey stages are based on events such as marriage, the birth of a child, the purchase of a house, and so on. This involves in creating individualized experiences for consumers all through their path to purchase different products across channels, Identify key moments along the consumer journey and service her in real time with personalized offers & content.It supports multichannel campaigns and triggered marketing resulting in improved customer satisfaction, higher lifetime value and RoI.
Technology landscape:-Hortonworks Data platform, Hbase, Hive, Spark Streaming, AWS, Spark, Python and Tableau.
Confidential, Sunnyvale, CA
Data Architect
Responsibilities:
- Strengthen distributed delivery network and focus on the constant improvement of the business thru Innovation.
- Build / Manage the Unified Data warehouse system and deliver integration programs along with decommission programs for the legacy assets with complete ownership from scoping to Implementation.
- Lead the Operations support team and meet the SLAs.
- Recruit and build teams with right mix of the skills.
- Build engineering practices, adopt Agile practice and product culture.
- Meet the financial goals and constantly look for opportunities to improve the IOI.
- Attain continuous improvement of the delivery and meet the delivery SLAs.
- Preparing presentations on project status, present weekly, monthly and annual reports to senior management.
- Ensuring that best practices, lessons learned, and standards are documented, communicated, and implemented.
- Devising performance goals, providing feedback to Projects Managers, Team Leads &Architects and guided them on career path and continuous improvement.
- Provide consulting in Digital transformation program.
- Lead the Global strategy program for the Data solutions .
- Provide thought leadership thru Innovation advocate program .
- Successfully integrated 100+ source systems, built robust ETL design with optimal performance.
- Redesigned and improved the performance of the application with introduction of Big data, Open source softwares and robust design principles (pertaining to scalability, high availability and higher throughput). Resulted in hard dollar savings.
- For last three consecutive years recorded highest claims submission for 3R business.
- Identified couple of opportunities for Predictive analytics. Successfully completed two and working on the other Machine learning POCs to convert them to revenue generating sources.
- Built smart Operations solutions for reducing the manual effort and improve the stability of the application. This is a big win in reducing the man efforts.
- Successfully decommissioned the 10+ legacy assets and integrated those functionalities into the existing Unified warehouse systems.
- Also provided Click stream analytics and user based search models and predictions.
Technology landscape: SQL Server,SSIS,SSRS,PIG,HADOOP,ER and ER-STUDIO.