Director / Sr. Aws Lead Solution Arch Resume
Indianapolis, IN
SUMMARY
- 17+ years of combined IT experience as a AWS Solution/Data Architect/Data Engineer for (DWBI/Cloud/Bigdata) using different set of ETL tools like AWS Glue (Python/Scala),PySpark API, Matillion for Redshift, Informatica Power Center/Cloud, Talend, AWS (EC2, S3, Glacier, Redshift, DMS,Dynamo DB, RDS, Postgres, SageMaker/ML, EMR, Data pipeline, SNS, SQS,SWF,VPC, AWS Snowball, Route53, ELB,Cloud Watch, CLI Config/Scrip, AMI, Athena, Elastic Search, Auto Scaling, CFT, IAM,AWS billing & Cost Management Device management, Sqoop, MSBI, (SSIS, SSAS) and OBIA/OBIEE 10 & 11G, Oracle, MS SQL Server, Teradata and Repository Database.
- Excellent Lead experience of AWS Arch, Cloud Engineering, Analytics/Data Scientists and DWBI teams to lead end - to-end multiple Cloud initiatives, Applications / Data Migration Projects, interpersonal / collaboration with other teams, involve project stakeholders for project Archicture design reviews / reusable framework model / best fit strategic solution discussions and ability to handle diverse situations of rapidly changing priorities.
- Master in creation of Cloud Architecture, including Private, Public and Hybrid architectures, IaaS, PaaS and SaaS models for any domain.
- Expert in AWS CLI script automation for EMR (end-to-end) and other AWS services and build Serverless Arch using Lambda (Boto3) and Step Functions.
- Thorough Competency Capability Build for new/multiple innovative technology framework strategic solutions for advance, ML, AI Analytics solutions and define the complete project life cycle for POCs, Prototype Design execution from Dev to Prod and open all communication channels for business stake holders with transparency on regular basis.
- Instrumental in building a digital transformations, building, architecting, solutions and roadmap to growth of practice and support sales, pre-sales teams on need basis to drive techno functional Solutions presentations at end clients / business stake holders and negotiations.
- Best practice solution for Bigdata applications like Hadoop (HDFS), SPARK(Python/Scala), EMR (optimize solutions) with minimal instance provision of resources to submit the different Scala jobs / step function and apply dependencies on step level functions for sequential and multiple Spark submit jobs for parallel step jobs runs.
- Experience in driving enterprise Cloud services adoption across a medium-to-large organization with proven success to handle homogenies / heterogeneous, Streaming and different type file formats for huge volumes.
- Identify continuous services improvement, automation areas and prepare strategy script which helps in automating manual tasks and improve service quality, reliability, durability and elasticity
- Expertise in analysis, design and development of ETL programs as per business specifications and assist in troubleshooting for complex business performance by specially Glue ETL /Custom Pyspark /Scala and in increase the DPU size upon on volume of source data and set the right properties at Glue UI/Designer.
- Experience in architecting, deploying and managing cost-effective and secure AWS environments, across multi-zones and regions, leveraging AWS services such as EC2, S3, RDS, VPC and extensive experience Architect and build Data lake in AWS S3 for Advance Analytica /ML, Athena Queries and build Object / hierarchy level Warehouse to main the lowest level of granularity on day to day ingest file into S3 Bucket
- SME for Technical and Architectural Informatica DW/BI Arch and Domain experience in Healthcare, Logistic, Life Science, Pharmacy, Clinical Trials, Telecom, Finance/Investment, Operations, process improvement and re-Engineering.
- Expert in build and design best fit industry Practices, company standards for security, compliance and risk while maintain system performance and availability.
PROFESSIONAL EXPERIENCE
Confidential - Indianapolis, IN
Director / Sr. AWS Lead Solution Arch
Responsibilities:
- Build Data Lake architecture and Design S3 bucket structure, folders, partitions and articulate key Data Lake attributes, storage, ingestion, processing, and data security and define Right IAM policies as appropriate as per business users/ stake holders.
- Responsible for PySpark Frame work development for heterogeneous sources data extraction, design, build and maintain standard best cloud design patters across business entities to keep track of code consistency.
- Lead AWS/Data engineering team to provide best design pattern, guide and implement performance strategic tuning techniques to support Data Scientist team to articulate ML/AI best algorithms/design.
- Design planning, review and implementation to create enterprise solutions built on AWS and other cloud providers and keep clear communication channel to all business stake holders with transparency.
- Responsible for configuring and build complete Cloud infrastructure, Security/Firewalls for Public and Private network, NACL and define the roles-based permissions(IAM) and setting up different accounts environments for Dev, QA and Prod.
- Build CI/CD pipeline using Code Commit,CodePipeline, CodeBuild and CodeDeploy, CFT migration code from Dev to SIT to UAT to Prod
- Lead and drive best fit innovative domain-based business architectural technical solutions and ability to drive Infra/architectural/Technical Solutions as a individual contributor and work with project stake holders directly on time-to-time and upfront to client facing role.
- Lead and initiative new strategic technology prototype business solutions in influence right tools/technologies/ services in short duration which will help Sr. Mgt business decision rapidly.
- Collaboratively work with Data Engineer, Data Scientist, DWBI Team, Leadership, business Stake holders customers to drive implement successful implementations and making sure Zero-defect deliverables
- Creation of spark jobs using Scala to process source files from S3 to RDS (Arora) and S3 destination.
- Build Glue ETL for custom Scala code to process data from S3 to Redshift Data marts / Fact Tables
- Creations of Glue Catalog to Crawl the metadata, create external tables in Athena schema and write advance analytics queries and send the output results to end customers.
- Creation of Lamda function based on events through (S3, Cloud Watch logs, SQS, SNS to trigger / Spin up Ec2 instances through EMR cluster and Submit the Spark (Scala) program by STEP function and build the server less Archicture Solutions and Using STEP functions Services for orchestration.
- Creation of Role based permission and creation of bucket policies to appropriate access to user
- Spin up and build scalable architecture based on client requirement and configure the instances
- Creation of Security group /Firewall groups, KMS Keys, attach them right instances
- Creation of public and private subnets, route tables, IGW, allocation of IP address and EIP and provisioning the web application, DB services, application on appropriate subnets and confiture the NAT gateway, inbound and outbound rules based on application for in/out traffic
Technology: AWS (S3, Spark (Scala), PySpark API, Redshift, Sagemaker/ML, EC2, Dynamo, SQS, SNS, Cloud Watch, Airflow), DMS (SCT),Athena, SQL Server, Oracle, UNIX, Oracle Toad, Jira, SQL Server, Matillion for Redshift, IAM, KMS, EMR, RDS (Arora)., Boto3(python), Lambda, StepFunction, SIMBA ODBC, Cloudwatch, Glue ETL (Python /Scala)
Confidential, Pittsburgh, PA
AWS Lead Solution Arch/Data Engineer
Responsibilities:
- AWS Lead for Cloud Infra / Data Engineering team, coach / Train the team for best fit Technical solutions patterns and build re-usable PySpark, Scala framework models to process heterogeneous source files to reduce development build/code over heads and upfront to Client /Leadership team to keep communication channels with transparency.
- Creation of DataPipeline for CI/CD process using Jenkins and Git Repository for Pyspark, Scala and Custom library for Glue jobs.
- Support DevOps integration workflow for complete life cycle and making sure all technical documents as per the best industry standards. Also, responsible for data visualization layer Tableau/Salesforce.com (Cloud).
- Build Spark framework models for data processing using Spark SQL/Scala for micro servicers and on-prem to cloud migration, and used Splunk /Sumologic(POC) for micro services/services log monitoring/session logs.
- Design and built cloud service models including Infrastructure-as-a-Service, Platform-as-a-Service, and Software-as-Service, IAM policies, Security, encryptions and Creation of KMS keys for more sensitive data.
- Modifications to systems to improve efficiency, reliability, and performance for delivering application and infrastructure in the cloud.
- Lead Data Engineering teams and responsible for build Data lake on S3 and curate the Data Lake for data governance/ HIPPA compliance, Business Glossary, Business/Technical Metadata and collaboratively work with other project teams dependencies during project deployment / releases on agile /sprint .
- Create a data catalog using Glue and crawl metadata for advance analytics and reusable metadata for EMR and other Eco system also write advance analytics queries in Athena and Redshift Spectrum for high volume of data.
- Responsible for data extraction from different sources systems like Conduct systems design, feasibility and cost studies and recommend cost-effective cloud solutions Veeva CRM, IMS, MDM, CDIS, Data one and IRDA DB and design the AWS cloud migration approaches from on-prem and build MVP for complete migration life cycle and project blue print
- Define the futuristic technology solutions for competency growth and recommend design/ Archictural changes to make data engineering team more potential growth and innovative ideas,
- Responsible for project compliance, technology integration with multiple application with zero defect deliverables and provide support for compliance/audit trial team, also define and manage the Dimension, Fact Tables, Data Marts on Cloud OLTP and Frame the Data Lake on S3
- Complete cloud services automation through Lambda/CFT.
- Closely working with all business stakeholders and set the right expectation and define the best strategic ways to increase target Revenue.
- Provide best-fit industry strategic solutions for multiple projects, develop serverless solutions and try to minimize the Infrastructure/automate complete data flow to reduce services cost as a long term/workaround solution to increate corporate revenue.
Technology: Talend Bigdata Integration, PySpark, AWS (S3,EMR, Spark (Scala), Boto3(python), Redshift, Glue ETL (Python/Scala), Lambda, Redshift Spectrum, EC2, Dynamo DB,Sqoop,EMR, Data pipeline, SQS, SNS, Cloud, Kinesis, Watch, Airflow), CLI Scripts, SQL Server, Oracle, UNIX, Oracle Toad, Jira, SQL Server, Veeva CRM, IMS, MDM, CDIS, Data one)
Confidential, New York, NY
AWS Solution Architect/ Data Engineer
Responsibilities:
- Build best fit industry standard, methodologies, AWS cloud-based solutions in secure, cost-effective, serverless architecture and prepare blue print for migration preparation approaches from on-prem to AWS cloud
- Lead the AWS data Engineering team, coach and implement best cloud design patterns, define the re-usable cloud framework models, design/implement serverless Arch. and keep business naming standards /versions to keep code consistency across the environments to reduce overhead on Prod fixes if any and strong enough to leadership skills to open clear verbal and written communication mails to project stake holders.
- Responsible for building S3 bucket architecture, folders, partitions and building Data lake for 10 years’ worth of historical data in RR3 storage space.
- Delivery of programmable infrastructure (Infrastructure as Code) and automation/orchestration of OS and applications.
- Work with customer's IT, Engineering, DevOps and Security teams in identifying and prioritizing business and technical requirements
- Setup and maintain the AWS Cloud infrastructure as per standards and guide lines set forth.
- Create credibility and accelerate adoption of Managed Public Cloud and Drive cost reduction/efficiency initiatives (e.g. change EC2 from on-demand to spot instances)
- Requirement gathering, creation of Data Mart from DWH as a sources Oracle, Veeva system by using data extraction tool as Informatica Cloud.
- Develop full SDLC project plans to implement ETL solution and identify resource requirements
- Discuss options with customer considering business needs, security, cost, and operational requirements
- Setting /Accessing Dynamo DB Env and Creation of Tables, on Demand Backup, and access control etc
- Designed the Cloud infrastructure and recommend the on-demand software/instances
- Lead architect in the Governance and steering of the AWS public cloud at customer
- Define the tools and technologies for project scope, articulate the development framework and BRD design for dev. activities
- Lead the team and work with others projects AWS Cloud engineering team to manage the day-to-day development activities, participate in designs, design review, code review, and implementation dependencies for each sprint (02 Weeks).
- Align proposed solutions with Client' public and private cloud offerings, and where needed identify alternative solutions to fill gaps
Technology: Informatica/Cloud, Spark, AWS (Redshift, S3, CloudWatch, Glacier, Pyspark, Direct Connect, VPC, EC2, Dynamo DB, Redshift, Athena, Glue Catalog, HDFS, DMS (Oracle), EMR, Glue ETL (Scala/Python), Data pipeline), Kinesis, Informatica Cloud, Oracle, UNIX, Oracle Toad, Jira, SQL Server, Maestro
Confidential, Bridgewater, NJ
Lead Enterprise Solution/Data Architect / Data Engineer
Responsibilities:
- Lead the AWS Team and responsible for design, development, and implementation of Cloud framework solutions, requirement gathering, conduct and drive cloud Arch. meetings with respective business holders and involve other lines of business solution/SME/Architects for design reviews and incorporate best design suggestion to existing framework.
- Install 3rd party tools on AWS Cloud platform and create the right S3 bucket infrastructure (landing zone, raw zone, process zone, Curated and IAM Security policies in bucket level folder level as appropriate access.
- S3 data lake architecture and security policies and setting for life cycle policies for Glacier and setting up S3 storage levels.
- Build the Lambda and STEP function serverless orchestration from S3 to Redshift and RDS Services.
- Working with Business and System Analyst’s to transform business requirements into technical designs into EADW schema.
- Build Scala program on SPARK Eco system to submit the jobs through EMR for all big data application for analytical solutions using Hadoop, Redshift, Kinesis, S3, RDS and DMS services.
- Responsible for Security requirements and deployments, including the use of encryption, key-pairs, MFA, ... Database solutions EC2 on-demand, spot and reserved high-Availability.
- Build Data lake in S3 for current, ongoing and historical data to support Athena query engine.
- Translate business requirements and operational strategy into a long-term, executable solution plan and roadmap
- Disaster Recovery solutions Storage and network solutions Cloud Formations Automation.
- Assist with the product evaluation, selection and implementation for respective managed services that require 3rd party products to support application or business
- Participate in planning, implementation, and growth of our customer's Amazon Web Services (AWS) foundational footprint.
Technology: Informaica/Cloud, Spark(Python API,Scala),EMR, AWS (Redshift, EC2, S3, Sqoop, Dynamo DB, RDS (SQl Server), EMR, Data Pipeline, CloudWatch,CloudTrail, CFT,Tableau, Cognos,Informatica 9.5, Oracle 11g, UNIX, Oracle Toad, SQL server 2008/12, Teradata 12/13