Cloud Architect/ml Data Engineer Resume
PleasantoN
SUMMARY:
- Worked as a ML engineer/cloudsolution architect/big data engineer to design strategy for migrating Hadoop on - prem to AWS and Microsoft Azure cloud.
- Worked extensively on AWS cloud building a data lake from scratch, data migrations from legacy systems(MySQL,ORACLE,SQL Server, Teradata) into cloud using PaaS,Iaas,SaaS, Db as a service and networking as a Service using S3, EMR, Step Function, EC2,RDS, Lambda, Glue, Athena, KMS, CloudWatch, CloudTrail, VPC, SageMaker, Kinesis, Macie, Quicksight .
- Vast experience in building Azure stack(including but not limited to Key Vault, ADW, Data Factory, CosmosDB, PowerBI, Event Hub, Stream Analytics)
- Designed Customer/Inventory/Product and customer DataMarts used across the entire organization.
- Hands on experience on Python programming PySpark implementations in AWS EMR,building data pipelines infrastructure to support deployments for Machine Learning models, Data Analysis and cleansing and using Statistical models with extensive use of Python, Pandas, Numpy, Visualization using Matplotlib,Seaborn and Scikit packages for predictions,xgboost
- Created Cloud and Big Data solutionson batch and Real Time data processingutilizing HDFS, Hive, Pig, HBase, Kafka, Spark, Storm, Flink, Solr, Elasticsearch, HCatolog, Oozie, Flume, Sqoop, Java, Scala, and Python.
- Implemented API calls to GET and POST data from EMR
- Implemented machine learning algorithms for Auto qualified leads, campaign performance and fulfillment planner using tools Python, R, RStudio, PySpark and MATLAB for regression and classification problems.
- Full-life cycle development utilizing Agile, XP, and RUP methodologies.
- Implementation of Azure cloud solution using HDInsight, Event Hubs, CosmosDB, cognitive services and KeyVault.
- Web analytics solution implementation including scoring, search engine optimization, email automation, Auto Leads generation, Auto qualify leads and social media marketing.
- Designed/implemented data governance strategies for cloud and big data solutions including PCI and PII data
- Implementation of customer success management solution using Gainsight and Master Data Management from scratch
- Extensive data knowledge in supply chain, Services, Ecommerce, Retail, Banking and Media domains
- Possesses 18+ years of commercial experience as an Enterprise software architect, technical lead, project manager, analyst and developer.
PROFESSIONAL EXPERIENCE:
Confidential, Pleasanton
Cloud Architect/ML Data Engineer
Responsibilites:
- Design the flow to ingest data from Azure and output to UI
- Involved in technical analysis to identify the use of Sagemaker
- Converted SQL procedures in SQL Server to Spark SQL
- Converted Azure ML procedures to PySpark scripts
- Responsible for Data cleansing & transformation of timeseries data with PySpark
- Responsible Streaming of timeseries data from PiServer to AWS S3 using Kinesis
- Implemented data pipeline infrastructure to support machine learning systems
- Used APIs with Python to extract and push prediction outputs in EMR
- CI/CD pipelines and Deployment of machine learning models in EMR
- Data Cataloging using Glue and Athena
Environment: AWS Services: S3,EMR,EC2,Step Functions, Glue, Athena,Lambda,CloudWatch,RDS,VPC,Subnets, Azure Services: SQLServer(SSMS),SQL procedures,Data factory,Pipelines,Python,Spark,API Services,Vagrant,Gitlab
Confidential, Oakland
Sr Cloud Architect / Sr. Data Architect
Responsibilities:
- Provided a developed solution to migrate data, logs and external data to AWS Data Lake from MySQL, Oracle and SQL server.
- Implemented data Analysis using Athena and DynamoDB for auditing
- Macie for data classification and data security
- AWS Glue crawler and ETL jobs implementation to catalog the data available on S3
- Implemented machine learning algorithms using SageMaker and TensorFlow for prediction of container availability and service cancellation
Environment: AWS(S3, EC2, EMR, VPC,ElasticSearch,KMS,Redshift,DynamoDb, Glue,CloudWatch,CloudTrail,VPC,Lambda,SageMaker,Macie,Kinesis), MySQL,Oracle,SQL Server, Terradata, Hadoop Cloudera platform
Confidential, San Francisco, CA
Big Data Architect / Sr. Big Data Engineer
Responsibilities:
- Architected data marts for Customer, Inventory, product, promotions
- Designed cloud solution for data share with Vendors like IBM,Stanford, third party data for data analysis
- Ingested database and application and system logs data to Azure using crawlers
- Used cognitive services to build models
- Defined data ingestion processes for sensitive data to Azure cloud
- Built data pipelines for on-prem to Microsoft Azure cloud migration using serverless compute functions
- Defined Key control management using key vault API Azure services
- Implemented Machine learning algorithms to build patterns for the fulfillment planner using IBM Watson
- Classification/clustering of customer survey data
- Built feature importance for survey responses utilizing decision tree methods
- Built customer insight analytics platform for customer behavior and churn
- Built solutions for real time data processing using Kafka mirroring, Kafka queues, Key Vault, Azure Data Warehouse, Power BI, Analysis Services, Cosmos DB
- Build Customer 360 data lake to enable data as a service.
Environment: Azure (KeyVault,Blob storage, Functions, CosmosDB, PowerBI).
Hortonworks Data Platform, Python, Spark (SparkSQL, Python Mlib), HDFS, Hive,Terradata.
Confidential, CA
Big Data Architect / Sr. Big Data Engineer
Responsibilities:
- Identified product/service sales trends across multiple dimensions.
- End to end campaign response management using machine learning.
- Built customer insights on customer/service utilization, bookings & CRM data using Gainsight.
- Campaigns data processing using ELOQUA/UNICA/BLUEKAI for personalized targeting.
- Ingestion of Akamai server logs to HDFS using Flume to generate Customer scoring/Auto leads.
- Clustering/classification techniques in Mahout to create insights.
- Analyze weblog data to build swimlanes representing customer journey
- Analyze huge customer/product log files and identify root causes for business impact.
- Designed and implemented digital marketing solution, customer success management solution, campaign management and data governance solutions
Environment: MapR distribution platform with HDFS,Spark, sqoop, Flume, Hive, Pig, Kafka,HBase, PySpark,Jupyter,Python).
Confidential, CA
Big Data Engineer
Responsibilities:
- Developed and implemented data migration from legacy systems to Hadoop FS.
Environment: Cloudera Data Platform (HDP 2.3): HDFS, Hive, Pig, HCatolog, Oozie.
Confidential, IL
Technical Business Analyst
Responsibilities:
- Involved in MDM Rollout - Supply chain SME for data modeling
- Created data models using Erwin modeller identifying attributes and properties
Environment: JDK, J2EE,Tomcat, MySQL, Informatica, SAP MDM, Oracle, Mainframes Z/Os, COBOL, JCL, DB2, VSAM, CICS,Erwin data Modeller
Confidential
Associate Developer
Responsibilities:
- Worked as a developer in implementing the credit card business in Copenhagen and coordinating the credit card business in the Nordic regions which include Denmark, Finland, Latvia.
Environment: Mainframes Z/Os, COBOL, JCL, DB2, VSAM, CICS, VISIONPLUS
TECHNICAL SKILLS:
Operating Systems: UNIX, LINUX, Windows, MVS z/OS.
Cloud Platforms: AWS, AZURE
Programming Languages: Java, R, Python
Frameworks: Apache Spark, Map Reduce, Mahout, Apache Lucene, J2EE
Database: Hadoop FS, MySQL, ORACLE, SQL, H Base, Yarn, Spark
Query Languages: Hive 0.9, Pig, Sqoop 1.4.4, Spark-SQL
Streaming: Flume 1.6, Spark Streaming, Streaming Analytics
Marketing Tools: SAS, Tableau, Platfora, ELOQUA(SFDC), UNICABLUEKAI & GainSight(SFDC), Talend
Messaging frameworks: Kafka using Event Hub
Distributions: MapR, Cloudera,Hortonworks, Cloudera
Reporting Platforms: Tableau, Power BI, Platfora
Datawarehousing Platform: Azure Datawarehouse, EDW