Redshift Architect, Data-warehousing Data-engineer/dev-dba/lead Developer Resume
NJ
SUMMARY:
- Technical and Strategic leader with more than twenty years of experience in leading cross - functional, technical, analytics and digital strategy initiatives with extensive recent experience in cloud-computing.
- Currently working primarily as AWS Redshift Architect and hands-on leader versed in data science, big data and analytical techniques. Has worked extensively in massively parallel processing (MPP) projects with relational databases as well as noSQL and unstructured-data platforms, and data visualization projects, both in the cloud and on-premise.
- Application Architecture and Implementation - Has led application design, development, operations and management for both start-ups and Fortune 500 companies. Prior to moving to Application Architect role, has held development and technology roles for eight years keeping hands-on to maintain leading-edge knowledge.
- Used, managed or PoC’ed following ETLs: Matillion, SnapLogic, Informatica,Talend, DataStage
- Managed and led multi-disciplinary teams including strategy, analytics, CRM and Business Intelligence to over de-liver on complex digital initiatives on time and on budget.
- Experienced in various industries - Pharma, Finance, Retail, Hi-Tech, Ad-Tech, Gamification, Consumer Packaged Goods/Manufacturing, Nuclear, Telecommunications, Government, Military.
- Has worked and consulted for companies like Merck, Confidential & Confidential, BMS, GE, Confidential, Arthur Anderson, Jardine Fleming, HK Land, BAE-Sema, CapGemini, Confidential, 1-800-Flowers, Oriental Trading (OTC), Kapitall.
TECHNICAL SKILLS:
Skills: - Strategy and Roadmap - - Technology Solutions Architecture - - Pre-sales/PoC - - Post-sales/dev+ops - MPP - - RDBMS/SQL - - ODS/NoSql - - Python - - Configuration and Optimization - Big Data - - Data Science - - Business Intelligence & Visualization - - Star-Schema - - Data-Lake - Analytics - - CRM - - MDM - - Cross-functional Collaboration - - Multi-cultural/Global teams - Automation - - Work-Flow Mgt - - Security - - Campaign Mgt & Tracking - - DBA/DW - Amazon Redshift - - S3 - - EC2 - - RDS - - IAM - - CloudFormation - - CloudWatch - - Boto3/SDK
PROFESSIONAL EXPERIENCE:
Confidential, NJ
Redshift Architect, Data-Warehousing Data-Engineer/Dev-DBA/Lead Developer
Responsibilities:
- Hands-on architect/developer for migration of call-center data (Sales-Force/Veeva) to AWS: Redshift to replace legacy Oracle DW for a global Top-10 pharma client.
- ETL Poc’s with Informatica, Attunity, Sqoop, Talend and Matillion
- Built utilities in Python, AWS-Boto, Bash, UDF’s for: orchestration & automation, prep’ing AWS:Redshift data on AWS:EC2, assembling DDL’s and Sqoop queries
- Design tables with distribution-style & sort orders, 3NF vs Dimensional, date-wise partitioning
- Created schema of tables in Redshift to reproduce reports from SalesForce for Confidential t/Clinic data.
- Conducted proof-of-concept with Matillion ETL/ELT to assess viability of use, meeting with CTO/team and ongoing discussions with CEO.
Confidential, NJ
Redshift Architect, Data-Warehousing Data-Engineer/Dev-DBA/Lead Developer
Responsibilities:
- Technical architect/developer for migration of legacy DW to AWS:Redshift for a global Top-25 client.
- Strategy direction and tactical solutions, using S3, Redshift, RDS, EC2, Python/Boto, Azkaban, Alteryx, Tableau
- Proof of Concept performance benchmarks for RFPs to migrate to AWS cloud for data-warehousing using Python to (a) pre-prep, (b) cleanse and (c) data-profile data for pre-optimized load into Redshift.
- Establishing data standards for extraction/harvesting, cleaning/conforming and publishing.
- Encourage use of data-profiling for data-loading and data-quality management during discovery phase.
- Instructing on-shore and off-shore teams to develop physical models to leverage performance using AWS:Redshift. Trained teams in tuning at (i) table-level (distribution-style, sort-key, compress, bulk-loading, etc.), (ii) intra-table (dimensional models, SQL Window functions, etc.), (iii) cluster (scaling-up, AWS:WLM,etc.) and (iv) intra-cluster (scaling-out, orchestration, cloning-via-SnapShot, etc.)
- Proof of Concept for IPaaS; including ETLs such as Informatics and SnapLogic.
- Established best-practices for receipt of vendor data (“push’n-forget”): “data-lake” and the “passive” archiving of extracts in AWS:S3 with pre-preped (split/zipped) data-feeds with ancillary files (eg. manifest, control and trigger).
- Collaborated with infrastructure group to define AWS:IAM role-based security policies.
- Benchmarking of bulk-loads (eg. 20n rows, 1TB, 50 tables based on different AWS:Redshift cluster configurations).
- Benchmarking of long-running queries (eg. pivot-up/down; Teradata vs AWS:Redshift {combination of cluster sizes}; 8x improvement)
Confidential, NY/NJ
Redshift Architect, Data-Warehousing Data-Engineer/Dev-DBA/Lead Developer
Responsibilities:
- Designed, implemented and operated AWS:Redshift solutions for a Top-100 internet retailer. Phase 1 built in 5 weeks with 5.5 bn row history (~1TB) and daily incremental loads.
- Wrote a metadata-driven work-scheduling ETL tool written in Python/Boto, orchestrating several clusters from a separate central metadata cluster. Intended to run on AWS:EC2 using AWS:DataPipeline and AWS:SNS.
- Tableau BI tool and SQL-Clients (SQLWorkBenchJ, DBVizualizer, Aginity) with future SAS connector as available.
- Developed Python tools for: task-scheduling, data-profiling, JSON security-file generation, AWS:CloudFormation.
- Focus on sales/marketing data; integrates third party data-feeds for: (a) customer cleansing/de-duping and (b) internet ad-tech clickstream data.
- Specified and designed end-user cluster-orchestration capabilities so that power-users could price activities, spin-up own clusters, populate them, run their work and quickly drop the clusters when finished.
- Team of three, performing: development, and presenting to clients.
- Sourced ad-tech/clickstream data from Neustar (aka. Aggregate Knowledge) and MDM/sales data from Merkle.
Confidential, Omaha NB
Responsibilities:
- Wrote proposal in response to RFP to transition an existing Teradata data-warehouse to AWS:Redshift
- Included architectures, costs/benefits and a series of proof-of-concepts and demonstrations.
Confidential, NYC NY
Responsibilities:
- Consulted on-site for start-up in brokerage/gamification sector that was in midst of money-raising rounds and pivots in business-direction. Documented systems and applications for investors intending to conduct due-diligence prior to investment.
- Developed prototypes of MongoDB database (using Python for BSON/JSON manipulation and querying) for a multi-OLTP system Enterprise requiring an operational data-store (ODS) for use in call-center/help-desk.
Confidential, NY/NJ
Senior Developer, Application-Architect
Responsibilities:
- Moved to architect/design and delivery applications that leveraged the successful work over previous 8 years in basic marketing-automation and DW/BI/Analytics.
- Created first production versions of Campaign Planning/Tracking application based on a distributed, thick-client model (ie. Citrix, VB.NET, SQL-Server/Oracle, MS-Office integration; Excel, Access, Outlook).
- Integrated embedded P+L benchmark “engine" into Campaign application, to automated actuarial modeling (ie. Excel, VBA, VB.NET, SQL-Server/Oracle, XML, VS/VSS) for on-demand actuarial analysis of marketing campaigns’ viability
- Using upwards of 2000 variables and controlled by marketing staff, in either ad-hoc or batch-processing modes, this vastly simplified, or removed entirely, the onerous HO reporting requirements to free up marketing staff to actually do marketing.
- As tools and prototypes became stable and critical of operations, tool previously developed were outsourced and rewritten by offshore teams and consultancies.
Confidential, NY
Application-Architect, Senior Developer
Responsibilities:
- Changed to more hands-on role as Architect/Designer for Confidential 's first generation of Distributed/Federated data-warehousing tools for Sales/Marketing Automation supporting the Personal-Lines (business-unit) Insurance.
- Designed an architecture that was flexible enough to cater for many sizes of marketing department at different levels of maturity. Included choices of different tools, platforms and objectives, yet provide economies of scale through consistency of setup and operation.
- Continued to integrate third-party products (Cognos, Oracle/SQL-Server, SAS) with internally developed automation tools, written in VB.net and C#.
- Installed eventually in ~50 countries over 8 years. Led the roll out by coordinating and supporting technical & marketing teams on the ground.
- Tools included ETL, segmentation, List-Extraction, Data-Quality, File-Transfer, de-duplication/cleansing, automated data-mart generation and BI report generation/distribution.
Confidential, NY
Information-based Marketing Project Manager
Responsibilities:
- Moved to NYC head-office to manage outsourced initiative to implemented Confidential 's first Commercial-Insurance (business-unit) data-warehouse.
- Integrated third-party products (Cognos, Oracle/SQL-Server, SAS) with internally developed automation tools.
- Used intensively, post-911, to exclude doing business with blacklisted companies and their affiliates (via Dunn & Bradstreet extracts) and later with anti-money laundering activities.
- Developed sure of tools including: ETL, List-Extraction, Data-Quality/Data-Profiling, File-Transfer, de-duplication/cleansing, automated data-mart generation and BI report distribution.
- Developed major application to automate Cognos cube generation, report creation and delivery.
- Developed tools/Proof-of-Concepts with Data-Mining staff to automate the predict and fraud-detection in blue-collar workman’s compensation insurance, for use as Alerts in Quotation system. Later adapted post-911 to predict money-laundering detection.
Confidential
Campaign Manager, Technology Lead, Developer
Responsibilities:
- Began data-driven marketing initiatives to support up-sell/cross-sell/retention strategies, within HK territory.
- Initially responsible for building data-marts from OLTP extracts and leveraging them to conduct repeatable marketing operations (mailings, emails, list-to-sales-reps, outbound t/m, etc.). Including merging/de-duplicating across three companies in HK (life, non-life and bank) using HK-ID.
- Gained more responsibility in Technology Leadership in Design and Architecting solution. Used BI (Cognos, Informatica) and automation ETL (DataStage/Trillium) to support marketing, analytics (SAS) and call-center (Siebel).
- This was a very reactive project, needing to be very agile and responding to marketing & sales needs yet with foresight, was able to gain economies-of-scales with repeated processes (extract, clean, load), end-user visibility (sql-tools, BI, SAS, etc.) and quality (tools and scripts written in VBA/VB6).
- Developed tools with Data-Miners to predict propensity to renew auto-insurance policies, for use by Telemarketers.