We provide IT Staff Augmentation Services!

Sr. Big Data/data Science Lead/architect/manager Resume

4.00/5 (Submit Your Rating)

Seattle, WA

SUMMARY

  • Over 18 Years of experience in IT and providing most of the industrial business solutions to the client’s eDW/BI requirements with various roles and responsibilities such as Sr. eDW/eBI/BigData/Data Science Architect/Lead/Manager (Data, ETL, Data Quality, Master Data Management MDM, Big Data, Machine Learning, Data Science).
  • Core Competency: Manage technical resources, on - time project deliverable, control the project budget, excellent people management skills, mentoring direct reports, strong organizational skills, identifying opportunities to provide more innovative solutions for large scale of enterprise data warehouse, design, development, implementation, Supports, technical problem solving skills to drive continuous productivity in a fast paced and hyper-growth environment to accomplish the project's deliverable.
  • Hands on experience writing Spark code using Scala to ingest files near real-time and batch mode.
  • Industries: Health Care, Insurance, Finance, Hospitality, Retail, Auction, Entertainment, Software and, Internet.
  • Project & Resource Planning, Managing resources, Scheduling, execute, implement Analytical Reporting platform in Cloud and Risk management.
  • ODS, OLTP, OLAP, MDM system designs, build eDW, eBI platform, NLP, Machine Learning and Data Science ( data analytics, data science, business intelligence, statistics) Closely work with Business Users, DBAs, Developers, and Managers to accomplish cross functional tasks.
  • Using Confidential technologies such as cloud platform storage, SQL, Bigquery, Dremel, Pantheon, Clarinet, Tableau, Hadoop ecosystem (HDFS, HBase, Cassandra 2.0, mapreduce, Pig, Hive, Avro, Sqooq, zookeeper, oozie) to build the unstructured big data warehouse for Machine logs and Clickstream
  • Knowledge on Hibernate, spring, scala in Java, tomcat and apache servers.

TECHNICAL SKILLS

Management tools: MS-Project, PowerPoint, Excel, World, Mind-mapData Science Python 3.2(Numpy, Pandas, Scikit), R 3.4, SAS, Spark MLib

Distributed Computing: Apache Hadoop 0.20 /1.1.1/2.0 (HDFS, Mapreduce), Pig 0.10.0, Hive 0.90, Cloudera CDH3 & 4, Hortonworks 0.20.X, 1.1.1

Streaming: Flume for log, Sqoop for relational DB, storm

Reporting Tool: Tableau 8/9.X, OBIEE 10.x, Micro Strategy 7.0, Cognos 6.X, Business Object 5.X, Crystal report

ETL: Informatica Power Center 8.x/7.X/6.X/5.X/4.X, RETL, RFX (10.3.3, 11.1), Data Stage 7.0, OWB 9.0, Pantaho, Talent

Programming Languages: Java, J2EE, Visual Basic 6.X/5.X/4.X, C, C++, HTML

Scripting Languages: Perl, Shell (kron, base), JavaScript, Python

IDE Tool: Eclipse SDK Luna (4.4.2), pyCham 2016.2

ERP: SAP NetWeaver® 7.0 R/3 & BW, Oracle 11i Application, SAP CRM

Data Modular Tools: IBM Power Designer 15.X, Erwin 3.5.2, 4.0 & 7.1, Designer 2000. ER Studio 9.0, Microsoft Visio

E-Business: ORACLE 11i Application (ERP, CRM, HRMS)

Web Servers: BEA Weblogic 5.1, 6.0, IIS 4.0

Data Quality Tool: Trillium 5.X/6.X/7.X/11.X, First Logic 5.0

Relational (Row) Database: ORACLE 12g, 11g, 10g, 9i (9.2), 8i (8.1.7, 8.1.6), 8. (7.3, 7.1), Teradata V2R5.1, V2R6.0, Greenplumn 3.1, MS-SQL 2005

Relational (Column) Database: Confidential BigQuery, Hbase, Redshift

NoSQL Database: Confidential BigTable, Cassendra 2.0Distributed File Systems Confidential CNS, GFS, Hadoop HDFS, AWS S3

PROFESSIONAL EXPERIENCE

Confidential, Seattle, WA

Sr. Big Data/Data Science Lead/Architect/Manager

Responsibilities:

  • Work with Senior Executives VP, Director to translate their vision into action and build the more hybride Business Intelligence and Advanced Analytical platform to support the Multi-Platform media Disney-ABC TV Business
  • Build descriptive analytical model for Conversion, Reach and frequency, Audience Analytics (royalty, retention, segmentation) by using R, python and tableau.
  • Developed the batch scripts to fetch the data from AWS S3 storage and do required transformations in Scala using Spark framework
  • Developed ETL data pipelines using Spark, Spark streaming and Scala.
  • Hands on experience writing Spark code using Scala to ingest files near real-time and batch mode.
  • Built predictive and prescriptive analytical model for RFM (reach, frequency and monetary), Churn and lift models by using Supervisor and un-supervisor algorithms with SaS, Python and R
  • Use NLTK to build sentiment analysis based on viewers watch behaviour & preferences .
  • Developed Spark code and Spark-SQL for faster testing and processing of data using Scala and python.
  • Provided Enterprise information Architecture road map including eDW/eBI, MDM, Big Data and Data Science (Supervisor and un-Supervisor)
  • Architectural road map for Master Data Management solution for Series, Episode, Geo, Date, TimeZone, Network, etc.,
  • Develop end to end data quality solution for the data load and apply various linear regression to predict data growth and capacity planning

Confidential, Mountain View, CA

Responsibilities:

  • Manage a team of members, 120+ engagements, more than 2 Million dollars worth of tableau desktop licenses, conduct weekly team meeting, in-and-out of box Corp-Engineering Report Platform solution design review meetings and contributed efforts to win CIO Award for CorpEng Reporting Platform
  • Define CERP SDLC Phases and Processes, Review referential architecture (eDW/eBI, IoT, Big Data- lamda, Web, Data visualization), processes and other artifacts, facilitated to roll out CERP self-service reporting platform (PaaS) for the various business functional verticals such as BizApps (Global Business - Marketing and, Supply Chain, Finance, HR, Legal, Treasury, Guts ticketing), Science and Technologies (YouTube, Netops, gFiber, Technical Infrastructure, Confidential +, Android) use cases.
  • Mentoring and providing analytical solutions to the users, worked with various teams such as SRE, CE Data Ops, Cloud Platform teams (Helix, BigQuery), BizApps, CE-Engineering, etc., to stabilize new features for user use cases on the CERP platform, Provide SOA, Multi-tenant architecture along with Saas
  • Facilitate to integrate R with CERP platform to provide solution for machine learning (regression, pattern matching) and data science for business use cases (pricing and prediction model) such as regression, predictive model, statistics inference.
  • Regulated data privacy, protection and security clearance for user data on CERP platform and integrate groups ( Confidential, ldap, Autorole and AD) to manage CERP Stack components for user credentials.
  • Design Data warehouse schema, Data lake for Big Data solution, ETL, Data ingestion (Batch & Event driven) and extraction pipelines, visualization dashboards for forecast engagement growth, license distributions, sales forecast, youtube, netops.
  • Automated few processes such as tableau license data extracts, tableau server upgrade user test cases by using Java, Python, Selenium, Eclipse
  • Utilize Confidential cloud storage, SQL, BigQuery, Dremel/SQL, computing, analytics, File systems (CNS, GFS), Borg Software Infrastructure and machines (Borglet cells) resources for various business users use cases reporting solutions, Hadoop echo system( Scoop, Flume, Storm, HDFS, MapReduce, Pig, Hive, HBase) & R.

Confidential, San Jose, CA

Responsibilities:

  • As a manager, my responsible are IT Analysis/ Analytical Thinking, Business Acumen, Change Management, Decision Making, Prioritization frameworks, Enterprise Perspective, Strategic Alignment, Team Effectiveness
  • Cost Management, Planning & Prioritization, PM Process Knowledge, Risk ID & Management, Time and Productivity, Management Business Networking
  • Provide Architectural road map to build the next generation of near-time data warehouse for Confidential Inc.,
  • Coaching Peers, Conflict Management, Cultural Adaptability, Virtual Remote Teaming
  • Provide road-map for 500K to more than 5 Million Project plans and budget builds the technical team with around 15 to 20 On-Site and Offshore resources (DBA, BA, ETL Developers, BI Reporting Developers and QA) matrix and cross function team to accomplish projects deliverable on time.
  • Extensively involved to build the talent pool (team), Project and Resource planning, execution, implementation and delivery of Enterprises Data Warehouse projects to the business stakeholders by using various SDLC methodologies Agile, SCRUM, Water Fall.
  • Provide road map for the EDW architectural methodology, Data Reconciliation solution design, Source acquisition design process from various source system and Business intelligent solutions.
  • Implementation of following data marts HR, Sales, Finance, CRM, AutoSuppot, Weblog, ClickStream
  • Guided Logical & Physical data modeling of EDW, forward and reverse engineering of schema, projection of Volumetric analysis of EDW schema, design ER diagrams for all the subject areas by using Erwin and UML diagrams for workflow design.
  • Coordinate Business Users, DBAs, Developers, Managers and Directors to accomplish cross-functional tasks.

Confidential

Principle eDW Architect Data, ETL, Data Quality

Responsibilities:

  • Gather business requirement from BA, BU and convert into conceptual, logical, physical model of star or snowflake schema for several Data Mart such as ( SAR, FRAUD, Customer Service, Seller Resumption, Customer First Event, Risk ( Acceptance user policy, Machine Identification, Advanced Risk Science ), Call Credit.
  • Provided ETL end to end solution oriented architecture for the Click stream Data warehouse.
  • Initial study, gathering requirement, analysis, design, Development of QC process for data integrity of data warehouse and redesign ETL batch with standards.
  • Establishes standards, guidelines, data quality for ETL (Heterogeneous Source Acquisition, Stage load), Quality Control Process and so on.
  • Develop Trillium Batch Projects for 170 international countries including US, CA, UK, DE, FR, IT, MX, AU etc.,
  • Design Universal Cleansing Adaptor with router, Parser, Geocoder, and Re-constructor for integrating all international countries in informatica and mapplet, mapping, worklet, workflow and AEP (Advance External Procedure) for winkey, Parser / Geocoder, Matcher for consumer profile (Name & Address Data).
  • Experience with extract, transform and load coding (ETL), master data management for all data marts as well as best practices for common ETL design techniques such change data capture, key generation and optimization for several data marts such as SAR, FRAUD, Customer Service, Seller Resumption, Customer First Event,
  • Risk (Acceptance user policy), Machine Identification, Advanced Risk Science.
  • Used Teradata fast load, fast export, tpump, mload utilities to load bulk data into tables and export data from table as well.
  • Performance tuning of SQL, ETL process and system performance monitoring ( ETL servers, Teradata, Oracle
  • Worked on Retail Retek application for Inventory, allocation, Sales, Finance for Confidential Inc.,
  • Monitored the operating system response in terms of CPU usage, disk space, swap space and paging by using various UNIX utilities like SAR, VMSTAT and top.

Confidential, Foster City, CA

Oracle DBA, Lead Informatica Admin

Responsibilities:

  • Data modeling for several data marts such as funds, fraud, risk by using Oracle designer 2000.
  • Installation, configuration, upgrade from 8.X to 9.X (REPO, AP), backup, recovery, cloning of Oracle Databases.
  • Installation, configuration, upgrade from 5.1 to 6.2, backup, recovery, folder & mappings migration from DEV, QA, PROD environment, users, securities, permissions for Informatica 5.1 and 6.2 .
  • Developed more complex source extraction mappingsfor pre-staging, staging, and core of dimension and fact population. It extracts ODS data with various sources like oracle, SQL server, flat file, XML from several data mart and load into the GWS EDW pre-staging area.
  • Created PL/SQLstored procedures, triggers, mat-view, and packages for EDW.
  • Create sessions, batches, and email variables through Informatica Server manager.
  • 24 * 7 on call support of Databases (7.x, 8.x, 8i, 9i) on Sun Solaris with HA Veritas Cluster Server and Quick I/O for Raw Devices & Production Informatica (ETL) Load. Performance tuning involving, SQL, Memory, Disk I/O.
  • Wrote shell scripts for monitoring CPU usage, Memory, disk space, swap space, paging by using various UNIX utilities like SAR, VMSTAT and top, to automate the process of analyzing, gathering statistical data, clearing the alert logs, trace dump files and setting CRON jobs for all.

Confidential

Oracle DBA & Programmer

Responsibilities:

  • Installation, Configuration, Tuning, Backup &Recovery, Create Database Instance,
  • Improve Performance, Maintain User Privileges, Roles, and Resources Creating Database Objects such as Tables, Views, Indexes, and Synonyms Creating Physical Objects such as Data files, Log files, and Control files. Migrating the Oracle Database V7.1 to V7.3. Import & Export Data from different Database system.

Environment: Oracle 7.1, 7.3 /Erwin / SQL Server 7.0, Visual Basic 4.0, MS-Access, Oracle Enterprises Manager 1.1, Win95, Win NT.

We'd love your feedback!