Lead Big Data Architect - Engineer / Lead Cloud Architect-Engineer / Data Scientist Resume Jacksonville, FL - Hire IT People

WORK HISTORY:

Confidential, Jacksonville, FL

Lead Big Data Architect - Engineer / Lead Cloud Architect-Engineer / Data Scientist

Responsibilities:

Design/Plan/Architect Pivotal Big Data Suite roadmap including use of the HDP / HDF Hadoop technology stack
Design/Plan/Architect data ingestion strategy from 400+ data sources into (HDP) Hadoop Data Lake
Design/Plan/Architect ETL strategies for real time data pipeline ingestion to (HDP/HDF) Hadoop Data Lake
Design/Plan/ArchitectStorm, Kafka and Spark architecture included in HDP/HDF real time data solutions
Data Discovery, Data Profiling, Predictive modelling, Machine Learning, R & Python development
Led the creation of Data Governance vision, charter, framework, committees and processes for the enterprise.
Led the implementation, design (one full lifecycle) of Master Data Management (MDM) using MuleSoft / Talend.
Proven "hands on" MDM experience with expertise in MDM strategy proposal, roadmap and planning
Phased implementation leveraging best practices and strong focus in data quality
Experience in design/architecture of MDM Hub, data integration, data governance process and data quality.
Proficient using Talend, Profisee Maestro, Informatica Siperion and IBM Infosphere MDM, DQ & DG tools.
R & Python with comprehensive proficiency; Scala - Architecture & Collection Library, REPL, Scala, Refection, Macros
Deployment of Hadoop and Spark ecosystems.
Using Erwin for logical and physical database design, database optimization, loading strategy design and implementation, conducting business analysis, event modeling & using knowledge of standard commercial databases (Oracle, Teradata, DB2).
Working in Big Data and Microservices technologies like - Hadoop, Map Reduce Frameworks, Cassandra, Kafka, Spark, HBase, Hive, Springboot, nodejs etc.
Developing database solutions by designing proposed system; defining database physical structure and functional capabilities, security, back-up, and recovery specifications and providing database support by coding utilities, responding to user queries and troubleshooting issues
Interacting and collaborating with cross functional teams including application development, peer reviews, testing, operations, security and compliance and project management office, as well as business customers and external vendors.
Machine Learning Frameworks - Amazon Machine Learning / Azure Machine Learning / Singa / H20 / Spark MLLib
Machine Learning Frameworks (Streams) - Massive Online Analysis / Spark MLLib
Regression, trees, neural networks, survival analysis, cluster analysis, forecasting, anomaly detection, association rules.
Detailed understanding of machine learning pipelines and ability to discuss concepts such as feature discovery/engineering, model evaluation/validation, online vs. offline learning, and model deployment.
CreatepredictiveandclusteringmodelsutilizingOracle, SQLServerandHDFSdatasources
Definewhenpredictiveorclusteringmodelscouldbeutilized,andthetypeofdatarequiredtomaketheminsightful
Develop,extractandmaintainlogicalandphysicaldatamodelsfordataanalyticswithinDirectEnergy
Enhancingdatacollectionprocedurestoincludeinformationthatisrelevantforbuildinganalyticsystems
Dataminingusingstate-of-theartmethodsandproduceactionableinsight
Selectingfeatures,buildingandoptimizingclassifiersusingmachinelearningtechniques
Designanddeveloppredictivemodelsandmachinelearningalgorithmsusingadvancedmethodologies
Architect - AWS, AWS RDB, AWS Data Warehouse, AWS Redshift & AWS Storage solutions
Architect - EC2, S3, CloudFormation, RDS, CloudFront, VPC, Route53, IAM, CloudWatch, Beanstalk, Lambda
Architect - Build, design, architect, implement high-volume, high-scale data analytics, machine learning Snowflake solutions
Architect - Data Migrations from Oracle, SQL Server and Hadoop (Hive / HDFS) to Snowflake Databases
Engineer - SnowFlake Data Warehouse - Analyze and performance tune, query processing engine with SnowFlake DW.
Engineer - SnowFlake Data Warehouse -Data Migration Strategy from On-Prem to SnowFlake DW solution - Ingestion Plan
Engineer - Deploy cloud infrastructure (Security Groups and load balancers needed to support EBS environment)
Engineer - Create and manage TFS Continuous integration builds on VSTS
Engineer - Responsible for maintaining AWS instances as part of EBS deployment
Engineer - Systems administration with Windows / Unix scripting
Engineer - Support on AWS services and DevOps deploying applications
Architect - Technical / Solution SME within the Data Integration across on-premise and AWS data sources / applications
Architect - MapR Data Fabric for Kubernetes (FlexVolume, PersistentVolume) - UDF and UDAF requirements
Architect - Talend Data Fabric through Spark and AWS EMR for Big Data Batch Jobs - UDF and UDAF requirements
Architect - Azure Data Factory, Data Pipeline Design, Azure Data lake / Azure Storage - Oracle, DB2, SQL Server, MySQL
Engineer - Azure Data Factory, Data Pipeline DevelopmentSQL, SSIS, Powershell and ETL scripting
Engineer – Azure Data Flow, Data Modeling in Azure, and Azure Ad-HOC Reporting (design / development)
Architect – ETL from AWS to Google Cloud to Azure and from/to other On-Prem data sources / targets.
Architect – Google Cloud Platform utilizing the Data Analytics, Data Stream Analytics, Hadoop, Data Lake and BI toolset
Engineer – Google Cloud Platform Data ingestion, Analytics datasets, data lake integration, data migration to Google Cloud
Engineer – Google Cloud Platform to Kafka and Spark cluster solutions, Google Cloud Platform to Azure via HDFS/Hive
Architect – Google Big Query for use cases where other Hadoop Solutions didn’t provide the results needed by business.
Engineer – Develop / Design data patterns via microservices into data pipelines across the Azure Technology Stack.
Architect & Administrator (AWS) GenGireXD, PostgresSQL, Greenplum, Hawq & Kafka environments; GoLANG Program
Architect & Administrator (Azure) Azure SQL DB, Hadoop, Hadoop Spark w/ NoSQL (Mongo, Cassandra & Couchbase)
Architect – Salesforce Data Extractions into Azure Data Lake & Cosmos DB / MuleSoft Microsoft Service Bus Connector
Architrct – MuleSoft Anypoint Platform to Azure API – Data Ingestion / Data Services from/to Salesforce, SAP, Databricks
Architect – Designed / Developed Data Migration Strategy from On Premise to Cloud (SQL & NoSQL Technology)
Architect & Administrator (Google Cloud) Hadoop, MongoDB, Couchbase, Hbase, PostgreSQL, Cassandra (Spark / Storm)
Architect - Callidus Cloud w/ SAP Hana; ETL Sales Data via Kafka to Data Lake (Hadoop); Data Visualizations
Architect – Calidus Cloud Integration to enterprise data stores via both SAP Data, Non-SAP data and Master Data Mgt.
Engineer – Deployment MapR Data Fabric for Kubernetes (FlexVolume, PersistentVolume) – UDF and UDAF requirements
Engineer – Deployment Talend Data Fabric, Spark within AWS EMR – UDF and UDAF requirements
Engineer – NetAPP Data Fabric architecture for UDF and UDAF deployments
Engineer - Azure Data Bricks, Azure Data Lake Service, Azure SQL Data Warehouse, Azure Data Catalog
Engineer - Technical / Solution SME within the Data Integration of Azure, Blob Storage, Log Analytics
Engineer – Azure Data Flow, Data Modeling in Azure, and Azure Ad-HOC Reporting (design / development)
Architect & Administrator (Azure) Cosmos DB – Schema Design, Data ingestion, Performance and Query optimization
Oracle Golden Gate / Oracle Data Integrator / Hive / PostgreSQL Data integration design & configuration
Oracle12C Enterprise Metadata Management installation and deployment
Data Integration with MuleSoft EBS, JMS Transport - TIBCO Suite (EMS) for EAI, SOA and BPM
MuleSoft integration patterns – Migration, Broadcast, Aggregation, Bidirectional Sync, Correlation.
MuleSoft Anypoint Platform, Connectors / Transports, Enterprise Service Bus, Integration Services
Installation and Deployment – Oracle Enterprise Metadata Management for Hadoop / PostreSQL Data Store,
Implement Talend Data Integration- Reading an input file, transforming data, combining columns, Joining data sources
Implement Talend Data Integration- Creating database metadata, Joining data, Master Data Management model design
Define Cloud Native (micro-services/BDD (behavior driven development), containers/Dockers, Agile, BDD)
Define data ingestion strategies; Kafka, Storm, Nifi, Zookeeper, Oozie, Sqoop – Lambda Architecture
Cloud to/from On-Prem – “Apache” - Kafka, NiFi, Storm, Flume, Sqoop, Samza, Chukwa
Cloud to/from On-Prem - Wavefront, Data Torrent, Amazon Kinesis, Syncsort, Gobblin, FluentD, Cloudera Morphlines,
Cloud to/from On-Prem - White Elephant, Heka, Scribe, Databus
Tools were tested, compared, bench marked, via POC and Performance testing. Ingestion Strategy /Tool Outcome
Delta Architecture Ingestion - Acquisition -> Sqoop, Flume, Python; Messaging -> Kafka, Pulsar;
Delta Architecture Ingestion - Stateful -> Flink; Query / Processing / Lamda -> Hadoop, MapReduce, Hive, NoSQL
Neo4J -Graphical Data – Query, Analyze for highly connected data; Native Graph Storage, Native Graph Processing
Neo4J - Graph scalability, high availability, Graph Clustering – Graphs on Spark, Graphs in Azure Cloud Graph Platform
HBase - Clusters Design, Management, including backup/recovery, replication, cluster failover, and disaster recovery
HBase – Large structured and unstructured Datasets from multiple data sources – pipeline to Hadoop / KAFKA Clusters
Couchbase - 5.0/4.6.x (15 Node/Cluster) Document Data modeling, Cluster Management w/ Hadoop HDF
Couchbase - Node Configuration, Data Conversion to JSON, Hadoop/HDP
Cassandra - DataStax (25 Node/Cluster) Transaction Data with replica’s across 3 data centers

Confidential, Santa Monica, CA

Lead Big Data Architect-Engineer / Lead Cloud Architect-Engineer / Enterprise Data Architect / Data Scientist

Responsibilities:

Design/Plan/Architect Pivotal Big Data Suite roadmap including use of the HDP / HDF Hadoop technology stack
Design/Plan/Architect data ingestion strategy from 2500+ data sources into (HDP) Hadoop Data Lake
Design/Plan/Architect ETL strategies for real time data pipeline ingestion to (HDP/HDF) Hadoop Data Lake
Design/Plan/ArchitectStorm, Kafka and Spark architecture included in HDP/HDF real time data solutions
Data Discovery, Data Profiling, Predictive modelling, Machine Learning, R & Python development
Architect - AWS, AWS RDB, AWS Data Warehouse, AWS Redshift & AWS Storage solutions
Architect – EC2, S3, CloudFormation, RDS, CloudFront, VPC, Route53, IAM, CloudWatch, Beanstalk, Lambda
Architect - Build, design, architect, implement high-volume, high-scale data analytics, machine learning Snowflake solutions
Architect – Azure Data Factory, Data Pipeline Design, Azure Data lake / Azure Storage – Oracle, DB2, SQL Server, MySQL
Engineer – Azure Data Factory, Data Pipeline DevelopmentSQL, SSIS, Powershell and ETL scripting
Engineer – SnowFlake Data Warehouse - Analyze and performance tune, query processing engine with SnowFlake DW.
Engineer – SnowFlake Data Warehouse -Data Migration Strategy from On-Prem to SnowFlake DW solution - Ingestion Plan
Engineer - Deploy cloud infrastructure (Security Groups and load balancers needed to support EBS environment)
Engineer - Create and manage TFS Continuous integration builds on VSTS
Engineer - Responsible for maintaining AWS instances as part of EBS deployment
Engineer - Systems administration with Windows / Unix scripting
Excellent grasp of integrating multiple data sources into an Enterprise data management platform and can lead data storage solution design
Ability to understand business requirements and building pragmatic/cost effective solutions using agile project methodologies
Participate in Agile/Scrum ceremonies, including 2 week release sprints
Perform requirements analysis and high quality code development
Review the code of coworkers and offer feedback
Design frameworks, libraries, and components that are reusable
Engineer - Support on AWS services and DevOps deploying applications
Architect - Technical / Solution SME within the Data Integration across on-premise and AWS data sources / applications
Architect – MapR Data Fabric for Kubernetes (FlexVolume, PersistentVolume) – UDF and UDAF requirements
Architect – Talend Data Fabric through Spark and AWS EMR for Big Data Batch Jobs – UDF and UDAF requirements
Engineer – Azure Data Flow, Data Modeling in Azure, and Azure Ad-HOC Reporting (design / development)
Architect – ETL from AWS to Google Cloud to Azure and from/to other On-Prem data sources / targets.
Architect – Google Cloud Platform utilizing the Data Analytics, Data Stream Analytics, Hadoop, Data Lake and BI toolset
Engineer – Google Cloud Platform Data ingestion, Analytics datasets, data lake integration, data migration to Google Cloud
Engineer – Google Cloud Platform to Kafka and Spark cluster solutions, Google Cloud Platform to Azure via HDFS/Hive
Architect – Google Big Query for use cases where other Hadoop Solutions didn’t provide the results needed by business.
Engineer – Develop / Design data patterns via microservices into data pipelines across the Azure Technology Stack.
Architect & Administrator (AWS) GenGireXD, PostgresSQL, Greenplum, Hawq & Kafka environments; GoLANG Program
Architect & Administrator (Azure) Azure SQL DB, Hadoop, Hadoop Spark w/ NoSQL (Mongo, Cassandra & Couchbase)
Architect – Designed / Developed Data Migration Strategy from On Premise to Cloud (SQL & NoSQL Technology)
Architect & Administrator (Google Cloud) Hadoop, MongoDB, Couchbase, Hbase, PostgreSQL, Cassandra (Spark / Storm)
Architect - Callidus Cloud w/ SAP Hana; ETL Sales Data via Kafka to Data Lake (Hadoop); Data Visualizations
Architect – Calidus Cloud Integration to enterprise data stores via both SAP Data, Non-SAP data and Master Data Mgt.
Engineer – Deployment MapR Data Fabric for Kubernetes (FlexVolume, PersistentVolume) – UDF and UDAF requirements
Engineer – Deployment Talend Data Fabric, Spark within AWS EMR – UDF and UDAF requirements
Engineer – NetAPP Data Fabric architecture for UDF and UDAF deployments
Engineer - Azure Data Bricks, Azure Data Lake Service, Azure SQL Data Warehouse, Azure Data Catalog
Engineer - Technical / Solution SME within the Data Integration of Azure, Blob Storage, Log Analytics
Engineer – Azure Data Flow, Data Modeling in Azure, and Azure Ad-HOC Reporting (design / development)
Architect & Administrator (Azure) Cosmos DB – Schema Design, Data ingestion, Performance and Query optimization
Engineer – ETL from Azure SQL to multitude of data targets and to/from data targets/sources.
DevOps – Automation for Support. Deployment, Patching, Configuration, SDLC, Migration efforts, Sync with On-Premise
DevOps - Build/Release/Deployment/Operations; Tools (Datical, Jenkins, SolarWinds, Splunk, Vagrant, Nagios)
DevOps - Linux/Unix/Windows Administration
Led the creation of Data Governance vision, charter, framework, committees and processes for the enterprise.
Led the implementation, design (one full lifecycle) of Master Data Management (MDM) using Profisee Maestro / Talend.
Proven "hands on" MDM experience with expertise in MDM strategy proposal, roadmap and planning
Phased implementation leveraging best practices and strong focus in data quality
Experience in design/architecture of MDM Hub, data integration, data governance process and data quality.
Proficient using Talend, Profisee Maestro, Informatica Siperion and IBM Infosphere MDM, DQ & DG tools.
R & Python with comprehensive proficiency; Scala – Architecture & Collection Library, REPL, Scala, Refection, Macros
Machine Learning Frameworks - Amazon Machine Learning / Azure Machine Learning / Singa / H20 / Spark MLLib
Machine Learning Frameworks (Streams) - Massive Online Analysis / Spark MLLib
Regression, trees, neural networks, survival analysis, cluster analysis, forecasting, anomaly detection, association rules.
Understanding of machine learning pipelines, feature discovery/engineering, model evaluation/validation/deployment.
Translatecomplexbusinessissuesintoachievableanalyticallearningobjectivesandactionableanalyticprojects
CreatepredictiveandclusteringmodelsutilizingOracle, SQLServerandHDFSdatasources
Processing,cleansing,andverifyingtheintegrityofdatausedforanalytics
Selectingfeatures,buildingandoptimizingclassifiersusingmachinelearningtechniques
Designanddeveloppredictivemodelsandmachinelearningalgorithmsusingadvancedmethodologies
Oracle Golden Gate / Oracle Data Integrator / Hive / PostgreSQL Data integration design & configuration
Oracle12C Enterprise Metadata Management installation and deployment
Installation and Deployment – Oracle Enterprise Metadata Management for Hadoop / PostreSQL Data Store,
Implement Talend Data Integration- Reading an input file, transforming data, combining columns, Joining data sources
Implement Talend Data Integration- Creating database metadata, Joining data, Master Data Management model design
Define SOA based applications and micro-services for Data Pipeline Architecture
Define Cloud Native (micro-services/BDD (behavior driven development), containers/Dockers, Agile, BDD)
Define data ingestion strategies; Kafka, Storm, Nifi, Zookeeper, Oozie, Sqoop – Lambda Architecture
Cloud to/from On-Prem – “Apache” - Kafka, NiFi, Storm, Flume, Sqoop, Samza, Chukwa
Cloud to/from On-Prem - Wavefront, Data Torrent, Amazon Kinesis, Syncsort, Gobblin, FluentD, Cloudera Morphlines,
Cloud to/from On-Prem - White Elephant, Heka, Scribe, Databus
Tools were tested, compared, bench marked, via POC and Performance testing. Ingestion Strategy /Tool Outcome
Delta Architecture Ingestion - Acquisition -> Sqoop, Flume, Python; Messaging -> Kafka, Pulsar;
Delta Architecture Ingestion - Stateful -> Flink; Query / Processing / Lamda -> Hadoop, MapReduce, Hive, NoSQL
Database Refactoring, Database Upgrades, Database Migrations, Database Platform Changes (SQL->SQL, SQL->NoSQL)
Database connection pooling and configuration (Oracle, SQL Server, DB2, PostgreSQL)
Migration from Informatica ETL / ELT methods into Azure Data Factory Data Pipeline architecture.
Data Pipeline from relational database into cloud data lake and data storage using ETL extraction via Azure Data Factory
SQL Env – Oracle 12c (450+ PRD DB’s); SQL Server 2012/2014/2016 (425+ PRD DB’s); DB2 9/10 (75 PRD DB’s)
Oracle – Oracle Performance Tuning, backup & recovery, DevOps Automated Changes, Data Migrations, ETL
SQL – Oracle 11i, 12c, 12c R2, OEM 13c; SQL Server 2012, 2014, 2016; DB2 11.1, 10.5, 10.1, 9.8, 9.7;
Sybase – (ASE 15.x/16.x) – 30+ Dev/TST/PRD DB’s (Log Shipping, Replication Server, DB Mirror);
MySQL – (Percona 5.7.x) – 15+ PRD DB’s Percona Cluster; (MariaDB 10.1.x, 10.2.x, 10.3.x) 20+ TST/DEV
MySQL – Replication, cluster configurations, sharding – IaaS (On-Prem & Cloud), Index, Elastic Search
PostgreSQL DBA 9.5/9.6 (20+ PRD DB’s. Data Ingestion, Data Model optimization, Table/Index Optimization
PostgreSQL DBA – Backup/Recovery, Performance Tuning, Data loads, Connectors to other environments, SQL Tuning
PostgreSQL DBA – Replication and Replica Management; Sync Management and Optimization;
PostgreSQL DBA – Query, Storage, Index, - Data Analytics Optimization.
Exadata – X5 migration to X7 4 node NVMe 18 Core configuration;
Exadata - Migration from Oracle RAC to Oracle Exadata multi-tenant (RAC) Cluster
Exadata – Parallelization Optimization, Index Optimization, Partition optimization, Stats optimization
Exadata – Perf. tuning, Optimizer optimization, Configuration optimization, Smart Scans optimization
Exadata – EXACHK, ASM, ClusterWare, DCLi, CellCLi, - Storage, Operating System and Network
Oracle – Oracle 12C Pluggable Databases Through Database Consolidation; Redaction Policy; Top N Query and Fetch;
ODI – Oracle 12C Data Integrator configuration with Hadoop/Hive & Oracle DB integration;
All Databases – Meta Data Management, Log Management, IN-Memory Optimization, Database Cluster Tuning
SQL Server – SQL Profiler, Indexing Optimization, Parallel Query Optimization, Storage Optimization (Data Files, Logs)
Netezza - (15+ PRD DB’s) – 30+ Dev/TST/PRD DB’s (Log Shipping, Replication Server, DB Mirror);
Netezza - Provided technical efficiency performance and security functions of Netezza databases.
Netezza - Implemented procedures for allocation of hardware resources and performance tuning
MongoDB –3.6/3.4 (15 MongoDB PRD DB’s) w/Sharding across 20 node cluster
MongoDB –Full High Availability within two data centers. Document store for more than 5000 users.
Neo4J -Graphical Data – Query, Analyze for highly connected data; Native Graph Storage, Native Graph Processing
Neo4J - Graph scalability, high availability, Graph Clustering – Graphs on Spark, Graphs in Azure Cloud Graph Platform
HBase - Clusters Design, Management, including backup/recovery, replication, cluster failover, and disaster recovery
HBase – Large structured and unstructured Datasets from multiple data sources – pipeline to Hadoop / KAFKA Clusters
GreenPlum – (15+ PRD DB’s). 10 Node Cluster, HAWQ, Pivotal HD & Big Data Suite
GreenPlum – Data Ingestion, Backup/Recovery, Performance Tuning, Connection Pooling, Query Tuning
Couchbase - 5.0/4.6.x (15 Node/Cluster) Document Data modeling, Cluster Management w/ Hadoop HDF
Couchbase - Node Configuration, Data Conversion to JSON, Hadoop/HDP
Cassandra - DataStax (25 Node/Cluster) Transaction Data with replica’s across 3 data centers
Oracle EBS 12.2.5/ DB 11iR2 Vision Installations at 5 different locations for 11i to 12c Upgrade Planning
Oracle EBS upgrade from 12.1.3 to 12.2.5 / RDBMS Upgrade from 11i R2 to 12c / EBS performance tuning
Oracle EBS to Oracle PeopleSoft bi-directional data replication and data ETL into Big Data (Hadoop) Data Lake (Batch)
Peoplesoft Prod, Test, QA & Dev support & administration
Peoplesoft Upgrade from 9.1 to 9.2 Upgrade Assistant; Peoplesoft Tools Upgrade to latest 8.51 to 8.54 Change Assistant
Peoplesoft Integration Broker, Gateway Properties, App Messages - Optimization
Peoplesoft Database to Oracle Exadata Platform – Optimized: Statistics, Indexes, Configurations, Scans & Parallelization
Peoplesoft Application Server, Process Scheduler, REN Server, Gateway, Weblogic & File Server Support/Upgrade

Confidential

Big Data Architect / Enterprise Architect / Data Scientist / Big Data Technical Lead / SQL Server 2016 DBA / Azure Cloud Architect

Responsibilities:

Build services that help categorize data based on usage and underlying attributes coming from a variety of systems.
Create systems that help quickly make anomalous patterns in data pipelines known to teams throughout enterprise.
Provide requirements and techniques into systems that help cleanse data being used in key business data pipelines.
Analyze data originating from many different source systems and database technologies.
Work with people/teams throughout enterprise to find opportunities improve data quality for overall data products.
Build features to support data categorization models, data quality anomaly detection and better data cleansing processes.
Identify and improve data elements within existing data lakes and new data lakes still in design phase.
Design and develop data requirements and samples that can be incorporated into engineering (technical) processes.
Machine Learning Frameworks - Amazon Machine Learning / Azure Machine Learning / H20 / Spark MLLib
Machine Learning Frameworks (Streams) - Massive Online Analysis / Spark MLLib
Utilizing Event Stream Processing and Complex Event Processing, Egress, Visualization and Utilization
Edit Python and R code for optimization and performance improvements.
R & Python with comprehensive proficiency; Scala – Architecture & Collection Library, REPL, Scaladoc, Refection, Macros
Detailed understanding of machine learning pipelines and ability to discuss concepts such as feature discovery/engineeringmodel evaluation/validation, online vs. offline learning, and model deployment.
Deployment of Hadoop and Spark ecosystems.
Regression, trees, neural networks, survival analysis, cluster analysis, forecasting, anomaly detection, association rules.
Big Data Architecture and Security, Maintenance and Governance
Big Data Design Patterns - Data Ingress, Data Wrangling, Data Storage
Big Data Solution Patterns - Data Processing, Data Analysis, Data Egress, Data Visualization
Defined scope for Big Data Platform and identified / Selected initial Use Cases that would drive Big Data Project
Big Data Strategy – Developed Initial Approach and Selected Initial Technology Stack
Big Data Strategy – performance management, data exploration, social analytics, data science
Architect & Admin. (Azure) PostgresSQ, Cloudera & Kafka environments;GoLANG Program
Architect & Admin. (Azure) Hadoop Cloudera, Hadoop Yarn, Hadoop Spark w/Mongo Cassandra & Couchbase
Architect & Admin. (Azure) Cloudera, Hadoop Yarn, Storm, Nifi w/Mongo, Cassandra & Couchbase
Architect – Azure Data Factory, Data Pipeline Design, Azure Data lake / Azure Storage – HDFS, SQL, NoSQL
Engineer – Deployment MapR Data Fabric for Kubernetes (FlexVolume, PersistentVolume) – UDF and UDAF requirements
Engineer – Deployment Talend Data Fabric, Spark within AWS EMR – UDF and UDAF requirements
Architect – MapR Data Fabric for Kubernetes (FlexVolume, PersistentVolume) – UDF and UDAF requirements
Support API and Java Developer teams with both Administration of the total cluster and data request from legacy and cloud
DevOps – Automation for Support. Deployment, Patching, Configuration, SDLC, Migration efforts, Sync with On-Premise
DevOps - Build/Release/Deployment/Operations; Tools (Datical, Jenkins, SolarWinds, Splunk, Vagrant, Nagios)

Confidential

Database Administrator / Database Engineer / Big Data Architect / Big Data Administrator / Hadoop Administrator / Hadoop Technical Lead

Responsibilities:

Drafted EnterpriseBig Data Platform policy which was incorporated in executive Project Management guidance
Defined scope for Big Data Platform and identified / Selected initial Use Cases that would drive Big Data Project
Big Data Strategy – Developed Initial Approach and Selected Initial Technology Stack
Big Data Strategy – performance management, data exploration, social analytics, data science
Architect & Administrator Hadoop, MongoDB, Hadoop Cluster
Architect & Administrator Hadoop Cluster; Hadoop HDFS; Hadoop Hive; Hadoop Map Reduce; Hadoop Pig
Oracle12C Enterprise Metadata Management installation and deployment
Database connection pooling and configuration (Oracle, SQL Server, DB2, MySQL – ODBC & JDBC)
Oracle Enterprise Metadata Management - Impact Analysis, Annotation and Tagging functions, Reporting Source Lineage
Oracle Exadata - Migration from Oracle RAC to Oracle Exadata multi-tenant (RAC) Cluster
Oracle Exadata – Parallelization Optimization, Index Optimization, Partition optimization, Statistics optimization
Oracle Exadata – Performance tuning, Optimizer optimization, Configuration optimization, Smart Scans optimization
Oracle Golden Gate / Oracle Data Integrator / Hive / PostgreSQL Data integration design & configuration
Database connection pooling and configuration (Oracle, SQL Server, DB2, MySQL – ODBC & JDBC)
SQL DBA - Log Shipping, Database Restore, Database Refreshes, Monitoring
SQL DBA – Meta Data Management, Log Management, IN-Memory Optimization, Database Cluster Tuning
SQL DBA – SQL Profiler, Indexing Optimization, Parallel Query Optimization, Storage Optimization (Data Files, Logs)
DB2/UDB DBA – Backups, Performance Tuning, Parameter/Configuration Optimization, Partitioning, Query Optimization,
DB2/UDB DBA - Log Shipping, Database Restore, Database Refreshes, Monitoring
DB2/UDB DBA – Meta Data Management, Log Management, IN-Memory Optimization, Database Cluster Tuning
DB2/UDB DBA – Indexing Optimization, Parallel Query Optimization, Storage Optimization (Data Files, Logs)

Confidential, McLean, VA

Big Data Architect / Data Governance and Master Data Management

Responsibilities:

Database Refactoring, Database Upgrades, Database Migrations, Database Platform Changes (SQL->SQL, SQL->NoSQL)
Defined scope for Big Data Platform and identified / Selected initial Use Cases that would drive Big Data Project
Big Data Strategy – Developed Initial Approach and Selected Initial Technology Stack
Technical Architect of analytics platform collecting usage from billions of records.
Database connection pooling and configuration (Oracle, SQL Server – ODBC & JDBC)
Led team of engineers and coordinated the QA effort for prod-ops, QA and presentations to product and executive teams.
Implemented high speed caching engine directly serving millions of customers when not possible prior.
Database Refactoring, Database Upgrades, Database Migrations, Database Platform Changes (SQL->SQL)
Led the creation of Data Governance vision, charter, framework, committees and processes for the enterprise.
Led the implementation, design (one full lifecycle) of Master Data Management (MDM).
Proven "hands on" MDM experience with expertise in MDM strategy proposal, roadmap and planning
Phased implementation leveraging best practices and strong focus in data quality
Experience in design/architecture of MDM Hub, data integration, data governance process and data quality.
Documented cloud strategy for Big Data Platform and showed value for use cases selected for project
Setup initial ETL design for Big Data Platform from Production EDW (Replacing Informatica with Big Data Platform)
Architect & Administrator Hadoop Cluster; Hadoop HDFS; Hadoop Hive; Hadoop Pig

We provide IT Staff Augmentation Services!

Lead Big Data Architect - Engineer / Lead Cloud Architect-engineer / Data Scientist Resume

Jacksonville, FL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship