Lead Big Data Architect - Engineer Resume - Hire IT People

PROFESSIONAL EXPERIENCE

Confidential

Lead Big Data Architect - Engineer

Responsibilities:

Design/Plan/Architect Pivotal Big Data Suite roadmap including use of the HDP / HDF Hadoop technology stack
Design/Plan/Architect data ingestion strategy from 400+ data sources into (HDP) Hadoop Data Lake
Design/Plan/Architect ETL strategies for real time data pipeline ingestion to (HDP/HDF) Hadoop Data Lake
Design/Plan/Architect Storm, Kafka and Spark architecture included in HDP/HDF real time data solutions
Data Discovery, Data Profiling, Predictive modelling, Machine Learning, R & Python development
Led the creation of Data Governance vision, charter, framework, committees and processes for the enterprise.
Led the implementation, design (one full lifecycle) of Master Data Management (MDM) using MuleSoft / Talend.
Proven "hands on" MDM experience with expertise in MDM strategy proposal, roadmap and planning
Phased implementation leveraging best practices and strong focus in data quality
Experience in design/architecture of MDM Hub, data integration, data governance process and data quality.
Proficient using Talend, Profisee Maestro, Informatica Siperion and IBM Infosphere MDM, DQ & DG tools.
R & Python with comprehensive proficiency; Scala - Architecture & Collection Library, REPL, Scala, Refection, Macros
Deployment of Hadoop and Spark ecosystems.
Using Erwin for logical and physical database design, database optimization, loading strategy design and implementation, conducting business analysis, event modeling & using knowledge of standard commercial databases (Oracle, Teradata, DB2).
Working in Big Data and Microservices technologies like - Hadoop, Map Reduce Frameworks, Cassandra, Kafka, Spark, HBase, Hive, Springboot, nodejs etc.
Developing database solutions by designing proposed system; defining database physical structure and functional capabilities, security, back-up, and recovery specifications and providing database support by coding utilities, responding to user queries and troubleshooting issues
Interacting and collaborating with cross functional teams including application development, peer reviews, testing, operations, security and compliance and project management office, as well as business customers and external vendors.
Machine Learning Frameworks - Amazon Machine Learning / Azure Machine Learning / Singa / H20 / Spark MLLib
Machine Learning Frameworks (Streams) - Massive Online Analysis / Spark MLLib
Regression, trees, neural networks, survival analysis, cluster analysis, forecasting, anomaly detection, association rules.
Detailed understanding of machine learning pipelines and ability to discuss concepts such as feature discovery/engineering, model evaluation/validation, online vs. offline learning, and model deployment.
Create predictive and clustering models utilizing Oracle, SQL Server and HDFS data sources
Define when predictive or clustering models could be utilized, and the type of data required to make them insightful
Develop, extract and maintain logical and physical data models for data analytics within Direct Energy
Enhancing data collection procedures to include information that is relevant for building analytic systems
Data mining using state-of-the art methods and produce actionable insight
Selecting features, building and optimizing classifiers using machine learning techniques
Design and develop predictive models and machine learning algorithms using advanced methodologies

Confidential

Lead Big Data Architect-Engineer

Responsibilities:

Design/Plan/Architect Pivotal Big Data Suite roadmap including use of the HDP / HDF Hadoop technology stack
Design/Plan/Architect data ingestion strategy from 2500+ data sources into (HDP) Hadoop Data Lake
Design/Plan/Architect ETL strategies for real time data pipeline ingestion to (HDP/HDF) Hadoop Data Lake
Design/Plan/Architect Storm, Kafka and Spark architecture included in HDP/HDF real time data solutions
Data Discovery, Data Profiling, Predictive modelling, Machine Learning, R & Python development
Architect - AWS, AWS RDB, AWS Data Warehouse, AWS Redshift & AWS Storage solutions
Architect - EC2, S3, CloudFormation, RDS, CloudFront, VPC, Route53, IAM, CloudWatch, Beanstalk, Lambda
Architect - Build, design, architect, implement high-volume, high-scale data analytics, machine learning Snowflake solutions
Architect - Azure Data Factory, Data Pipeline Design, Azure Data lake / Azure Storage - Oracle, DB2, SQL Server, MySQL
Engineer - Azure Data Factory, Data Pipeline Development SQL, SSIS, Powershell and ETL scripting
Engineer - SnowFlake Data Warehouse - Analyze and performance tune, query processing engine with SnowFlake DW.
Engineer - SnowFlake Data Warehouse -Data Migration Strategy from On-Prem to SnowFlake DW solution - Ingestion Plan
Engineer - Deploy cloud infrastructure (Security Groups and load balancers needed to support EBS environment)
Engineer - Create and manage TFS Continuous integration builds on VSTS
Engineer - Responsible for maintaining AWS instances as part of EBS deployment
Engineer - Systems administration with Windows / Unix scripting
Excellent grasp of integrating multiple data sources into an Enterprise data management platform and can lead data storage solution design
Ability to understand business requirements and building pragmatic/cost effective solutions using agile project methodologies
Participate in Agile/Scrum ceremonies, including 2 week release sprints
Perform requirements analysis and high quality code development
Review the code of coworkers and offer feedback
Design frameworks, libraries, and components that are reusable
Engineer - Support on AWS services and DevOps deploying applications
Architect - Technical / Solution SME within the Data Integration across on-premise and AWS data sources / applications
Architect - MapR Data Fabric for Kubernetes (FlexVolume, PersistentVolume) - UDF and UDAF requirements
Architect – Talend Data Fabric through Spark and AWS EMR for Big Data Batch Jobs – UDF and UDAF requirements
Engineer – Azure Data Flow, Data Modeling in Azure, and Azure Ad-HOC Reporting (design / development)
Architect – ETL from AWS to Google Cloud to Azure and from/to other On-Prem data sources / targets.
Architect – Google Cloud Platform utilizing the Data Analytics, Data Stream Analytics, Hadoop, Data Lake and BI toolset
Engineer – Google Cloud Platform Data ingestion, Analytics datasets, data lake integration, data migration to Google Cloud
Engineer – Google Cloud Platform to Kafka and Spark cluster solutions, Google Cloud Platform to Azure via HDFS/Hive
Architect – Google Big Query for use cases where other Hadoop Solutions didn’t provide the results needed by business.
Engineer – Develop / Design data patterns via microservices into data pipelines across the Azure Technology Stack.
Architect & Administrator (AWS) GenGireXD, PostgresSQL, Greenplum, Hawq & Kafka environments; GoLANG Program
Architect & Administrator (Azure) Azure SQL DB, Hadoop, Hadoop Spark w/ NoSQL (Mongo, Cassandra & Couchbase)
Architect – Designed / Developed Data Migration Strategy from On Premise to Cloud (SQL & NoSQL Technology)
Architect & Administrator (Google Cloud) Hadoop, MongoDB, Couchbase, Hbase, PostgreSQL, Cassandra (Spark / Storm)
Architect - Callidus Cloud w/ SAP Hana; ETL Sales Data via Kafka to Data Lake (Hadoop); Data Visualizations
Architect – Calidus Cloud Integration to enterprise data stores via both SAP Data, Non-SAP data and Master Data Mgt.
Engineer – Deployment MapR Data Fabric for Kubernetes (FlexVolume, PersistentVolume) – UDF and UDAF requirements
Engineer – Deployment Talend Data Fabric, Spark within AWS EMR – UDF and UDAF requirements
Engineer – NetAPP Data Fabric architecture for UDF and UDAF deployments
Engineer - Azure Data Bricks, Azure Data Lake Service, Azure SQL Data Warehouse, Azure Data Catalog
Engineer - Technical / Solution SME within the Data Integration of Azure, Blob Storage, Log Analytics
Engineer – Azure Data Flow, Data Modeling in Azure, and Azure Ad-HOC Reporting (design / development)
Architect & Administrator (Azure) Cosmos DB – Schema Design, Data ingestion, Performance and Query optimization
Engineer – ETL from Azure SQL to multitude of data targets and to/from data targets/sources.
DevOps – Automation for Support. Deployment, Patching, Configuration, SDLC, Migration efforts, Sync with On-Premise
DevOps - Build/Release/Deployment/Operations; Tools (Datical, Jenkins, SolarWinds, Splunk, Vagrant, Nagios)
DevOps - Linux/Unix/Windows Administration

Confidential

Big Data Architect

Responsibilities:

Build services that help categorize data based on usage and underlying attributes coming from a variety of systems.
Create systems that help quickly make anomalous patterns in data pipelines known to teams throughout enterprise.
Provide requirements and techniques into systems that help cleanse data being used in key business data pipelines.
Analyze data originating from many different source systems and database technologies.
Work with people/teams throughout enterprise to find opportunities improve data quality for overall data products.
Build features to support data categorization models, data quality anomaly detection and better data cleansing processes.
Identify and improve data elements within existing data lakes and new data lakes still in design phase.
Design and develop data requirements and samples that can be incorporated into engineering (technical) processes.
Machine Learning Frameworks - Amazon Machine Learning / Azure Machine Learning / H20 / Spark MLLib
Machine Learning Frameworks (Streams) - Massive Online Analysis / Spark MLLib

Confidential

Database Administrator / Database Engineer

Responsibilities:

Drafted Enterprise Big Data Platform policy which was incorporated in executive Project Management guidance
Defined scope for Big Data Platform and identified / Selected initial Use Cases that would drive Big Data Project
Big Data Strategy – Developed Initial Approach and Selected Initial Technology Stack
Big Data Strategy – performance management, data exploration, social analytics, data science
Architect & Administrator Hadoop, MongoDB, Hadoop Cluster
Architect & Administrator Hadoop Cluster; Hadoop HDFS; Hadoop Hive; Hadoop Map Reduce; Hadoop Pig
Oracle12C Enterprise Metadata Management installation and deployment
Database connection pooling and configuration (Oracle, SQL Server, DB2, MySQL – ODBC & JDBC)
Oracle Enterprise Metadata Management - Impact Analysis, Annotation and Tagging functions, Reporting Source Lineage
Oracle Exadata - Migration from Oracle RAC to Oracle Exadata multi-tenant (RAC) Cluster
Oracle Exadata – Parallelization Optimization, Index Optimization, Partition optimization, Statistics optimization
Oracle Exadata – Performance tuning, Optimizer optimization, Configuration optimization, Smart Scans optimization
Oracle Golden Gate / Oracle Data Integrator / Hive / PostgreSQL Data integration design & configuration
Database connection pooling and configuration (Oracle, SQL Server, DB2, MySQL – ODBC & JDBC)
SQL DBA - Log Shipping, Database Restore, Database Refreshes, Monitoring
SQL DBA – Meta Data Management, Log Management, IN-Memory Optimization, Database Cluster Tuning
SQL DBA – SQL Profiler, Indexing Optimization, Parallel Query Optimization, Storage Optimization (Data Files, Logs)
DB2/UDB DBA – Backups, Performance Tuning, Parameter/Configuration Optimization, Partitioning, Query Optimization,
DB2/UDB DBA - Log Shipping, Database Restore, Database Refreshes, Monitoring
DB2/UDB DBA – Meta Data Management, Log Management, IN-Memory Optimization, Database Cluster Tuning
DB2/UDB DBA – Indexing Optimization, Parallel Query Optimization, Storage Optimization (Data Files, Logs)

We provide IT Staff Augmentation Services!

Lead Big Data Architect - Engineer Resume

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship