Lead Big Data Consultant Resume Manhattan NY - Hire IT People

SUMMARY

Seasoned data professional with 12+ years of overall IT experience in delivering Big Data/Data warehousing solutions
Experience working in Financial, Telecom, Energy, Healthcare and Pharmaceutical domains
Expertise in delivering solutions in waterfall, iterative development and agile delivery methodologies
Excellent interpersonal, verbal, written and communication skills

TECHNICAL SKILLS

Big Data: Cloudera/Hortonworks Hadoop/Map R, MapReduce, Hive, HDFS, Pig, Apache Spark, Scala

Data Warehousing: Informatica Powercenter, OLAP, OLTP, SQL*Plus, SQL*Loader, Informatica PowerConnect for DB2

Dimensional Data Modeling: Dimensional Data Modeling, Data Modeling, Star Join Schema Modeling, Snow - Flake Modeling, FACT and Dimensions Tables, Physical and Logical Data Modeling

Project Management: PMP Certified PM, Agile, Kanban, Waterfall, Rally

Devops: JIRA, GitHub, Jenkins

PROFESSIONAL EXPERIENCE

Confidential, Manhattan NY

Lead Big Data Consultant

Responsibilities:

Delivered numerous Data Ingestion solutions for customers from Commercial and Consumer lines of business ingesting data in proprietary data formats from both internal and external partners into AMEX cornerstone raw data layer
Created internal/external tables in the Feed/Storage layers of Cornerstone of using industry standard compression and storage techniques in the Raw and the Feed layer of Cornerstone
Created partitioned/bucketed tables where appropriate to maximize performance
Lead the HOME (Home of Member Engagement) re-engineering project to migrate legacy code in HIVE to Spark using Scala 2.x, Intelli
J
IDE
Wrote numerous Spark scripts in Scala for Home ETL Reengineering project using both IntelliJ IDE and Scala REPL
Processed billions of AMEX campaign events using techniques such as Partitioning/Repartitioning/Coalescing and driver/executor parameter configuration
Maintained legacy HIVE/PIG scripts for EFS(Expert Feedback System) project
Extensively used IntelliJ/GitHub/Jenkins for development/deployment
Led scrum teams working with Product Owners and other stakeholders for end to end delivery of deliverables in agile fashion

Environment: mapR, HIVE, PIG, Spark 2.6, Scala 2.3, IntelliJ,GitHub/Jenkins

Confidential, Madison WI

Big Data Consultant

Responsibilities:

Serving as a Big Data Consultant for HEDIS project integrating Gundersen/PPIC systems into Quartz Data Lake/DW using Cloudera Big Data technologies
Worked with Scrum Master/Project Team /Product Owner for delivering sprint deliverables in a timely fashion
Extensively used HIVE and Spark Scala for data ingestion/transformation/loading into the Datalake/Datawarehouse and Department DataMarts
Worked with Cross functional teams for receiving source files into Hadoop edge server and sending HEDIS files out to CMS HEDIS
Worked with DataArchitect for creation of INTERNAL/EXTERNAL tables in the staging layer and Atomic layer
Wrote HIVE scripts to load data from files on edge server into staging layer
Wrote SCALA scripts to process data from staging layer into Atomic layer
Wrote SCALA utilities for accessing audit metadata stored in RDBMS
Used ORC File Formats and Snappy compression for performance improvement in HIVE
Used partitioning and bucketing techniques with HIVE tables for performance improvement
Used HIVE execution modes such as TEZ and SPARK to improve performance
Extensively used Scala collections for processing data from staging/atomic layer
Used Scala data frames for data access/transformation
Used Scala features such as persistence/caching for improved performance
Created entire applications in CA Workload Automation for running the Informatica jobs
Perform daily monitoring and support of all data load jobs
Address data related problems related to systems integration, compatibility, and multiple-platform integration
Provided level-3 production support for data warehouse jobs on a rotational basis

Environment: Cloudera Hadoop, Netezza, CA Workload automation, Agile, Spark 2.1,Scala, Eclipse, HIVE, GitHub/Jenkins

Confidential, Chicago, IL

EDW Lead/Big Data Developer

Responsibilities:

Worked on numerous EDW projects using Informatica 10.2
Created numerous mappings/workflows, parameter files for loading data into all layers of EDW
Designed and coded efficient mappings with consideration for Initial Loads/Reloads
Used JIRA for task tracking and team communication
Worked on multiple Big Data projects ingesting data into data lake and sending data to external vendors
Used Hive, Pig, Scala to extract data from Raw layer of data lake to lad into consumption layer and send files to external vendors
Extensively dealt with external vendors/SME’s for requirement gathering and issue resolution
Created numerous internal/external Hive tables in RAW and SMITH layers of the data lake
Processed variety of file formats in Hadoop
Created numerous UDFs for Hive
Worked with data architect for creating of Master Inventory Sheet to for access to data lake
Worked with admins for application of ranger policies to gain access to data lake
Create Pig scripts for accessing data from ingestion to consumption layer
Wrote numerous Scala scripts using RDDs, collections for accessing data from HIVE/HDFS and generation of files
Worked with source SMES for configuration of directories on edge nodes for receipt of files
Worked with Systems Analyst for refining mapping documents
Worked in Agile delivery model attending all the Scrum Events.

Environment: Informatica 10.6/10.1, Oracle Exadata, Horton Works HDP, Pig, Hive, Spark, Scala, Kafka, JIRA, Github, Jenkins, Eclipse

Confidential, Bensenville IL

EDW Lead

Responsibilities:

Delivered multiple EDW releases (Device Financing, Device Installments, Device Upgrade, Early Upgrade, Notification Hub releases ) using Informatica/Oracle Exadata successfully from scoping thru deployment
Actively participated in requirement gathering, Gap Analysis and model review with SANDS/BI/Architecture team contributing to a solid data model design during planning sprints
Estimated work effort based on complexity of mappings.
Led daily scrum with project teams managing daily execution and extensively used JIRA for managing communication
Created numerous design documents for handoff to developers
Designed and coded efficient mappings with consideration for Initial Loads/Reloads
Designed mappings to conform to the EDW and ABAC frame work architecture
Extensively used oracle hints and oracle OEM to resolve performance issues
Worked with ETQA in defect resolution ensuring resolution SLAS were met for Sev1,Sev2 and Sev3 defects
Work with business users to conduct UAT
Worked with project team and conducted Dress Rehearsal and Deployment activities
Worked on the NewFeature team playing role of Scrum Master working closely with Product Owners,Project Team
Successfully offloaded numerous tables data from Exadata into Datalake using Hadoop
Used SQOOP jobs to load data from operational system into Staging layer of Hadoop platform
Lead a big data project thru successful deployment for ingestion of multiple tables into the Hadoop layer using Hive, HDFS, Hue, SQOOP, Scala
Worked closely with Cloudera admins in determining appropriate roles and privileges for URIs
Designed the ETL architecture for jobs running on Hadoop Cluster
Worked with Cloudera admins for resolution of issues
Created External as well as Internal tables in Hive
Worked with JSON and other file formats in Hadoop
Wrote numerous Scala scripts for accessing/manipulating data within Hadoop ecosystem
Led numerous EDW releases thru complete SDLC using Informatica
Used Informatica BDM to access/manipulate data in Hadoop platform
Delievered EDW releases in waterfall/agile fashion.

Environment: Cloudera Hadoop, Informatica 10, Informatica BDE, Unix, Oracle Exadata, CDH 5.8, Hive, Pig, HDFS, SQOOP, Hive, Scala, Hue, Kafka, Waterfall, Agile

Confidential, Oakbrook IL

Data Integration Lead

Responsibilities:

Gathered requirements from Business users and stakeholders for numerous projects and NPD items
Created logical data models for the UPS Campusship, GT.com employee webservice projects
Created High Level documents for interfaces for UPS Campusship, GT.com employee webservice and Deltek interfaces
Created Operations guides for different interfaces
Designed, and maintained the GT.com employee webservice for consumption by external vendor Siteworx
Worked with appropriate stakeholders (Internal & External) to ensure that the GT.com employee webservice was successfully setup and functioning
Worked extensively with Siteworx and in addressing and resolving any issues arising during intergration
Lead a team of 5 developers assigning them tasks and overseeing their deliverables
Created templates for Interface Design Documents and Operational Guides document to be used across the board for the Data Integration team
Lead the requirements gathering for an enterprise scheduler
Used encryption tools such as GPG and PGP
Defined the encryption/decryption requirements and evaluated the Linoma PGP software
Gathered requirements from the business users for Dimensions and Facts for the GT datawarehouse
Designed and developed Dimensions and Facts for the GT datawarehouse

Environment: Informatica 9.1, SQL Server 2008, Oracle 11i, Linoma PGP, TFS, SOAP, XML, Lawson, ERWIN

Confidential, Bensenville, IL

ETL Lead

Responsibilities:

Was involved in requirement gathering with the client for the Audit Balancing and Control framework
Created logical and physical data models for the ABAC tables
Understood the clients data retention requirements for the ETL jobs and designed and implemented a solution for the same
Created the Functional Specification documents for SDF
Designed the PROCESS DEPENDENCIES table of SDF
Designed and implemented the launch keys evaluation module of SDF which sources PROCESS DEPENDENCIES table
Designed and implemented the ‘Selective Process Enabling’ module of SDF using perl to selectively disable jobs in SDF
Designed and implemented ‘Error Repoting’ module of SDF which uses information in the ERROR STATISTICS table of SDF and informaticas PMERR tables to write session errors to a text file which ids then read by a perl script to send error information to users
Created a session to extract reject counts from SDF tables to be written to a file and later emailed to the support personnel.
Created Functional Spec documents and Detail Level Design documents for each of the feeds for Outperform project
Created Outperform specific production support document detailing steps to be followed to resolve issues.
Tuned memory settings for mappings/sessions to streamline performance for the Outperform project
Extensively ftp’ed files across Development, QA and production servers of USC and also to ftp servers of Synygy
Extensively dealt with parameter files while making code changes.
Created High Level and Detailed Level design documents from the Functional Design document for the ‘Provisioning’ track of the PSMS project
Designed, developed and tested mappings/sessions/workflows for each of 6 USC markets for both the ‘MDNInitial’ and ‘MDN provisioning ’ feeds. All of these conform to SDF standards
Setup and tested 6 SDF batches each for the ‘Initial’ and ‘Provisioning’ feeds
Wrote a perl script which would take the raw files generated by the ‘Initial’ and ‘Provisioning’ feeds, adds header and footer and renames the files to have timestamp when the file was created
Provided warranty to production support for PSMS jobs
Created various mappings for the Clearinghouse
Setup and tested the SDF batches for Clearinghouse project
Support project teams and user community on an ad-hoc, as required basis.
Extensively used TOAD to access databases

Environment: InformaticaPowerCenter 8.6.1, Oracle 10g, Oracle 11i, TOAD, IBM AIX, SQL*Loader

Confidential, Nashville TN

Lead ETL Consultant

Responsibilities:

Created new and altered existing mappings for the Contract Modeling project
Worked with the UNIX admin to configure sftp for communication between local and remote servers
Wrote a ftp script to ftp files for the ‘Contract Modeling’ project to a remote server
Converted heavy SQL overrides in existing mappings of the QSR and CMS projects to transformation objects
Used features such as Dynamic Lookup, Target override in various mappings
Extensively worked with parameters and parameter files
Created numerous ‘Change Requests’ and liaised with the DBAs, Informatica admins to have code migrated across environments
Worked with oracle analytic functions
The CMS load is parameter driven and certain parameters had to be manually set every time the load was run. Automated the parameter generation obviating the need for manual intervention
Wrote a korn shell script to monitor a workflow running under a folder utilizing the pmcmd utility and send intimations if the runtime exceeds a threshold
Used Informatica 8 features such as flat-file headers, user defined parameters
Wrote a Perl script to rename session generated xml files by reading the file names from a parameter file and also the suffix to be added to the file names from the parameter file
Used Perl inplace editing feature to get rid of empty tags in an XML file
Wrote a Perl script which generates a sftp batch script to be used in the sftp command to securely ftp files to the remote server for the Crowe project

Environment: InformaticaPowerCenter 7.1.3, Oracle 10g, TOAD, IBM AIX, Rational Clearcase, Rational Clearquest

Confidential, Bensenville IL

ETL Lead

Responsibilities:

Served as Technical Team Lead for the Rewards project
Participated in Requirement analysis during Business Case Analysis phase of the Rewards project.
Prepared technical design documents and Interface design documents, working closely with Rewards program vendor.
Coordinated and supported system testing, User Acceptance testing of the project.
Acted as a SPOC for warranty support for the Rewards project.
Worked as on call for various initiatives supported by the Data Movement and Interfaces team resolving and closing out multiple tickets
Liaised with external vendors to resolve issues related to incoming/outgoing files
Altered an existing Pro*C script for the MARS project to fix an issue related to MYTHOS project
Altered existing UNIX scripts of the Clearinghouse utility to move archival to a stand-alone script to reduce execution time
Altered an existing Perl script for the PSMS Repository track to verify that incoming files have been received and processed an upstream process and fail the script if that is not the with all the files
Set up Public Key authentication between servers for password less authentication between servers
Worked with command line interface for Rational Clearcase to check-in check out files related to numerous fixes
For retroactive data fix to the SUBSCRIBER table for MARS, Wrote SQL*Loader scripts to load staging tables from flat files

Environment: InformaticaPowerCenter 7.1.3, Oracle 10g, TOAD, IBM AIX, Rational Clearcase, Rational Clearquest

We provide IT Staff Augmentation Services!

Lead Big Data Consultant Resume

Manhattan, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship