We provide IT Staff Augmentation Services!

Lead Big Data Consultant Resume

2.00/5 (Submit Your Rating)

Manhattan, NY

SUMMARY

  • Seasoned data professional with 12+ years of overall IT experience in delivering Big Data/Data warehousing solutions
  • Experience working in Financial, Telecom, Energy, Healthcare and Pharmaceutical domains
  • Expertise in delivering solutions in waterfall, iterative development and agile delivery methodologies
  • Excellent interpersonal, verbal, written and communication skills

TECHNICAL SKILLS

Big Data: Cloudera/Hortonworks Hadoop/Map R, MapReduce, Hive, HDFS, Pig, Apache Spark, Scala

Data Warehousing: Informatica Powercenter, OLAP, OLTP, SQL*Plus, SQL*Loader, Informatica PowerConnect for DB2

Dimensional Data Modeling: Dimensional Data Modeling, Data Modeling, Star Join Schema Modeling, Snow - Flake Modeling, FACT and Dimensions Tables, Physical and Logical Data Modeling

Project Management: PMP Certified PM, Agile, Kanban, Waterfall, Rally

Devops: JIRA, GitHub, Jenkins

PROFESSIONAL EXPERIENCE

Confidential, Manhattan NY

Lead Big Data Consultant

Responsibilities:

  • Delivered numerous Data Ingestion solutions for customers from Commercial and Consumer lines of business ingesting data in proprietary data formats from both internal and external partners into AMEX cornerstone raw data layer
  • Created internal/external tables in the Feed/Storage layers of Cornerstone of using industry standard compression and storage techniques in the Raw and the Feed layer of Cornerstone
  • Created partitioned/bucketed tables where appropriate to maximize performance
  • Lead the HOME (Home of Member Engagement) re-engineering project to migrate legacy code in HIVE to Spark using Scala 2.x, Intelli
  • J
  • IDE
  • Wrote numerous Spark scripts in Scala for Home ETL Reengineering project using both IntelliJ IDE and Scala REPL
  • Processed billions of AMEX campaign events using techniques such as Partitioning/Repartitioning/Coalescing and driver/executor parameter configuration
  • Maintained legacy HIVE/PIG scripts for EFS(Expert Feedback System) project
  • Extensively used IntelliJ/GitHub/Jenkins for development/deployment
  • Led scrum teams working with Product Owners and other stakeholders for end to end delivery of deliverables in agile fashion

Environment: mapR, HIVE, PIG, Spark 2.6, Scala 2.3, IntelliJ,GitHub/Jenkins

Confidential, Madison WI

Big Data Consultant

Responsibilities:

  • Serving as a Big Data Consultant for HEDIS project integrating Gundersen/PPIC systems into Quartz Data Lake/DW using Cloudera Big Data technologies
  • Worked with Scrum Master/Project Team /Product Owner for delivering sprint deliverables in a timely fashion
  • Extensively used HIVE and Spark Scala for data ingestion/transformation/loading into the Datalake/Datawarehouse and Department DataMarts
  • Worked with Cross functional teams for receiving source files into Hadoop edge server and sending HEDIS files out to CMS HEDIS
  • Worked with DataArchitect for creation of INTERNAL/EXTERNAL tables in the staging layer and Atomic layer
  • Wrote HIVE scripts to load data from files on edge server into staging layer
  • Wrote SCALA scripts to process data from staging layer into Atomic layer
  • Wrote SCALA utilities for accessing audit metadata stored in RDBMS
  • Used ORC File Formats and Snappy compression for performance improvement in HIVE
  • Used partitioning and bucketing techniques with HIVE tables for performance improvement
  • Used HIVE execution modes such as TEZ and SPARK to improve performance
  • Extensively used Scala collections for processing data from staging/atomic layer
  • Used Scala data frames for data access/transformation
  • Used Scala features such as persistence/caching for improved performance
  • Created entire applications in CA Workload Automation for running the Informatica jobs
  • Perform daily monitoring and support of all data load jobs
  • Address data related problems related to systems integration, compatibility, and multiple-platform integration
  • Provided level-3 production support for data warehouse jobs on a rotational basis

Environment: Cloudera Hadoop, Netezza, CA Workload automation, Agile, Spark 2.1,Scala, Eclipse, HIVE, GitHub/Jenkins

Confidential, Chicago, IL

EDW Lead/Big Data Developer

Responsibilities:

  • Worked on numerous EDW projects using Informatica 10.2
  • Created numerous mappings/workflows, parameter files for loading data into all layers of EDW
  • Designed and coded efficient mappings with consideration for Initial Loads/Reloads
  • Used JIRA for task tracking and team communication
  • Worked on multiple Big Data projects ingesting data into data lake and sending data to external vendors
  • Used Hive, Pig, Scala to extract data from Raw layer of data lake to lad into consumption layer and send files to external vendors
  • Extensively dealt with external vendors/SME’s for requirement gathering and issue resolution
  • Created numerous internal/external Hive tables in RAW and SMITH layers of the data lake
  • Processed variety of file formats in Hadoop
  • Created numerous UDFs for Hive
  • Worked with data architect for creating of Master Inventory Sheet to for access to data lake
  • Worked with admins for application of ranger policies to gain access to data lake
  • Create Pig scripts for accessing data from ingestion to consumption layer
  • Wrote numerous Scala scripts using RDDs, collections for accessing data from HIVE/HDFS and generation of files
  • Worked with source SMES for configuration of directories on edge nodes for receipt of files
  • Worked with Systems Analyst for refining mapping documents
  • Worked in Agile delivery model attending all the Scrum Events.

Environment: Informatica 10.6/10.1, Oracle Exadata, Horton Works HDP, Pig, Hive, Spark, Scala, Kafka, JIRA, Github, Jenkins, Eclipse

Confidential, Bensenville IL

EDW Lead

Responsibilities:

  • Delivered multiple EDW releases (Device Financing, Device Installments, Device Upgrade, Early Upgrade, Notification Hub releases ) using Informatica/Oracle Exadata successfully from scoping thru deployment
  • Actively participated in requirement gathering, Gap Analysis and model review with SANDS/BI/Architecture team contributing to a solid data model design during planning sprints
  • Estimated work effort based on complexity of mappings.
  • Led daily scrum with project teams managing daily execution and extensively used JIRA for managing communication
  • Created numerous design documents for handoff to developers
  • Designed and coded efficient mappings with consideration for Initial Loads/Reloads
  • Designed mappings to conform to the EDW and ABAC frame work architecture
  • Extensively used oracle hints and oracle OEM to resolve performance issues
  • Worked with ETQA in defect resolution ensuring resolution SLAS were met for Sev1,Sev2 and Sev3 defects
  • Work with business users to conduct UAT
  • Worked with project team and conducted Dress Rehearsal and Deployment activities
  • Worked on the NewFeature team playing role of Scrum Master working closely with Product Owners,Project Team
  • Successfully offloaded numerous tables data from Exadata into Datalake using Hadoop
  • Used SQOOP jobs to load data from operational system into Staging layer of Hadoop platform
  • Lead a big data project thru successful deployment for ingestion of multiple tables into the Hadoop layer using Hive, HDFS, Hue, SQOOP, Scala
  • Worked closely with Cloudera admins in determining appropriate roles and privileges for URIs
  • Designed the ETL architecture for jobs running on Hadoop Cluster
  • Worked with Cloudera admins for resolution of issues
  • Created External as well as Internal tables in Hive
  • Worked with JSON and other file formats in Hadoop
  • Wrote numerous Scala scripts for accessing/manipulating data within Hadoop ecosystem
  • Led numerous EDW releases thru complete SDLC using Informatica
  • Used Informatica BDM to access/manipulate data in Hadoop platform
  • Delievered EDW releases in waterfall/agile fashion.

Environment: Cloudera Hadoop, Informatica 10, Informatica BDE, Unix, Oracle Exadata, CDH 5.8, Hive, Pig, HDFS, SQOOP, Hive, Scala, Hue, Kafka, Waterfall, Agile

Confidential, Oakbrook IL

Data Integration Lead

Responsibilities:

  • Gathered requirements from Business users and stakeholders for numerous projects and NPD items
  • Created logical data models for the UPS Campusship, GT.com employee webservice projects
  • Created High Level documents for interfaces for UPS Campusship, GT.com employee webservice and Deltek interfaces
  • Created Operations guides for different interfaces
  • Designed, and maintained the GT.com employee webservice for consumption by external vendor Siteworx
  • Worked with appropriate stakeholders (Internal & External) to ensure that the GT.com employee webservice was successfully setup and functioning
  • Worked extensively with Siteworx and in addressing and resolving any issues arising during intergration
  • Lead a team of 5 developers assigning them tasks and overseeing their deliverables
  • Created templates for Interface Design Documents and Operational Guides document to be used across the board for the Data Integration team
  • Lead the requirements gathering for an enterprise scheduler
  • Used encryption tools such as GPG and PGP
  • Defined the encryption/decryption requirements and evaluated the Linoma PGP software
  • Gathered requirements from the business users for Dimensions and Facts for the GT datawarehouse
  • Designed and developed Dimensions and Facts for the GT datawarehouse

Environment: Informatica 9.1, SQL Server 2008, Oracle 11i, Linoma PGP, TFS, SOAP, XML, Lawson, ERWIN

Confidential, Bensenville, IL

ETL Lead

Responsibilities:

  • Was involved in requirement gathering with the client for the Audit Balancing and Control framework
  • Created logical and physical data models for the ABAC tables
  • Understood the clients data retention requirements for the ETL jobs and designed and implemented a solution for the same
  • Created the Functional Specification documents for SDF
  • Designed the PROCESS DEPENDENCIES table of SDF
  • Designed and implemented the launch keys evaluation module of SDF which sources PROCESS DEPENDENCIES table
  • Designed and implemented the ‘Selective Process Enabling’ module of SDF using perl to selectively disable jobs in SDF
  • Designed and implemented ‘Error Repoting’ module of SDF which uses information in the ERROR STATISTICS table of SDF and informaticas PMERR tables to write session errors to a text file which ids then read by a perl script to send error information to users
  • Created a session to extract reject counts from SDF tables to be written to a file and later emailed to the support personnel.
  • Created Functional Spec documents and Detail Level Design documents for each of the feeds for Outperform project
  • Created Outperform specific production support document detailing steps to be followed to resolve issues.
  • Tuned memory settings for mappings/sessions to streamline performance for the Outperform project
  • Extensively ftp’ed files across Development, QA and production servers of USC and also to ftp servers of Synygy
  • Extensively dealt with parameter files while making code changes.
  • Created High Level and Detailed Level design documents from the Functional Design document for the ‘Provisioning’ track of the PSMS project
  • Designed, developed and tested mappings/sessions/workflows for each of 6 USC markets for both the ‘MDNInitial’ and ‘MDN provisioning ’ feeds. All of these conform to SDF standards
  • Setup and tested 6 SDF batches each for the ‘Initial’ and ‘Provisioning’ feeds
  • Wrote a perl script which would take the raw files generated by the ‘Initial’ and ‘Provisioning’ feeds, adds header and footer and renames the files to have timestamp when the file was created
  • Provided warranty to production support for PSMS jobs
  • Created various mappings for the Clearinghouse
  • Setup and tested the SDF batches for Clearinghouse project
  • Support project teams and user community on an ad-hoc, as required basis.
  • Extensively used TOAD to access databases

Environment: InformaticaPowerCenter 8.6.1, Oracle 10g, Oracle 11i, TOAD, IBM AIX, SQL*Loader

Confidential, Nashville TN

Lead ETL Consultant

Responsibilities:

  • Created new and altered existing mappings for the Contract Modeling project
  • Worked with the UNIX admin to configure sftp for communication between local and remote servers
  • Wrote a ftp script to ftp files for the ‘Contract Modeling’ project to a remote server
  • Converted heavy SQL overrides in existing mappings of the QSR and CMS projects to transformation objects
  • Used features such as Dynamic Lookup, Target override in various mappings
  • Extensively worked with parameters and parameter files
  • Created numerous ‘Change Requests’ and liaised with the DBAs, Informatica admins to have code migrated across environments
  • Worked with oracle analytic functions
  • The CMS load is parameter driven and certain parameters had to be manually set every time the load was run. Automated the parameter generation obviating the need for manual intervention
  • Wrote a korn shell script to monitor a workflow running under a folder utilizing the pmcmd utility and send intimations if the runtime exceeds a threshold
  • Used Informatica 8 features such as flat-file headers, user defined parameters
  • Wrote a Perl script to rename session generated xml files by reading the file names from a parameter file and also the suffix to be added to the file names from the parameter file
  • Used Perl inplace editing feature to get rid of empty tags in an XML file
  • Wrote a Perl script which generates a sftp batch script to be used in the sftp command to securely ftp files to the remote server for the Crowe project

Environment: InformaticaPowerCenter 7.1.3, Oracle 10g, TOAD, IBM AIX, Rational Clearcase, Rational Clearquest

Confidential, Bensenville IL

ETL Lead

Responsibilities:

  • Served as Technical Team Lead for the Rewards project
  • Participated in Requirement analysis during Business Case Analysis phase of the Rewards project.
  • Prepared technical design documents and Interface design documents, working closely with Rewards program vendor.
  • Coordinated and supported system testing, User Acceptance testing of the project.
  • Acted as a SPOC for warranty support for the Rewards project.
  • Worked as on call for various initiatives supported by the Data Movement and Interfaces team resolving and closing out multiple tickets
  • Liaised with external vendors to resolve issues related to incoming/outgoing files
  • Altered an existing Pro*C script for the MARS project to fix an issue related to MYTHOS project
  • Altered existing UNIX scripts of the Clearinghouse utility to move archival to a stand-alone script to reduce execution time
  • Altered an existing Perl script for the PSMS Repository track to verify that incoming files have been received and processed an upstream process and fail the script if that is not the with all the files
  • Set up Public Key authentication between servers for password less authentication between servers
  • Worked with command line interface for Rational Clearcase to check-in check out files related to numerous fixes
  • For retroactive data fix to the SUBSCRIBER table for MARS, Wrote SQL*Loader scripts to load staging tables from flat files

Environment: InformaticaPowerCenter 7.1.3, Oracle 10g, TOAD, IBM AIX, Rational Clearcase, Rational Clearquest

We'd love your feedback!