Data Analytics Lead Resume
Detroit, MI
TECHNICAL SKILLS
Languages: Machine Learning, Decision Tree Classifier, Linear Regression algorithms, Python, SciPy, PySpark, Pandas, NumPy, MatLab, Automation Anywhere, UI Path, Pega, SQL, HIVE queries, Mainframes - COBOL, PLI and JCL.
Big Data Technologies/Tools: Hadoop, Hive, Impala, HBase, Sqoop, PIG, Spark and Splunk
Database: Netezza, MS-Access, Oracle, DB2 and IMS.
Tools: Data Scientist Workbench, Anaconda, Jupyter, PyCharm, IPython, IBM InfoSphere Information Analyzer, Trillium, Data flux, SAS/EG, HUE, UI Path, Automation Anywhere, Pega, Data Studio, SQL developer, MS Office - Word, Excel, PowerPoint, Visio, RMS and SVN
PROFESSIONAL EXPERIENCE
Confidential, Detroit, MI
Data Analytics Lead
Responsibilities:
- Data Analysis is carried out using HIVE, Impala, HBase, Sqoop and PIG tools in Cloudera Distribution Hadoop ecosystem.
- Sqoop importer to extract data from Oracle RDBMS tables into HDFS flat files/Hive tables during non-business hours through HUE.
- Data from EDW/RDBMS is moved as flat files to Cloudera Distributed Hadoop (CDH) environment and Hive/Impala tables are created.
- PIG Script to apply transformations to fix data type compatibility, removed unwanted columns and store back the resultant file in HDFS.
- SQOOP to add daily delta updates into existing HIVE directory
- OOZIE job to schedule Python, PIG and HIVE script in sequence
- Data elements required are identified and joined using HIVE HQL queries to create required view.
- Exploratory Data Analysis, Profiling and Assessment to evaluate data for accuracy and quality using Python libraries MATLAB, NUMPY and PANDAS.
- Python scripts to automate business rules to find co-relation among different data elements
- Analyze completeness and cleanse anomalies using advanced Data Science imputing techniques through PANDAS.
- Identifying correlation among features, perform advanced feature engineering technique.
- Developed basic predictive MODELs using Classification and Regression algorithms using scikit-learn python library.
Confidential
Data Architect
Responsibilities:
- Management and implementation of database models, data flow diagrams, structures and data standards to support robust data management infrastructure.
- Designed ETL strategy for large-scale data conversion, design and implementation across different platform including Legacy to ERP.
- Develop Enterprise Data Strategy road map for Confidential initiative, identifying business drivers, reporting metrics and establish data governance model.
- Build DQ framework for CIA, in-house or boxed solution for end-to-end DQ cycle and BPM.
- Develop best practices for data management, maintenance, reporting and security.
- Assist team in data management projects when needed.
- Integrate and automate DQ toolsets and functions within DQ framework.
Confidential
Data Architect
Responsibilities:
- Identified Use Cases with Business Teams for Robotics PoC.
- Actively participated and collaborated with architecture team on down selecting three Confidential Tools/Vendors from 20+ tools.
- Identified two uses cases for Proof of Concept one each from AML and Finance.
- Configured Confidential tools UiPath, Pega (Open Span) and Automation Anywhere (AA) in Ally Test Lab.
- Completed Use Case development and conducted demo of working prototype to AML and Finance business leaders.
- Demonstrated Confidential /AI prototypes and its benefits to Corp tech and CIA leadership.
- Rated Confidential tools used in POC as per Ally Architectural and Functional requirements.
- Collaborating with Corporate Technology team to procure Automation Anywhere tool for Confidential project execution this year.
Confidential
Data Architect
Responsibilities:
- Designed and Developed data model, process model to accommodate Confidential requirements in Confidential Data Lake initiative.
- Developed file specification and reference data specifications to capture customer consent data by LOB’s.
- Reviewed each LOB’s process, system and data requirements to capture and communicate revocation details to enterprise system.
- Designed and Developed integration process in Confidential system to create/update Customer Master Data using details from all LOB’s and manage duplication.
- Designed system, interface, incoming and outgoing data from LOB’s and CIA system to provide sign off.
Confidential
Senior Data Analyst
Responsibilities:
- Identified impacts of IA 11.5 Confidential within DQ.
- Developed comprehensive deployment instructions for IA 11.5 Confidential .
- Tested entire DQ cycle in 11.5 to identify gaps, compatibility issues and defects.
- Seamlessly implemented changes into production.
- Coached and mentored GDC resources to leverage 11.5 features such as SQL Virtual tables to write complex SQL rules.
Confidential, Bloomington, IL
Senior Data Analyst
Responsibilities:
- SQOOP to extract data from DB2 Mainframe into HDFS flat files / Hive tables during non-business hours.
- Data from mainframe is moved as flat files to Hadoop environment and Hive tables are created.
- Nine-node cluster is currently maintained using Horton work distribution.
- Removed unwanted columns using PIG Script and store back the resultant file in HDFS.
- Created External table on top of the directory where PIG resultant file is stored.
- HIVE HQL script is written to compare two tables and store the result into table, which is exposed to external world (Pivot table, TABLEAU).
- SQOOP to add daily delta updates into existing HIVE directory.
- Helped scheduling team set up a OOZIE job to execute PIG and HIVE script in sequence.
- Responsible for building scalable distributed data solutions using Hadoop.
- Writing scripts to automate the offloading of files one environment into Hadoop environment.
- Solving error by looking at corresponding log files.
- Worked with analyst and test team for writing Hive Queries.
Confidential
Senior Data Analyst
Responsibilities:
- DQ framework comprehends assesses, analyze, report and solve data quality problems with proactive management.
- Configured different data sources against Trillium and resolved many issues.
- Profiled, assessed and developed DQ rules on Trillium / SQL and implemented in production.
- Automated monthly DQ report validation using SQL’s.
- Proposed and created Materialized view to enhance SQL performance of DQ rules and DQ report metric calculations.
- Mentored intern on DQ activities, proposed process documentation, artifacts for deliverables.
- Identified DQ metric calculation for new engagement lining with existing standards.
Confidential
Data Analyst
Responsibilities:
- Designed Target IDAA DB2 Tables for hierarchical IMS DB.
- Built data models, data Mapping document and Confidential rules.
- Developed COBOL, PLI programs to extract IMS data to load.
- Document corporate metadata definitions for enterprise data stores, established naming standards, data types, volumetric and domain definitions.
- Developed DQ rules to analyze data, profiling stats and identify anomalies.
- Developed programs to calculate DQ metrics for DQ reporting in Cogno’s dash board.
Confidential, Chennai, TN
Data Analyst
Responsibilities:
- Project planning for all new Data Quality engagements.
- Extracting data from hierarchical Mainframe IMS databases and loading into Trillium for Confidential .
- Develop matching COBOL copybooks using metadata information to create entities.
- Data profiling, Assessing and creating reports.
- Brainstorming with Business and SME’s to develop and configure business rules.
- Execute rules and Identify anomalies.
- Perform Cost / Benefit analysis on fixing anomalies with Business, SME’s and Sponsors.
- Summarize and present assessment findings to Business partners, Data Business analyst, subject matter experts and stakeholders.
- Writing requirement specification for DATA cleansing activities.
Confidential
Mainframe Application Developer
Responsibilities:
- Coordinate and Communicate with Onsite lead on engagements.
- Coordinated DBVal application for big integrated projects by identifying impacts and implementing changes.
- Develop / Modify COBOL/PLI programs as per new enhancement requirements.
- Create component specs, develop programs, test and prepare UTR, and making SCIT entries (document changes) and upload into RMS (Revision Management System).
- Support and sustain DBVal programs, Job monitoring and Daily/Monthly Control - D reports.
- Writing and executing SQL’s to fix discrepancies in Database that are identified through Database validation.