Data Analyst Resume
Scottsdale, AZ
SUMMARY
- Strong working experience in Data Analysis, Data Validation, Data Verification and identifying data mismatch.
- Strong Experience in implementing Data warehouse solutions in AWS Redshift; Worked on various projects to migrate data from on premise databases to AWS Redshift, RDS and S3
- Experience on Cloud Databases and Data warehouses (SQL Azure and AWS Redshift/RDS).
- In - depth knowledge on Tableau Desktop,Tableau Reader and Tableau Server.
- Involved in the designing of Dimensional Model and created Star Schema using E/R studio
- Extensively worked with Data Analyst and Data Modeler to design and to understand the structures of dimensions and fact tables and Technical Specification Document
- Expertise in OLTP/OLAP System Study, Analysis and E-R modeling, developing Database Schemas like Star schema and Snowflake schema (Fact Tables, Dimension Tables) used in relational, dimensional and multidimensional modeling
- Worked extensively on Informatica designer to design a robust end-to-end ETL process involving complex transformation like Source Qualifier, Lookup, Update Strategy, Router, Aggregator, Sequence Generator, Filter, Expression for the efficient extraction, transformation and loading of the data to the staging and then to the Data Mart (Data Warehouse) checking the complex logics for computing the facts
- Extensively used the reusable transformation, mappings and codes using Mapplets for faster development and standardization
- Proficient inTableaudata visualization tool to analyze and obtain insights into large datasets, create visually compelling and actionable interactive reports and dashboards.
- Mastered the ability to design and deploy rich Graphic visualizations usingTableau.
- Deployedtableauserver in clustered environment by adding worker machines.
- Experienced in developing Web Services withPythonprogramming language.
- Experienced in developing web-based applications usingPython, Django, QT, C++, XML, CSS, JSON, HTML, DHTML, JavaScript and JQuery.
- Good Experience with Django, a high-level PythonWeb framework.
- Experience object oriented programming (OOP) concepts usingPython, C++.
- Experienced in WAMP (Windows, Apache, MYSQL, andPython/PHP) and MVC Struts.
- Experienced in developing web-based applications usingPython, Django, Java, HTML, DHTML, JavaScript and JQuery.
- Good Knowledge of Python and Python Web Framework Django.
- Experienced with Python frameworks like Webapp2 and, Flask.
- Strong Experience in Performing data analysis and data profiling using complex SQL on various sources systems including Oracle and Teradata.
- Using Teradata utilities (BTEQ, FASTLOAD, FASTEXPORT, MULTILOAD, Teradata Administrator, SQL Assistant, Pmon/View Point, Data mover).
- Involved in creating dashboards by using Level of Detail concept (LOD's).
- Good knowledge in Data Warehousing Methodologies,SASBusiness Intelligence Tools and Database concepts.
- Good capability to work either individually with minimal supervision or as a member of a large team and a quick learner with strong analytical, debugging and problem solving skills.
- Excellent report writing, communication, presentation and interpersonal skills.
PROFESSIONAL EXPERIENCE
Confidential, Scottsdale, AZ
Data Analyst
Responsibilities:
- Works with key stakeholders to understand the data requirements and translates strategic requirements into usable enterprise information architecture.
- Ensures existing data/information assets are identified, documented, stewarded, and leveraged across the organization.
- Used Erwin for effective model management of sharing, dividing and reusing model information and design for productivity improvement.
- Perform data modeling, design, mapping and analysis - using both ERD and dimensional data models
- Worked on Business Intelligence standardization to create database layers with user friendly views in MS SQL Server that can be used for development of variousTableaureports/ dashboards.
- Deliver fit for purpose analytics by building comprehensive performance reports that cover the big picture storyline and allow for relevant drill drowns
- Developed Informatica Mapping/Session/Workflows using DT Studio and XSD's
- Involved in Performance Tuning in Informatica for source, transformation, targets, mapping and session
- Created variables and parameters files for the mapping and session so that it can migrate easily in different environment and Database
- Effectively used data blending feature intableauand defined best practices forTableaureport development.
- Developed tools to automate some base tasks using Python, Shell scripting and XML.
- Working with various Teradata tools and utilities like Teradata Viewpoint, MultiLoad, ARC, Teradata Administrator, BTEQ and other Teradata Utilities.
- Experienced in ERAssist and SQL Workbench for Logical and Physical database modeling of the warehouse, responsible for database schemas creation based on the logical models.
- Analyze change requests for mapping of multiple source systems for understanding of Enterprise wide information architecture to devise Technical Solutions.
- Worked on AWS redshift datawarehouse for columnar data storage
- Worked on migrating of EDW to AWS using EMR and various other technologies.
- Developing data and metadata policies and procedures to build maintain and leverage data models, ensuring compliance with corporate data standards.
- Involved in building database Model, APIs, and Views utilizing Python technologies to build web based applications.
- Identified the Facts and Dimensions from the business requirements and developed the logical and physical models using Erwin.
- UsedPythonto place data into JSON files for testing Django Websites.
- UsedPythonscripts to update content in the database and manipulate files.
- Used the Django Framework to develop the application.
- GeneratedPythonDjango Forms to record data of online users.
- Used Django APIs for database access.
- Worked withPythonOO Design code for manufacturing quality, monitoring, logging, and debugging codeoptimization.
- Wrote aPythonmodule to connect and view the status of an Apache Cassandra instance.
- Developed tools usingPython, Shell scripting, XML to automate some of the menial tasks.
- UsedPythonunit and functional testing modules such as unit test, unittest2, mock, and custom frameworks in-line with Agile Software Development methodologies.
- Used Teradata Fast Load utility to load large volumes of data into empty Teradata tables in the Data mart for the Initial load and Used Teradata Fast Export utility to export large volumes of data from Teradata tables and views for processing and reporting needs.
- Done data migration from an RDBMS to a NoSQL database, and gives the whole picture for data deployed in various data systems.
- Generated comprehensive analytical reports by running SQL queries against current databases to conduct data analysis.
- Transforming staging area data into a STAR schema which was then used for developing embedded Tableau dashboards
- Handled performance requirements for databases in OLTP and OLAP models.
- Understand and analyze business data requirements and architect an accurate, extensible, flexible and logical data model.
- Prepared complex SQL/R scripts for ODBC and Teradata servers for analysis and modeling.
- Worked in importing and cleansing of data from various sources like Teradata, Oracle, flat files, SQL Server with high volume data
- Developed complex Stored Procedures for SSRS (SQL Server Reporting Services) and created database objects like tables, indexes, synonyms, views, materialized views etc.
- Documented ER Diagrams, Logical and Physical models, business process diagrams and process flow diagrams.
- Written several shell scripts using UNIX Korn shell for file transfers, error logging, data archiving, checking the log files and cleanup process.
- Created complex SQL server 12 and Oracle stored procedures and triggers in support of the applications with Error handling.
- Created mappings, mapplets and reusable transformations in Informatica. Also used several transformations like expression, Aggregator, Lookups, Joiner, Normalizer, Router, update strategy, Filter, Union, Sequence Generator, Stored Procedure, Source qualifier, XML transformation, etc.
- Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
- Written shell scripts to export/import database backups from RDS & keep the same in S3 (AWS Storage)
- Extracted data from the databases (Oracle and SQL Server, DB2, FLAT FILES) using Informatica to load it into a single data warehouse repository.
- Used SQL for Querying the database in UNIX environment
- Fine-tuned the mappings using push down optimization, session partitioning, bulk load, fast load, etc.
Environment: Tableau Developer(Desktop/Server), Python,Teradata,AgileSQL Assistant, AWS BTEQ, AWS Redshift, SAS desktop, SAS EG and Putty, Erwin r9.6, Informatica Power Center, Teradata, Oracle 12c, Microsoft SQL Server 20012, SVN tool, R, BTEQs, Mload, Fast Load, UNIX scripting, PLSQL Programming, Hadoop, Hive, PIG.
Confidential, Charlotte, NC
Sr. Data Analyst
Responsibilities:
- Involved in extensive data validation by writing several complex SQL queries and Involved in back-end testing and worked with data quality issues.
- Key Focus on eLending & eMortgage Business Strategy (eNotes, eMortgage, eSignature, SMARTdocs, Imaging, MISMO Data, eVaults .
- Interfaced and maintained relationships with major Mortgage LOS system vendors. Worked with cross-functional IT teams to create solutions for credit order and response files based on MBA MISMO dictated guidelines, LOS vendor changes, Mortgage Solutions changes, and government regulations.
- Developed and kept current, a high-level data strategy that fits with the Data Warehouse Standards and the overall strategy of the organization.
- Mapping client data files into XML which is used by a document composition engine to create communication for clients
- Performed SQL Testing on AWS Redshift database.
- Connectedtableaudirectly to Hortonworks Data Platform through hive interface so that business users can quickly find insight with Hadoop deployment.
- Extract, Transform and Load process and development of transformation standards and processes using Informatica best practices
- Connectedtableauto Greenplum Database which is a fully compliant SQL Database that sits on top of Hadoop for real time analytics.
- Used various transformations like Source Qualifier, Lookup, Router, Filter, Update, Expression, Aggregator, Sorter, and Sequence Generator etc.
- Worked with team of developers onPythonapplications for RISK management.
- Developed Python/Django application for Google Analytics aggregation and reporting.
- Used Django configuration to manage URLs and application parameters.
- Generated Python Django Forms to record data of online users.
- Used Python and Django creating graphics, XML processing, data exchange and business logic implementation
- Developed all the mappings according to the design document and mapping specs
- Created workflow documentation for each of the developed mapping
- Used Power Designer for reverse engineering to connect to existing database and ODS to create graphical representation in the form of Entity Relationships and elicit more information
- Generated data extractions, summary tables and graphical representations in SAS, Excel, and R, Power Point.
- Used Python scripts to update the content in database and manipulate files.
- Tested the ETL process for both before data validation and after data validation process. Tested the messages published by ETL tool and data loaded into various databases.
- Made strategic recommendations based on data analysis and insights incorporating business requirements. Streamline reporting, improve office processes, and adapt existing systems to new requirements.
- Effectively used data blending, filters, actions, Hierarchies feature intableau.
- Combined views and reports into interactive dashboards inTableauDesktop that were presented to Business Users, Program Managers, and End Users.
- Projected and forecasted future growth in terms of number of subscribers based on upgrading to LTE by developing Area Maps to show details on which states were connected the most.
- Developed story telling dashboards inTableauDesktop and published them on toTableauServer which allowed end users to understand the data on the fly with the usage of quick filters for on demand needed information.
- Created Source- Target SQL scripts for all the transformation rules mentioned in the mapping document.
- Used IBM DB2 Utilities in Creation of database objects and Import, Export and Loading data.
- Analyzed data from different perspectives and summarized it into useful information by Data Mining Techniques.
- Prepared Tableau reports and dashboards with calculated fields, parameters, sets, groups or bins and publish on the server.
- Performed data analysis and data profiling using complex SQL on various sources systems including Oracle, SQL server and DB2
- Used Teradata SQL Assistant, Teradata Administrator, PMON and data load/export utilities like BTEQ, Fast Load, Multi Load, Fast Export and Tpump on UNIX/Windows environments and running the batch process for Teradata.
Environment: AWS Redshift, Power Designer, PL/SQL, Oracle 11g, SQL/PL, Python,Navigator, UNIX, Microsoft Access, Teradata 14.1, R Scripting, Netezza, Teradata SQL assistant, Microsoft Visio, IBM DB2, SQL Server 2008, Tableau, MS PowerPoint, MS Access, Microsoft Excel.
Confidential, Washington, DC
Data Analyst
Responsibilities:
- Designed the ER diagrams, logical model (relationship, cardinality, attributes, and, candidate keys) and physical database (capacity planning, object creation and aggregation strategies) for Oracle and Teradata as per business requirements using Erwin
- Produced entity relationship diagrams, logical model diagrams, physical model diagrams and logical to physical database mapping.
- Involve in Unix Shell programming using Bash and set up CRONTAB jobs for SAS application batch run.
- Used Informatica PowerCenter for extraction, transformation and loading data from heterogeneous source systems
- Good knowledge in data warehouse concepts like Star Schema, Snow Flake, Dimension and Fact tables
- Reverse engineered the reports and identified the Data Elements (in the source systems), Dimensions, Facts and Measures required for new enhancements of reports.
- Designing and customizing data models for Data warehouse supporting data from multiple sources on real time.
- Deep experience with the design and development ofTableauvisualization solutions.
- Responsible for creating reports usingTableaudesktop and publishing reports toTableauserver
- Configured different data sources usingTableauDesktop
- Created best visualizations reports as per the business requirements usingTableaudesktop professional edition.
- Blended data using multiple data sources (primary and secondary).
- Connectedtableauto Google BigQuery for fast, interactive analysis of massive data.
- EnabledtableauActions to drill down from Dashboard to worksheet.
- Experience with creation of users, groups, projects, workbooks and the appropriate permission sets forTableauserver logons and security checks.
- Created Physical Data Model (PDM), generate DDLs and work with DBA to implement physical data model.
- Developed and supported the extraction, transformation and load process (ETL) for a Data
- Validated data accuracy across systems using AWS Redshift, S3, & MySQL
- Defined the Primary Keys (PKs) and Foreign Keys (FKs) for the Entities, created dimensions’ model star and snowflake schemas using Kimball methodology.
- Developed Ab-Initio graphs using Ab-Initio Parallelism techniques, Data Parallelism and MFS Techniques with the Conditional Components and Conditional DML.
- Identified the required dependencies between ETL processes and triggers to schedule the Jobs to populate Data Marts on scheduled basis
- Carried the performance tuning on the Ab Initio graphs to reduce the process time.
- Worked in importing and cleansing of data from various sources like Teradata, Oracle, flat files, SQL Server with high volume data
- Converted Visual Basic Application to Python, MySQL.
- Used python scripts to update content in the database and manipulate files.
- Developed tools to automate some base tasks using Python, Shell scripting and XM
- Resolved data issues and updates for multiple applications using SQL queries/scripts.
- Involved in the validation of the OLAP Unit testing and System Testing of the OLAP Report Functionality and data displayed in the reports.
- Extracted data from the databases (Oracle and SQL Server) using Informatica to load it into a single data warehouse repository of Teradata
- Redefined attributes and relationships in the model and cleansed unwanted tables/columns as part of data analysis responsibilities.
- Designed different type of STAR schemas for detailed data marts and plan data marts in the OLAP environment.
- Supported the DBA in the physical implementation of the tables in Netezza, Oracle and DB2 databases.
- Developed complex SQL scripts for Teradata database for creating BI layer on DW for Tableau reporting.
- Created Inbound Interfaces to pull the data from source to Teradata.
- Involved in modeling business processes through UML diagrams.
- Created entity process association matrices, functional decomposition diagrams and data flow diagrams from business requirements documents.
Environment: AWS Redshift, Teradata, XMLSpy, Erwin, Business Objects XI R2,, DB2, Oracle 10 g, SQL Server 2005, Tableau, Oracle DBMS, R, Ab-Initio, PL/SQL, Python,UNIX Shell scripts, Loader, MS Access, MS PowerPoint, MS Excel.
Confidential, Atlanta, GA
Data Analyst
Responsibilities:
- Performed data analysis and data profiling using complex SQL on various sources systems including Oracle and Teradata.
- Worked Confidential conceptual/logical/physical data model level using Power Designer according to requirements.
- Extensively used ETL methodology for supporting data extraction, transformations and loading processing, in a complex EDW using Informatica.
- Wrote and executed various MySQL database queries from Python-MySQL connector and MySQL dB package.
- Performance tuning of the Informatica mappings using various components like Parameter files, Variables and Dynamic Cache
- Responsible in handling and smoke testing of WFXML Requests and Response
- Various ab initio commands such as m ls, m cp, m mv, m db, m touchwere used extensively to operate with multifiles.
- Performed data cleansing and data validation by using various ab initio functions like is valid, is defined, is error, is null, string trim etc.
- Involved in writing a Procedures, Functions, packages to uploading data from Database.
- Implemented component level, pipeline and data parallelism using AB Initio for ETL process
- Involved in code reviews, performance tuning strategies Confidential ab initio and Database leve
- Created map attributes from one system to another helping in normalization of data Interpret data, analyze results using statistical techniques and provide ongoing reports.
- Developed and maintained data dictionary to create metadata reports for technical and business purpose.
- Responsible for analyzing various data sources such as flat files, ASCII Data, EBCDIC Data, Relational Data (Oracle, DB2 UDB and MS SQL Server) from various heterogeneous data sources.
- Designed a STAR schema for the detailed data marts and Plan data marts involving confirmed dimensions.
- Created new informatica interfaces to pull data from Teradata EDW to Oracle.
- Involved in Uploading Master and Transactional data from flat files and preparation of Test cases, Sub System Testing.
- Used DB2 Movement Utilities like Export, Import and Load.
- Tested the database to check field size validation, check constraints, stored procedures and cross verifying the field size defined within the application with metadata.
- Responsible for different Data mapping activities from Source systems to Teradata.
- Involved in the Working of BTEQ, TPT, and Fastload to load the data.
- Created the test environment for Staging area, loading the Staging area with data from multiple sources.
- Excellent experience in troubleshooting test scripts, SQL queries, ETL jobs, data warehouse/data mart/data store models.
- Excellent in creating various artifacts for projects which include specification documents, data mapping and data analysis documents.
Environment: Teradata SQL assistant, IBM DB2, Power Designer, WFXML Validation, MS Visio, MS Access, Microsoft Excel, Micro Strategy, SQL, R, Teradata, Ab-Initio,Netezza.
Confidential, Dallas, TX
Data Analyst
Responsibilities:
- Developed SAS programs with the use of Base SAS and SAS/Macros for ad hoc jobs.
- Worked in risk-centric analytics, customer-centric analytics and finance-centric analytics.
- Modified the existing programs according to the new business rules.
- Ensured the accuracy and timeliness of daily P&L and position reports using SAS reporting procedures and VBA tools.
- Automation of Data Extraction, Transformation and Loading the results to final datasets using SAS and SQL.
- Generated Reports, Summary tables, Charts and Graphs for different users using SAS/ASSIST.
- Used advanced techniques to reduce the execution of processes times in SAS and MP-Connect, SAS/Access.
- Designed and developed complex aggregate, join, look up transformation rules (business rules) to generate consolidated (fact/summary) data identified by dimensions using Informatica PowerCenter 6.x/7.0 tool
- Conducted data validation using SAS Enterprise Guide.
- Used different Ab-Initio components like Subscribe, BatchSubscribe, Publish, Multi publish, Continuous update table, Read XML, Write XML, partition by key and sort, dedup, rollup, scan, reformat, join and fuse in various graphs.
- Also used components like trash, run program and run sql components to run UNIX and SQL commands in Ab Initio.
- Implemented component level, pipeline and data parallelism using AB Initio for ETL process
- Performed Statistical Analysis with statistical procedures and Univariate procedures from Base SAS, SAS/STAT and created Reports with PROC REPORT and ODS.
- Apply Forecasting and Analytical Methodologies, and contribute to its ongoing continuous improvement.
- Participate in application design and architecture discussions and assess the impact and applicability of forecasting to client's business problem solutions.
- Completed all levels of application testing and prepare all required documentation.
- Ran the monthly and quarterly maintenance jobs on the mainframe.
- Documented the run instructions for two different processes in a project.
- Participated in designing, coding, testing, debugging and documenting SAS Programs
- Statistical analysis included coding the data sets, calculating simple percentage distributions, graphs, comparison of rates by Agency’s t-test and chi-square test, multiple tables and reports developed using SAS/STAT
- Extensively used various SAS Data Step functions, SAS procedures, and SQL to write reports logics for SAS Stored Processes.
- Created SAS Customized Reports using the Data Null technique.
- Developed routine SAS macros to create tables, graphs and listings for inclusion in Clinical study reports and regulatory submissions and maintained existing ones.
- Produced quality customized reports by using PROC TABULATE, REPORT, and SUMMARY and also provided descriptive statistics using PROC Means, Frequency, and Univariate.
- Created survival graphs in MS-Excel by transporting SAS data sets into Excel spreadsheets.
- Formatted HTML and RTF reports, using SAS - output delivery system ODS.
- Used the SAS Macro facility to produce weekly and monthly reports.
- Validated programs and databases.
- Created SAS programs using Proc Report, Proc Summary to generate reports on patients who deviate from protocol specifications.
- Served close with the statistician and make sure everything is in right place.
Environment: Base SAS, SAS Enterprise Guide, Ab-Initio,SAS MACRO, SAS/SQL, SAS/STAT, SAS/GRAPH, WFXML Validation
Confidential, Los Angeles, CA
SQL Developer
Responsibilities:
- Primarily worked on a project to develop internal ETL product to handle complex and large volume healthcare claims data. Designed ETL framework and developed number of packages to Extract,
- Transform and Load data using SQL Server Integration Services (SSIS) into Data warehouse tofacilitate BI and Analytics.
- Performing Data source investigation, developed source to destination mappings and data cleansing while loading the data into staging/ODS regions
- Involved in various Transformation and data cleansing activities using various Control flow and data flow tasks in SSIS packages during data migration
- Applied various data transformations like Lookup, Aggregate, Sort, Multicasting, Conditional Split, Derived column etc.
- Developed Mappings, Sessions, and Workflows to extract, validate, and transform data according to the business rules using Informatica.
- Querying, creating stored procedures and writing complex queries and T-SQL join to address various reporting operations and also random data requests.
- Performance monitoringandOptimizing Indexestasks by usingPerformance Monitor, SQL Profiler,
- Database Tuning Advisor and Index tuning wizard.
- Acted as point of contact to resolve locking/blocking and performance issues.
- Wrote scripts and indexing strategy for a migration to Amazon Redshift from SQL Server and MySQL databases
- Worked on AWS Data Pipeline to configure data loads from S3 to into Redshift
- Used JSON schema to define table and column mapping from S3 data to Redshift
- Wrote indexing and data distribution strategies optimized for sub-second query response
Environment: SQL Server, Oracle, Amazon Redshift, AWS Data Pipeline, S3, SQL Server ReportingServices (SSRS), SQL Server Integration Services (SSIS), SSAS, Share Point, TFS, MS Project, DB2, MS Access, VBA.
Confidential, Thousand oaks, CA
SQL Developer
Responsibilities:
- Involved in various stages like development, system Analysis, design, and creating stored Procedures.
- Involved withSSISfor sending and reviewing data from heterogeneous sources like (Excel, CSV, Oracle, flat file, Text Format Data).
- Created database objects like tables, views, indexes, stored-procedures, triggers and cursors.
- Migration of DTS to SQL Server Integration Service (SSIS) and Designing/ Implementing Backuppolicies for databases.
- Experience in Performance Tuning of mappings, ETL Procedures and process.
- Experienced in complete life cycle Implementation of data warehouse.
- Experienced in OLTP/OLAP system study, analysis and ER modeling, developing Database schemas like Star schema and Snowflake schema used in relational and multidimensional modeling by using Erwin, Visio
- Ability to write complex SQLs needed for ETL jobs and analyzing data, and is proficient and worked with databases like Oracle 11x/10g, SQL Server, MSSQL, Excel sheets, Flat Files and XML files.
- Expertise in Administration tasks including Importing/Exporting mappings, copying folders over the DEV/QA/PRD environments, managing Users, Groups, associated privileges and performing backups of the repository.
- Well Experienced in doing Error Handling and Troubleshot.
- Good exposure to Development, Testing, Debugging, Implementation, Documentation, End-user training and Production support.
- Expertise in doing Unit Testing, Integration Testing, System Testing and Data Validation for Developed Informatica Mappings.
- Created various DDL/DML triggers in SQL Server 2005/2000
- Creating, maintaining the users, roles and granting privileges and Rebuilding indexes on varioustables.
- Developing complex SQL to implement business logic and Optimized the performance of querieswith modifications in T-SQL queries.
- Configured Mail Profile for sending automatic mails to the respective people when a job is failed or succeed. Maintaining SQL Script for creation of Database Objects.