Data Analyst Resume
Marlborough, MA
SUMMARY
- Over 8+ years of IT experience in the field of Data/ Business analysis, ETL Development, Data Modeling, and Project Management with 2+ years of experience in Big Data and related Hadoop technologies.
- Excellent knowledge on internal working of HDFS file system, Map Reduce
- In - depth knowledge on Hadoop ecosystem components like: Pig, Hive, Sqoop, Flume, Oozie, Zookeeper, Cloudera Manager, Flume
- Experience in deploying Hadoop cluster on public and private cloud environments like: Amazon AWS, Rackspace and Open stack
- Experience in using HCatalog for Hive, Pig and HBase
- Exposure to NoSQL databases HBase and Cassandra
- Have excellent knowledge on Python Collections and Multi-Threading.
- Skilled experience in Python with proven expertise in using new tools and technical developments
- Strong with performance improvement with very large datasets in SAS
- Using SAS/SQL for extract, transfer and load ETL methodology and processes
- Worked on several python packages like numpy, scipy, pandas, pytables etc.
- Strong experience in Business and Data Analysis, Data Profiling, Data Migration, Data Conversion, Data Quality, Data Integration and Metadata Management Services and Configuration Management
- Proficiency in multiple databases like Teradata, MongoDB, Cassandra, MySQL, ORACLE and MS SQL Server
- Experience in Creating Teradata SQL scripts using OLAP functions like rank and rank () Over to improve the query performance while pulling the data from large tables
- Design & Implementation of Data Extraction, Transformation & Loading (using SQL Loader, Informatica & other ETL tools), Analyze Oracle & SQL Server Data & Migration of the same.
- Comprehensive knowledge and experience in process improvement, normalization/de-normalization, data extraction, data cleansing, data manipulation
- Strong experience in interacting with stakeholders/customers, gathering requirements through interviews, workshops, and existing system documentation or procedures, defining business processes, identifying and analyzing risks using appropriate templates and analysis tools.
- Expertise in creating complex SSRS reports against OLTP, OLAP databases.
- Experience in various phases of Software Development life cycle (Analysis, Requirements gathering, Designing) with expertise in documenting various requirement specifications, functional specifications, Test Plans, Source to Target mappings, SQL Joins.
- Experience in Normalization and De-Normalization techniques for both OLTP and OLAP systems in creating Database Objects like tables, Constraints (Primary key, Foreign Key, Unique, Default), Indexes.
- Good understanding of Relational Database Design, Data Warehouse/OLAP concepts and methodologies
- Performed Logical & Physical Data Modeling and delivering Normalized, De-Normalized & Dimensional schemas.
- In-depth experience in AWS using EC2, Volume and Snapshot management, AWS Dynamo DB, AWS S3, AWS RDS, AWS VPC, Route 53, Elastic Beanstalk and IAM services.
- Worked on Managing Amazon instances by taking AMIs and performing administration and monitoring of Amazon instances using Amazon Cloud Watch.
TECHNICAL SKILLS
ETL TOOLS: Informatica Power Center, SQL * Loader, Ascential Data StageDATA MODELING: Erwin 4.0, Power Designer, Microsoft Visio 2003, ER Studio
DATABASES: Oracle 10g, 10g, MS SQL Server, MS Access, Teradata
OPERATING SYSTEMS: Windows and UNIX, Sun Solaris, AIX, HP
PROGRAMMING LANGUAGES: SQL, PL/SQL, Visual Basic, Net, C, C++, UNIX Shell Scripting, XML
REPORTING/DSS/ANALYSIS: Business Objects 5.0, COGNOS 7.0 and Crystal Reports
PROFESSIONAL EXPERIENCE
Data Analyst
Confidential, Marlborough, MA
Responsibilities:
- Wrote several Teradata SQL Queries using Teradata SQL Assistant for Ad Hoc Data Pull request.
- Developed Python programs for manipulating the data reading from various Teradata and convert them as one CSV Files.
- Performing statistical data analysis and data visualization using Python and R
- Worked on creating filters, parameters and calculated sets for preparing dashboards and worksheets in Tableau.
- Created data models in Splunk using pivot tables by analyzing vast amount of data and extracting key information to suit various business requirements.
- Created new scripts for Splunk scripted input for collecting CPU, system and OS data.
- Interacting with other data scientists and architected custom solutions for data visualization using tools like tableau, Packages in R and R-Shiny.
- Implemented data refreshes on Tableau Server for biweekly and monthly increments based on business change to ensure that the views and dashboards were displaying the changed data accurately.
- Maintenance of large data sets, combining data from various sources by Excel, SAS Grid, Enterprise, Access and SQL queries.
- Analyzed Data Set with SAS programming, R and Excel.
- Publish Interactive dashboards and schedule auto-data refreshes
- Experience in performing Tableau administering by using tableau admin commands.
- Created Hive queries that helped market analysts spot emerging trends by comparing incremental data with Teradata reference tables and historical metrics.
- Responsible for creating Hive tables, loading the structured data resulted from Map Reduce jobs into the tables and writing hive queries to further analyze the logs to identify issues and behavioral patterns.
- Developed normalized Logical and Physical database models for designing an OLTP application.
- Knowledgeable in AWS Environment for loading data files from on prim to Redshift cluster
- Performed SQL Testing on AWS Redshift databases
- Developed Teradata SQL scripts using OLAP functions like rank and rank () Over to improve the query performance while pulling the data from large tables.
- Involved in running Map Reduce jobs for processing millions of records.
- Written complex SQL queries using joins and OLAP functions like CSUM, Count and Rank etc.
- Involved in extensive routine operational reporting, hoc reporting, and data manipulation to produce routine metrics and dashboards for management
- Created action filters, parameters and calculated sets for preparing dashboards and worksheets in Tableau.
- Building, publishing customized interactive reports and dashboards, report scheduling using Tableau server.
- Experienced in migrating Hive QL into Impala to minimize query response time.
- Responsible for Data Modeling as per our requirement in HBase and for managing and scheduling Jobs on a Hadoop cluster using Oozie jobs.
- Worked on Spark SQL and Data frames for faster execution of Hive queries using Spark Sql Context.
- Design and development of ETL processes using Informatica ETL tool for dimension and fact file creation.
- Develop and automate solutions for a new billing and membership Enterprise data Warehouse including ETL routines, tables, maps, materialized views, and stored procedures incorporating Informatica and Oracle PL/SQL toolsets.
- Performed analysis on implementing Spark using Scala and wrote spark sample programs using PySpark.
- Created UDFs to calculate the pending payment for the given residential or small business customer's quotation data and used in Pig and Hive Scripts.
- Experienced in moving data from Hive tables into HBase for real time analytics on Hive tables.
- Handled importing of data from various data sources, performed transformations using Hive. (External tables, partitioning).
Environment: SQL/Server, Oracle 9i, MS-Office, Teradata, Informatica, ER Studio, XML, Hive, HDFS, Flume, Sqooq, R connector, Python, R, Tableau 9.2
Data Analyst
Confidential, Dorchester, MA
Responsibilities:
- Work with users to identify the most appropriate source of record required to define the asset data for financing
- Performed data profiling in Target DWH
- Experience in using OLAP function like Count, SUM and CSUM
- Performed Data analysis and Data profiling using complex SQL on various sources systems including Oracle and Teradata.
- Hands on Experience on Sqoop.
- Developed normalized Logical and Physical database models for designing an OLTP application.
- Developed new scripts for gathering network and storage inventory data and make Splunk ingest data.
- Imported the customer data into Python using Pandas libraries and performed various data analysis - found patterns in data which helped in key decisions for the company
- Created tables in Hive and loaded the structured (resulted from Map Reduce jobs) data
- Using HiveQL developed many queries and extracted the required information.
- Exported the data required information to RDBMS using Sqoop to make the data available for the claims processing team to assist in processing a claim based on the data.
- Design and deploy rich Graphic visualizations with Drill Down and Drop down menu option and Parameterized using Tableau.
- Extracted data from the database using SAS/Access, SAS SQL procedures and create SAS data sets.
- Created Teradata SQL scripts using OLAP functions like RANK () to improve the query performance while pulling the data from large tables.
- Worked on MongoDB database concepts such as locking, transactions, indexes, Sharding, replication, schema design, etc.
- Performed Data analysis using Python Pandas.
- Good experience in Agile Methodologies, Scrum stories and sprints experience in a Python based environment, along with data analytics and Excel data extracts.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Involved in defining the source to target data mappings, business rules, business and data definitions
- Responsible for defining the key identifiers for each mapping/interface
- Responsible for defining the functional requirement documents for each source to target interface.
- Hands on Experience on Pivot tables, Graphs in MS Excel
- Using advanced Excel features like Pivot tables and Charts for generating Graphs.
- Designed and developed weekly, monthly reports by using MS Excel Techniques (Charts, Graphs, Pivot tables) and Power point presentations.
- Strong Excel skills, including pivots, Vlookup, conditional formatting, large record sets. Including data manipulation and cleaning.
Environment: SQL/Server, Oracle 9i, MS-Office, Teradata, Informatica, ER Studio, XML, Hive, HDFS, Flume, Sqoop, R connector, Python, R, Tableau 9.2
Data Analyst
Confidential, Waltham, MA
Responsibilities:
- Experienced in developing business reports by writing complex SQL queries using views, volatile tables
- Experienced in Automating and Scheduling the Teradata SQL Scripts in UNIX using Korn Shell scripting.
- Wrote several Teradata SQL Queries using Teradata SQL Assistant for Ad Hoc Data Pull request.
- Extensive experience in working with Tableau Desktop, Tableau Server, and Tableau Reader in various versions of Tableau 7.0, 8.0,8.3, 9.0 and 10 as a Developer and Analyst.
- Worked on Offshore-Onshore Model.
- Interaction with the Client and End User to Understand the Requirement and design the high and Low-Level documentation.
- Analysis of functional and non-functional categorized data elements for data profiling and mapping from source to target data environment. Developed working documents to support findings and assign specific tasks.
- Design and prototype of accurate and scalable prediction algorithms using R/R Studio
- Analyzed different types of data to derive insights about relationships between locations, statistical measurements and qualitatively assess the data using R/R Studio
- Data Profiling to help identify patterns in the source data using SQL and Informatica and thereby help improve quality of data and help business to understand the converted data better to come up with accurate business rules.
- Involved with data profiling for multiple sources and answered complex business questions by providing data to business users.
- Implemented Indexes, Collecting Statistics, and Constraints while creating table
- Created action filters, parameters and calculated sets for preparing dashboards and worksheets in Tableau.
- Building, publishing customized interactive reports and dashboards, report scheduling using Tableau server.
- Design and deploy rich Graphic visualizations with Drill Down and Drop down menu option and Parameterized using Tableau.
- Created side by side bars, Scatter Plots, Stacked Bars, Heat Maps, Filled Maps and Symbol Maps according to deliverable specifications.
Environment: Oracle 10g, MS-Office, Teradata, Tableau 9.2, Teradata 13
Data Analyst
Confidential, Boston, MA
Responsibilities:
- Created new reports based on requirements. Responsible in Generating Weekly ad-hoc Reports
- Planned, coordinated, and monitored project levels of performance and activities to ensure project completion in time.
- Automated and scheduled recurring reporting processes using UNIX shell scripting and Teradata utilities such as MLOAD, BTEQ and Fast Load
- Experience with Perl
- Worked in a Scrum Agile process & Writing Stories with two week iterations delivering product for each iteration
- Worked on transferring the data files to vendor through sftp &Ftp process
- Involved in defining and Constructing the customer to customer relationships based on Association to an account & customer
- Created action filters, parameters and calculated sets for preparing dashboards and worksheets in Tableau.
- Experience in performing Tableau administering by using tableau admin commands.
- Worked with architects and, assisting in the development of current and target state enterprise level data architectures
- Worked with project team representatives to ensure that logical and physical data models were developed in line with corporate standards and guidelines.
- Involved in defining the source to target data mappings, business rules and data definitions.
- Responsible for defining the key identifiers for each mapping/interface.
- Performed data analysis and data profiling using complex SQL on various sources systems including Oracle and Teradata.
- Migrated three critical reporting systems to Business Objects and Web Intelligence on a Teradata platform
- Created Excel charts and pivot tables for the Adhoc data pull
Environment: Teradata 13.1, Informatica 6.2.1, Ab Initio, Business Objects, Oracle 9i, PL/SQL, Microsoft Office Suite (Excel, Vlookup, Pivot, Access, Power Point), Visio, VBA, Micro Strategy, Tableau, UNIX Shell Scripting ERWIN.
Data Analyst
Confidential, Holyoke, MA
Responsibilities:
- Interacted with business users to identify and understand business requirements and identified the scope of the projects.
- Identified and designed business Entities and attributes and relationships between the Entities to develop a logical model and later translated the model into physical model.
- Developed normalized Logical and Physical database models for designing an OLTP application.
- Enforced Referential Integrity (R.I) for consistent relationship between parent and child tables. Work with users to identify the most appropriate source of record and profile the data required for sales and service.
- Involved in defining the business/transformation rules applied for ICP data.
- Define the list codes and code conversions between the source systems and the data mart.
- Developed the financing reporting requirements by analyzing the existing business objects reports
- Utilized Informatica toolset (Informatica Data Explorer, and Informatica Data Quality) to analyze legacy data for data profiling.
- Reverse Engineered the Data Models and identified the Data Elements in the source systems and adding new Data Elements to the existing data models.
- Created XSD's for applications to connect the interface and the database.
- Compare data with original source documents and validate Data accuracy.
- Used reverse engineering to create Graphical Representation (E-R diagram) and to connect to existing database.
- Generate weekly and monthly asset inventory reports.
- Evaluated data profiling, cleansing, integration and extraction tools (e.g. Informatica)
- Coordinate with the business users in providing appropriate, effective and efficient way to design the new reporting needs based on the user with the existing functionality
- Also Worked on some impact of low quality and/or missing data on the performance of data warehouse client
- Worked with NZ Load to load flat file data into Netezza tables.
- Good understanding about Netezza architecture.
- Identified design fl in the data warehouse
- Executed DDL to create databases, tables and views.
- Generated comprehensive analytical reports by running SQL queries against current databases to conduct data analysis.
- Involved in Data Mapping activities for the data warehouse
- Created and Configured Workflows, Work lets, and Sessions to transport the data to target warehouse Netezza tables using Informatica Workflow Manager.
- Extensively worked on Performance Tuning and understanding Joins and Data distribution.
- Experienced in generating and documenting Metadata while designing application.
- Coordinated with DBAs and generated SQL codes from data models.
- Generate reports for better communication between business teams.
Environment: SQL/Server, Oracle9i, MS-Office, Embarcadero, Crystal Reports, Netezza, Teradata, Enterprise Architect, Toad, Informatica, ER Studio, XML, Informatica, OBIEE