Data Analyst Resume
Philadelphia, PA
SUMMARY
- Around 8 years of experience in NLP/NLU/NLG/AI/machinelearning/Computer vision/Probabilistic Graphical Models/Inferential statistics/Graph Theory/System Design.
- Proficientto understandofSparkcore, Spark SQL, Spark Streaming and Spark MLlib part of R&D team to build new analytics POC's usingApache Spark, Scala, Randmachine learning.
- Expert level understandingin Application Design, Development and testing in Mainframe environments usingPL/1,COBOL,EGL,Easytrieve,DB2, JCL, QC& VAG.
- Expertise in data analysis and development using SQL Server Reporting Services (SSRS), SQL Server Integration Services (SSIS) and SQL Server Analysis Services (SSAS)
- Experience inMachine Learning, Statistics, Regression - Linear, Logistic, Poisson, Binomial can help you build futuristic AI bots to assist/replace human in various business domains.
- Good experience in designing and scheduling complexSSISPackages for Data Migration from various sources like SQL Server, Oracle Database and Excel.
- Experience in logical and physical da ta modelling using ER Studio and Microsoft Visio proficient in requirement analysis, system design, and Object-Oriented Design (UML).
- Proficient in Database performance optimization, debugging and tuning using the Query Analyzer,SQLProfiler andSQLServerDebugger And data viewers and performance tuning of ETL data flows.
- Experience in using Jira service Desk as a ticketing system to resolve end users’ tickets and using Jira as part of an Agile Frame to manage team projects.
- Expertise in designing and developing efficient dashboards using Tableau2018.x/10.x/9.x/8.x/7.x (Desktop/Server) using data from multiple sources like Oracle, Postgre SQL, Teradata, DB2 and Flat Files
- Experience on Migrating SQL database to Azure Data Lake (ADF), Azure data lake Analytics, Azure SQL Database, Data Bricks and Azure SQL Data warehouse and Controlling and granting database access.
- Experience inoptimizing the Mappingsand implementing the complex business rules by creating re-usable transformations,MappletsandPL/SQLstored procedures.
- Experience SQLBIdeveloping interactive drill down reports usingslicers and dicersinPowerBIandPowerPivot and hands on experience in embedding Tableau reports and dashboards on external websites usingJavaAPI scripting
- Experience on Cloud Databases and Data warehouses (SQL Azure and Confidential Redshift/RDS) with additional version control tools such as Git, Subversion (SVN), Perforce, TFS and UCM ClearCase.
- Experienced in Automating, Configuring, and deploying instances onAWS, Azure environments and Data centers, also familiar withEC2,Cloud watch,Cloud Formationand managing security groups onAWS.
- Definingthe loading strategies, initial and delta dataloads, schedulingand monitoring data loads strongexperience inconductingworkshops,training sessions for users in BW/HANAreporting.
- CreatingSAS datasets, reports, tables, listings, summaries and graphs according to specifications given and departmental guidelines in the SOP on Windows and UNIX platforms
- Experience in VB, C#.NET, ASP.NET and ADO.NET worked in implementation, roll-off, upgrade and maintenance projects.
- Experience modelling and manipulating data usingPython, Alteryxand Informatica Tableau reporting administrative knowledge of ALM workflows, maintenance and customization
- Experience with all phases of the software development lifecycle (SDLC) and project methodologies performed Tableau type conversion functions when connected to relational anddata sources.
- Experience in OLTP/OLAP system study, Analysis, and modelling, developing data warehouse schemas like Star and Snowflake Schema used in dimensional and multi-dimensional modelling
- Single handed Built and designed a whole Information extraction bot POC for KYC extraction. This bot is using adaptivelearningtechniques and uses some custom supervised classifiers for entity and relation extraction.
- Performed summarization using NLP techniques to draw meaningful insights from customer reviews which enhanced the revenue.
- Extensively worked on Python 3.5/2.7 (NumPy, Pandas, Matplotlib, Tensor flow, NLTK and Scikit-learn) having Hadoop/Big Data related technology experience in Storage, Querying, Processing and analysis of data
- Experience building solutions for enterprises, context-awareness, pervasive computing, and/or application of machine learning
- Research and development ofmachinelearningpipeline design for Optical Character Recognition (Handwritten), anomaly detection system using multi variate Gaussian model. Healthcare diagnostics systems using PGM (BN).
- Hands on experience in data mining algorithms and approach Comfortable presenting to senior management, business stakeholders, and external partners.
- Good at algorithm and design techniques fluency in modern programming languages such as Java comfortable presenting to senior management, business stakeholders, and external partners architecture
- Design of reusable server components for the web as well as Mobile applications strong programming expertise Python and strong in Database SQL.
- Proficient in Python, experience building, and product ionizing end-to-end systems solid coding and engineering skills in Machine Learning
- Experience with file systems, server architectures, databases, SQL, and data movement (ETL) knowledge of Information Extraction, NLP algorithms coupled with Deep Learning
- Experience in Jenkins by installing, configuring and maintaining, for purpose of continuous integration & for end to end automation for all build and deployments.
- Experience in branching, merging, and maintaining versions using tools like Bitbucket. Performed and deployed builds for SIT, UAT and Production environments.
TECHNICAL SKILLS
Languages: Python, R, Scala and Java, SQL, PL/SQL, ASP, Visual Basic, XML, Python, SQL, T-SQL, SQL Server, C, C++, JAVA, HTML, UNIX shell scripting, PERL, RPython, R
Machine learning library: Spark ML, Spark Mllib, Scikit-Learn. NLTK & Stanford NLP
Deep learning framework: Tensor flow, Google Dialog flow
Big Data Frameworks: Apache Spark, Apache Hadoop, Kafka, Mongo DB, Cassandra., Mongo DB
Machine learning: Linear regression, Logistic Regression, Naive Bayes, SVM, Decision Trees, Random Forest, Boosting, Kmeans, Bagging etc
Big data Distribution: Cloudera & Amazon EMRCloud
Web Technologies: Flask, django and spring MVC
Front End Technologies: JSP, HTML5, Ajax, JQuery and XML Servers
Web server: Apache2, Nginx Web Sphere and Tomcat
Visualization Tool: Tableau
Databases: Oracle 11g/12c, Mysql and Postgress, MS Access, SQL Server 2012/2014, Sybase and DB2, Teradata14/15, Hive, SQL Server 2012/2014, MS Access
No SQL: MongoDB and Cassandra
Operating Systems: Linux and windows
Scheduling Tools: Airflow & oozie.
PROFESSIONAL EXPERIENCE
Confidential - Philadelphia, PA
Data Analyst
Responsibilities:
- Created various database objects (tables, indexes, views, stored procedures and triggers) and implement referential integrity constraints for enforcing data integrity and business rules
- Performed T-SQL tuning and optimization of queries that take longer execution time using MS SQL Profiler and Database Engine Tuning Advisor.
- Performed Data mapping between source systems to local systems, logical data modeling, created class diagrams and ER diagrams and used SQL queries to filter data.
- Reverse engineered backend database to change the T-SQL scripts, create Views, Stored Procedures, Triggers and Functions to improve performance drastically.
- Performed Data Modelling to know about the relationships between entities in a database applied various data transformations like Aggregate, Sort, Multicasting, Conditional Split, Derived column in SSIS.
- Developed Merge jobs in Python to extract and loaddataintoMySQLdatabase, also worked on Python ETL file loading and use of regular expression
- Expert using Business Intelligence tools - Microsoft SSRS, SSIS andSSAS,Visual Studio,InformaticaPower
- Developing SQL Server and Oracle Database development experience using tables, triggers, views, packages and stored procedures in PL/SQL & Postgre SQL, Strong RDBMS fundamentals
- Experienced working in the Agile methodology and data viewers and performance tuning of ETL data flows involved in creating VB.Net Script for Data Flow and Error Handling using Script component in SSIS.
- Design and develop Power BI graphical and visualization solutions with Python business requirement documents and plans for creating interactive dashboards
- Involve in designing, developing, and testing of the ETL strategy to Java populate the data from various source systems using SSIS
- Worked with Tabular reports, Matrix reports, Gauges & Chart reports, Parameterized reports, Sub reports, Ad-hoc reports, environment using SQL Server Reporting Services SQL BI, SSIS, SSRS, ETL Process.
- Participated in reviews of issues related to SSRS reports and work in troubleshooting those issues, fixing, testing and deploying them
- A highly immersive Data Science program involving Data Manipulation & Visualization, Web Scraping, Machine Learning, Python programming, SQL, GIT, Unix Commands, NoSQL, MongoDB, Hadoop.
- Used pandas, numpy, seaborn, scipy, matplotlib, scikit-learn, NLTK in Python for developing various machine learning algorithms.
- Installed and used Caffe Deep Learning Framework Setup storage and data analysis tools in Amazon Web Services cloud computing infrastructure
- Developing Voice Bot using AI (IVR ), improving the interaction between Human and the Virtual Assistant on different data formats such as JSON, XML and performed machine learning algorithms in Python.
- Development and Deployment using Google Dialogflow Enterprise worked asDataArchitectsand ITArchitectsto understand the movement ofdataand its storage and ER Studio 9.7
- Participated in all phases of data mining; data collection, data cleaning, developing models, validation, visualization and performed Gap analysis.
- Implemented Agile Methodology for building an internal application Data Manipulation and Aggregation from different source using Nexus, Toad, Business Objects, Power BI and Smart View.
- Integration overlap and Informatica newer commitment to MDM with the acquisition of Identity Systems extracted data from HDFS and prepared data for exploratory analysis using data munging
- Rapid model creation in Python using pandas, numpy, sklearn, and plot.ly for data visualization. These models are then implemented in SAS where they are interfaced with MSSQL databases and scheduled to update on a timely basis.
- Data analysis using regressions, data cleaning, excel v-look up, histograms and TOAD client and data representation of the analysis and suggested solutions for investors
- Build and maintain dashboard and reporting based on the statistical models to identify and track key metrics and risk indicators.
- Attained good knowledge in Hadoop Data Lake Implementation and HADOOP Architecture for client business data management.
Environment: ER Studio 9.7, Tableau 9.03, AWS, Teradata 15, MDM, GIT, Unix, Python 3.5.2,, MLLib, SAS, regression, logistic regression, Hadoop, NoSQL, Teradata, OLTP, random forest, OLAP, HDFS, ODS, NLTK, SVM, JSON, XML, MapReduce, Google Dialog Flow.
Confidential - Plano, TX
Data Analyst
Responsibilities:
- Worked in the migration process to move SSIS packages for SQL server 2008 R2 from one data Centre to another data Centre
- Involved in the complete Software Development Life Cycle (SDLC) process by analysing business requirements and understanding the functional workflow of information from source systems to destination systems
- Designed and implemented complex SSIS package to migrate data from multiple data sources for data analysing, deploying and dynamic configuring of SSIS packages
- Hands-on experience with the Automic UC4 for scheduling jobs like File, database, Processing, Real-time, Debug stages and the sequence jobs creation.
- Developed complex Stored Procedures and Functions and incorporated them in Crystal Reports to enable report generation on the fly (SSRS).
- Useddatajoining and blending and other advanced features in Tableau on variousdatasources likeMySQLTables and Flat files monitor, tune, and analyse database performance and allocate server resources to achieve optimum database performance
- Worked extensively on modifying and updating existing oracle code including object types, views,PL/SQLstored procedures and packages, functions and triggersbased on business requirements
- Good Experience in architecting and Python configuring secure cloud VPC using private and public networks through subnets inAWS.
- Worked on SQL Server 2005 concepts SSIS (SQL Server Integration Services),SSAS(Analysis Services) and SSRS (Reporting Services).
- Documented the installation process for setup and recovery to the new environment for the technical support team createdSQLserver reports and developed Postgre SQL query for generating drill down reports using SSRS
- Involved in complex SSIS package development with Microsoft SQL BI standards and Development strategies to pull data from various data sources such as (SQL Server, Excel, and Flat file).
- Maintained existing Data Migration program with occasional upgrades and enhancements executed Data Migration in coordination with management and technical services personnel
- Worked with several R packages including knitr, dplyr, SparkR, CausalInfer, spacetime enabled to grant permissions and resources to users. Managed roles and permissions of users with the help of AWSIAM.
- Implemented end-to-end systems for Data Analytics, Data Automation and integrated with custom visualization tools using R, Mahout, Hadoop and MongoDB.
- Gathering all the data that is required from multiple data sources and creating datasets that will be used in analysis performed Exploratory Data Analysis and Data Visualizations using R, and Tableau.
- Worked withDatagovernance,Dataquality,datalineage,Dataarchitectto design various models and processes perform a proper EDA, Univariate and bi-variate analysis to understand the intrinsic effect/combined effects.
- Independently coded new programs and designed Tables to load and test the program effectively for the given POC's using with BigData/Hadoop.
- Designeddatamodels anddataflow diagrams using Erwin and MS Visio research on improving IVR used internally in J&J performed data cleaning and imputation of missing values using R.
- Developing IVR For clinics so that the callers can receive anonymous access to test results worked with Hadoop eco system covering HDFS, HBase, YARN and Map Reduce requests based on different departments and locations.
- Determined regression model predictors using Correlation matrix for Factor analysis in R built Regression model to understand order fulfillment time lag issue using Scikit-learn in Python
- Utilized Spark, Scala, Hadoop, HBase, Kafka, Spark Streaming, MLlib, Python, a broad variety of machine learning methods including classifications, regressions, dimensionally reduction etc
- Empowered decision makers with data analysis dashboards using Tableau and Power BI interface with other technology teams to extract, transform, and load (ETL) data from a wide variety of data sources
- Provides input and recommendations on technical issues to BIEngineers, Business & DataAnalysts and Data Scientists.
- Developed, Implemented & Maintained the Conceptual, Logical & PhysicalDataModels using Erwin for Forward/Reverse Engineered Databases.
- EstablishedDataarchitecture strategy, best practices, standards, and roadmaps as an Architect implemented MDM hub to provide clean, consistent data for a SOA implementation.
- Implemented Event Task for execute Application Automatically broad knowledge of programming, and scripting (especially in R / Java / Python)
- Development and presentation of a data analytics data-hub prototype with the help of the other members of the emerging solutions team.
Environment: R 3.0, Erwin 9.5, Tableau 8.0, MDM, QlikView, MLLib, PL/SQL, HDFS, Teradata 14.1, JSON, HADOOP (HDFS), MapReduce, PIG, Spark, R Studio, MAHOUT, JAVA, HIVE, AWS.
Confidential, New York, NY
Data Analyst
Responsibilities:
- Created and Designed a Multi-Dimensional Analysis (Configured OLAP Cubes, Dimensions, Measures, MDX Queries met with business users, gathered business requirements, and prepared the documentation for requirement analysis.
- Created centralized data warehouse (ODS) and developed de-normalized structures based on the requirements and to improve the query performance for end-users.
- Created ETL design specifications, performed unit testing, data validations, integration testing and UAT and deployed SSIS packages to development and test environments
- Worked with SSAS tabular model (BISM) and created dashboards and reports using Power BI views, Power pivot, Datazen and SSRS 2014 used team foundation server for version controlling of all BI projects.
- Development ofPythonAPIs to dump the array structures in the Processor at the failure point for debugging of the break point tool usingPerlandJavaUser Interface
- Implemented end-to-end systems forDataAnalytics, Mahout,HadoopandMySQLand Tableau using HP Quality Center v 11 for defect tracking of issues.
- Worked on AWS Data Pipeline to configure data loads from S3 to into Redshift did debug use breakpoints and data viewers and performance tuning of ETL data flows
- Created various SSIS packages to populate the data from flat files, Excel and Access into ODS (SQL Server) performed full loads for current data and incremental uploads for historical data (transaction-based data).
- Created calculated columns, KPIs and custom hierarchies to accommodate business requirements and partitioned the cubes to improve the performance and to decrease the processing time.
- Designed SSAS tabular models and created dashboards and reports using SQL BI view and deployed them in Share point server 2013.
- Developed various types of reports drill down, drill through, matrix and sub reports using SSRS 2014 created multiple dashboards with calculations and KPIs using Tableau 9.3/10 and deployed them in Tableau server
- Involved in defining the source to target data mappings, business rules, and data definitions performing data profiling on various source systems that are required for transferring data to ECH using
- Defining the list codes and code conversions between the source systems and the data mart using Reference Data Management (RDM).
- Utilizing Informatica toolset (Informatica Data Explorer, and Informatica Data Quality) to analyze legacy data for Data Profiling.
- Worked on DTS Packages, DTS Import/Export for transferring data between SQL Server 2000 to 2005 involved in upgrading DTS packages to SSIS packages (ETL).
- Performing an end to end Informatica ETL Testing for these custom tables by writing complex SQL Queries on the source database and comparing the results against the target database.
- Expertise in applying data mining techniques and optimization techniques in B2B and B2C industries and proficient in Machine Learning, Data/Text Mining, Statistical Analysis and Predictive Modeling.
- Extracting the source data from Oracle tables, MS SQL Server, sequential files and excel sheets predictive modelling using state-of-the-art methods
- Developing and maintaining Data Dictionary to create metadata reports for technical and business purpose maintain dashboard and reporting based on the statistical models to identify and track key metrics and risk indicators.
- Developed MapReduce/Spark Python modules for machine learning & predictive analytics in Hadoop on AWS parse and manipulate raw, complex data streams to prepare for loading into an analytical tool.
- Migrating Informatica mappings from SQL Server to Netezza Foster culture of continuous engineering improvement through mentoring, feedback, and metrics
- Implemented Event Task for execute Application Automatically broad knowledge of programming, and scripting (especially in R / Java / Python)
- Involved in developing Patches & Updates Module proven experience building sustainable and trustful relationships with senior leaders.
Environment: Erwin 8, Teradata 13, SQL Server 2008, Oracle 9i, SQL*Loader, PL/SQL, ODS, OLAP, OLTP, SSAS, Informatica Power Center 8.1.
Confidential
Software Developer
Responsibilities:
- Analysed business requirements and build logical and physical data models that describe all the data and relationships between the data using ER Win.
- Developed drill down and drill through reports from multidimensional objects like star schema and snowflake schema using SSRS and performance point server.
- Used SSIS and T-SQL stored and Work in Power BI procedures to transfer data from OLTP databases to the staging area and finally transfer into data marts and performed the action in XML and performance tuning of ETL data flows.
- Created new database objects like stored procedures, Functions, Triggers, Indexes and Views using T-SQL in Development and Production environment for SQL Server 2008.
- Designed various types of reports and deployed them into the production server using Microsoft office share point services (MOSS).
- Created various reports for analysis of revenue, claims, customer interactions and shipping data using SQL Server Reporting Services, Report Builder.
- Designed and created distributed reports in multiple formats such as Excel, PDF, XML, HTML, and CSV using SQL Server 2012 Reporting Services (SSRS)
- Involved in defining the source to target data mappings, business rules, and data definitions Performed forward and reverse engineering for database development and design using ER Win.
- Performed document analysis involving creation of Use Cases and Use Case narrations using Microsoft Visio, in order to present the efficiency of the gathered requirements.
- Calculated and analyzed claims data for provider incentive and supplemental benefit analysis using Microsoft Access and Oracle SQL.
- Expertise in applying data mining techniques and optimization techniques in B2B and B2C industries and proficient in Machine Learning, Data/Text Mining, Statistical Analysis and Predictive Modeling.
- Worked with BTEQ to submit SQL statements, import and export data, and generate reports in Terra-data Analyse business information requirements and model class diagrams and/or conceptual domain models.
- Responsible for defining the key identifiers for each mapping/interface Gather & Review Customer Information Requirements for OLAP and building the data mart
- Coordinated meetings with vendors to define requirements and system interaction agreement documentation between client and vendor system.
- Enterprise Metadata Library with any changes or updates Analysed business process workflows and assisted in the development of ETL procedures for mapping data from source to target systems.
- Document data quality and traceability documents for each source interface responsible for defining the functional requirement documents for each source to target interface.
- Managed the project requirements, documents and use cases by IBM Rational Requisite Pro assisted in building an Integrated Logical Data Design, propose physical database design for building the data mart.
- Document all data mapping and transformation processes in the Functional Design documents based on the business requirements establish standards of procedures generate weekly and monthly asset inventory reports.
Environment: SQL Server 2008R2/2005 Enterprise, SSRS, SSIS, Crystal Reports, Windows Enterprise Server 2000, DTS, SQL Profiler, and Query Analyzer.