We provide IT Staff Augmentation Services!

Sr. Data Engineer/analyst Resume

4.00/5 (Submit Your Rating)

Malvern, PA

SUMMARY:

  • Experience in developing and deploying projects for Predictive Modeling, Data Validation, Data Acquisition, Predictive modeling, Data Visualization and Data Mining with large datasets of Structured and Unstructured data.
  • Over 6+ years of experience in various domains as a Data Analyst.
  • Passionate about gleaning insightful information from massive data assets and developing a culture of sound, data - driven decision making.
  • Solid ability to write and optimize diverse SQL queries, working knowledge of RDBMS like MySQL, SQL Server.
  • Questioner Design, Data Collection, Data Analysis and Visualization, Statistical Analysis and Modeling, System Model Analysis and Design, Survey Design, Time Series Analysis, ANOVA, Data Mining, Econometric Analysis, Time Series Procedures (e.g., ARIMA, ARCH/GARCH), Regression analysis, Regression discontinuity, Logistic Regression, Neural Network, Discriminate Analysis, Decision Tree Analysis, Association Analysis, Propensity Score Matching, Survival Analysis, Difference in Differences, Predictive Analysis, Estimation and Forecasting, Qualitative and Quantitative Research.
  • Proficient in using editors Eclipse, Jupyter and Spyder and Notepad++.
  • Proficient using R for Big Data processing, statistical analysis and visualization in Tableau, Power BI, Google Data Studio, ggplot2 and d3.js.
  • Experience with data analytics, data reporting, Ad-hoc reporting, Graphs, Scales, PivotTables and OLAP reporting.
  • Proficient in creating Triggers, Tables, Stored Procedures, Functions, Views, Indexes and Constraints.
  • Experience in visualization tools like, Tableau, Power BI and Tibco Spotfire for creating reports and dashboards.
  • Worked on all activities related to high level design of ETL DTS package for integrating data from multiple heterogeneous data sources (Excel, Flat File, Text format Data) using SSIS.
  • Extensive experience in querying languages using SQL, R, SAS, SAP BI.
  • Proficient in Data Analysis with sound knowledge in extraction of data from various database sources like MySQL, Oracle, SAP BI and other database systems.
  • Worked in creating different Visualizations in Tableau using Bar charts, Line charts, Pie charts, Maps, Scatter Plot charts, Heat maps and Table reports.
  • Experienced in working with Agile-based SDLC (Scrum) and Waterfall software development life cycle.
  • Maintained UNIX/Perl scripts for build and release tasks.
  • Experience in enhancing and deploying the SSIS Packages from development server to production server.
  • Good understanding and hands on experience in setting up and maintaining NoSQL Databases like Cassandra and HBase.
  • Expertise in Normalization/DE normalization techniques for effective and optimum performance in OLTP and OLAP environments.
  • Extensively worked on statistical analysis tools and adept at writing code in Advanced Excel, R, MATLAB, Python, and SAS.
  • Proficient in design and development of various dashboards, reports using visualizations like bar graphs, scatter plots, pie-charts, geographic visualization and other, making use of actions, other local and global filters according to the end-user requirement.
  • Experienced in Python to manipulate data for data loading and extraction and worked with Python libraries like Matplotlib, NumPy, Scipy, Scikit-learn, Stats Models and Pandas for data analysis.
  • Implemented Bagging and Boosting to enhance the model performance.
  • Excellent understanding in Agile and Scrum development methodologies.
  • Excellent communication skills. Successfully working in fast-paced multitasking environment both independently and in a collaborative team, a self-motivated, enthusiastic learner.

TECHNICAL SKILLS:

Databases: Oracle 11g/10g/9i/8i, SQL Server 2008/2005/2000 , Sybase, DB2, MS Access

Languages: Python, SQL, PL/SQL, Unix Shell Scripting (Korn/C/Bourne), Pro*C, Java, C, C++

Web Technologies: JSP, Java Script, HTML, XML, HTTP

Operating System: UNIX, LINUX, Windows NT/XP/Vista/98/95

Reporting Tools: Business Objects Developer Suite 5.1, Business Objects XI/XIR2/XIR3.1, COGNOS Suite, COGNOS Report Net, Crystal Reports, Oracle Reports 10g/9i/6i

Oracle Utilities: SQL*Plus, SQL*Loader, SQL Developer, TOAD

Other Tools: Erwin 8.1/7.3, Visio 2007, Oracle Forms and Reports, APEX, Golden Gate, Autosys, Subversion, Revision Control System, BMC Control-M

PROFESSIONAL EXPERIENCE:

Confidential, Malvern, PA

Sr. Data Engineer/Analyst

Responsibilities:

  • Involved in data mapping sections that involves matching up the old business structure with new structure and determining the source and target mappings.
  • Developing data analytical databases from different sources and create a master data set.
  • Responsible for data identification, collection, exploration, cleaning for modeling.
  • We do predictions on sales and profits using machine learning and deep learning strategies.
  • Updated and manipulated content and files by using python scripts. Worked on Python Open stack API's. Experienced in job workflow scheduling and monitoring tools.
  • Designed and developed applications in Spark using Scala to compare the performance of Spark with Hive and SQL.
  • Developed in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
  • Created complex Stored Procedures, Triggers, Functions, Indexes, Tables, Views, SQL Joins and other PL SQL code to adept business logic.
  • Imported data by executing hive scripts.
  • Performed data processing using Python libraries like NumPy and Pandas.
  • Performed data imputation using Scikit-learn package in Python.
  • Worked with data analysis using Matplotlib and seaborn libraries to do data visualizations for better understanding.
  • Worked on SSIS performance tuning using counters, error handling, event handling, re-running of failed SSIS packages using checkpoints.
  • Developed, deployed and monitored SSIS Packages including upgrading DTS to SSIS.
  • Used clustering-based and tree-based anomaly detection models to predict anomalies.
  • Wrote various Windows scripts and Python scripts to facilitate the operations of these data stores. Written technical specifications to document requirements, data transformation rules, design and workflow.
  • Used ad hoc queries and Pandas with Python for querying and analyzing the data, participated in performing data profiling, data analyzing, data validation and data mining.
  • Designed ETL (Extract, Transform, and Load) strategy to transfer data using OLEDB providers from the existing diversified data sources.
  • Involved in requirement gathering/analysis, Design, Development, Testing and Production rollover of Reporting and Analysis projects.
  • Created Data Quality Scripts using SQL and Hive to validate successful data load and quality of the data. Created various types of data visualizations using Python.
  • Worked extensively with python in optimization of the code for better performance.
  • Handled importing data from various data sources, performed transformations using Hive, MapReduce and loaded data into HDFS.
  • Built Hadoop based ETL workflow to transform and aggregate data.
  • Created OLAP models based on Dimension and Facts for efficient loads of data based on Star Schema structure.
  • Conducted JAD sessions with business users and SME's for better understanding of the reporting requirements.
  • Used R and Python for Exploratory Data Analysis to compare and identify the effectiveness of the data.
  • Design and developed end-to-end ETL process from various source systems to Staging area, from staging to Data Marts.
  • Attended Daily Scrum meetings to provide update on the progress on daily activities to the Scrum Master and also to notify blocker and dependency if any.
  • Created independent libraries in Python which can be used by multiple projects which have common functionalities.
  • Implemented Installation and configuration of multi-node cluster on Cloud using Amazon Web Services (AWS) on EC2.
  • Developed the data model of Reporting data marts by introducing the star and snow flake concepts using Erwin.
  • Developed and test programs using SQL, PL/SQL for Data Warehouse and Business Intelligence applications including dimensional data marts.
  • Created various Proof of Concepts (PoC) and gap analysis and gathered necessary data for analysis from different sources, prepared data for data exploration using data munging.
  • Supporting the operations of Data Warehouse and Business Intelligence solutions.
  • Analyzing the informational and programming requirements for data migration and Performing ETL operations in Data Warehouse using SAS DI.

Environment: JIRA, DB2, Oracle11g, Oracle Cloud, MS Project 2002, PL/SQL,SAS, SAS Integration studio, SSIS,AWS, SCRUM, UML, MS Power Point, MS SQL server, SVN, Erwin9.5, JSON, Teradata, Hadoop, HDFS, Hive, Pig, Sqoop, Map Reduce, SQL, Python, MVS, Oracle PL/SQL, Data Mining.

Confidential

Data Analyst/ Engineer

Responsibilities:

  • Worked on Data mining with large data sets of Structured and Unstructured data, Data Acquisition, Data Validation, Predictive modeling, Data Visualization.
  • Participated in all phases of research including data collection, data cleaning, data mining, developing models and visualizations.
  • Responsible for loading, extracting and validation of client data.
  • Translated business requirements to technical terms for call flows and other call center systems.
  • Used AWS to manage the data in the cloud.
  • Good knowledge on Hadoop components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and MapReduce concepts.
  • Collaborated with data engineers and operation team to collect data from internal system to fit the analytical requirements.
  • Worked on managing and reviewing Hadoop log files. Tested and reported defects in an Agile Methodology perspective.
  • Configured Hive Meta store with MySQL, which stores the metadata for Hive tables.
  • Created Use Case Diagrams using UML to define the functional requirements of the application.
  • Worked on configuring and managing disaster recovery and backup on Cassandra Data.
  • Created PL/SQL packages and Database Triggers and developed user procedures and prepared user manuals for the new programs.
  • Created jobs and transformation in Pentaho Data Integration to generate reports and transfer data from HBase to RDBMS.
  • Designed the HBase schemes based on the requirements and HBase data migration and validation.
  • Participated in writing data mapping documents and performing gap analysis to verify the compatibility of existing system with new business requirements.
  • Utilizing tableau for visual analytics and dashboard developments.
  • Created Tableau views with complex calculations and hierarchies making it possible to analyze and obtain insights into large data sets.
  • Worked with data validation, ad hoc report, and report validation using Excel, VBA Macro, and crystal Reports and Business Objects.
  • Wrote complex SQL statements in a Teradata database environment for reporting purposes or data mining.
  • Used Python for creating graphics, data exchange and business logic implementation.
  • Maintained updated Log files using Python.
  • Designed data profiles for processing, including using python for Data Acquisition and Data Integrity which consists of Datasets Comparing and Dataset schema checks.
  • Created Data Quality Scripts using SQL and Hive to validate successful data load and quality of the data.
  • Created various types of data visualizations using Python.
  • Performed data profiling on datasets with millions of rows on Oracle environment, validating key gen elements, ensuring correctness of codes and identifiers, and recommending mapping changes.
  • Generated reports using Tableau and publishing to Tableau Server.
  • Designed and developed Project document templates based on AGILE methodology.
  • Followed Agile Software Development Methodology for the application development.
  • Assisted the Project Manager in setting realistic Project expectations and in evaluating the impact of changes on the organization and plans accordingly and conducted Project related presentations.
  • Identified internal and external system Requirements, design and configuration set-up.
  • Validating and profiling Flat file Data into Oracle tables using Unix Shell scripts.
  • Performed Data Stewardship by implementing Dimensional Modeling (Data Marts, Facts and Dimensions), Data Migration, Data Cleansing, ETL Processes, Data Integration, and Data Mining.
  • Developed strategic partnerships with the Business units to develop a solid knowledge base of the Business line, including the Business Plan, Products, and Process.
  • Used JIRA for reporting bugs and tracking progress of defect fixing for each release in product production.
  • Developed Business Requirement Document as well as High-Level Project Plan.
  • Worked with developers to implement COTS products into the web application.
  • Worked with QA team to ensure that the test results matched the requirements.
  • Used Quality Center to get an easy interface to manage and organize activities like requirements coverage, test case management, test execution reporting, defect management and test automation.

Environment: SCRUM, MS Project 2002, PL/SQL, MS Visio, MS Word, MS Excel, Test Director, ORACLE 10g, Windows NT/2000, UNIX, Python, SQL Server 2005/2008/2012 , Tableau, Agile methodology, T-SQL,TOAD, SQL-Loader.

Confidential, Dayton, OH

Data Analyst

Responsibilities:

  • Responsible for gathering requirements from Business Analyst and Operational Analyst and identifying the data sources required for the request.
  • Created Stored Procedures to migrate data from flat file structure to a normalized Data structure.
  • Created Test Report Dashboards that helps in visualizing several thousands of app status using Tableau.
  • Created Tableau dashboards to know the app status of machine for user/organization.
  • Created Adhoc reports in Tableau to visualize top 10 incompatible apps based on user count.
  • Developed stacked bar chart in Tableau for the top-level management to view the test status of apps across the organization.
  • Developed various Dashboards using objects such as Chart Box (Drill Down, Drill Up), List, crosstab etc. using Tableau.
  • Created complex formulas and calculations within Tableau to meet the needs of complex business logic.
  • Combined Tableau visualizations into Interactive Dashboards using filter actions, highlight actions etc., and published them to the web.
  • Created complex formulas and calculations within Tableau to meet the needs of complex business logic.
  • Developed various data connections from data source to Tableau Desktop for report and dashboard development.
  • Created Technical Documentation for reports and maintenance/enhancements of existing reports.
  • Involved in project planning, scheduling for database module with project managers.
  • Discussed with business people to understand the rules for defining the test status of apps at the report level.
  • Created new data source and replaced with existed data source. Created schedules and extracted the data into Tableau Data Engine.
  • Building, publishing customized interactive reports and dashboards, report scheduling using Tableau server.
  • Created action filters, parameters and calculated sets for preparing dashboards and worksheets in Tableau.
  • Effectively used data blending feature in tableau. Defined best practices for Tableau report development.
  • Developed Tabular, drill down, parameterized, cascaded and sub-reports using Tableau.

Environment: SQL Server 2008R2/2005 Enterprise, SSRS, SSIS, Crystal Reports, Windows Enterprise Server 2000, DTS, SQL Profiler, Tableau, Qlik View, ad-hoc, SharePoint and Query Analyzer.

Confidential, Houston, TX

Data Analyst

Responsibilities:

  • Responsible for gathering requirements from Business Analyst and Operational Analyst and identifying the data sources required for the request.
  • Performed Data analysis on many ad hoc request's and critical projects through which some of the critical business decisions will be made.
  • Worked on Data Verifications and Validations to evaluate the data generated according to the requirements is appropriate and consistent.
  • Utilized ODBC for connectivity to Teradata & MS Excel for automating reports and graphical representation of data to the Business and Operational Analyst.
  • Played the role of team Database captain which includes monitoring of container space allocated to our team, running weekly reports of usage done by users, checking for skewness and statistics on a table and recommended changes to users.
  • Optimized the Data environment in order to efficiently access data Marts and implemented efficient data extraction routines for the delivery of data.
  • Designed and developed weekly, monthly reports related to the marketing and financial departments using Teradata SQL.
  • Analyze, design, code, test, implement and a support data warehousing extract programs and end-user reports and queries.
  • • Worked with data investigation, discovery and mapping tools to scan every single data record from many sources.
  • Aggregate functions were executed on measures in the OLAP cube to generate information about dynamic trends including bandwidth consumption and their cost analysis.
  • Generated periodic reports from the OLAP cubes based on the statistical analysis of the trends about location and bandwidth data from various time frames and departments using SQL Server Reporting Services (SSRS) to project various KPI's.
  • Worked on loading data from flat files to Teradata tables using SAS Proc Import, Fast Load and MS Excel Macros Techniques.
  • Expertise in writing MS XCEL VBA CODE, created numerous tools as per business requirements using Teradata as data base server and used user forms and various other function in VBA.
  • Helped in Development and implementation of specialized training for effective use of data and reporting resources.
  • Converted Xcelsius dashboard to Tableau Dashboard with High Visualization and Good Flexibility
  • Worked on Bar charts, line Charts, combo charts, heat maps and incorporated them into dashboards.
  • Validating automated scheduled reports to catch any data issues or missing data before sending out to business customer.
  • Worked with ETL team if there are any issues in production data like load delays, missing data and data quality and fix it. Also involved in modification and creation of new data warehouse table design.
  • Good experience in creating complex logic to meet the business requirements in building the reports or for their data needs and maintaining quality of data.
  • Wrote hundreds of DDL scripts to create tables and views in the company Data Warehouse, Ad-hoc reports developed using Teradata SQL, and UNIX.

Environment: Oracle 10G/11g, PL/SQL, Cognos, Qlikview, SAS, Tableau, Spotfire, SQL Developer, Teradata SQL ETL, Data Processing, Data migration, Data modeling Adobe Analytics, MS-Windows/Microsoft Office Suite, VB.NET, TOAD, Excel, PowerPoint, Visio.

Confidential

Business Data Analyst

Responsibilities:

  • Liaised between Business and Functional owner during risk engineering and high-level review sessions to derive and execute actions plans, deadlines and standards
  • Conduct GAP Analysis of business rules/requirements, business and system process flows, user administration
  • Participate in daily scrum meetings by giving insights into prioritizing user stories and responsible for assigning story points to user stories
  • Design and develop Use Case, Activity and Sequence diagrams using Unified Methodology Language(UML) in MS Visio
  • Successfully resolve data quality issues by performing Root cause analysis
  • Accomplished in deploying data visualization and analyzing data by developing dashboards in Qlikview and Tableau that empowers business users to make business decisions
  • Create daily basis reports in Spotfire and presented dashboards to users.
  • Perform ETL on Oracle data warehouse and wrote SQL queries to troubleshoot and solve user problems
  • Designed high level ETL architecture for overall data transfer from the OLTP to OLAP with the help of SSIS
  • Involved in Designing the DB2 process to Extract translates and load data from OLTP Oracle database system to Teradata data warehouse
  • Provide production support and troubleshooting of data quality and integrity issues and backed up daily updates
  • Work with QA testing team to design test methods to verify application functions of consumer loan and commercial loan activities in an Agile environment
  • Participate in full lifecycle of SDLC

Environment: Oracle 10G/11g, PL/SQL, Teradata SQL Spotfire, Tableau, SAP, Excel, SQL Developer, MS-Windows/Microsoft Office Suite, VB.NET, TOAD, MS Visio

Confidential

Junior Data Analyst

Responsibilities:

  • Analyze, develop and maintain business requirements for the project lifecycle, customer facing and development of comprehensive documentation
  • Design, develop and maintain company databases that address various user and management needs
  • Develop reports and dashboards for use by various business, operational and executive stakeholders that provide key insights into performance
  • Designed and developed weekly, monthly reports related to the marketing and financial departments using Teradata SQL.
  • Perform data manipulation, migration and mining to develop data quality and efficiency
  • Involved in Designing the DB2 process to Extract translates and load data from OLTP Oracle database system to Teradata data warehouse
  • ELT, using Talend Big Data for processing the data. Transformations done include joins, reformats, sorts, if-else conditional transforms etc.
  • Present results obtained from analytical tools (Tableau, Spotfire, QlikView) including financial analysis, KPIs (key performance indicators) and healthcare analysis to higher management
  • Develop SQL stored procedures to handle processing for application database
  • Maintain and support production and test servers for hospital systems.

Environment: SQL server, Linux Environment, Ubuntu Operating System, SQL queries (Complex Joins, Sub Queries), SSIS, SSAS, SSRS, ER-diagrams, Dash Boards.

We'd love your feedback!