Data Analyst /ml Engineer Resume
Baltimore -, MD
SUMMARY
- Data Analyst and Software developer with 7+ years of experience in Machine Learning, Data mining with large data sets of Structured and Unstructured data, Data Acquisition, Data Validation, Predictive modelling, Data Visualization, programming languages like R, Python and Java including AWS Services.
- Experience with more than 5 years of experience in all phases of Software development life cycle (SDLC) including System Analysis, Design, Data Modelling, Implementation and Support and maintenance of various applications in both OLTP and OLAP systems.
- Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing data mining and reporting solutions that scale across a massive volume of structured and unstructured data.
- Extensive experience in relational Data modelling, Dimensional data modelling, logical/Physical Design, ER Diagrams and OLTP and OLAP System Study and Analysis.
- Hands on writing test cases for PL/SQL Sub Programs and Done Unit Testing for SQL/PLSQL and deployed it to production and then the maintenances.
- Expertise in collecting the requirements from client to prod maintenances, query optimization, execution plan, and Performance tuning of queries for better performance in SQL.
- Possess strong Documentation skill and knowledge sharing among Team, conducted data modelling sessions for different user groups, facilitated common data models between different applications, participated in requirement sessions to identify logical entities.
- Experienced in writing PL/SQL Stored Procedures, Triggers and Function.
- Experience in software designing, developing, debugging multi - tier applications, and providing modular solutions based on object-oriented principles.
- Hold expertise in Java, J2EE, JDBC, JSP, Servlets, Springs Core, Spring MVC, Spring Boot, and Hibernate in addition to Python, machine learning and visualization tools.
- Experience in various phases ofSoftware Development life cycle(Analysis, Requirements gathering, Designing) with expertise in documenting variousrequirement specifications, functional specifications, Test Plans, Source to Target mappings, SQL Joins.
- Experience in conductingJoint Application Development (JAD)sessions for requirements gathering, analysis, design and Rapid Application Development (RAD) sessions to converge early toward a design acceptable to the customer and feasible for the developers and to limit a project’s exposure to the forces of change.
- Experience in Working with different industries likeFinancial,HealthCare,Educational.
- Experienced inimplement Optimization techniquesfor better performance on the ETL side and also on the database side.
- Strong experience inBusiness and Data Analysis, Data Profiling, Data Migration, Data Conversion, Data Quality, Data Integration and Metadata Management Services and Configuration Management.
TECHNICAL SKILLS
Operating systems: Windows XP/7/8/10, Unix, Linux
Languages: R, SQL, Python, Shell Scripting, Markup languages HTML, XML, JavaScript.
Databases: Oracle 11g, SQL Server, MS Access, MySQL, MongoDB, PostgreSQL, PL/SQL, ETL.
Cloud Platform: Aws
IDE: R Studio, Jupyter Notebook, Eclipse, NetBeans.
BI and visualization: Tableau, Power Bi, Quick View
ML Tools /Libraries: Sklearn, Pandas, NumPy, MS Excel, Predictive Modelling, Classification, Regression, Clustering, Tree Based Algorithms, Bagging & Boosting, Linear/Logistic Regression, Random Forest, Naive Bayes, SVM, KNN, K Means, Parameter Tuning, Boosting, Feature Engineering.
Version Controls: GIT, SVN
PROFESSIONAL EXPERIENCE
Confidential, Baltimore - MD
Data Analyst /ML engineer
Responsibilities:
- Consulted with application development business analysts to translate business requirements into data design requirements used for driving innovative data designs that meet business objectives.
- Worked as business requirements analysts with subject matter experts to identify and understand requirements.
- Used machine learning algorithms to predict and map accessibility for people with disabilities.
- Integrated GIS data with other data sources to create a comprehensive data model for the project.
- Used Python libraries like NumPy& Pandas in conjunction with GIS in dealing with Data Frames to manipulate the vector data.
- Application of various machine learning algorithms like decision trees, regression models, neural networks, SVM, clustering to identify profiles using scikit-learn package in python.
- Used Principal Component Analysis in feature engineering to analyse high dimensional data.
- Created and designed reports that will use gathered metrics to infer and draw logical conclusions of past and future behaviour.
- Used SQL queries to retrieve and join relevant data from different tables to support machine learning and GIS analysis.
- Used Tableau to create visualizations that incorporated GIS data and provided insights into accessibility trends.
- Collaborated with GIS specialists to ensure that all GIS data was accurate, up-to-date, and met the project's requirements and properly stored and managed in the database.
- Processed a large volume of commercial data containing 100,000 orders from various marketplaces.
- Created a customer portal and a summary dashboard using Python Streamlet to provide a comprehensive overview of customer behaviour and purchasing patterns in the e-commerce industry.
- Performed time series analysis by Arima for forecasting, K-Means / DBSCAN to segment customers and R Shiny to design an interactive dashboard for 100,000 orders from an e-commerce platform.
- Mined data from GSMArena and performed feature engineering by mapping Centurion Mark Score with fuzzy logic and achieving a 15% mean absolute percent error by Random Forest regression.
- Visualized clusters of different wildfires leveraging Geopandas and predicted the Arson wildfire by performing EDA and feature engineering with 95.14% accuracy by Random Forest.
- Conducted GAP analysis so as to analyse the variance between the system capabilities and business requirements.
Environment: Oracle 8i, PL/SQL, SQL, Tableau, ML, Python, SAS.
Confidential, Chicago - IL
Data Analyst
Responsibilities:
- Conducted analysis of healthcare data, including patient outcomes, hospital operations, and financial performance, to identify trends and patterns using statistical software such as SAS, Python, and R.
- Developed and maintained dashboards and reports using Tableau and Power BI to communicate data insights to clinical and administrative stakeholders.
- Created complex queries using SQL to extract and manipulate data from various sources, including electronic health records (EHRs) and claims data, to support analysis.
- Conducted data quality assessments to ensure the accuracy and completeness of EHR data and other healthcare data sources.
- Collaborated with IT and data management teams to identify and resolve data integrity issues using SQL and Python.
- Led the design and development of predictive models to identify high-risk patients and support care management programs, using SAS and Python.
- Conducted ad hoc analysis to support strategic decision-making, such as evaluating the impact of new programs or policies on patient outcomes and costs, using SAS, Python and SQL.
- Developed and implemented data-driven initiatives, such as population health management and value-based care, using Tableau, SAS, and Python.
- Worked closely with clinical and administrative stakeholders to understand their data needs and develop custom reports and analyses to support their work.
- Maintained up-to-date knowledge of healthcare regulations and compliance requirements, such as HIPAA and HITECH, to ensure adherence to industry standards when working with sensitive patient data.
- Gathered user requirements and business goals through JAD sessions and blended technical and business knowledge.
- Analysed data models and performed data validation using SQL queries against an Oracle database.
- Created reports and identified data hierarchies through list and drill-down reports.
- Developed process documents, reporting specs and templates, training materials, and presentations for application development teams and management.
- Collaborated with data modelers and ETL developers to create Data Functional Design documents and ensure conformity to best practices and change management.
- Created and maintained process documentation and data deliverables such as data profiling and source-to-client maps.
- Documented stakeholder requirements in a Software Requirements Specification document and created use cases and diagrams using Rational Rose.
- Analysed emerging business trends, production patterns and forecasting demand using BI Reports
- Involved in identifying reporting needs and helping the business in creating specification and worked closely with report development team to achieve the reporting goal.
Environment: PL/SQL, Tableau, Python, SAS, Orange, Excel SQL/Server, Oracle 9i, MS-Office, XML, Business Objects.
Confidential
Data Analyst
Responsibilities:
- Developed spark programs for data ingestion to HDFS from data sources like MySQL, Vertica and Oracle.
- Used ACID properties to leverage row level transactions in Hive.
- Developed python APIs to perform data quality checks on data before being used by subsequent processes or Publish to downstream users.
- Used Git version control system for project files, UNIX scripts.
- Created Sqoop jobs for efficiently transferring bulk data between Apache Hadoop and relational databases.
Environment: Python, SQL, Hadoop, UNIX.
Confidential
Software Engineer/Database developer
Responsibilities:
- Built user interface (UI) with HTML, CSS, and JavaScript in Spring Framework for Confidential Acquiring System aided to process 12.44 billion transactions for notable clients such as RAKBANK.
- Recognized for building a Shell Script for data and log file purging to assist organization in addressing data purging related issues as a part of PCI requirements with 30% improvement in retention times.
- Led end-to-end payment cycle by managing running of Confidential internal batches such as pre-cleaning, EOD daily operating Core Java, MySQL and Jasper Reports used to onboard 50 plus banks.
- Developed and place into use mathematics-based enterprise data ETL solutions to optimize process of integrating large and complex business datasets to instantiate and configure Confidential software platform.
- Collaborated with senior management on multiple functions such as code integration, release building, Production support, issues and deployment incorporating core agile practices.
- Streamlined processed data provided by banks to store it in a database through ETL, spring boot, and hibernate. Created Store Procedures, Functions, Packages, Triggers.
- Working with Oracle Collections (including Record, PL/SQL Table, and Nested Table) along with sub queries and joins to extract data from multiple tables.
- Responsible for optimizing SQL queries through performance tuning and indexing resulting in a 30% reduction of database deadlocks and improved response time.
- Worked in the Confidential file processor, which is a Java multithreading framework to facilitate the end-to-end transaction.
- Mentored and trained new hires by providing sessions on digital banking framework, best practices.
- Conducted risk analysis and prioritized functional requirements to identify project critical success factors.
- Managed changes to requirements to ensure requirements integrity.
- Collaborated with business leaders, IT groups, and vendors to ensure clear and accurate communication of issues.
- Collaborated with business leaders, IT groups and vendors to ensure issues communicated clearly and accurately.
- Perform data analysis to evaluate the data quality and resolve the data related issues.
- Responsible for improving data quality and for designing or presenting conclusions gained from analysing data using Microsoft Excel as statistical tool.
- Performed Data reconciliation of source, Referential Integrity, Null and Default value checks, and Business Rules applied.
- Attended defect triage meetings with the end users and developers.
Environment: Oracle 8i/9i, PL/SQL, SQL, ETL, Jasper Reports, Spring Framework, Junit.
Confidential
Data Analyst and Quality Analyst
Responsibilities:
- Collaborated with business leaders, IT groups and vendors to ensure issues communicated clearly and accurately.
- Perform data analysis to evaluate the data quality and resolve the data related issues.
- Responsible for improving data quality and for designing or presenting conclusions gained from analysing data using Microsoft Excel as statistical tool.
- Performed Data reconciliation of source, Referential Integrity, Null and Default value checks and Business Rules applied.
- Debugged data issues between source systems and EDW.
- Profiled data in the sources prior to defining the data cleaning rules.
- Verified the data loads as well as the reports.
- Involved in identifying reporting needs and helping the business in creating specification and worked closely with report development team to achieve the reporting goal.
- Assisted the project manager in setting up the direction for business analysis team.
- Developed custom routines for generating test cases.
- Created the Test Specifications, Test Scripts and Test Categories for the testing of data in the EDW.
- Attended defect triage meetings with the end users and developers.