- Data Analytics and Business Intelligence professional with 8 years of excellence in data science life cycle management
- Skilled at driving all aspects of data gathering, data cleansing, requirement analysis, data mining, modeling and data visualization by using SQL, Python, R and Tableau
- Develop and implement data collection systems and other strategies that optimize statistical efficiency and quality
- Proven track record in delivering strategic analytics solutions by developing Advanced Analytics Solutions and Strategy
- Ability in leveraging machine learning for improving customer experience and maximizing business growth opportunities
- Works on large data - sets, data architecture design and development, ensuring successful completion of analytic projects
- Competent in managing client expectations throughout the project life-cycle by understanding the business requirements
- Deft in consulting on the best practices, aligning business goals with technology solutions to drive process improvements
- Consistently exceeded client satisfaction and met client’s analytics needs by being involved in data engineering
- Skilled at building architecture, and applying advanced analytical techniques for clients across multiple industry verticals
Languages and Databases: Python (NumPy, Pandas, Matplotlib, Scikit-learn), PL/SQL, Oracle DB, Teradata, DB2, R, SAS, Java
Big Data Ecosystem: HDFS, Hive, Pig, Sqoop, Spark SQL, Hadoop, Map-reduce
Data Visualization: Tableau, Alteryx, QlikView
Machine Learning: Linear, Logistic Regression, Neural Networks, KNN, K-Means Clustering, Decision Trees, Random Forests
Tools: Anaconda (Jupiter), Google Cloud, MS Office (Excel), SQL developer, Teradata SQL Assistant, Toad, OBIEE, Splunk
ETL Tools: - Informatica, IBM DataStage, AB Initio
Cloud Services: Google Cloud, Salesforce Service Cloud (CRM)
Confidential, Santa Clara, CA
Senior Data Analyst
- Designed and built data maps & high-quality ETL data solutions for migrating existing source data from heterogeneous sources to Salesforce Service cloud (CRM).
- Managed the entire analytics life cycle by performing Data Analysis, Quality Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export on source data by using Python, SQL and ETL tools.
- Reviewed, analyzed, and ensured the quality of data loaded into the database system by using SQL and Python.
- Performed data validation checks by ensuring data integrity and consistency by developing automation scripts using python.
- Facilitated improved decision making and provided valuable business insights by developing visualizations using Tableau.
- Oversee and mentor a team and serves as a liaison between the data analysts and other departments
- Migrated existing source data in Teradata and DB2 to Google Cloud (Google Cloud Platform) as per company’s standards
- Designed and build data solutions, which resulted in increased accuracy and significant reduction in the run time.
- Designed scoop scripts to load from Teradata and DB2 to Hadoop environment and designed Shell scripts to transfer data from Hadoop to Google Cloud Storage (GCS) and from GCS to Google Big Query
- Managed data validation and validated Scoop jobs, Shell scripts & verified if data is loaded accurately
- Analyzed the results of analytical data, performance and KPIs by designing Tableau dashboards for executive leadership
- Identified and interpreted trends / patterns in complex data sets to prioritize business and information needs
- Defined new data collection and analysis processes to built data solutions and migrated existing source data in Sources (Oracle, SQL, UNIX and Flat files) to Atlas
- Played a key role in devising simple and complex HIVE, SQL scripts to validate data flow in varied applications
- Actively worked on Data Lake (Hadoop Environment) and evaluated huge volumes of business data
- Utilized MHUB to check Data Profiling & Data Lineage and developed reports using Cognos for efficient data validation
- Conducted Technical Data quality (TDQ) validations included header/footer validation, record count, data lineage and more
- Involved in profiling, check sum, empty file, duplicates, delimiter, threshold, CDC validations for all data sources
- Created tableau dashboards and engineering reports for data visualization through data analysis and issue reporting
- Devised simple and complex SQL scripts to check and validate Dataflow in various applications
- Performed data analysis, migration, data cleansing, extraction, transformation and data loading (ETL) using Informatica tool
- Played a vital role in devising PL/SQL statements - Stored Procedures, Triggers, Views and packages
- Made efficient utilization of Indexing, Aggregation and Materialized views to optimize query performance