- Around 7+ years of experience as data analyst.
- Strong SQL skills in retrieving data.
- Fluency in programming language Python in Jupyter Notebook.
- Extensive experience in ETL data pipelines for extracting, transforming, loading.
- Excellent experience in designing dashboards with Tableau and Power BI.
- Experience with big data environment, Spark, Hive.
- Experience in AWS S3, Lambda, DynamoDB.
- Experience in machine learning, deep learning.
- Extensive experience in designing data pipelines for data collecting, cleaning, exploring, preprocessing, engineering, modeling.
- Experience in building models with Python for classification, regression, and clustering problems.
- Excellent skills in Excel to query, analyze, visualize, modeling data.
- Experience in formulatingLinear Programming model to find optimal configuration for production planning, inventory controling order to maximize profit with Excel (Solver).
- Data Analysis | Python (Pandas
- Visualization (Tableau
- Power BI
- Machine Learning | Python (Scikit - learn
- Keras pickle
- Random Forest
- Decision Tree
- Logistic Regression
- Naïve Bayes
- K Means
- AWS: S3
- Excel (Solver
- Translated organizational goals and questions into quantitative analysis.
- Wrote and optimized SQL queries to acquire data, including joins, selection, aggregation, union, case when, window function, etc.
- Extracted data via SQL, Python, Pyspark from various type of data sources like SQL databases, offline, APIs, formats like csv, xls, json, xml.
- Transformed data via SQL and Python for further analysis and modeling, including row operations, joins, sorting, aggregations, cleaning, merging,concatenating, etc.
- Built MySQL-connected applications and pipelines via SQLAIchemy in Jupyter Notebook for extracting data from database, transforming, and doing analysis and visualization.
- Pulled data from an API using Python Requests.
- Implemented data preparation with Python, including data cleansing, handling missing data, Identifying outlier, transforming data.
- Conducted data processing and analysis using Spark SQL; implemented graph analytics and machine learing in Spark.
- Visualized data withMatplotlib, Seaborn, Excel, Tableau, Power BI.
- Built BI dashboards with Tableau, Power BI.
- Facilitated advanced and scalable machine learning models on structured or unstructured data to solve classification, regression, and clustering problems.
- Identified patterns & trends in data; provided insights to enhance business decision making.
- Collaborated with product, science, engineering, and business development teams to develop and deliver data science driven solutions that brought real business value.
- Documented, summarized, and presented findings to a group of peers and stakeholders.
- Figured out optimal configuration for production and supply chain via solving Linear Programming model in order to maximize profit with Excel (Solver).
- Wrote fast and reusable SQL to query, extract, and transform data from data sources, gaining required information.
- Wrote HiveQL to query data.
- Developed and maintain dashboards using BI tools such as Tableau, Power BI.
- Explored and analyzed data with tools like Excel to deliver insight, answers and decision support, ending up reporting to stakeholders.
- Visualized data insights and answers with tools of Excel, Tableau, and Python to communicate findings to teams across the organization.
- Forecasted supplies amount via time-series models like moving average, exponential smoothing, saving storage space in warehouse.
- Created WBS(Work Breakdown Structure), Network Diagram with Visio.
- Formulated production planning and shifts planning via Operation Research(Linear programming) with the tool of Excel(Solver).
- Implemented hypothesis testing to confirm if purity of chemical materials were qualified according to confidence interval.
- Managed database of chemical compounds.
- Developed and presented report to tell cohesive and easily-understood stories.
- Transferred data from paper formats into computer files or database system using key borads, data recorders or optical scanners.
- Createdspreedsheets with large numbers of figures without mistakes.
- Sorted and organized paperwork after entering data to ensure it is not lost.
- Retrieved data from Excel or SQL as requested.