We provide IT Staff Augmentation Services!

Data Analyst/data Engineer Resume

4.00/5 (Submit Your Rating)

Miliptas, CA

SUMMARY:

  • 5 years’ experience as data analyst and 3 years’ experience data engineer with a wide range knowledge of big data analysis, data cleaning/visualizing/exploring, building ETL pipelines, and cloud computing
  • Experience identifying, analyzing and interpreting trends and patterns in complex datasets, drawing insights to answer business questions and identify opportunities for improvement
  • Worked on data cleaning, data exploratory analysis, data visualizing and data mining using SQL/ Python/ R
  • Advanced knowledge of employing Tableau to design interactive dashboards and reports from different databases and display key metrics and data driven insights to managers, other staff
  • Experience defining key metrics to assess overall business performance and delivered an in - depth analysis and recommend
  • Experience building ETL Pipelines to query data from relational databases like MySQL/ SQLite/ SQL Server and nonrelational databases like MongoDB, scheduling in a specific order through airflow
  • Hands on experience of clouding computing, like using AWS and Google Cloud to run big data analysis
  • Knowledge of database design, like create views/ tables, manage access permissions to databases/ tables/ views, normalize/de-normalize databases, partition tables
  • Hands on experience of implementing Hadoop to handling big data processing and ETL with Hive, Spark
  • Experience of developing Machine Learning Models with Linear Regression, Logistic Regression, Decision Tree, K-Nearest Neighbor, Support Vector Machines, Random Forests, Boosting, K-means
  • Experience collaborating with marketing and strategy teams to identify business goals and transformed the business goals into IT specifications
  • Hands on experience of Git to track changes in different versions of source code and coordinating work on those versions with other peoples
  • Experience of using Jira to capture/track/resolve issues, custom workflows and projects configuration

SKILLS

Programming Languages: SQL | Python (NumPy, Pandas, Statsmodels, Matplotlib, Seaborn, Folium, spaCy, Scikit-Learn, BeautifulSoup, Pyspark, Flask, TensorFlow) | R (dplyr, ggplot2, Shiny, lme4)

Databases: MySQL | SQL Server | SQLite | MongoDB

Tools: Tableau | Hadoop (Hive, Spark) | Visual Studio | Jupyter Notebook | RStudio | Git | Jira | Microsoft Office (Word, Excel, PowerPoint)

Clouds: AWS (S3, EC2, DynamoDB) | Google Cloud Platform (BigQuery, Cloud Storage, Kubernetes)

Transferable skills: teamwork ability | critical thinking and problem solving | attention to detail | self-motivated individual | written and verbal communication

Machine Learning: Linear Regression | Logistic Regression | SVM | K-Nearest Neighbors | Naïve Bayes | Random Forest | Gradient Boosting | K-means | Time Series | NLP

PROFESSIONAL EXPERIENCE:

Confidential, Miliptas, CA

Data Analyst/Data Engineer

Responsibilities:

  • Created database objects like tables, views, procedures, and functions using SQL to provide definition, structure and to maintain data efficiently
  • Created dashboards and interactive charts using Tableau to provide insights for managers and stakeholders and enable decision-making for market development
  • Worked on designing ETL pipelines to retrieve the dataset from MySQL and MongoDB into AWS S3 bucket, managed bucket and objects access permission
  • Performed data cleaning and wrangling using Python with a cluster computing framework like Spark
  • Ability to manage multiple project tasks with changing priorities and tight deadlines in Agile environment
  • Employed statistical analysis with R to examine hypothesis assumptions and choose features for machine learning
  • Worked with cross-functional team, designed, developed and implemented a BI solution for marketing strategies
  • Implemented Feature Engineering in Spark to tokenize text data, transform features with scaling, normalization and imputation
  • Involved in building machine learning pipelines to do customer segmentation with Spark, clustered with PCA and K-means, and assisted the Data Scientist team to implement association rules mining
  • Developed presentations using MS PowerPoint for internal and external audiences

Confidential

Data Analyst/Data Engineer

Responsibilities:

  • Collaborated with the Engineer team to design and maintain MySQL databases for storing and retrieving customer review data
  • Employed SQL to build ETL Pipelines that filter, aggerate and join various tables to retrieve the desired data from MySQL databases
  • Ingested data, explored, cleaned and integrated data from MySQL and MongoDB databases on AWS EC2 using Python and Hadoop to perform initial investigation, discover patterns, and check assumptions
  • Provided BI Analysis for the marketing team to review impact on key metrics in relation to the project
  • Used R to query the data, run statistical analysis and create reports or dashboards
  • Prepared project progress reports and status reports and submitted to the management team on an ongoing basis
  • Built compelling visualizations and dashboards using Tableau to deliver actionable insights
  • Employed feature engineering pipelines with Python to do normalization and scaling for numerical features, and tokenizing for categorial features, implemented PCA to reduce the dimensions
  • Contributed in building Machine Learning models with scikit-learn library in Python, like Logistic Regression model, SVMs model, Random Forest model, and Naive Bayes model

Confidential

Data Analyst

Responsibilities:

  • Collaborated with data managers to define and implement data standards and common data elements for data collection
  • Built ETL Pipeline using SQL to query telecom data from MySQL database by filtering, joining and aggerating various tables
  • Used Tableau to design and maintain reports and dashboards to track and communicate customer churn prediction performance
  • Manipulated the raw data with NumPy and Pandas library in Python for data cleaning, exploratory analysis and feature engineering
  • Generated interactive charts with Matplotlib and Seaborn library in Python for exploring and explaining data
  • Collaborated with Marketing Managers to identify root causes of customer discontent and constructing dashboards that reflect these predictions
  • Designed A/B tests to identify variables that contributed to customer churn and used Shiny library in R to turn analyses into dashboards
  • Applied data mining in Spark to extract diverse features that provide additional information to enhance the churn prediction
  • Supported in constructing machine learning models using scikit-learn library in Python to predict customer churn, including Decision Tree Model, Random Forest Model, Gradient Boost Model

Confidential

Data Analyst Intern

Responsibilities:

  • Assisted in the maintenance of all MySQL database applications, and resolved database related issues that are submitted to the help desk ticketing system
  • Retrieve raw data, applied data cleaning, transforming and exploratory analysis with NumPy and Pandas library in Python
  • Employed Matplotlib library in Python to monitoring and analyzing Weekly/Monthly/Yearly sales data to identify market trends and patterns
  • Designed and conducted statistical analysis with lme4 library in R to identify and remediate data quality/integrity issues and to recognize metrics used to monitor product performance
  • Developed dashboards and frameworks with Tableau to monitor business and product performance
  • Supported business lead for special assignments and to ensure production efficiency

We'd love your feedback!