Data Analyst Resume San Jose, CA - Hire IT People

SUMMARY

Experienced in data analysis and modeling for 5 years with strong technical and communication skills
Developed and implemented databases, data collection systems, data analytics and other strategies that optimize statistical efficiency and quality
Interpreted data, analyze results using statistical techniques and provide ongoing reports
Strong analytical and problem - solving skills, capable of addressing relevant facts, recommending solutions, working with teams and cross functional departments
Experience in design, development, testing, implementation and support of Enterprise Resource Planning (ERP) and s trong understanding of business process flows
Gathered and conformed project requirements by studying user requirement and and referring with others on project team
Worked with both relational and non-relational database by using SQL and NoSQL syntax to query data and integrate with big data frameworks
Proficient in querying data with MySQL, MSSQL, PostgreSQL etc. including build relational database, select and join tables with aggregate function, create view, function, and procedure
Hands on experience in MongoDB, DynamoDBto build, retrieve, store and modify NoSQL database
Worked on data preprocessing,exploratory data analysis and building Machine learning modelssuch as SVM, KNN, Random Forest, etc. by Python
Analyzed data by building Machine Learning models for regression, classification and clustering with feature selection, feature engineering, hyper-parameter tuning, evaluation and validation by Python
Experience using Tableau with database join, nested sorting, integration, visualization by creating diverse charts, maps, trend lines, and predictive analysis
Hands on experience of big data framework Hadoop, including HBase for big data storage, data arrangement in HDFS, processing by MapReduceor Spark, and also data analysis by Hive
Hands on experience in data mining, cleaning, warehousing, and ETLprocess by using Talend, Informatica, AWS Glue, Hive with big data frameworks
Worked with cloud computing platform Amazon Web Service (AWS) and Google Cloud Platform (GCP) with various services including EC2, S3 storage, EMR, DynamoDB, BigQuery, Kubernetes, etc.
Performed AB testing and hypothesis testing to delivery business insights
Built statistical models and exploratory analysis by R Studio
Experience with Deep Learning models for NLP and image recognition by using Python
Reported and presented the data patterns and analytical results by building charts, graphs, and dashboards via MS PowerPoint and Excel to other team
Experience with project management Agile methodology by Jira
Used Git repository as version control
High self-motivation, quick learner and adaptability towards trending tools and teamwork environment

TECHNICAL SKILLS

Programming Language: Python, R, SQL, Bash, Julia; JavaScript, STATA, MATLAB (Exposure)

Database: MongoDB, DynamoDB, Redshift, PostgreSQL, MySQL, MSSQL, Oracle, MariaDB

Packages: NumPy, pandas, Scikit-learn, TensorFlow, PyTorch, Keras, Caffe, seaborn, Matplotlib, SciPy, NLTK

Cloud Platform: Amazon Web Service (AWS), Google Cloud Platform (GCP)

Machine Learning: Linear Regression, Logistic Regression, Decision Tree, Random Forest, KNN, SVM, Gradient Boosting, Multi-Layer Perceptron, Neural Networks, Natural Language Processing

IDE: Jupyter Notebook, PyCharm, Visual studio code, Spyder, Atom, R studio

Big Data Platform: Hadoop, Spark, MapReduce, Hive, HBase

Analytic and Reporting Tools: Tableau, Power BI, Microsoft Office, SSRS, SSAS

Management Tools: Jira, Slack, Trello, Airflow

Operation System: Linux, Windows, MacOS

PROFESSIONAL EXPERIENCE

Confidential, San Jose, CA

Data Analyst

Responsibilities:

Analyzed complex, high-volume, high-dimensionality data from multiple sources using different data analysistechniques and tools to formulate recommendations, learning and test plans
Queried and analyzed relational databases containing millions of customer information using SQL syntax, integrated with big data frameworks to handle and analysis data, and built M achineL earning models for providing business insight
Maintained NoSQL database on MongoDB to handle unstructured data, clean the data by removing invalidate data, unifying the format and rearranging the structure and load for following steps
Run big dataframework Hadoop and interact with S3 for file storage on AWS EMR cluster
Used Hive QL to fuse and aggregate the different datasets and finally load that data into database as a full ETL process with AWSGlue
Built relational databaseson Amazon AuroraMySQL: created/altered tables, wrote stored procedures, triggered defined functions, implemented batch
Used MySQL syntax such as join function, window function, subqueries, aggregate function to clean data and optimize the database performance
D esigned story-telling visualizations and dashboards from different database to enable ongoingmonitoring and reporting at different departments using Tableau
Maneuvered Python scripts with AWS EC2 to extract data from different departments, cleaned data by creating functions based on business logic and streamlined data processing with Pandas and NumPy
ImplementedEDA, PCA, and Feature engineering to extract features, a pplied SMOTE and other resampling techniques to address the imbalanced data and improve F1 score within machine learning models in Python
Developed Random Forest model and Gradient Boosting modeland p erformed hyperparameter tuning by using GridSearchCVby Scikit-learn package in Python
Run A/B test for new features designed via different market channels
Identified data patterns and visualized model results by connecting Tableau with Python and MySQL
Reported Tableau dashboard with different chartsand presented with MS PowerPoint
Used Git as version control and Jira for the team-wide management methodology
Summarized the information andreports to deliver the insights for team and client
Implemented the design, analysis, and interpretationof a variety of reports and analytical solutions
Engaged constructively with project teams to support project objectives through the application principle

Confidential

Data Analyst

Responsibilities:

Monitored and analyzed customer information data and sales data to grasp product and market trend
Provided data-driven insights to enable decision-making for product and market development by interpreting data, analyzing results and using statistical techniques to provide ongoing reports
Created effective marketing strategies and promotion plans to retain existing high-value customers
Developed intuitive KPI dashboards in Tableau for senior management that provided insight into the performance of department strategies
Worked with SQL to manipulate relational databases containing millions of customer insurance’s information data, and built Machine Learningmodels to predict premiums, risk level, etc. with big data frameworks for delivering business insight
Participated in NoSQL database maintaining with MongoDB, Cloud Bigtable for unstructured data manipulating and extracted data from different source
Extracted and manipulated large-scale data in Hadoop and Spark environment
Worked with HiveQL script to process the data stored in HBase and run MapReduce jobs
Used Google Cloud and Snowflake for ETL process, data warehousing and large-scale computing
Designed relational database and storage in MariaDB (MySQL),m aintained and optimized database by creating/altering tables, writing stored procedures, triggering defined functions, implementing batch
Fetched relational data by using SQL to query and merge over millions of data from different tables
Used Tableau connecting with different database to create interactive charts and trend linesfor presenting analytical results and providing customer segmentation suggestions
Identified trends and patterns of the data through EDA processin Python providing helps to engineer team and sales team for further analysis
Feature engineering to extract features also visualized by matplotlib and seaborn in Python
Determined key factors by feature selection process, such as PCA, affecting potential risk level
Applied Machine Learning modelsincluding Gradient Boosting and SVM and evaluated models
Built unsupervised machine learning modelssuch as k-means for customers segmentation and personalization in Python
Run A/B test for new features designed via different market channels for product researching
Worked with R for statistical modeling like Bayesian and hypothesis testwith dplyr and BAS packages, and v isualized testing results in R to delivery business insight
Model validation by Confusion Matrix, ROC, AUC, and developed diagnostic tables and graphs that demonstrated how model can be used to improve the efficiency of the selection process
Presented and reported business insights by SSRS and Tableau dashboard combined with different diagrams
Utilized Jira as project management methodology and Git for version control to build the program
Reported and displayed the analysis result in the web browser with html and JavaScript
Involved constructively with project teams, supported project’s goal through principle and delivered the insights for team and client

Confidential

Data Analyst

Responsibilities:

Collaborated with product team to develop database with MySQL, establish data analysis by using Microsoft Excel and Python, and delivery analysis for business insight by different visual format
Used Microsoft Excel to clean the data and explore the data features by using sorting, filter, conditional formatting, charts, and pivot tables
Analyzed data sources and formats via different tools, build relational database, and modified the schema by MySQL
Queried and merged data by removing the invalidate data, unifying the format and rearranging the structure in MySQL
Performed exploratory data analysis, results interpretation, and report preparation in support of business process management and supply chain cycle
Identified the data pattern, feature importance by Python with pandas, NumPy, and sklearn
Built models such as linear regression and logistic regression with different features and make predictions for advising supply chain management
Adjusted performance by feature tuning with GridSearchCV to improve the MSRE
Visualized the results by Python with matplotlib, seaborn, and also Data Studio to delivery insights
Interpret data and make conclusion that are presented in a visual format by Microsoft PowerPoint to management team
Reported and displayed the analysis result in the web browser with html and JavaScript

We provide IT Staff Augmentation Services!

Data Analyst Resume

San Jose, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship