Data Scientist Resume
New, YorK
SUMMARY:
- Seeking for Software Development Engineer / Data Scientist / Business Intelligence Analysis Internship & Job opportunities
TECHNICAL SKILLS:
Full Stack: Big Data Algorithms & Architecture Enterprise Data Warehouse Design and Build - up Deep Learning in Convolutional Neural Network(CNN), LSTM, Recurrent Neural Network(RNN), BP Network, Hopfield Network, Gradient Descent as well as forward and back propagation via Python keras and Google Tensorflow Machine Learning in Logistic Regression, KNN, C5.0, Random Forest, Bayes Belief Network, Apriori, AdaBoost, PCA, XGBoosting, SVM, Cross Validation, ROC/AUC, Natural Language Processing(NLP) via Python Pandas/NumPy/Scikit-learn/nltk, R Big data platform like Hadoop / Hive / HBase / Spark / Mahout / Map Reduce in Java Data Visualization tools such as Tableau, IBM Cognos, XLCubed ETL tools such as IBM DataStage, TM1 Database and sql like PL/SQL, Oracle DBA & Performance Tuning, SQL Server Microsoft BI (SSAS / SSIS / SSRS, MDX / DAX, Power BI, Datazen) Scripting languages like Perl, Awk, Sed, Shell on Linux Software Development Engineering in Java/J2EE framework, JQuery, JDBC, js & Servlets, as well as the methodologies like Agile / Waterfall, and the tools such as Jira, Git and Maven Clouding Server & Infrastructure like Microsoft Azure, Amazon AWS, RedShift
PROFESSIONAL EXPERIENCE:
Data Scientist
Confidential, New York
Responsibilities:
- As a Research Scientist I am mainly working on the Acquisition data ETL of Data Lake, which would be used by the Cognitive Enterprise Data Platform as the source of Watson. I also undertake some internal deep learning and machine learning analysis.
Senior Data Consultant
Confidential
Responsibilities:
- Bayer is a global enterprise with core competencies in Life Science, Healthcare & Materials.
- As the BI expert and Big Data Consultant, I am mainly responsible for a series of local & global projects regarding Business Intelligence and Data Warehouse, with CRM and Sales data covered. I also build a variety of visualization reports and Java Portals for the business.
ETL & Dashboard Developer
Confidential
Responsibilities:
- Construct the LSTM-RNN for sequence prediction, context analysis and CNN for image classification, Apply Python scikit-learn, nltk and SPSS to analyze the stock price, opportunity loss and contract token/entities/relation, etc.
- Implement the Parallel ETL jobs and sql procedures for multiple sources of data extraction
- Global Core X
- Motivate the requirements from Bayer managers and issue the URS
- Analyze the daily interaction dataset on Hadoop via Map Reduce and R in Collaborative filtering algorithm to investigate the customers’ p and Sales rep behaviors
- Work with Singapore team to query the daily analysis reports via NoSQL in Hive, HBase
- Develop the feature-based, cross-validated machine learning models like linear regression, k-nearest neighbors and time series to predict the customers’ purchase
- Coordinate with Switzerland, Germany and US team, be responsible for the project management of China part, which includes the proposals, changes control and budget
- Implement the Data Warehouses via Snowflake Model & Slowly Changing Dimensions
- Develop Cube, ETL and Reports via Microsoft BI and MDX, work closely with the Singapore team to build up the DEV, QA & PROD environments
- Troubleshoot the memory leak of JVM heap on BI Web Portal, develop the portal via J2EE
- User Analysis of China Mobile
Confidential
Responsibilities:
- Update the original PL/SQL stored procedures, replace the previous time-cost cursors and loops with the Oracle analyze functions and shorten the whole ETL process, which holds around 5 million customer details and 100 million transaction records. It keeps increasing 2 TB per day and cost previously 52 minutes that be finished less than 5 minutes nowadays
- Use Tableau to make the data analysis conclusion more intuitive for leaders
- Construct the Hadoop clusters and Map Reduce jobs in Java
- Implement the classification model based on Bayes Belief Network & K-means via Mahout