Machine Learning Engineer/data Scientist Resume
Katy, TX
SUMMARY:
- Over 7 years hand on experienced in the areas of machine learning and statistical modeling, big data analytics.
- Supervised learning, unsupervised learning and natural language processing.
- Expertise in machine learning with multiple projects by deep learning/neural network, kNN, Na ve Bayes, SGD, SVM, SOM, Random Forest and boosted trees, together with NLP.
COMPUTER/ANALYTICS SKILLS
Familiar with machine learning, NLP, and deep learning with Tensorflow/Theano/PyTorch, CNTK.
Proficient in JAVA, Python, R, C/C++, SQL, Tableau.
Certificated with SAS Base and Advanced Programming with SAS9.
Experienced in spark MLlib, and data analytics on AWS/MS - Azure/Google Cloud.
Familiar with Hadoop, HDFS, MapReduce, Hive/NoSQL(Hbase, MongoDB), Yarn Spark/MLlib.
Proficient in Microsoft Windows, DOS/Power Shell, Unix/Linux/Ubuntu.
WORK EXPERIENCE:
Confidential, Katy, TX
Machine Learning Engineer/Data Scientist
Responsibilities:
- Machine Learning Projects based on Python, R, SQL, Spark MLlib and SAS. Performed data exploratory, data visualizations, feature selections, and model validations. Improved the model performance on test data, for example: In a customer segmentation/campaign and credit risk prediction project, the metric scoring was lifted to 0.83 for final model from 0.57 for legacy model.
- Applications of machine learning algorithms, including kNN, NB, ANN, random forest and boosted tree, SVM, SGD (stochastic gradient descendent regression), PCA, Singular Value Decomposition, NLTK/textBlob, neural network, spark MLlib and deep learning using Tensorflow/Theano.
- Performed data analysis, natural language processing, statistical analysis, generated reports, listings and graphs.
- Big data analytics with Hadoop, HiveQL, Spark RDD, and Spark SQL.
- Tested python/SAS on AWS cloud service and CNTK modeling on MS - Azure cloud service.
Confidential, Houston, TX
Data Processor/Geophysical Data Scientist
Responsibilities:
- Built prediction models of major subsurface properties for underground image, geologic interpretation and drilling decisions. Utilized advanced methods of big data analytics, machine learning, artificial intelligence, wave equation modeling, and statistical analysis. Provided exclusive summary on oil/gas seismic data and well profiles, conduct predictive analyses and data mining to support interpretation and operations.
- Cross - correlation based data analysis method through Python and R on multi-offset-well to help predict the models and pore-pressure ahead a little for real time drilling. Big data modeling with in corporation of seismic, rock physics, statistical analysis, well logs and geological information into the 'beyond image'.
- Using Python and Java, developing, operationalizing, and productionizing machine learning models to make significant impact on the geological pattern identification and subsurface model prediction. Analyzing seismic and log data with sub-group analysis (classification-clustering, hybrid approach by PCA and SOM) and model prediction methods (regression, random forest, boosted tree, standard neural network etc.).
- Use SAS statistical regression method to simulate the anisotropic trend.
- Tested the migrated data processing system on Google Cloud with velocity model updating tasks.
- ETL to convert unstructured data to structured data and import the data to Hadoop HDFS. Utilized MapR as a low-risk big data solution to build a digital oilfield. Efficiently integrated and analyzed the data to increase drilling performance and interpretation quality. Analyzed sensors and well log data in HDFS with HiveQL and prepare for prediction learning models.
- Constantly monitored the data and models to identify the scope of improvement in the processing and business. Manipulated and prepared the data for data visualization and report generation. Performed data analysis, statistical analysis, generated reports, listings and graphs.
- Co-leader of mathematics community 2015, Schlumberger Eureca.
Confidential, Houston, TX
Data Processor/Advanced Software Engineer
Responsibilities:
- Using C/C++ programming language, develop and support modules of grid based model operations and acquisition geometry simulations.
- Support script based workflows, including I/O of large datasets, calling geophysical modules, and communications of parallel processes.
- Seismic attributes clustering using unsupervised neural network to help better utilizing big volume of geophysical data for geological feature recognition and oil production risk control.
- Acquisition geometry simulation using Matlab/Unix Scripts.
Confidential, Houston, TX
Seismic Data Processor
Responsibilities:
- Seismic Data Processing & Interpretation; generated high quality subsurface images as a part of the team, with strong problem - solving ability, efficiency communication and corporation.
- Load acquisition data for processing with Oracle database management.
- Perform data analysis from different domains using both geophysical and statistical methods. Detect noisy/bad traces based on anomalous analysis. Predict and subtract noise models and surface related multiple models based on pattern reorganization, convolution methods and amplitude ratio identification.
- Clean/archive data sets on schedule. Processed seismic surveys of terabyte-datasets.
Confidential, Woodlands, TX
Financial Analyst
Responsibilities:
- Financial and Business analytics service using SAS and C. Generate prediction and regression model using statistical supervised learning methods.
- Collaborated with the analyst group to analyze the portfolio and evaluate the models. Estimate Value - at-Risk using Monte Carlo Simulation.
- Performed data analysis, statistical analysis, generated reports, listings and graphs.
- SAS financial time series autocorrelation analysis and regression using SAS/SGPLOT, PROC AUTOREG, with investigations into the model parameters AIC, AICC, MAPE, R-Square, etc.
- Trained the stock market prediction models and select the model with data mining methods.
