Data Scientist Resume Newark, NJ - Hire IT People

SUMMARY

7+ years of professional experience in Statistical modeling, Machine Learning, Conversational AI, Data Visualization
Expertise in transforming business resources and requirements into manageable data formats and analytical models, Designing algorithms, building models, developing data mining and reporting solutions and scale.
Proficient in managing entire data science project life cycle including Data Acquisition,Data
Preparation,Data Manipulation, Feature Engineering, Statistical Modeling, Testing and Validation, Visualization and Reporting.
Proficient in Machine learning algorithm like Linear Regression, Ridge, Lasso, Elastic Net Regression, Decision Tree,
Random Forests and more advanced algorithms like ANN, CNN, RNN, Ensemble methods like Bagging, Boosting, Stacking.
Build chatbots using RASA for more than 4 years, has worked with chatbot in several projects, have a good experience in feature extraction analysis methods.
Excellent performance in Model Validation and Model Tuning with Model selection, K - ford cross-validation, Hold-Out
Scheme and Hyperparameter tuning by Grid search and HyperOpt.
Advanced experience with Python (2.x, 3.x) and its libraries such as NumPy, Pandas, Scikit-learn, XGBoost, LightGBM,
Keras, Matplotlib, Seaborn.
Strong Knowledge in Statistical methodologies such as Hypothesis Testing, ANOVA, Principal Component Analysis
(PCA), Monte Carlo Sampling and Time Series Analysis.
Strong experience with R to develop Machine learning models and Hypothesis testing.
Basic knowledge with Hadoop, Spark and experience with BigDatatools such as PySpark, Pig and Hive.
Experience in building machine learning solutions using PySpark for large sets ofdataon Hadoop System.
Experience in using cloud services Amazon Web Services (AWS) including EC2, S3, AWS Lambda, and EMR.
Proficient at building and publishing interactive reports and dashboard with design customizations based on the Stakeholders' needs in Tableau.
Experienced in RDMS such as SQL Server 2012, Oracle 9i/10g and NoSQL database like MongoDB, DynamoDB.
Expert in SQL with writing Queries, Temp tables, CTE, Stored Procedures, User-Defined Functions, Views, and Indexes.
Responsible for creating ETL packages, migratingdatafrom, Flat File and MS Excel, cleaningdataand backing updatafiles, and synchronizing daily transactions by using SSIS.
Quick learner in any new business industries or software environment to deliver the best solutions adapted to new requirements and challenges.
Knowledge and experience in GitHub/Git version control tools.

TECHNICAL SKILLS

Deep learning Frameworks and libraries: Tensor flow, Keras, TF Learn, NumPy, Conversational AI matplotlib, Seaborn, Pandas, Scikit learn, NLTK, Kubeflow, Vertex AI, tensor flow.

ML and DL algorithms: Linear, Logistic regression, Support Vector Machines, K-Means, Decision Trees, Random forests, Single and Multi-layer perceptron’s, CNN, chatbot, ensemble models, YOLO, object detection, feature extraction, SSD, RASA etc.:

Programming Languages and tools: Python, C++, MATLAB, MySQL, Visual basic, XCode, Jupyter, Spyder, OpenCV, Tableau, Google cloud platform. .

Operating Systems: Microsoft Windows, Mac, Linux.

PROFESSIONAL EXPERIENCE

Confidential - Newark, NJ

Data Scientist

Responsibilities:

Worked asDataScientistand developed and deployed predictive models for analyzing customer churn and retention
PerformedDataExtraction,DataManipulation andDataAnalysis on TBs of structured and unstructureddata
Developed machine learning models using Logistic Regression, Naïve Bayes, Random Forest and KNN.
PerformedDataImputation using Scikit-learn package of Python
Created interactive analytic dashboards using Tableau
Conducted analysis on assessing customer consuming behaviors and discover value of customers with RMF analysis; applied customer segmentation with clustering algorithms such as K-Means Clustering and Hierarchical Clustering
Collaborated withdataengineers and operation team to implement ETL process, wrote and optimized SQL queries to performdataextraction to fit the analytical requirements
Performed sentiment analysis and captured customer sentiments and categorized positive, negative, angry and happy customers from feedback forms
Ensured that the model has low False Positive Rate and Text classification and sentiment analysis for unstructured and semi-structureddata
Launching Amazon EC2 Cloud Instances using Amazon Images (Linux/ Ubuntu) and Configuring launched instances with respect to specific applications.
Integrated SAS datasets into Excel using DynamicDataExchange, using SAS to analyzedata, statistical tables, listings and graphs for reports.
Use Principal Component Analysis in feature engineering to analyse high dimensionaldata
Use MLlib, Spark's Machine learning library to build and evaluate different models
PerformedDatamanagement like Merging, concatenating, interleaving of SAS datasets using MERGE, UNION and SET statements inDATAstep and PROC SQL.
Experience in using SAS to read, write, import and export to anotherdatafile formats including delimited files, spreadsheet, Microsoft excel and access tables.
Communicate with team members, leadership, and stakeholders on findings to ensure models are well understood and incorporated into business processes

Environment: Python, R, MS Excel, Perl, MS SQL Server, HIPAA, EDI, Power BI, Tableau, T-SQL, ETL, MS Access, XML, JSON, MS office 2010, Outlook.

Confidential - Austin, TX

Data Analyst

Responsibilities:

Statistical Modelling with ML to bring Insights inDataunder guidance of PrincipalData Scientist
Ingestion with Sqoop, Flume.
Used SVN to commit the Changes into the main EMM application trunk.
Understanding and implementation of text mining concepts, graph processing and semi structured and unstructureddataprocessing.
Worked with Ajax API calls to communicate with Hadoop through Impala Connection and SQL to render the requireddatathrough it. These API calls are like Microsoft Cognitive API calls.
Performed Map Reduce Programs on nodes running on the cluster.
Developed multiple MapReduce jobs in Scala fordatacleaning and pre-processing.
Analysed the partitioned and bucketeddataand compute various metrics for reporting.
Involved in loadingdatafrom RDBMS and web logs into HDFS using Sqoop and Flume.
Worked on loading thedatafrom MySQL to HBase where necessary using Sqoop.
Developed Hive queries for Analysis across different banners.
Extracteddatafrom Twitter using Java and Twitter API. Parsed JSON formatted twitterdata and uploaded to database.
Exported the result set from Hive to MySQL using Sqoop after processing thedata.
Analysed thedataby performing Hive queries and running Pig scripts to study customer behaviour.
Have hands on experience working on Sequence files, AVRO, HAR file formats and compression.
Used Hive to partition and bucketdata.
Experience in writing MapReduce programs with Java API to cleanse Structured and unstructureddata.
Wrote Pig Scripts to perform ETL procedures on thedatain HDFS.
Created HBase tables to store variousdataformats ofdatacoming from different portfolios
Worked on improving performance of existing Pig and Hive Queries.

Confidential

SQL Developer

Responsibilities:

Created and modifiedSQLStored Procedures, Triggers, User Defined Functions, Views, and Cursors.
Performance tuning of stored procedures utilizing Database tuning advisor andSQLProfiler.
Adhoc reporting per client's requests.
Build and modify SSIS packages utilizing BIDS 2012.
Built complex queries utilizing recursive CTE's.
Documents both business and application processes utilizing Microsoft Visio.
Build SSIS configuration files to adhere to deployment processes.
Utilize SSIS script task extensively to load data from excel files to database tables.
Utilize SVN as the main source control repository.
Modified Web interface to adhere to business logic pertaining to the leasing of company assets.

We provide IT Staff Augmentation Services!

Data Scientist Resume

Newark, NJ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship