We provide IT Staff Augmentation Services!

Sr. Data Scientist/data Architect Resume

4.00/5 (Submit Your Rating)

Santa Ana, CA

SUMMARY:

  • An accomplished and performance driven professional with over 11+ years of technical & managerial experience in IT sector across technical consulting, designing future proof of Data Architect for Telecom, Financial institutions in Billing, IN and VAS, DWH, CRM, CC, HRIS, Big Data Tools, RDBMS, ERP, POS & Data Analysis.
  • Expertise in Business Intelligence, Data Warehousing, and Reporting tools in Financial, Trading & Telecom industry
  • Having 4 years’ experience working with Tableau Desktop, Tableau Server in various versions including Tableau (8.x/9.x/10.x).
  • Having 4 years of work experience with statistical data analysis such as linear models, multivariate analysis, statistical Analysis, Data Mining and Machine Learning techniques.
  • Expertise working with statistical data analysis such as linear models, Statistical Analysis, and Machine Learning techniques.
  • Hands - on experience Python (2.x, 3.x) to develop analytic models and solutions mapper & reducer.
  • Hands-on experience in creating insightful Tableau worksheets, dashboards to generate segment analysis and financial forecasting reports.
  • Proficient in creating data modelling for 360 customer view & customer behavior.
  • Strong skillset in PLSQL, ETLS, Business Intelligence, SQL Server Integration Server (SSIS) and SQL Server Reporting Services (SSRS).
  • Proficient in Data Cleansing and Data Validation checks during staging before loading the data into the Data warehouse.
  • Highly proficient at using PLSQL for developing complex Stored Procedure, Triggers, Indexes, Tables, User Defined procedure, Relational Database models and SQL joins to support data manipulation and conversion tasks.
  • Highly skilled in creating, maintaining and deploying Extract, Transform and Load(ETL) packages to Integration Server using Project Deployment and Package Deployment models.
  • Outstanding interpersonal communication, problem solving, documentation and business analytical skills.

TECHNICAL SKILLS:

Data Analytics Tools/Programming: Python (numpy,scipy,pandas), MATLAB, Microsoft SQL Server, Oracle PLSQL, Python.

Data Visualization: Tableau, Visualization packages, Microsoft Excel.

Machine Learning Algorithms: Classifications, Regression, Clustering, Feature Engineering.

Data Modeling: Star Schema, Snow-Flake Schema.

Big Data Tools: Hadoop, MapReduce, SQOOP, Pig, Hive, NOSQL, Spark.

Databases: Oracle, SQL Server, Teradata.

ETL: Informatica, SSIS.

Others: Deep Learning, Text Mining, c, Java script, Shell Scripting, Spark MLLib, SPSS, Cognos.

PROFESSIONAL EXPERIENCE:

Confidential, Santa Ana, CA

Sr. Data Scientist/Data Architect

Responsibilities:

  • Involved in developing analytics solutions based on Machine Learning platform and demonstrated creative problem-solving approach and strong analytical skills.
  • Interact with the other departments to understand and identify data needs and requirements and work with other members of the IT organization to QlikView based deliver data visualization and reporting solutions to address those needs.
  • Worked with Architecture team to get the metadata approved for the new data elements that are added for this project.
  • Data Story teller, Mining Data from different Data Source such as SQL Server, Oracle, Cube Database, Web Analytics, Business Object and Hadoop. Provided AD hoc analysis and reports to executive level management team.
  • Creating various B2B Predictive and descriptive analytics using R and Tableau.
  • Exploratory analysis and model building to develop predictive insights and visualize, interpret, report findings and develop strategic uses of data.
  • Utilize Spark, Scala, Hadoop, HBase, Kafka, Spark Streaming, MLLib, R, a broad variety of machine learning methods including classifications, regressions, dimensionally reduction etc.
  • Designed and provisioned the platform architecture to execute Hadoop and machine learning use cases under Cloud infrastructure.
  • Selection of statistical algorithms - (Two Class Logistic Regression Boosted Decision Tree, Decision Forest Classifiers etc.).
  • Used MLlib, Spark's Machine learning library to build and evaluate different models.
  • Involve in creating Data Lake by extracting customer's Big Data from various data sources into Hadoop HDFS. This included data from Excel, Flat Files, Oracle, SQL Server, Mongo DB, HBase, Teradata and also log data from servers.
  • Create high level ETL design document and assisted ETL developers in the detail design and development of ETL maps using Informatica.
  • Used R, SQL to create Statistical algorithms involving Multivariate Regression, Linear Regression, Logistic Regression, PCA, Random forest models, Decision trees, Support Vector Machine for estimating the risks of welfare dependency.
  • Helped in migration and conversion of data from the Oracle database, preparing mapping documents and developing partial SQL scripts as required.
  • Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from legacy Oracle and SQL Server database systems.
  • Worked on predictive and what-if analysis using R from HDFS and successfully loaded files to HDFS and loaded from HDFS to HIVE.
  • Analyze data and predicted end customer behaviors and product performance by applying machine learning algorithms using Spark MLlib.
  • Perform data mining on data using very complex SQL queries and discovered pattern and used extensive SQL for data profiling/analysis to provide guidance in building the data model.
  • Create numerous dashboards in tableau desktop based on the data collected from zonal and compass, while blending data from MS-excel and CSV files, with MS SQL server databases.

Environment: Python R, Machine Learning, Teradata 14, Hadoop Map Reduce, Pyspark, Spark, R, Spark MLLib, Tableau, Informatica, SQL, Excel, CSV, Oracle, Informatica MDM, Cognos, SQL Server 2012, DB2, SPSS, T-SQL, PL/SQL, Flat Files, XML, and Tableau.

Confidential, Miami, FL

Sr. Data Scientist/Data Architect

Responsibilities:

  • Cleaned and manipulated complex healthcare datasets to create the data foundation for further analytics and the development of key insights (MSSQL server, R, Tableau, Excel)
  • Application of various machine learning algorithms and statistical modeling like decision trees, regression models, neural networks, SVM, clustering to identify Volume using scikit-learn package in python, Matlab.
  • Utilized Apache Spark with Python to develop and execute Big Data Analytics and Machine learning applications, executed machine Learning use cases under Spark ML and Mllib.
  • Led technical implementation of advanced analytics projects, Defined the mathematical approaches, developer new and effective analytics algorithms and wrote the key pieces of mission-critical source code implementing advanced machine learning algorithms utilizing caffe, TensorFlow, Spark, MLLib, R and other tools and languages needed.
  • Performed K-means clustering, Multivariate analysis and Support Vector Machines in Python and R.
  • Professional Tableau user (Desktop, Online, and Server), Experience with Keras and Tensor Flow.
  • Involved in creating Data Lake by extracting customer's Big Data from various data sources into Hadoop HDFS. This included data from Excel, Flat Files, Oracle, SQL Server, HBase, Teradata, and also log data from servers.
  • Created mapreduce running over HDFS for data mining and analysis using R and Loading & Storage data to Pig Script and R for MapReduce operations and created various types of data visualizations using R, and Tableau.
  • Worked on machine learning on large size data using Spark and MapReduce.
  • Performed data analysis by using Hive to retrieve the data from Hadoop cluster, SQL to retrieve data from Oracle database.
  • Developed Spark/Scala, Python for regular expression (regex) project in the Hadoop/Hive environment with Linux/Windows for big data resources.
  • Performed Multinomial Logistic Regression, Random forest, Decision Tree, SVM to classify package is going to deliver on time for the new route.
  • Responsible for planning & scheduling new product releases and promotional offers.
  • Used pandas, numpy, seaborn, scipy, matplotlib, scikit-learn, NLTK in Python for developing various machine learning algorithms.
  • Worked on NOSQL databases like MongoDB, HBase.
  • Created Data Quality Scripts using SQL and Hive to validate successful data load and quality of the data. Created various types of data visualizations using Python and Tableau.
  • Worked on data pre-processing and cleaning the data to perform feature engineering and performed data imputation techniques for the missing values in the dataset using Python.
  • Extracted data from HDFS and prepared data for exploratory analysis using data munging.
  • Worked on Text Analytics, Naive Bayes, Sentiment analysis, creating word clouds and retrieving data from Twitter and other social networking platforms.
  • Worked on different data formats such as JSON, XML and performed machine learning algorithms in Python.

Environment: Python, MongoDB, JavaScript, SQL Server, HDFS, Pig, Hive, Oracle, DB2, Tableau, ETL (Informatica), SQL, T-SQL, Hadoop Framework, Spark SQL, SparkMllib, NLP, SQL, Matlab, HBase, R, Pyspark, Tableau Desktop, Excel, Linux, Informatica MDM.

Confidential, Toronto, CA

BI Architect /Data Scientist Tableau/Statistics Algorithm

Responsibilities:

  • Building Data warehouse architecture aligned with business requirements & foundation logical model (OCDM model).
  • Customization on logical model, prepares data mart for presentation layer tools, builds hierarchies & reporting objects and Developing customized tables and aggregations.
  • Leading the Design, development and implementation of data models to support of data warehouse, data delivery, and BI analytic applications.
  • Using expert knowledge of BI/DW technologies and business processes to provide in depth analysis of application/system and develop solutions to meet business/customer requirements.
  • Bringing expertise of new technologies, platform and approaches in data architecture, data analytics including Big Data tools, unstructured data analysis, social network data analytics and machine learning algorithm for analysis purposes.
  • Responsible for technical process documentation and standardization within team & perform design and code reviews and be accountable for all design decisions (Integrations with OCDM).
  • Establishing the overall strategy and vision for internal systems design and development, systems planning, programming computer operations, networks, data warehousing, architecture, data processing, data security, telecommunications, systems support, and data analysis to ensure alignment with corporate mission, objectives, and strategies.
  • Designing SSIS package to perform extract, transform and load (ETL) data across different platforms and validate the data and achieve the data files into the database.
  • Working on data cleaning and ensured data quality, consistency, integrity using Pandas.
  • Performed decision tree to classify whether a customer will rent a car.
  • Using Big Data Tools flame for log files data analysis.
  • Creating Machine Learning Module in Sentiment Analysis in R for tweets messages.
  • Creating Machine Learning Module forecasting for 6-12 months customers prediction using Time Series /ARIMA model in R.
  • Designing, developing and maintaining daily and monthly summary in Tableau reports.
  • Creating Machine Learning Module in clustering in R for marketing campaign purpose.

Environment: Oracle, Python, Hadoop, Spark, Machine learning (K means, UBCF, Naive Bayes, SVM, Decision Tree, text mining, sentiment Analysis, Logistic regression, etc.…), PLSQL, SQLite, MySQL, DBA, Hive, Pig, HBase, Informatica, Sqoop, Agile, Java frameworks, Talend, Mapper & Reducer, C, R, MS SQL server, SISS, SSRS, Tableau.

Confidential

Data Architect (Big Data tools / statistical Algorithm

Responsibilities:

  • Skilled in Data Scientist solutions, Database Design (RDBMS), Billing Systems, System Migrations/Integrations, and Enterprise wide Implementations of big data tools using Hadoop.
  • Spearheading Technical team for products delivery services possessing with the accountability of customer service management, developing SoW, proposal development, competitive bidding and negotiations.
  • Responsible for business analyst, P&L, client satisfaction and overall delivery execution of complex information and technology services and reporting tools.
  • Analyzing market and sales strategies, deal requirements, products and plans development from potential and financials aspects.
  • Preparing RFP & business cases for products & services, closing new business deals by coordinating requirements, developing & negotiating contracts and integrating contract requirements.
  • Formulating the marketing strategy for the organization to ensure an enriched customer portfolio and building differentiation for the brand and performing ICT clouds presentations and supporting targeted sales opportunities.
  • Preparing Enterprise Services HW & SW about OPEX and CAPEX developing marketing products plans, segmentation using clustering Machine Learning and probability methods, creating promo products for subscribers.
  • Building 360-degree Analytical Data modeling subscribers to support marketing activities, developing marketing reports, detailed analysis to measure the performance promotions, using Machine Learning for Data Analytics.
  • Developing subscribers monthly consolidate reports, Roaming & Interconnect Revenue and data usage reports and creating ETL for all streams to DWH and other vendors applications
  • Accountable for planning, managing and overseeing all execution teams for capacity growth and coordinating with vendors to ensure reduced operating expenditure and improved fiscal position, communication with team and reporting skills.
  • Designing storage capacity HDFS for data analysis, managing the installation, configuration & commissioning of switches, OS & firewalls and monitoring systems availability & business continuity, performance of DWH as a measure of preventive maintenance.
  • Leading, managing and delivering data modelling and DWH establishing & creating successful machine learning Algorithms.
  • Managing and reducing risks and ensuring change management and applying control procedures by improving quality of services offered by enterprise services, risk management & IT Auditing of Security Services.
  • Implementing of data packages on Huawei platform & integrated with IN API platform/Ericsson, implemented of Ericsson IN charging system, creating tariff plans for IN & making integration with Billing system, managing roaming in bound & out bound with Interconnect services.
  • Startup IT department and implemented IN, EMM, EMA & Billing, provisioning, mediation, rating /charging, CRM cloud-based, Call Center /IVR, with deployment of IT infrastructure commissioning cisco switches, routers, Wi-Fi, firewalls, HP HW, over all maintained daily operations, rate plans, call center and DWH and created processes and procedures operations of IT to meet the required IT standards and policies, ITIL.
  • Projects managing, multi-tasking & retaining the clients through relationship management, strategic planning and fostering a good working relationship especially with the senior management to facilitate flawless execution of projects.
  • Continuously gathering knowledge of competitors and designing strategies to effectively positioning the company services against them and responsible for services augmentation.
  • Developed and implemented several types of Financial Reports (Income Statement, Profit& Loss Statement, EBIT, BI Reports).
  • Developed Reports (Gross Margin, Revenue base on geographic regions, Profitability based on web sales and smartphone app sales) and ran the reports every month and distributed them to respective departments through mailing server subscriptions and SharePoint server portal.
  • Hybris/SAP billing software integration, Subscription Order Management, Invoicing, Document Management, Customer Financial Management &Consolidated CDRS.
  • Implementing of IRM /MPOS, Smart app, CRM cloud & ICT Huawei integration, staging & mapping in IN DWH
  • Implementing of BI DWH & dash board, ETL Informatica integration oracle DWH.
  • Deploying VMAX 500 TB DR & VMware, HP & CISCO infrastructure & Itrac integration with VS oracle databases integration.
  • STK dual sim cards & mobile money banking solution, & scratch cards using oracle integration & ETLs ODI.
  • Using Oracle database integration for Convergent Billing system & CRM integration with IN, ERP GP Dynamic.
  • Creating CDRs streaming DWH using ODI ETLs & integration between IN & Billing system fill fledge solution.
  • Implementing of Ericsson GSM start-up, IN, EMM, EMA charging system, roaming, interconnect, call center /IVR, HLR, ERP & GP.
  • Using Machine learning k-means for clustering subscribers & segmentation to run campaign management,
  • Creating CDRs streaming Data analysis for reporting purposing using Tableau & Big Data Tools, Swoop, HBase & Hadoop /HDFS.
  • Fraud detecting using HDFS big Data Tools, Spark & Spark SQL data streaming on real time analysis.
  • Revenue forecasting using Fuzzy Clustering Module for Customer Segmentation Analyst ML data scientist.
  • Building data modelling of Subscribers 360-degree view for Data Analytical purpose using Hive over HDFS
  • Developing subscriber’s behavior data Modelling using Big Data analytic tools R & ML Logistic Regression Module for Data Analysis.
  • Developed Churn Prediction using ML & RGS subscribers Performed BI DWH with new data mart.
  • Creating Association Rules with Apriori Algorithm using Big Data Tools in R for subscriber’s behavior analysis.
  • ROI of activities (ATL/BTL) Segmentation of base subs & understanding of customer behavior using Machine Learning Algorithm Hierarchical clusters.
  • Building reporting on real time using Talend Big Data Tools for network availability.
  • Reconciliation subs & fetch Network Nodes failures using Big Data Tools streaming mapper & reducer /HDFS.
  • Creating content package video streaming using Big Data Tools Kafka & Python.

Environment: Oracle, Python, Hadoop, Spark, Machine learning (K means, UBCF, Naive Bayes, SVM, Decision Tree, text mining, sentiment Analysis, Logistic regression, PLSQL, SQLite, MySQL, DBA, Hive, Pig, Hbase, Informatics, Sqoop, Agile, Java frameworks, Talend, Mapper & Reducer, C, R, MS SQL server, SISS, SSRS, Tableau.

We'd love your feedback!