We provide IT Staff Augmentation Services!

Sr. Data Scientist- Big Data Architect Resume

Middle Town, NJ


  • An accomplished and performance driven professional wif over 14+ years of technical & managerial experience in IT sector across technical consulting, designing future proof of Data Architect for Telecom, Financial institutions in Billing, IN and VAS, DWH, CRM, CC, HRIS, Big Data Tools, RDBMS, ERP, POS & Data Analysis.
  • Expertise in Business Intelligence, Data Warehousing, and Reporting tools in Financial, Trading & Telecom industry
  • Having 4 years’ experience working wif Tableau Desktop, Tableau Server in various versions including Tableau (8.x/9.x/10.x).
  • Having 4 years of work experience wif statistical data analysis such as linear models, multivariate analysis, statistical Analysis, Data Mining and Machine Learning techniques.
  • Expertise working wif statistical data analysis such as linear models, Statistical Analysis, and Machine Learning techniques.
  • Hands - on experience Python (2.x, 3.x) to develop analytic models and solutions mapper & reducer.
  • Hands-on experience in creating insightful Tableau worksheets, dashboards to generate segment analysis and financial forecasting reports.
  • Proficient in creating data modelling for 360 customer view & customer behavior.
  • Strong skillset in PLSQL, ETLS /ODI, Business Intelligence, SQL Server Integration Server (SSIS) and SQL Server Reporting Services (SSRS).
  • Proficient in Data Cleansing and Data Validation checks during staging before loading teh data into teh Data warehouse.
  • Highly proficient Confidential using PLSQL for developing complex Stored Procedure, Triggers, Indexes, Tables, User Defined procedure, Relational Database models and SQL joins to support data manipulation and conversion tasks.
  • Highly skilled in creating, maintaining and deploying Extract, Transform and Load(ETL Oracle ) packages to Integration Server using Project Deployment and Package Deployment models.
  • Outstanding interpersonal communication, problem solving, documentation and business analytical skills.


Data Analytics Tools/Programming: Python (numpy,scipy,pandas), MATLAB, Microsoft SQL Server, Oracle PLSQL, Python.

Data Visualization: Tableau, Visualization packages, Microsoft Excel.

Machine Learning Algorithms: Classifications, Regression, Clustering, Feature Engineering.

Data Modeling & Tools: Star Schema, Snow-Flake Schema, .

Big Data Tools: Hadoop, MapReduce, SQOOP, Pig, Hive, NOSQL,PY Spark.

Databases: Oracle, SQL Server,AZURE.

ETL: Informatica, SSIS,ODI.

Others: Deep Learning, Text Mining, c, Java script, Shell Scripting, PY Spark, MLLib, Cognos, CNN, RNN, LTSM, Reinforcement Learning, Tensorflow, Agile,SPSS,AWS, Unix & Linux


  • Business Intelligence
  • Machine Learning
  • Clarity
  • Spark MLLib
  • Predictive Analytics
  • Tableau
  • Business Objects
  • Python
  • SQL, PL/SQL, SQL Server
  • Big Data & Hadoop
  • ERP


Confidential, Middle Town, NJ

Sr. Data Scientist- Big Data Architect


  • Developed analytics solutions based on Machine Learning platform, creating chatbot and demonstrated creative problem-solving approach and strong analytical skills.
  • Interacted wif teh development & support team to understand and identifydataneeds and requirements and work wif them of teh IT organization to QlikView based deliverdatavisualization and reporting solutions to address those needs.
  • Worked wif Architecture team to get teh metadata approved for teh new data elements that are added for this project.
  • DataStory teller, MiningDatafrom differentDataSource such as SQL Server, Oracle, Cube Database, Web Analytics, Business Object and Hadoop. Provided AD hoc analysis and reports to executive level management team.
  • Creating various B2B Predictive and descriptive analytics using R and Tableau.
  • Exploratory analysis and model building to develop predictive insights and visualize, interpret, report findings and develop strategic uses ofdata.
  • Utilized Spark, Scala, Hadoop, HBase, Kafka, Spark Streaming, MLLib, R, a broad variety of machine learning methods including classifications, regressions, dimensionally reduction etc.
  • Designed and provisioned teh platform architecture to execute Hadoop and machine learning use cases under Cloud infrastructure.
  • Selection of statistical algorithms - ( Linear Logistic Regression Decision Tree, Decision Forest Classifiers etc.).
  • Used MLlib, Spark's Machine learning library to build and evaluate different models.
  • Involved in creatingDataLake by extracting customer's BigDatafrom variousdatasources into Hadoop HDFS. This includeddatafrom Excel, Flat Files, Oracle, SQL Server, Mongo DB, HBase and also logdatafrom servers.
  • Created high level ETL design document and assisted ETL developers in teh detail design and development of ETL maps using Informatica, Oracle /ODI, Power Design & ER Studio Data Architect.
  • Used R, SQL to create Statistical algorithms involving Multivariate Regression, Linear Regression, Logistic Regression, PCA, Random forest models, Decision trees, Support Vector Machine for estimating teh risks of welfare dependency.
  • Helped in migration and conversion of data from teh Oracle database, preparing mapping documents and developing partial SQL scripts as required.
  • Data governance ensure high quality of exists data throughout teh complete lifecycle availability, usability, integrity & security .
  • Worked on predictive and wat-if analysis using R from HDFS and successfully loaded files to HDFS and loaded from HDFS to HIVE.
  • Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from legacy Oracle and SQL Server database systems.
  • Analyzed data and predicted end customer behaviors and product performance by applying machine learning algorithms using SparkMLlib.
  • Performed data mining on data using very complex SQL queries and discovered pattern and used extensive SQL fordataprofiling/analysis to provide guidance in building thedatamodel and using power BI for data analysis.
  • Created numerous dashboards in tableau desktop based on thedatacollected from zonal and compass, while blendingdatafrom MS-excel and CSV files, wif MS SQL server databases.
  • Environment: Python R, Machine Learning, Teradata 14, Hadoop Map Reduce, Pyspark, Spark, R, Spark MLLib, Tableau, ERWIN,Informatica, SQL, Excel, CSV, Oracle, AZURE. ODI, Informatica MDM, Cognos,Denodo, SQL Server 2012, DB2, T-SQL, PL/SQL, Flat Files, XML, and Tableau, Tensorflow, CNN, RNN, LTSM, WorkFusion, Rapid miner and Process Mining tools,rive script, Reinforcement Learning, SPSS,AWS, Agile, Master Data Management, Unix & Linux

Confidential, Livonia MI

Sr. Data Scientist/Big Data Architect


  • Cleaned and manipulated complex datasets to create teh data foundation for further analytics and teh development of key insights (MSSQL server, R, Tableau, Excel)
  • Application of various machine learning algorithms and statistical modeling like decision trees, regression models, neural networks, SVM, clustering to identify Volume using scikit-learn package in python, Matlab.
  • Utilized Apache Spark wif Python to develop and execute Big Data Analytics and Machine learning applications, executed machine Learning use cases under Spark ML and Mllib.
  • Led technical implementation of advanced analytics projects, Defined teh mathematical approaches, developer new and effective analytics algorithms and wrote teh key pieces of mission-critical source code implementing advanced machine learning algorithms utilizing caffe, TensorFlow, Spark, MLLib, R and other tools and languages needed.
  • Performed K-means clustering, Multivariate analysis and Support Vector Machines in Python and R.
  • Professional Tableau user (Desktop, Online, and Server), Experience wif Keras and Tensor Flow.
  • Involved in creatingDataLakeby extracting customer's BigDatafrom variousdatasources into Hadoop HDFS. This includeddatafrom Excel, Flat Files, Oracle, SQL Server, HBase and also logdatafrom servers.
  • Created mapreduce running over HDFS for data mining and analysis using R and Loading & Storage data to Pig Script and R for MapReduce operations and created various types of data visualizations using R, and Tableau.
  • Worked on machine learning on large sizedatausing Spark and MapReduce.
  • Performeddataanalysis by using Hive to retrieve thedatafrom Hadoop cluster, SQL to retrievedata from Oracle database and ER Studio Data Architect.
  • DevelopedSpark/Scala, Python for regular expression (regex) project in teh Hadoop/Hive environment wif Linux/Windows for bigdataresources.
  • Performed Multinomial Logistic Regression, Random forest, Decision Tree, SVM to classify package is going to deliver on time for teh new route.
  • Responsible for planning & scheduling new product releases and promotional offers.
  • Used pandas, numpy, seaborn, scipy, matplotlib, scikit-learn, NLTK (Natural Language Toolkit ) in Python for developing various machine learning algorithms.
  • Worked on NOSQL databases like MongoDB, HBase, using OBIEE and data modelling using ERWIN .
  • CreatedDataQuality Scripts using SQL and Hive to validate successfuldataload and quality of thedata. Created various types ofdatavisualizations using Python and Tableau.
  • Worked ondatapre-processing and cleaning thedatato perform feature engineering and performeddataimputation techniques for teh missing values in teh dataset using Python.
  • Key focus on data governance availability, usability, integrity and security .
  • Extracteddatafrom HDFS and prepareddatafor exploratory analysis usingdatamunging.
  • Worked on Text Analytics, Naive Bayes, Sentiment analysis, creating word clouds and retrievingdatafrom Twitter and other social networking platforms.
  • Worked on differentdataformats such as JSON, XML and performed machine learning algorithms in Python.

Environment: Python, MongoDB, JavaScript, SQL Server, HDFS, Pig, Hive, Oracle/ODI, DB2, Tableau, ETL (Informatica), SQL, T-SQL, Hadoop Framework, Spark SQL, SparkMllib, Denodo, NLP, SQL, Matlab, ERWIN, HBase, R, Pyspark,Power BI, AWS,Tableau Desktop, AZURE, Rapid miner and Process Mining tools. WorkFusion, SPSS, ER Studio & power design Data Architect, Excel, Agile,Master Data Management, Unix & Linux, MDM.

Confidential, Dubai, UAE

Sr. Data Architect


  • Skilled in Data Scientist solutions, Database Design (RDBMS), Billing Systems, System Migrations/Integrations, and Enterprise wide Implementations of big data tools using Hadoop.
  • Spearheading Technical team for products delivery services possessing wif teh accountability of customer service management, developing SoW, proposal development, competitive bidding and negotiations.
  • Responsible for business analyst, P&L, client satisfaction and overall delivery execution of complex information and technology services and reporting tools.
  • Analyzing market and sales strategies, deal requirements, products and plans development from potential and financials aspects and eCommerce analysis
  • Preparing RFP & business cases for products & services, closing new business deals by coordinating requirements, developing & negotiating contracts and integrating contract requirements.
  • Formulating teh marketing strategy for teh organization to ensure an enriched customer portfolio and building differentiation for teh brand and performing ICT clouds presentations and supporting targeted sales opportunities.
  • Preparing Enterprise Services HW & SW about OPEX and CAPEX developing marketing products plans, segmentation using clustering Machine Learning and probability methods, creating promo products for subscribers.
  • Building 360-degree Analytical Data modeling subscribers to support marketing activities, developing marketing reports, detailed analysis to measure teh performance promotions, using Machine Learning for Data Analytics.
  • Developing subscribers monthly consolidate reports, Roaming & Interconnect Revenue and data usage reports and creating ETL ORACLE/ ODI for all streams to DWH and other vendors applications
  • Accountable for planning, managing and overseeing all execution teams for capacity growth and coordinating wif vendors to ensure reduced operating expenditure and improved fiscal position, communication wif team and reporting skills.
  • Designing storage capacity HDFS for data analysis, managing teh installation, configuration & commissioning of switches, OS & firewalls and monitoring systems availability & business continuity, performance of DWH as a measure of preventive maintenance.
  • Leading, managing and delivering data modelling using ERWIN and Oracle DWH establishing & creating successful machine learning Algorithms.
  • Managing and reducing risks and ensuring change management and applying control procedures by improving quality of services offered by enterprise services, risk management & IT Auditing of Security Services.
  • Implementing of data packages on Huawei platform & integrated wif IN API platform/Ericsson, implemented of Ericsson IN charging system, creating tariff plans for IN & making integration wif Billing system, managing roaming in bound & out bound wif Interconnect services.
  • Startup IT department and implemented IN, EMM, EMA & Billing, provisioning, mediation, rating /charging, CRM cloud-based, Call Center /IVR, wif deployment of IT infrastructure commissioning cisco switches, routers, Wi-Fi, firewalls, HP HW, over all maintained daily operations, rate plans, call center and DWH and created processes and procedures operations of IT to meet teh required IT standards and policies, ITIL.
  • Projects managing, multi-tasking & retaining teh clients through relationship management, strategic planning and fostering a good working relationship especially wif teh senior management to facilitate flawless execution of projects.
  • Continuously gathering knowledge of competitors and designing strategies to effectively positioning teh company services against them and responsible for services augmentation.
  • Developed and implemented several types of Financial Reports (Income Statement, Profit& Loss Statement, EBIT, BI Reports).
  • Developed Reports (Gross Margin, Revenue base on geographic regions, Profitability based on web sales and smartphone app sales) and ran teh reports every month and distributed them to respective departments through mailing server subscriptions and SharePoint server portal.
  • Hybris/SAP billing software integration, Subscription Order Management, Invoicing, Document Management, Customer Financial Management &Consolidated CDRS.
  • Implementing of IRM /MPOS, Smart app, CRM cloud & ICT Huawei integration, staging & mapping in IN DWH
  • Implementing of BI DWH, OBIEE & dash board, ETL Informatica integration oracle DWH and using data modelling .
  • Deploying VMAX 500 TB DR & VMware, HP & CISCO infrastructure & Itrac integration wif VS oracle databases integration.
  • STK dual sim cards & mobile money banking solution, & scratch cards using oracle integration & ETLs ODI.
  • Using Oracle database integration for Convergent Billing system & CRM integration wif IN, ERP GP Dynamic.
  • Creating CDRs streaming DWH using ETLs ODI & integration between IN & Billing system fill fledge solution.
  • Implementing of Ericsson GSM start-up, IN, EMM, EMA charging system, roaming, interconnect, call center /IVR, HLR, ERP & GP.
  • Using Machine learning k-means for clustering subscribers & segmentation to run campaign management,
  • Creating CDRs streaming Data analysis for reporting purposing using Tableau & Big Data Tools, Swoop, HBase & Hadoop /HDFS.
  • Fraud detecting using HDFS big Data Tools, Spark & Spark SQL data streaming on real time analysis.
  • Revenue forecasting using Fuzzy Clustering Module for Customer Segmentation Analyst ML data scientist.
  • Building data modelling of Subscribers 360-degree view for Data Analytical purpose using Hive over HDFS
  • Developing subscriber’s behavior oracle & ERWIN data Modelling using Big Data analytic tools R & ML Logistic Regression Module for Data Analysis.
  • Developed Churn Prediction using ML & RGS subscribers Performed BI DWH wif new data mart.
  • Creating Association Rules wif Apriori Algorithm using Big Data Tools in R for subscriber’s behavior analysis.
  • ROI of activities (ATL/BTL) Segmentation of base subs & understanding of customer behavior using Machine Learning Algorithm Hierarchical clusters.
  • Building reporting on real time using Talend Big Data Tools for network availability.
  • Reconciliation subs & fetch Network Nodes failures using Big Data Tools streaming mapper & reducer /HDFS.
  • Creating content package video streaming using Big Data Tools Kafka & Python.

Environment: Oracle, Python, Hadoop, Spark, Machine learning (K means, UBCF, Naive Bayes, SVM, Decision Tree, text mining tools, Process Mining tools, sentiment Analysis, Logistic regression, PLSQL, SQLite, MySQL, DBA, Hive, Pig, HRIS, Hbase, Informatics, Sqoop, Agile, Java frameworks, BI,OBIEE,Talend, Mapper & Reducer, C, R, MS SQL server, SISS, SSRS, Unix & Linux, Tableau.

Hire Now