We provide IT Staff Augmentation Services!

Sr. Data Scientist- Big Data Architect Resume

Middle Town, NJ

SUMMARY

  • An accomplished and performance driven professional with over 14+ years of technical & managerial experience in IT sector across technical consulting, designing future proof of Data Architect for Telecom, Financial institutions in Billing, IN and VAS, DWH, CRM, CC, HRIS, Big Data Tools, RDBMS, ERP, POS & Data Analysis.
  • Expertise in Business Intelligence, Data Warehousing, and Reporting tools in Financial, Trading & Telecom industry
  • Having 4 years’ experience working with Tableau Desktop, Tableau Server in various versions including Tableau (8.x/9.x/10.x).
  • Having 4 years of work experience with statistical data analysis such as linear models, multivariate analysis, statistical Analysis, Data Mining and Machine Learning techniques.
  • Expertise working with statistical data analysis such as linear models, Statistical Analysis, and Machine Learning techniques.
  • Hands - on experience Python (2.x, 3.x) to develop analytic models and solutions mapper & reducer.
  • Hands-on experience in creating insightful Tableau worksheets, dashboards to generate segment analysis and financial forecasting reports.
  • Proficient in creating data modelling for 360 customer view & customer behavior.
  • Strong skillset in PLSQL, ETLS /ODI, Business Intelligence, SQL Server Integration Server (SSIS) and SQL Server Reporting Services (SSRS).
  • Proficient in Data Cleansing and Data Validation checks during staging before loading the data into the Data warehouse.
  • Highly proficient Confidential using PLSQL for developing complex Stored Procedure, Triggers, Indexes, Tables, User Defined procedure, Relational Database models and SQL joins to support data manipulation and conversion tasks.
  • Highly skilled in creating, maintaining and deploying Extract, Transform and Load(ETL Oracle ) packages to Integration Server using Project Deployment and Package Deployment models.
  • Outstanding interpersonal communication, problem solving, documentation and business analytical skills.

TECHNICAL SKILLS

Data Analytics Tools/Programming: Python (numpy,scipy,pandas), MATLAB, Microsoft SQL Server, Oracle PLSQL, Python.

Data Visualization: Tableau, Visualization packages, Microsoft Excel.

Machine Learning Algorithms: Classifications, Regression, Clustering, Feature Engineering.

Data Modeling & Tools: Star Schema, Snow-Flake Schema, .

Big Data Tools: Hadoop, MapReduce, SQOOP, Pig, Hive, NOSQL,PY Spark.

Databases: Oracle, SQL Server,AZURE.

ETL: Informatica, SSIS,ODI.

Others: Deep Learning, Text Mining, c, Java script, Shell Scripting, PY Spark, MLLib, Cognos, CNN, RNN, LTSM, Reinforcement Learning, Tensorflow, Agile,SPSS,AWS, Unix & Linux

PROFESSIONAL EXPERIENCE

Confidential, Middle Town, NJ

Sr. Data Scientist- Big Data Architect

Responsibilities:

  • Developed analytics solutions based on Machine Learning platform, creating chatbot and demonstrated creative problem-solving approach and strong analytical skills.
  • Interacted with the development & support team to understand and identifydataneeds and requirements and work with them of the IT organization to QlikView based deliverdatavisualization and reporting solutions to address those needs.
  • Worked with Architecture team to get the metadata approved for the new data elements that are added for this project.
  • DataStory teller, MiningDatafrom differentDataSource such as SQL Server, Oracle, Cube Database, Web Analytics, Business Object and Hadoop. Provided AD hoc analysis and reports to executive level management team.
  • Creating various B2B Predictive and descriptive analytics using R and Tableau.
  • Exploratory analysis and model building to develop predictive insights and visualize, interpret, report findings and develop strategic uses ofdata.
  • Utilized Spark, Scala, Hadoop, HBase, Kafka, Spark Streaming, MLLib, R, a broad variety of machine learning methods including classifications, regressions, dimensionally reduction etc.
  • Designed and provisioned the platform architecture to execute Hadoop and machine learning use cases under Cloud infrastructure.
  • Selection of statistical algorithms - ( Linear Logistic Regression Decision Tree, Decision Forest Classifiers etc.).
  • Used MLlib, Spark's Machine learning library to build and evaluate different models.
  • Involved in creatingDataLake by extracting customer's BigDatafrom variousdatasources into Hadoop HDFS. This includeddatafrom Excel, Flat Files, Oracle, SQL Server, Mongo DB, HBase and also logdatafrom servers.
  • Created high level ETL design document and assisted ETL developers in the detail design and development of ETL maps using Informatica, Oracle /ODI, Power Design & ER Studio Data Architect.
  • Used R, SQL to create Statistical algorithms involving Multivariate Regression, Linear Regression, Logistic Regression, PCA, Random forest models, Decision trees, Support Vector Machine for estimating the risks of welfare dependency.
  • Helped in migration and conversion of data from the Oracle database, preparing mapping documents and developing partial SQL scripts as required.
  • Data governance ensure high quality of exists data throughout the complete lifecycle availability, usability, integrity & security .
  • Worked on predictive and what-if analysis using R from HDFS and successfully loaded files to HDFS and loaded from HDFS to HIVE.
  • Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from legacy Oracle and SQL Server database systems.
  • Analyzed data and predicted end customer behaviors and product performance by applying machine learning algorithms using SparkMLlib.
  • Performed data mining on data using very complex SQL queries and discovered pattern and used extensive SQL fordataprofiling/analysis to provide guidance in building thedatamodel and using power BI for data analysis.
  • Created numerous dashboards in tableau desktop based on thedatacollected from zonal and compass, while blendingdatafrom MS-excel and CSV files, with MS SQL server databases.

Environment: Python R, Machine Learning, Teradata 14, Hadoop Map Reduce, Pyspark, Spark, R, Spark MLLib, Tableau, ERWIN,Informatica, SQL, Excel, CSV, Oracle, AZURE. ODI, Informatica MDM, Cognos,Denodo, SQL Server 2012, DB2, T-SQL, PL/SQL, Flat Files, XML, and Tableau, Tensorflow, CNN, RNN, LTSM, WorkFusion, Rapid miner and Process Mining tools,rive script, Reinforcement Learning, SPSS,AWS, Agile, Master Data Management, Unix & Linux

Confidential, Livonia MI

Sr. Data Scientist/Big Data Architect

Responsibilities:

  • Cleaned and manipulated complex datasets to create the data foundation for further analytics and the development of key insights (MSSQL server, R, Tableau, Excel)
  • Application of various machine learning algorithms and statistical modeling like decision trees, regression models, neural networks, SVM, clustering to identify Volume using scikit-learn package in python, Matlab.
  • Utilized Apache Spark with Python to develop and execute Big Data Analytics and Machine learning applications, executed machine Learning use cases under Spark ML and Mllib.
  • Led technical implementation of advanced analytics projects, Defined the mathematical approaches, developer new and effective analytics algorithms and wrote the key pieces of mission-critical source code implementing advanced machine learning algorithms utilizing caffe, TensorFlow, Spark, MLLib, R and other tools and languages needed.
  • Performed K-means clustering, Multivariate analysis and Support Vector Machines in Python and R.
  • Professional Tableau user (Desktop, Online, and Server), Experience with Keras and Tensor Flow.
  • Involved in creatingDataLakeby extracting customer's BigDatafrom variousdatasources into Hadoop HDFS. This includeddatafrom Excel, Flat Files, Oracle, SQL Server, HBase and also logdatafrom servers.
  • Created mapreduce running over HDFS for data mining and analysis using R and Loading & Storage data to Pig Script and R for MapReduce operations and created various types of data visualizations using R, and Tableau.
  • Worked on machine learning on large sizedatausing Spark and MapReduce.
  • Performeddataanalysis by using Hive to retrieve thedatafrom Hadoop cluster, SQL to retrievedata from Oracle database and ER Studio Data Architect.
  • DevelopedSpark/Scala, Python for regular expression (regex) project in the Hadoop/Hive environment with Linux/Windows for bigdataresources.
  • Performed Multinomial Logistic Regression, Random forest, Decision Tree, SVM to classify package is going to deliver on time for the new route.
  • Responsible for planning & scheduling new product releases and promotional offers.
  • Used pandas, numpy, seaborn, scipy, matplotlib, scikit-learn, NLTK (Natural Language Toolkit ) in Python for developing various machine learning algorithms.
  • Worked on NOSQL databases like MongoDB, HBase, using OBIEE and data modelling using ERWIN .
  • CreatedDataQuality Scripts using SQL and Hive to validate successfuldataload and quality of thedata. Created various types ofdatavisualizations using Python and Tableau.
  • Worked ondatapre-processing and cleaning thedatato perform feature engineering and performeddataimputation techniques for the missing values in the dataset using Python.
  • Key focus on data governance availability, usability, integrity and security .
  • Extracteddatafrom HDFS and prepareddatafor exploratory analysis usingdatamunging.
  • Worked on Text Analytics, Naive Bayes, Sentiment analysis, creating word clouds and retrievingdatafrom Twitter and other social networking platforms.
  • Worked on differentdataformats such as JSON, XML and performed machine learning algorithms in Python.

Environment: Python, MongoDB, JavaScript, SQL Server, HDFS, Pig, Hive, Oracle/ODI, DB2, Tableau, ETL (Informatica), SQL, T-SQL, Hadoop Framework, Spark SQL, SparkMllib, Denodo, NLP, SQL, Matlab, ERWIN, HBase, R, Pyspark,Power BI, AWS,Tableau Desktop, AZURE, Rapid miner and Process Mining tools. WorkFusion, SPSS, ER Studio & power design Data Architect, Excel, Agile,Master Data Management, Unix & Linux, MDM.

Confidential

Sr. Data Architect

Responsibilities:

  • Skilled in Data Scientist solutions, Database Design (RDBMS), Billing Systems, System Migrations/Integrations, and Enterprise wide Implementations of big data tools using Hadoop.
  • Spearheading Technical team for products delivery services possessing with the accountability of customer service management, developing SoW, proposal development, competitive bidding and negotiations.
  • Responsible for business analyst, P&L, client satisfaction and overall delivery execution of complex information and technology services and reporting tools.
  • Analyzing market and sales strategies, deal requirements, products and plans development from potential and financials aspects and eCommerce analysis
  • Preparing RFP & business cases for products & services, closing new business deals by coordinating requirements, developing & negotiating contracts and integrating contract requirements.
  • Formulating the marketing strategy for the organization to ensure an enriched customer portfolio and building differentiation for the brand and performing ICT clouds presentations and supporting targeted sales opportunities.
  • Preparing Enterprise Services HW & SW about OPEX and CAPEX developing marketing products plans, segmentation using clustering Machine Learning and probability methods, creating promo products for subscribers.
  • Building 360-degree Analytical Data modeling subscribers to support marketing activities, developing marketing reports, detailed analysis to measure the performance promotions, using Machine Learning for Data Analytics.
  • Developing subscribers monthly consolidate reports, Roaming & Interconnect Revenue and data usage reports and creating ETL ORACLE/ ODI for all streams to DWH and other vendors applications
  • Accountable for planning, managing and overseeing all execution teams for capacity growth and coordinating with vendors to ensure reduced operating expenditure and improved fiscal position, communication with team and reporting skills.
  • Designing storage capacity HDFS for data analysis, managing the installation, configuration & commissioning of switches, OS & firewalls and monitoring systems availability & business continuity, performance of DWH as a measure of preventive maintenance.
  • Leading, managing and delivering data modelling using ERWIN and Oracle DWH establishing & creating successful machine learning Algorithms.
  • Managing and reducing risks and ensuring change management and applying control procedures by improving quality of services offered by enterprise services, risk management & IT Auditing of Security Services.
  • Implementing of data packages on Huawei platform & integrated with IN API platform/Ericsson, implemented of Ericsson IN charging system, creating tariff plans for IN & making integration with Billing system, managing roaming in bound & out bound with Interconnect services.
  • Startup IT department and implemented IN, EMM, EMA & Billing, provisioning, mediation, rating /charging, CRM cloud-based, Call Center /IVR, with deployment of IT infrastructure commissioning cisco switches, routers, Wi-Fi, firewalls, HP HW, over all maintained daily operations, rate plans, call center and DWH and created processes and procedures operations of IT to meet the required IT standards and policies, ITIL.
  • Projects managing, multi-tasking & retaining the clients through relationship management, strategic planning and fostering a good working relationship especially with the senior management to facilitate flawless execution of projects.
  • Continuously gathering knowledge of competitors and designing strategies to effectively positioning the company services against them and responsible for services augmentation.
  • Developed and implemented several types of Financial Reports (Income Statement, Profit& Loss Statement, EBIT, BI Reports).
  • Developed Reports (Gross Margin, Revenue base on geographic regions, Profitability based on web sales and smartphone app sales) and ran the reports every month and distributed them to respective departments through mailing server subscriptions and SharePoint server portal.
  • Hybris/SAP billing software integration, Subscription Order Management, Invoicing, Document Management, Customer Financial Management &Consolidated CDRS.
  • Implementing of IRM /MPOS, Smart app, CRM cloud & ICT Huawei integration, staging & mapping in IN DWH
  • Implementing of BI DWH, OBIEE & dash board, ETL Informatica integration oracle DWH and using data modelling .
  • Deploying VMAX 500 TB DR & VMware, HP & CISCO infrastructure & Itrac integration with VS oracle databases integration.
  • STK dual sim cards & mobile money banking solution, & scratch cards using oracle integration & ETLs ODI.
  • Using Oracle database integration for Convergent Billing system & CRM integration with IN, ERP GP Dynamic.
  • Creating CDRs streaming DWH using ETLs ODI & integration between IN & Billing system fill fledge solution.
  • Implementing of Ericsson GSM start-up, IN, EMM, EMA charging system, roaming, interconnect, call center /IVR, HLR, ERP & GP.
  • Using Machine learning k-means for clustering subscribers & segmentation to run campaign management,
  • Creating CDRs streaming Data analysis for reporting purposing using Tableau & Big Data Tools, Swoop, HBase & Hadoop /HDFS.
  • Fraud detecting using HDFS big Data Tools, Spark & Spark SQL data streaming on real time analysis.
  • Revenue forecasting using Fuzzy Clustering Module for Customer Segmentation Analyst ML data scientist.
  • Building data modelling of Subscribers 360-degree view for Data Analytical purpose using Hive over HDFS
  • Developing subscriber’s behavior oracle & ERWIN data Modelling using Big Data analytic tools R & ML Logistic Regression Module for Data Analysis.
  • Developed Churn Prediction using ML & RGS subscribers Performed BI DWH with new data mart.
  • Creating Association Rules with Apriori Algorithm using Big Data Tools in R for subscriber’s behavior analysis.
  • ROI of activities (ATL/BTL) Segmentation of base subs & understanding of customer behavior using Machine Learning Algorithm Hierarchical clusters.
  • Building reporting on real time using Talend Big Data Tools for network availability.
  • Reconciliation subs & fetch Network Nodes failures using Big Data Tools streaming mapper & reducer /HDFS.
  • Creating content package video streaming using Big Data Tools Kafka & Python.

Environment: Oracle, Python, Hadoop, Spark, Machine learning (K means, UBCF, Naive Bayes, SVM, Decision Tree, text mining tools, Process Mining tools, sentiment Analysis, Logistic regression, PLSQL, SQLite, MySQL, DBA, Hive, Pig, HRIS, Hbase, Informatics, Sqoop, Agile, Java frameworks, BI,OBIEE,Talend, Mapper & Reducer, C, R, MS SQL server, SISS, SSRS, Unix & Linux, Tableau.

Hire Now