We provide IT Staff Augmentation Services!

Big Data Analyst Resume

0/5 (Submit Your Rating)

CT

SUMMARY:

  • Microsoft and SAS certified, Highly - skilled data analyst bringing more than 3 years of expertise in Big Data Technology, Data Mining, Data Warehousing Data,Analysis, and Data Visualization .Thrive to use econometric, and thereby guide and help businesses in their decision-making and run efficiently created data lake using Hadoop and HBase and Hive.
  • Extensive experience using R packages like (foreign, gretl, rattle, quantMode). Also familiarity with finance statistical analysis such as lme4,MASS,mice,mlogit,Rcmdr,survival,truncreg
  • Robust user of Python packages for statistical analysis such asStatsmodels, Scikit-learn, Numpy, Pandas, NLTK
  • Competent in Normalization/ De-Normalization techniques for optimal performance in relational and dimensional database environmentsand maintaining referential Integrity by using Triggers and Primary and Foreign Keys.
  • Saved $300K/year operation cost of Global Database Administrator team, saved 267 days’ database team work.
  • Winner of Summer GIO Hackathon 2015 at JOHN DEERE
  • Expertise in designing concise,pertinent visualizations usingTableau, Power BI,R Studiosoftware and publishing and presenting dashboards on web and desktop platforms.
  • Excellency in MS Excel with proficiency in Lookups, Pivot Tables and understanding of VBA Macros.
  • Extensive experience in in-depth data analysis on different data bases and structures. Strong knowledge in writing T-SQL and PL SQL Queries, Dynamic-queries, sub-queries, CTEs and complex joins.

TECHNICAL SKILLS:

Programming Languages: Java, PL/SQL, JCL, COBOL

Analytics Languages: R, Python, SQL, Scala, SAS

IDE: RStudio, Eclipse, Intellij, PyCharm, PySpark, Weka

AWS Technologies: S3, EC2, SQS, SNS, EMR

Databases: Mainframe, Oracle, NoSQL, MySQL, ETL

Web Development: Java Script, HTML, CSS

Big Data Technologies: MapReduce, Hive, Spark, HBase, Pig, Yarn, Apache Azure

Data Visualization: Tableau Desktop/Server, Weka,Power BI, Pivot Table, VBA, V-lookup

SAS Skills: SAS-BASE, GRAPH, MACRO, SQL, ODS, STAT, MINER

Competencies: Logistic and Linear Regression, Time Series Analysis, CHAID,Factor Analysis, CART, Survival Analysis

PROFESSIONAL EXPERIENCE:

Confidential, CT

Big Data Analyst

Responsibilities:

  • Leveraged Statistical Analysis Azure Library to group multiple data plans into one group to create cushion capacity of 15 % to avoid overage for different vendor plans.
  • Achieved: Approx. $160 K saving in one and half month tenure for more than 5 teams.
  • Analyzed, retrieved and aggregated data from multiple vehicles vendor to perform data mapping, to precisely append either incorrectly mapped or missing data storage and transfer using HDFS and HIVE query.
  • Responsible for building and maintaining effective working relationships with business teams, as well as other external vendor and customer data management team
  • Identified inconsistencies in data collected from different sources and worked with business owners/stakeholders to assess business and risk impact, provided solution to business owners.
  • Worked with consumers and different teams to gain insights about the telematics data usage by their team and their target of cost saving through data usage planning analysis generated data lake using Hadoop to push raw data.
  • Analyzed business data requirements, data plans requirements, data overage requirement specifications, and accountable for documenting data plan modification and usage monitoring creating data storage in Hadoop and run strong queries using Spark.
  • Communicated with data vendor such as Vodafone, Verizon, Iridium and ORBCOMM to understand their business data plan strategies.
  • Access Caterpillar telematics data plan information from multiple data provider’s portal in SQL, SAS,Oracle format and run data profiling, data cleaning and data analysis on raw data using R packages and advance SQL querying.
  • Modelled advanced visual basic application macros on various vendor data reports to plan data usage structure with minimum overage cost using Excel and VBA.
  • Employed time series algorithm on data usage to visualize future scope of data plan usage and overage estimation to generate cost saving successful data usage plan for teams.
  • Created backdating data plan process to avoid non-required data overage cost for 5 teams by the help of VBA, Tableau and SQL query report generation
  • Performed data analysis and data profiling using complex SQL queries on various sources systems.
  • Developed share point documentation template to support findings, project status and assign specific tasks.
  • Involved with data profiling of multiple sources using SQL Management Studio and presented initial discovery in Excel tables and reports.
  • Used project management tools such as Kanban and Share point to keep stakeholder updated about project.
  • LeveragedSentiment analysis to established consumersfeedback systemby MapReduce and text mining in java.
  • Developed and systematizedend-to-end statistical model on high-volume data sets by manipulating data using Hivequeries and Spark on Hadoop for faster results, and resolve issues of stakeholders under tight deadlines
  • Achieved: 343 products showed selling growth after profiling
  • Business requirement gathering through one-to one and group meeting with Vendors, Order Management team and Supply Chain team. Presented initial developed KPI frameworks to gain line of sight project.
  • Employed time series algorithm on parts sales to visualize future scope of part and vehicles requirement to estimate required availability of various parts of vehicle in inventory and generate cost saving successful inventory management plan for team.
  • Extracted, compiled and tracked data, and analyzed data to generate reports in a variety of layoutsExcel, PDF, Tableau and SAS dashboard and Modelled data structures for multiple projects using Mainframe and Oracle
  • Maintained the data integrity during extraction, ingestion, manipulation, processing, analysis and storage.
  • Presented more than 15 impactful visualization time series dashboards and stories by employing Tableau desktop and server, Excel, pivot tables, Power BI, SAS and Visual Basic macros
  • Presented more than 15 impactful visualization time series dashboards and stories by employing Tableau desktop and server, Excel, pivot tables, SQL queries, Power BI, SAS and Visual Basic macros
  • Modelled basic analytical models using Python, R through an Spark R and API on 25 % threshold data
  • Achieved: 15% growth in inventory planning accuracy

We'd love your feedback!