We provide IT Staff Augmentation Services!

Big Data Analyst Resume

3.00/5 (Submit Your Rating)

Fremont, CA

SUMMARY

  • 4.5 years of IT experience in programming and developing software applications which includes 1.5 years of experience in Big Data using Hadoop, Hive, PIG, Sqoop, Hbase and Map Reduce programming.
  • Hands on experience and good knowledge in using Hadoop ecosystem tools like MapReduce, HDFS, PIG, Hive, Spark, Sqoop, Kafka, Scala, Flume, Yarn and Oozie.
  • Experience in analyzing data usingHiveQL, Spark SQL, PIG Latin, and develop programs in Java and Python.
  • Developed scripts, batch jobs to schedule various Hadoop programs.
  • Worked on importing and exporting data from databases likedifferent relational databases systems like MySQL, RDBMS into HDFS and Hive using Sqoop.
  • Experience working with Java, J2EE, JSP, Servlets, Hibernate, Angular JS and Spring.
  • Experience in developing and debugging application for mobile development using Android SDK.
  • Worked in complete Software Development Life Cycle (SDLC) in Waterfall and Agile models.
  • Strong programming and analytical skills with an ability to work in a fast - paced, team oriented environment.
  • Quick learner and a good communicator with a desire and proactiveness in acquiring new skills.
  • Oracle Certified Professional, Java SE 6 Programmer.

TECHNICAL SKILLS

Big Data Ecosystems: Hadoop, MapReduce, HDFS, Hbase, Hive, PIG, Sqoop, Spark, Scala, Oozie, Flume, Kafka

Programming Languages: Java, Python, SQL, J2EE, ASP.net, C/C++

Development Tools: Cloudera, Eclipse IDE, Android SDK

Scripting Languages: JavaScript, HTML, CSS, XML

Databases: MongoDB, MySQL, Cassandra

Methodologies: Agile, Waterfall

PROFESSIONAL EXPERIENCE

Confidential, Fremont, CA

Big Data Analyst

Responsibilities:

  • Acted as Data Lead for data ingestion.
  • Gathered requirement & analysis, business process documentation, system architecture design, use cases creation, test cases and user acceptance testing using agile development process and practices.
  • Prepared system architecture & components design documentation & performed code-review.
  • Created design architecture for data ingestion from multiple sources like RDBMS &Amazon S3 services.
  • Worked with structured & unstructured, RDBMS &csv data.
  • Worked on a live 30 nodes Hadoop cluster running CDH5.
  • Used Python libraries (Pandas) for data analysis and manipulation.
  • Imported data from HDFS into Hive using Hive commands.
  • Developed a Sqoop Incremental Import Job, Shell Script & CRONJOB for importing data into HDFS.
  • Created Hive partition on dates and stocks for imported data.
  • Developed a PySpark script which dynamically downloads the Amazon S3 Data files into the HDFS system.
  • Developed Python scripts to scrape data from sources and store it on S3 buckets to run analytics.
  • Developed programs using Python for spark to handle arithmetic calculations using RDDS and deployed it in the cluster.
  • Used SparkSQLforquerying and analyzing the data.
  • Created Tableau reports to display graphical visualization of stock data.

Technologies: Hadoop Ecosystem, HDFS, Sqoop, Hive, PySpark, Spark SQL, Python, Tableau.

Confidential

Big Data Analyst

Responsibilities:

  • Built data pipelines to Load and transform large sets of structured, semi structured and unstructured data.
  • Data came from various sources (Unix/Mainframe). Load CSV files into MySQL database.
  • Imported data into HDFS and Hive using Sqoop. Wrote Hive UDFS for masking certain information.
  • Inserted masked data into the Hive tables using the Hive UDF function. PerformedSqoop export to MySQL DB tables from Hive masked tables.
  • Developed Hive queries to process the data for visualizing.
  • Imported data from HDFS into Hive using HiveQL.
  • Involved in creating Hive tables, loading and analyzing data using hive queries.
  • Created Hive Partitioned and Bucketed tables to improve performance.
  • Developed a SQOOP Import Job, Shell Script & CRONJOB for importing data into HDFS.
  • Used Tableau for visualization and building dashboards.
  • Processed millions of records usingHadoop jobs.
  • Implemented Spark code using Python for RDD transformations & actions in Spark application.
  • Define and contribute to development of standards, guidelines, design patterns and common development frameworks
  • Created Tableau reports to display graphical visualization of stock data.

Technologies: Hadoop Ecosystem, HDFS, MapReduce, Flume, Sqoop, Hive, Spark, Spark SQL, Python.

Associate Software Engineer

Confidential

Responsibilities:

  • Developed application using Android SDK and Java/J2EE, tested programming code & applications.
  • Worked with healthcare provider to create patient portal tool providing 360-degree view of patient details thus creating a personalized health care management system.
  • Involved in Agile - Sprint methodologies to do requirements gathering, analysis, architecting and planning.
  • Involved in development, testing and integration of the application.
  • Developed web pages using HTML, CSS and Javascript.
  • Developed consumer based features using JavaScript, JQuery, HTML, CSS behavior driven development.
  • Developed AngularJS controllers, directives, factory and service resources and events.
  • Used Android SDK and Android studio for the application development.
  • Tested the application across different Android versions &phones to ensure quality & performance.
  • Developed user-friendly user-interfaces using widgets like Menus, Dialogs, Different Layouts, Buttons, edit boxes and selection widgets like list view & scroll view as per client needs; developed activities & UI layers.
  • Worked closely with another mobile application developer, leading the other platform development.
  • Involved in product development as part of customer intelligence and insights initiative for launching customer 360-degree product via analytics and reporting using Big Data and Hadoop ecosystem.

Junior Developer, Intern

Confidential

Responsibilities:

  • Used.NET platform to create academic automation system in helping institution for managing database accessibility by students, teachers and college admin.
  • Wrote business requirement documents and created dashboards to improve data quality efficiency and accuracy.
  • Created effective dashboards and reports to provide real-time snapshots of projects using Salesforce.

We'd love your feedback!