Big Data Analyst Resume
Fremont, CA
SUMMARY
- 4.5 years of IT experience in programming and developing software applications which includes 1.5 years of experience in Big Data using Hadoop, Hive, PIG, Sqoop, Hbase and Map Reduce programming.
- Hands on experience and good knowledge in using Hadoop ecosystem tools like MapReduce, HDFS, PIG, Hive, Spark, Sqoop, Kafka, Scala, Flume, Yarn and Oozie.
- Experience in analyzing data usingHiveQL, Spark SQL, PIG Latin, and develop programs in Java and Python.
- Developed scripts, batch jobs to schedule various Hadoop programs.
- Worked on importing and exporting data from databases likedifferent relational databases systems like MySQL, RDBMS into HDFS and Hive using Sqoop.
- Experience working with Java, J2EE, JSP, Servlets, Hibernate, Angular JS and Spring.
- Experience in developing and debugging application for mobile development using Android SDK.
- Worked in complete Software Development Life Cycle (SDLC) in Waterfall and Agile models.
- Strong programming and analytical skills with an ability to work in a fast - paced, team oriented environment.
- Quick learner and a good communicator with a desire and proactiveness in acquiring new skills.
- Oracle Certified Professional, Java SE 6 Programmer.
TECHNICAL SKILLS
Big Data Ecosystems: Hadoop, MapReduce, HDFS, Hbase, Hive, PIG, Sqoop, Spark, Scala, Oozie, Flume, Kafka
Programming Languages: Java, Python, SQL, J2EE, ASP.net, C/C++
Development Tools: Cloudera, Eclipse IDE, Android SDK
Scripting Languages: JavaScript, HTML, CSS, XML
Databases: MongoDB, MySQL, Cassandra
Methodologies: Agile, Waterfall
PROFESSIONAL EXPERIENCE
Confidential, Fremont, CA
Big Data Analyst
Responsibilities:
- Acted as Data Lead for data ingestion.
- Gathered requirement & analysis, business process documentation, system architecture design, use cases creation, test cases and user acceptance testing using agile development process and practices.
- Prepared system architecture & components design documentation & performed code-review.
- Created design architecture for data ingestion from multiple sources like RDBMS &Amazon S3 services.
- Worked with structured & unstructured, RDBMS &csv data.
- Worked on a live 30 nodes Hadoop cluster running CDH5.
- Used Python libraries (Pandas) for data analysis and manipulation.
- Imported data from HDFS into Hive using Hive commands.
- Developed a Sqoop Incremental Import Job, Shell Script & CRONJOB for importing data into HDFS.
- Created Hive partition on dates and stocks for imported data.
- Developed a PySpark script which dynamically downloads the Amazon S3 Data files into the HDFS system.
- Developed Python scripts to scrape data from sources and store it on S3 buckets to run analytics.
- Developed programs using Python for spark to handle arithmetic calculations using RDDS and deployed it in the cluster.
- Used SparkSQLforquerying and analyzing the data.
- Created Tableau reports to display graphical visualization of stock data.
Technologies: Hadoop Ecosystem, HDFS, Sqoop, Hive, PySpark, Spark SQL, Python, Tableau.
Confidential
Big Data Analyst
Responsibilities:
- Built data pipelines to Load and transform large sets of structured, semi structured and unstructured data.
- Data came from various sources (Unix/Mainframe). Load CSV files into MySQL database.
- Imported data into HDFS and Hive using Sqoop. Wrote Hive UDFS for masking certain information.
- Inserted masked data into the Hive tables using the Hive UDF function. PerformedSqoop export to MySQL DB tables from Hive masked tables.
- Developed Hive queries to process the data for visualizing.
- Imported data from HDFS into Hive using HiveQL.
- Involved in creating Hive tables, loading and analyzing data using hive queries.
- Created Hive Partitioned and Bucketed tables to improve performance.
- Developed a SQOOP Import Job, Shell Script & CRONJOB for importing data into HDFS.
- Used Tableau for visualization and building dashboards.
- Processed millions of records usingHadoop jobs.
- Implemented Spark code using Python for RDD transformations & actions in Spark application.
- Define and contribute to development of standards, guidelines, design patterns and common development frameworks
- Created Tableau reports to display graphical visualization of stock data.
Technologies: Hadoop Ecosystem, HDFS, MapReduce, Flume, Sqoop, Hive, Spark, Spark SQL, Python.
Associate Software Engineer
Confidential
Responsibilities:
- Developed application using Android SDK and Java/J2EE, tested programming code & applications.
- Worked with healthcare provider to create patient portal tool providing 360-degree view of patient details thus creating a personalized health care management system.
- Involved in Agile - Sprint methodologies to do requirements gathering, analysis, architecting and planning.
- Involved in development, testing and integration of the application.
- Developed web pages using HTML, CSS and Javascript.
- Developed consumer based features using JavaScript, JQuery, HTML, CSS behavior driven development.
- Developed AngularJS controllers, directives, factory and service resources and events.
- Used Android SDK and Android studio for the application development.
- Tested the application across different Android versions &phones to ensure quality & performance.
- Developed user-friendly user-interfaces using widgets like Menus, Dialogs, Different Layouts, Buttons, edit boxes and selection widgets like list view & scroll view as per client needs; developed activities & UI layers.
- Worked closely with another mobile application developer, leading the other platform development.
- Involved in product development as part of customer intelligence and insights initiative for launching customer 360-degree product via analytics and reporting using Big Data and Hadoop ecosystem.
Junior Developer, Intern
Confidential
Responsibilities:
- Used.NET platform to create academic automation system in helping institution for managing database accessibility by students, teachers and college admin.
- Wrote business requirement documents and created dashboards to improve data quality efficiency and accuracy.
- Created effective dashboards and reports to provide real-time snapshots of projects using Salesforce.
