Spark Developer Resume San Jose, CA - Hire IT People

CAREER OBJECTIVE

Spark & Hadoop Developer, Focus on Machine learning and Artificial Intelligence

PROFESSIONAL SUMMARY:

Over 5 years of IT experience which includes expertise in BigData, Hadoop, Apache Spark, Java, Scala, technologies
Solid Mathematics, Probability and broad practical statistical data mining techniques
Strong technical, administration & mentoring knowledge in Linux and BigData/Hadoop technologies
Hands on experience on major component in Hadoop Ecosystem like Hadoop Map Reduce, SparkSQL, Spark Streaming, Kafka, HDFS, Hive, Pig, HBase, Zookeeper, Sqoop, Oozie, and Flume
Working experience building, designing, configuring large Hadoop environments
Work experience with cloud infrastructure like Hortonworks, Cloudera, Databricks .
Experience in importing and exporting data using Sqoop from HDFS to Relational Database systems/mainframe and vice - versa
Excellent understanding/knowledge of Hadoop architecture and various components such as HDFS, Ambari, Big data ecosystem Job tracker, Task tracker
Hands on experience installing, configuring and using hadoop ecosystem components like Spark, HBase, Oozie, Hive, Sqoop, Drill, Pig Storm, Kafka, Zookeeper and YARN
Experienced in monitoring Hadoop cluster environment using Ambari and Oozie
Experience in Object Oriented Analysis, Design using core Java pattern
Experience in NoSQL, SQL, MYSQL, Hbase
Articulate in written and verbal communication along with strong interpersonal, analytical and organizational skills
Ability to adapt to evolving technology, strong sense of responsibility and accomplishment
Ability to meet deadlines and handle multiple tasks, flexible in work schedules and posses excellent communication skills

TECHNICAL SKILLS:

Big Data Eco systems: Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Cassandra, Oozie, Scala, Spark, Flume

Programming Languages: Java, C++, Scala, Matlab, SparkSQL, HiveQL

Web Technologies: JavaScript, XML, HTML5 .

Database: NoSQL, MySql, SQL SERVER, ORACLE, HBase

IDE Tools: Eclipse, JDeveloper, Netbeans, MS Visual Studio.

Tools: Adobe, Sql, Flume, Sqoop and Storm

Operating Systems: Windows, Unix, Linux and MAC OS X

EXPERIENCE:

Spark Developer

Confidential, San Jose, CA

Responsibilities:

Created spark streaming application which consumed data from Kafka, parse the data and stored in Hbase
Developed Hbase tables to load large set of structured data coming from spark streaming
Using Oozie schedular to automate batch machine learning model
Written Hive object creation scripts
Created administrative object design on tools such as Kafka topic, Hbase and Hive tables
Loading and transforming large sets of structured data on real time basis
Involved in managing spark.ml application which read data from Hbase and processed recommendation using ALS algorithm
Collaborated with Infrastructure, network, database application and Buisness Intelligence team to ensure data quality and availability

Environment: Apache spark, Hbase, CentOS, spark streaming, Kafka, Hortonworks

Big Data Engineer

Confidential, San Jose, CA

Responsibilities:

Collaborated on insights with other Data Scientists, Business Analysts, and Partners.
Created Data lakes and provided data for continuously improving the efficiency and accuracy of existing predictive model for data science team.
Utilized various data analysis and data visualization tools to accomplish data analysis, report design and report delivery.
Developed Data frames using SparkSQL from external database.
Uploaded data to Hadoop Hive and combined new tables with existing databases.
Implemented the data backup strategies for the data in the MongoDB.
Created data lakes using hive, Apache Spark provisioning Hadoop cluster.
Implemented the ETL design to dump the data lakes to MongoDB.
Implemented POC using APACHE IMPALA for data processing on top of HIVE.
Imported data from relational database into HDFS using Flume on near real time latency
Excellent Understanding on data storage and retrieval techniques, ETL, and databases

Environment: Apache Spark, Flume, SparkSQL, Scala, MongoDB, Hive, Storm, Hive, Big Data

Hadoop Developer

Confidential

Responsibilities:

Built a data flow pipeline using flume, java map reduce and pig
Used Flume to capture the streaming mobile sensor data and store the data on HDFS
Used Hive scripts to compute aggregates and store them on HBase for low latency applications
Analyze HBase database and compare it with other open-source NoSQL database to find which one of them better suits the current requirement
Integrated HBAse as a distributed persistent metadata store to provide metadata resolutions for network entities on the network
Used Oozie to orchestrate the scheduling of map reduce jobs and pig scripts
Created HBase tables to load large sets of structured, semi-structured and unstructured data

Environment: JDK1.7, Ubuntu Linux, Big Data, Hive Pig, Sqoop, Flume, Zookeeper, Oozie, DB2, Hbase

We provide IT Staff Augmentation Services!

Spark Developer Resume

San Jose, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship