Big Data Developer Resume , Toronto - Hire IT People

PROFILE SUMMARY:

MBA Professional with over 3+ years of experience in Big Data
Proficiency in developing, testing and deploying ETL solutions
Extensive experience in using Spark to transform data using RDDs and dataframes
Experience in developing and using Hive Query Language for data analytics
Proficiency in Kafka, Hive, Impala, Sqoop, Flume, Oozie and MongoDB
Good knowledge of Hadoop Architecture and its various components such as HDFS and YARN
Proficiency in using Flume,NiFi,Kafka to gain real time and near real time streaming data into HDFS
Experience in using Spark Streaming to process streaming data from Kafka and Flume
Proficiency in Database Programming using MySQL creating Indexes, Functions, Views and Joins
Experience in using Spark SQL & performance tuning of Spark jobs
Strong team player, ability to work independently and in a team
Ability to adapt to a rapidly changing environment along with strong commitment towards learning
Experience in using both the major Hadoop distributors, Cloudera and Hortonworks
Experience in different phases of Software Development Life Cycle (SDLC)
Programming Knowledge in Java and Scala

TECHNICAL SKILLS:

PROFESSIONAL EXPERIENCE

Big Data Developer

Confidential, Toronto

Responsibilities:

Designed, created and tested complete data - pipelines
Worked closely with the data architect
Took care of the complete ETL process
Worked on semi-structured,unstructured and structured 600GB of data everyday
Created a complete system by using different Big Data tools
Used NiFi to gain real-time streaming data from different sources into HDFS
Automated ETL job using NiFi to load JSON data and server data into MongoDB
Loaded disparate data sets into HDFS from RDBMS like MySQL and vice-versa using Sqoop
Created schema in Hive using performance optimization using partitioning and bucketing
Wrote HQL queries for processing of data
Used Flume to load log data from e-commerce application into HDFS
Used Spring boot to publish log data from web applications into Kafka topic
Developed Spark jobs in Scala for data processing using Spark SQL and dataframes
Used NoSQL databases such as MongoDB as data-access tier in data streaming
Used Spring Boot to receive log data from web application and publish it in Kafka topic
Subscribed messages from Kafka topic into Spark Streaming for data processing
Stored the processed log data into NoSQL database,MongoDB
Deployed Docker containers to improve workflow and improve performance
Automated build and deployment using Jenkins to speed up the process
Migrated data from MySQL to HDFS and Hive using Sqoop & vice-versa in the hybrid system

Big Data Developer

Confidential

Responsibilities:

Imported and exported data using Sqoop from HDFS and Hive to RDBMS and vice-versa
Replaced default Derby metadata storage system for Hive with MySQL system
Created Spark RDDs from data files and performed transformations and actions on them
Used Spark SQL to run analysis on huge datasets
Created Hive tables with partitions and bucketing for efficiency
Developed Hive queries for the analysts
Utilized ApacheHadoop environment by Hortonworks
Automated dataflow process using Apache NiFi
Worked on import & export of data into HDFS and Hive using Sqoop
Involved in managing and reviewingHadoop log files
Created Hadoopstreaming jobs to process gigabytes of xml format data
Created transformations for large sets of structured, semi structured and unstructured data
Worked with Hive partitioned tables to load data for analysis

Hadoop Developer

Confidential

Responsibilities:

Used Talend for data integration and data managementDeveloped the whole process using the Talend ETL Big Data toolImported data to HDFS using Sqoop Analyzed the data using Spark Filtered, Mapped and Reduced RDDs using Spark
Created hive schemas using partitioned tables and bucketing
Developed Scala scripts for running Spark codes
Used Sqoop to transfer data back to the RDBMS
Developed oozie workflow to implement jobsMade Rest API call to get JSON dataStored the data in local Directory. Transferred the data from local directory to HDFSRead JSON data from HDFS using Spark, converted it to Dataframe and tan saved it as tableDeveloped Hive scripts to query the table
Used oozie to run the jobs

MySQL Developer

Confidential

Responsibilities:

Monitored and fine tuned the running database server Performed database development and implementationPlanned database growth in terms of capacity and scalability
Enabled extraction,transformation and loading of data and packages
Ensured continuous database availability,integrity and security in the production environment