We provide IT Staff Augmentation Services!

Bigdata Developer Resume

Atlanta, GA

PROFESSIONAL SUMMARY:

  • 5+ years of overall experience in Data analysis, Big Data, Data Integration, Object Oriented programming and Advanced Analytics.
  • Excellent understanding of Hadoop Architecture and Daemons such as HDFS, Name Node, Data Node, Job Tracker, Task Tracker.
  • Hands on experience in installing, configuring and using Hadoop ecosystem components like Hadoop, HDFS, MapReduce Programming, Hive, Pig, Sqoop, HBase, Oozie, Zoo Keeper, Kafka, Spark, Cassandra with Cloudera and Hortonworks distribution.
  • Extracted files from Cassandra through Sqoop and placed in HDFS and processed.
  • Involved in converting Cassandra/Hive/SQL queries into Spark transformations using RDDs, and Scala.
  • Analyzed the Cassandra/SQL scripts and designed the solution to implement using Scala.
  • Expertise in Big Data Technologies and Hadoop Ecosystem tools like Flume, Sqoop, Hbase, Zookeeper, Oozie, MapReduce, Hive, PIG and YARN.
  • Hands on experience in installation, configuration, management and deployment of Big Data solutions and the underlying infrastructure of Hadoop Cluster using Cloudera and Horton works distributions.
  • Good experience in writing Spark applications using Scala.
  • Hands on experience in various Hadoop distributions IBM Big Insights, Cloudera, AWS EMR, Horton works.
  • In - depth understanding of Spark Architecture including Spark Core, Spark SQL, Data
  • Frames, Spark Streaming.
  • Expertise in writing Spark RDD transformations, actions, Data Frames, case classes for the required input data and performed the data transformations using Spark-Core.
  • Expertise in developing Real-Time Streaming Solutions using Spark Streaming.
  • Proficient in big data ingestion and streaming tools like Flume, Sqoop, Spark, Kafka and Storm.
  • Hands on experience in developing Map Reduce programs using Apache Hadoop for analyzing the Big Data.
  • Expertise in implementing ad-hoc Map Reduce programs using Pig Scripts.
  • Experience in importing and exporting data from RDBMS to HDFS, Hive tables and HBase by using Sqoop.
  • Experience in importing streaming data into HDFS using flume sources, and flume sinks and transforming the data using flume interceptors.
  • Exposure on usage of Apache Kafka to develop data pipeline of logs as a stream of messages using producers and consumers.
  • Experience in integrating Apache Kafka pipelines for real time processing.

SKILL:

Big Data Technologies: HDFS, Hive, Spark, MapReduce, Cassandra, Pig, Hcatalog, Phoenix, Falcon, Scoop, Flume, Zookeeper, Mahout, Oozie, Avro, HBase, MapReduce, HDFS, StormProgramming Languages: Java, C/C++, HTML, SQL, PL/SQL, AVS & JVS, Scala, Python

Monitoring Tools: Cloudera Manager, Ambari, Nagios, Ganglia

Scripting Languages: Shell Scripting, Puppet, Scripting, Python, Bash, CSH

Operating Systems: UNIX, Windows, LINUX

Application Servers: IBM Web sphere, Tomcat, Web Logic, Web Sphere

Web technologies: JSP, Servlets, JNDI, JDBC, Java Beans, JavaScript, Web Services

Databases: Oracle, MySQL, MS SQL, Teradata

WORK EXPERIENCE:

Confidential, Atlanta, GA

Bigdata Developer

Responsibilities:

  • Developing the Sqoop commands to ingest data into HDFS from the Distributed databases.
  • Creating the shell script for data transformation and also for the Data movement into different zones.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Consuming messages from KAFKA topic and loading it into HDFS path.
  • Developed Spark scripts by using Scala shell commands as per the requirement.
  • Using Spark with Scala and spark SQL perform different types of transformations on the health insurance data.
  • Involved in migrating hive queries and UDF’s in hive to Spark SQL.
  • Extensively Used Sqoop to import/export data between RDBMS and hive tables, incremental imports and created Sqoop jobs for last saved value.
  • Created customized BI tool for manager team that perform Query analytics using Hive.
  • Involved in data migration from Oracle database to Mongo DB
  • Created Hive Generic UDF’s to process business logic that varies based on policy.

Environment: Hadoop, Cloudera (CDH 4), HDFS, Hive, HBase, Sqoop, Kafka Scala, UNIX Shell Scripting, Python scripting.

Confidential, Bellevue, WA

Hadoop Developer

Responsibilities:

  • Upgrading the project spark code version from 1.6.2 to 2.1.0 for better stability.
  • Developed Scala scripts using both Data frames/SQL/Data sets and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into OLTP system through SQOOP.
  • Loading the data as SQL tables from the blob storage.
  • Developed scope scripts to transform the streams in Cosmos.
  • Creating the metadata from the incoming data into SQL server.
  • Experience in implementing Spark RDD's in Scala
  • Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
  • Developed Spark program to analyze reports
  • Good exposure in development with HTML, Bootstrap, Scala
  • Worked in AWS environment for development and deployment of custom Hadoop applications.
  • Involved in converting Hive/SQL queries into Spark transformations using SparkRDD'S and Scala.
  • Collected data using Spark Streaming from AWS S3 bucket in near-real-time and performs necessary Transformations and Aggregations to build the data model and persists the data in HDFS
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Developed Spark scripts by using Scala shell commands as per the requirement.
  • Experience in customizing map reduce framework at different levels like input formats, data types
  • Experience in using Flume to efficiently collect, aggregate and move large amounts of log data.
  • Developed custom writable MapReduce JAVA programs to load web server logs into HBase using Flume.
  • Created Oozie workflows to automate data ingestion using Sqoop and process incremental log data ingested by Flume using Pig.
  • Involved in migrating hive queries and UDF's in hive to Spark SQL.
  • Extensively Used Sqoop to import/export data between RDBMS and hive tables, incremental imports and created Sqoop jobs for last saved value.
  • Created customized BI tool for manager team that perform Query analytics using Hive.
  • Involved in data migration from Oracle database to Mongo DB
  • Created Hive Generic UDF's to process business logic that varies based on policy.
  • Creating Hive tables, loading with data and writing Hive queries which will run internally in Map Reduce way

Environment: Hadoop, Cloudera (CDH 4), HDFS, Hive, HBase, Flume, Sqoop, Pig, Kafka Java, Eclipse, Tableau, Ubuntu, UNIX, and Maven.

Confidential, Bellevue, WA

Hadoop Developer

Responsibilities:

  • Worked on importing data from various sources and performed transformations using MapReduce, Hive to load data into HDFS.
  • Migrated payment accounts to structured data using the Data Frames and Dataset features in spark for the project.
  • Loaded raw data into datasets and performed transformation using Spark-sql.
  • Used S3 to store huge data using Amazon Web Services.
  • Developed Spark custom UDF (user defined functions) to calculate data points like number of active subscription days etc.
  • Installed and configured Hive. Developed Hive UDFs.
  • Used Spark for interactive queries, processing of streaming data and integration with HDFS for storing huge volume of data.
  • Performance tuning of the Spark jobs by changing the configuration properties and used broadcast variables and accumulators.
  • Performed transformations & actions on RDDs and Data Frames and Datasets
  • Worked on setting up Hive and HBase on multiple nodes and developed using Pig, Hive, HBase and Map Reduce.
  • Created Spark Clusters using AWS EMR with required configurations.
  • Configured Sqoop jobs to import data from RDBMS into HDFS using Oozie workflows.
  • Involved in the process of data acquisition, data pre-processing and data exploration using Scala.
  • Experience with Map Reduce coding.
  • Solved small file problem using Sequence files processing in Map Reduce.
  • Written various Hive and Pig scripts.
  • Experience in Upgrading cluster, CDH and HDP Cluster.
  • Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.
  • Created HBase tables to store variable data formats coming from different portfolios.
  • Experience in upgrading Hadoop cluster HBase/zookeeper from CDH3 to CDH4.
  • Performed real time analytics on HBase using Java API and Rest API.
  • Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis
  • Implemented complex MapReduce programs to perform joins on the Map side using distributed cache.
  • Setup flume for different sources to bring the log messages from outside to HDFS.
  • Worked on compression mechanisms to optimize MapReduce Jobs.
  • Real time experience with analytics and BI.
  • Implemented business logic by writing UDF's in Java and used various UDF's from Piggybanks and other sources.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Unit tested and tuned SQLs and ETL Code for better performance.
  • Monitored the performance and identified performance bottlenecks in ETL code.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Environment: Map Reduce, HBase, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, AWS EMR, Sqoop, Flume, Zookeeper, YARN, Oozie, Java, Eclipse

Confidential, Texas

Software Developer

Responsibilities:

  • Developed Scala programs to extract and transform the data sets and the results were exported back to HDFS.
  • Experienced in running Spark streaming jobs to process large size of xml format data.
  • Involved in efficiently collecting and aggregating large amounts of log data into the Spark cluster using Apache Flume.
  • Involved in creating Hive tables, loading with data and writing hive queries
  • Worked on both External and Managed HIVE tables for optimized performance
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java MapReduce, Hive and Sqoop as well as system specific jobs.
  • Worked with data analysts to construct creative solutions for their analysis tasks.
  • Installed and configured Horton Works Manager on an already existing cluster.

Environment: Spark, Scala, Ambari Manager, Hive, HDFS, Flume, Java 8, IntelliJ Idea, Apache Kafka, Git.

Confidential

Software Developer

Responsibilities:

  • Involved in requirements gathering and analysis from the existing system.
  • Worked with Agile Software Development.
  • Designed and developed business components using Spring AOP, Spring IOC, and Spring Batch.
  • Implemented DAO using Hibernate, AOP and service layer using spring, MVC design.
  • Developed Java Server components using spring, Spring MVC, Hibernate, Web Services technologies.
  • Used Hibernate as persistence framework for DAO layer to access the database.
  • Worked with the JavaScript framework Angular JS.
  • Designed and developed Restful APIs for different modules in the project as per the requirement.
  • Developed JSP pages using Custom tags and Tiles framework.
  • Developed the User Interface Screens for presentation logic using JSP and HTML.
  • Have Used Spring IOC to inject the services and their dependencies in dependency injection mechanism.
  • Developed SQL queries to interact with SQL Server database and also involved in writing PL/SQL code for procedures and functions.
  • Developed the persistence layer (DAL) and the presentation layer.
  • Created Angular JS controllers, directives, models for different modules in the frontend.
  • Used MAVEN for build framework and Jenkins for continuous build system.
  • Developed GUI using Front end technologies JSP, JSTL, AJAX, HTML, CSS and Java Script.
  • Developed a code for Web services using XML, SOAP and used SOAPUI tool for testing the services proficient in testing Web Pages functionalities and raising defects.
  • Involved in writing Spring Configuration XML, file that contains declarations and business classes are wired-up to the frontend managed beans using Spring IOC pattern.

Environment: Java, J2EE, Spring Core, Spring Data, Spring MVC, Spring AOP, Spring Batch, Spring Scheduler, Restful Web Services, SOAP Web Services, Hibernate, Eclipse IDE, Angular JS, JSP, JSTL, HTML5, CSS, JavaScript, Web Logic, Tomcat, XML, XSD, Unix, Linux, UML, Oracle, Maven, SVN, SOA, Design patterns, JMS, JUNIT, log4J, WSDL, JSON, JNDI.

Hire Now