We provide IT Staff Augmentation Services!

Big Data Developer Resume

Indianapolis, IN

PROFESSIONAL SUMMARY:

  • Experienced and self - motivated Big Data Developer, with over 5years of extensive experience in Big
  • Data Ecosystem and Enterprise Data Warehouse Systems. Agile development experience with proven track record of Successful Implementations.
  • Experience in using various Hadoop infrastructures particularly Spark, Pig, Hive, Oozie, Sqoop, Flume, Zookeeper, Impala, SQL Database, HUE, Flume-ng and YARN.
  • Excellent knowledge in Hadoop Ecosystem such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce.
  • Expertise in Java 6.0, JavaScript, JEE, JUnit, CSS, JQuery, HTML, XML, SOAP, Spring MVC, Hibernate, JSP, Servlets, and Web Services.
  • Experience in working with MapReduce programs using Apache Hadoop for working with BigData.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNodes and MapReduce concepts.
  • Experience in analyzing data using HiveQL, Pig Latin, and custom MapReduce programs in Java.
  • Hands on experience in writing Pig Latin scripts and pig commands.
  • Thorough knowledge of Monitoring, Replication and Sharding Techniques in MongoDB.
  • Experienced with designing Partitions in Cubes to improve performance-using SSAS
  • Strong knowledge in working with UNIX/LINUX environments, writing shell scripts and PL/SQL Stored Procedures.
  • Good knowledge in using Hibernate for mapping Java classes with database and using Hibernate Query Language (HQL).
  • Operated on Java/J2EE systems with different databases, which include Oracle, MySQL and DB2.
  • Experience in designing and maintaining high performing ELT/ETL Processes.
  • Extensive experience working in Oracle, DB2, SQL Server and My SQL database.
  • Committed hardworking individual with strong communication and organizational skills.
  • Ability to adapt evolving technology, strong sense of responsibility and .

WORK HISTORY:

Big Data Developer

Confidential, Indianapolis, IN

Responsibilities:

  • Worked on analyzing Hadoop cluster and different Big Data analytic tools including MapReduce, Hive and Spark.
  • Prepared Linux shell scripts for automating the process and implemented Impala for data analysis.
  • Implemented batch processing of data sources using Apache Spark.
  • Executed Spark RDD transformations and actions as per business analysis needs.
  • Migrated Hive queries into Spark QL to improve performance.
  • Elaborated predictive analytic using Apache Spark Scala APIs.
  • Developed solutions to pre-process large sets of structured, semi-structured data, with different file formats (Text, Avro, Sequence, Xml, JSON and Parquet).
  • Established PIG Latin scripts for the analysis of semi-structured data.
  • Performed optimization on Pig scripts and Hive queries to increase efficiency and add new features to existing code.
  • Created Hive tables and involved in data loading and writing Hive UDFs.
  • Imported data from MySQL to HDFS using Sqoop and manage Hadoop log files.
  • Experienced in data cleansing and processing using Pig Latin operations and UDFs.
  • Wrote Hive Scripts for analyzing data in Hive warehouse using HiveQL.
  • Collected logs data from various sources and integrating into HDFS using Flume.
  • Ran Hadoop streaming jobs to process terabytes of Xml format data.

Environment: HDFS, CDH 5.1.2, Apache Spark 1.4.0, Pig, Hive, Sqoop, SQL, Shell scripting, Java 7.0, Oracle 10g/11g.

Hadoop/ Spark Developer

Confidential, Albany, NY

Responsibilities:

  • Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
  • Developed a data pipeline using Kafka and Strom to store data into HDFS.
  • Explored Spark to improve the performance and optimization using Spark context, Spark- SQL, Data Frame, pair RDDs, Spark YARN of the existing algorithms in Hadoop.
  • Installed/Configured/Maintained Hortonworks Hadoop clusters for application development.
  • Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
  • Responsible for building scalable distributed data solutions using Hadoop cluster environment with AWS infrastructure services Amazon Simple Storage Service (Amazon S3), EMR, and Amazon Elastic Compute Cloud (Amazon EC2).
  • Load the data into Spark RDD and performed in-memory data computation to generate the output response.
  • Developed and executed shell scripts to automate the jobs and Wrote complex Hive queries and UDFs.
  • Worked on reading multiple data formats on HDFS using Spark.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Developed multiple POCs using Spark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata. \Analyzed the SQL scripts and designed the solution to implement using Spark.
  • Involved in loading data from UNIX file system to HDFS, AWS S3.
  • Extracted the data from Teradata into HDFS using Sqoop.
  • Handled importing of data from various data sources like AWS S3, MongoDB performed transformations using Hive, MapReduce, Spark and load data into HDFS.
  • Manage and review Hadoop log files.
  • Involved in analysis, design, testing phases and responsible for documenting technical specifications.
  • Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive using AWS EMR.
  • Using Atlas exchange of metadata with MariaDB to Hive.
  • Facilitating the daily scrum meetings, spring planning, spring review, and spring retrospective.
  • Worked on the core and Spark SQL modules of Spark extensively.
  • Experienced in running Hadoop streaming jobs to process terabytes data from AWS S3.
  • Implemented Oozie job for importing real-time data to Hadoop using Kafka and for daily imports.

Environment: Hadoop, HDFS, Hive, Scala, Spark, SQL, MongoDB, MariaDB, UNIX Shell Scripting, AWS S3, EMR, Hortonworks HDP 2.5, Hadoop Stack, Apache Ranger and Apache Atlas.

Hadoop Developer

Confidential, Lexington, KY

Responsibilities:

  • Involved in loading data from LINUX file system to HDFS.
  • Implemented test scripts to support test driven development and continuous integration.
  • Responsible to manage data coming from different sources.
  • Load and transform large sets of structured, semi structured and unstructured data
  • Experience in managing and reviewing Hadoop log files, managing and scheduling Jobs on a Hadoop cluster.
  • Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Used Pig as ETL tool to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS
  • Written Hive queries for data analysis to meet the business requirements
  • Involved in writing Hive scripts to extract, transform and load the data into Database.
  • Used JIRA for bug tracking and used CVS for version control.

Environment : Hadoop, Hive, Linux, MapReduce, HDFS, Pig, Sqoop, Shell Scripting, Python, Java (JDK 1.6), Java 6, Eclipse, Control-M scheduler, Oracle 10g, PL/SQL, SQL*PLUS, Toad 9.6, Linux, JIRA 5.1, CVS, JIRA 5.2.

Java Developer

Confidential

Responsibilities:

  • Configured data to provide persistence services and persistent objects to the application from the database using Hibernate ORM tool as persistence layer.
  • Developed DAO layer using Spring MVC configuration XML's for Hibernate and to manage CRUD operations like insert, update and delete.
  • Implemented reusable services using BPEL to transfer data.
  • Developed dependency injection for Spring framework.
  • Developed a Web based reporting system with JSP, DAO and Apache Struts-Validator using Struts framework.
  • Designed the controller using Servlets.
  • Developed Junit classes and created Junit test cases.
  • Configured logging (enable/disable) using log4j for the application.
  • Created user interface using HTMP, CSS, JSP, JQuery, AJAX, Java Scrpit and JSTL.
  • Implemented database operations using PL/SQL procedures and queries.
  • Developed shell scripts for UNIX environment to deploy EAR and read log files.
  • Implemented log4j for logging.

Environment : Java, Jest, SQL,Junit, PL/SQL, SOA Suite 10g BPEL, Struts, Spring, Hibernate, Web services JAX-WS, JMS, EJB, Web logic 10.1 Server, JDeveloper, HTML, LDAP, Maven, XML, CSS, JavaScript, JSON, Oracle, CVS and UNIX.

Java Developer

Confidential

Responsibilities:

  • Involved in designing and developing modules at both Client and Server Side
  • Developed the UI using JSP, JavaScript and HTML.
  • Responsible for validating the data at the client side using JavaScript.
  • Interacted with external services to get the user information using SOAP web service calls
  • Developed web components using JSP, Servlets and JDBC.
  • Technical analysis, design, development and documentation with a focus on implementation and agile development.
  • Accessed backend database Oracle using JDBC.
  • Developed and wrote UNIX Shell scripts to automate various tasks.
  • Developed user and technical documentation.
  • Developed business objects, request handlers and JSPs for this project using Java Servlets and XML.
  • Developed core spring components with some of the modules and integrated it with the existing struts framework.
  • Actively participated in testing and designed user interface using HTML and JSPs.
  • Implemented the database connectivity to Oracle using JDBC, designed and created tables using SQL.
  • Implemented the server-side processing using Java Servlets.
  • Installed and configured the Apache Web server and deployed JSPs and Servlets in Tomcat Server.

Environment: Java, Servlets, JSP, JavaScript, JDBC, Unix Shell scripting, HTML, Eclipse, Oracle 8i, WebLogic.

Hire Now