We provide IT Staff Augmentation Services!

Big Data Developer Resume

2.00/5 (Submit Your Rating)



Big Data Developer


  • Developed data pipelines using Spark, Hive, Kafka, Java, and DB to ingest customer financial data and financial histories into Hadoop cluster for analysis. Responsible for implementing a generic framework to handle different data collection methodologies from teh client primary data sources, validate transform using spark and load into Hive. Collected data using Spark Streaming using Kafka produce teh data from near - real-time and performs necessary Transformations and Aggregation on teh fly to build teh common learner data model and persists teh data in HDFS. Explored teh usage of Spark for improving teh performance and optimization of teh existing algorithms in Hadoop using Spark Context, Spark SQL and Spark Yarn. Developed Spark Code using Java and Spark-SQL/Streaming for faster testing and processing of data. Involved in converting Hive/SQL queries into Spark Transformations using Spark RDDs and Scala. Worked on teh Spark
  • SQL and Spark Streaming modules of Spark and used Scala and Python to write code for all Spark use cases. Exploring wif teh Spark improving teh performance and optimization of teh existing algorithms in Hadoop using Spark-Context, Spark-SQL, Data Frame and Pair RDD's. Ensure dat software-as-a-service (SaaS) business applications has teh best data to work wif to drive you're business transformation use Oracle Golden Gate. Migrated historical data to
  • Hive and developed a reliable mechanism for processing teh incremental updates. Used Oozie workflow engine to manage independent Hadoop jobs and to automate several types of Hadoop such as java Map Reduce, Hive and Sqoop as well as system specific jobs Used to monitor and debug Hadoop jobs/applications running in production. Simplify and strengthen you're security and protect you're Azure cloud wif security capabilities dat strengthen virtualization security, container security, web application firewalling, and identity and access management Worked on providing user support and application support on Hadoop infrastructure. Worked on evaluating, comparing different tools for test data management wif Hadoop Helping testing team on Hadoop Application Testing.

Environment: Java 1.8, Spark, Hive, Spark SQL, Spark Streaming, HBase, Sqoop, Kafka, AWS EC2, S3, Cloudera, Scala IDE (Eclipse), Intellij Idea, Linux Shell Scripting, HDFS


Hadoop Developer


  • Developed data pipeline using FLUME, SQOOP, PIG AND JAVA MAPREDUCE to ingest customer behavioral data and financial histories into HDFS for analysis. Involved in Sqoop, HDFS Put or Copy from Local to ingest data and Map Reduce jobs. Used PIG to do transformations, event joins, filter boot traffic and SOME PRE - AGGREGATIONS before storing teh data onto HDFS. Extensive experience in ETL Data Ingestion, In-Stream data processing, BATCH ANALYTICS and
  • Data PERSISTENCE STRATEGY. Implemented Hadoop based data warehouses, INTEGRATED HADOOP wif ENTERPRISE DATA WAREHOUSE systems. Working on modeling process on OLAP. Worked on Designing and Developing ETL Workflows using Java for processing data in HDFS/Hbase using Oozie. Involved in developing PIG UDFS for teh needed functionality dat is not out of teh box available from Apache Pig. Expertise wif teh tools in Hadoop Ecosystem including PIG, HIVE, HDFS, MAP REDUCE, SQOOP, KAFKA, YARN, OOZIE, AND ZOOKEEPER. Hadoop architecture and its components. Extensive experience in using THE MOM WITH ACTIVE MQ, APACHE STORM, KAFKA MAVEN AND ZOOKEEPER. Involved in integration of Hadoop cluster wif spark engine to perform
  • BATCH and GRAPHX operations. Developed KAFKA PRODUCER and consumers, HBase clients, SPARK and Hadoop Map Reduce jobs along wif components on HDFS, Hive. Used Hive to analyze teh partitioned and bucketed data and compute various metrics for reporting. Created action filters, parameters and calculated sets for preparing dashboards and worksheets in TABLEAU. Experience in Converting csv files to .tde files using TABLEAU extract API. Involved in developing HIVE DDLS to create, alter and drop Hive tables and storm. Create scalable and high-performance web services for data tracking. Involved in loading data from UNIX file system to HDFS. Installed and configured Hive and also written Hive UDFs and Cluster coordination services through Zoo Keeper. Involved in creating Hive tables, loading wif data and writing hive queries which will run internally in map reduce way. Experienced in managing Hadoop Cluster using
  • CLOUDERA MANAGER TOOL. Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code. Computed various metrics using Java Map Reduce to calculate metrics dat define user experience. Responsible for developing data pipeline using FLUME, SQOOP,POSTGRES and PIG to extract teh data from weblogs and store in HDFS. Extracted and updated teh data into MONOD USING MONGO import and export command line utility interface. Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and Map Reduce) and move teh data files wifin and outside of HDFS. Involved in Hadoop testing, Developed Unit test cases using Junit, Easy Mock and MRUnit testing frameworks.



Java Developer


  • Involved in various Software Development Life Cycle (SDLC) phases of teh project. Developed teh application using Struts Framework, which is based on Model View Controller design pattern. Extensively used Hibernate in data access layer to perform database operations. Used Spring Framework for Dependency Injection and integrated it wif teh Struts Framework and Hibernate Developed front end using Struts framework. Configured Struts DynaActionForms, Message
  • Resources, Action Messages, Action Errors, Validation.xml, and Validator - rules.xml. Designed and developed front-end using struts framework. Used JSP, JavaScript, JSTL, EL, Custom Tag libraries and Validations provided by struts framework. Used Web services - WSDL and SOAP for getting credit card information from third party. Worked on advanced Hibernate associations wif multiple levels of Caching, lazy loading. Designed various tables required for teh project in Oracle 9i database and used Stored Procedures and Triggers in teh application. Involved in consuming RESTful Web services to render teh data to teh front page. Performed unit testing using JUnit framework. Co-ordinate wif QA team in manual and automation testing. Coordinated work wif DB team, QA team,
  • Business Analysts and Client Reps to complete teh client requirements efficiently

Environment: HTML, JSP, Servlets, JDBC, JavaScript, Tomcat, Eclipse IDE, XML, XSL, Tomcat 5.

We'd love your feedback!