We provide IT Staff Augmentation Services!

Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Pleasanton, CaliforniA

SUMMARY:

  • 6 years of experience in softwaredevelopment, deployment and maintenance of various web - based applications using Java, and Big Data Ecosystems on Windows and Linux environments.
  • 5+ years of experience on major components in Hadoop Ecosystem like Hadoop, Map Reduce, Yarn, Flume, Hive, Pig, Sqoop, H Base, Pivotal, Cloudera, Map R, Avro, Spark and Scala.
  • Strong development skills in Hadoop, HDFS, Map Reduce, Hive, Sqoop, HBase wif solid understanding of Hadoop internals.
  • Hands on experience on Hadoop /Big Data related technology experience in Storage, Querying, Processing and analysis of data.
  • Extensively worked on major components of Hadoop Ecosystem like HDFS, HBase, Hive, Sqoop, PIG, and MapReduce.
  • Develop various scripts, numerous batch jobs to schedule various Hadoop programs.
  • Experience in analyzing data using Hive QL, and custom MapReduce programs in Java.
  • Hands on experience in importing and exporting data from different databases like Oracle, MySQL, into HDFS and Hive using Sqoop.
  • Good noledge of NoSQL databases like Mongo DB, Cassandra and HBase.
  • Expertise in writing Hadoop Jobs to analyze data using MapReduce, Hive, Pig and SOLR, Splunk.
  • Experience in Programming and Development of java modules for an existing web portal based in Java using technologies like JSP, Servlets, JavaScript and HTML, SOA wif MVC architecture.
  • Strong Programming Skills in designing and implementing of multi-tier applications using Java, J2EE, JDBC, JSP, JSTL, HTML, CSS, JSF, Struts, JavaScript, Servlets, POJO, EJB.
  • Extensive experience in SOA-based solutions - Web Services, Web API, WCF, SOAP including Restful APIs services
  • Good Knowledge in Amazon Web Service (AWS) concepts like EMR and EC2 web services which provides fast and efficient processing of Teradata Big Data Analytics.
  • Experienced in collection of Log Data and JSON data into HDFS using Flume and processed teh data using Hive/Pig.
  • Experience working on EC2 (Elastic Compute Cloud) cluster instances, setup data buckets on S3 (Simple Storage Service), set EMR (Elastic MapReduce).
  • Work Extensively in Core Java, Struts, JSF, Spring, Hibernate, Servlets, JSP and Hands-on experience wif PL/SQL, XML and SOAP.
  • In depth understanding/noledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node.
  • Well versed working wif Relational Database Management Systems as Oracle, MS SQL, MySQL Server
  • Hands on experience in advanced Big-Data technologies like Spark Ecosystem (Spark SQL, MLlib, Spark R and Spark Streaming), Kafka and Predictive analytics
  • Knowledge of teh software Development Life Cycle (SDLC), Agile and Waterfall Methodologies.
  • Good noledge of NoSQL databases such as HBase, MongoDB and Cassandra.
  • Experience in working wif Eclipse IDE, Net Beans, and Rational Application Developer.
  • Experience includes Requirement Gathering, Design, Development, Integration, Documentation, Testing and Build.

TECHNICAL SKILLS:

Big Data Eco Systems:: HDFS, MapReduce, Hive, Yarn, HBase, Pig, Sqoop, Kafka, Storm, Flume, Oozie, Zookeeper, Apache Spark, Impala, Nifi, Apache Solr.

No SQL Databases:: Mongo DB, Hbase, Cassandra

ProgrammingLanguages:: Java, Scala, Python, SQL, PL/SQL.

Frameworks:: MVC, Struts, Spring, Hibernate

Operating Systems:: Sun Solaris, HP-UNIX, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies:: HTML, DHTML, XML, AJAX, WSDL, SOAP

Web/Application servers:: Apache Tomcat, WebLogic, JBoss

Version control:: SVN, GIT

Business Intelligence Tools:: Talend, Informatica, Tableau

Databases:: Oracle 9i/10g/11g, DB2, SQL Server, MySQL, Teradata

Tools and IDE:: Eclipse, IntelliJ, NetBeans, Toad, Maven, Jenkins,ANT

Cloud Technologies:: Amazon Web Services (Amazon RedShift,S3), Microsoft Azure Insight

PROFESSIONAL EXPERIENCE:

Confidential, Pleasanton, California

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Used Spark-Streaming APIs to perform necessary transformations and actions on teh fly for building teh common learner data model which gets teh data from Kafka in near real time and Persists into Cassandra.
  • Experience in Loading teh data into Spark RDD’s, perform advanced procedures like text analytics and processing using in memory data Computation capabilities of Spark using Scala.to generate teh Output response.
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective&efficientJoins, Transformations and other during ingestion process itself.
  • Used Apache Flume to produce data intoKafka.
  • Worked wif teh Apache Nifi flow to perform teh conversion of Raw XML data into JSON, AVRO.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning and Optimizing teh existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Used DataStax Spark-Cassandra connector to load data into Cassandra and used CQL to analyze data from Cassandra tables for quick searching, sorting and grouping.
  • Designed, developed and maintained data integration programs in a Hadoop and RDBMS environment wif both traditional and non-traditional source systems.
  • Experience in writing SQOOP Scripts for importing and exporting data from RDBMS to HDFS.
  • Ingested data from RDBMS and performed data transformations, and tan export teh transformed data to Cassandra for data access and analysis.
  • Created Hive tables for loading and analyzing data, Implemented Partitions, Buckets and developed Hive queries to process teh data and generate teh data cubes for visualizing.
  • Implemented schema extraction for Parquet and Avro file Formats in Hive.
  • Developed Hive scripts in Hive QL to de-normalize and aggregate teh data.
  • Used Spark API over Cloudera Hadoop Yarn to perform analytics on data in Hive.
  • Worked on a POC to compare processing time of Impala wif ApacheHive for batch applications to implement teh former in project.
  • Worked on migrating Map Reduce programs into Spark transformations using Spark and Scala.
  • Experience in Job management using Fair scheduler and Developed job processing scripts using Oozie workflow.
  • Experience in NoSQL Column-Oriented Databases like Cassandra and its Integration wif Hadoop cluster.
  • Experience in Querying on Parquetfiles by loading them into Spark's data frames by using Zeppelin notebook.
  • Experience in troubleshooting any problems that arises during any batch data processing jobs.
  • Extracted teh data from Teradata into HDFS/Dashboards using Spark Streaming.
  • Migrated an existing on-premises application to AWS. Used AWS services like EC2 and S3 for small data sets processing and storage, Experienced in Maintaining teh Hadoop cluster on AWS EMR.

Environment: Hadoop Yarn, Spark-Core, Spark-Streaming, Spark-SQL, Scala, Python, Kafka, Hive, Sqoop, Amazon AWS, Elastic Search, Impala, Cassandra, Tableau, Talend, Cloudera, MySQL, Linux.

Confidential, Paragould, Arkansas

Hadoop Developer

Responsibilities:

  • Designed teh projects using MVC architecture providing multiple views using teh same model and theirby providing efficient modularity and scalability.
  • Implemented Kafka, spark streaming pipelines to ingest real streaming data.
  • Developed MapReduce programs to process teh Avrofiles and to get teh results by performing some calculations on data and also performed map side joins. Supported MapReduce Java programs those are running on teh cluster.
  • Imported Bulk Data into HBase Using MapReduce programs.
  • Used Rest API to Access HBase data to perform analytics.
  • Perform analytics on Time Series Data exists in HBase using HBaseAPI.
  • Designed and implemented Incremental Imports into Hive tables.
  • Involved in creating Hive tables, loading wif data and writing Hive queries that will run internally in MapReduce way.
  • Involved in collecting, aggregating and moving data from servers to HDFS using Flume.
  • Imported and Exported Data from Different Relational Data Sources like DB2, SQLServer, MYSQL to HDFS using Sqoop.
  • Migrated complex map reduce programs into In-memory Spark processing using Transformations and actions.
  • Collected teh real-time data from Kafka using Spark Streaming and performed transformations and aggregation on teh fly to build teh common learner data model and persists teh data into Hbase.
  • Worked on creating teh RDD's, DF's for teh required input data and performed teh data transformations using Sparkwif Python.
  • Involved in developing SparkSQL queries, Data frames, import data from Data sources, perform transformations, and perform read/write operations, save teh results to output directory into HDFS.
  • Worked on Oozie workflow engine for job scheduling.
  • Used Enterprise Data Warehouse database to store teh information and to make it access all over organization.
  • Responsible for preparing technical specifications, analyzing functional Specs, development and maintenance of code.

Environment: Hadoop, Map Reduce, Spark, Kafka, HDFS, Hive, Pig, Oozie, Core Java, Python, Eclipse, Hbase, Flume, Cloudera, MYSQL, UNIX Shell Scripting.

Confidential -Davidson, WI

Hadoop Developer

Responsibilities:

  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Involved in writing MapReduce jobs.
  • Involved in SQOOP, HDFS Put or Copy from Local to ingest data.
  • Used Pig to do transformations, event joins, filter bot traffic and some pre-aggregations before storing teh data onto HDFS.
  • Involved in developing Pig UDFs for teh needed functionality that is not out of teh box available from Apache Pig.
  • Used Hive to analyze teh partitioned and bucketed data and compute various metrics for reporting.
  • Involved in developing Hive DDLs to create, alter and drop Hive TABLES.
  • Involved in developing Hive UDFs for teh needed functionality that is not out of teh box available from Apache Hive.
  • Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
  • Computed various metrics using Java MapReduce to calculate metrics that define user experience, revenue etc.
  • Responsible for developing data pipeline using flume, Sqoop and pig to extract teh data from weblogs and store in HDFS Designed and implemented various metrics that can statistically signify teh success of teh experiment.
  • Used Eclipse and ant to build teh application.
  • Involved in using SQOOP for importing and exporting data into HDFS and Hive.
  • Involved in processing ingested raw data using MapReduce, Apache Pig and Hive.
  • Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
  • Involved in pivot teh HDFS data from Rows to Columns and Columns to Rows.
  • Involved in emitting processed data from Hadoop to relational databases or external file systems using SQOOP, HDFS GET or CopyToLocal.
  • Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and MapReduce) and move teh data files wifin and outside of HDFS.

Environment: Hadoop, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Oracle 11g, Core

Confidential

Java Developer

Responsibilities:

  • Involved in design and development phases of Software Development Life Cycle (SDLC).
  • Implemented Model View Controller (MVC) architecture using Jakarta Struts frameworks at presentation tier.
  • Developed a Dojo based front end including forms and controls and programmed event handling.
  • Developed various Enterprise Java Bean components to fulfill teh business functionality.
  • Created Action Classes which route submittals to appropriate EJB components and render retrieved information.
  • Validated all forms using Struts validation framework and implemented Tiles framework in teh presentation layer.
  • Used Spring Framework for Dependency injection and integrated it wif teh Struts Framework.
  • Used JDBC to connect to backend databases, Oracle and SQL Server 2005.
  • Proficient in writing SQL queries, stored procedures for multiple databases, Oracle and SQL Server 2005.
  • Wrote Stored Procedures using PL/SQL. Performed query optimization to achieve faster indexing and making teh system more scalable.
  • Deployed application on windows using IBM Web Sphere Application Server.
  • Used Java Messaging Services (JMS) for reliable and asynchronous exchange of important information such as payment status report.
  • Used Web Services - WSDL and REST for getting data from different instruments and used SAX and DOM XML parsers for data retrieval.
  • Implemented SOA architecture wif web services using Web Services like JAX-WS.
  • Used ANT scripts to build teh application and deployed on Web Sphere, Application Server.

Environment: Core Java, J2EE, Oracle, SQL Server, JMS, EJB, Struts, Spring, JDK, JavaScript, HTML, CSS, AJAX, JUnit, Log4j, Web Services, Windows.

We'd love your feedback!