We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

4.00/5 (Submit Your Rating)

San Francisco, CA

SUMMARY

  • Over 8 years of professional experience in systems analysis, software development, and training.
  • Experience in Hadoop/Big Data Technologies, Expertise inHadoop echo systems HDFS, Map Reduce Programming, Sqoop, Pig, Hive, Oozie, Flume, impala and HBase for scalability, distributed computing and high performance computing.
  • Hands on experience working with Hadoop, HDFS, Map Reduce framework and Hadoop ecosystem like Hive, HBase, KAFKA,Sqoop and Oozie.
  • Experience in installing, configuring and administrating Hadoop cluster for distributions like Cloudera, Horton works and MapR Hadoop distributions.
  • Experience in Importing and Exporting the Data using SQOOP from HDFS to Relational Database systems.
  • Experience in NoSQL Column - Oriented Databases like HBase and its Integration with Hadoop cluster
  • Strong Experience in Linux administration.
  • Knowledge on Kafka, Storm. And hands on experience in Spark.
  • Integrated Splunk with Hadoop and setup jobs to export data from and to Splunk.
  • Spark is a data-processing tool that operates on those distributed data collections.
  • Hands on experience in Scala for working with Spark Core and Spark Streaming.
  • Good experience on scripting languages like PYTHON, SCALA.
  • Worked on Oozie to manage data processing jobs for Hadoop.
  • Hands on experience in gathering information from different nodes into Greenplum database and then Sqoop incremental load into HDFS.
  • Good knowledge about Map-Reduce framework which includes MR daemons, sortingand shuffle phase, task execution.
  • Experience in strong and analyzing data using HiveQL, Pig Latin, SparkQL and custom MapReduce programs in Java.
  • Experience in analyzing data using HiveQL, PIG Latin, and custom MapReduce programs in JAVA, and well versed in Core Java.
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Experienced in working with various kinds of data sources such as Teradata and Oracle and successfully loaded files to HDFS
  • Experience in writing and testing Map-Reduce programs to structure the data.
  • Experience with Oozie Workflow Engine to automate and parallelize Hadoop MapReduce and Spark jobs.
  • Well versed in scheduling Oozie jobs both sequentially and parallel.
  • Good experience with MapReduce performance optimization techniques for effective utilization of cluster resources.
  • Experience working with MapRvolumes and snapshots for data redundancy.
  • Good level of experience in Core Java, JEE technologies as JDBC, Servlets, and JSP.
  • Knowledge of custom Map Reduce programs in JAVA.
  • Experience in creating custom Solr Query components.
  • Extensive experience in developing the SOA middleware based out of Fuse ESB and Mule ESB, Configured, Elastic Search logstash, kibana to monitor spring batch jobs.
  • Working knowledge on HTML5 and expert level proficiency in markup and scripting languages such as HTML, DHTML, XML, CSS, JavaScript, JQuery.
  • Expertise in using various Hadoop infrastructures such as MapReduce, Pig, Hive, HBase, Sqoop, Oozie, Flume.
  • Configured different topologies for Storm cluster and deployed them on regular basis.
  • Experienced in implementing unified data platform to get data from different data sources using Apache Kafka brokers, cluster, Java producers and Consumers.
  • Experienced in implementing complex algorithms on semi/unstructured data using Map reduce programs.
  • Experienced in working with structured data using Hive QL, join operations, Hive UDFs, partitions, bucketing and internal/external tables.
  • Experienced in migrating ETL kind of operations using Pig transformations, operations and UDF's.
  • Experienced in migrating ETL kind of operations using Pig transformations, operations and UDF's.
  • Spark Streaming collects this data from Kafka in near-real-time and performs necessary transformations and aggregation on the fly to build the common learner data model and persists the data in NoSQL store (Hbase).
  • Specialization in Data Ingestion, Processing, Development from Various RDBMS data sources into a Hadoop Cluster using Map Reduce/Pig/Hive/Sqoop.
  • Excellent understanding and knowledge of NOSQL databases like HBase, Cassandra, Mongo DB, Teradata and on Data warehouse.

TECHNICAL SKILLS

Hadoop Ecosystem: Hadoop 2.2, HDFS, MapReduce, Sqoop, Hive, Pig, Impala, Oozie, Yarn, Spark, Kafka, Storm, Flume.

Hadoop Management & Security: Hortonworks, Cloudera Manager, Ubuntu.

Web Technologies: HTML, XHTML, XML, XSL, CSS, JavaScript

Server Side Scripting: UNIX Shell Scripting

Database: Oracle 10g, Teradata,Microsoft SQL Server, MySQL, DB2, SQL, RDBMS.

Programming Languages: Java, J2EE, JDBC, JSP, Java Servlets, JUNIT, Python, Scala.

Web Servers: Apache Tomcat 5.x, BEA WebLogic 8.x, IBM WebSphere 6.0/5.1.1

NO SQL Databases: HBase, Mongo DB

OS/Platforms: Mac OS X 10.9.5, Windows, Linux, Unix

Client Side: JavaScript, CSS, HTML, JQuery

SDLC Methodology: Agile (SCRUM), Waterfall.

PROFESSIONAL EXPERIENCE

Confidential, San Francisco, CA

Sr. Hadoop Developer

Responsibilities:

  • Created Hive Tables, loaded retail transactional data from Teradata using Sqoop.
  • Loaded home mortgage data from the existing DWH tables (SQL Server) to HDFS using Sqoop.
  • Loaded the load ready files from mainframes toHadoopand files were converted to ASCII format.
  • Created an Apache Hadoop fully distributed cluster install using HDFS, MapReduce and its various sub-projects, Pig, Hive, Ambari and Oozie.
  • Worked on major and minor upgrades of Hbase and Cassandra cluster.
  • Involved in developing Unix scripts for validating source file, creating transformation and load jobs for 4 modules(Ongoing Advice, Advice Details, Case Details, Advice fee payment)
  • Involved in writing complex SQL queries, Stored Procedures, triggers to access the data from Relational database.
  • Extensively used Pig for data cleansing. Proficient work experience with NOSQL, Monod databases.
  • WrittenPython applications to interact with the MySQL database usingSpark SQL Context and also accessed Hive tables using Hive Context
  • Involved in developing Hive DDLs to create, alter and drop Hive tables.
  • Installed and configuredHadoopMapReduce, HDFS, developed multiple MapReduce jobs in java for data cleaning and pre-processing.
  • Implemented Storm topology with Streaming group to perform real time analytical operations.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scale, And databases such as HBase, and MongoDB.
  • Collaborated with big data partners including: Cloudera, Hortonworks MapR for Supermicro integrated solutionsInvolved in processing ingested raw data using Map Reduce, Apache Pig and Hive.
  • Handled importing of data from various data sources, performed transformations using Hive MapReduce, loaded data into Hadoop Distributed File System (HDFS) and extracted the data from MySQL into HDFS vice-versa using Sqoop.
  • Experience in collecting metrics for Hadoop clusters using Ganglia and Ambari.
  • Used Python scripts to update the content in database and manipulate files
  • Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.
  • Worked with NoSQL databases like Hbase and Mongo DB for POC purpose.
  • Recommendation engine for Portfolio and Research articles using Apache Spark and MongoDB.
  • Developed Spark SQL scripts and involved in converting hive UDF's to Spark SQL UDF's.
  • Responsible for batch processing and real time processing in HDFS and NOSQL Databases.
  • Responsible for retrieval of Data from Casandra and ingestion to PIG.
  • Experience in customizing map reduce framework at various levels by generating Custom Input formats, Record Readers, Partitioner and Data types.
  • Experienced with multiple file in HIVE, AVRO, Sequence file formats.
  • Created and maintained Technical documentation for launching Hadoop Clusters and for executing Pig Script.
  • Implemented business logic by writing Pig UDF's in Java and used various UDFs from Piggybanks and other sources.
  • Involved in hive-Hbase integration by creating hive external tables and specifying storage as Hbase format.
  • DevelopedSpark scripts by usingPython shell commands as per the requirement
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Experienced in defining job flows to run multiple Map Reduce and Pig jobs using Oozie.
  • Installed and configured Hive and also written Hive QL scripts.
  • Database products: MS-SQL, Hadoop/Apache MapR, Oracle, DB2, Informix Online
  • Experience with creating ETL jobs to load JSON data and server data into MongoDB and transformed MongoDB into the Data Warehouse.
  • Created reports and dashboards using structured and unstructured data.
  • Implemented HBase co-processors, Observers to work as event based analysis.

Environment: Map jobs, Spark SQL, Pig Scripts, ETL, Flume, Kafka, Storm, MapR, Hadoop BI, Pig UDF's, Oozie, AVRO, Hive, Map Reduce, Java, Eclipse, Zookeeper.

Confidential, Denver, CO

Sr. Hadoop Developer

Responsibilities:

  • Developed data pipeline using Flume, Sqoop, Pig and map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Worked on analyzingHadoop cluster and different Big Data analytic tools including Pig, Hive, HBase database and SQOOP.
  • InstalledHadoop, Map Reduce, HDFS, and Developed multiple map reduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Participated in Development and Implementation of MapR environment.
  • Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
  • Involved in developing Pig UDFs for the needed functionality that is not out of the box available from Apache Pig.
  • Implemented POC’s using Apache Kafka, Storm and Spark.
  • Importing and exporting data into HDFS and Hive using SQOOP.
  • Experienced in querying data from various servers into MapR-FS.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data on to HDFS.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.POC work is going on using Spark and Kafka for real time processing.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Experienced in managing and reviewing theHadoop log files.
  • Responsible to manage data coming from different sources.
  • Involved in Unit testing and delivered Unit test plans and results documents.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Worked on Oozie workflow engine for job scheduling.
  • Importing and exporting data into MapR-FS and Hive using Sqoop.
  • Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Used Spark to migrate MapReduce jobsinto Spark using Scala.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Expertise in different data Modeling and Data Warehouse design and development.
  • Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Developed Spark code using scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Import the data from different sources like HDFS/Hbase into Spark RDD.
  • Developed a data pipeline using Kafka and Storm to store data into HDFS.
  • Performed real time analysis on the incoming data.

Environment: MapReduce, HDFS, Hive, Pig, Spark, Spark-Streaming, Spark SQL, MapR, Storm, Apache Kafka, Sqoop, Java, Scala, CDH4, CDH5, AWS, Eclipse, Oracle, Git, Shell Scripting and Cassandra.

Confidential, Santa Clara, CA

Hadoop Developer

Responsibilities:

  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop.
  • Implement automated methods and industry best practices for consistent installation and configuration of Greenplum for production and non-production environments
  • Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
  • Installed and configured Cloudera Manager for easy management of existing Hadoop cluster.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Responsible for managing and reviewing Hadoop log files. Designed and developed data management system using MySQL.
  • Developed entire frontend and backend modules using Python on Django Web Framework.
  • Wrote Python scripts to parse XML documents and load the data in database.
  • Cluster maintenance as well as creation and removal of nodes using tools like Cloudera Manager Enterprise, and other tools.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Little bit hands on Data processing using spark.
  • Worked on NoSQL databases including HBase and ElasticSearch.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
  • Used Tableau as reporting tool as data visualization tool
  • Involved in Java, J2EE, Struts, Web Services and Hibernate in a fast paced development environment.
  • Followed Agile methodology, interacted directly with the client provide/take feedback on the features, suggest/implement optimal solutions, and tailor application to customer needs.
  • Setting up proxy rules of applications in Apache server and Creating Spark SQL queries for faster requests.
  • Designed and Developed database design document and database diagrams based on the Requirements.
  • Developed UI of Web Service using Struts MVC Framework.
  • Implemented Struts validation framework.
  • Installed and configured Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
  • Implemented Web Service security on JBoss Server.
  • Implemented DAOs for data access using Spring ORM with Hibernate.
  • Implemented/optimized complex stored procedures for performance enhancements.
  • Designed the XML Schema for data transmission using xml documents.

Environment: HDFS, Hive, PIG, UNIX, SQL, Java MapReduce, SPARK Hadoop Cluster, Hbase, Sqoop, Oozie, Linux, Data Pipeline, Greenplum, KAFKA, Python, MySql, Storm, MapRDB.

Confidential

Java Developer

Responsibilities:

  • Involved in the analysis, design, and development and testing phases of Software Development Lifecycle (SDLC) using agile development methodology.
  • Used JSP, Java Script, HTML5, and CSS for manipulating, validating, customizing, error messages to the User Interface. Used JBoss for EJB and JTA, for caching and clustering purpose
  • Presentation components in JSP pages are built using ICE faces tag libraries
  • Responsible in the deployment of the code on the staging/QA server.
  • GUI was developed using JSP, AJAX and JavaScript, spring framework. Involved in the Development of Spring Framework Controllers.
  • Configured the URL mappings and bean classes using Springapp-servlet.xml. Sybase was the database and Mybatis was used.
  • Integrated Push notifications for Android/IPhone using Javapns and GCM for the application.
  • Worked with Flied level engineers and teams to make the product more user-friendly. Performed testing for GUI and back end.
  • Wrote Web Services using SOAP for sending and getting data from the external interface.
  • Used XSL/XSLT for transforming and displaying reports Developed Schemas for XML.
  • Involved in development of web interface using JSP, JSTL, Servlets, JavaScript and JDBC for administering and managing users and clients.
  • Integrated third party custom pickers plugins in the application using JQuery for IPhone/Android web browsers.
  • Used Design patterns such as Business delegate, Service locator, Model View Controller, Session, DAO
  • Responsible for the design of customizable headers and footers using Tiles framework of Spring, and also used JdbcTemplate to perform database operations at the server side.

Environment: J2EE, Java, Servlets, JSP, SQL, XML, JavaScript, JSTL Ajax, CSS, Agile Methodology,JAVA multithreading, collections, WebSphere, HTML5, JSP.

Confidential

Java Developer

Responsibilities:

  • Used JDBC, SQL and PL/SQL programming for storing, retrieving, manipulating the data.
  • Responsible for creation of the project structure, development of the application with Java, J2EE and management of the code.
  • Responsible for the Design and management of database in DB2 using Toad tool.
  • Integrated third party plug-in tool for data tables with dynamic data using jQuery.
  • Responsible for the deployment of the application on the server using IBM WebSphere and putty.
  • Developed the application in an Agile environment with the constant changes in the applicationscope and deadlines.
  • Involved in designing and development of the ecommerce site using JSP, Servlets, EJBs, JavaScript and JDBC.
  • Involved in client interaction and support for the application testing at the client location.
  • Used AJAX for interactive user operations and client side validations Used XSL transforms on certain XML data.
  • Performed an active role in the Integration of various systems present in the application.
  • Responsible to provide services for the mobile requests based on the user request.
  • Performed logging of all the debug, error and warning at the code level using log4j.
  • Involved in the UAT phase and production phase to provide continuous support to the onsite team.
  • Used HP Quality center tool to actively resolve any bugs logged in any of the testing phases.
  • Used XML for ORM mapping relations with the java classes and the database.
  • Developed ANT script for compiling and deployment. Performed unit testing using JUnit.
  • Used Subversion as the version control system. Extensively used Log4j for logging the log files.

Environment: Java, J2EE, PL/SQL, JSP, HTML, AJAX, Java Script, JDBC, XML, JMS, UML, JUnit.

Confidential

Java Developer

Responsibilities:

  • Developed the applications using Java, J2EE, Struts, JDBC.
  • Built applications for scale using JavaScript, NodeJS, and React.JS
  • Used SOAP UI Pro version for testing the Web Services.
  • Involved in preparing the High Level and Detail level design of the system using J2EE.
  • Created struts form beans, action classes, JSPs following Struts framework standards.
  • Implemented the database connectivity using JDBC with Oracle 9i database as backend.
  • Involved in the development of underwriting process, which involves communications without side systems using IBM MQ and JMS.
  • Created a deployment procedure utilizing Jenkins CI to run the unit tests.
  • Worked with JMS Queues for sending messages in point-to-point mode.
  • Used PL/SQL stored procedures for applications that needed to execute as part of a scheduling mechanisms.
  • Developed SOAP based XML web services.
  • Used JAXB to manipulate XML documents.
  • Created XML document using STAX XML API to pass the XML structure to Web Services.
  • Used Rational Clear Case for version control and JUnit for unit testing.
  • Provided troubleshooting and error handling support in multiple projects.

Environment: JSP1.2, Jasper reports, JMS, XML, SOAP,, JDBC, JavaScript, XML, UML, HTML, JNDI, Apache Tomcat, ANT and JUnit.

We'd love your feedback!