We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Austin, TX

PROFESSIONAL SUMMARY:

  • 8 years of professional experience in systems analysis, software development, and .
  • Good Experience in Hadoop/Big Data Technologies, Expertise inHadoop echo systems HDFS, Map Reduce Programming, Sqoop, Pig, Hive, Oozie, KAFKA, Flume, Oozie, impala and HBase for scalability, distributed computing and high performance computing.
  • Hands on experience working with Hadoop, HDFS, Map Reduce framework
  • Experience in installing, configuring and administrating Hadoop cluster for distributions like Cloudera, Horton works and MapR Hadoop distributions.
  • Experience in Importing and Exporting the Data using SQOOP from HDFS to Relational Database systems.
  • Experience in NoSQL Column - Oriented Databases like HBase and its Integration with Hadoop cluster
  • Strong Experience in Linux administration.
  • Knowledge on Kafka, Storm. And hands on experience in Spark.
  • Integrated Splunk with Hadoop and setup jobs to export data from and to Splunk.
  • Spark is a data-processing tool that operates on those distributed data collections.
  • Hands on experience in Scala for working with Spark Core and Spark Streaming.
  • Good experience on scripting languages like PYTHON, SCALA.
  • Worked on Oozie to manage data processing jobs for Hadoop.
  • Hands on experience in gathering information from different nodes into Greenplum database and then Sqoop incremental load into HDFS.
  • Good knowledge about Map-Reduce framework which includes MR daemons, sortingand shuffle phase, task execution.
  • Experience in analyzing data using HiveQL, PIG Latin, and custom MapReduce programs in JAVA, and well versed in Core Java.
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Experienced in working with various kinds of data sources such as Teradata and Oracle and successfully loaded files to HDFS
  • Experience in writing and testing Map-Reduce programs to structure the data.
  • Experience with Oozie Workflow Engine to automate and parallelize Hadoop MapReduce and Spark jobs.
  • Well versed in scheduling Oozie jobs both sequentially and parallel.
  • Good experience with MapReduce performance optimization techniques for effective utilization of cluster resources.
  • Experience in working with MapRvolumes and snapshots for data redundancy.
  • Good level of experience in Core Java, JEE technologies as JDBC, Servlets, and JSP.
  • Knowledge of custom Map Reduce programs in JAVA.
  • Experience in creating custom Solr Query components.
  • Extensive experience in developing the SOA middleware based out of Fuse ESB and Mule ESB, Configured, Elastic Search log stash, kibana to monitor spring batch jobs.
  • Working knowledge on HTML5 and expert level proficiency in markup and scripting languages such as HTML, DHTML, XML, CSS, JavaScript, JQuery.
  • Configured different topologies for Storm cluster and deployed them on regular basis.
  • Experienced in implementing unified data platform to get data from different data sources using Apache Kafka brokers, cluster, Java producers and Consumers.
  • Experienced in working with structured data using Hive QL, join operations, Hive UDFs, partitions, bucketing and internal/external tables.
  • Experienced in migrating ETL kind of operations using Pig transformations, operations and UDF's.
  • Spark Streaming collects this data from Kafka in near-real-time and performs necessary transformations and aggregation on the fly to build the common learner data model and persists the data in NoSQL store (Hbase).
  • Excellent understanding and knowledge of NOSQL databases like HBase, Cassandra, Mongo DB, Teradata and on Data warehouse.

TECHNICAL SKILLS:

Hadoop Ecosystem: Hadoop 2.2, HDFS, MapReduce, Sqoop, Hive, Pig, Impala, Oozie, Yarn, Spark, Kafka, Storm, Flume.

Hadoop Management & Security: Hortonworks, Cloudera Manager, Ubuntu.

Web Technologies: HTML, XHTML, XML

Database: Oracle 10g, Teradata, Microsoft SQL, XSL, CSS, JavaScript

Server Side Scripting: UNIX Shell Scripting Server, MySQL, DB2, SQL, RDBMS.

Programming Languages: Java, J2EE, JDBC, JSP, Java Servlets, JUNIT, Python, Scala.

Web Servers: Apache Tomcat 5.x, BEA WebLogic 8.x, IBM WebSphere 6.0/5.1.1

NO SQL Databases: HBase, Mongo DB

OS/Platforms: Mac OS X 10.9.5, Windows, Linux, UNIX

Client Side: JavaScript, CSS, HTML, JQuery

SDLC Methodology: Agile (SCRUM), Waterfall.

WORK EXPERIENCE:

Confidential, Austin, TX

Sr. Hadoop Developer

Responsibilities:

  • Detailed Understanding on existing build system, Tools related for information of various products and releases and test results information
  • Designed and implemented map reduce jobs to support distributed processing using java, Hive and Apache Pig.
  • Developed UDF's to provide custom hive and pig capabilities.
  • Built a mechanism for automatically moving the existing proprietary binary format data files to HDFS using a service called Ingestion service.
  • Comprehensive knowledge and experience in process improvement, normalization/de-normalization, data extraction, data cleansing, data manipulation Performed Data transformations in HIVE and used partitions, buckets for performance improvements.
  • Written custom Input format and record reader classes for reading and processing the binary format in map reduce.
  • Written Custom writable classes for Hadoop serialization and De serialization of Time series tuples.
  • Implemented Custom File loader for Pig so that we can query directly on the large Data files such as build logs
  • Used Python for pattern matching in build logs to format errors and warnings
  • Developed Pig Latin scripts for validating the different query modes in Historian.
  • Created Hive external tables on the map reduce output before partitioning; bucketing is applied on top of it.
  • Improved the Performance by tuning of HIVE and map reduce.
  • Developed Daily Test engine using Python for continuous tests.
  • Used Shell scripting for Jenkins job automation.
  • Building a custom calculation engine which can be programmed according to user needs.
  • Ingestion of data into Hadoop using Sqoop and apply data transformations and using Pig and HIVE.
  • Handled the performance improvement changes to Pre Ingestion service which is responsible for generating the Big Data Format binary files from older version of Historian
  • Worked with support teams and resolved operational & performance issues.

Environment: Apache Hadoop, Hive, PIG, HDFS, Java Map-Reduce, Core Java, Python, Maven, GIT, Jenkins, UNIX, MYSQL, Eclipse, Oozie, Sqoop, Flume, Oracle, My SQL and CDH4.X.

Confidential, Columbus, OH

Sr. Hadoop Developer

Responsibilities:

  • Developed data pipeline using Flume, Sqoop, Pig and map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Worked on analyzingHadoop cluster and different Big Data analytic tools including Pig, Hive, HBase database and SQOOP.
  • InstalledHadoop, Map Reduce, HDFS, and Developed multiple map reduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Participated in Development and Implementation of MapR environment.
  • Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
  • Involved in developing Pig UDFs for the needed functionality that is not out of the box available from Apache Pig.
  • Importing and exporting data into HDFS and Hive using SQOOP.
  • Experienced in querying data from various servers into MapR-FS.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data on to HDFS.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.POC work is going on using Spark and Kafka for real time processing.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Involved in Unit testing and delivered Unit test plans and results documents.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Worked on Oozie workflow engine for job scheduling.
  • Importing and exporting data into MapR-FS and Hive using Sqoop.
  • Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Used Spark to migrate MapReduce jobs into Spark using Scala.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Expertise in different data Modeling and Data Warehouse design and development.
  • Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.

Environment: MapReduce, HDFS, Hive, Pig, Spark, Spark-Streaming, Spark SQL, MapR, Storm, Apache Kafka, Sqoop, Java, Scala, CDH4, CDH5, AWS, Eclipse, Oracle, Git, Shell Scripting and Cassandra.

Confidential, Atlanta, GA

Hadoop Developer

Responsibilities:

  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop.
  • Implement automated methods and industry best practices for consistent installation and configuration of Greenplum for production and non-production environments
  • Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
  • Installed and configured Cloudera Manager for easy management of existing Hadoop cluster.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Responsible for managing and reviewing Hadoop log files. Designed and developed data management system using MySQL.
  • Developed entire frontend and backend modules using Python on Django Web Framework.
  • Wrote Python scripts to parse XML documents and load the data in database.
  • Cluster maintenance as well as creation and removal of nodes using tools like Cloudera Manager Enterprise, and other tools.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Little bit hands on Data processing using spark.
  • Worked on NoSQL databases including HBase and Elastic Search.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
  • Used Tableau as reporting tool as data visualization tool
  • Involved in Java, J2EE, Struts, Web Services and Hibernate in a fast paced development environment.
  • Followed agile methodology, interacted directly with the client provide/take feedback on the features, suggest/implement optimal solutions, and tailor application to customer needs.
  • Setting up proxy rules of applications in Apache server and Creating Spark SQL queries for faster requests.
  • Designed and Developed database design document and database diagrams based on the Requirements.

Environment: HDFS, Hive, PIG, UNIX, SQL, Java MapReduce, SPARK Hadoop Cluster, Hbase, Sqoop, Oozie, Linux, Data Pipeline, Greenplum, KAFKA, Python, MySQL, Storm, MapRDB.

Confidential

Java Developer

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) of the application like Requirement gathering, Design, Analysis and Code development.
  • Developed a prototype of the application and demonstrated to business users to verify the application functionality.
  • Design, develop and implement MVC Pattern based Keyword Driven automation testing framework utilizing Java, JUnit and Selenium Web Driver.
  • Used automated scripts and performed functionality testing during the various phases of the application development using Selenium.
  • Prepared user documentation with screenshots for UAT (User Acceptance testing).
  • Developed and implemented the MVC Architectural Pattern using Struts Framework including JSP, Servlets, EJB, Form Bean and Action classes.
  • Implemented server side tasks using Servlets and XML.
  • Helped developed page templates using Struts Tiles framework.
  • Implemented Struts Validation Framework for Server side validation.
  • Developed JSP's with Custom Tag Libraries for control of the business processes in the middle-tier and was involved in their integration.
  • Implemented Struts Action classes using Struts controller component.
  • Developed Web services (SOAP) through WSDL in Apache Axis to interact with other components.
  • Integrated Spring DAO for data access using Hibernate used HQL and SQL for querying databases.
  • Implemented EJBs Session beans for business logic.
  • Used parsers like SAX and DOM for parsing xml documents and used XML transformations using XSLT.
  • Wrote stored procedures, triggers, and cursors using Oracle PL/SQL.
  • Created and deployed web pages using HTML, JSP, JavaScript and CSS.

Environment: Java1.5, JSP, JDBC, Spring Core 2.0, Struts 1.2, Hibernate 3.0, Design Patterns, XML

Confidential

Java Developer

Responsibilities:

  • Used JDBC, SQL and PL/SQL programming for storing, retrieving, manipulating the data.
  • Responsible for creation of the project structure, development of the application with Java, J2EE and management of the code.
  • Responsible for the Design and management of database in DB2 using Toad tool.
  • Integrated third party plug-in tool for data tables with dynamic data using JQuery.
  • Responsible for the deployment of the application on the server using IBM WebSphere and putty.
  • Developed the application in an Agile environment with the constant changes in the applicationscope and deadlines.
  • Involved in designing and development of the ecommerce site using JSP, Servlets, EJBs, JavaScript and JDBC.
  • Involved in client interaction and support for the application testing at the client location.
  • Used AJAX for interactive user operations and client side validations Used XSL transforms on certain XML data.
  • Performed an active role in the Integration of various systems present in the application.
  • Responsible to provide services for the mobile requests based on the user request.
  • Performed logging of all the debug, error and warning at the code level using log4j.
  • Involved in the UAT phase and production phase to provide continuous support to the onsite team.
  • Used HP Quality centre tool to actively resolve any bugs logged in any of the testing phases.
  • Used XML for ORM mapping relations with the java classes and the database.
  • Developed ANT script for compiling and deployment. Performed unit testing using JUnit.
  • Used Subversion as the version control system. Extensively used Log4j for logging the log files.

Environment: Java, J2EE, PL/SQL, JSP, HTML, AJAX, Java Script, JDBC, XML, JMS, UML, JUnit.

Confidential

Java Developer

Responsibilities:

  • Developed the applications using Java, J2EE, Struts, JDBC.
  • Built applications for scale using JavaScript, NodeJS, and React.JS
  • Used AngularJS as the development framework to build a single-page application.
  • Used SOAP UI Pro version for testing the Web Services.
  • Developed an AngularJS workflow manager leveraging Angular-UI's state router for flexible configuration.
  • Involved in preparing the High Level and Detail level design of the system using J2EE.
  • Created struts form beans, action classes, JSPs following Struts framework standards.
  • Implemented the database connectivity using JDBC with Oracle 9i database as backend.
  • Involved in the development of underwriting process, which involves communications without side systems using IBM MQ and JMS.
  • Created a deployment procedure utilizing Jenkins CI to run the unit tests.
  • Worked with JMS Queues for sending messages in point-to-point mode.
  • Used PL/SQL stored procedures for applications that needed to execute as part of a scheduling mechanisms.
  • Developed SOAP based XML web services.
  • Used JAXB to manipulate XML documents.
  • Created XML document using STAX XML API to pass the XML structure to Web Services.
  • Used Rational Clear Case for version control and JUnit for unit testing.
  • Provided troubleshooting and error handling support in multiple projects.

Environment: JSP1.2, Jasper reports, JMS, XML, SOAP, Angular JS, JDBC, JavaScript, XML, UML, HTML, JNDI, Apache Tomcat, ANT and JUnit.

We'd love your feedback!