Sr. Hadoop Developer Resume
Austin, TX
PROFESSIONAL SUMMARY:
- 8 years of professional experience in systems analysis, software development, and .
- Good Experience in Hadoop/Big Data Technologies, Expertise inHadoop echo systems HDFS, Map Reduce Programming, Sqoop, Pig, Hive, Oozie, KAFKA, Flume, Oozie, impala and HBase for scalability, distributed computing and high performance computing.
- Hands on experience working with Hadoop, HDFS, Map Reduce framework
- Experience in installing, configuring and administrating Hadoop cluster for distributions like Cloudera, Horton works and MapR Hadoop distributions.
- Experience in Importing and Exporting the Data using SQOOP from HDFS to Relational Database systems.
- Experience in NoSQL Column - Oriented Databases like HBase and its Integration with Hadoop cluster
- Strong Experience in Linux administration.
- Knowledge on Kafka, Storm. And hands on experience in Spark.
- Integrated Splunk with Hadoop and setup jobs to export data from and to Splunk.
- Spark is a data-processing tool that operates on those distributed data collections.
- Hands on experience in Scala for working with Spark Core and Spark Streaming.
- Good experience on scripting languages like PYTHON, SCALA.
- Worked on Oozie to manage data processing jobs for Hadoop.
- Hands on experience in gathering information from different nodes into Greenplum database and then Sqoop incremental load into HDFS.
- Good knowledge about Map-Reduce framework which includes MR daemons, sortingand shuffle phase, task execution.
- Experience in analyzing data using HiveQL, PIG Latin, and custom MapReduce programs in JAVA, and well versed in Core Java.
- Extending Hive and Pig core functionality by writing custom UDFs.
- Experienced in working with various kinds of data sources such as Teradata and Oracle and successfully loaded files to HDFS
- Experience in writing and testing Map-Reduce programs to structure the data.
- Experience with Oozie Workflow Engine to automate and parallelize Hadoop MapReduce and Spark jobs.
- Well versed in scheduling Oozie jobs both sequentially and parallel.
- Good experience with MapReduce performance optimization techniques for effective utilization of cluster resources.
- Experience in working with MapRvolumes and snapshots for data redundancy.
- Good level of experience in Core Java, JEE technologies as JDBC, Servlets, and JSP.
- Knowledge of custom Map Reduce programs in JAVA.
- Experience in creating custom Solr Query components.
- Extensive experience in developing the SOA middleware based out of Fuse ESB and Mule ESB, Configured, Elastic Search log stash, kibana to monitor spring batch jobs.
- Working knowledge on HTML5 and expert level proficiency in markup and scripting languages such as HTML, DHTML, XML, CSS, JavaScript, JQuery.
- Configured different topologies for Storm cluster and deployed them on regular basis.
- Experienced in implementing unified data platform to get data from different data sources using Apache Kafka brokers, cluster, Java producers and Consumers.
- Experienced in working with structured data using Hive QL, join operations, Hive UDFs, partitions, bucketing and internal/external tables.
- Experienced in migrating ETL kind of operations using Pig transformations, operations and UDF's.
- Spark Streaming collects this data from Kafka in near-real-time and performs necessary transformations and aggregation on the fly to build the common learner data model and persists the data in NoSQL store (Hbase).
- Excellent understanding and knowledge of NOSQL databases like HBase, Cassandra, Mongo DB, Teradata and on Data warehouse.
TECHNICAL SKILLS:
Hadoop Ecosystem: Hadoop 2.2, HDFS, MapReduce, Sqoop, Hive, Pig, Impala, Oozie, Yarn, Spark, Kafka, Storm, Flume.
Hadoop Management & Security: Hortonworks, Cloudera Manager, Ubuntu.
Web Technologies: HTML, XHTML, XML
Database: Oracle 10g, Teradata, Microsoft SQL, XSL, CSS, JavaScript
Server Side Scripting: UNIX Shell Scripting Server, MySQL, DB2, SQL, RDBMS.
Programming Languages: Java, J2EE, JDBC, JSP, Java Servlets, JUNIT, Python, Scala.
Web Servers: Apache Tomcat 5.x, BEA WebLogic 8.x, IBM WebSphere 6.0/5.1.1
NO SQL Databases: HBase, Mongo DB
OS/Platforms: Mac OS X 10.9.5, Windows, Linux, UNIX
Client Side: JavaScript, CSS, HTML, JQuery
SDLC Methodology: Agile (SCRUM), Waterfall.
WORK EXPERIENCE:
Confidential, Austin, TX
Sr. Hadoop Developer
Responsibilities:
- Detailed Understanding on existing build system, Tools related for information of various products and releases and test results information
- Designed and implemented map reduce jobs to support distributed processing using java, Hive and Apache Pig.
- Developed UDF's to provide custom hive and pig capabilities.
- Built a mechanism for automatically moving the existing proprietary binary format data files to HDFS using a service called Ingestion service.
- Comprehensive knowledge and experience in process improvement, normalization/de-normalization, data extraction, data cleansing, data manipulation Performed Data transformations in HIVE and used partitions, buckets for performance improvements.
- Written custom Input format and record reader classes for reading and processing the binary format in map reduce.
- Written Custom writable classes for Hadoop serialization and De serialization of Time series tuples.
- Implemented Custom File loader for Pig so that we can query directly on the large Data files such as build logs
- Used Python for pattern matching in build logs to format errors and warnings
- Developed Pig Latin scripts for validating the different query modes in Historian.
- Created Hive external tables on the map reduce output before partitioning; bucketing is applied on top of it.
- Improved the Performance by tuning of HIVE and map reduce.
- Developed Daily Test engine using Python for continuous tests.
- Used Shell scripting for Jenkins job automation.
- Building a custom calculation engine which can be programmed according to user needs.
- Ingestion of data into Hadoop using Sqoop and apply data transformations and using Pig and HIVE.
- Handled the performance improvement changes to Pre Ingestion service which is responsible for generating the Big Data Format binary files from older version of Historian
- Worked with support teams and resolved operational & performance issues.
Environment: Apache Hadoop, Hive, PIG, HDFS, Java Map-Reduce, Core Java, Python, Maven, GIT, Jenkins, UNIX, MYSQL, Eclipse, Oozie, Sqoop, Flume, Oracle, My SQL and CDH4.X.
Confidential, Columbus, OH
Sr. Hadoop Developer
Responsibilities:
- Developed data pipeline using Flume, Sqoop, Pig and map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Worked on analyzingHadoop cluster and different Big Data analytic tools including Pig, Hive, HBase database and SQOOP.
- InstalledHadoop, Map Reduce, HDFS, and Developed multiple map reduce jobs in PIG and Hive for data cleaning and pre-processing.
- Participated in Development and Implementation of MapR environment.
- Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
- Involved in developing Pig UDFs for the needed functionality that is not out of the box available from Apache Pig.
- Importing and exporting data into HDFS and Hive using SQOOP.
- Experienced in querying data from various servers into MapR-FS.
- Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data on to HDFS.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.POC work is going on using Spark and Kafka for real time processing.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
- Involved in Unit testing and delivered Unit test plans and results documents.
- Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
- Worked on Oozie workflow engine for job scheduling.
- Importing and exporting data into MapR-FS and Hive using Sqoop.
- Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
- Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
- Load the data into Spark RDD and do in memory data Computation to generate the Output response.
- Used Spark to migrate MapReduce jobs into Spark using Scala.
- Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
- Expertise in different data Modeling and Data Warehouse design and development.
- Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
- Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Environment: MapReduce, HDFS, Hive, Pig, Spark, Spark-Streaming, Spark SQL, MapR, Storm, Apache Kafka, Sqoop, Java, Scala, CDH4, CDH5, AWS, Eclipse, Oracle, Git, Shell Scripting and Cassandra.
Confidential, Atlanta, GA
Hadoop Developer
Responsibilities:
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop.
- Implement automated methods and industry best practices for consistent installation and configuration of Greenplum for production and non-production environments
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
- Installed and configured Cloudera Manager for easy management of existing Hadoop cluster.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Responsible for managing and reviewing Hadoop log files. Designed and developed data management system using MySQL.
- Developed entire frontend and backend modules using Python on Django Web Framework.
- Wrote Python scripts to parse XML documents and load the data in database.
- Cluster maintenance as well as creation and removal of nodes using tools like Cloudera Manager Enterprise, and other tools.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Little bit hands on Data processing using spark.
- Worked on NoSQL databases including HBase and Elastic Search.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
- Used Tableau as reporting tool as data visualization tool
- Involved in Java, J2EE, Struts, Web Services and Hibernate in a fast paced development environment.
- Followed agile methodology, interacted directly with the client provide/take feedback on the features, suggest/implement optimal solutions, and tailor application to customer needs.
- Setting up proxy rules of applications in Apache server and Creating Spark SQL queries for faster requests.
- Designed and Developed database design document and database diagrams based on the Requirements.
Environment: HDFS, Hive, PIG, UNIX, SQL, Java MapReduce, SPARK Hadoop Cluster, Hbase, Sqoop, Oozie, Linux, Data Pipeline, Greenplum, KAFKA, Python, MySQL, Storm, MapRDB.
Confidential
Java Developer
Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC) of the application like Requirement gathering, Design, Analysis and Code development.
- Developed a prototype of the application and demonstrated to business users to verify the application functionality.
- Design, develop and implement MVC Pattern based Keyword Driven automation testing framework utilizing Java, JUnit and Selenium Web Driver.
- Used automated scripts and performed functionality testing during the various phases of the application development using Selenium.
- Prepared user documentation with screenshots for UAT (User Acceptance testing).
- Developed and implemented the MVC Architectural Pattern using Struts Framework including JSP, Servlets, EJB, Form Bean and Action classes.
- Implemented server side tasks using Servlets and XML.
- Helped developed page templates using Struts Tiles framework.
- Implemented Struts Validation Framework for Server side validation.
- Developed JSP's with Custom Tag Libraries for control of the business processes in the middle-tier and was involved in their integration.
- Implemented Struts Action classes using Struts controller component.
- Developed Web services (SOAP) through WSDL in Apache Axis to interact with other components.
- Integrated Spring DAO for data access using Hibernate used HQL and SQL for querying databases.
- Implemented EJBs Session beans for business logic.
- Used parsers like SAX and DOM for parsing xml documents and used XML transformations using XSLT.
- Wrote stored procedures, triggers, and cursors using Oracle PL/SQL.
- Created and deployed web pages using HTML, JSP, JavaScript and CSS.
Environment: Java1.5, JSP, JDBC, Spring Core 2.0, Struts 1.2, Hibernate 3.0, Design Patterns, XML
Confidential
Java Developer
Responsibilities:
- Used JDBC, SQL and PL/SQL programming for storing, retrieving, manipulating the data.
- Responsible for creation of the project structure, development of the application with Java, J2EE and management of the code.
- Responsible for the Design and management of database in DB2 using Toad tool.
- Integrated third party plug-in tool for data tables with dynamic data using JQuery.
- Responsible for the deployment of the application on the server using IBM WebSphere and putty.
- Developed the application in an Agile environment with the constant changes in the applicationscope and deadlines.
- Involved in designing and development of the ecommerce site using JSP, Servlets, EJBs, JavaScript and JDBC.
- Involved in client interaction and support for the application testing at the client location.
- Used AJAX for interactive user operations and client side validations Used XSL transforms on certain XML data.
- Performed an active role in the Integration of various systems present in the application.
- Responsible to provide services for the mobile requests based on the user request.
- Performed logging of all the debug, error and warning at the code level using log4j.
- Involved in the UAT phase and production phase to provide continuous support to the onsite team.
- Used HP Quality centre tool to actively resolve any bugs logged in any of the testing phases.
- Used XML for ORM mapping relations with the java classes and the database.
- Developed ANT script for compiling and deployment. Performed unit testing using JUnit.
- Used Subversion as the version control system. Extensively used Log4j for logging the log files.
Environment: Java, J2EE, PL/SQL, JSP, HTML, AJAX, Java Script, JDBC, XML, JMS, UML, JUnit.
Confidential
Java Developer
Responsibilities:
- Developed the applications using Java, J2EE, Struts, JDBC.
- Built applications for scale using JavaScript, NodeJS, and React.JS
- Used AngularJS as the development framework to build a single-page application.
- Used SOAP UI Pro version for testing the Web Services.
- Developed an AngularJS workflow manager leveraging Angular-UI's state router for flexible configuration.
- Involved in preparing the High Level and Detail level design of the system using J2EE.
- Created struts form beans, action classes, JSPs following Struts framework standards.
- Implemented the database connectivity using JDBC with Oracle 9i database as backend.
- Involved in the development of underwriting process, which involves communications without side systems using IBM MQ and JMS.
- Created a deployment procedure utilizing Jenkins CI to run the unit tests.
- Worked with JMS Queues for sending messages in point-to-point mode.
- Used PL/SQL stored procedures for applications that needed to execute as part of a scheduling mechanisms.
- Developed SOAP based XML web services.
- Used JAXB to manipulate XML documents.
- Created XML document using STAX XML API to pass the XML structure to Web Services.
- Used Rational Clear Case for version control and JUnit for unit testing.
- Provided troubleshooting and error handling support in multiple projects.
Environment: JSP1.2, Jasper reports, JMS, XML, SOAP, Angular JS, JDBC, JavaScript, XML, UML, HTML, JNDI, Apache Tomcat, ANT and JUnit.