We provide IT Staff Augmentation Services!

Sr.hadoop Developer Resume

0/5 (Submit Your Rating)

TexaS

SUMMARY

  • Hadoop Developer having 7+ years of professional IT experience with 3.5 years of Big Data Ecosystem experience in ingestion, Data Modelling, storage, querying, processing, analysis and implementing enterprise level systems spanning Big Data, Data Integration.
  • Expertise in Big Data Technologies and Hadoop Ecosystem like Flume,Hbase, Zookeeper, Oozie, Hive, Sqoop, PIG, Kafka, Sparkand YARN
  • Excellent understanding of Hadoop architecture and different daemons of Hadoop clusters which include Name Node and Data Node, Task Tracker, Job Tracker
  • Hands on experience in installing, configuring CDH3 and CDH4 clusters and using Hadoop Ecosystem Components like Hadoop MapReduce, HDFS, Pig, Hive, HBase, Cassandra, Sqoop, Oozie, Flume, Zookeeper, Cloudera Hadoop Ecosystem
  • Knowledge of ClouderaHadoop distributions and many more like Horton works, MapR and IBM Big Insights.
  • Proficient in big data ingestion and streaming tools like Flume, Sqoop, Spark, Kafka and Storm
  • Experience in supporting data analysis projects by using Elastic Mapreduce on the Amazon Web Services(AWS) cloud. Performed Export and import od data into S3.
  • Hands on experience in codingMap Reduce programs using Java, Scala for analyzingBig Data
  • Experience in using inPig Scripts to implementad - hoc Map Reduce programs.
  • Experience in importing and exporting data from Various Databases like RDBMS, Oracle, MySQL,Netezza, Teradata, DB2 into HDFS using Sqoop.
  • Extensive experience in data validation using HIVE,PIG and also written UDFs
  • Experience in importing streaming data into HDFS using flume sources, and flume sinks and transforming the data using flume interceptors
  • Exposure on usage of Apache Kafkato develop data pipeline of logs as a stream of messages using producers and consumers.
  • Loaded the dataset into Hive for ETL Operation.
  • Good Knowledge/Understanding of NoSQL Data bases and hands on work experience in writing applications on NoSQL databases like Cassandra and Mongo DB
  • Design and Strong Programming experience as a senior Java Developer in Internet Applications, client/server technologies using Java, J2EE, JSP,MVC, Servlets, Struts, Hibernate, JDBC, JSF, EJB, XML, AJAX and web based development tools.
  • Worked on an 88 node Hadoop cluster.
  • Proficient in using various IDEs like RAD, Eclipse Luna.
  • Experience inAgileScrumSoftware Development Life Cycle (SDLC) and tools like VersionOne to track daily burnDowns.
  • Implemented Proofs of concepts on Hadoop stack and different big data analytic tools, migration from different databases like Teradata, Oracle, MySQL to Hadoop.
  • Experienced with different scripting languages like Python and shell scripts.
  • Good at working on low-level design documents and System Specifications.
  • Very good working knowledge on Performance Tuning, Debugging, Testing on various platforms.
  • Working knowledge of UMLClass Diagramsusing MS Visio or tools likelucid charts.
  • Experience working with databases like Oracle, MySQL and Teradata.
  • Hands-on experience in Linux, UNIXShell Scripting.
  • Actively in-touch with community to work with the issues, resolutions and developments in Hadoop Ecosystems.
  • Good team player, Excellent communication and Inter-personal and problem solving skills.

TECHNICAL SKILLS

Hadoop: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Avro, Hadoop Streaming, Zookeeper, Spark, Strom, Kafka, HBase, Cassandra, YARN, Tez, Impala, Mahout, Talend, Apache GiraphHadoop Distributions Cloudera CDH3, CDH4, Centos, Ubuntu, Amazon Web Services, Amazon EC2, S3

Languages: Objective C, Java - J2EE, Scala, Shell Scripting, Pig Latin, HiveQL, SQL

Web Servers: WebLogic, WebSphere, Apache Tomcat, Jboss 4.0

Frame Works: Spring, Hibernate, Struts, JUnit and MRUnit

Development Environments: Eclipse Luna, Net Beans

Operating Systems: Cent OS, Windows, Linux/Unix, Ubuntu 14.04, Z/OS

Databases: Hive, MS SQL server, DB2, Oracle 10g, HBase, Teradata, Cassandra, MongoDB, MySQL

Analysis and Visualization Tools: Microsoft SSRS2008.2012, Microsoft SSAS 2012, Tableau,Spotfire, Pentaho.

Scripting Languages: JavaScript, Bash, Perl, Python, Ruby

UML Tools: Visio, Lucid Chart

PROFESSIONAL EXPERIENCE

Confidential, Texas

Sr.Hadoop Developer

Responsibilities:

  • Extracted and updated the data into HDFS using sqoop import and export command line utility interface.
  • Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
  • Involved in using HCATALOG to access Hive table metadata from MapReduce or Pig code.
  • Involved in developing Hive UDFs for the needed functionality.
  • Involved in creating Hive tables, loading with data and writing Hive queries.
  • Managed works including indexing data, tuning relevance, developing custom tokenizers and filters, adding functionality includes playlist, custom sorting and regionalization with Solr Search Engine.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
  • Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like Spark
  • Enhanced and optimized product Spark code to aggregate, group and run data mining tasks using the Spark Framework.
  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Involved in emitting processed data from Hadoop to relational databases or external file systems using SQOOP.
  • Data is loaded back to the Sql Server for the BASEL reporting and for the business users to analyze and visualize the data using Datameer.
  • Participated in daily scrum meetings and iterative development.
  • Loaded all the airline data from the existing DWH tables (SQL Server) to HDFS using Sqoop.
  • Orchestrated hundreds of sqoop scripts, pig scripts, hive queries using Oozie workflows and sub-workflows. loaded cache data in to Hbase using sqoop. created lots of external tables on Hive pointed to Hbase tables.
  • Analyzed Hbase data in Hive by creating External partitioned and bucketed tables. worked with cache data stored in Cassandra.
  • Used the External tables in impala for data analysis. participated in Apache Spark POCS for analyzing the sales data based on several business factors.

Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, Hbase, Impala,Sqoop,Flume,Oozie,Apache Spark, Java, Python, Linux, Maven, Sql Server, Zookeeper, autosys, Tableau, Cassandra.

Confidential, Jersy city, NJ

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions usingHadoopEcosystem.
  • Worked on monitoringHadoopcluster and different big data analytic tools includingFlume, Oozie, and MongoDB database.
  • Responsible for writing MapReduce jobs to handle files in multiple formats (JSON, Text, XML, Binary, and Logsetc.).
  • Developed PIGUDFs to perform data cleansing and transforming for ETL activities.
  • Developed data pipeline using Flume, Sqoop, Pig and JavaMapReduce to ingest data into HDFS for analysis
  • Worked extensively on creating combiners, Partitioning, distributed cache to improve the performance of MapReduce jobs.
  • Worked on Creating the MapReduce jobs to parse the raw web logs data into delimited records.
  • Used Pig to do data transformations, event joins and some pre-aggregations before storing the data on the HDFS.
  • Developed Sqoop scripts to import and export data from and to relational sources by handling incremental data loading on the customer transaction data by date.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
  • Responsible for creating complex tables using Hive.
  • Created partitioned tables in Hive for best performance and faster querying.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig. Developed Pig Scripts to pull data from HDFS.
  • Developed JavaAPIs for invocation in Pig Scripts to solve complex problems.
  • Developed Shell scripts to automate and provide Controlflow to Pig scripts.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Performed extensive data analysis using Hive and Pig.
  • Performed Data scrubbing and processing with Oozie.
  • Responsible for managing data coming from different sources.
  • Worked on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, JSON, CSV formats
  • Handling failure condition and sending notificationsvia HP service manager.

Environment: HadoopFramework, MapReduce, Hive, Sqoop, Pig, HBase, Flume, Oozie, Java (JDK1.6), UNIX Shell Scripting, Oracle 11g, Windows, IBM Datastage 8.1.

Confidential - Richmond, VA

Hadoop Developer

Responsibilities:

  • Worked on a live Hadoop production CDH3 cluster with 35 nodes
  • Worked with highly unstructured and semi structured data of 25 TB in size
  • Good experience in benchmarkingHadoop cluster.
  • Data pipeline/ETL design
  • Implemented Flume (Multiplexing) to stream data from upstream pipes in to HDFS
  • Worked on custom MapReduce programs using Java
  • Designed and developed the Apache Storm topologies for Inbound and outbound data for real time ETL to find the latest trends and keywords.
  • Designed and developed PIG data transformation scripts to work against unstructured data from various data points and created a base line.
  • Worked on creating and optimizing Hive scripts for data analysts based on the requirements.
  • Created HiveUDFs to encapsulate complex and reusable logic for the end users.
  • Very good experience in working with Sequence files and compressed file formats.
  • Worked with performance issues and tuning the Pig and Hive scripts.
  • Good experience in troubleshooting performance issues and tuning Hadoop cluster.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Create databases using HBase/Python-MapReduce to replace Oracle databases
  • Good experience in setting up and configure clusters in AWS
  • Documented tool to perform chunk uploads of big data into Google Big Query.
  • Worked with the infrastructure and the admin teams to set up monitoring probes to track the health of the nodes
  • Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts.

Environment: Java 7, Python, Eclipse, Oracle 10g, Cassandra, Hadoop, Flume, Storm, Kafka, Hive, HBase, Linux, MapReduce, HDFS, Hive, CDH, SQL.

Confidential - Jessup, PA

Sr. Java Developer

Responsibilities:

  • Created the Database, User, Environment, Activity, and Class diagram for the project (UML).
  • Implement the Database using Oracle database engine.
  • Designed and developed a fully functional generic n-tiered J2EE application platform the environment was Oracle technology driven. The entire infrastructure application was developed using Oracle JDeveloper in conjunction with Oracle ADF-BC and Oracle ADF- Rich Faces.
  • Created an entity object (business rules and policy, validation logic, default value logic, security)
  • Created View objects, View Links, Association Objects, Application modules with data validation rules (Exposing Linked Views in an Application Module), LOV, dropdown, value defaulting, transaction management features.
  • Web application development using J2EE: JSP, Servlets, JDBC, JavaBeans, Struts, Ajax, JSF, JSTL, Custom Tags, EJB, JNDI, Hibernate, ANT, JUnit and ApacheLog4J, Web Services, Message Queue (MQ).
  • Designing GUI prototype using ADF 11G GUI component before finalizing it for development.
  • Created Reusable Component (ADF Library and ADF Task Flow.
  • Experience in using Version controls such as CVS, PVCS.
  • Involved in consuming, producing SOAP based web services using JAX-WS.
  • Involved in consuming, Producing Restful web services using JAX-RS.
  • Collaborated with ETL/Informatica team to determine the necessary data models and UI designs to support Cognos Reports.
  • JUnit was used for unit testing for the integration testing tool.
  • Creating Modules Using Task Flow with Bounded and Unbounded
  • Generating WSDL (Web Services) And Create Work Flow Using BPEL
  • Created the Skin for the layout
  • Test the application for components, actions, listeners, and pages
  • Made Integrated testing for the application.
  • Created dynamic report and using JFreechart

Environment: Java core, Servlet, JSF, ADF Rich client UI Framework ADF-BC (BC4J) 11g, web services Using Oracle SOA, Oracle WebLogic

Confidential

Java Developer

Responsibilities:

  • Developed the user interface screens using Swing for accepting various system inputs such as contractual terms, monthly data pertaining to production, inventory and transportation.
  • Involved in designing Database Connections using JDBC.
  • Involved in design and Development of UI using HTML, JavaScript and CSS.
  • Involved in creating tables, stored procedures in SQL for data manipulation and retrieval using SQLSERVER 2000, Database Modification using SQL, PL/SQL, Stored procedures, triggers, Views in Oracle.
  • Used Dispatch Action to group related actions into a single class
  • Build the applications using ANT tool. Also used Eclipse as the IDE
  • Developed the business components (in core Java) used for the calculation module (calculating various entitlement attributes).
  • Involved in the logical and physical database design and implemented it by creating suitable tables, views and triggers.
  • Applied J2EE design patterns like Business Delegate, DAO and Singleton
  • Created the related procedures and functions used by JDBC calls in the above components.
  • Actively involved in testing, debugging and deployment of the application on WebLogic Application Server.
  • Developed test cases and performed unit testing using JUnit.
  • Involved in fixing bugs and minor enhancements for the front-end modules.

Environment: Java, HTML, Java Script, CSS, Oracle, JDBC, ANT tool, SQL, Swing and Eclipse.

Confidential

Java/J2EE Developer

Responsibilities:

  • Developed GUI related changes using JSP, HTML and client validations using Java script.
  • Designed and developed front end using HTML, JSP and Servlets
  • Implemented client side validation using JavaScript
  • Developed the application using Struts Framework to implement a MVC design approach
  • Validated all forms using Struts validation framework
  • Using java scripts did client side validation.
  • Worked with QA team in preparation and review of test cases.
  • Writing SQL Queries to fetch the business data using Oracle as database.
  • Implemented action classes, form beans and JSP pages interaction with these components.
  • Wrote a controller Servlet that dispatched requests to appropriate classes.
  • Created UML class diagrams that depict the code's design and its compliance with the functional requirements.
  • Developed user interface using JSP, Struts Tag Libraries to simplify the complexities of the application.
  • Developed the Web Interface using Servlets, JSP, HTML and CSS.
  • Extensively used the JDBC Prepared Statement to embed the SQL queries into the java code. Implemented the DAO pattern.
  • Developed business logic using Stateless session beans for calculating asset depreciation on Straight line and written down value approaches.
  • Involved coding SQL Queries, Stored Procedures and Triggers.
  • Created java classes to communicate with database using JDBC.

Environment: Java (Jdk 1.6), Servlets, JSPs, Java Beans, HTML, CSS, JavaScript, JQuery, SQL, JDBC, Oracle 9i/10g.

We'd love your feedback!