We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

PA

SUMMARY

  • Over 8 years of Professional experience in IT Industry involved in Developing, Implementing, configuring, testing Hadoop ecosystem components and maintenance of various web based applications using Java, J2EE.
  • Hands on experience ininstalling, configuring and using Hadoopecosystems such as HDFS, MapReduce, Yarn, Pig, Hive, Hbase, Oozie, Zookeeper, sqoop, Kafka and flume.
  • Extensive hands on experience in writing complex Map reduce jobs in Java and Python, Pig Scripts and Hive data modeling.
  • Excellent understanding and knowledge of Hadoop Distributed file system data modelling, architecture and design principles.
  • Good understanding of ClassicHadoopand Yarn architecture along with variousHadoopDaemons such as Job Tracker, Task Tracker, Name Node, Data Node, Secondary Name Node, Resource Manager, Node Manager, Application Master and Containers.
  • Involved in writing data transformations, data cleansing using PIG operations.
  • Experience in writing Pig and Hive scripts and extending Hive and Pig core functionality by writing custom UDFs.
  • Experience in trouble shooting Map Reduce programs and performance tuning of the Hadoop cluster by gathering and analyzing the existing infrastructure.
  • Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs.
  • Hands on experience in Importing and exporting data from different databases like MySQL, MongoDB, Cassandra, Oracle, Teradata and Netezza into HDFS and vice - versa using Sqoop.
  • Have good experience creating real time data streaming solutions using Apache Spark/ Spark Streaming / Apache Storm, Kafka and Flume.
  • Implemented Kafka Custom encoders for custom input format to load data into Kafka Partitions. Real time streaming the data using Spark with Kafka for faster processing.
  • Has good experience in using Business Intelligence tools like tableau and cognos and familiar with Data warehousing concepts.
  • Experienced in Java Application Development, Client/Server Applications, Internet/Intranet based applications using Core Java, J2EE patterns, Web Services, Oracle, SQL Server and other relational databases.
  • Experience in utilizing Java tools in business, Web, and client-server environments including Java Platform J2EE, EJB, JSP, Java Servlets, Struts, and Java database Connectivity (JDBC) technologies.
  • Good experience in developing and implementing web applications using Java, CSS, HTML, HTML5, XHTML, Java script, JSON, XML and JDBC.
  • Proficient in writing ANT, Maven builds script to automate the application build and deployment.
  • Extensive work experience with different SDLC approaches such as Waterfall and Agile development methodologies.
  • A great team player& ability to effectively communicate with all levels of the organization such as technical, management and customers.

TECHNICAL SKILLS

Big Data Technologies: Hadoop, MapReduce, HDFS, Hive, Pig, HBase, Sqoop, Flume, Zookeeper, Oozie, Kafka, Yarn, Spark, Storm, MongoDB and Cassandra.

Databases: Oracle, MySQL, Teradata, Microsoft SQL Server, MS Access,DB2 and NOSQL

Programming Languages: C, C++, Java, J2EE, Scala, Python, SQL, PL/SQL and Unix Shell Scripts

Frameworks: MVC, Struts, Spring, Junit and Hibernate

Development Tools: Eclipse, NetBeans, Toad, Maven and ANT

Web Languages: XML, HTML, HTML5, DHTML, DOM, JavaScript, AJAX, JQuery, JSON and CSS

Operating Systems & others: Linux(Cent OS, Ubuntu), Unix, Windows XP, Server 2003, Putty, Winscp, FileZilla, AWS and Microsoft Office Suite

PROFESSIONAL EXPERIENCE

Confidential - PA

HADOOP DEVELOPER

Responsibilities:

  • Installed and Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, Oozie, Zookeeper, Hbase, Flume and Sqoop.
  • Implemented multiple Map Reduce Jobs in java for data cleaning and pre-processing.
  • Worked in a team with 30 node cluster and increased cluster by adding Nodes, the configuration for additional data nodes was done by Commissioning process in Hadoop.
  • Developed java Map Reduce programs using core concepts like OOPS, Multithreading, Collections and IO.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Developed the Pig UDF'S to pre-process the data for analysis.
  • Written robust/reusable HiveQL Scripts and UDF’s in Hive using Java.
  • Implemented partitioning, bucketing in Hive for better organization of the data.
  • Designed and built unit tests and executed operational queries on Hbase.
  • Implemented a script to transmit information from Oracle to Hbase using Sqoop.
  • Involved in defining job flows, managing and reviewing log files.
  • Installed Oozie workflow engine to run multiple Map Reduce, HiveQL and Pig jobs.
  • Implemented a script to transmit information from Webservers to Hadoop using Flume.
  • Used Zookeeper to manage coordination among the clusters.
  • Used Apache Kafka and Apache Storm to gather log data and fed into HDFS.
  • Developed Scala program for data extraction using Spark Streaming.
  • Has done various compressions and file formats like snappy, Gzip, Avro, Sequence and text.
  • Compiled and built the application using MAVEN and used SVN as version control system.
  • Prepare Developer (Unit) Test cases and execute developer testing.
  • Implemented test scripts to support test driven development and continuous integration.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
  • Worked on visualization tool tableau for visually analyzing the data.

Environment: Hadoop, HDFS, Map Reduce, Pig, Hive, Hbase, Sqoop, Flume, Oozie, Zookeeper, Kafka, Spark, Storm, AWS EMR, Cloudera, java, Junit testing, python, Java Script, Oracle, MySQL, NoSQL, Teradata, MongoDB, Cassandra, Tableau, LINUX and Windows.

Confidential - Charlotte, NC

HADOOP DEVELOPER

Responsibilities:

  • Worked on Hadoop cluster which ranged from 4-8 nodes during pre-production stage and it was sometimes extended up to 24 nodes during production
  • Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components
  • Established custom MapReduce programs in order to analyze data and used Pig Latin to clean unwanted data
  • Performed various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
  • Expert in creating PIG and Hive UDFs using Java in order to analyze the data efficiently.
  • Responsible for loading the data from Oracle database, Teradata into HDFS using Sqoop.
  • Implemented AJAX, JSON, and Java script to create interactive web screens.
  • Wrote data ingestion systems to pull data from traditional RDBMS platforms such as Oracle and Teradata and store it in NoSQL databases such as MongoDB.
  • Involved in creating Hive tables, then applied HiveQL on those tables, this will invoke and run MapReduce jobs automatically.
  • Support of applications running on Linux machines
  • Developed data formatted web applications and deploy the script using HTML5, XHTML, CSS and Client side scripting using JavaScript.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
  • Participated in requirement gathering form the Experts and Business Partners and converting the requirements into technical specifications.
  • Experienced in analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability
  • Assisted application teams in installing Hadoop updates, operating system, patches and version upgrades when required
  • Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.

Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Cloudera Manager, Scala, Cassandra, Pig, Sqoop, Oozie, ZooKeeper, Teradata, NOSQL, MySQL, Python, Windows, Horton works, Oozie and HBase

Confidential -Tampa, FL

Hadoop Developer

Responsibilities:

  • Developed data pipeline using Flume, Sqoop, Pig and map reduce to ingest customer behavioural data and purchase histories into HDFS for analysis.
  • Used Pig tool to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
  • UsedHiveto analyse data ingested intoHbaseby usingHive-Hbaseintegration and compute various metrics for reporting on the dashboard
  • Loaded the aggregated data onto DB2 for reporting on the dashboard.
  • Monitoring and Debugging Hadoop jobs/Applications running in production.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
  • Worked on Installing 20 node Hadoop cluster .
  • Building, packaging and deploying the code to the Hadoop servers.
  • Moving data from Oracle to HDFS and vice-versa using SQOOP.
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
  • Worked with different file formats and compression techniques to determine standards
  • Developed Hive scripts for implementing control tables logic in HDFS.
  • Designed and Implemented Partitioning (Static, Dynamic) Buckets in HIVE.
  • Created Hbase tables to store various data formats of data coming from different portfolios
  • Data processing using SPARK.
  • Cluster co-ordination services through ZooKeeper.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Hands on writing Map Reduce code to make unstructured data as structured data and for inserting data into HBase from HDFS.
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
  • Configured Sqoop and developed scripts to extract data from MySQL into HDFS.
  • Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.

Environment: JDK, Ubuntu Linux, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, MongoDB, Zookeeper, Hbase, Java, Shell Scripting, Informatica, Cognos, SQL, Teradata.

Confidential

JAVA DEVELOPER

Responsibilities:

  • Responsible for understanding the business requirement.
  • Worked with Business Analyst and helped representing the business domain details in technical specifications.
  • Was also actively involved in setting coding standards and writing related documentation.
  • Developed the Java Code using Eclipse as IDE.
  • Developed JSPs and Servlets to dynamically generate HTML and display the data to the client side
  • Developed application on Struts MVC architecture utilizing Action Classes, Action Forms and validations.
  • Tiles were used as an implementation of Composite View pattern
  • Was responsible in implementing various J2EE Design Patterns like Service Locator, Business Delegate, Session Facade and Factory Pattern.
  • Code Review & Debugging using Eclipse Debugger.
  • Was responsible for developing and deploying the EJB (Session & MDB).
  • Configured Queues in Web Logic server where the messages, using JMS API, were published.
  • Consumed Web Services (WSDL, SOAP) from third party for authorizing payments to/from customers.
  • Writing/Manipulating the database queries.
  • Build web application using MAVEN as build tool.
  • Used CVS for Version control,
  • Performed unit testing using JUnit Testing Framework and Log4J to monitor the error log.

Environment: Java/J2EE, Eclipse, Web Logic Application Server, Oracle, JSP, HTML, JavaScript, JMS, Servlets, UML, XML, Eclipse, Struts, Web Services, WSDL, SOAP, UDDI.

Confidential

JAVA DEVELOPER

Responsibilities:

  • Gathering requirements from end users and create functional requirements.
  • Used Web Sphere for developing use cases, sequence diagrams and preliminary class diagrams for the system in UML.
  • Extensively used Web Sphere Studio Application Developer for building, testing, and deploying applications.
  • Used Spring Framework based on (MVC) Model View Controller, designed GUI screens by using HTML, JSP.
  • Developed the presentation layer and GUI framework in HTML, JSP and Client-Side validations were done.
  • Involved in Java code, which generated XML document, which in turn used XSLT to translate the content into HTML to present to GUI.
  • Implemented XQuery and XPath for querying and node selection based on the client input XML files to create Java Objects.
  • Used Web Sphere to develop the Entity Beans where transaction persistence is required and JDBC was used to connect to the MySQL database.
  • Developed the user interface using the JSP pages and DHTML to design the dynamic HTML pages.
  • Developed Session Beans on Web Sphere for the transactions in the application.
  • Utilized WSAD to create JSP, Servlets, and EJB that pulled information from a DB2 database and sent to a front end GUI for end users.
  • In the database end, responsibilities included creation of tables, triggers, stored procedures, sub-queries, joins, integrity constraints and views.
  • Worked on MQ Series with J2EE technologies (EJB, Java Mail, JMS, etc.) on Web Sphere server.

Environment: Java, EJB, IBM Web Sphere Application server, Spring, JSP, Servlets, JUnit, JDBC, XML, XSLT,CSS, DOM, HTML, MySQL, JavaScript, Oracle, UML, Clear Case, ANT.

We'd love your feedback!