Hadoop Developer Resume
Dallas, TX
SUMMARY:
- 5+ years of IT experience in all phases of SDLC like requirements gathering and analysis, system requirements specifications, development, test, deployment and Agile software development in a variety of technologies and environments.
- 3 years of accumulated experience in Big - data analysis through multiple projects involving Hadoop MapReduce, Apache Spark, HDFS, Pig, Hive, Sqoop.
- Excellent knowledge on Hadoop system Architecture and Hadoop ecosystem components.
- Developed custom Apache Spark programs in Scala to analyze and transform unstructured data.
- Developed Spark Streaming jobs to process incoming streams of data and store them in hive tables.
- Experienced in ingesting data from traditional database system into the Hadoop data lake using Sqoop for analysis.
- Analyzed large data sets using Hive queries for Structured data and Pig commands/scripts for unstructured and semi structured data.
- Experienced in relational database systems (RDBMS) such as MySQL, Oracle.
- Extensive experience in working on IDEs like Eclipse and Net Beans.
- In-depth understanding of Data Structures and Algorithms.
- Excellent knowledge in Java and SQL in application development and deployment.
- Experience in working with various web technologies like JDBC, JavaScript, JQuery, PHP, HTML, CSS.
- Strong knowledge in programming languages like C, C++.
- Experienced in Linux administration and operation.
- Exceptional ability to learn new technologies and to deliver outputs in short deadlines.
- Excellent technical communication, analytical and problem solving skills and ability to get on well with people including cross-cultural backgrounds and trouble-shooting capabilities.
TECHNICAL SKILLS:
Big Data: Apache Hadoop, Apache Spark, Hive, Pig, Sqoop
Hadoop Distributions: Cloudera, Hortonworks
Languages: Core Java, J2EE, C, C++, Scala
Web Services & Technologies: SOAP, REST, PHP, JavaScript, JQuery, XML, HTML, CSS and JSON.
Databases: SQL Server, MySQL, Oracle
Web/Application Servers: Apache Tomcat, Web Sphere.
IDE: Eclipse, NetBeans.
Operating System: Linux, Windows XP/7/8/10.
Version Controls: GitHub
PROFESSIONAL EXPERIENCE:
Confidential, Dallas, TX
Hadoop Developer
Responsibilities:
- Working on a cluster of size 200 nodes and 2 Petabyte capacity.
- Upgraded the Hadoop system from CDH4 to CDH5 for better performance.
- Ingested transactional data from Oracle into HDFS using Sqoop.
- Performed data profiling and quality validation using transient and staging tables in Hive. After all the actions are done, data is loaded into the staging tables.
- Developed custom Apache Spark programs for data validation to filter unwanted data and cleanse the data.
- Developed Spark Streaming jobs to process incoming streams of data from Kafka sources.
- Used Pig to perform analysis on subscription data and extracted required information on how much percentage of customers are subscribing for additional data.
Environment: Hadoop MapReduce, HDFS, Apache Spark, Hive, Sqoop, Kafka, Linux, Cloudera CDH 5, Pig, Teradata, Tableau and Oracle.
Confidential, Apple Valley, MN
Hadoop Developer
Responsibilities:
- Worked on a Hadoop Cluster with current size of 56 Nodes and 896 Terabytes capacity.
- Developed custom Apache Spark programs for data validation to filter unwanted data and cleanse the data.
- Loading files to HDFS and writing Hive Queries to process required data.
- Experience in managing and reviewing Hadoop log files.
- Ingested traditional RDBMS data into the HDFS from the existing SQL Server using Sqoop.
- Performance tuned and optimized Hadoop clusters to achieve high performance.
- Responsible to manage different test data coming from different sources.
- Used PIG to process small samples of transformed data for a product purchase prediction to create an efficient product recommendation system.
Environment: Hadoop, MapReduce, Hive 0.10.1, Hadoop distribution on Cloudera, Pig 0.11.1, Linux, Sqoop, Oozie 3.3.0, java.
Confidential
Hadoop Developer
Responsibilities:
- Setup and configured a cluster of size 25 nodes with a running Hadoop ecosystem.
- Troubleshot various configuration issues between different components in the ecosystem to ensure seamless performance.
- Ingested data using Sqoop into Hadoop data lake from traditional RDBMS.
- Experimented with running various Pig commands and Pig Latin scripts on the data and analyzed the results in business perspective.
- Developed custom MapReduce programs in java to transform loaded data and analyzed the results for better business insights.
- Created Hive tables and implemented partitioning technique to improve query performance.
- Moved large amounts of Archived historical data from the existing systems into the Hadoop data lake for future analysis.
Environment: Hadoop MapReduce, HDFS, Hive, Sqoop, Pig, Linux and MySQL
Confidential
Java/ J2EE Developer
Responsibilities:
- Implemented Servlets, JSP and Ajax to design the user interface
- Used JSP, Java Script, HTML5, and CSS for manipulating, validating, customizing, error messages to the User Interface
- Used Servlets, JSP, Java Script, HTML, and CSS, RESTful for manipulating, validating, customizing, error messages to the User Interface
- Implemented Object-relation mapping in the persistence layer using Hibernate framework
- Presentation components in JSP pages are built using ICE faces tag libraries
- ICE Faces libraries are used in all presentation pages like Search/Inquiry and data collection pages
- All the Business logic in all the modules is written in core Java
- Wrote Web Services using SOAP for sending and getting data from the external interface
- Used XSL/XSLT for transforming and displaying reports Developed Schemas for XML
- Involved in writing the ANT scripts to build and deploy the application
- Developed a web-based reporting for monitoring system with HTML and Tiles using Struts framework
- Used Design patterns such as Business delegate, Service locator, Model View Controller, Session, DAO
Environment: JAVA multithreading, collections, J2EE, EJB, UML, SQL, PHP, Sybase, Eclipse, JavaScript, WebSphere, JBOSS, HTML5, DHTML, CSS, XML, ANT, JUNIT, JSP, Servlets, Hibernate.