Hadoop Developer Resume
Dallas, TexaS
SUMMARY
- Highly skilled Software Engineer with 6 years of successful, hands - on experience developing and implementing Hadoop web applications using Java, JEE, Hadoop, Spring / Hibernate, MySQL, and various web services, fulfilling projects through entire SDLC using advanced development methodologies.
- Over 6 years of overall IT experience that includes 3 years of Big Data experience in ingestion, storage, querying, processing and analysis
- Excellent understanding of HDFS, Map Reduce, YARN, and tools including Pig and Hive for data analysis, Sqoop for data migration, Flume for data ingestion, Oozie for scheduling and Zookeeper for coordinating cluster resources.
- Worked on analyzing Hadoop cluster and different big data analytic like Hbase.
- Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
- Experience in writing HiveQL queries to store processed data into Hive tables for analysis.
- Experience in building Pig scripts to extract, transform and load data onto HDFS for processing.
- Knowledge and understanding on industry latest Hadoop ecosystems like Apache Spark integration with Hadoop.
- Loaded streaming log data from various webservers into HDFS using Flume.
- Experience in data migration from RDBMS to Cassandra.
- Have the motivation to take independent responsibility as well as ability to contribute and be a productive team member.
- Experience in monitoring and managing 100+ node Hadoop cluster.
- Created a complete processing engine based on Cloudera distribution.
- Experienced in automating job flows using Oozie.
- Supported Map Reduce programs running on the cluster.
- Worked with application team via scrum to provide operational support, install Hadoop updates, patches and version upgrades as required.
- Monitor Hadoop cluster using tools like Nagios, Ganglia Ambariand Cloudera Manager.
- Worked with system engineering team to plan and deploy Hadoop hardware and software environments.
- Worked on disaster management with Hadoop cluster.
- Built ingestion framework using flume for streaming logs and aggregating the data into HDFS.
- Built data transform framework using Map Reduce and Pig.
- Designed, delivered and helped manage a device data analytics Confidential a very large storage vendor.
- Worked with business users to extract clear requirements to create business value.
- Worked with big data teams to move ETL tasks to Hadoop.
- Experienced in Linux Administration tasks like IP Management (IP Addressing, Sub netting, Ethernet Bonding, and Static IP)
- Good communication and interpersonal skills, a committed team player and a quick learner.
- Manage and Review Hadoop Log.
TECHNICAL SKILLS
Languages: Java, Hadoop, Hadoop Cascading, Elasticsearch, JUnit, C, C++, SQL, PL / SQL
Web: JSP, Spring, Spring REST, HTML, CSS, JavaScript, JQuery
JEE Technologies: Servlets, Web Services, SOAP
Databases: NoSQL, Oracle, DB2, MySQL, SQLite, MS SQL Server, MS Access
Tools: Eclipse, IntelliJ, Dia, NetBeans, GitHub, Dropbox, Visual Studio 2010/2012
Platforms: Windows NT / 2000 / XP / 2003 / 7, UNIX
SDLC: Agile, Rapid Application Development, Waterfall Model, Iterative Model
Design Patterns: Singleton, Adapter, Builder, Iterator, Template
Web Services: WebLogic, WebSphere, Apache Jakarta-Tomcat, JBoss
Frameworks: Hibernate, EJB, Struts, Spring, Grails
Other: SVN, Maven, ANT
PROFESSIONAL EXPERIENCE
Confidential, Dallas, Texas
Hadoop Developer
Responsibilities:
- Helped the team to increase cluster size from 35 nodes to 113 nodes. The configuration for additional data nodes was managed using Puppet.
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Importing and exporting data into RDBMS and Hive using Sqoop.
- Able to partitioning a Hive table, creating an external table and differences between the managed and external tables.
- Optimized HIVE analytics SQL queries and achieve job performance.
- Created and worked Sqoop jobs with incremental load to populate Hive External tables.
- Developed Pig scripts in the areas where extensive coding needs to be reduced.
- Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
- Created HBase tables to store variable data formats of data coming from different portfolios.
- Experienced in Map Reduce programs to load the data from system generated log file to HBase database.
- Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
- Developed backend (server side) in Scala.
- Design technical solution for real-time analytics using Kafka and HBase.
- Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to Map Reduce jobs.
- Designing conceptual model with Spark for performance optimization.
- Developed Oozie workflow for scheduling and orchestrating the ETL process
- Developed Map Reduce programs to parse the raw data and store the refined data in tables.
- Analyzing data with Hive, Pig and Hadoop Streaming
- Worked on creating the Data Model for Cassandra from the current Oracle Data model.
- Worked with CQL to execute queries on the data persisting in the Cassandra cluster.
- To analyze data migrated to HDFS, used Hive data warehouse tool and developed Hive queries.
- Used Tableau for visualizing and to generate reports.
- Used Flume to collect, aggregate, and store the log data from different web servers.
Environment: Hadoop, Map Reduce, Hive, Pig, Hbase, Sqoop, Flume, Cassandra, Scala, Spark, Oozie, Kafka, Linux, Java, Tableau, Eclipse, HDFS, PIG, Java (JDK), MySQL and Ubuntu.
Confidential, Boston, MA
Java/Hadoop Developer
Responsibilities:
- Implemented code according to the business requirements.
- Contributed to reporting and analytic solutions to enhance healthcare management. Implemented features by using Hadoop Cascading/Map Reduce
- Responsible for building scalable distributed data solutions using Hadoop
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
- Responsible for running Hadoop streaming jobs to process terabytes of csv data.
- Load and transform large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Involved in loading data from UNIX file system to HDFS.
- Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
- Extracted the data from Teradata into HDFS using the Sqoop.
- Exported the patterns analyzed back to Teradata using Sqoop.
- Involved in production support.
- Used Java code conventions and language standards for maintainable and documented code.
- Resolved bugs via replicating bug, applying query in Elasticsearch
- Developed automated test tools using JUnit.
- Used Maven for automated building of the Java files
- Used Git as version control to checkout and check-in of files.
Environment: Java, Python, Big Data, Hadoop, Bash, Linux, Protobuff, Apache API, Hadoop Cascading, Map Reduce, Groovy on Grails, NoSQL (Elasticsearch), Dia, IntelliJ, IDEA, Maven, Git, Sense (Chrome Extension).
Confidential, Minneapolis, MN
Java Developer
Responsibilities:
- Used CVS for maintaining the Source Code Designed, developed and deployed on Apache Tomcat Server.
- Involved in Analysis, design and coding on J2EE Environment.
- Implemented MVC architecture using Struts, JSP, and EJB's.
- Worked on Hibernate object/relational mapping according to database schema.
- Presentation layer design and programming on HTML, XML, XSL, JSP, JSTL and Ajax.
- Designed, developed and implemented the business logic required for Security presentation controller.
- Used JSP, Servlet coding under J2EE Environment.
- Designed XML files to implement most of the wiring need for Hibernate annotations and Struts configurations.
- Responsible for developing the forms, which contains the details of the employees, and generating the reports and bills.
- Involved in designing of class and dataflow diagrams using UML Rational Rose.
- Created and modified Stored Procedures, Functions, Triggers and Complex SQL Commands using PL/SQL.
- Involved in the Design of ERD (Entity Relationship Diagrams) for Relational database.
- Developed Shell scripts in UNIX and procedures using SQL and PL/SQL to process the data from the input file and load into the database.
- Used Core Javaconcepts in application such as multithreaded programming, synchronization of threads used thread wait, notify, join methods etc.
- Creating cross-browser compatible and standards-compliant CSS-based page layouts.
- Performed Unit Testing on the applications that are developed.
Environment: Java(jdk1.6), J2EE, JSP, Servlets, Hibernate, JavaScript, JDBC, Oracle 10g, UML, Rational Rose, WebLogic Server, Apache Ivy, JUnit, SQL, PL/SQL, CSS, HTML, XML,, Eclipse
Confidential
Software Developer
Responsibilities:
- Developed various product applications using Java, J2EE and related technologies.
- Prepared Estimation for design and development activities
- Involved in the Business Requirement analysis, Design Process, and Product development.
- Prepared Unit test cases, sub-system test cases and participated in Unit and Integration Testing to ensure the delivery of quality output
- Participated in Code walkthroughs, Debugging and defect fixing
- Involved in the co-ordination of end to end production release process
- Used SVN for versioning control system.
- Used Eclipse IDE for product development.
- Implemented logging service using log4j Framework.
- Responsible in testing the classes and methods using JUnit test case.
- Developed build management process for all projects using Maven.
- Developed various modules using agile methodology.
- Involved in installing and configuring Eclipse IDE and MAVEN for development.
Environment: Java, JEE, Servlets, JSP, XML, JSTL, Ajax, MVC, Spring framework, DB2, Struts, EJB, Eclipse, Tomcat, MySQL, WebLogic Application Server API, Oracle, JUnit, Log4j, SVN, Ubuntu, Windows