Hadoop Developer Resume
Nashville, TN
SUMMARY
- Overall 7 +Years of professional experience in various Software Development positions in core and enterprise software development using Big Data, Java/J2EE and Open Source technologies.
- 2+years of proficient experience as a Big Data Specialist in Hadoopwith Cloudera, Horton works and NoSQL platform.
- Expert knowledge in working with Apache Hadoop Components like HDFS, Map Reduce, YARN, Pig, Hive, Hbase, Spark Flume, Sqoop, Oozie, and Cassandra.
- Hands on experience in Hadoop Architecture and various components such as HDFS, Resource Manager, Node Manager, Containers, Name Node, Data Node, Standby Name node, and Map Reduce concepts.
- Experience in analyzing data using HiveQL, PIG Latin, and Map Reduce programs in Java.
- Experienced in writing UDFs in both Pig and Hive.
- Designed and develop SQOOP scripts for datasets transfer between Hadoop and RDBMS.
- Extensively used Flume to collect the weblogs.
- Experienced in Scheduling of all HDFS, MR, Pig, Hive and Sqoop jobs using Oozie to automate the tasks.
- Knowledge in handling Kafka cluster and created several Storm topologies to support real - time processing requirements.
- Good Knowledge in creating event processing data pipelines using Kafka and Storm
- Experience in using Hcatalog for Hive, Pig and Hbase.
- Experience in NoSQL databases like Hbase, Cassandra
- Experienced in writing UNIX shell scripts.
- Worked on both Cloudera Distribution (CDH) and Hortonworks Data Platform (HDP).
- Experience working with different file formats( XML,JSON, Avro, Sequence files)
- Experienced with different kind of compression techniques like LZO, GZip and Snappy.
- Good knowledge on build tools like Maven.
- Experience in developing distributed Web applications and Enterprise applications using Java/ J2EE technologies (Core Java (JDK 6+), Servlets, JSP, EJB).
- Involved in Application Design, Architecture and implementation using J2EE, EJB, Hibernate, Struts, Web Services, Maven, log4j.
- Good knowledge on developing JAVA/J2EE applications using IDEs like RAD, Eclipse, My Eclipse and Webservers like Tomcat, Web Sphere and Web Logic.
- Extensive experience using SQL and PL/SQL to write Stored Procedures and Functions.
- Highly proficient in developing and deploying Java/J2EE applications on Application Servers - Web Logic, IBM Web Sphere, JBOSS and Apache Tomcat.
- Proficiency in working with all databases like Oracle, MySQL, DB2.
- Comfortable with various testing frameworks like Junit for doing unit testing.
- Experience in business process analysis and planning of system processes and sub processes.
- Quick learner, adaptable to dynamics of software industry, excellent communication, team work & inter-personnel skills.
- Exemplary communication and presentation skills coupled with above average analytical skills.
TECHNICAL SKILLS
Hadoop Ecosystem: HDFS, Map Reduce, PIG, Hive, Sqoop, Flume, Ozzie, Storm, and Spark.
NoSQL DB Technologies: HBase, Cassandra.
Programming Languages: Java, Scala
Web Technologies: J2EE, Servlets, JSP, JDBC, XML, AJAX, SOAP, WSDL SDLC
ORM Technology: Hibernate 3.0
Databases: Oracle, MySQL, SQL Server, DB2
Methodologies: Agile, UML, Design Patterns
Frameworks: Ajax, MVC, Struts 2, Struts 1, Hibernate 3, Spring 3/2.5/2 Client Side Coding HTML, DHTML, XML, XSLT, JavaScript, CSS
PROFESSIONAL EXPERIENCE
Confidential, Nashville, TN
Hadoop Developer
Responsibilities:
- Involved in gathering business requirements and analysis of business use cases.
- Used Sqoop to import data from various database(Sql server, Cache Database) sources into HDFS and Hive
- Worked on Oozie workflow to schedule multiple Sqoop Jobs and shell Scripts
- Written apache flume conf file to move data from mounted data in Linux to HDFS (used multiple sources and multiple Sinks).
- Created Hive tables on the top of Xml parsing output data which is further used by appeal analysts and also used dynamic partition on tables
- Worked on different compression techniques like Snappyand GZip formats to reduce the output file size and optimize the MR jobs.
- Involved in HBase to provide fast lookups for larger tables, Implemented Hive integration with HBase.
- Used Hadoop XmL stream library, spark HadoopRDD to chunk down and read large Xml files while processing.
- Worked on Scala XML API and Apache spark to parse the xml files, handled collection data and inner loops of depth 3 by de-normalize to flat files.
- Handled special characters in XML/Json files while processing.
- Used Broadcast variables and Accumulators to improve and debug the overall application performance.
- Serialized the streamed XML to String using Text Decoders to further processing the data.
- Used Solr to Index the files directly from HDFS for both Structured and Semi Structured data.
- Used the search capabilities provided by Solr like faceted search, collapsing/grouping, function queries
Environment: HDFS, MapReduce, Hive, Sqoop, Flume, Spark, Cloudera Hadoop Distribution,Sql Server, Cache database, UNIX Shell Scripting, Java (jdk1.7),Scala
Confidential, San Mateo, CA
Hadoop Developer
Responsibilities:
- Worked on coordinating the project management related activities.
- Involved in Design and Development of technical specification documents
- Developed MapReduce programs to perform data scrubbing for unstructured data
- Implemented optimized joins to gather data from different data sources using Map reduce joins.
- Worked on the Hortonworks environment.
- Developed Pig and Hive queries for Log Analysis, Recommendation and Analytics
- Developed Pig Latin scripts to do data cleansing on Datasets.
- Worked on Pig Latin UDFs in Java and used various UDFs from Piggybanks and other sources to customize pig scripts
- Performed extensive data validation using HIVE Dynamic Partitioning and Bucketing for efficient data access
- Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.
- Used HBase to provide fast lookups for larger tables.
- Involved in Development of data pipeline using Kafka and Storm to store data into HDFS.
- Used Sqoop for import and export data from HDFS to RDBMS and vice-versa
- Collected the log data from web servers and integrated into HDFS using Flume.
- Involved in collecting the real-time data from Kafka using Spark Streaming and perform transformations and aggregation on the fly to build the common learner data model and persists the data into HBase
- Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs
- Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations
- Involved in loading data from LINUX and UNIX file system to HDFS
- Strong experience in development, and testing phases of software development life cycle.
- Worked with MRUnit for Unit testing.
Environment: Map Reduce, HDFS, Pig, Hive, Sqoop, Flume, Spark, Hadoop distribution of Horton Works, Kafka, Oracle 11g, UNIX Shell Scripting, Java (jdk1.6).
Confidential, San Francisco, CA
Hadoop Developer
Responsibilities:
- Responsible for developing efficient MapReduce programs for processing historical data of more than 20 years’ worth data.
- Uploaded and processed more than 30 terabytes of data from various structured and unstructured sources into HDFS using Sqoop and Flume
- Played a key-role is setting up the Hadoop cluster utilizing Apache Hadoop by working closely with the Hadoop Administration team.
- Worked with the advanced analytics team to design algorithms and then developed MapReduce programs to efficiently run the algorithm on the huge datasets.
- Involved in writing Map Reduce programs for large datasets.
- Extensively used Pig to processing frame work for large Datasets.
- Involved in writing UDFs in Java and used various UDFs from Piggybanks and other sources.
- Developed Hive queries to perform transformations on datasets.
- Involved in creating the External Hive Table on top of parsed data.
- Worked with Flume to transport logs to HDFS.
- Involved in loading data from UNIX file system to HDFS.
- CreatedHbasetables to load large sets of structured, semi-structured and unstructured data from UNIX and NoSQL
- Involved in creating Hive Tables, loading data and writing Hive queries.
- Worked on Importing and exporting data from MySQL database into HDFS using Sqoop.
- Developed Oozie workflows to run multiple Map Reduce, Pig and Hive jobs.
- Worked on the Cloudera Manager environment.
Environment: Hadoop. Map Reduce, HDFS, Pig, Hive, Flume, Sqoop, Cloudera manager, Oozie, Java (jdk1.6).
Confidential, Bloomington IL
Java Developer
Responsibilities:
- Performed systems analysis, requirement clarification, design and documentation of the application.
- Implemented the Struts Model View Controller (MVC) design pattern on various modules using J2EE, JSP, and JAVA Beans to control the flow of the application in Presentation/Web tier.
- Extensively used Struts Validation framework for validation.
- Involved in writing business logic using Session beans and design patterns.
- Widely used J2EE Design patterns like Business Delegate, Singleton, Service Locator, Data Access Objects etc to reuse and maintained the code.
- Used DB Visualizer to write SQL queries and Stored procedures for the application.
- Implemented Log4j framework for logging mechanism.
- Design and Implemented JAX-WS Web Services using SOAP, XML to provide the interface to various clients running on both Java and Non Java applications. .
- Involved in code reviews, test case reviews and gave feedback on various design aspects.
- Used Ant for build and deployment processes.
- Accessed Mainframe/IMS systems via MQ access.
- Used Log4j framework to log/track application.
- Interacted with external systems in development and system/integration testing.
- Used IRAD 7.0 extensively for code development and debugging.
Environment: Java, Servlet, JSP, Struts framework 1.2, EJB, Web Services, XML, JavaScript, HTML, CSS, Web Logic, windows, UNIX shell script, PL-SQL, Ant, JSTL, DB2, MQ Workflow.
Confidential
Java/J2EE Developer
Responsibilities:
- Involved in Agile methodology with respect to the successful development of the project.
- Involved in development of the application usingStrutsincluding validation framework, JSP and JavaScript.
- Implemented "Model View Controller (MVC)" architecture to obtain "Layered Architecture" to isolate each layer of the application to avoid the complexity of integration and customization.
- Used Web logic server as the application server to host the EJB’s.
- Used Eclipse IDE for developing the application.
- Used SVN for version control.
- Responsible for development of DAO's (Data Access Objects) to interact with the database using JDBC.
- Generated JUnit Test Cases to test the application.
- Used ANT to build the deployment JAR and WAR files.
- Tested web services using SOAP UI.
- Used Bugzilla for bug tracking.
- Log4j was used to log both User interface and Domain Level Messages.
Environment: JAVA, J2EE, JSP, Rational Rose, Servlets,StrutsFramework, JavaScript, Oracle and BEA Web Logic Server, JUnit, ANT, SVN, Log4J.
Confidential
Java Developer
Responsibilities:
- Involved in development, testing and implementation of the complete business solution.
- Used Java Servlets extensively and using JDBC for database access.
- Designed and developed user interfaces using JSP, JavaScript and XHTML.
- Used various J2EE design patterns like Singleton, Command Pattern for the implementation of the application.
- Designed, coded and configured server side J2EE components like JSP, Servlets, JavaBeans, JDBC, JNDI, JTS, Java Mail API, XML.
- Involved in database design and developed SQL Queries on MySQL.
- Configured the product for different application servers.
- Involved in client side validations using JavaScript.
- Used Swing Layout managers and Swing components for creating desktop application.
Environment: J2SE 1.4, J2EE, XML, Servlets, XHTML, JSP, XSL, JavaScript, Tomcat 5, MySQL, JDBC, Eclipse.