We provide IT Staff Augmentation Services!

Hadoop/big Data Developer Resume

3.00/5 (Submit Your Rating)

Atlanta, Ga

PROFESSIONAL SUMMARY:

  • 8 + years of experience in IT including 3+ years of expertise in design and development of scalable distributed systems using HadoopEco System tools,Big Data Technologies,Core Java and J2EE
  • Comprehensive experience in Big Data processing using Hadoop Ecosystem including Pig, Hive,HDFS, Map Reduce (MRV1 and YARN), Sqoop, Flume, Kafka, Oozie,Zookeeper, Spark, Impala.
  • Experience with all flavors of Hadoop distributions, including Cloudera, Hortonworks, MapR, Amazon Web Services distribution of Hadoop.
  • Expertise in writing Hadoop Jobs for analyzing data using Hive Query Language (HQL), Pig Latin (Data flow language), and custom MapReduce programs in Java.
  • Experience in writing Pig and Hive scripts toprocessstructured and unstructured dataand extending Hive and Pig core functionality by writing custom UDFs.
  • Experience in data loading from Oracle and MySQL databases to HDFS system usingSQOOP.
  • Good understanding of NoSQL databases like MongoDB, Cassandra, and HBase.
  • Expertise in using MongoDBfor storing large data objects, real - time analytics, Logging and Full Text search.
  • Hands on experience writing applications on HBaseand expertise with SQL, PL/SQL database concepts.
  • Good Knowledge on general data analytics on distributed computing cluster like Hadoop using Apache Sparkand Scala.
  • Having experience in developing a data pipeline using Kafkato store data into HDFS.
  • Havingexperience in using Apache Avroto provide both a serialization format for persistent data, and a wire format for communication between Hadoop nodes.
  • Familiar in creating tables in Parquet format in Impala.
  • Extensive experience in Unix Shell Scripting.
  • Familiar in creating custom Solr Query components.
  • Hands on experience in configuring and working with Flume to load the data from multiple sources directlyinto HDFS.
  • Expertise in Hadoop workflows scheduling and monitoring using Oozie,Zookeeper.
  • Good Knowledge in writing MapReduce programs using Apache Crunch.
  • Strong experience as a Java Developer in Web/intranet, Client/Server technologies using Java, J2EE technologies which includesStruts framework, MVC design Patterns, JSP, Servlets, EJB,JDBC,JSLT, XML/XLST, Java Script, AJAX, JMS, JNDI, RDMS, SOAP, Hibernate and custom tag Libraries.
  • Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.
  • An excellent team player and self-starter with good communication and inter-personal skills and proven abilities to finish tasks before target deadlines.

TECHNICAL SKILLS:

Hadoop/ Big Data : Apache Hadoop, HDFS and Map Reduce,Pig, Hive, Sqoop, Flume, Hue, YARN, Oozie, Zookeeper,MapR Converged Data Platform,CDH,HDP,EMR, Apache Spark, Apache Kafka, Apache STORM, Apache Crunch, Avro, Parquet.

Programming Languages: Java, C/C++,SQL, PL/SQL, Python, Ruby,Unix Shell Scripting.

Java Technologies: Java, J2EE, JSTL, JDBC, JSP, Java Servlets, JMS, JUnit, Log4j.

Web Technologies: AJAX, HTML5,JavaScript,CSS3,XML,SOAP, WSDL.

IDE Development Tools: Eclipse, Net Beans, My Eclipse, SOAP UI, Ant.

Frameworks: MVC, Struts, Hibernate, Spring.

Web Servers: Web Logic, Web Sphere, Apache Tomcat.

Databases: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server.

NoSQL Databases: HBase, MongoDB, Cassandra.

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP.

ETL &Reporting Tools: Informatica, Pentaho, SSIS, Cognos BI, Tableau, Hyperion, SSRS.

Operating Systems: Windows,Macintosh,Unix, Linux,Solaris.

PROFESSIONAL EXPERIENCE:

Hadoop/Big Data Developer

Confidential, Atlanta, GA 

Responsibilities:

  • Used Kafka for log aggregation like collecting physical log files off servers and puts them in a central place like HDFS for processing. 
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS.
  • Also Used Spark SQL to process structured data in Hive.
  • Involved in creating Hive tables, loading data, writing hive queries,generating partitions and buckets foroptimization.
  • Developed Simple to complex Map/reduce Jobs usingJava,Hive and Pigfor data cleaning and preprocessing.
  • Analyzed large data sets by runningHive queries and Pig scripts.
  • Written Hive UDF’s to sort Structure fields and return complex data type.
  • Used different data formats (Text format and ORC format) while loading the data into HDFS.
  • Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
  • Involved in developing Shell scripts to easy execution of all other scripts (Pig, Hive, and MapReduce) andmove the data files within and outside of HDFS.
  • Creating indexes and tuned the SQL queries in Hive using HUE.
  • Created custom Solr Query components to enable optimum search matching.
  • Worked with NoSQL databases like Hbase in creating Hbase tables to load large sets of semi structured data.
  • Acted for importing information under HBase utilizing HBase shell Also HBase customer API.
  • Used Kafka to rebuild a user activity tracking pipeline as a set of real-time publish-subscribe feeds.

Environment:Hadoop,Cloudera,CDH4, CDH5,HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Hbase, Apache Spark, Oozie Scheduler, Java, UNIX Shell Scripts,Impala, HUE,HCATALOG, Kafka,Solr, Git, Maven, BitBucket.

Hadoop Developer

Confidential, Tampa, FL

Responsibilities:

  • Coordinated with business customers to gather business requirements and worked under agile environment.
  • Responsible for importing log files from various sources into HDFS using Flume.
  • Processed Big Data using a Hadoop cluster consisting of 45 nodes.
  • Performed complex HiveQL queries on Hive tablesto create, alter and drop tables.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Created final tables in Parquet format.
  • Developed PIG scripts for source data validation and transformation.
  • Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
  • Developed NoSQL database by using CRUD, Indexing, Replication and Sharding in MongoDB. Sorted thedata by using indexing.
  • Extracted and updated the data into MongoDB using Mongo import and export command line utility interface.
  • Involved in unit testing using MR unit for Map Reduce jobs.
  • Used Hive and Pig to generate BI reports.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types ofHadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.

Environment:Hadoop,HDFS, Pig, Hive, MapReduce, Java, Flume, Oozie, Linux/UNIX Shell Scripting, Avro,MongoDB, Python, Perl,Java (jdk1.7),Git, Maven, Jenkins.

Hadoop Developer

  Confidential, Dallas, TX

Responsibilities:

  • Involved in start to end process of Hadoop cluster, Hadoop ecosystem,Cloudera Manager Installation, configuration and monitoring using CDH3 Distribution.
  • Worked extensively in creating MapReduce jobs for search and analytics in the identification of varioustrends.
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Integrating bulk data into Cassandra file system using MapReduce programs.
  • Involved in creating Hive tables, and loading and analyzing data sets using hive queriesand Pig scriptswhich will run internally in map reduce way. 
  • Extracted files from MySQL tablesin to HDFS using Sqoop.
  • Involved in loading data from Linux/UNIX file system to HDFS.
  • Written custom Hive and Pig UDF’s based on the requirements.
  • Cluster coordination services through Zoo Keeper.
  • Designed and Developed Dashboards using Tableau 6.

Environment: CDH3, Hadoop, HDFS, Pig, Hive, Map Reduce, Sqoop, HBase, Cloudera Manager, Cassandra, MySQL, Zookeeper, LINUX (CentOS), Tableau 6, Java, SQL.

Java Developer

  Confidential, Dallas, TX

Responsibilities:

  • Responsible for gathering Business Requirements and User Specifications from Business Analyst.
  • Involved in developing Web Service coding, testing and deployment of the application.
  • Worked on MVC framework preferably Web Work and STRUTS 2.0 with spring dependency injection for application customization and upgrade.
  • All the Business logic in all the modules is written in core Java.
  • Worked on Load Builder Module in order to develop the Region services using SOAP Web services.
  • Implemented Hibernate in the data access object layer to access and update information in the Oracle10g Database.
  • Used JSP, Java Script, HTML5, and CSS for manipulating, validating, customizing, error messages to the User Interface.
  • Wrote test cases in JUnit for unit testing of classes.
  • The batch framework made heavy use of XML/XSL transforms.
  • Providing technical expertise to project team covering application design, database design and performance tuning activities.

Environment:Java, J2EE, Struts 2.0, Hibernate, MVC, WebLogic Application Server, JSP, Servlets, Java Script, HTML, CSS, Ajax, Web Services, Oracle 10g, Eclipse, PL/SQL, ANT, Junit, XML/XSL,log 4j, SVN Tortoise.

Java/J2EE Developer

Confidential

Responsibilities:

  • Involved in developing the application using Java/J2EE platform. Implemented the Model View Control (MVC) structure using Struts.
  • Created JSP screen by assembling Struts Tiles and Taglibs and used HTML for static webpage And JavaScript for View part of the project.
  • Applied MVC pattern of Ajax framework which involves creating Controllers for implementing Classic JavaScript event handlers and implemented flexible event model for managing multiple event call backs.
  • Implemented simulated top-down SOAP based Web Service to test the business logic for the rating calculation.
  • Used Hibernate as Persistence framework mapping the ORM objects to table using Hibernate annotations.
  • Used client side Java scripting: JQUERY for designing TABS and DIALOGBOX.
  • Used XML for ORM mapping relations with the java classes and the database.
  • Integrated Log4J logging API to log errors and messages.
  • Responsible for overall quality and timeliness of the delivery.

Environment:Java/J2EE,Oracle 10g, MVC, JSP, EJB,Struts 1.x, JBoss, WebServices, JQuery,Log4j, ANT,HTML, XML,SQL/PL SQL, Tomcat, Hibernate, Java script,Junit,JDBC.

Java Developer

Confidential

Responsibilities

  • Participated in all the phases of SDLC including Requirements Collection, Design & Analysis of the Customer Specifications, Development and Customization of the application. 
  • Involved and participated in Code reviews.
  • Responsible for designing different JSP pages and writing Action class using Struts framework for Security, and Search modules.
  • Involved in making security and search feature as separate Application Units of project. 
  • Designing the database and coding of SQL, PL/SQL, Triggers and Views using IBM DB2.
  • Created Connection pools and Data Sources.
  • Deployed the application, which uses J2EE architecture model and Struts Framework on JBoss Application server.
  • Developed server-side common utilities for the application and the front-end dynamic web pages using Servlets, JSP and custom tag libraries, JavaScript, HTML/DHTML and CSS.

Environment:Java 5.0,J2EE, JSP, JavaScript, HTML/DHTML, CSS,DB2,JSP,CVS, Win XP, Struts Frame Work, Eclipse IDE, EJB, JMS, WebLogic Server, SQL,PL/SQL.

We'd love your feedback!