Hadoop/big Data Developer Resume
Atlanta, Ga
PROFESSIONAL SUMMARY:
- 8 + years of experience in IT including 3+ years of expertise in design and development of scalable distributed systems using HadoopEco System tools,Big Data Technologies,Core Java and J2EE
- Comprehensive experience in Big Data processing using Hadoop Ecosystem including Pig, Hive,HDFS, Map Reduce (MRV1 and YARN), Sqoop, Flume, Kafka, Oozie,Zookeeper, Spark, Impala.
- Experience with all flavors of Hadoop distributions, including Cloudera, Hortonworks, MapR, Amazon Web Services distribution of Hadoop.
- Expertise in writing Hadoop Jobs for analyzing data using Hive Query Language (HQL), Pig Latin (Data flow language), and custom MapReduce programs in Java.
- Experience in writing Pig and Hive scripts toprocessstructured and unstructured dataand extending Hive and Pig core functionality by writing custom UDFs.
- Experience in data loading from Oracle and MySQL databases to HDFS system usingSQOOP.
- Good understanding of NoSQL databases like MongoDB, Cassandra, and HBase.
- Expertise in using MongoDBfor storing large data objects, real - time analytics, Logging and Full Text search.
- Hands on experience writing applications on HBaseand expertise with SQL, PL/SQL database concepts.
- Good Knowledge on general data analytics on distributed computing cluster like Hadoop using Apache Sparkand Scala.
- Having experience in developing a data pipeline using Kafkato store data into HDFS.
- Havingexperience in using Apache Avroto provide both a serialization format for persistent data, and a wire format for communication between Hadoop nodes.
- Familiar in creating tables in Parquet format in Impala.
- Extensive experience in Unix Shell Scripting.
- Familiar in creating custom Solr Query components.
- Hands on experience in configuring and working with Flume to load the data from multiple sources directlyinto HDFS.
- Expertise in Hadoop workflows scheduling and monitoring using Oozie,Zookeeper.
- Good Knowledge in writing MapReduce programs using Apache Crunch.
- Strong experience as a Java Developer in Web/intranet, Client/Server technologies using Java, J2EE technologies which includesStruts framework, MVC design Patterns, JSP, Servlets, EJB,JDBC,JSLT, XML/XLST, Java Script, AJAX, JMS, JNDI, RDMS, SOAP, Hibernate and custom tag Libraries.
- Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.
- An excellent team player and self-starter with good communication and inter-personal skills and proven abilities to finish tasks before target deadlines.
TECHNICAL SKILLS:
Hadoop/ Big Data : Apache Hadoop, HDFS and Map Reduce,Pig, Hive, Sqoop, Flume, Hue, YARN, Oozie, Zookeeper,MapR Converged Data Platform,CDH,HDP,EMR, Apache Spark, Apache Kafka, Apache STORM, Apache Crunch, Avro, Parquet.
Programming Languages: Java, C/C++,SQL, PL/SQL, Python, Ruby,Unix Shell Scripting.
Java Technologies: Java, J2EE, JSTL, JDBC, JSP, Java Servlets, JMS, JUnit, Log4j.
Web Technologies: AJAX, HTML5,JavaScript,CSS3,XML,SOAP, WSDL.
IDE Development Tools: Eclipse, Net Beans, My Eclipse, SOAP UI, Ant.
Frameworks: MVC, Struts, Hibernate, Spring.
Web Servers: Web Logic, Web Sphere, Apache Tomcat.
Databases: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server.
NoSQL Databases: HBase, MongoDB, Cassandra.
Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP.
ETL &Reporting Tools: Informatica, Pentaho, SSIS, Cognos BI, Tableau, Hyperion, SSRS.
Operating Systems: Windows,Macintosh,Unix, Linux,Solaris.
PROFESSIONAL EXPERIENCE:
Hadoop/Big Data Developer
Confidential, Atlanta, GA
Responsibilities:
- Used Kafka for log aggregation like collecting physical log files off servers and puts them in a central place like HDFS for processing.
- Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS.
- Also Used Spark SQL to process structured data in Hive.
- Involved in creating Hive tables, loading data, writing hive queries,generating partitions and buckets foroptimization.
- Developed Simple to complex Map/reduce Jobs usingJava,Hive and Pigfor data cleaning and preprocessing.
- Analyzed large data sets by runningHive queries and Pig scripts.
- Written Hive UDF’s to sort Structure fields and return complex data type.
- Used different data formats (Text format and ORC format) while loading the data into HDFS.
- Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
- Involved in developing Shell scripts to easy execution of all other scripts (Pig, Hive, and MapReduce) andmove the data files within and outside of HDFS.
- Creating indexes and tuned the SQL queries in Hive using HUE.
- Created custom Solr Query components to enable optimum search matching.
- Worked with NoSQL databases like Hbase in creating Hbase tables to load large sets of semi structured data.
- Acted for importing information under HBase utilizing HBase shell Also HBase customer API.
- Used Kafka to rebuild a user activity tracking pipeline as a set of real-time publish-subscribe feeds.
Environment:Hadoop,Cloudera,CDH4, CDH5,HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Hbase, Apache Spark, Oozie Scheduler, Java, UNIX Shell Scripts,Impala, HUE,HCATALOG, Kafka,Solr, Git, Maven, BitBucket.
Hadoop Developer
Confidential, Tampa, FL
Responsibilities:
- Coordinated with business customers to gather business requirements and worked under agile environment.
- Responsible for importing log files from various sources into HDFS using Flume.
- Processed Big Data using a Hadoop cluster consisting of 45 nodes.
- Performed complex HiveQL queries on Hive tablesto create, alter and drop tables.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Created final tables in Parquet format.
- Developed PIG scripts for source data validation and transformation.
- Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
- Developed NoSQL database by using CRUD, Indexing, Replication and Sharding in MongoDB. Sorted thedata by using indexing.
- Extracted and updated the data into MongoDB using Mongo import and export command line utility interface.
- Involved in unit testing using MR unit for Map Reduce jobs.
- Used Hive and Pig to generate BI reports.
- Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types ofHadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
Environment:Hadoop,HDFS, Pig, Hive, MapReduce, Java, Flume, Oozie, Linux/UNIX Shell Scripting, Avro,MongoDB, Python, Perl,Java (jdk1.7),Git, Maven, Jenkins.
Hadoop Developer
Confidential, Dallas, TX
Responsibilities:
- Involved in start to end process of Hadoop cluster, Hadoop ecosystem,Cloudera Manager Installation, configuration and monitoring using CDH3 Distribution.
- Worked extensively in creating MapReduce jobs for search and analytics in the identification of varioustrends.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Integrating bulk data into Cassandra file system using MapReduce programs.
- Involved in creating Hive tables, and loading and analyzing data sets using hive queriesand Pig scriptswhich will run internally in map reduce way.
- Extracted files from MySQL tablesin to HDFS using Sqoop.
- Involved in loading data from Linux/UNIX file system to HDFS.
- Written custom Hive and Pig UDF’s based on the requirements.
- Cluster coordination services through Zoo Keeper.
- Designed and Developed Dashboards using Tableau 6.
Environment: CDH3, Hadoop, HDFS, Pig, Hive, Map Reduce, Sqoop, HBase, Cloudera Manager, Cassandra, MySQL, Zookeeper, LINUX (CentOS), Tableau 6, Java, SQL.
Java Developer
Confidential, Dallas, TX
Responsibilities:
- Responsible for gathering Business Requirements and User Specifications from Business Analyst.
- Involved in developing Web Service coding, testing and deployment of the application.
- Worked on MVC framework preferably Web Work and STRUTS 2.0 with spring dependency injection for application customization and upgrade.
- All the Business logic in all the modules is written in core Java.
- Worked on Load Builder Module in order to develop the Region services using SOAP Web services.
- Implemented Hibernate in the data access object layer to access and update information in the Oracle10g Database.
- Used JSP, Java Script, HTML5, and CSS for manipulating, validating, customizing, error messages to the User Interface.
- Wrote test cases in JUnit for unit testing of classes.
- The batch framework made heavy use of XML/XSL transforms.
- Providing technical expertise to project team covering application design, database design and performance tuning activities.
Environment:Java, J2EE, Struts 2.0, Hibernate, MVC, WebLogic Application Server, JSP, Servlets, Java Script, HTML, CSS, Ajax, Web Services, Oracle 10g, Eclipse, PL/SQL, ANT, Junit, XML/XSL,log 4j, SVN Tortoise.
Java/J2EE Developer
Confidential
Responsibilities:
- Involved in developing the application using Java/J2EE platform. Implemented the Model View Control (MVC) structure using Struts.
- Created JSP screen by assembling Struts Tiles and Taglibs and used HTML for static webpage And JavaScript for View part of the project.
- Applied MVC pattern of Ajax framework which involves creating Controllers for implementing Classic JavaScript event handlers and implemented flexible event model for managing multiple event call backs.
- Implemented simulated top-down SOAP based Web Service to test the business logic for the rating calculation.
- Used Hibernate as Persistence framework mapping the ORM objects to table using Hibernate annotations.
- Used client side Java scripting: JQUERY for designing TABS and DIALOGBOX.
- Used XML for ORM mapping relations with the java classes and the database.
- Integrated Log4J logging API to log errors and messages.
- Responsible for overall quality and timeliness of the delivery.
Environment:Java/J2EE,Oracle 10g, MVC, JSP, EJB,Struts 1.x, JBoss, WebServices, JQuery,Log4j, ANT,HTML, XML,SQL/PL SQL, Tomcat, Hibernate, Java script,Junit,JDBC.
Java Developer
Confidential
Responsibilities:
- Participated in all the phases of SDLC including Requirements Collection, Design & Analysis of the Customer Specifications, Development and Customization of the application.
- Involved and participated in Code reviews.
- Responsible for designing different JSP pages and writing Action class using Struts framework for Security, and Search modules.
- Involved in making security and search feature as separate Application Units of project.
- Designing the database and coding of SQL, PL/SQL, Triggers and Views using IBM DB2.
- Created Connection pools and Data Sources.
- Deployed the application, which uses J2EE architecture model and Struts Framework on JBoss Application server.
- Developed server-side common utilities for the application and the front-end dynamic web pages using Servlets, JSP and custom tag libraries, JavaScript, HTML/DHTML and CSS.
Environment:Java 5.0,J2EE, JSP, JavaScript, HTML/DHTML, CSS,DB2,JSP,CVS, Win XP, Struts Frame Work, Eclipse IDE, EJB, JMS, WebLogic Server, SQL,PL/SQL.