Hadoop Developer Resume Profile
Chicago, IL
Summary
- Hadoop/Java Developer with over 7.5years of overall experience as software developer in design, development, deploying and large scale supporting large scale distributed systems.
- 3 years of extensive experience as Hadoop Developer and Big Data analyst.
- DataStax Cassandra and IBM Big Data University certified.
- Excellent understanding of Hadoop architecture and underlying framework including storage management.
- Haveexperience in installing, configuring and administratingHadoop cluster for major Hadoop distributions like CDH 3, CDH4, and CDH5.
- Expertise in using various Hadoop infrastructures such as MapReduce, Pig, Hive, ZooKeeper, HBase, Sqoop, Oozie, Flumeand sparkfor data storage and analysis.
- Experience in developing customUDFs for Pig and Hive to incorporate methods and functionality of Python/Java into PigLatin and HQL HiveQL and Used UDFs from Piggybank UDF Repository.
- Experienced in running query using Impala and used BI tools to run ad-hoc queries directly on Hadoop.
- Good experience in OozieFramework and Automating daily import jobs.
- Experienced in managing Hadoop clusters and services using ClouderaManager.
- Experienced in troubleshooting errors in HBase Shell/API, Pig, Hive and MapReduce.
- Highly experienced in importing and exporting data between HDFS and Relational Database Management systems using Sqoop.
- Collected logs data from various sources and integrated in to HDFS using Flume.
- Assisted Deploymentteam in setting up Hadoop cluster and services.
- Good experience in Generating Statistics/extracts/reports from the Hadoop.
- Good understanding of NoSQLData bases and hands on work experience in writing applications on No SQL databases like Cassandra and Mongo DB.
- Good knowledge in querying data from Cassandra for searching grouping and sorting.
- Good Knowledge in Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data.
- Strong experience in core Java,J2EE,SQL,PL/SQL and Restful web services.
- Having good knowledge in Benchmarking Performance Tuning of cluster.
- Experienced in Identifying improvement areas for systems stability and providing end end high availability architectural solutions.
- Good experience in Generating Statistics and reports from the Hadoop.
- Extensive experience in developing applications using Core Java and multi-threading.
- Determined, committed and hardworking individual with strong communication, interpersonal and organizational skills.
Technical Skills
Hadoop Ecosystem | HDFS, MapReduce,MRUnit, YARN, Hive, Pig, HBase, Impala, Zookeeper, Sqoop, Oozie,DataStax Apache Cassandra, Flume, Spark and Avro, AWS, Amazon EC2, S3. |
Web Technologies | HTML, XML, JDBC, JSP, JavaScript, AJAX |
RDBMS | Oracle 10g/11g, MySQL, SQL server, Teradata |
No SQL | HBase, Cassandra, Mahout. |
Web/Application servers | Tomcat, LDAP |
Java frameworks | Struts, Spring, Hibernate |
Methodologies | Agile, UML, Design Patterns Core Java and J2EE |
Data Bases | Oracle 11g/10g,Teradata, DB2, MS-SQL Server, MySQL, MS-Access |
Programming Languages | C, C , Java, SQL, PL/SQL, Linux shell scripts. |
Tools Used | Eclipse, Putty, Cygwin, MS Office, Crystal Reports |
BI Tools | Tableau, Datameer, pentaho |
Professional Summary
Confidential
Hadoop Developer
Responsibilities:
- Installed and configured Hadoop Map Reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Experienced in installing, configuring and using Hadoop Ecosystem components.
- Experienced in Importing and exporting data into HDFS and Hive using Sqoop.
- Knowledge in performance troubleshooting and tuningHadoop clusters.
- Participated in development/implementation of ClouderaHadoop environment.
- Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
- Experienced in running query using Impala and used BI tools to run ad-hoc queries directly on Hadoop.
- ImplementedSpark advanced procedures like text analytics and processing using the in-memory computing capabilities.
- Involved in various NOSQL databases like HBase, Cassandra in implementing and integration.
- Installed and configured Hive and also written Hive UDFs and Used Map Reduce and Junit for unit testing.
- Used DataStax Cassandra along with Pentaho for reporting.
- Experienced in working with various kinds of data sources such as Teradata and Oracle. Successfully loaded files to HDFS from Teradata, and load loaded from hdfs to hive and impala.
- Experienced in using Zookeeper and Oozie Operational Services for coordinating the cluster and scheduling workflows.
- Queried and analyzed data from Cassandra for quick searching, sorting and grouping through CQL.
- Installed and configured Hive and also written Hive UDFs and used piggy bank a repository of UDF's for PigLatin.
- Experienced in managing and reviewing Hadoop log files.
- Worked in installing cluster, commissioning decommissioning of Datanodes, Namenode recovery, capacity planning, and slots configuration.
- Supported Map Reduce Programs those are running on the cluster. Involved in loading data from UNIX file system to HDFS.
- Load and transform large sets of structured, semi structured and unstructured data.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Environment:Hadoop, MapReduce, HDFS, Hive, pig, Impala,Cassandra, spark, Java, SQL, Tableau,PIG, Zookeeper, Sqoop, Teradata, CentOS, Pentaho.
Confidential
Hadoop Developer
Responsibilities:
- Acted as a lead resource and build the entire Hadoop platform from scratch.
- Evaluated suitability of Hadoop and its ecosystem to the above project and implementing / validating with various proof of concept POC applications to eventually adopt them to benefit from the Big Data Hadoop initiative.
- Estimated the Software Hardware requirements for the NameNode and DataNode planning the cluster.
- Extracted the needed data from the server into HDFS and BulkLoaded the cleaned data into HBase.
- Using the Spark framework Enhanced and optimized product Spark code to aggregate, group and run data mining tasks.
- Wrote queries Using Cassandra CQL to create, alter,insert and delete elements.
- Written the Map Reduce programs, HiveUDFs in Java.
- Used Map Reduce JUnit for unit testing.
- Develop HIVE queries for the analysts.
- Created an e-mail notification service upon completion of job for the particular team which requested for the data.
- Defined job work flows as per their dependencies in Oozie.
- Played a key role in productionizing the application after testing by BI analysts.
- Given POC of FLUME to handle the real time log processing for attribution reports.
- Maintain System integrity of all sub-components related to Hadoop.
Environment: Apache Hadoop, HDFS, Spark, Hive, Cassandra, Map Reduce, Java, Flume, Cloudera CDH4, Oozie, Oracle, MySQL, Amazon S3.
Confidential
Java Developer
Responsibilities:
- Installed and configured Apache Hadoop to test the maintenance of log files in Hadoop cluster.
- Installed and configured Hive, Pig, Sqoop, Flume andOozie on the Hadoop cluster.
- Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
- Setup and benchmarked Hadoop/HBase clusters for internal use.
- Developed Java MapReduce programs for the analysis of sample log file stored in cluster.
- Developed Simple to complex Map/reduce Jobs using Hive and Pig
- Developed Map Reduce Programs for data analysis and data cleaning.
- Developed PIG Latin scripts for the analysis of semi structured data.
- Developed and involved in the industry specific UDF user defined functions
- Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
- Used Sqoop to import data into HDFS and Hive from other data systems.
- Continuous monitoring and managing the Hadoop clusterthrough Cloudera Manager.
- Migration of ETL processes from RDBMS to Hive to test the easy data manipulation.
- Developed Hive queries to process the data for visualizing.
Environment:ApacheHadoop, HDFS, Cloudera Manager, CentOS, Java, MapReduce, Eclipse, Hive, PIG, Sqoop, Oozie and SQL.
Confidential
Sr. Java Developer
Responsibilities
- Involved in requirement analysis and played a key role in project planning.
- Successfully completed the Architecture, Detailed Design Development of modules Interacted with end users to gather, analyze, and implement the project.
- Designed and developed web components and business modules through all tiers from presentation to persistence.
- Developed the web pages using JSP, JavaScript, CSS, AJAX and Servlets.
- Used hibernate for mapping from Java classes to database tables.
- Developed the Action Classes, Action Form Classes, created JSPs using Struts tag libraries and configured in Struts-config.xml, Web.xml files.
- Developed UI layout using Dreamweaver.
- Developed java beans to interact with UI database.
- Created the end-user business interfaces.
- Frequent interaction with client and delivered solution for their business needs.
- Developed ANT script for building and packaging J2EE components.
- Wrote PL/SQL queries and Stored procedures for data retrieval
- Created and modified DB2 Schema objects like Tables, Indexes.
- Created Test Plan, Test Cases scripts for UI testing.
Environment: Java, JSP, Servlets, JDBC, JavaBeans, Oracle, HTML/DHTML, Microsoft FrontPage, Java Script 1.3, PL/SQL, Tomcat 4.0, Windows NT.
Confidential
Java Developer
Responsibilities:
- End to End designing of Critical Core Java Components using Java Collections and Multithreading.
- Analysis of different database schemas Transaction and Data warehouse to build extensive reports to Business using SQL Joins.
- Development of multiple reports to business in quick turn-around time, which helped business to save considerable operational costs.
- Created one of the best programs to notify the operational team on Downtime of one of 250 pharmacies on AP network in a few seconds.
- Created an interface using JSP, Servlet and MVC Struts architecture for pharmacy team to resolve stuck orders in different pharmacies.
- Performance tuned the IMS report to memory leaks and best practices in java to boost the performance and reliability of the application.
- Preparation of detail documentation for the results.
- Suggested some alternative options for the user interface from the user perspective.
Environment: Java 1.4, J2EE JSP, Servlets, Java Beans, JDBC, Multi-Threading , LINUX Shell Perl Scripting , and SQL.
Confidential
JAVA Developer
Responsibilities:
- Coordinated with the users to gather and analyze the business requirements.
- Design Development of design specifications using design patterns and OO methodology using UML Rational Rose
- Involved in Use Case analysis and developing User Interface using HTML/DHTML.
- Involved in the Development and Deployment of Java beans.
- Developed dynamic page designing using JSP to invoke Servlets Controllers
- Developed JDBC Connection pooling to optimize database connections
- Wrote different stored procedures in Oracle using Pl/SQL
- Used Java Script for Client side validations
- Implemented Session Tracking and User Authentication
Environment: Java, JSP, Servlets, JDBC, JavaBeans, Oracle, HTML/DHTML, Microsoft FrontPage, Java Script 1.3, PL/SQL, Tomcat 4.0, Windows NT.