Hadoop Developer Resume
TX
SUMMARY:
- Over 9+ years of professional IT experience including 4 years of experience in Big Data technologies and Hadoop ecosystem components like Spark, MapReduce, Hive, Pig, YARN, HDFS, Oozie, Sqoop, Flume and Kafka. NoSQL systems like HBase, Cassandra.
- Experience in differentHadoop distributions like Cloudera (Cloudera distribution CDH3, 4 and 5), and Horton Works Distributions (HDP).
- Strong Knowledge on Architecture of Distributed systems and parallel processing, In - depth understanding of MapReduce Framework and Spark execution framework.
- Experience in Apache Flume and Kafka for collecting, aggregating and moving huge chunks of data from various sources such as web server, telnet sources etc.
- Good knowledge and experience of Real time streaming technologies Storm and Kafka.
- Worked on Java HBase API for ingestion processed data to HBase tables
- Strong experience in working with UNIX/LINUX environments, writing shell scripts.
- Hands on experience in NOSQL databases like HBase and MongoDB.
- Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
- Extensive experience in importing/exporting data from/to RDBMS the Hadoop Ecosystem using Apache Sqoop.
- Used Tableauas reporting tool as data visualization tool.
- Expertise in back-end/server side java technologies such as: Web services, Java Data base Connectivity.
- Experience with Kafka in understanding and performing thousands of megabytes of reads and writes per second on streaming data.
- Experience includes application development in Java (client/server), JSP, Servlet programming, Enterprise Java Beans, Struts, JSF, JDBC, spring, Spring Integration, Hibernate.
- Implemented POC in spark using Java and Scala.
- Good experience in optimizing Map-Reduce algorithms by using Combiners and Custom partitioners.
- Experience in using version control tools like GITHUB, SVN etc.
- Having good knowledge of Oracle 8i, 9i, 10g as Database and excellent in writing the SQL queries
- Performed performance tuning and productivity improvement activities
- Proactive in time management and problem solving skills, self-motivated and good analytical skills.
- Experience in building and deploying web applications in multiple applications servers and middleware platforms including Apache Tomcat, JBoss.
- Experience in writing test cases in Java Environment using JUnit.
- Demonstrated technical expertise, organization and client service skills in various projects undertaken.
- Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.
TECHNICAL SKILLS:
Big Data Ecosystem: Hadoop, MapReduce, YARN, HDFS, HBase, Zookeeper, Hive, Hue, Pig, Sqoop, Cassandra, Spark, Oozie, Storm, Flume, Cloudera Manager, Amazon AWS, Hortonworks clusters, CASK-Hydrator.
Languages: C, Java, PL/SQL, Pig Latin, HiveQL, Scala, SQL
Java/J2EE Web Technologies: J2EE, Servlets, JSP, JSTL, HTML, XHTML, CSS, XML, AJAX.
Frame works: Struts, ORM (Hibernate), JDBC
Web Services: SOAP, RESTful
Web Servers: Web Logic, Web Sphere, Apache Tomcat.
Scripting Languages: Shell Scripting, Java script.
Database: Oracle 9i/10g, Microsoft SQL Server, MySQL, RDBMS, MongoDB, Hbase
IDE & Build Tools: Eclipse, NetBeans, ANT and Maven.
Version Control System: CVS, SVN, GITHUB.
PROFESSIONAL EXPERIENCE:
Confidential, TX
Hadoop Developer
Responsibilities:
- Worked on migrating raw data from external relational databases into HDFS using Sqoop bulk load using CASK-HYDRATOR.
- Implemented pipe-lines using CASK-HYDRATOR.
- Responsible for building scalable distributed end to end data processing pipelines usingHadoop ecosystem tools.
- Monitoring and managing theHadoop cluster using Apache Ambari.
- Worked on loading log data directly into HDFS using Flume.
- Worked on writing various Linux scripts to stream data from multiple data sources like Oracle and Teradataonto the data lake.
- Developed PIG Latin scripts for the Processing of semi structured data
- Involved in creating Hive tables, loading with data and writing hive queries.
- Load and transform large sets of structured, semi structured and unstructured data that includes Avro, sequence files and xml files.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
- Developed end to end data pipeline using FTP Adaptor, Spark, Hive and Impala.
- Developed product profiles using Pig and commodity UDFs
- Used the Spark - Cassandra Connector to load data to and from Cassandra.
- Implemented a prototype to perform Real time streaming the data using Spark Streaming with Kafka
- Extensive experience in working with structured data using Hive QL, join operations, writing custom UDF’s and experienced in optimizing Hive Queries.
- Worked on tuning the performance of HIVE queries.
- Worked on developing OOZIE WORKFLOWS to run multiple Hive and Pig jobs, which run independently with time and data availability.
- Exported the analyzed data to the relational databases using Sqoop for visualization and for generating reports.
- Experienced in managing and reviewing Hadoop log files.
Environment: HDFS, Hadoop 2.x, Pig, Hive, Sqoop, Flume, MapReduce, Scala, Oozie, Oracle 11g, Horton Works Cluster
Confidential, NYC, NY
Hadoop Developer
Responsibilities:
- Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
- Worked on automation of delta feeds from Mainframes using Sqoop, also from FTP Servers to Hive.
- Implemented a Flume agent which will read logs from port and store into DATALAKE.
- Real time streaming the data using STORM and Kafka for faster processing.
- Inserted the data into DATALAKE after processing from STORM.
- Implemented Hive tables on top of the data inserted.
- Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better performance with Hive QL queries.
- Schedule the meetings with different IT groups; coordinate the Deployment activities, plan and prepare project cutover activities and Prepare deployment plan and run the plan.
- Analyzed the data by performing Hive queries (Hive QL) and running Pig scripts (Pig Latin) to study customer behavior.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Created components like Hive UDFs for missing functionality in HIVE for analytics
- Implemented Hive tables and HQL Queries for the reports. Written and used complex data type in Hive.
- Implemented Fair Scheduler on the job tracker for allocation of many resources to small jobs.
- Implemented automatic failover Zookeeper and zookeeper failover controller.
- Work with System architects and DBA has to ensure decisions meet long-term enterprise growth needs & reusability factors.
- Implemented Installation and configuration of multi-node cluster on Cloud using Amazon Web Services (AWS) on EC2.
- Worked with BI tools like Tableau for report creation and further analysis from the front end.
Environment: s: Hadoop 2.x, Hive, HQL, HDFS, MapReduce, Sqoop, Flume, Oozie, Python, Java, Maven, Eclipse, Putty, Cloudera Manager 4 and CDH 4, Zookeeper, Tableau
Confidential, Summit, NJ
Hadoop Developer
Responsibilities:
- Developed a MAP-REDUCE code for parsing Bills and loading the data into HDFS.
- Implemented sqoop bulk load to Import Production data.
- Implemented aggregations using HIVE and implemented join queries using hive.
- Written custom UDF’s in Hive.
- Participate in Design Reviews & Daily Project Scrums meetings.
- Worked closely with the business analysts to convert the Business Requirements into Technical Requirements and prepared low and high level documentation.
- Involved in installing and configuring Hive, Sqoop on the Hadoop cluster.
- Support and maintenance process managed and audited by tracking the activities using Request Items, Incidents and Change Requests.
- Track and maintain tasks/projects completed on time and within given scope between Onsite and Offshore team
- Supported Data Analysts in running Map Reduce Programs
- Gained very good business knowledge on fraud suspect identification, appeals process etc.reated and managed Source to Target mapping documents for required jobs.
- Execution of Hadoop ecosystem and Applications through Apache HUE.
- Optimizing Hadoop MapReduce code, Hive/Pig scripts for better scalability, reliability and performance.
- Feasibility Analysis (For the deliverables) - Evaluating the feasibility of the requirements against complexity and time lines.
- Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
Environment: Hadoop, HDFS, Map Reduce, HIVE, PIG, Sqoop, My Sql, Putty
Confidential, Somerset, NJ
Java Developer
Responsibilities:
- Involved in business requirement gathering and technical specifications.
- Involved in designing and developing modules Confidential both Client and Server Side.
- Developed Java classes and helper classes in the business layer and tested them using Junit.
- Designed and implemented front-end application with the use of JavaScript, CSS, HTML and JSPs and Spring MVC to register a new entry and management and used Hibernate to configure and connect to database.
- Developed and tested Efficiency Management module with the help of EJB, Servlets and JSP& Core Java components in WebLogic Application server.
- Implemented Hibernate in the data access object layer to access and update information in Oracle database.
- Data manipulation (update, modify) using SQL queries.
- Developed stored procedures and complex SQL queries.
- Achieved object relational mapping by configuring deployment descriptors in Hibernate.
- Created Stored procedures using PL-SQL for data modification (Using DML insert, update, delete) in Oracle
- Closely worked with QA, Business and Architect to solve various Defects in quick and fast to meet deadlines
- Conducted Design reviews and Technical reviews with other project stakeholders.
Environment: J2EE, JDBC, Java 1.4, Servlets, JSP, JSF, Struts, Hibernate, Web services, Design Patterns, MVC, HTML, JavaScript 1.2, WebLogic 8.0, XML, JUnit, Oracle 9.i/10.g, My Eclipse
Confidential
Java Developer
Responsibilities:
- Closely involved with design, development and implementation of the application.
- Followed agile based development strategy for developing of the application.
- Worked on development of UI screens using JSP, CSS for the application.
- Worked with JavaScript events and functions for client side validation.
- Implemented the database connectivity to Oracle using JDBC.
- Design and developed SQL Queries, Views and procedures.
- Designed, developed and deployed necessary stored procedures, Functions using SQL, PL/SQL.
- Supported the deployed applications through debugging, fixing, and maintenance releases.
- Preparing and executing Unit test cases
- Worked with minimal supervision in a fast paced, dynamic environment.
- Wrote test plans and test cases for the developed screens.
- Executed test cases and fixed bugs through unit testing.
Environment: My Eclipse, Oracle, JSP, RDBMS, SQL, PL/SQL, JavaScript, JQuery, JDBC, Servlet, JSP, JSF, Struts, Hibernate, Web services, Design Patterns
