Sr Hadoop Developer Resume
Sfo, CA
SUMMARY
- Hadoop Developer with almost 8 years of IT experience in developing, delivering of software using wide variety of technologies in all phases of the development life cycle.
- Expertise in Java /Big data technologies as an engineer, proven ability in project based leadership, teamwork and good communication skills.
- Developed and implemented in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig, Oozie and other Hadoop related eco - systems as a Data Storage and Retrieval systems.
- Experienced in Data transfer from structured data stores to HDFS using Sqoop.
- Worked on writing Map Reduce programs to perform Data processing and analysis.
- Experienced in analyzing data with Hive and Pig.
- Worked on Oozie for managing Hadoop jobs.
- Experienced in cluster coordination using Zookeeper.
- Experienced in loading logs from multiple sources directly into HDFS using Flume.
- Developed Batch Processing jobs using Java Map Reduce, Pig and Hive.
- Experienced in using Flume to load logs files into HDFS.
- Expertised in using Oozie for configuring job flows.
- Experienced in migrating ETL projects to Hadoop Platform
- Experienced in designing both time driven and data driven automated workflows using Oozie.
- Has knowledge of Data Structures and Hands on Experience of Design Patterns.
- Experienced in RDBMS technologies like MySQL.
- Experienced in scripting for automation, and monitoring using Shell Scripting
- Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities likeSpark.
- Enhanced and optimized productSparkcode to aggregate, group and run data mining tasks using theSparkframework.
- Experienced in the concepts of Kafka, Storm, Scala, SPARK.
- Experienced in Building, Deploying and Integrating with Ant, Maven
- Experienced in preparing and executing Unit Test Plan and Unit Test Cases after software development.
- Hands on experience in Agile and Scrum methodologies.
- Developed experience in different IDE’s like Eclipse, NetBeans.
- Extensive experience in working with the Customers to gather required information to analyze, clear up and provide data fix or code fix for technical problems, build service patch for each version release and unit testing, integration testing, User Acceptance testing and system testing and providing Technical Solution documents for the Users
- Very Good Knowledge in Object-oriented concepts with complete software development life cycle experience - Requirements gathering, Conceptual Design, Analysis, Detail design, Development, Mentoring, System and User Acceptance Testing.
TECHNICAL SKILLS
Hadoop/BigData Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, HBase, Mongo DB, Spark, Cassandra, Oozie, Zookeeper, YARN
Programming Languages: Java, Python, SQL, C#
Operating Systems: UNIX, Windows, LINUX
Databases: NoSQL, Oracle, Sybase, SQLServer
Java IDE: Eclipse 3.x
Tools: SQL Developer
PROFESSIONAL EXPERIENCE
Confidential, SFO, CA
Sr Hadoop Developer
Responsibilities:
- Involve in all phases of the SDLC including analysis, design, development, testing, and deployment of Hadoop cluster.
- Experience with Agile development processes and practices.
- Work on Oozie and Unix scripts for batch processing and scheduling workflows dynamically.
- Implement data ingestion from multiple sources like IBM Mainframes, Teradata, Oracle and Netezza using Sqoop, MR jobs.
- Develop Sqoop scripts to import and export data from relational sources and handled incremental and updated changes into HDFS layer.
- Develop transformations and aggregated the data for large data sets using MR, Pig and Hive scripts.
- Work on partitioning and used bucketing in HIVE tables and running the scripts in parallel to improve the performance.
- Develop test cases in MRUnit for unit testing of MR Jobs.
- Implement a process to automatically update the Hive tables by reading a change file provided by business users.
- Experience working with different file formats - Avro, Parquet and JSON.
- Experience in using Gzip, LZO, Snappy and Bzip2 compressions.
- Experience in reading and writing files into HDFS using Java file system API.
- Develop Pig and Hive UDF's based on requirements.
- Develop the workflow jobs using Oozie services to run the MR, Pig and Hive jobs and created JIL scripts to run Oozie jobs.
- Use MongoDB and report if it can be a possible replacement to the existing relational database.
- Improve performance using advanced joins in Apache Pig and Apache Hive.
- Tuning MapReduce job parameters and configuration parameters to improve performance.
- Data copy between production and lower environments.
Environment: Cloudera Hadoop, HDFS, Hive, Pig, Mapreduce, Oozie, Flume, Sqoop, Kafka, UNIX, Shell Scripting.
Confidential, New York, NY
Sr Hadoop Developer
Responsibilities:
- Worked on Hadoop cluster which ranged from 4-8 nodes during pre-production stage and it was sometimes extended up to 24 nodes during production
- Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components
- Established custom MapReduce programs in order to analyze data and used Pig Latin to clean unwanted data
- Experienced various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
- Involved in creating Hive tables, then applied HiveQL on those tables, this will invoke and run MapReduce jobs automatically
- Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
- Participated in requirement gathering form the Experts and Business Partners and converting the requirements into technical specifications
- Used Zookeeper to manage coordination among the clusters
- Experienced in analyzing Cassandra and MongoDB database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts
- Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability
- Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files
Environment: Apache Hadoop 2.0.0, Pig 0.11, Hive 0.10, Sqoop 1.4.3, Flume, MapReduce, HDFS, LINUX, Oozie, Cassandra, Hue, HCatalog, Java, Eclipse, Red Hat Linux.
Confidential - San Mateo, CA
Java/Hadoop Engineer
Responsibilities:
- Developed Map Reduce jobs in java for data cleansing and preprocessing.
- Moved data from Oracle to HDFS and vice-versa using SQOOP.
- Collected and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers.
- Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
- Created the Hive tables as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
- Implemented partitioning, bucketing in Hive for better organization of the data.
- Worked with different file formats and compression techniques to determine standards
- Developed Hive queries and UDFs to analyze/transform the data in HDFS.
- Developed Hive scripts for implementing control tables logic in HDFS.
- Designed and Implemented Partitioning (Static, Dynamic), Buckets in HIVE.
- Developed Pig scripts and UDF’s as per the Business logic.
- Analyzed/Transformed data with Hive and Pig.
- Developed Oozie workflows and they are scheduled through a scheduler on a monthly basis.
- Involved in End to End implementation of ETL logic.
- Ccoordinated with offshore team and managed project deliverable on time.
- Worked on QA support activities, test data creation and Unit testing activities.
- Monitored and Debugging Hadoop jobs/Applications running in production.
- Worked on Providing User support and application support on Hadoop Infrastructure.
- Reviewed ETL application use cases before on boarding to Hadoop.
- Worked on Evaluating, comparing different tools for test data management with Hadoop.
- Helped and directed testing team to get up to speed on Hadoop Application testing.
Environment: Apache Hadoop 2.0.0, Pig 0.11, Hive 0.10, Sqoop 1.4.3, Flume, MapReduce, HDFS, LINUX, Oozie, Cassandra, Hue, HCatalog, Java, Eclipse, Red Hat Linux.
Confidential, Alpharetta, GA
Java Developer
Responsibilities:
- Analysed, Design, Project Planning and effort estimate and Development of FTM application based on -MVC using Struts Framework and server-side J2EE technologies.
- Was Part of the core agile team in developing the application in Agile Development Methodology.
- Involved in mentoring team in technical discussions and Technical review of Design Documents.
- Developed codes by using Core java, servlet and Hibernate framework's API.
- Used Hibernate to develop persistent classes following ORM principles.
- Developed Hibernate configuration files for establishing data base connection and Hibernate mapping files based on POJO classes.
- Developed JUNIT test cases and System test cases for all the developed modules and classes, use Jmeter for performance test.
- Used SVN for source control.
- Used Maven for product lifecycle management.
- Implemented of Industry standard Design Patterns.
- Involved in code reviews and verifying bug analysis reports
- Created the PL/SQL stored procedure, function, triggers for the Oracle 11g database.
- Used Eclipse Juno as the IDE and Tomcat 6.0/ 7.0 as the application server.
Environment: Java, J2EE 1.5, Struts 1.3, Hibernate 3.0, JSP, Servlets, XML, Tomcat 6.0/7.0, JDBC, Oracle SQL Developer, Oracle 11.2.0
Confidential
Java Developer
Responsibilities:
- Involved in the complete SDLC software development life cycle of the application from requirement analysis to testing.
- Created complex SQL Queries, PL/SQL Stored procedures, Functions for back end.
- Prepared the Functional, Design and Test case specifications.
- Involved in writing Stored Procedures in Oracle to do some database side validations.
- Performed unit testing, system testing and integration testing
- Developed Unit Test Cases. Used JUNIT for unit testing of the application.
- Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects. Resolved more priority defects as per the schedule.
Environment: Java, Junit, HTML, PL/SQL, Oracle