Hadoop Developer Resume
San Jose, CA
SUMMARY
- Over 8 years of experience in IT industry which includes 3 years of experience Hadoop, Hive, Pig, Sqoop, HBase, Zookeeper, JAVA, J2EE, JDBC, HTML, and JavaScript.
- Well experienced on development of Java Map Reduce Programs and also on Various Hadoop Ecosystem Such as Sparkand SCALA programming, Oozie, Flume etc.
- Experience in working with Cassandra.
- Importing and exporting data from different databases like MySQL, Oracle into HDFS and Hive using Sqoop.
- Extensive experience on several Apache Hadoop projects, map reduce programs using Hadoop Java API and also using Hives and pig.
- Follow the software development life cycle specifications and ensure all the deliverables meet the Hadoop specifications.
- Extensive knowledge on current development and source code management tools (GIT, SVN).
- Working Experience on Installation/Configuration/Maintenance ofHortonworksHadoop clusters for application development.
- Extensive working experience in Agile Software Development Model which includes handles scrum sessions as Scrum Master.
- Good communication and interpersonal skills with self - learning attitude.
- Hands on experience developing enterprise Hadoop application in a Cloudera environment.
TECHNICAL SKILLS
Programming Languages: Java, C, C++, C# and SCALA.
Hadoop Ecosystem: HDFS, Map Reduce, Pig, Hive,Sqoop, Flume, Zookeeper,HBase, Kafka, HortonworksData Platform (HDP).
IDE Tools: Eclipse, Netbeans, STS, IntelliJ.
Operating Systems: MS-DOS, Windows, Linux, Unix
Web Technologies: HTML, CSS, Javascript and AJAX.
Databases: Oracle, My SQL and SQL Server.
Application /Web Server: Apache Tomcat, Web Logic, TFS
Functional Testing Tools: Quick Test Pro, Selenium, Load Runner, Quality Center, HPALM, JIRA
PROFESSIONAL EXPERIENCE
Confidential, San Jose, CA
Hadoop Developer
Responsibilities:
- Installed and configured Hadoop MapReduce, HDFS and developed MapReduce jobs in Java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Defined job flows, managing and reviewing Hadoop log files.
- Participated in development/implementation of Cloudera environment.
- Responsible for running Hadoop streaming jobs to capture and process terabytes of xml format data coming from different sources.
- Developed code to load data from Linux file system to HDFS.
- Worked on implementing and integrating in NoSQL databases like HBase.
- Supported Map Reduce Programs running on the cluster.
- Installed and configured Hive and also written Hive UDFs in Java and Python.
- Loaded and transform large datasets such as Structured, Un-structured and semi Structured Data.
- Worked on migrating MapReduce programs into Spark transformations using Spark and Scala.
- Developed Pig Latin scripts to process the data and also written UDF in Java and Python.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Wrote Map Reduce programs in Java to achieve the required Output.
- Developed Hive queries for data analysis to meet the Business requirements.
- Used Oozie workflow engine to create the workflows and automate the Map Reduce, Hive, Pig jobs.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop. Cluster co-ordination through Zookeeper.
Environment: Java, Hadoop, Spark, Scala, NoSQL, Hive, Kafka and Python.
Confidential, Arlington Heights IL
Hadoop Developer
Responsibilities:
- Worked in the BI team in the area of Big Data Hadoop cluster implementation and data integration in developing large-scale system software.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems/mainframe and vice-versa.
- Managed and reviewed Hadoop log files.
- Shared responsibility for administration of Hadoop, Hive and Pig.
- Built and maintained scalable data pipelines using the Hadoop ecosystem and other open source components like Hive, and HBase.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in the tables in EDW.
- Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System.
- Capturing data from existing databases that provide SQL interfaces using Sqoop.
- Developed and maintained complex outbound notification applications that run on custom architectures, using diverse technologies including Core Java, J2EE, SOAP, XML and Web Services.
- Tested raw data and executed performance scripts.
- Developed code for client side validations using scripting languages JavaScript andPython.
- Developed Map Reduce programs using Apache Hadoop API for analyzing the data
- Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
- Assess existing and available data warehousing technologies and methods to ensure our Data warehouse/ BI architecture meets the needs of the business unit and enterprise and allows for business growth.
Environment: Hadoop, HIVE, HBase, HDFS, Oozie, Sqoop, Java, XML, JavaScript and Python.
Confidential, Pittsburgh, PA
Technical Specialist - Hadoop
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
- Analyzed data using Hadoop components Hive and Pig.
- Worked hands on with ETL process.
- Responsible for running Hadoop streaming jobs to process terabytes of xml's data.
- Load and transform large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Involved in loading data from UNIX file system to HDFS.
- Responsible for creating Hive tables, loading data and writing hive queries.
- Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
- Extracted the data from Teradata into HDFS using the Sqoop and vice versa.
Environment: Hadoop, Hive, Pig, Unix, ETL and Sqoop.
Confidential
Java/J2EE Developer
Responsibilities:
- Used CVS for maintaining the Source Code Designed, developed and deployed on Apache Tomcat Server.
- Involved in Analysis, design and coding on J2EE Environment.
- Developed Hibernate object/relational mapping according to database schema.
- Designed the presentation layer and programmed using HTML, XML, XSL, JSP, JSTL and Ajax.
- Designed, developed and implemented the business logic required for Security presentation controller.
- Created XML files to implement most of the wiring need for Hibernate annotations and Struts configurations.
- Responsible for developing the forms, which contains the details of the employees, and generating the reports and bills.
- Involved in designing of class and dataflow diagrams using UML Rational Rose.
- Created and modified Stored Procedures, Functions, Triggers and Complex SQL Commands using PL/SQL.
- Developed Shell scripts in UNIX and procedures using SQL and PL/SQL to process the data from the input file and load into the database.
- Used Core java concepts in application such as multithreaded programming, synchronization of threads used thread wait, notify, join methods etc.
- Created cross-browser compatible and standards-compliant CSS-based page layouts.
- Involved in maintaining the records of the patients visited along with the prescriptions they were issued in the Database.
- Performed Unit Testing on the applications that are developed.
Environment: Unix (Shell Scripts), J2EE, JSP1.0, Servlets, Hibernate, JavaScript, JDBC, Oracle 10g, UML, Rational Rose 2000, SQL, PL/SQL, CSS, HTML & XML.