Hadoop Consultant Resume
San Francisco, CA
SUMMARY
- 7 years of overall IT experience that includes Big Data experience in ingestion, storage, querying, processing and analysis and Java experience
- Have knowledge of loading logs from multiple sources directly into HDFS using tools like Flume
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
- Solid understanding of Hadoop, especially in HDFS, Hive, Map Reduce, Pig, Hbase, and Sqoop
- Excellent understanding of HDFS, Map Reduce, YARN, and tools including Pig and Hive for data analysis, Sqoop for data migration, Flume for data ingestion, Oozie for scheduling and Zookeeper for coordinating cluster resources.
- Developed Pig Latin scripts using operators such as LOAD, STORE, DUMP, FILTER, DISTINCT, FOREACH, GENERATE, GROUP, COGROUP, ORDER, LIMIT, UNION, SPLIT to extract data from data files to load into HDFS
- Knowledge on EMC VNX,VMAX, VMware
- Built data transform framework using Map Reduce and Pig.
- Worked on disaster management with Hadoop cluster.
- Worked with application team via scrum to provide operational support, install Hadoop updates, patches and version upgrades as required.
- Experience in building Pig scripts to extract, transform and load data onto HDFS for processing.
- Knowledge and understanding on industry latest Hadoop ecosystems like Apache Spark integration with Hadoop.
- Knowledge in job work-flow scheduling and monitoring tools like Oozie
- Worked with business users to extract clear requirements to create business value.
- Designed, delivered and helped manage a device data analytics at a very large storage vendor.
- Having Good knowledge on Single node and Multinode Cluster Configurations.
- Very Good understanding of SQL, ETL and Data Warehousing Technologies
- Experienced withHadoopinternals (MapReduce (YARN), HDFS),
- Extended Hive and Pig core functionality
- Experience with Amazon Web Service (AWS) migration and development.
- Worked on event-driven programming inPython.
- Experience in developing solutions to analyze large data sets efficiently
- Installed Oozie workflow engine to run multiple Hive and Pig jobs that run independently with time and data availability.
- Experience to work under tight deadlines and rapidly changing priorities with proactive, creative & focused approach to business needs with high analytical, inters personal, and team playing skills.
- Have the motivation to take independent responsibility as well as ability to contribute and be a productive team member.
- Strong communication, collaboration & team building skills with proficiency at grasping new Technical concepts quickly and utilizing them in a productive manner.
- Extremely motivated with good interpersonal skills; have ability to work in strict deadlines
TECHNICAL SKILLS
Operating System: Windows,UNIX
Domain: Storage, Cloud, Big Data, Java, HDFS, Hive, Pig, HBase
Big Data Ecosystem: Hadoop, MapReduce, HDFS, HBase, Ambari, Spark, Hive, Pig, Sqoop, Oozie, and Flume
Tools: Eclipse, Dropbox, SFDC, Ubuntu
Programming Languages: C, Java, SQL, JavaScript Python, PL/SQL
PROFESSIONAL EXPERIENCE
Confidential, San Francisco, CA
Hadoop Consultant
Responsibilities:
- Worked with business partners to gather business requirements.
- Analyzed large data sets by running Hive queries and Pig scripts
- Involved in creating Hivetables, and loading and analyzing data using Hive queries
- Developed Simple to Complex Map Reduce Jobs using Hive and Pig
- Involved in running Hadoop Jobs for processing millions of records of text data.
- Created connection through JDBC and used JDBC statements to call stored procedures.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Developed the Pig UDF’S to pre-process the data for analysis.
- Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing.
- Moved all RDBMS data into flat files generated from various channels to HDFS for further processing.
- Writing CLI commands using HDFS.
- Developed job workflows in Oozie to automate the tasks of loading the data into HDFS.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and extracted data from Teradata into HDFS using Sqoop.
- Writing the script files for processing data and loading to HDFS.
- Worked extensively with Sqoop for importing metadata from RDBMS.
- Cluster co-ordination services through Zookeeper.
- Setup Hive with MySQL as a Remote Meta store.
- Moved all log/text files generated by various products into cluster location on top of HDFS.
- Preparing weekly status and monthly status report
- Attending Defect calls to provide latest status to client
Environment: Java 1.7, Spring, Hadoop, YARN, Hive, noSQL, UNIX, Git Cloudera CDH3, Flume, Sqoop, Pig, HDFS, MapReduce, HBase.
Confidential, San Francisco, CA
Hadoop Developer/Admin
Responsibilities:
- Involved in the Software Development Life Cycle (SDLC) of the project from Analysis, Design, Implementation and Testing.
- Creating and scheduling end-to-end workflows using Oozie.
- Designing hive and Impala tables in parquet format for faster querying and reporting
- Monitoring and keeping Hadoop cluster healthy.
- Developed Hive DDLs and DMLs to populate data.
- Developed shell script to pull data from RDMS and apply the incremental and full load to the Hive tables.
- Technical design of Hadoop end to end job flows and integration components
- Writing sqoop scripts to extract data from existing RDBMS source Oracle and store it in HDFS
- Writing Pig scripts to perform various data transformations (ETL tasks) on input data and store output in hive tables.
- Design and execution of Java Mapreduce programs to perform complex business computations related to revenue management systems which deals with customer sensitive information
- Worked on Java MapReduce jobs to process in coming data and load it into HBase and Hive tables.
- Cleansing data generated from weblogs with automated scripts inPython
- Designing, creating, loading Hive tables (for structured data)
- Design and Develop Spark, Scala applications to replace long running Mapreduce jobs
- Including and Executing Spark applications using Oozie flow.
- Development of Pig/Hive UDFs for complex computations
- Extensively used Hue and Cloudera Manager in CDH
- Designing and scheduling end-to-end Oozie workflows for various subject areas
- Demonstration of End to End workflows for Confidential Audience
- Optimizing the code and ensuring production stability
Environment: Cloudera (CDH4), HDFS, Map Reduce, Java, Pig, Hive, HBase, Sqoop, Flume, Spark, Scala, Oozie, Hue, SVN, D3JS and R
Confidential, Texas
Sr. Java developer
Responsibilities:
- Created various modules and components as per business requirement.
- Extensively used MVC architecture and JBoss for deployment purposes.
- Provided technical support for various key business releases.
- Coordinated with multiple teams to resolve various items involved as part of big releases. As the functional owner & senior java developer in the team, completed various code reviews and provided my input to make system more agile and easily maintainable.
- Built the backend services, which will be consumed by action classes of studs.
- Created SOAP web services to allow communication between the applications.
- Developed and analyzed the front-end and back-end using JSP, Servlets and Spring 3.0.
- Integrated Spring (Dependency Injection) among different layers of an application.
- Worked with Agile methodology.
- Used spring framework for dependency injection, transaction management.
- Used Spring MVC framework controllers for Controllers part of the MVC.
- Used Java Message Service (JMS) for reliable and asynchronous exchange of important information, such as loan status report.
- Implemented various complex PL/SQL queries.
- Worked with testers in resolving defects in the application and was an integral part of the team.
- Interacted with Business Analysts to come up with better implementation designs for the application.
- Interacted with the users in case of technical problems & mentoring the business users.
- Implement the best practices and performance improvement / productivity plans.
- Co-ordination of activities between off-shore and onsite teams
- Developed the presentation layer and content management framework using HTML and JavaScript.
Environment: JAVA 1.6, J2EE, Servlets, JMS, Spring, SOAP Web Services, HTML, Java Script, JDBC, Agile Methodology, PL/SQL, XML, UML, UNIX, Oracle 10g, JBOSS, Eclipse.
Confidential, North Brunswick, NJ
Java Developer
Responsibilities:
- Analysis of requirements for developing an automated tool for monitoring.
- Managed and mentored a group of application developers, assigned responsibilities, elaborated use cases, managed project schedules, and module targets.
- Provided recommendations on OO design concepts, best practices, exception handling,and identifying and fixing potential memory, performance, and transactional issues.
- Providing solution for the new requirements and change request.
- Making enhancements of the report based upon the requirements of the user.
- Formatted the reports in order to improve the look and feel of the report.
- Unit testing and System Testing of the developed modules. Communicated with legacy systems by using the soap based web services, such as SOAP, WSDL.
- Written service and implementation classes using EJB calls to save update and delete objects. Written code to create EJB Stateless session factory to create Hibernate session.
Environment: Java/J2EE, Struts 2.0, Spring 3.2, EJB, Web Logic 10.3, Oracle 11g, XML, SAX, DOM, JAXB, WSDL, XML Spy, SOAP, Java Script, JQuery, AJAX, HTML5, CSS3, Maven, SVN, PL/SQL
Confidential, San Francisco, CA
Java Developer
Responsibilities:
- Analysis of requirements for developing us application for automated monitoring.
- Providing solution for the new requirements and change request.
- Making enhancements of the report based upon the requirements of the user.
- Developed different type monitoring report for all batch jobs.
- Unit testing and System Testing of the developed modules.
- Creation of data flow transformation for new client.
- Creating database objects Tables, procedures, function.
- Developed reports such as table, matrix, sub report, drilldown reports etc.
- Business logic programming using Java/J2EE.
- Analysis of requirements and change request (CR).
Environment: Java/J2EE, PL/SQL