Hadoop Consultant Resume
Hartford, CT
SUMMARY
- 10+ years of experience in IT industry, which includes hands on experience in Big data ecosystem related technologies.
- Over 3+ years of comprehensive experience as a Big Data developer.
- Experience in installation, configuration, supporting and monitoring Hadoop clusters using Apache, Hortonworks Distribution (HDP 2.X).
- Hands on experience in working with Ecosystems consisting Hive, Pig, Sqoop, Map Reduce, Flume, Oozie, Spark, Flink.
- Expertise in different data loading techniques (Flume, NiFi Sqoop) onto HDFS.
- Hands on experience in writing Pig Latin Scripts, Hive - QL queries.
- Strong knowledge of Pig and Hive's analytical functions.
- Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Hands-on experience in using Hive partitioning, bucketing and execute different types of joins on Hive tables.
- Extending Hive and Pig core functionality by writing custom UDFs.
- Expertise in Hadoop administration such as managing cluster, reviewing Hadoop log files.
- Good Experience in writing complex SQL queries with databases like Oracle 10g, MySQL, and SQL Server
- Knowledge in job workflow scheduling and monitoring tools like Oozie and Zookeeper
- Experience in designing, developing and implementing connectivity products that allow efficient exchange of data between our core database engine and the Hadoop ecosystem.
- Experience as a Java Developer using Core Java and SQL.
- Good knowledge of Normalization, Fact Tables and Dimension Tables, also dealing with OLAP and OLTP systems.
- Techno-functional responsibilities include interfacing with users, identifying functional and technical gaps, estimates, designing custom solutions, development, leading developers, producing documentation, and production support.
- Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.
TECHNICAL SKILLS
Hadoop/Big Data: HDFS, Pig,Hive, MapReduce, HBase, Sqoop, Flume, Oozie, Zookeeper
IDE: Eclipse
Programming languages: Java, Scala, Python, Linux shell scripts
BI Applications: OBIEE 11.x
Databases worked: Oracle, SQL server, MySql
Operational Systems: Windows, Linux, Unix
Database Tools: TOAD, SQL Developer
Database Query Language: SQL, PL/SQL
PROFESSIONAL EXPERIENCE
Confidential, Hartford CT
Hadoop Consultant
Responsibilities:
- Developed a data pipeline using flume, Hadoop and Hive to ingest, transform and analyzing data.
- Development and testing of Hadoop jobs and implemented data quality solution based on design.
- Also working on data load from Hive toSparkusingSparkSQL.
- Developed data pipeline using Flume to ingest customer behavioral data into HDFS for analysis.
- In data exploration stage used hive and impala to get some insights about the customer data.
- Used Hive data warehouse tool to analyze the data in HDFS and developed Hive queries.
- Used Pig as ETL tool to do Transformations, even joins and some pre - aggregations before storing the data on to HDFS.
- Developed HIVE scripts for analyst requirements for analysis. Cross-examining data loaded in Hive table with the source data in oracle.
- Worked on data migration and data conversion using SQL to convert them into custom ETL tasks.
- Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, YARN.
- Involved in creating Hive internal and external tables, loading them with data and writing hive queries which requires multiple join scenarios. Created partitioned and bucketed tables in Hive based on the hierarchy of the dataset.
- Worked on performance tuning of ETL code and SQL queries to improve performance, availability and throughput.
- Involved in ingesting data into Data Warehouse using various data loading techniques.
- Importing and Exporting of data from RDBMS to HDFS and vice versa using Sqoop
- Involved in preparing Best practices document, Code review methodology document, Migration document.
- Tuned existing ETL SQL scripts by making necessary design changes to improve performance of Fact tables.
- Wrote Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS using PIG, Hive and OOZIE.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS
- DevelopedPythonscripts, UDF's using both Data frames/SQL and RDD/MapReduce inSparkfor Data Aggregation, queries and writing data back into RDBMS through Sqoop.
Environment: Cloudera CDH 5.7, MapReduce, HDFS, Hive, Pig, Impala, Spark, Ambari, Sqoop, Flume, Oozie, Java, Scala, Eclipse, Teradata and Unix/Linux .
Confidential, Philadelphia
Hadoop Developer
Responsibilities:
- Developed a data pipeline using flume, hadoop and Hive to ingest, transform and analyzing data.
- Development and testing of Hadoop jobs and implemented data quality solution based on design.
- Developed data pipeline using Flume to ingest customer behavioral data into HDFS for analysis.
- Hands - on experience in using Hive partitioning, bucketing and execute different types of joins on Hive tables.
- Development and testing of ETL jobs and implemented data quality solution based on design.
- In data exploration stage used hive and impala to get some insights about the customer data.
- Used Hive data warehouse tool to analyze the data in HDFS and developed Hive queries.
- Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data on to HDFS.
- Developed HIVE scripts for analyst requirements for analysis. Cross-examining data loaded in Hive table with the source data in oracle.
- Worked on data migration and data conversion using SQL to convert them into custom ETL tasks.
- Involved in creating Hive internal and external tables, loading them with data and writing hive queries which requires multiple join scenarios. Created partitioned and bucketed tables in Hive based on the hierarchy of the dataset.
- Worked on performance tuning of ETL code and SQL queries to improve performance, availability and throughput.
- Involved in ingesting data into Data Warehouse using various data loading techniques.
- Importing and Exporting of data from RDBMS to HDFS and vice versa using Sqoop
- Involved in preparing Best practices document, Code review methodology document, Migration document.
- Tuned existing ETL SQL scripts by making necessary design changes to improve performance of Fact tables.
- Wrote Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS using PIG, Hive, Python and OOZIE.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS
Environment: Hortonworks HDP 2.5, MapReduce, HDFS, Hive, Pig, SQL, Ambari, Sqoop, Flume, Oozie, HBase, Java (jdk 1.6), Eclipse, MySql and Unix/Linux.
Confidential, Stamford, CT
Hadoop Developer
Responsibilities:
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle database into HDFS using Sqoop.
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
- Developed the Pig Latin code for loading, filtering and storing the data.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
- Importing and exporting data into HDFS and Hive using Sqoop.
- Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
- Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
- Installed Oozie workflow engine to run multiple Hive and Pig jobs.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
Environment: Hadoop, MapReduce, HDFS, Hive, Oracle 11g, Java, Struts, Servlets, HTML, XML, SQL, J2EE, JUnit, Tomcat 6.
Confidential, Boston MA
Java Developer
Responsibilities:
- Responsible for the analysis, documenting the requirements and architecting the application based on J2EE standards. Followed test-driven development (TDD) and participated in scrum status reports.
- Provided full SDLC application.
- Development services including design, integrate, test, and deploy enterprise mission-critical billing solutions.
- Participated in designing of Use Case, Class Diagram and Sequence Diagram for various Engine components and used IBM Rational Rose for generating the UML notations.
- Developing Ant, Maven and Shell Scripts to automatically compile, package, deploy and test J2EE applications to a variety of Web Sphere platforms.
- Experience in developing Business Applications using JBoss, Web Sphere and Tomcat.
- Perl scripting, shell scripting and PL/SQL programming to resolve business problems of various natures.
- Used Enterprise Integration Patterns like XMPP, AMSP etc. which are implemented at application layer protocol. Whereas SIP is implemented at Internet layer protocol using IP.
- Converted the standalone MS-Access reports into Oracle Reports with business logic written in PL/SQL and Java.
- Front end screens development-using JSP with tag libraries and HTML pages.
- Implementing JSP Standard Tag Libraries (JSTL) along with Expression Language (EL).
- Cleaning up duplicate reports across business streams.
- Written SQL queries, stored procedures modifications to existing database structure as required per addition of new features.
- Client side validations and server side validations are done according to the business needs. Written test cases and done Unit testing and written executing Junit tests.
- Written ANT Scripts for project build in LINUX environment.
- Involved in Production implantation and post production support.
Environment: Spring MVC, Oracle 11g J2EE, Java, JDBC, Servlets, JSP, XML, Design Patterns, CSS, HTML, JavaScript 1.2, JUnit, Apache Tomcat, My SQL Server 2008.
Confidential
Siebel Consultant
Responsibilities:
- Preparation of the LLD
- Customization of the User Interface
- Customization of the Business Layer
- Customization of the Data Layer
- Workflows for process automations
Environment: Siebel 8.1.1 SIA (Public Sector), Oracle 11g, Windows 2003 Server SP2, Windows XP
Confidential
Siebel Consultant
Responsibilities:
- Preparation of the HLD, LLD
- Customization of the User Interface
- Creation of the Email Templates
- Creation of Workflow Policy Objects
- Creation of Workflow Policies
- Creation of Workflows to send out emails using the Templates created using outbound communication manager.
- Creation of the VBC to display the transaction details
Environment: Siebel 8.1.1, Oracle 11g, HP-UX, Windows 2003 Server
Confidential
Siebel Consultant
Responsibilities:
- Preparation of the LLD
- Customization of the User Interface
- Customization of the Business Layer
- Customization of the Data Layer
- Workflows for process automations
Environment: Siebel 8.1.1, Oracle 11g, HP-UX, Windows 2003 Server
Confidential
Siebel Consultant
Responsibilities:
- Day to day Administration and Monitoring of the Application
- Part of new application Upgrade releases which happens quarterly.
- Customized Object Definitions using Siebel tools.
- Providing solutions for the issues raised by the TPMS users.
- Shipment Data load Monitoring and trouble shooting
- Product Data load Monitoring and trouble shooting
Environment: Siebel 7.8, Oracle 9i, HP-UX, Windows 2003 Server
Confidential
Siebel Consultant
Responsibilities:
- Day to day Administration and Monitoring of the Application
- Part of new application Upgrade releases which happens quarterly.
- Customized Object Definitions using Siebel tools.
- Providing solutions for the issues raised by the end users.
Environment: Siebel 2000, Oracle 8i, Actuate, Windows 2000