We provide IT Staff Augmentation Services!

Hadoop/bigdata Consultant Resume

5.00/5 (Submit Your Rating)

Bellevue, WA

SUMMARY:

  • 8+ years of extensive IT experience which includes 4 yearsofexperience in big data, hadoop ecosystem related technologies.
  • Experience with Hadoop Ecosystem: Hortonworks 2.0/2.2/2.3,CDH 4.3.2/5.2, HDFS, MapReduce, YARN, Sqoop, Flume, Oozie, Pig, Hive, Scala, HBase, MongoDB, Cassandra, Solr, Spark, Storm and Kafka.
  • Experience in importing and exporting the data using Sqoop from Relational Database systems to HDFS and vice - versa.
  • Expertise in Hive Query Language (HiveQL), Hive Security and debugging Hive issues.
  • Experience in performing extensive data validation using HIVE Dynamic Partitioning and Bucketing.
  • Experience in developing custom UDFs for Pig and Hive to in corporate methods and functionality of Python/Java into Pig Latin and HQL (Hive QL).
  • Experience in Streaming the Data to HDFS using Flume.
  • Expertise in writing ETL Jobs for analyzing data using Pig.
  • Experience in NoSQL Column-Oriented Databases like HBase, Cassandra and its Integration with Hadoop cluster.
  • Hands on experience in using Map reduce programming model for Batch processing of data stored in HDFS.
  • Experience in managing and reviewing Hadoop Log files.
  • Experience with Solr in implementing indexes for fast data retrieval.
  • Experience with OozieWorkflow Engine in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs.
  • Experience in Database Design, ER modeling, SQL, PL/SQL, procedures, functions, triggers.
  • Extensive working knowledge on Teradata BTEQ and M-Load Scripts.
  • Good experience in developing a build script using Ant, Maven, Jenkins and Accurev.
  • Good working knowledge on Github and Nexus code repositories.
  • Good knowledge on Spark, Kafka and Storm.
  • Experience working with Talend.
  • Proficient in development methodologies such as Agile, Scrum and Waterfall.
  • Extensive working knowledge on Shell Scripting.
  • Good Knowledge on working with multiple file formats like Sequence files, Avro, RC and ORC File Formats.
  • Worked with customers, end users to formulate and document business requirements.
  • Worked closely with Data Architects for creating S2TM (Source to Target Mapping).
  • Experience in understanding source data by performing source system analysis.
  • Worked extensively on Business Requirements Analysis, Functional and Non-Functional requirements analysis, RiskAnalysis and UAT.
  • Experience in database development using SQL and PL/SQL and experience working on databases like Oracle 9i/10g, SQL Server,MySQL and Teradata.
  • Good communication, inter-personal skills, team player and contributor who delivers on schedule under tight schedules.

TECHNICAL SKILL:

Hadoop Distribution: Apache, Cloudera CDH, Hortonworks HDP

Big Data Technologies: Apache Hadoop (MRv1,MRv2), Hive,Pig,Sqoop,HBase,Flume,Zookeeper,Oozie, Ambari, Hue, Impala, Spark, Kafka and Storm

Operating Systems: Windows, Linux & Unix

Languages: C, Java, PL/SQL, Unix Shell, Python

Frameworks: Struts, Spring and Hibernate

Web Technologies: HTML, JSP, CSS, JavaScript

IDEs: Eclipse, IBM Web Sphere

Webservers /App Servers: Apache Tomcat 6.0/7.0, IBM WebSphere 6.0/7.0, JBoss 4.3

Database: Oracle 8i/9i/10g, MySQL,HBase(NoSQL), MongoDB(NoSQL), Teradata

Test Management Tool: Quality Center, BugZilla, JIRA

PROFESSIONAL EXPERIENCE:

Confidential,Bellevue, WA

Hadoop/BigData Consultant

Responsibilities:
  • Understand the business needs and objectives of the system and interacted with the end client/users and gathered requirements for the integrated system.
  • Worked as a Source System Analyst to understand various source systems by interacting with each source SME’s.
  • Worked as a Data Modeler/Architect to help Senior architects in designing the Target Data Models and S2TMs.
  • Involved in various activities of the project like information gathering, analyzing the information, documenting the functional and nonfunctional requirements.
  • Ingested data from different sources into Hadoop Data Lake using Sqoop.
  • Performed data profiling activities like data type validation, NULL checks for the Ingested data using Java/Python MapReduce Programs.
  • Used Pig as an ETL tool to do the business transformations and then load the transformed data into the target tables.
  • Implemented Incremental Logic for Data Ingestion using HBase by setting proclog entry for each Data Load.
  • Involved in integrating Cassandra with Hadoop.
  • Written Storm topology to accept the events from Kafka producer and emit into Cassandra DB.
  • Created HBase Tables for each source table for maintaining audit information.
  • Created Hive external tables for each source table in Hadoop Data Lake.
  • Created Hive Partitions and Buckets for tables on basis of load date.
  • Developed Pig Scripts for processing the Ingested data as per business requirement.
  • Implemented SCD Type-2 Logic in processing layer for versioning the records using Java MapReduce.
  • Converting SCD Type-1 Tables to SCD Type-2 Tables using Pig Transformations and Java UDFs.
  • Developed Pig UDF’s in Java in order to convert date formats as desired.
  • Created Oozie workflows to automate the actions required for each job.
  • Experience working with Talend.
  • Using Control-M as a Job scheduler and monitoring tool.
  • Developed Denodo views for source like data for data availability for Business users.
  • Implemented Hadoop Security using Kerberos Active Directory.
  • Experience in Using Cloudera Hue Interface for Job monitoring and Viewing table metadata.
  • Using Hcatalog as a Table management solution for both Pig and Hive Interfaces.
  • Developed Shell Scripts for executing and parameterizing each Job.
  • Using Kafka as a middleware to send data from various sources to Hadoop Data lake.
  • Implemented Spark as a processing engine to achieve RDD.
  • Developed Spark code using Scala and Spark-SQL for faster processing of data.
  • Developed Spark scripts by using Scala shell commands as per the requirement.
  • Analyzed the SQL scripts and redesigned the solution to implement using Scala
  • Exporting Processed data back to Teradata staging layer using Sqoop.
  • Loading data from edge node to Teradata stage tables using Teradata Multi-Load (M-Load) Utility.
  • Exported Teradata Tables from Stage Layer to Core Layer using BTEQ Scripts.
  • Extensive experience on Unit testing by creating Test Cases.
  • Supported Business users during UAT.
  • Experience in Development Methodologies like Agile,Waterfall.
  • Supported QA Team by fixing defects logged in Quality Center.
  • Experience in code repositories like Github, Nexus.
  • Developed Business Objects Universe by connecting to Teradata Core tables.
  • Experience in promoting code to QA using Accurev.
  • Experience in Building Jobs and promoting code to QA using Jenkins Cloudbase Tool.
  • Supporting Hadoop App Support team by writing SIS documents and code drop requests.

Environment: ApacheHadoop, Java, Python,Eclipse, Hortonworks, Spark, HBase, Cassandra, Map Reduce, Pig, Scala, Hive, Kafka, Teradata, Sqoop, Hcatalog, Hue, Linux, Jenkins, Github, Nexus, Control-M

Confidential,Charlotte,NC

Sr. Hadoop Developer

Responsibilities:
  • Worked on different Big Data tools including Pig, Hive, HBase and SQOOP.
  • Coordinated with business customers to gather business requirements. And also, interact with other technical peers to derive Technical requirements and delivered the BRD and TDD documents.
  • Extensively involved in Design phase and delivered Design documents.
  • Involved in Testing and coordination with business in User testing.
  • Importing and exporting data into HDFS and Hive using SQOOP.
  • Developed multiple POCs using Scala and deployed on the Hadoop cluster, compared the performance of Sparkwith Hive and SQL/Teradata.
  • Written Hivejobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Implemented Partitioning, Dynamic Partitions and Bucketing in Hive for efficient data access.
  • Experienced in defining job flows.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Load and Transform large sets of structured and semi structured data.
  • Used Github as a code repository and version control tool.
  • Responsible to manage data coming from different sources.
  • Involved in creating Hive Tables, loading data and writing Hive queries.
  • Involved in Unit testing and delivered Unit test plans and results documents.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Worked on Oozie workflow engine for job scheduling.
  • Implemented type 2 tables in Hadoop/Hbase.

Environment: Apache Hadoop, HDFS, MapReduce, Hbase, Java, Scala, Linux, Sqoop, Hive, Pig, Python, NoSQL, Flume, Oozie, Github.

Confidential,Minneapolis,MN

Hadoop Developer

Responsibilities:
  • Installed, configured, and maintained Apache Hadoop clusters for application development and major components of Hadoop Ecosystem: Hive, Pig, HBase, Sqoop, Flume, Oozie and Zookeeper
  • Importing and exporting data into HDFS and Hive from different RDBMS using Sqoop
  • Experienced in defining job flows to run multiple MapReduce and Pig jobs using Oozie
  • Importing log files using Flume into HDFS and load into Hive tables to query data
  • Used HBase-Hive integration, written multiple Hive UDFs for complex queries
  • Involved in writing APIs to Read HBase tables, cleanse data and write to another HBase table
  • Created multiple Hive tables, implemented Partitioning, Dynamic Partitioning and Buckets in Hive for efficient data access
  • Written multiple MapReduce programs in Java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats
  • Experienced in running batch processes using Pig Scripts and developed Pig UDFs for data manipulation according to Business Requirements
  • Experienced in writing programs using HBase Client API
  • Involved in loading data into HBase using HBase Shell, HBase Client API, Pig and Sqoop
  • Experienced in design, development, tuning and maintenance of NoSQL database
  • Developed unit test cases for Hadoop MapReduce jobs with MRUnit
  • Excellent experience in ETL analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of database

Environment: Apache Hadoop, HDFS, MapReduce, Hive, Pig, HBase, Sqoop, Flume, Oozie, Java, Linux, MySQL Server, MS SQL, SQL, PL/SQL, NoSQL.

Confidential

Java Developer

Responsibilities:
  • Transformation of XML to HTML documents using XSLT style sheet.
  • Developed frontend Modules using MVC architecture using JSF 2.0.
  • Used XSLT to develop templates and process XML data into a more user-friendly format.
  • Programming and Development of modules involving Struts, JPA, Spring, AJAX, Servlets, JSP, JSTL, JQuery and JS.
  • Optimization of Hibernate mapping in order to boost performance of the system.
  • High level design of SOA components to complete end-to-endB2B integration
  • Manage and deploy application using JBOSS Application Server 6.1 with deployment manager in a clustered environment.
  • Developed views using JSPs and struts tags. Using Tiles framework, improving UI flexibility and providing single point of maintenance.
  • Developed the code for asynchronous update to web page using JavaScript and Ajax.
  • Developed application using JavaScript for Web pages to add functionality, validate forms, communicate with the server.
  • Used Spring IOC, Writing Java Bean classes, with get and set methods for each property to be configured by spring.
  • SOAP and REST based webservices are implemented using Apache CXF framework.
  • Modified the configuration of the Spring Application Framework IOC Container.
  • Used Hibernate ORM framework as persistence engine, actively engaged in mapping, and hibernate queries
  • Involved in writing Hibernate mapping files (HBM files) and configuration files.
  • Used Log4j for logging Errors.
  • Using JUnit test, extensively written test cases for this system to test the application.
  • Implemented logging mechanism using Log4j with the help of SpringAOP frame work.
  • Server side validations using the StrutsValidator framework

Environment: Eclipse, NetBeans, NoSQL, PL/SQL developer, Filezilla, Putty, SOAP UI, Java, JavaScript, XML, HTML, JSP

Confidential

JavaDeveloper

Responsibilities:
  • Gathered user requirements followed by analysis and design. Evaluated various technologies for the client.
  • Developed HTML and JSP to present Client side GUI.
  • Involved in development of JavaScript code for client side Validations.
  • Designed the HTMLbased web pages for displaying the reports.
  • Developed the HTML based web pages for displaying the reports.
  • Developed java classes and JSP files.
  • Extensively used JSF framework.
  • Extensively used XML documents with XSLT and CSS to translate the content into HTML to present to GUI.
  • Developed dynamic content of presentation layer using JSP.
  • Develop user-defined tags using XML.
  • Developed, Tested and Debugged the Java, JSP and EJB components using Eclipse.
  • Developed JSP as the view, Servlets as Controller and EJB as model in the Struts Framework.
  • Created and implemented PL/SQL stored procedures, triggers.

Environment: Java, J2EE, EJB 2.1, JSP 2.0, Servlets 2.4, JNDI 1.2, Java Mail 1.2, JDBC 3.0, Struts, HTML, XML, CORBA, XSLT, Java Script, Eclipse3.2, Oracle10g, Weblogic8.1.

We'd love your feedback!