Hadoop developer Resume Beaverton, OR - Hire IT People

PROFESSIONAL SUMMARY:

Over 8+ years of experience in IT industry which includes 5+ years of experience in Application Development using Hadoop and related Big Data technologies.
Good knowledge on Hadoop Architecture and its components such as HDFS, MapReduce, JobTracker, TaskTracker, NameNode, DataNode.
Hands on experience in installing, configuring, and using Hadoop ecosystem components like HDFS, MapReduce, Pig, Hive, Sqoop, Flume, HBase, Zookeeper, and Oozie.
Having extensive knowledge on Hadoop technology experience in Storage, writing Queries, processing and analysis of data.
Having good knowledge in writing MapReduce jobs through Pig, Hive, and Sqoop.
Extending Pig and Hive core functionality by writing customized User Defined Functions for analysis of data, file processing, by running PigLatinScripts.
Having experience in creating Hiveinternal/externalTables using shared MetaStore.
Written Sqoop Queries to import data into Hadoop from Teradata/SQLServer.
Experience on Working on ApacheSqoop for relational data dumps.
Knowledge in Streaming the Data to HDFS using Flume.
Worked on importing data into HBase using HBaseShell.
Used ApacheOozie for scheduling and managing the Hadoop Jobs.
Good understanding of Zookeeper for monitoring and managing Hadoop jobs.
Having extensive knowledge on RDBMS such as Oracle, MicrosoftSQLServer, MYSQL
Extensive experience working on various databases and database script development using SQL and PL/SQL.
Good understanding of NoSQL databases such as HBase, Cassandra and MongoDB.
Experience with operating systems: Linux, RedHat, and UNIX.
Supported MapReduce Programs running on the cluster and wrote custom MapReduceScripts for DataProcessing in Java.
Experience in Developing Spark jobs using Scala in test environment for faster data processing and used SparkSQL for querying.
Extensive knowledge in J2EE technologies such as Object Oriented Programming techniques (OOPS), JSP, and JDBC.
Expertise in Web technologies using HTML, XML, JDBC, JSP, JavaScript, AJAX, SOAP.
Extensive experience in different IDEs like Eclipse, NetBeans.

TECHNICAL SKILLS:

Programming Languages: Java, J2EE, C, SQL/PLSQL, PIG LATIN, Scala, HTML, XML

Hadoop: Ecosystem Development HDFS, MapReduce, Pig, Hive, Sqoop, Flume,YARN, Oozie, Apache Kafka, Zookeeper, HBase, Scala, Spark.

Web Technologies: JDBC, JSP, JavaScript, AJAX, SOAP.

RDBMS Languages: Oracle, Microsoft SQL Server, MYSQL.

NoSQL: MongoDB, HBase, Apache Cassandra.

Tools/IDES: .Net Beans, Eclipse, GIT, Putty.

Operating System: Linux, Windows, UNIX, CentOS.

Methodologies: Agile, Waterfall model.

Testing Hadoop: MR UNIT Testing, Quality Center, Hive Testing.

PROFESSIONAL EXPERIENCE:

Confidential, Beaverton, OR

Hadoop developer

Responsibilities:

Coordinated with business customers to gather business requirements. And, interact with other technical peers to derive Technical requirements and delivered the BRD and TDD documents
Worked on analyzingHadoop2.7.2cluster and different Big Data analytic tools including Pig0.16.0, Hive2.0HBase1.1.2 database and SQOOP1.4.6
Implemented Spark2.0using Python3.6.0and SparkSQL for faster processing of data.
Implemented algorithms for real time analysis in Spark.
Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
Involved in validating the aggregate table based on the rollup process documented in the data mapping. Developed HiveQL, SparkRDDSQL and automated the flow using shellscripting
Developed MapReduce3programs to parse the raw data and store the refined data in tables.
Designed and Modified Database tables and used HBASE Queries to insert and fetch data from tables.
Involved in moving all log files generated from various sources to HDFS for further processing through Flume1.7.0.
Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Created Hive tables, loaded data and wrote Hive queries that run within the map.
Used OOZIE1.2.1Operational Services for batch processing and scheduling workflows dynamically and created UDF's to store specialized data structures in HBase and Cassandra.
Used Impala to read, write and query the Hadoop data in HDFS from HBase or Cassandra and configured Kafka to read and write messages from external programs.
Optimizing existing algorithms in Hadoop using SparkContext, Spark - SQL, Data Frames and Pair RDD's.
Hands on experience in application development using Java, RDBMS, and Linuxshellscripting.
Involved in fetching brands data from social media applications like Facebook, twitter.
Developed and updated social media analytics dashboards on regular basis.
Performed data mining investigations to find new insights related to customers and involved in forecast based on the present results and insights derived from data analysis.
Create a complete processing engine, based on Cloudera's distribution, enhanced to performance.
Manage and review Hadoop log files.
Developed and generated insights based on brand conversations, which in turn helpful for effectively driving brand awareness, engagement and traffic to social media pages.
Involved in the identifying, analyzing defects, questionable function error and inconsistencies in output.

Environment: Hadoop, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Oracle 11g, Core Java Cloudera HDFS, Eclipse.

Confidential, Fairfax, VA

Hadoop developer

Responsibilities:

Involved in the high-level design of the Hadoop2.6.3 architecture for the existingdata structure and Problem statement and setup the 64-node cluster and configured the entire Hadoopplatform.
Implemented Data Interface to get information of customers using RestAPIand Pre-Processdata using MapReduce 2.0 and store into HDFS (Hortonworks)
Extracted files from MySQL, Oracle, and Teradata 2 through Sqoop 1.4.6and placed in HDFS Cloudera Distribution and processed.
Configured Hive 1.1.1metastore, which stores the metadata for Hive tables and partitions in a relational database.
Worked with various HDFS file formats like Avro1.7.6, Sequence File, Jsonand various compression formats like Snappy, bzip2.
Developed efficient MapReduce programs for filtering out the unstructured data and developed multiple MapReduce jobs to perform datacleaning and preprocessing on Hortonworks.
Developed the Pig 0.15.0UDF's to pre-process the data for analysis and Migrated ETL operations into Hadoopsystem using Pig Latin scripts and Python Scripts3.5.1.
Used Pig as ETL tool to do transformations, event joins, filtering and some pre-aggregations before storing the data into HDFS.
Troubleshooting, debugging & altering Talend issues, while maintaining the health and performance of theETLenvironment.
Developed Hive queries for data sampling and analysis to the analysts.
Loaded data into the cluster from dynamically generated files usingFlume and from relationaldatabase management systems using Sqoop.
Developed custom Unix SHELL scripts to do pre and post validations of master and slave nodes, before and after configuring the name node and datanodes respectively.
Experienced in runningHadoop streaming jobs to process terabytes of formatted data usingPythonscripts.
Developed small distributed applications in our projects using Zookeeper3.4.7and scheduled the workflows using Oozie 4.2.0 .
Developed complex Talend jobs mappings to load the data from various sources using different components.
Developed a SCP Stimulator which emulates the behavior of intelligent networking and Interacts with SSF.
Created HBase tables from Hive and Wrote HiveQL statements to access HBase0.98.12.1table's data.
Proficient in designing Row keys and Schema Design for NoSQL DatabaseHbaseand knowledge of other NOSQL database Cassandra.
Used Hive to perform data validation on the data ingested using scoop and flume and the cleansed data set is pushed intoHbase.
Created a MapReduce program which looks into data in HBasecurrent and prior versions to identify transactional updates. These updates are loaded into Hive externaltables which are in turn referred by Hivescripts in transactionalfeeds generation.

Environment: Hadoop (Cloudera), HDFS, Map Reduce, Hive, Scala, Python, Pig, Sqoop, WebSphere, Hibernate, spring, Oozie, REST Web Services, AWS, Solaris, DB2, UNIX Shell Scripting, JDBC.

Confidential, Wickenburg, AZ

Hadoop developer

Responsibilities:

Executed Hive queries that helped in analysis of market trends by comparing the new data with EDW reference tables and historical data.
Managed and reviewed Hadoop log files job tracker, NameNode, secondary NameNode, data node, and task tracker.
Tested raw market data and executed performance scripts on data to reduce the runtime.
Involved in loading the created Files into HBase for faster access of large sets of customer data without affecting the performance.
Importing and exporting the data from HDFS to RDBMS using Sqoop and Kafka.
Executed the Hive jobs to parse the logs and structure them in relational format to provide effective queries on the log data.
Created Hive tables (Internal/external) for loading data and have written queries that will run internally in MapReduce and queries to process the data.
Developed Pig Scripts for capturing data change and record processing between new data and already existed data in HDFS.
Creating scalable perform ant machine learning applications using the Mahout.
Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.
Involved in importing of data from different data sources, and performed various queries using Hive, MapReduce, and Pig Latin.
Involved in loading data from local file system to HDFS using HDFS Shell commands.
Experience on UNIX shell scripts for process and loading data from various interfaces to HDFS.
Develop different components of Hadoop ecosystem system process that involves Map Reduce, and Hive.

Environment:Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, Big Data, Java, Flume, Kafka, Yarn, HBase, Kafka Oozie, Java, SQL scripting, Linux shell scripting, Mahout, Eclipse and Cloudera.

Confidential, Madison, WI

Hadoop developer

Responsibilities:

Worked on Distributed/Cloud Computing (Map Reduce/Hadoop, Hive, Pig, HBase, Sqoop, Spark AVRO, Zookeeper etc.), Cloudera distributed Hadoop (CDH4)
Installed and configured HadoopMapReduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and processing.
Involved in installing Hadoop Ecosystem components.
Importing and exporting data into HDFS, Pig, Hive and HBase using SQOOP.
Responsible to manage data coming from different sources.
Flume and from relational database management systems using SQOOP.
Responsible to manage data coming from different data sources.
Involved in gathering the requirements, designing, development and testing.
Worked on loading and transformation of large sets of structured, semi structured data into Hadoop system.
Developed simple and complex MapReduce programs in Java for Data Analysis.
Load data from various data sources into HDFS using Flume.
Developed the Pig UDF'S to pre-process the data for analysis.
Worked on Hue interface for querying the data.
Created Hive tables to store the processed results in a tabular format.
Developed Hive Scripts for implementing dynamic Partitions.
Developed Pig scripts for data analysis and extended its functionality by developing custom UDF's.
Extensive knowledge on PIG scripts using bags and tuples.
Experience in managing and reviewing Hadoop log files.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Exported analyzed data to relational databases using SQOOP for visualization to generate reports for the BI team.

Environment: Hadoop (CDH4), UNIX, Eclipse, HDFS, Java, MapReduce, Apache Pig, Hive, HBase, Oozie, SQOOP and MySQL.

Confidential

Hadoop developer

Responsibilities:

Installation, Configuration & Upgrade of Solaris and Linux operating system.
Experience in Extraction, Transformation, and Loading (ETL) of data from multiple sources like Flat files, XML files, and Databases.
Analyzed large data sets by running Hive queries and Pig scripts.
Worked with the Data Science team to gather requirements for various data mining projects.
Involved in creating Hive tables and loading and analyzing data using hive queries.
Developed Simple to complex MapReduce Jobs using Hive and Pig.
Involved in running Hadoop jobs for processing millions of records of text data.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Developed multiple MapReduce jobs in java language for data processing.
Installed and configured Hive and also written Hive User Defined Functions.
Load and transform large sets of structured, semi structured and unstructured data using MapReduce programming.
Using Sqoop to import and export functionalities to handle large data set transfer between DB2 database and HDFS.
Experience in writing Hive JOIN Queries.
Using Flume to stream the data and loaded it into Hadoop cluster.
Created MapReduce programs to process the data.
Used Sqoop to move the structured data from MySql to HDFS, Hive, Pig and HBase.
Used Pig predefined functions to convert the fixed width file to delimited file.

Environment: Hadoop, HDFS, MapReduce, Hive, Pig, Sqoop, Flume, LINUX, Core Java, Scala, MYSQL, Teradata.

Confidential

Java Developer

Responsibilities:

Competency in using XML Web Services by using SOAP to transfer data to supply chain and for domain expertise Monitoring Systems.
Worked on Maven to build tool for building jar files.
Used the Hibernate framework (ORM) to interact with the database.
Knowledge in struts tiles framework for layout management.
Worked on design, analysis, and development and testing various phases of the application.
Develop named HQL queries and Criteria for use in application.
Developed user interface using JSP and HTML.
Used JDBC for the Database connectivity.
Involved in projects utilizing Java, Java EE web applications in the creation of fully-integrated client management systems.
Consistently met deadlines as well as requirements for all production work orders.
Executed SQL statements for searching contactors depending on Criteria.
Development and integration of the application using Eclipse IDE.
Developed Junit for server side code.
Involved in building, testing and debugging of JSP pages in the system.
Involved in multi-tiered J2EE design utilizing spring (IOC) architecture and Hibernate.
Involved in the development of front end screens using technologies like JSP, HTML, AJAX and JavaScript.
Configured spring managed beans.
Spring Security API is used for configured security.
Investigated, debug and fixed the potential bugs in the implementation code.

Environment: Java, J2EE, JSP, Hibernate, Struts, XML Schema, SOAP, Java Script, PL/SQL, Junit, AJAX, HQL, JSP, HTML, JDBC, Maven, Eclipse.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Beaverton, OR

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship