Hadoop Developer Resume
WA
SUMMARY
- 8+ Years of experience with emphasis on Big Data Technologies, Development, Administration and Design of Java based enterprise applications.
- Expertise in Hadoop Development/Administration.
- Expertise in Software Development Life Cycle (Requirements Analysis, Design, Development, Testing, Deployment and Support).
- Sounds Knowledge in databases design features including ER Diagrams, normalization, Tablets, Temporary tables, constraints, keys, data dictionaries, and data integrity
- Experienced in setting up standards and processes for Hadoop based application design and implementation.
- Hands on experience in installing, configuring and using ecosystem components likeHadoopMap Reduce, HDFS, Hbase, Oozie, Hive, Pig, Flume.
- Experienced in developing Map Reduce programs using Apache Hadoop for working with Big Data.
- Developed Sqoop Scripts to extract data from DB2 EDW source databases onto HDFS.
- Experience in various data transformation and analysis tools like Map Reduce, Pig and Hive to handle files in multiple formats (JSON, Text, XML, Binary, Logs etc
- Extensive Experience on importing and exporting data using stream processing platforms like Flume and Kafka.
- Worked with Oracle and Teradata for data import/export operations from different data marts.
- Configured internode communication between Cassandra nodes and client using SSL encryption.
- Expertise in NoSQL databases including HBase.
- Developed Spark Application by using Python (Pyspark)
- Expertise in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice - versa.
- Expertize with the tools inHadoopEcosystem including Pig, Hive, HDFS, MapReduce, Sqoop, Spark, Kafka, Y/arn, Oozie, and Zookeeper.
- Excellent noledge onHadoopecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
- Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Performed data analysis using Hive and Pig.
- Loading log data into HDFS using Flume.
- Expertise in using Sqoop, Zoo Keeper and Cloudera Manager.
- Expertise in back-end procedure development, for RDBMS, Database Applications using SQL and PL/SQL. Hands on experience on writing Queries, Stored procedures, Functions and Triggers by using SQL.
- Excellent communication, interpersonal, analytical skills, and strong ability to perform as part of team.
- Exceptional ability to learn new concepts.
- Hard working, Quick learner and enthusiastic.
- End to end working experience in all the SDLC phases.
- Well versed with all stages of Software Development Life Cycle (SDLC) me.e. Requirement(s) gathering, analyzing the same, Design, Implementation with usage of SQL mappings, sessions and workflows.
- An excellent team player and self-starter with good communication skills and proven abilities to finish tasks before target deadlines.
TECHNICAL SKILLS
Programming Languages: SQL, PL/SQL,T-SQL, Databases. SQL, T-SQL, PL/SQL, C, C++, C#, CSS, HTML, Java.
Databases: NO SQL (HBase), MY SQL,MS SQL server.
IDE s & Utilities: Eclipse and JCreator,NetBeans.
WebDev.Technologies: HTML, XML.
Protocols: TCP/IP, HTTP and HTTPS.
Operating Systems: Windows 7,8,10, Unix, Linux, Red hat.
ETL tools: Tableau,VMplayer
Hadoop ecosystem: Hadoop and Map Reduce, Sqoop, Hive, PIG, HBASE, HDFS, Zookeeper, Oozie, and Kafka.
PROFESSIONAL EXPERIENCE
Confidential, WA
Hadoop Developer
Responsibilities:
- Installed and configured Hadoop Map-Reduce, HDFS and developed multiple Map-Reduce jobs in Java for data cleansing and pre-processing.
- Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Responsible for building scalable distributed data solutions using Hadoop.
- Analysed large amounts of data sets to determine optimal way to aggregate and report on it.
- Developed and executed hive queries for de-normalizing the data.
- Developed the Apache Storm, Kafka, and HDFS integration project to do a real-time data analyses.
- Responsible for executing hive queries using Hive Command Line, Web GUI HUE and Impala to read, write and query the data into HBase.
- Moved data from HDFS to Cassandra using Map-Reduce and Bulk Output Format class
- Developed bash scripts to bring the T-log files from ftp server and tan processing it to load into hive tables.
- Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and tan imported into hive tables.
- Worked on analysing data with Hive and Pig.
- Handled importing of data from various data sources, performed transformations using Hive, Map-Reduce, and loaded data into HDFS.
- Importing and exporting data into HDFS using Sqoop.
- Wrote Map-Reduce code to make un-structured data into semi- structured data and loaded into Hive tables.
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting.
- Worked extensively in creating Map-Reduce jobs to power data for search and aggregation
- Worked extensively with Sqoop for importing metadata from Oracle.
- In depth understanding/noledge ofHadoopArchitecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, MRv1 and MRv2 (YARN).
- Extensively used Pig for data cleansing.
- Created partitioned tables in Hive.
- Managed and reviewed Hadoop log files.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in Map-Reduce way.
- Used Hive to analyse the partitioned and bucketed data and compute various metrics for reporting.
- Installed and configured Pig and also written Pig Latin scripts.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Created H-base tables to store various data formats of data coming from different portfolios.
- Developed Map-Reduce jobs to automate transfer of data from Hbase.
- Used SVN, Tortoise SVN version control tools for code management (check-ins, checkouts and synchronizing the code with repository).
- Worked hands on with ETL process.
Environment: Hadoop, Map-Reduce, Hive, H-Base, HDFS, Hive, Java (JDK 1.6), Cloudera, Map-Reduce, PL/SQL, SQL*PLUS, UNIX Shell Scripting. Java 6, Eclipse
Confidential
Hadoop Developer
Responsibilities:
- Worked with business teams and created Hive queries for ad hoc access.
- Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
- Involved in review of functional and non-functional requirements
- Responsible to manage data coming from different sources.
- Installed and configured Hadoop ecosystem like H-Base, Flume, Pig and Sqoop.
- Loaded daily data from websites to Hadoop cluster by using Flume.
- Involved in loading data from UNIX file system to HDFS.
- Developed batch processing pipeline to process data using python and airflow. Scheduled spark jobs using airflow.
- Created a new airflow DAG to find popular items in Red-shift and ingest in the main Postgres DB via a web service call.
- Worked on migrating Map-Reduce programs into Spark transformations using Spark and Scala.
- Involved in writing, testing, and running Map-Reduce pipelines using Apache Crunch.
- Creating Hive tables and working on them using Hive QL.
- Created complex Hive tables and executed complex Hive queries on Hive warehouse.
- Wrote Map-Reduce code to convert unstructured data to semi structured data.
- Used Pig to extract, transformation & load of semi structured data.
- Installed and configured Hive and also written Hive UDFs.
- Develop Hive queries for the analysts.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Cluster co-ordination services through Zoo-Keeper.
- Collected the logs data from web servers and integrated in to HDFS using Flume.
- Creating Hive tables and working on them using Hive QL.
- Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
- Design and implement Map Reduce jobs to support distributed data processing.
- Supported Map-Reduce Programs those are running on the cluster.
- Involved in HDFS maintenance and loading of structured and unstructured data.
- Wrote Map-Reduces job using Java API.
- Designing No-SQL schemas in H-base.
- Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Involved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
- Developed the Pig UDF’S to pre-process the data for analysis.
- Involved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
Environment: Hadoop, Map-Reduce, HDFS, Hive, Pig, H-Base, Java, Cloudera Linux, XML, MySQL, MySQL Workbench, Java 6, Eclipse, Cassandra.
Confidential, Arlington, TX
Hadoop Developer
Responsibilities:
- Involved in review of functional and non-functional requirements.
- Installed and configured Pig and also written Pig-Latin scripts.
- Wrote Map-Reduce job using Pig Latin.
- Involved in ETL, Data Integration and Migration
- Imported data using Sqoop to load data from Oracle to HDFS on regular basis.
- Developing Scripts and Batch Job to schedule various Hadoop Program.
- Written Hive queries for data analysis to meet the business requirements.
- Creating Hive tables and working on them using Hive QL.
- Importing and exporting data into HDFS from Oracle Database and vice versa using Sqoop.
- Experienced indefining jobflows.
- Got good experience with NOSQL database H-Base.
- Involved in creating Hive tables, loading the data and writing hive queries that will run internally in a map reduce way.
- Developed a custom File System plugin for Hadoop so it can access files on Data Platform.
- The custom File System plugin allows Hadoop Map-Reduce programs, H-Base, Pig and Hive to work unmodified and access files directly.
- Designed and implemented Map-reduce-based large-scale parallel relation-learning system
- Extracted feeds form social media sites such as Facebook, Twitter using Python scripts.
- Setup and benchmarked Hadoop/H-Base clusters for internal use
Environment: Hadoop, Map-Reduce, HDFS, Java, Hadoop distribution of Cloudera, Pig,H-base, UNIX.
Confidential
PL/SQL developer
Responsibilities:
- Wrote Stored Procedures in PL/SQL.
- Defragmentation of tables, partitioning, compressing and indexes for improved performance and efficiency.
- Involved in table redesigning with implementation of Partition Table and Partition Indexes to makeDatabaseFaster and easier to maintain.
- UsedSQL Server SSIStoolto build high performance data integration solutions includingextraction, transformationandload packagesfordata warehousing.
- Extracted data from theXMLfile and loaded it into thedatabase.
- Created and modifiedSQL*Plus, PL/SQLandSQL*Loader scriptsfor data conversions.
- Worked onXMLalong with PL/SQL to develop and modify web forms.
- Designed Data Modelling, Design Specifications and to analyzeDependencies.
- Creatingindexeson tables to improve the performance by eliminating the full table scans and views for hiding the actual tables and to eliminate the complexity of the large queries.
- Involved in creatingUNIXShell Scripting.
- Maintaining Logical and Physical structure of the database.
- Creating tablespaces, tables, views,scripts for automatic operationsof the database activities.
- Coded variousstored procedures, packagesandtriggersto in corporate business logic into the application.
Environment: Oracle 9i, 10g, PL/SQL, Erwin 4.1, Oracle Designer 2000,Windows 2000, Toad,SQL*Plus.