Hadoop Developer Resume
Portland, OR
SUMMARY
- 9+ years of total IT experience this includes 4+ years of experience in Hadoop and Big data.
- Hands on experience in installing big data ecosystems Hive, Pig, Sqoop, Flume, Oozie, through Cloudera distribution.
- Worked on Open Source Apache Hadoop, Cloudera Enterprise (CDH) .
- Good at setting up the Hadoopcluster, troubleshooting cluster related issues, adding or removing a node from the cluster without any data loss, etc.
- Good understanding of the map reduce framework architectures (MRV1 & YARN Architecture)
- Good noledge of HDFS architecture and configuration.
- Hands on experience in cleansing semi - structured and unstructured data using Pig Latin scripts
- Hands on experience in performing aggregations on data using Hive Query Language (HQL).
- Developed MapReduce programs in java
- Good experience in extending the core functionality of Hive and Pig by developing user-defined functions to provide custom capabilities to these languages.
- Used Sqoop to import and export data from HDFS to relational database and vice-versa.
- Hands on experience on data ingestion tools Sqoop, Flume, Kafka, etc.
- Experience in designing and executing Unix Scriptsto implement cron jobs dat execute Hadoop jobs (MapReduce, Pig, etc.)
- Experience in working with NoSQL databases like HBase, Cassandra, etc.
- Familiar with various data warehouse and data modeling techniques
- Worked on various compression codec’s like gzip, snappy, etc.
- Worked on open source distributed processing framework Apache Spark to achieve near real time query processing and iterative processing.
- Hands on experience working with AVRO, XML, JSON, Parquet and RCFile formats.
- Implemented Apache Sentry to enforce role based authorization to data stored in Hadoop distributed file system.
- Worked on healthcare, Retail and Telecom domains.
- Hands on experience in using SparkSQL.
- Hands on experience in designing database, definingentity relationships, Database analysis, ProgrammingSQL,Stored procedure’s PL/ SQL
TECHNICAL SKILLS
Hadoop/Big Data platform: HDFS, Map Reduce, HBase, Spark, Hive, Pig, Zookeeper, Sqoop, and Kafka.
Hadoop distribution: Cloudera, Amazon Elastic MapReduce
Programming languages: Java,UNIX shell scripts, Python, Pig Latin, PL/SQL.
Operating Systems: Redhat Linux,Cent OS
Linux Experience: System Administration Tools,Apache
Application Software: SSH, telnet, ftp, Terminal client and Remote Desktop Connection.
Data Storage and Data Base: Oracle 9i, 10g, 11g, MySQL, RDBMS, HBase
PROFESSIONAL EXPERIENCE
Confidential, Portland, OR
Hadoop Developer
Responsibilities:
- Initial set up of Hadoop platform, designed the data ingestion and validation approach with ETL flow.
- Worked on extracting data from Oracle database and load to Hive database.
- Developed spark jobs for extraction data from different sources and loaded in to hive tables.
- UsedSparkStreaming to divide streaming data into batches as an input tosparkengine for batch processing.
- Developed the Python automations scripts for Data integrity and validations
- Developed extraction logic using Sqoop scripts to move data from relational databases to HDFS.
- Developed complex transformations using Hive QL to build aggregate/summary tables.
- Optimized the performance of Queries by looking into the Explain plan and also tuning various hive parameters and spark parameters
- Developed UDF’s to implement functions dat were not present in Hive and Spark
- Solved performance issues in Hive and Spark scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
- Developed deployment scripts for Development, QA and Production systems through Git.
- Developed workflows using Aiflow.
Environment: Hadoop, HDFS, Map Reduce, Hive, Airflow, SQL Developer, Oracle, PL/SQL,SVN, Eclipse, JAVA, Shell scripting, Spark SQL, Python, Unix and Tableau
Confidential, Dallas, TX
Hadoop Developer
Responsibilities:
- Involved in installing, configuring andmanaging HadoopEcosystem components like Hive, Pig, Sqoop and Flume.
- Designed an approach for Data Validation and Data Ingestion Framework.
- Worked on validating and converting raw data from various sources like traditional databases, online banking etc.
- Develop transformation logic using Hive Queries to build dimension and fact tables
- Developed schema in HBase for faster scans
- Used HBase for storing aggregated data used for reporting
- Worked on Unit testing by creating test data and comparing expected results
- Developed deployment scripts for production release.
- Wrote MapReduce jobs to perform operations like copying data onHDFSand defining job flows onEC2 server, load and transform large sets of structured, semi-structured and unstructured data.
- Extended the core functionality of Hive language by writing UDF's.
- Participated in Daily scrum calls and track day to day activities using Rally
- Expertise in working in Agile environment.
Environment: Hadoop, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Oracle, Core Java Cloudera HDFS, Eclipse.
Confidential, TX
Hadoop Developer
Responsibilities:
- Involvement in design and development phases ofSoftware Development Life Cycle.
- Imported the data from Oraclesource and populated it into HDFS using Sqoop.
- Automated deployment of Hadoop clusters using AWS EMR and Python boto
- Developed a data pipeline using Flume and Kafka to store data into HDFS.
- Automatedthe process for extraction of data from warehouses and weblogs by developing work-flows and coordinator jobs in OOZIE.
- Performed transformations like event joins, filter bot traffic and some pre-aggregations using Pig.
- Setting up and maintain the continuous integration server and Live production server using Jenkins.
- Developed the Python automations scripts for Data integrity and validations
- Converted data files into Parquet file format.
- Executed Hive queries on Parquet tables to perform data analysis to meet the business requirements.
- Implemented a POC withSparkSQL to interpret complexJsonrecords.
- Created table definition and made the contents available as a Schema-BackedRDD.
- Developed business specific Custom UDF's in Hive, Pig.
- Optimizing MapReduce code, pig scripts and performance tuning and analysis.
- Exported the aggregated data onto Oracleusing Sqoopfor reporting on the dashboard.
Environment: Eclipse,Centos Linux, HDFS, MapReduce, Hive, Spark, Kafka, Pig, Sqoop, Oracle, Oozie, Parquet, Flume, EMR, EC2, S3, RDS, Cloudformation
Confidential, Dallas, TX
Oracle PL/SQL Developer
Responsibilities:
- Performed database and query tuning and performance monitoring.
- Involved in Requirement Analysis, Design, Data Mapping, use cases and Client Interaction. This involves analyzing Functional Spec and designing Technical Spec.
- Technical Assistance for Development. Design and Develop of Logical and physical Data Model of Schema
- Gather the requirements and define the objective of your database, Gather Data, Organize in tables and Specify the Primary Keys, Create Relationships among Tables Relational Database Design
- Monitored growth of data and managed space allocation accordingly.
- Creating and managing schema objects such as tables, views, indexes, procedures, triggers & maintaining Referential Integrity.
- Extensively written PL/SQL code for stored Procedures, Functions, Packages, Libraries and other database triggers etc. Written interfaces to transfer/interact data with various external systems.
- Used %TYPE and %ROWTYPE for anchoring the variables to the database data types.
- Used package like DBMS JOB to determine if there are any jobs dat are to be performed, or any changes are made to the program or procedure.
- Backup strategies, scheduling the backups. Backing up and restoring the databases whenever necessary.
- Actively involved in coding SQL statements in PL/SQL.
- Developed various Complex Queries, Views for Generating Reports.
- Fine-tuned the logic in Procedures, Functions and Triggers for optimum performance.
- Used Quest TOAD to tune query performance.
- Performed off site on call 24x7 production support for Oracle 9i database administration activities.
- Trouble shooting performance issues using Explain Plan and TKPROF.
- Used Cursors both implicit and explicit to capture many rows within a PL/SQL block, and applied business rules on them.
- Created public synonyms on the database links for easy reuse.
Environment: Oracle 9i, PL/SQL, C++, Windows 2000, SQL*Loader, TOAD 8.5, Forms 6i, Reports 6i, Windows.
Confidential, Atlanta, GA
Oracle PL/SQL Developer
Responsibilities:
- Performed review of the business process, involved in Requirements Analysis, System Design documents, Flows, Test Plan preparations and Development of Business Process and Work Flow charts.
- Developed database objects like Tables, Views, Indexes, Synonyms and Sequences.
- Involved in creation of tables, Join conditions, correlated sub queries, nested queries, views, sequences, synonyms for the business application development.
- Created Materialized views to calculate aggregated data, such as sum of monthly revenue.
- Developed complex ORACLE stored procedures and packages to calculate borrower’s debt ratio, property taxes, mortgage insurance, regular/ irregular payments, etc.
- Load traffic data logs into staging tables using SQL loader and shell scripts.
- PL/SQL coding for Exception Handling and duplicate data maintenance.
- Used Triggers in Customizing Error conditions.
- Developed / modified PL/SQL code like stored procedures, packages for interfaces.
- Developed UNIX shell scripts and PL/SQL procedures to extract and load data for month-end batch processing.
- Developed PL/SQL code for updating payment terms.
- Supported DBA team for the data migration from oracle 9i to 10g.
- Creation of Forms, Pop-up Menus and maintaining Master Detail relationship.
- Used various LOVS (List of Values) and record groups at various points in runtime.
- Involved in the design and development of User Interfaces using Forms 6i, Reports 6i and coding modules in PL/SQL.
- Development of PL/SQL program units and sharing them among multiple applications for processing business logic in the database.
- Created some routines (Before-After, Transform function) used across the project.
- Testing all forms, PL/SQL code for logic correction.
Environment: Oracle 9i (SQL, PL/SQL), SQL *loader, Toad, Oracle Forms/Reports 6i, UNIXData Stage.
Confidential
Oracle PL/SQL Developer
Responsibilities:
- Wrote scripts for creating Tables, Sequences, Views, Synonyms and Other Database Objects.
- Wrote PL/SQL Packages, Procedures, Functions, Triggers, Cursors, Collections and Anonymous block and other objects for the back end processing of the proposed database design.
- Involved in writing complete sql to process through pl/sql units.
- Designing UI with HTML and JavaScript using Mod PL/SQL.
- Client side validation using JavaScript
- Involved in Performance Tuning for SQL Queries.
- Done the code reviews for fellow developers coding part.
- Conducted PL/SQL training sessions for co-employees to educate about the features available in PL/SQL.
- Providing Support for Testing, UAT and Production support team.
- TEMPEffective solutions to fix production issues.
Environment: Oracle 10g, Java, HTML, Java Script and Linux, SQL*Plus,SQL*Loader, Windows XP.