We provide IT Staff Augmentation Services!

Big Data Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Fort Worth, TexaS

SUMMARY

  • 4 years of experience on Bigdata engineering and Analytics using Hadoop working environment includesMap Reduce, HDFS, Hive, Pig, HBase, Zookeeper, Oozie, and Sqoop.
  • Experienced in processing large datasets of different forms including structured, semi - structured and unstructured data.
  • Hands on experience with Cloudera and multi cluster nodes on Hortonworks Sandbox.
  • Expertise at designing tables in Hive, Mysql using Sqoop and processing data like importing and exporting of databases to the HDFS.
  • Experienced in working with data architecture including pipeline design of data ingestion, Architecture information of Hadoop, data modeling, machine learning and advanced data processing.
  • Experience optimizing ETL workflows, where data coming from different sources and it is processed.
  • ETL Data extraction, managing, aggressions and loading into HBase.
  • Expertise in developing Pig Latin Script and Hive Query Language.
  • Extensive knowledge about Zookeeper process for Various types of centralized configurations
  • Experienced the integration of various data sources including Java, JDBC, RDBMS, Shell Scripting, Spreadsheets and Text files.
  • Experience in managing and reviewing Hadoop Log files using Flume and Kafka also developed the Pig UDF's and Hive UDF's to pre-process the data for analysis.
  • Hands on experience with Spark to handle the streaming data.
  • Hands on experience with spring tool suit for development of Scala Applications.
  • Shell Scripting to load the data and process it from various Enterprise Resource Planning (ERP) sources.
  • Hands on experience in writing Pig Latin and Pig Interpreter to run the Map Reduce jobs.
  • Expertise in Hadoop components like Yarn, Pig, Hive, HBase, Flume, Oozie, Shell Scripting like Bash.
  • Good Understanding of Hadoop architecture and the daemons of Hadoop including Name-Node, Data Node, Job Tracker, Task Tracker, Resource Manager.
  • Hands on experience in ingesting data into Data Warehouse using various data loading techniques.
  • Good knowledge on Data Modelling and Data Mining to model the data as per business requirements.
  • Hands on experience with MapReduce, Pig, Programming Model, Installation and Configuration of Hadoop, HBase, Hive, Pig, Sqoop and Flume using Linux commands.
  • Working knowledge of Agile and waterfall development models.

TECHNICAL SKILLS

Big Data Platform: Hortonworks (HDP 2.2)/AWS (S3, EMR, EC2)/Cloudera (CDH3/CDH4)

Analytical Tools: D3JS, Tableau, R, Python

OLAP Concepts: Data warehousing, Data mining concepts

Apache Hadoop: HDFS, HBase, Pig, Hive, Sqoop, Kafka, Zookeeper, Oozie, Ambari, Spark SQL

Source Control: GitHub, VSS, TFS

Databases and NoSQL: MS SQL Server 2012/2008, Oracle 11g (PL/SQL) and MySQL 5.6, MongoDB

Development Methodologies: Agile and Waterfall

Development Tool: Eclipse, Toad, Visual Studio

Programming Languages: Java, .Net

Scripting Languages: JavaScript, JSP, Python, XML, HTML and Bash

PROFESSIONAL EXPERIENCE

Confidential, Fort Worth, Texas

Big Data Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, managing and reviewing data backups and Hadoop log files
  • Continuous monitoring and managing the Hadoop cluster through Cloudera Manager
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs
  • Developed a mechanism that manipulates a huge 7GB csv file into Hive and exports results into many csv files
  • Used Oozie workflow to automate the process
  • Observed rapid increase in the speed compared to the traditional ETL process
  • Extracted and loaded data into Data Lake environment (Amazon S3) which was accessed by business users and data scientists.
  • Developed PIG scripts to transform the raw data into intelligent data as specified by business users.
  • Handled importing of data from various data sources, performed transformations using Hive, loaded data into HDFS and extracted data from Teradata into HDFS using Sqoop
  • Worked extensively with Sqoop for importing metadata from Oracle
  • Experience migrating MapReduce programs into Spark transformations using Spark and Scala
  • Configured Sqoop and developed scripts to extract data from MySQL into HDFS
  • Optimized PIG jobs by using different compression techniques and performance enhancers.
  • Optimization of complex joins in PIG by using techniques such as skewed joins and hash based aggregations.
  • Worked on the Data Pipeline which is an orchestration tool for all our jobs that run on AWS
  • Worked on installing and configuring EC2 instances on Amazon Web Services (AWS) for establishing clusters on cloud
  • Written shell scripts and Python scripts for automation of job
  • Assist with the addition of Hadoop processing to the IT infrastructure

Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Cloudera Manager, Scala, Pig, Sqoop, Oozie, ZooKeeper, Teradata, PL/SQL, MySQL, Windows, Oozie, HBase.

Confidential, Denver, CO

Hadoop Developer

Responsibilities:

  • Imported data from different relational data sources like RDBMS, Teradata to HDFS using Sqoop.
  • Imported bulk data into HBase Using Map Reduce programs.
  • Perform analytics on Time Series Data exists in HBase using HBase API.
  • Designed and implemented Incremental Imports into Hive tables.
  • Used Rest ApI to Access HBase data to perform analytics.
  • Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
  • Experienced in working with various kinds of data sources such as Teradata and Oracle. Successfully loaded files to HDFS from Teradata, and load loaded from hdfs to hive and impala.
  • Experienced in running query usingImpalaand used BI tools to run ad-hoc queries directly on Hadoop.
  • Experienced with batch processing of data sources using ApacheSpark,Elastic search.
  • Develop wrapper using shell scripting for Hive, Pig, Sqoop, Scala jobs.
  • Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
  • Worked in Loading and transforming large sets of structured, semi structured and unstructured data.
  • Involvedin collecting, aggregating and moving data from servers to HDFS using Apache Flume.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in MapReduce way.
  • Developed java Restful web services to upload data from local to Amazon S3, listing S3 objects and file manipulation operations.
  • Successfully ran all Hadoop MapReduce programs on Amazon Elastic MapReduce framework by using Amazon S3 for input and output.
  • Designed and developed Dashboards for Analytical purposes using Tableau.
  • Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Worked on different file formats like Sequence files, XML files and Map files usingMap ReducePrograms.
  • Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Worked on Oozie workflow engine for job scheduling.

Environment: CDH, Map Reduce, Hive, Spark, Oozie, Sqoop, Pig, Java, Rest API, Maven, MRUnit, Junit, Tableau, Cloudera, Python.

Confidential, Boca Raton, FL

SQL BI Developer

Responsibilities:

  • Responsible for interaction with business users, gathering requirements and managing the delivery
  • Involved in installation, Configuration and administration of Tableau Server
  • Building, publishing customized interactive reports and dashboards, report scheduling using Tableau server
  • Tracking the performance of sales representatives with respective KPIs
  • PreparingDashboardsusing calculations, parameters, calculatedfields,groups, sets and hierarchies in Tableau
  • Published Workbooks by creating user filters so that only appropriate users can view/edit them
  • Creating users, sites, groups, projects, Data connections, settings as a tableau administrator
  • Created rich dashboards using Tableau Dashboard and prepared user stories to create compelling dashboards to deliver actionable insights
  • Authenticating the tableau dashboards before publishing to tableau server
  • Security implementation to tableau server reports based on user community
  • Generated Dashboards with Quick filters, Parameters and sets to handle views more efficiently
  • Published Workbooks by creating user filters so that only appropriate teams can view it
  • Analyzed the source data and handled efficiently by modifying the data types
  • Generated context filters and used performance actions while handling huge volume of data
  • Developed tableau dashboards and checked the performance using Performance Recording
  • Generated tableau dashboards with combination charts for clear understanding.
  • Analyzed backend data from SQL Server and Oracle create effective dashboards.
  • Created Tableau Calculations (Table Calculations, Hide Columns, Creating/Using Parameter, Totals) and Formatting (Annotations, Layout Containers, Mark Labels, Rich Text Formatting) using Tableau Desktop
  • Created Workbooks and Projects, Database Views, Data Sources and Data Connections
  • Performed the back up and restoration activity, server performance monitoring of Tableau server for environmental changes, or for any software upgrades or patches

Environment: s: Oracle,SQL Server Integration Services, Transact-SQL, Tableau, Excel Charts

Confidential, Columbia, MD

SQL BI Developer

Responsibilities:

  • Worked with client to understand and analyze business requirements to provide the possible technical solutions.
  • Review and modify software programs to ensure technical accuracy & reliability of programs.
  • Translate business requirements into software applications and models.
  • Worked with database objects such as tables, views, synonyms, sequences and database links as well as custom packages tailored to business requirements.
  • Built complex queries using SQL and wrote stored procedures using PL/SQL.
  • Used Bulk Collections, Indexes, and Materialized Views to improve the query executions.
  • Ensure compliance of standards and conventions in developing programs.
  • Created SQL scripts for conversion of legacy data (including validations) and then load it into the tables.
  • Created complex Stored Procedures, Triggers, Functions, Indexes, Tables, Views, Joins and Other SQL code to implement business rules.
  • Performed System Acceptance Testing (SAT) and User Acceptance Testing (UAT) for databases. Performed unit testing and path testing for application.
  • Analyzed available data from MS Excel, MS Access and SQL server.
  • Involved in implementing the data integrity validation checks.
  • Resolve and troubleshoot complex issues.

Environment: MS SQL Server 2005/2008, Windows 2003/2008, SSIS, SQL Server, Management Studio, SQL Server Business Intelligence studio, SQL Profiler, Microsoft Excel and Access.

We'd love your feedback!