Hadoop Developer Resume Renton, WA - Hire IT People

SUMMARY

8 years of experience in IT industry in all phases of Software Development Life Cycle .
Extensive experience with mapping, analysis, transformation and support of application software.
Experience in usage of Hadoop distribution like Cloudera5.3 (CDH5, CDH3) and Horton works distribution.
Understanding and experience working with Cloud Infrastructure services like Azure.
Experience in Azure Big Data Technologies like Azure Data Lake, HDInsight.
Provisioned and Configured Proof of Concepts (POC) environments for Map Reduce, Hive, Oozie, Flume, HBase and other major components of Hadoop distributed system
Involved in developing Spark/Scala scripts for changing data capture and delta record processing between newly arrived data and already existing data in HDFS and Blob Storage.
Experience in Implementing and automating models created in Spark, Scala, Hive.
Experience working with code repositories and continuous integration in Git.
Worked with multiple Databases including RDBMS Technologies and NoSQL.
Implemented Partitioning, Dynamic Partitioning and bucketing concepts in Hive to compute data metrics.
Performed data analytics using HIVE, Spark/ Scala for Data Architects in our team.
Experience creating Spark jobs and automate them using shell scripts.
Experience in exporting data from MongoDB using Spark/ Scala from various sources to Azure Storage and performed transformations on it using Hive, Pig and Spark.
Worked with different data format like structured, semi - structured and unstructured data.
Worked on Oozie workflow engine for job scheduling.
Developed UNIX shell scripts for creating reports from Hive data and automated them using the CRON job scheduler.
Experience in usage of User interface Ambari which will monitor the Hadoop cluster.
Experience in monitoring Hadoop Log files.
Extensively worked on Spark using Scala on cluster for computational (analytics), On top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL.
Experience with data visualizations tools like Power BI, Tableau.
Getting data into Power BI via Azure Blob Storage and Spark Cluster.
Responsible for generating actionable insights from complex data to drive real business results for various applications teams.
Ability to learn new concepts, hardworking and Hadoop enthusiast.

TECHNICAL SKILLS

Hadoop Distributions: Cloudera, Hortonworks

Big Data Technologies: HDFS, Map Reduce, Pig, Hive, Spark, YARN, Kafka, Flume, Sqoop, Impala, Oozie, Zookeeper, Spark

RDBMS: MySQL, Oracle, Teradata, MSSQL Server, DB2

Programming languages: SQL, Java, Scala

NoSQL Databases: MongoDB, HBase

Development Tools: Net Beans, Eclipse, Git, Maven, IntelliJ

Virtual Machines: VMware, Virtual Box

OS: Cent OS 5.5, Unix, Red Hat Linux, Ubuntu

Cloud Environment: Microsoft Azure

BI Tools: Power BI, Tableau

PROFESSIONAL EXPERIENCE

Confidential, Renton, WA

Hadoop Developer

Responsibilities:

Worked with cloud Infrastructure services like Microsoft Azure.
Provisioned HDInsight Hadoop Cluster type and Spark Cluster type.
Developed Spark Applications by using Scala and Implemented Apache Spark data processing project to handle data from various data sources.
Experience in Implementing and automating models created in Spark, Scala, Hive, etc.
Hands on experience of Spark SQL jobs to load data into HDFS/ Blob Storage rather than sqooping which increases performance.
Developed Spark code using Scala and Spark-SQL for faster testing and data processing.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Spark Data Frame.
Developed multiple POCs using Spark and deployed on the Yarn cluster, compared the performance of Spark, with HIVE and MySQL.
Experience working with code repositories and continuous integration in Git.
Experience in usage of User interface Ambari which will monitor the Hadoop cluster.
Experience in monitoring Hadoop Log files.
Developed UNIX shell scripts for creating reports from Hive data and automated them using the CRON job scheduler.
Developed Oozie coordinators to schedule Hive scripts to create Data pipeline.
Experience with data visualizations tools like Power BI, Tableau.
Understanding of development and project methodologies.

Confidential, Denver, CO

Hadoop Developer

Responsibilities:

Developed Spark Applications by using Scala, Java and Implemented Apache Spark data processing project to handle data from various RDBMS and Streaming sources.
Worked with the Spark for improving performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD and Spark YARN.
Created Hive tables, loaded data and analyzed data using Hive queries and HiveQL in HUE.
Used HBase in accordance with Hive as and when required for real time low latency queries.
Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce.
Worked on optimizing and tuning Map Reduce jobs to achieve optimal performance.
Imported data using Sqoop to load data from IBM DB2 to HDFS on regular basis.
Developed UNIX shell scripts for creating reports from Hive data and automated them using the CRON job scheduler.
Provide hands-on support for UNIX, Storage and Backup track during day to day activity and troubleshooting problems on different locations for multiple clients.
Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
Developed Oozie coordinators to schedule Hive scripts to create Data pipelines.
Worked with different teams to ensure data quality and availability.

Environment: Hadoop, HDFS, Map Reduce, Hive, Sqoop, Oozie, Spark, Kafka, NoSQL, HBase, UNIX Shell Scripting, Linux, Java (JDK SE 6, 7), Eclipse.

Confidential, Sacramento, CA

Hadoop Developer

Responsibilities:

Worked with structured and semi structured data of approximately 100TB.
Worked on Kafka to bring the data from data sources and keep it in HDFS systems for filtering.
Extensively used Hive/HQL to query or search for a particular string in Hive tables in HDFS
Experience in developing customized UDF's in python to extend Hive functionality.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Cross examining data loaded in Hive table with the source data in MySQL.
Worked on SQOOP for retrieving the data from RDBMS to HDFS and HIVE if needed.
Worked with UNIX shell scripting for automate the required jobs to run at any time.

Environment: Hadoop, HDFS, Hive, Python, Spark, SQL, Teradata, Yarn, SQOOP, Kafka, UNIX Shell Scripting.

Confidential

Hadoop Developer

Responsibilities:

Implemented complex Map Reduce programs to perform joins on the Map side using distributed cache in java.
Developed Map Reduce programs and Hive queries to analyze shipping pattern and customer satisfaction index over the history of data.
Experience in Writing PIG User Define Function and Hive UDF’s.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
Created Map Reduce programs to handle semi/unstructured data like XML, JSON, Avro data files and sequence files for log files.
Used SQOOP to import the data from RDBMS to HDFS to achieve the reliability of data.
Developed pig scripts to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.

Environment: Hadoop, HDFS, Map Reduce, SQOOP, Hive, Pig, Oozie, HBase, Java, Flume 1.2.0, Eclipse IDE, CDH3.

Confidential

SQL Developer

Responsibilities:

Involved in all phases of Software Development Life Cycle (SDLC) and UML diagrams like Use Case Diagrams, Class Diagrams and Sequence Diagrams to represent the detail design phase.
Designed database tables, indexes, constraints etc.
Loaded files into system using MS SQL Server 2008.
Created and managed structured objects like tables, views, stored procedures, and triggers.
Manage employee records as needed to keep the system up to date.
Combining data from multiple tables using Inner joins and Left Outer joins and created sub-queries for complex queries involving multiple tables.
Develop/Unit Test Extract, Transformation, Load Programs using SQL.

Environment: MS SQL Server 2008 and SQL Server 2008 R2, UNIX shell scripts, Eclipse, Java/J2EE, Teradata.

Confidential

SQL Developer

Responsibilities:

Performed extensive data extraction from web and other sources and handled data preparation - missing values, formatting, transformations using SSIS.
Written Stored Procedures and SQL scripts both in SQL Server and Oracle to implement business rules for various clients.
Designed T-scripts to identify long running queries and blocking sessions.
Writing and Debugging T-SQL, stored procedures, Views and User Defined Functions.
Data migration (import & export - BCP) from Text to SQL Server
Created database objects like tables, views, indexes, stored-procedures, triggers, and user defined functions.
Customized the stored procedures and database triggers to meet the changing business rules.

Environment: SQL Server 2005, SQL Server 2000, Oracle 9i, Visual Basic, Excel, Tableau.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Renton, WA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship