Hadoop Developer Resume Phoenix, AZ - Hire IT People

PROFESSIONAL SUMMARY:

Overall 8+ years of overall experience with strong emphasis on Design, Development, Implementation, Testing and Deployment of Software Applications.
Over 8+ years of comprehensive IT experience in BigData and Big DataAnalytics, Hadoop, HDFS, Map Reduce, YARN, Hadoop Ecosystem and ShellScripting.
Highly capable for processing large sets of Structured, Semi - structured and Unstructured datasets and supporting BigData applications.
Expertise in transferring data between a Hadoop ecosystem and structured data storage in a RDBMS such as MY SQL, Oracle, Teradata and DB2 using Sqoop.
Experience in ApacheSpark cluster and streams processing using Spark Streaming
Expertise in moving large amounts of log, streaming event data and Transactional data using Flume.
Experience in developing Map Reduce jobs in Java for data cleaning and pre-processing.
Expertise in writing PigLatin, Hive Scripts and extended their functionality using UserDefined Functions (UDF's).
Good knowledge on Hadoop, Hbase, Hive, Pig Latin Scripts, MR, Sqoop, Flume, Hive QL.
Experience in analyzing data using Pig Latin, HiveQL and HBase.
Capturing data from existing databases that provide SQL interfaces using Sqoop.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
Implemented Proofs of Concept on Hadoop stack and different big data analytic tools, migration from different databases (i.e Teradata, Oracle, MYSQL) to Hadoop.
Worked on NoSQL databases including HBase, Cassandra and MongoDB
Successfully loaded files to Hive and HDFS from MongoDB, HBase
Experience in configuring Hadoop Clusters and HDFS.
Expertise in handling structured arrangement of data within certain limits (Data Layout's) using Partitions and Bucketing in Hive.
Expertise in preparing interactive Data Visualization's using Tableau Software from different sources.
Hands on experience in developing workflows that execute MapReduce, Sqoop, Pig,Hive and Shellscripts using Oozie.
Experience working with Cloudera HueInterface and Impala.
Expertise in developing SQLqueries, Stored Procedures and excellent development experience with Agile Methodology.
Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
Excellent leadership, interpersonal, problem solving and time management skills.
Excellent communication skills both Written (documentation) and Verbal (presentation).

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, and Zookeeper.

No SQL Databases: Hbase, Cassandra, MongoDB

Languages: C, Python, Java, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts, R Programming

Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery

Frameworks: MVC, Struts, Spring, Hibernate

Operating Systems: Sun Solaris, HP-UNIX, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, JBoss

Databases: Oracle 9i/10g/11g, DB2, SQL Server, MySQL, Teradata

Tools: and IDE: Eclipse, NetBeans, Toad, Maven, ANT, Hudson, Sonar, JDeveloper, Assent PMD, DB Visualizer

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

PROFESSIONAL EXPERIENCE:

Confidential, Phoenix, AZ

Hadoop Developer

Responsibilities:

Involved in end to end data processing like ingestion, processing, and quality checks and splitting.
Developed Spark scripts by using Scala as per the requirement.
Load the data into SparkRDD and performed in-memory data computation to generate the output response.
Develop scripts to automate the execution of ETL using shell scripts under Unix environment.
Support other Talend developers; providing mentoring, technical assistance, troubleshooting and alternative development solutions.
Remarkable experience in designing ETL processes and developing source to target mappings
Expert in developing, testing, and deploying ETL Tools Talend .
Performed different types of transformations and actions on the RDD to meet the business requirements.
Developed a data pipeline using Kafka, Spark and Hive to ingest, transform and analysing data.
Also worked on analysing Hadoop cluster and different bigdata analytic tools including Pig, HBase and Sqoop.
Good experience in AWS services, Networking, Storage, and Cloud Technology.
Primarily responsible for designing, implementing, Testing, and maintaining database solution for Azure.
Involved in loading data from UNIX file system to HDFS.
Created HBase tables to store variable data formats of PII data coming from different portfolios.
Implemented best offer logic using Pig scripts and Pig UDFs.
Responsible to manage data coming from various sources.
Experience on loading and transforming of large sets of structured, semi structured and unstructured data.
Experience in using Map-Reduce programming model for Batch processing of data stored in HDFS.
Cluster coordination services through Zookeeper.
Exported the analysed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Analysed large amounts of data sets to determine optimal way to aggregate and report on it.
Responsible for setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
Involved in managing and reviewing Hadoop log files.
Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
Developing Scripts and Batch Job to schedule various Hadoop Program.
Responsible for writing Hive queries for data analysis to meet the business requirements.
Responsible for creating Hive tables and working on them using HiveQL.
Responsible for importing and exporting data into HDFS and Hive using Sqoop.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
Designed and implemented Map Reduce based large-scale parallel relation-learning system.
Involved in scheduling Oozie workflow engine to run multiple Hive jobs
Developing parser and loader map reduce application to retrieve data from HDFS and store to HBase and Hive.
Importing the unstructured data into the HDFS using Flume.
Used Oozie to orchestrate the map reduce jobs that extract the data on a timely manner.
Involved in using HBase Java API on Java application.
Automated all the jobs for extracting the data from different Data Sources like MySQL to pushing the result set data to Hadoop Distributed File System.
Hands on design and development of an application using Hive (UDF).
Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
Provide support data analysts in running Pig and Hive queries.
Involved in HiveQL and Involved in Pig Latin.
Importing and exporting Data from MySQL/Oracle to HiveQL Using SQOOP.
Responsible for defining the data flow within Hadoop eco system and direct the team in implement them.

Environment: Hadoop, MapReduce2.7.2, Hive2.0, Pig0.16, Talend, Sqoop2, Java, Oozie, HBase0.98.19, Kafka0.10.1.1, Spark2.0, Scala2.12.0, Eclipse, Linux, Oracle, Teradata.

Confidential, San Francisco, CA

Hadoop Developer

Responsibilities:

Worked on Hortonworks-HDP 2.5distribution
Responsible for building-scalable distribution data solution using Hadoop
Involved in importing data from MicrosoftSQLServer, MySQL, Teradata. into HDFS using Sqoop.
Using Jenkins AWS Code Deploy plugin to deploy to AWS and Migrated applications to the AWS cloud.
Played a key role in dynamic partitioning and Bucketing of the data stored in Hive Metadata.
Writing HiveQL queries for integrating different tables for create views to produce result set.
Collected the log data from Web Servers and integrated into HDFS using Flume.
Experienced on loading and transforming of large sets of structed and unstructured data.
Used MapReduce programs for data cleaning and transformations and load the output into the Hive tables in different file formats.
Written MapReduce programs to handle semi structed and un structed data like JSON, Avro data files and sequence files for log files.
Created data pipelines for different events to load the data from DynamoDB to AWS S3 bucket and then into HDFS location.
Involved in loading data into HBaseNoSQL database.
Building, Managing and scheduling Oozie workflows for end to end job processing
Experienced in extending Hive and Pig core functionality by writing custom UDFs using Java.
Analyzing of Large volumes of structured data using SparkSQL.
Written shell script to execute HiveQL.
Used Spark as ETL tool
Written Automated shell scripts in Linux/Unix environment using bash.
Migrated HiveQL queries into SparkSQLto improve performance.
Extracted Real time feed using Spark streaming and convert to RDD and process data into Data Frame and load the data into HBase.
Experienced in using Data Stax Spark connector which is used to store the data into Cassandra databaseor get the data from Cassandra database.
Extracted Real time feed using Spark streaming and convert it to RDD and process data into Data Frame and load the data into Cassandra.

Environment: : Hortonworks, Hadoop, HDFS, Pig, Sqoop, Hive, Oozie, Zookeeper, NoSQL, HBase, Shell Scripting, Scala, Spark, SparkSQL.

Confidential, Dallas, TX

Hadoop Developer

Responsibilities:

Developed a process for Sqooping data from multiple sources like SQLServer, Oracle and Teradata.
Responsible for creation of mapping document from source fields to destination fields mapping.
Developed a shell script to create staging, landing tables with the same schema like the source and generate the properties which are used by Ooziejobs.
Developed Oozie workflow's for executing Sqoop and Hive actions.
Worked with NoSQL databases like Hbase in creating Hbase tables to load large sets of semi structured data coming from various sources.
Involved in building databaseModel, APIs and Views utilizing python, in order to build an interactive web based solution
Performance optimizations on Spark/Scala. Diagnose and resolve performance issues.
Responsible for developing Python wrapper scripts which will extract specific date range using Sqoop by passing custom properties required for the workflow.
Developed scripts to run Oozie workflows, capture the logs of all jobs that run on cluster and create a metadata table which specifies the execution times of each job.
Developed Hivescripts for performing transformation logic and also loading the data from staging zone to final landing zone.
Developed monitoring and notification tools using Python.
Worked on Parquet File format to get a better storage and performance for publish tables.
Involved in loading transactional data into HDFS using Flume for Fraud Analytics.
Developed Python utility to validate HDFS tables with source tables.
Designed and developed UDF'S to extend the functionality in both PIG and HIVE.
Import and Export of data using Sqoop between MySQL to HDFS on regular basis.
Managed datasets using Panda data frames and MySQL, queried MYSQL database queries from python using Python-MySQL connector and MySQL dB package to retrieve information.
Automated all the jobs for pulling data from FTP server to load data into Hive tables using Oozieworkflows.
Involved in developing Spark code using Scala and Spark-SQL for faster testing and processing of data and exploring of optimizing it using SparkContext, Spark-SQL, PairRDD's, Spark YARN.
Migrating the needed data from Oracle, MySQL in to HDFS using Sqoop and importing various formats of flat files in to HDFS.

Environment: : Hadoop, HDFS2.6.3, Hive1.0.1, HBase0.98.12.1, Zookeeper3.5.1, Oozie, Impala1.4.1, Java(jdk1.6), ClouderaCDH 3, Oracle, Teradata SQL Server, UNIX Shell Scripting, Flume1.6.0, Scala2.11.6, Spark1.5.0, Sqoop1.4.6, Python3.5.1.

Confidential

Java Developer

Responsibilities:

Responsible for understanding the scope of the project and requirements gathering
Created the database, user, environment, activity and class diagram for the project (UML).
Implemented the database using oracle database engine.
Created an entity object (business rules and policy, validation logic, default value logic, security).
Web application development using J2EE, JSP, Servlets, JDBC, JavaBeans, Struts, Ajax, Custom Tags, EJB, Hibernate, Ant, Junitand ApacheLog4j, Web Services, Message queue(MQ).
Created applications, connection pools, deployment of JSP & Servlets.
Used Oracle, MySQL database for storing user information.
Developed backed for application using PHP for web applications.
Hands on experience in all phases of SDLC (software development life cycle) involving.
Used the Eclipse as IDE, configured and deployed the application onto WebLogic application server using Maven build scripts to automate the build and deployment process.
Developed UML diagrams using Rational Rose
Created UI for web applications using HTML, CSS.
Created Desktop applications using J2EE, Swings.
Developed the process using Waterfall model.
Created SQL scripts for Oracle database.

Environment: Java, Servlets, JSF, Adf rich client UI framework ADF-BC (BC4J) 11g, Web Services using Oracle SOA, Oracle Web Logic.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Phoenix, AZ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship