Hadoop Developer Resume Albnay, NY - Hire IT People

PROFESSIONAL SUMMARY:

Around 8+ years of professional experience including around 3 years of Java Developer and 5 plus years in Big Data analytics as Hadoop Developer.
Experience in all the phases of Data warehouse life cycle involving Requirement Analysis, Design, Coding, Testing, and Deployment.
Experience in architecting, designing, installation, configuration, and management of Apache Hadoop Clusters & Cloudera Hadoop Distribution.
Working close together with QA and Operations teams to understand, design, and develop and end - to-end data flow requirements.
Utilizing Oozie to schedule workflows.
Experienced in migrating HiveQL into Impala to minimize query response time.
Implemented the workflows using Apache Oozie framework to automate tasks.
Experience in managing the Hadoop infrastructure with Cloudera Manager.
Practical knowledge on functionalities of every Hadoop daemons, interaction between them, resource utilizations and dynamic tuning to make cluster available and efficient.
Experience in understanding and managing Hadoop Log Files.
Experience in understanding hadoop multiple data processing engines such as interactive SQL, real time streaming, data science and batch processing to handle data stored in a single platform in Yarn.
Experience in Adding and removing the nodes in Hadoop Cluster.
Experience in extracting the data from RDBMS into HDFS Sqoop.
Experience in collecting the logs from log collector into HDFS using up Flume.
Good understanding of No SQL databases such as HBase.
Experience in analyzing data in HDFS through Map Reduce, Hive and Pig.
Design, implement and review features and enhancements to Cassandra.
Experience on UNIX commands and Shell Scripting.
Experience in Data Analysis, Data Cleansing (Scrubbing), Data Validation and Verification, Data Conversion, Data Migrations and Data Mining.
Excellent interpersonal, communication, documentation, and presentation skills.

SKILL:

Hadoop/Big Data: MapReduce, HDFS, Hive 2.3, Pig 0.17, HBASE 1.2, Zookeeper 3.4, Sqoop 1.4, Oozie, Flume 1.8, Scala 2.12, Kafka 1.0, Storm, MongoDB 3.6, Hadoop 3.0Spark, Cassandra 3.11, Impala 2.1

Database: Oracle 12c, MySQL, MS SQL server, Teradata15.

Web Tools: HTML 5.1, Java Script, XML, ODBC, JDBC, Hibernate, JSP, Servlets, Java, Struts, spring, and Avro.

Cloud Technology: Amazon Web Services (AWS), EC2, EC3, Elastic Search, Microsoft Azure.

Languages: Java/J2EE, SQL, Shell Scripting, C/C++, Python

Java/J2EE Technologies: JDBC, Java Script, JSP, Servlets, JQuery

IDE and Build Tools: Eclipse, NetBeans, MS Visual Studio, Ant, Maven, JIRA, Confluence Version Control Git, SVN, CVS

Operating System: Windows, Unix, Linux.

Tools: Eclipse Maven, ANT, JUnit, Jenkins, Soap UI, Log4j

Scripting Languages: JavaScript, JQuery, AJAX, CSS, XML, DOM, SOAP, REST

ELATED EXPERIE NCE:

Hadoop Developer

Confidential, Albnay, NY

Responsibilities:

Understand Business requirement and involved in preparing Design document preparation according to client requirement.
Analyzed Tera Data procedure to prepare all individual queries information.
Developed hive queries according to business requirement.
Developed UDF's in Hive where we don't have some default functions in hive.
Developed UDF for converting data from Hive table to JSON format as per client requirement.
Implemented Dynamic partitioning and Bucketing in Hive as part of performance tuning.
Implemented the workflow and coordinator files using Oozie framework to automate tasks.
Involved in Unit, Integration, System Testing.
Prepared all unit test case documents and flow diagrams for all scripts which are used in the project.
Scheduling and managing jobs on a Hadoop cluster using Oozie work flow.
Experienced on loading and transforming of large sets of structured, semi structured, and unstructured data.
Transforming unstructured data into structured data using PIG.
Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
Designed and developed PIG Latin Scripts to process data in a batch to perform trend analysis.
Good experience on Hadoop tools like MapReduce, Hive and HBase.
Worked on both External and Managed HIVE tables for optimized performance.
Developed HIVE scripts for analyst requirements for analysis.
Maintenance of data importing scripts using Hive and Map reduce jobs.
Data design and analysis to handle huge amount of data.
Cross examining data loaded in Hive table with the source data in oracle.
Working close together with QA and Operations teams to understand, design, and develop and end-to-end data flow requirements.
Utilizing Oozie to schedule workflows
Developing structured, efficient and error free codes for Big Data requirements using my knowledge in Hadoop and its Eco-system.
Storing, processing, and analyzing huge data-set for getting valuable insights from them.

Environment: HDFS, Map Reduce, Sqoop, Oozie, Pig, Hive, HBase, Flume, LINUX, Java, Eclipse, Cassandra, PL/SQL, UNIX Shell Scripting, and Eclipse.

Hadoop Engineer

Confidential, St.Louis, MO

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop.
Written multiple MapReduce programs in Java for Data Analysis.
Wrote MapReduce job using Pig Latin and Java API.
Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files.
Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
Responsible for architecting Hadoop clusters with Hortonworks distribution platform HDP 1.3.2 and Cloudera CDH4.
Responsible for Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups, Manage and review Hadoop log files on Hortonworks, MapR and Cloudera clusters
Collected the logs from the physical machines and the OpenStack controller and integrated into HDFS using Flume.
Load data from various data sources into HDFS using Kafka.
Designed and presented plan for POC on impala.
Experienced in migrating HiveQL into Impala to minimize query response time.
Implemented Avro and parquet data formats for apache Hive computations to handle custom business requirements.
Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
Performed extensive Data Mining applications using HIVE.
Responsible for performing extensive data validation using Hive.
Sqoop jobs, PIG and Hive scripts were created for data ingestion from relational databases to compare with historical data.
Utilized Storm for processing large volume of datasets.
Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations.
Used Visualization tools such as Powerview for excel, Tableau for visualizing and generating reports.
Setup Hadoop cluster on Amazon EC2 using whirr for POC.
Implemented test scripts to support test driven development and continuous integration.
Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, Sqoop, Flume, Oozie, Java, Linux, Maven, Teradata, Zookeeper, SVN, autosys, Hbase.

Big data Engineer

Confidential, Columbus, OH

Responsibilities:

Used Sqoop and Java API’s to import the data to Cassandra from different relational databases.
Created tables in Cassandra and loaded large data sets of structured, semi-structured and unstructured data from various data sources.
Developed Map reduce jobs in Java for cleaning and preprocessing data.
Wrote Python scripts for wrapper and utility automation.
Performed cleansing operations by using storm builder topologies before moving data in to Cassandra.
Implemented Storm builder topologies to perform cleansing operations before moving data into Cassandra.
Worked on configuring Hive, PIG, Impala, Sqoop, Flume and oozie in cloudera.
Automated data movement between different Hadoop systems using Apache NiFi.
Wrote Map reduce programs in python using Hadoop Streaming API.
Wrote on creating Hive tables and loading them with data and writing Hive queries.
Migration of ETL processes from SQL server to Hadoop using PIG for data manipulation.
Developed spark jobs using Scala in test environment and Spark sql for querying.
Worked on importing data from oracle tables to HDFS and Hbase tables using Sqoop.
Wrote scripts to load data in to Spark RDDs and do in memory computations.
Wrote Spark Streaming script which consumes topics from distributed messaging source Kafka and periodically pushes batch of data to Spark for real time processing.
Experience in Elastic search technologies in creating custom Solr Query components.
Implemented Kafka Custom encoders for custom input format to load data into Kafka Partitions.
Worked on different data sources such as Oracle, Netezza, MySQL, Flat files etc.
Extensively used Sqoop to get data from RDBMS sources like Teradata and Netezza.
Worked with Flume to load the log data from different sources into HDFS.
Good knowledge in using apache NiFi to automate the data movement between different Hadoop systems.
Developed Talend jobs to move inbound files to HDFS file location based on monthly, weekly, daily, and hourly partitioning.

Environment: Cloudera, Map Reduce, Spark SQL, Spark Streaming, Pig, Hive, Flume, Hue, Oozie, Java, Eclipse, Zookeeper, Cassandra, HBase, Talent, Github.

Hadoop Developer

Confidential

Responsibilities:

Worked on writing transformer/mapping Map-Reduce pipelines using Java.
Involved in creating Hive Tables, loading with data, and writing Hive queries which will invoke and run Map Reduce jobs in the backend.
Involved in loading data into HBase using HBase Shell, HBase Client API, Pig and Sqoop.
Designed and implemented Incremental Imports into Hive tables.
Deployed an Apache Solr search engine server to help speed up the search of the government cultural asset.
Involved in collecting, aggregating, and moving data from servers to HDFS using Apache Flume.
Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
Involved in creating Hive tables, loading with data, and writing hive queries that will run internally in map reduce way.
Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
Implemented the workflows using Apache Oozie framework to automate tasks.
Worked with Avro Data Serialization system to work with JSON data formats.
Worked on different file formats like Sequence files, XML files and Map files using Map Reduce Programs.
Created and maintained Technical documentation for launching HADOOP Clusters and for executing pig Scripts.

Environment: Hadoop, Big Data, HDFS, MapReduce, Sqoop, Oozie, Pig, Hive, hbase, Flume, LINUX, Java, Eclipse, Cassandra, Hadoop Distribution of Cloudera., PL/SQL, Windows NT, UNIX Shell Scripting, Putty and Eclipse.

J2EE Developer

Confidential

Responsibilities:

Responsible for the systems design, architecture, implementation, and integration with various technologies like Spring Integration, Web Services, Oracle Advanced Queues and WMQ's.
Implemented framework Spring 3.05 and Spring Integration 2.0.5 upgrades.
Used OSGi container framework to install bundles (modules) developed using Spring and Spring Integration.
Worked on UI development using JSP on Struts and Spring MVC Frameworks.
Developed DAOs (Data Access Object) and DOs (Data Object) using Hibernate as ORM to interact with DBMS - Oracle.
Developed modules that integrate with web services that provide global information.
Used Log4j for logging the application, log of the running system to trace the errors and certain automated routine functions.
Worked as Web Dynpro Java developer and developed custom applications and creating the Portal screens.

JAVA Developer

Confidential

Responsibilities:

Analysis, design, and development of Application based on J2EE using Struts and Hibernate.
Involved in interacting with the Business Analyst and Architect during the Sprint Planning Sessions.
Implemented Point to Point JMS queues and MDB's to fetch diagnostic details across various interfaces.
Worked with WebSphere business integration technologies as WebSphere MQ and Message Broker 7.0 (Middleware tools) on Various Operating systems.
Perform incident resolution for WebSphere Application Server, WebSphere MQ, IBM Message broker, Process and Portal server.
Configured WebSphere resources including JDBC providers, JDBC data sources, connection pooling, and JavaMail sessions. Deployed Session and Entity EJBs in WebSphere.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Albnay, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship