Hadoop Developer. Resume TX - Hire IT People

SUMMARY:

Proactive IT developer with Eight years of working experience on development and design of various scalable systems using Hadoop Technologies on various environments.
Extraordinary Understanding of Hadoop building and Hands on involvement with Hadoop segments such as Job Tracker, Task Tracker, Name Node, Data Node and HDFS Framework.
Extensive experience in analyzing data using Hadoop Ecosystems including Sqoop, Flume, Kafka,
Storm, HDFS, Hive, Pig, Impala, Oozie, Zookeeper, Solr, Nifi, SparkSQL, Spark Streaming.
Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
Configured Zookeeper, Cassandra, and Flume to the existing Hadoop cluster.
Have an experience in importing and exporting data using Sqoop from Hadoop Distributed File Systems to Relational Database Systems to Hadoop Distributed File Systems.
Expertise in writing Hadoop Jobs for analyzing data using Hive QL (Queries), Pig Latin (Data flow language), and custom MapReduce programs in Java.
Involvement in creating custom UDFs for Pig and Hive to consolidate strategies and usefulness of Python/Java into Pig Latin and HQL (Hive QL).
Knowledge in developing Nifi flow prototype for data ingestion in HDFS.
Experience in converting Hive queries into Spark transformations using Spark RDDs and Scala.
Hands on Experience in troubleshooting errors in HBase Shell, Pig, Hive, and MapReduce.
Hands-on experience in provisioning and managing multi-tenant Cassandra cluster on public cloud environment - Amazon Web Services (AWS) - EC2, Open Stack.
Experience in NoSQL Column-Oriented Databases like HBase, Cassandra, MongoDB, and its Integration with Hadoop cluster.
Good noledge on Integration of Mesos (DS/OS) into the Cloud Platform
Experience in maintaining the big data platform using open source technologies such as Spark and Elastic Search.
Implemented partitioning on several fact tables in data warehouse using Cluster Column store indexes.
Experience in installation, configuration, supporting and managing Hadoop Clusters using Horton works, and Cloudera (CDH3, CDH4) distributions on Amazon web services (AWS).
Experience in configuring the flume agents for the transfer of data from external systems to HDFS.
Good noledge on SDLC and have experience in waterfall and agile methodologies like scrum.
Experienced data pipelines using Kafka and Akka for handling large terabytes of data.
Good experience on general data analytics on distributed computing cluster like Hadoop using Apache Spark, Impala, and Scala
Good understanding on Yarn and Mesos.
Developed and deployed Apache Nifi flows across various environments, optimized Nifi data flow and written QA scripts in Python for tracking missing files.
Planned and created answer for constant information ingestion utilizing Kafka, Storm, Spark spilling and different NoSQL databases.
Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
Good hands on experience in creating the RDD's, Data frames for the required input data and performed the data transformations using Spark Scala.
Good noledge on Datasets.
Worked in developing a Nifi flow prototype for data ingestion in HDFS.
Extensive experience working in Oracle, DB2, SQL Server, PL/SQL and My SQL database and Java Core concepts like OOPS, Multithreading, Collections, and IO.
Experienced in designing Web Applications using HTML5, CSS3, JavaScript, Json, JQuery, AngularJS, Bootstrap, and Ajax under Windows operating system.
Experience in Service Oriented Architecture using Web Services like SOAP & Restful.
Learning on administration situated design (SOA), work processes and web administrations utilizing XML, SOAP, and WSDL.
Extensive experience in middle-tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets, JSP, JSF, Struts, Spring, Hibernate, JDBC, EJB.
Have good interpersonal, communicational skills, strong problem-solving skills, explore to new technologies with ease and a good team member.

TECHNICAL SKILLS:

Big Data Eco systems: HDFS, MapReduce, Hive, YARN, Pig, Sqoop, Kafka, Storm, Flume, Oozie, and Zookeeper, Apache Spark, Apache Tez, Impala, Nifi, Akka Apache Solr, Active MQ, Scala.

No SQL Databases: HBase, MongoDB, Cassandra

Programming Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, Scala, Python

Java/J2EE Technologies: JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery, AngularJS

Frameworks: MVC, Struts, Spring, Hibernate

Operating Systems: Sun Solaris, HP-UNIX, Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP, Play framework

Web/Application servers: Apache Tomcat, WebLogic, JBoss.

Version control: GIT, SVN, CVS

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

PROFESSIONAL EXPERIENCE:

Confidential, TX

Hadoop Developer

Responsibilities:

Optimizing the Hive Queries using the various files format like JSON, Avro, ORC, and Parquet.
Worked on Spark RDD transformations to map business analysis and apply actions on top of transformations.
Experienced in working with spark eco system using Spark SQL and Scala queries on different formats like Text file, Avro, Parquet files.
Worked in Spark streaming to get ongoing information from the Kafka and store the stream information to HDFS.
Developed Pig Latin scripts and Pig command line transformations for data joins and custom processing of Map reduce outputs and loading tables from Hadoop to various clusters.
Talend jobs for data ingestion, enrichment, and provisioning.
Worked in migrating Hive QL into Impala to minimize query response time.
Involved in loading data from edge node to HDFS using shell scripting.
Use Data frames for data transformation.
Worked with Kerberos and integrated it to the Hadoop cluster to make it more strong and secure from unauthorized access.
Migrated an existing on-premises application to AWS. Used AWSservices like EC2 and S3 for small data sets.
Created Hive tables, dynamic partitions, buckets for sampling, and working on them using HQL.
Worked on Spark using Scala and Spark SQL for faster testing and processing of data.
Experienced a proof of concept using Kafka, HBase for processing streaming data.
Involved in advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala.
Worked with BI teams in generating the reports and designing ETL workflows on Tableau. Deployed data from various sources into HDFS and building reports using Tableau.
Written Python scripts to analyze the data of the customers.
Implemented Talend jobs to load data from different sources and integrated with Kafka.
Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).

Environment: Map Reduce, HDFS, Spark, Scala, Python, Kafka, Hive, Pig, Spark streaming, Talend, HBase, Tableau, Maven, Jenkins, UNIX, MR Unit, Git.

Confidential, Iowa

Hadoop Developer.

Responsibilities:

Worked on Spark SQL to handle structured data in Hive.
Involved in making Hive tables, stacking information, composing hive inquiries, producing segments and basins for enhancement.
Involved in migrating tables from RDBMS into Hive tables using SQOOP and later generate visualizations using Tableau.
Worked on complex MapReduce program to analyses data that exists on the cluster.
Analyzed substantial data sets by running Hive queries and Pig scripts.
Written Hive UDFs to sort Structure fields and return complex data type.
Worked in AWS environment for development and deployment of custom Hadoop applications.
Involved in creating Shell scripts to simplify the execution of all other scripts (Pig, Hive, Sqoop, Impala and MapReduce) and move the data inside and outside of HDFS.
Creating files and tuned the SQL queries in Hive utilizing HUE.
Involved in collecting and aggregating large amounts of log data using Storm and staging data in HDFS for further analysis.
Created the Hive external tables using Accumulo connector.
Managed real time data processing and real time Data Ingestion in MongoDB and Hive using Storm.
Created custom Solr Query segments to optimize ideal search matching.
Developed Spark scripts by using Python shell commands.
Stored the processed results In Data Warehouse, and maintaining data using Hive.
Experienced in working with Spark eco system using Spark SQL and Scala queries on different formats like Text file, CSV file.
Created Oozie workflow and Coordinator jobs to kick off the jobs on time for data availability.
Worked with NoSQL databases like MongoDB in making MongoDB tables to load expansive arrangements of semi structured data.
Developed Spark scripts by using Python shell commands as per the requirement.
Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.
Worked and learned a great deal from Amazon Web Services (AWS) Cloud services like EC2, S3, EMR.

Environment: HDFS, MapReduce, Storm, Hive, Pig, Sqoop, MongoDB, Apache Spark, Python, Accumulo, Oozie Scheduler, Kerberos, AWS, Tableau, Java, UNIX Shell scripts, HUE, Solr, Git, Maven.

Confidential, MA

Hadoop Developer.

Responsibilities:

Responsible for importing log files from various sources into HDFS using Flume.
Handled Big Data utilizing a Hadoop group comprising of 40 hubs.
Performed complex Hive QL queries on Hive tables.
Captured the data logs from web servers into HDFS using Flume & Splunk for analysis.
Actualized Partitioning, Dynamic Partitions, Buckets in HIVE.
Exported data from DB2 to HDFS using Sqoop and Developed MapReduce jobs using Java API.
Created final tables in Parquet format.
Developed PIG scripts for source data validation and transformation.
Developed Shell, and Python scripts to automate and provide Control flow to Pig scripts.
Involved in unit testing using MR unit for MapReduce jobs.
Utilized Hive and Pig to create BI reports.
Developed data integration programs in a Hadoop environment with NoSQL data store Cassandra for data access and analysis
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
Worked with Informatica MDM in creating single view of the data.

Environment: Horton works, HDFS, Pig, Hive, MapReduce, Java, Informatica, Oozie, Linux/Unix Shell scripting, Cassandra, Python, Perl, Java (jdk1.7), Git, Maven, Jenkins.

Confidential, NJ

Java Developer.

Responsibilities:

TEMPEffectively interacted with team members and business users for requirements gathering.
Involved in analysis, design, and implementation phases of the software development lifecycle (SDLC).
Implementation of spring core J2EE patterns like MVC, Dependency Injection (DI), and Inversion of Control (IOC).
Implemented REST Web Services with Jersey API to deal with customer requests.
Developed test cases using J Unit and used Log4j as the logging framework.
Worked with HQL and Criteria API from retrieving the data elements from database.
Developed user interface using HTML, Spring Tags, JavaScript, JQuery, and CSS.
Developed the application using Eclipse IDE and worked under Agile Environment.
Design and implementation of front end web pages using CSS, JSP, HTML, java Script Ajax and, Struts
Utilized Eclipse IDE as improvement environment to plan, create and convey Spring segments on Web Logic

Environment: Java, J2EE, HTML, JavaScript, CSS, J Query, Spring 3.0, JNDI, Hibernate 3.0, Java Mail, Web Services, REST, Oracle 10g, J Unit, Log4j, Eclipse, Web logic 10.3.

Confidential

Java Developer.

Responsibilities:

Designed and implemented the training and reports modules of the application using Servlets, JSP and Ajax.
Developed custom JSP tags for the application.
Writing queries for fetching and manipulating data using ORM software ibatis.
Used Quartz schedulers to run the jobs sequentially at given time.
Implemented design patterns like Filter, Cache Manager, and Singleton to improve the performance of the application.
Implemented the reports module of the application using Jasper Reports to display dynamically generated reports for business intelligence.
Deployed the application in client’s location on Tomcat Server.

Environment: HTML, Java Script, Ajax, Java, Servlets, JSP, ibatis, Tomcat Server, SQL Server, Jasper Reports.

We provide IT Staff Augmentation Services!

Hadoop Developer. Resume

TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship