Hadoop Developer Resume Washington D.C. - Hire IT People

SUMMARY

Overall 5 years of professional experience in IT industry. Including 2 years’ experience in Hadoop development with solving problems and delivering high quality results in a fast - paced environment, one-year experience in with 2 years into Core Java based programming.
In-depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Map Reduce, Kafka, Sqoop, Hive, Pig, SparkSQL, Yarn, Hue, HCatalogs.
Good knowledge on the Cloudera Apache Hadoop tool.
Worked independently with Cloudera support for any issue/concerns with Hadoop cluster.
Have hands-on-experience on messaging services like JMS, Kafka, Flume.
Experience in NoSQL databases such as MongoDB, HBase and Elastic Search.
Hands-on experience in developing Hive UDF's, UDAF's, and Pig MACROS, Pig UDF's.
Extensive Experience in validating and cleansing the data using Hive Queries and Pig statements.
Experience in writing robust/reusable Hive Queries for processing and analyzing large volumes of data.
Read, processed and stored data parallel using the Hive Query Language.
Have a good knowledge on systems using spark, JAVA.
Strong analytical, problem solving and communicational skills with ability to work in a group or independently.
Used Kafka & Spark Streaming for real-time stream processing.
Supported Map Reduce Programs those are running on the cluster.
Experience in extracting source data from Sequential files, XML, JSON and other file formats and transforming and loading it into the target Data warehouse using Sqoop with Bash Scripts.
Experience in data processing like collecting, aggregating, moving from various sources using Kafka.
Used Kafka & Spark Streaming for real-time stream processing.
Experience in Spark for data manipulation, preparation, cleansing.
Hands on experience with Spark Core, Spark SQL, Spark Streaming using PySpark.
Used Spark-SQL to perform transformations and actions on data residing in Hive and MongoDB.
Worked on kerborized Hadoop cluster with 250 nodes on Cloudera distribution 5.4.5.
Worked on to migrate existing data to Hadoop from RDBMS (MySQL, SQL Server, and Oracle) using Sqoop.
Worked on external tables with proper partitions for efficiency and loaded the structured data in
Experience in managing and reviewing Hadoop Log files.
Responsible for 250+ RHEL servers in Enterprise environment. Support hardware/software's issues in Production, install and configure software's, patch install, troubleshoot a performance issue.
Fine tune Linux systems for better performance, modify kernel parameters to achieve optimal system performance.
Used JMS and created MDBs, sender and receiver and test servlets to check the results of program.
Experienced in Using monitoring tools such as top, sar, vmstat, iostat, Net stat to identify resource issue with Linux severs and provide recommendations.
Experience in development of Java applications.
Have a six years of work experience in Core Java developing programs.

TECHNICAL SKILLS

Big Data Ecosystems: Hadoop, MapReduce, HDFS, Yarn, hue, Hcatalog.

Ingestion Tool: Scoop, Kafka.

Databases: HBase, Mongo DB, MySQL, Oracle.

Programming Languages: Java, Spark.

Scripting Languages: HiveSQL, Pig Latin, Bash Script, XML, HTML, CSS.

Web / Application Servers: Apache, Tomcat Application Server

Operating system: Linux, Red Hat, CentOS.

Virtualization: VMware, VSphere, VMware VSphere, Vcenter.

System Monitoring tools: sar, vmstat, iostat, top, tcpdump, PS

Cloud Technologies: Amazon Web Services (AWS), EC2, EMR, VPC, RDS, Auto scaling, S3, AWS Import / Export.

PROFESSIONAL EXPERIENCE

Confidential - Washington D.C.

Hadoop Developer

Responsibilities:

Involved in choosing the right configurations for Hadoop.
Requirement gathering from the Business Partners and Subject Matter Experts.
Played a major role in Hadoopcluster installation, configuration and monitoring.
Developed data pipeline using Kafka, Spark and HBase ingest, process and store data.
Selected HBase database since the data is NoSQL.
Wrote Kafka configuration files for importing streamed log data into HBase.
Analyze and define researcher's strategy and determine system architecture and requirement to achieve goals.
Developed multiple Kafka Producers, Consumers and Zookeeper to maintain the smooth flow as per the software requirement specifications.
Wrote the java program to connect the Kafka with Spark Streaming using eclipse in Cloudera distribution
Configured Spark streaming to get ongoing information from the Kafka and store the stream information to HBase.
Wrote a Scala program to ensure that data is going to the HBase.
Used various Spark Transformations and Actions for cleansing the input data.
Developed shell scripts to generate hive create statements from the data and load the data into the table.
Wrote Map Reduce jobs using Java API and Pig Latin
Optimized Hive QL/ pig scripts by using execution engine like Spark.
Involved in writing custom Map-Reduce programs using java API for data processing.
Involved in developing a linear regression model to predict a continuous measurement for improving the observation on data developed using spark with Scala API.
Worked on the development in analyzation of data in spark.
Implemented Spark Scripts using Scala, Spark SQL to access hive tables into spark for faster processing of data.
The hive tables are created as per requirement were Internal or External tables defined with appropriate static, dynamic partitions and bucketing, intended for efficiency.
Load and transform large sets of structured, semi structured data using hive.
Written Spark jobs in Scala to analyze the data of the customers and sales history.
Involved in designing the row, key in HBase to store Text, JSON, Parquet and Avronformat files to create schema for HBase tables,
Used Spark and Spark-SQL to read the parquet data and create the tables in hive using the Scala API.
Develop Hive queries for the analysts.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre- processing with Pig.
Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
Implemented Spark using Scala and utilizing Data frames and Spark SQL API for faster processing of data.
Involved in performing the analytics and visualization for the data from the logs and estimate the error rate and study the probability of future errors using regressing models.
Used Kafka to patch up a customer activity taking after pipeline as a course of action of steady appropriate subscribe supports.
Exported the analyzed data in HBase to the Oracle using Sqoop for visualization and to generate reports for the BI team.

Environment: Hadoop, Cloudera, HDFS, pig, Hive, Kafka, Sqoop, Spark, Scala, HBase, MySQL, Oozie, Shell Scripting, Linux Red Hat, Java.

Confidential - Durham, NC.

Jr. Hadoop Developer

Responsibilities:

Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
Created a twitter application Tp flume to fetch the data from twitter. played a major role in the implementation complex map reduce programs to perform map side joins and reduce using distributed cache.
Experienced in developing complex MapReduce programs against structured, and unstructured data.
Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.
Converted existing SQL queries into Hive QL queries.
Had experience in loading data to hive and accessed the data from hive.
Loaded and transformed large sets of structured, semi structured and unstructured data in various formats like text, zip, xml and json.
Written multiple Map Reduce programs to power data for extraction, transformation and aggregation from multiple file formats including xml, json, csv & other compressed file formats.
Refined the Website clickstream from data from Omniture logs and moved it into Hive.
Developed programs using the scripting languages like Pig t manipulate the data.
Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
Developed PIG UDFs for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
PIG UDF was required to extract the information of the area from the huge data which we get from the sensors.
Maintained the track records of the project.
Created hive tables according to the company requirements.
Experience in working with very large data sets.
Build programs that leverage the parallel capabilities of Hadoop and MPP platforms
Involved in NoSQL database Mongo DB design, integration and implementation.
Loaded data into NoSQL database MongoDB.

Environment: Flume, Pig, Hive, and MongoDB database, Sqoop, and Cloudera Manager

Confidential

Hadoop Administrator

Responsibilities:

Collaborated with teams in Hadoop development for Cluster Planning, Hardware requirement, Server configurations, network equipment's to implement clusters in Cloudera Distributed Hadoop.
Involved in development of ongoing administration of Hadoop infrastructure.
Implemented commissioning and decommissioning of data nodes, updating the metadata of the name node, killing the unresponsive task tracker and dealing with blacklisted task trackers.
Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from Oracle, NoSQL and various portfolios
Created the derby database to store the log files generated by the hive.
Resolving tickets submitted by users, troubleshoot the documented errors, resolving the errors.
Involved in creating Hive tables and loading and analyzing data using Hive queries.
Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive, and Sqoop.
Created workflow using Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Assisted in importing data to HDFS and exporting analyzed data to relational databases using Sqoop.
Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
Automated script to monitor HDFS and HBase through cronjobs.
Supported code/design analysis, strategy development, and project planning.

Environment: Oozie, Sqoop, pig Latin, Sqoop, HBase, Oracle.

Confidential

Java Developer

Responsibilities:

Participation in sprint planning and collaborate with product owners to identify and prioritize product and technical requirements.
Used various Core Java techniques like Exception Handling, Data Structuresand Collections toimplement various features and enhancements.
Provide architectural solutions as needed across applications involved in the development.
Co-ordinate multiple development teams to complete a feature
Developing new projects or enhancements and maintaining the existing program to support onlineapplication.
Periodically communicate project status to stakeholders
Work on Design patterns and involvement in design decisions
Used JMS to connect with the application in India to connect with the regional services in USA.
Created a sender and receiver code by using java programming.
Developed a Message Driven Beam in India when the customer received the courier then a message is sent to the management.
Developed a hibernate framework to simplify the development of java application to interact with database like Oracle

Environment: Core Java, JMS, Hibernate Framework

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Washington D, C

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship