Sr. Hadoop Developer Resume Fairfax, VA - Hire IT People

SUMMARY

9 years of professional IT work experience in Analysis, Design, Administration, Development, Deployment and Maintenance of critical software and big data applications.
Over 3+ years of experience in Big Data platform as both Developer and Administrator.
Hands on experience in developing and deploying enterprise based applications using major Hadoop ecosystem components like Map Reduce, YARN, Hive, Pig, HBase, Flume, Sqoop, SparkStreaming, SparkSQL, Storm, Kafka, Oozieand Cassandra.
Hands on experience in using MapReduce programming model for Batch processing of data stored in HDFS.
Exposure to administrative tasks such as installingHadoopand its ecosystem components such as Hive and Pig
Installed and configured multiple Hadoop clusters of different sizes and with ecosystem components like Pig, Hive, Sqoop, Flume, HBase, Oozie and Zookeeper.
Worked on all major distributions of HadoopClouderaand Hortonworks.
Responsible for designing and building a Data Lake using Hadoop and its ecosystem components.
Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
Developed Spark Applications by using Scala, Java and Implemented Apache Spark data processing project to handle data from various RDBMS and Streaming sources.
Worked with the Spark for improving performance and optimization of the existing algorithms inHadoopusing Spark Context, Spark - SQL, Spark MLlib, Data Frame, Pair RDD's, Spark YARN.
Experienced in Apache Spark for implementing advanced procedures like text analytics and processing using the in-memory computing capabilities written in Scala.
Experience in installation, configuration, Management, supporting and monitoringHadoopcluster using various distributions such as Apache and Cloudera.
Experience using middleware architecture using Sun Java technologies like J2EE, Servlets, and application servers like Web Sphere and Web logic.
Used Different Spark Modules like Spark core, Spark RDD's, Spark Data frame, Spark SQL.
Converted Various Hive queries into Spark transformations and Actions that are required.
Experience in working on apacheHadoopopen source distribution with technologies like HDFS, Map-reduce, Python, Pig, Hive, Hue, HBase, SQOOP, Oozie, Zookeeper, Spark, Spark-Streaming, Storm, Kafka, Cassandra, Impala, Snappy, Green plum and MongoDB, Mesos.
In-Depth knowledge of Scala and Experience building Spark applications using Scala.
Good experience working on Tableau and Spotfire and enabled the JDBC/ODBC data connectivity from those to Hive tables.
Designed neat and insightful dashboards in Tableau.
Have worked and designed on array of reports which includes Crosstab, Chart, Drill-Down, Drill-Through, Customer-Segment, and Geodemographicsegmentation.
Deep understanding of Tableau features such as site and serveradministration, Calculatedfields, Tablecalculations, Parameters, Filter’s (Normalandquick), highlighting, Levelofdetail,Granularity, Aggregation, Reference line and many more.
Adequate knowledge of Scrum, Agile and Waterfall methodologies.
Designed and developed multiple J2EEModel 2 MVC based Web Application using J2EE.
Worked on various Tools and IDEs like Eclipse, IBM Rational, Apache Ant-Build Tool, MS-Office, PLSQL Developer, and SQL Plus.
Highly motivated with the ability to work independently or as an integral part of a team and Committed to highest levels of profession.

TECHNICAL SKILLS

Big Data Technologies: Hadoop, HDFS, Hive, MapReduce, Pig, Sqoop, Flume, Oozie, Hadoop distribution, and HBase,Spark

Programming Languages: Java (5, 6, 7),Python,Scala

Databases/RDBMS: MySQL, SQL/PL-SQL, MS-SQL Server 2005, Oracle 9i/10g/11g

Scripting/ Web Languages: JavaScript, HTML5, CSS3, XML, SQL, Shell

ETL Tools: Cassandra, HBASE,ELASTIC SEARCH, Alteryx.

Operating Systems: Linux, Windows XP/7/8

Software Life Cycles: SDLC, Waterfall and Agile models

Office Tools: MS-Office,MS-Project and Risk Analysis tools, Visio

Utilities/Tools: Eclipse, Tomcat, NetBeans, JUnit, SQL, SOAP UI, ANT, Maven, Automation and MR-Unit

Cloud Platforms: Amazon EC2

Visualization Tools: Tableau.

PROFESSIONAL EXPERIENCE

Confidential, Fairfax, VA

Sr. Hadoop Developer

Responsibilities:

Worked onHadoopcluster scaling from 4 nodes in development environment to 8 nodes in pre-production stage and up to 24 nodes in production.
Involved in complete Implementation lifecycle, specialized in writing custom MapReduce, Pig and Hive programs.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Extensively used Hive/HQL or Hive queries to query or search for a particular string in Hive tables in HDFS.
Possess good Linux and Hadoop System Administration skills, networking, shell scripting and familiarity with open source configuration management and deployment tools such as Chef.
Worked with Puppet for application deployment
Configured Kafka to read and write messages from external programs.
Configured Kafka to handle real time data.
Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
Developed MapReduce and Spark jobs to discover trends in data usage by users.
Implemented Spark using Python and Spark SQL for faster processing of data.
Developed functional programs in SCALA for connecting the streaming data application and gathering web data using JSON and XML and passing it to FLUME.
Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
Exported the patterns analyzed back into Teradata using Sqoop. Continuous monitoring and managing theHadoopcluster through Cloudera Manager.
Used the Spark -Cassandra Connector to load data to and from Cassandra.
Real time streaming the data using Spark with Kafka.
Good knowledge on building Apache spark applications using Scala.
Developed several business services using Java RESTful WebServices using Spring MVC framework
Managing and scheduling Jobs to remove the duplicate log data files in HDFS using Oozie.
Used Apache Oozie for scheduling and managing theHadoopJobs. Knowledge on HCatalog forHadoopbased storage management.
Expert in creating and designing data ingest pipelines using technologies such as spring Integration, Apache Storm-kafka
Used Flume extensively in gathering and moving log data files from Application Servers to a central location inHadoopDistributed File System (HDFS).
Implemented test scripts to support test driven development and continuous integration.
Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP
Responsible to manage data coming from different sources.
Experienced in Analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
Used File System check (FSCK) to check the health of files in HDFS.
Developed the UNIX shell scripts for creating the reports from Hive data.
Used JAVA, J2EE application development skills with Object Oriented Analysis and extensively involved throughout Software Development Life Cycle (SDLC)
Involved in the pilot of Hadoop cluster hosted on Amazon Web Services (AWS)
Extensively used Sqoop to get data from RDBMS sources like Teradata and Netezza.
Involved in collecting metrics forHadoopclusters using Ganglia and Ambari.
Extracted files from CouchDB, MongoDB through Sqoop and placed in HDFS for processed
Spark Streaming collects this data from Kafka in near-real-time and performs necessary transformations and aggregation on the fly to build the common learner data model and persists the data in NoSQL store (Hbase).
Configured Kerberos for the clusters

Environment: Hadoop, Map Reduce, HDFS, Ambari, Hive, Sqoop, Apache Kafka, Oozie, SQL, Alteryx, Flume, Spark, Cassandra, Scala, Java, AWS, GitHub.

Confidential - Atlanta, GA

Hadoop Data Analyst

Responsibilities:

Worked on cloud platform which was built with a scalable distributed data solution using Hadoopon a 40-node cluster using AWS cloud to run analysis on 25+ Terabytes of customer usage data.
Worked on analyzing Hadoopstack and different big data analytic tools including Pig, Hive, HBase database and Sqoop.
Designing and implementing semi-structured data analytics platform leveraging Hadoop.
Worked on performance analysis and improvements for Hive and Pig scripts Confidential MapReduce job tuning level.
Installation and Configuration ofHadoopCluster. Working with Cloudera Support Team to Fine tune Cluster. Developed a custom File System plugin forHadoopso it can access files on Hitachi Data Platform.
Developed connectors for elastic search and green plum for data transfer from a kafka topic. Performed Data Ingestion from multiple internal clients using Apache Kafka Developed k-streams using java for real time data processing.
Involved in Optimization of Hive Queries.
Developed a frame work to handle loading and transform large sets of unstructured data from UNIX system to HIVE tables.
Involved in Data Ingestion to HDFS from various data sources.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Extensively used Apache Sqoop for efficiently transferring bulk data between Apache Hadoopand relational databases.
Automated Sqoop, hive and pig jobs using Oozie scheduling.
Extensive knowledge in NoSQL databases like HBase
Worked extensively with importing metadata into Hive and migrated existing tables and applications to work on Hive and AWS cloud.
Responsible for continuous monitoring and managing Elastic MapReduce (EMR) cluster through AWS console.
Have good knowledge on writing and using the user defined functions in HIVE, PIG and MapReduce.
Helped business team by installing and configuring Hadoopecosystem components along with Hadoopadmin.
Developed multiple Kafka Producers and Consumers from scratch as per the business requirements.
Worked on loading log data into HDFS through Flume
Created and maintained technical documentation for executing Hive queries and Pig Scripts.
Worked on debugging and performance tuning of Hive &Pig jobs.
Used Oozie to schedule various jobs on Hadoop cluster.
Used Hive to analyses the partitioned and bucketed data.
Worked on establishing connectivity between Tableau andHive.

Environment: Hortonworks 2.4, Hadoop, HDFS, Map Reduce, Mongo DB,Cloudera Java, VMware, HIVE, Eclipse, PIG, Hive, HBase, AWS, Tableau, Sqoop, Flume, Linux, UNIX

Confidential - Beaverton, OR

Hadoop Developer

Responsibilities:

Worked with Business analysts and Product owners to analyze and understand the requirements and giving the estimates.
Implement J2EE design patterns such as Singleton, DAO, DTO and MVC.
Developed this web application to store all system information in a central location using Spring MVC, JSP, Servlet and HTML.
Used SpringAOP module to handle transaction management services for objects in any Spring-based application.
Implemented SpringDI and Spring Transactions in business layer.
Developed data access components using JDBC, DAOs, and Beans for data manipulation.
Designed and developed database objects like Tables, Views, Stored Procedures, User Functions using PL/SQL, SQL Developer and used them in WEB components.
Used iBATIS for dynamically building SQLQueries based on parameters.
Developed JavaScript and JQuery functions for all Client side Validations.
Developed Junit test cases for Unit Testing &Used Maven as build and configuration tool.
Used Shell scripting to create jobs to run on daily basis.
Debugged the application using Firebug and traversed through the nodes of the tree using DOM functions.
Monitored the error logs using log4j and fixed the problems.
Used Eclipse IDE and deployed the application on Web Logic server

Environment: Java, J2EE, Java Script, XML, JavaScript, JDBC, Spring Framework, Hibernate, Rest Full Web services, Web Logic Server, Log4j, JUnit, ANT, SoapUI, Oracle11g.

Confidential - Houston, TX

Java/Hadoop Developer

Responsibilities:

Design and development of Java classes using Object Oriented Methodology.
Worked in system using Java, JSP and SERVLET.
Development of Java classes and methods for handling Data from database.
Experience in sequence data pre-processing, extraction, model fitting and validation using ML pipelines.
Uses Talend Open Studio to load files into HadoopHIVE tables and performed ETL aggregations in HadoopHIVE.
Used Sqoop to import data from SQL server to Hadoopecosystem.
Integration of Cassandra with Talend and automation of jobs.
Did Scheduling and monitoring the console outputs through Jenkins.
Worked in Agile environment, this uses Jira to maintain the story points.
Worked on Implementation of a toolkit that abstracted Solrand Elastic Search.
Maintenance and troubleshooting in Cassandra cluster.
Installed and configured Hive and written HiveUDFs in java and python
Attended and Conducted User meetings for requirement analysis and project reporting.
Testing and bug fixing and providing support the production.

Environment: Hadoop, HDFS, Map Reduce, Java, HIVE, Eclipse, Talend, Hive, HBase, Sqoop, Flume, Cassandra, Solr.

We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

Fairfax, VA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship