Hadoop Developer Resume Boston, MA - Hire IT People

SUMMARY

7 years of overall IT experience in a variety of industries, which includes hands on experience of 3+ years in Big Data technologies and designing and implementing Map Reduce
Around 3+ years of working experience in setting, configuring and monitoring ofHadoopcluster of Cloudera, Hortonworks distribution.
Excellent knowledge onHadoopArchitecture and ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
Expertize with the tools inHadoopEcosystem including Pig, Hive, HDFS, MapReduce, SQOOP, Spark, Kafka, Yarn, Oozie, Zookeeper.
Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice - versa according to client's requirement.
Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
Good knowledge on building Apache Spark applications using Scala.
Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase, Cassandra and MongoDB.
Experienced in writing MapReduce programs in Java.
Good knowledge on Apache Cassandra and Pentaho.
Scheduled ApacheHadoopjobs using Oozie workflow manager.
Experience inHadoopadministration activities such as installation and configuration of clusters using Cloudera, AWS, Hortonworks.
Knowledge of manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data.
Knowledge on importing and exporting data using Flume and kafka.
Developed Machine learning algorithms using Mahout for clustering and data mining.
Supported Apache and Tomcat applications running on Linux and Unix servers.
Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
Excellent Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC.
Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
Excellent implementation knowledge of Enterprise/Web/Client Server using Java, J2EE.
Experience in XML, XSLT, XSD, XQuery.
Have knowledge on Python and shell scripting.
Extensive experience in Oracle database design, application development and in-depth knowledge of SQL and PL/SQL.
Developed stored procedures, Functions, Packages and Triggers as backend database support for java applications.
Experienced in using agile approaches, including Extreme Programming, Test-Driven Development and Agile Scrum.
Worked in large and small teams for systems requirement, design & development.
Key participant in all phases of software development life cycle with Analysis, Design, Development, Integration, Implementation, Debugging, and Testing of Software Applications in client server environment, Object Oriented Technology and Web based applications.
Experience in using various IDEs Eclipse, My Eclipse and repositories SVN and CVS.
Experience of using build tools Ant and Maven.
Experience in using Talend tool to integrate the data.
Preparation of Standard Code guidelines, analysis and testing documentations.

PROFESSIONAL EXPERIENCE

Confidential, Boston, MA

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions usingHadoop.
Job duties include design and development of various modules inHadoopBig Data platform and processing data using MapReduce, Hive, SQOOP, Pig and Oozie.
Developed job processing scripts using Oozie workflow
Configured different topologies for Storm cluster and deployed them on regular basis.
Implemented POC to migrate map reduce jobs into Spark RDD transformations using Scala.
Spark streaming collects the data from Kafka in near real time and performs necessary transformations and aggregations on the fly to build the common learner data model and persists the data in Cassandra
Implemented different machine learning techniques in Scala using Spark machine learning library.
Developed Spark applications using Scala for easyHadooptransitions.
Used Spark with YARN and got performance results compared with MapReduce.
Installed and configured Hive, Pig, Sqoop, Flume and Oozie on theHadoopcluster.
Developed Simple to complex Map Reduce Jobs using Hive and Pig.
Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
Used Pig for analysis of large data sets and brought data back to Hbase by using Pig.
Involved inHadoopcluster task like commissioning & decommissioning Nodes without any effect to running jobs and data.
Wrote Map Reduce jobs to discover trends in data usage by users.
Involved in runningHadoopstreaming jobs to process terabytes of text data.
Uses Talend Open Studio to load files intoHadoopHIVE tables and performed ELT aggregations inHadoopHIVE.
Designing and creating ETL jobs through Talend to load huge volumes of data into Cassandra,HadoopEcosystem and relational databases.
Used sqoop to import data from SQL server to Cassandra.
Integration of Cassandra with Talend and automation of jobs.
Maintenance and troubleshooting in Cassandra cluster.
Integrated data by using Talend integration tool.
Involved in massive storage using Data Lake which is a large storage repository and processing engine.
Analyzed large data sets by running Hive queries and Pig scripts.
Helped the team to increase the Cluster size from 22 to 30 Nodes.
Job management using Fair scheduler.
Worked extensively with Sqoop for importing metadata from Oracle.
Responsible for working with different teams in buildingHadoopInfrastructure.
Involved in creating Hive tables, and loading and analyzing data using hive queries.
Designed, developed and did maintenance of data integration programs in aHadoopand RDBMS environment with both traditional and non-traditional source systems as we as RDBMS and NoSQL data stores for data access and analysis. Experienced in runningHadoopstreaming jobs to process terabytes of xml format data.
Load and transform large sets of structured, semi structured and unstructured data.
Responsible to manage data coming from different sources.
Assisted in exporting analyzed data to relational databases using Sqoop.
Wrote Hive Queries and UDF's.
Developed Hive queries to process the data and generate the data cubes for visualizing.

Confidential, Rockford, IL

Hadoop Developer

Responsibilities:

Responsible for designing and implementing ETL process to load data from different sources, perform data mining and analyze data using visualization/reporting tools to leverage the performance of OpenStack.
DevelopedHadoopstreaming MapReduce jobs using Java.
Collected the logs from the physical machines and the OpenStack controller and integrated into HDFS using Flume.
Partitioned the collected logs by date/timestamps and host names.
Good knowledge of analyzing data in Hbase using Hive and Pig.
Experience in defining Job flows using Oozie.
Worked on the HBase for data optimization.
Used Zoo-keeper along with Hbase.
Involved with file processing using Pig Latin.
Created external Hive tables and involved in data loading and writing Hive UDF's.
Performed analytics on time series data exists in Hbase using Hbase API.
Configured Flume to transport web server logs into HDFS. Also used kite logging module to upload webserver logs into HDFS.
Developed custom MapReduce programs to extract the required data from the logs.
Developed UDF functions for Hive and wrote complex queries in Hive for data analysis.
Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewingHadooplog files.
Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing hive queries to further analyze the logs to identify issues and behavioral patterns.
Successfully loaded files to Hive and HDFS from Oracle, Netezza, and SQL server using SQOOP.
Imported data frequently from MySQL to HDFS using Sqoop.
Developed unit test cases using MR unit for MapReduce code.
Plan, Deploy, Monitor and maintain Amazon AWS cloud infrastructure consisting of multiple EC2 nodes and VMWare Vm's as required in the environment.
Expertise in AWS data migration between different database platforms like SQL Server to Amazon Aurora using RDS tool.
Worked with Infrastructure team in installing cluster, commissioning & decommissioning of datanodes, Name node recovery, capacity planning.
Supported operations team inHadoopcluster maintenance activities including commissioning and decommissioning nodes and upgrades.
Worked with different teams in ETL, Data Integration, and Migration toHadoop.
Used different file formats like Text files, Sequence Files, Avro.
Used Tableau for visualizing and to generate reports.
Used Impala to pull the data from Hive tables.
Tested impala integration.

Confidential, Dallas, TX

Java/J2EE Developer

Responsibilities:

Involved in Requirement Analysis, Design, Development and Testing of the risk workflow system.
Involved in the implementation of design using vital phases of the Software development life cycle (SDLC) that includes Development, Testing, Implementation and Maintenance Support.
Applied OOAD principle for the analysis and design of the system.
Implemented XML Schema as part of XQuery query language
Applied J2EE design patterns like Singleton, Business Delegate, Service Locator, Data Transfer Object (DTO), Data Access Objects (DAO) and Adapter during the development of components.
Used RAD for the Development, Testing and Debugging of the application.
Used Websphere Application Server to deploy the build.
Developed front-end screens using Struts, JSP, HTML, AJAX, JQuery, Javascript, JSON and CSS.
Used J2EE for the development of business layer services.
Developed Struts Action Forms, Action classes and performed action mapping using Struts.
Performed data validation in Struts Form beans and Action Classes.
Developed POJO based programming model using spring framework.
Used IOC (Inversion of Control) Pattern and Dependency Injection of Spring framework for wiring and managing business objects.
Used Web Services to connect to mainframe for the validation of the data.
SOAP has been used as a protocol to send request and response in the form of XML messages.
JDBC framework has been used to connect the application with the Database.
Used Eclipse for the Development, Testing and Debugging of the application.
Log4j framework has been used for logging debug, info & error data.
Used Hibernate framework for Entity Relational Mapping.
Used Oracle 10g database for data persistence.
SQL Developer was used as a database client.
Extensively worked on Windows and UNIX operating systems.
Used SecureCRT to transfer file from local system to Unix system.\
Performed Test Driven Development (TDD) using JUnit.
Used Ant script for build automation.
PVCS version control system has been used to check-in and checkout the developed artifacts. The version control system has been integrated with Eclipse IDE.

Confidential, San Diego CA

Java/J2EE Developer

Responsibilities:

Used Tomcat Application server to deploy Servlets, JSP, TagLibs, JavaBeans, and Database Connection.
Analysed requirements for Sector Weights guideline.
Configured environment for development.
Prepared design document for Sector Weights doc changes.
Involved in Design, Development and Support phases of Software Development Life Cycle (SDLC)
Extensively worked on Core java.
Developed business logic layer using Spring Framework.
Implemented Database by using Oracle with TOAD.
Worked with Quality Assurance to ensure complete test coverage of customizations by creating unit test cases and executing them with the help of JUnit testing framework.
Supported, Testing and coding issues in Production/QA environment.
Consumed Web Services for transferring data between different applications.
Experienced with SOAP/WSDL

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Boston, MA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship