Hadoop Developer Resume New York, NY - Hire IT People

SUMMARY

Having 8+ Years of professional experience in Software Development with Linux and Hadoop/Big Data technologies.
experience with Hadoop Ecosystem including HDFS, MapReduce, Hive, Pig, Flume, Sqoop, impala, ZooKeeper, Hue, Oozie and HBase.
Experience implementing big data projects using Cloudera.
Installed, Configured and Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Hands - on experience in designing and implementing solutions using Apache Hadoop 2.4.0, HDFS 2.7, MapReduce2, Hbase 1.1, Hive 1.2, Oozie 4.2.0, Tez 0.7.0,Yarn 2.7.0,Sqoop 1.4.6,MongoDB.
Experience in implementing the Kafka Producers and consumer groups to read the messages from various partitions parallely.
Setting up and integrating Hadoop eco system tools - Hbase, Hive, Pig, Sqoop etc.
Hands on experience loading the data into Spark RDD and performing in-memory data computation
Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Storm, Spark, Kafka and Flume.
Strong understanding of Data Modeling and experience with Data Cleansing, Data Profiling and Data analysis.
Configured Hadoop clusters in OpenStack and Amazon Web Services (AWS)
Experience in ETL (Data stage) analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of databases.
Experience in extracting source data from Sequential files, XML files, Excel files, transforming and loading it into the target data warehouse.
Strong experience with Java/J2EE technologies such as Core Java, JDBC, JSP, JSTL, HTML, JavaScript, JSON
Experience in deploying and managing the multi-node development and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER) using Horton works Ambari.
Gaining optimum performance with data compression, region splits and by manually managing compaction in Hbase.
Upgrading from HDP 2.1 to HPD 2.2 and then to HDP 2.3.
Working experience in Map Reduce programming model and Hadoop Distributed File System.
In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts.
Hands on experience on Unix/Linux environments, which included software installations/ upgrades, shell scripting for job automation and other maintenance activities.
Thorough knowledge and experience in SQL and PL/SQL concepts.
Expertise in setting up standards and processes for Hadoop based application design and implementation.

TECHNICAL SKILLS

Hadoop ECO Systems: Spark-core, Kafka, Spark- SQL, HDFS, YARN, Sqoop, PIG, Hive, Oozie, Flume, Map Reduce, Storm

Development And Building Tools: Eclipse, Net Beans, IntelliJ, ANT, Maven, IVY, TOAD, SQL Developer

Data Bases: HBase, Cassandra, Oracle, SQL Server 2008 R2/2012, My SQL,ODI

Languages: Languages Java JDK1.4 1.5 1.6 (JDK 5 JDK 6), C/C++, SQL, PL/SQL,Scala,Python

Operating Systems: Windows Server 2000/2003/2008, Windows XP/Vista, Mac OS, UNIX, LINUX

Java Technologies: Spring 3.0, Struts 2.2.1, Hibernate 3.0, Spring-WS, Apache Kafka

Frameworks: JUnit and Jest

IDE’s & Utilities: Eclipse, Maven, NetBeans.

SQL Server Tools: SQL Server Management Studio, Enterprise Manager, Query Analyser, Profiler, Export & Import (DTS).

WebDev. Technologies: ASP.NET, HTML,HTML5, XML,CSS3, JavaScript/JQuery

PROFESSIONAL EXPERIENCE

Confidential, New York, NY

Hadoop Developer

Responsibilities:

Developed ETL data pipelines using Spark, Spark streaming and Scala.
Loaded data from RDBMS to Hadoop using Sqoop
Worked collaboratively to manage build outs of large data clusters and real time streaming with Spark.
Responsible for loading Data pipelines from web servers using Sqoop, Kafka and Spark Streaming API
Developed the Kafka producers, partitions in brokers and consumer groups
Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
Developed the batch scripts to fetch the data from AWS S3 storage and do required transformations in Scala using Spark framework .
Implemented Spark using Scala andSparkSQLfor faster testing and processing of data.
Data Processing: Processed data using Map Reduce and Yarn. Worked on Kafka as a proof of concept for log processing.
Monitoring the hive meta store and the cluster nodes with the help of Hue.
Developed Spark code using Scala and Spark-SQL/Streaming for faster processing of data.
Created AWS EC2 instances and used JIT servers.
Data Integrity checks have been handledusing hive queries,HadoopandSpark
Worked on performing transformations & actions on RDDs and Spark Streaming data with Scala
Implemented the Machine learning algorithms using Spark with Python
Defined job flows and developed simple to complex Map Reduce jobs as per the requirement.
Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
Developed PIG UDFs for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
Responsible in handling Streaming data from web server console logs
Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
Developed PIG Latin scripts for the analysis of semi structured data.
Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
Used Sqoop to import data into HDFS and Hive from other data systems.
Installed and configured Apache Hadoop to test the maintenance of log files in Hadoop cluster.
Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
Worked on developing ETL processes (Data Stage Open Studio) to load data from multiple data sources to HDFS using FLUME and SQOOP, and performed structural modifications using Map Reduce, HIVE.
Involved in NoSQL database design, integration and implementation
Loaded data into NoSQL database HBase.
Developed Kafka producer and consumers,HBaseclients,SparkandHadoopMapReduce jobs along with components on HDFS, Hive.
Very good understanding ofPartitions,Bucketingconcepts in Hive and designed both Managed and External tables inHiveto optimize performance

Environment: Spark, Spark Streaming, Apache Kafka, Hive, AWS, ETL, PIG, UNIX, Linux, Tableau, Teradata, Pig, Sqoop, Hue, Oozie, Java, Scala, Python, GIT

Confidential - Eden Prairie, MN

Hadoop (Big Data) Developer

Responsibilities:

Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
Responsible for building scalable distributed data solutions using Hadoop.
Implemented nine nodes CDH3 Hadoop cluster on CentOS
Implemented Apache Crunch library on top of map reduce and spark for data aggregation.
Involved in loading data from LINUX file system to HDFS.
Worked on installing cluster, commissioning & decommissioning of data node, name node recovery, capacity planning, and slots configuration.
Implemented a script to transmit suspiring information from Oracle to HBase using Sqoop.
Implemented best income logic using Pig scripts and UDFs.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Applied design patterns and OO design conceptsto improve the existingJava/J2EEbased code base.
DevelopedJAX-WSweb services
Handling Type 2 and type 1 slowly changing dimensions.
Importing and exporting data into HDFS from database and vice versa using Sqoop
Written hive jobs to parse the logs and structure them in tabular format to facilitate effective querying
Involved in the design, implementation, and maintenance of Data warehouses
Involved in creating Hive tables, loading with data and writing Hive queries
Implemented custom interceptors for flume to filter data as per requirement.
Used Hive and Pig to analyze data in HDFS to identify issues and behavioral patterns.
Created internal and external Hive tables and defined static and dynamic partitions for optimized performance.
Wrote Pig Latin scripts for running advanced analytics on the data collected.
Configured daily workflow for extraction, processing and analysis of data using Oozie Scheduler.
Proactively involved in ongoing maintenance, support and improvements in Hadoopcluster.
Wrote Pig Latin scripts for running advanced analytics on the data collected.
Implemented custom interceptors for flume to filter data as per requirement.
Used Hive and Pig to analyze data in HDFS to identify issues and behavioral patterns.
Created internal and external Hive tables and defined static and dynamic partitions for optimized performance.

Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, CDH3, CentOS,Sqoop, Oozie, UNIX, T-SQL

Confidential, San Mateo, CA

Hadoop Developer

Responsibilities:

suggestions on converting to Hadoop using MapReduce, Hive, Sqoop, Flume and Pig Latin
Experience in writingSpark applications for Data validation, cleansing, transformations and custom aggregations.
Imported data from different sources into Spark RDD for processing.
Developed custom aggregate functions usingSparkSQLand performed interactive querying.
Worked oninstalling cluster, commissioning & decommissioning ofDatanode, Namenodehigh availability, capacity planning, and slots configuration.
Responsible for managing data coming from different sources.
Imported and exported data into HDFS using Flume.
Experienced in analyzing data with Hive and Pig.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
Setup and benchmarked Hadoop/HBase clusters for internal use
Setup Hadoop cluster on Amazon EC2 using whirr for POC.
Worked on developing applications in Hadoop Big Data Technologies-Pig, Hive, Map-Reduce, Oozie, Flume, and Kafka.
Experienced in managing and reviewing Hadoop log files.
Helped with Big Data technologies for integration of Hive with HBASE and Sqoop with HBase.
Analyzed data with Hive, Pig and Hadoop Streaming.
Involved in transforming therelational databseto legacy labels to HDFS, andHBASEtables usingSqoopand vice versa.
Involved in Cluster coordination services through Zookeeper and Adding new nodes to an existing cluster.
Moved the data from traditional databases like MySQL, MS SQL Server and Oracle into Hadoop
Worked on Integrating Talend and SSIS with Hadoop and performed ETL operations.
Installed Hive, Pig, Flume, Sqoop and Oozie on the Hadoop cluster.
Used Flume to collect, aggregate and push log data from different log servers

Environment: Hadoop, Hortonworks, Linux, HDFS, Hive, Sqoop, Flume, Zookeeper and HBase

Confidential

Java Developer

Responsibilities:

Developed the business logic using Java Beans and Session Beans.
Developed system to access to legacy system database (JDBC).
Implemented validation rules using Struts framework
Developed user interface using JSP, HTML, Velocity template
Persistence Layer operations are encapsulated in a Data Access Objects (DAO) and used Hibernate for data retrieval from the database.
Developed Web services component usingXML, WSDL, andSOAPwithDOMparser to transfer and transform data between applications.
Exposed various capabilities as Web Services usingSOAP/WSDL.
Used SOAPUIfor testing theRestful Webservicesby sending anSOAPrequest.
Used AJAX framework for server communication and seamless user experience.
Created test framework onSeleniumand executed Web testing inChrome, IEandMozillathroughWeb driver.
Used client side Java scripting:JQUERYfor designingTABSandDIALOGBOX.
CreatedUNIXshell scripts to automate the build process, to perform regular jobs like file transfers between different hosts.
Design, Build, Test, and Deploy enhanced web services.
Involved in system design, coding, testing, installation, documentation and post-deployment audits, all performed in accordance with the established standards.
Developed RESTful Web Service using Spring and Apache CXF
Created Java Servlets and other classes deployed as EAR file, connecting to Oracle database using JDBC.

Environment: Hibernate, MVC, JavaScript, CSS, Maven, Java 1.6, XML, Junit, SQL, PL-SQL, Eclipse, Web Sphere

Confidential

Java/J2EE Consultant

Responsibilities:

Modified application flows and the existing UML diagrams.
Involved in Change Request Technical solution document, and implementation plan.
Followed MVC architecture using Struts.
Worked on Struts Framework and developed action and form classes for User interface.
Mapping of event class, HTML file and JavaBean Class using Xml
Used J2EE design patterns like Singleton, DAO and DTO.
Developed UI usingHTML,JavaScript, andJSP, and developed Business Logic and Interfacing components using Business Objects,XML, andJDBC.
Designed user-interface and checking validations using JavaScript.
Managed connectivity usingJDBCfor querying/inserting & data management including triggers and stored procedures.
Developed various EJBs for handling business logic and data manipulations from database.
Involved in design ofJSP’sandServletsfor navigation among the modules.
Designed cascading style sheets andXMLpart of Order entry Module & Product Search Module and did client side validations with java script.
Developed client customized interfaces for various clients using CSS and JavaScript
Performing the code review for peers and maintaining of the code repositories using GIT
Enhanced the mechanism of logging and tracing with Log4j.
Web services client generation using WSDL file.
Involved in development of presentation layer using STRUTS and custom tag libraries.
Performing integration testing, supporting the project, tracking the Confidential with help of JIRA
Acted as the first point of contact for the Business queries during development and testing phase
Working closely with clients and QA team to resolve critical issues/bugs

Environment: EcommCore, JavaScript, CSS, IVY, Java 1.6, YUI 2.8, Web Services, XML, XML Parsers SAX/ JAXB, Junit, DAO/DTO, Blue zone, Eclipse, Apache Tomcat, GIT, Jenkins, Arthur, GIT

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

New York, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship