We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

New York, NY

SUMMARY

  • Having 8+ Years of professional experience in Software Development with Linux and Hadoop/Big Data technologies.
  • experience with Hadoop Ecosystem including HDFS, MapReduce, Hive, Pig, Flume, Sqoop, impala, ZooKeeper, Hue, Oozie and HBase.
  • Experience implementing big data projects using Cloudera.
  • Installed, Configured and Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Hands - on experience in designing and implementing solutions using Apache Hadoop 2.4.0, HDFS 2.7, MapReduce2, Hbase 1.1, Hive 1.2, Oozie 4.2.0, Tez 0.7.0,Yarn 2.7.0,Sqoop 1.4.6,MongoDB.
  • Experience in implementing the Kafka Producers and consumer groups to read the messages from various partitions parallely.
  • Setting up and integrating Hadoop eco system tools - Hbase, Hive, Pig, Sqoop etc.
  • Hands on experience loading the data into Spark RDD and performing in-memory data computation
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Storm, Spark, Kafka and Flume.
  • Strong understanding of Data Modeling and experience with Data Cleansing, Data Profiling and Data analysis.
  • Configured Hadoop clusters in OpenStack and Amazon Web Services (AWS)
  • Experience in ETL (Data stage) analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of databases.
  • Experience in extracting source data from Sequential files, XML files, Excel files, transforming and loading it into the target data warehouse.
  • Strong experience with Java/J2EE technologies such as Core Java, JDBC, JSP, JSTL, HTML, JavaScript, JSON
  • Experience in deploying and managing the multi-node development and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER) using Horton works Ambari.
  • Gaining optimum performance with data compression, region splits and by manually managing compaction in Hbase.
  • Upgrading from HDP 2.1 to HPD 2.2 and then to HDP 2.3.
  • Working experience in Map Reduce programming model and Hadoop Distributed File System.
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts.
  • Hands on experience on Unix/Linux environments, which included software installations/ upgrades, shell scripting for job automation and other maintenance activities.
  • Thorough knowledge and experience in SQL and PL/SQL concepts.
  • Expertise in setting up standards and processes for Hadoop based application design and implementation.

TECHNICAL SKILLS

Hadoop ECO Systems: Spark-core, Kafka, Spark- SQL, HDFS, YARN, Sqoop, PIG, Hive, Oozie, Flume, Map Reduce, Storm

Development And Building Tools: Eclipse, Net Beans, IntelliJ, ANT, Maven, IVY, TOAD, SQL Developer

Data Bases: HBase, Cassandra, Oracle, SQL Server 2008 R2/2012, My SQL,ODI

Languages: Languages Java JDK1.4 1.5 1.6 (JDK 5 JDK 6), C/C++, SQL, PL/SQL,Scala,Python

Operating Systems: Windows Server 2000/2003/2008, Windows XP/Vista, Mac OS, UNIX, LINUX

Java Technologies: Spring 3.0, Struts 2.2.1, Hibernate 3.0, Spring-WS, Apache Kafka

Frameworks: JUnit and Jest

IDE’s & Utilities: Eclipse, Maven, NetBeans.

SQL Server Tools: SQL Server Management Studio, Enterprise Manager, Query Analyser, Profiler, Export & Import (DTS).

WebDev. Technologies: ASP.NET, HTML,HTML5, XML,CSS3, JavaScript/JQuery

PROFESSIONAL EXPERIENCE

Confidential, New York, NY

Hadoop Developer

Responsibilities:

  • Developed ETL data pipelines using Spark, Spark streaming and Scala.
  • Loaded data from RDBMS to Hadoop using Sqoop
  • Worked collaboratively to manage build outs of large data clusters and real time streaming with Spark.
  • Responsible for loading Data pipelines from web servers using Sqoop, Kafka and Spark Streaming API
  • Developed the Kafka producers, partitions in brokers and consumer groups
  • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
  • Developed the batch scripts to fetch the data from AWS S3 storage and do required transformations in Scala using Spark framework .
  • Implemented Spark using Scala andSparkSQLfor faster testing and processing of data.
  • Data Processing: Processed data using Map Reduce and Yarn. Worked on Kafka as a proof of concept for log processing.
  • Monitoring the hive meta store and the cluster nodes with the help of Hue.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster processing of data.
  • Created AWS EC2 instances and used JIT servers.
  • Data Integrity checks have been handledusing hive queries,HadoopandSpark
  • Worked on performing transformations & actions on RDDs and Spark Streaming data with Scala
  • Implemented the Machine learning algorithms using Spark with Python
  • Defined job flows and developed simple to complex Map Reduce jobs as per the requirement.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
  • Developed PIG UDFs for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
  • Responsible in handling Streaming data from web server console logs
  • Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
  • Developed PIG Latin scripts for the analysis of semi structured data.
  • Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
  • Used Sqoop to import data into HDFS and Hive from other data systems.
  • Installed and configured Apache Hadoop to test the maintenance of log files in Hadoop cluster.
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
  • Worked on developing ETL processes (Data Stage Open Studio) to load data from multiple data sources to HDFS using FLUME and SQOOP, and performed structural modifications using Map Reduce, HIVE.
  • Involved in NoSQL database design, integration and implementation
  • Loaded data into NoSQL database HBase.
  • Developed Kafka producer and consumers,HBaseclients,SparkandHadoopMapReduce jobs along with components on HDFS, Hive.
  • Very good understanding ofPartitions,Bucketingconcepts in Hive and designed both Managed and External tables inHiveto optimize performance

Environment: Spark, Spark Streaming, Apache Kafka, Hive, AWS, ETL, PIG, UNIX, Linux, Tableau, Teradata, Pig, Sqoop, Hue, Oozie, Java, Scala, Python, GIT

Confidential - Eden Prairie, MN

Hadoop (Big Data) Developer

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Implemented nine nodes CDH3 Hadoop cluster on CentOS
  • Implemented Apache Crunch library on top of map reduce and spark for data aggregation.
  • Involved in loading data from LINUX file system to HDFS.
  • Worked on installing cluster, commissioning & decommissioning of data node, name node recovery, capacity planning, and slots configuration.
  • Implemented a script to transmit suspiring information from Oracle to HBase using Sqoop.
  • Implemented best income logic using Pig scripts and UDFs.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Applied design patterns and OO design conceptsto improve the existingJava/J2EEbased code base.
  • DevelopedJAX-WSweb services
  • Handling Type 2 and type 1 slowly changing dimensions.
  • Importing and exporting data into HDFS from database and vice versa using Sqoop
  • Written hive jobs to parse the logs and structure them in tabular format to facilitate effective querying
  • Involved in the design, implementation, and maintenance of Data warehouses
  • Involved in creating Hive tables, loading with data and writing Hive queries
  • Implemented custom interceptors for flume to filter data as per requirement.
  • Used Hive and Pig to analyze data in HDFS to identify issues and behavioral patterns.
  • Created internal and external Hive tables and defined static and dynamic partitions for optimized performance.
  • Wrote Pig Latin scripts for running advanced analytics on the data collected.
  • Configured daily workflow for extraction, processing and analysis of data using Oozie Scheduler.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoopcluster.
  • Wrote Pig Latin scripts for running advanced analytics on the data collected.
  • Implemented custom interceptors for flume to filter data as per requirement.
  • Used Hive and Pig to analyze data in HDFS to identify issues and behavioral patterns.
  • Created internal and external Hive tables and defined static and dynamic partitions for optimized performance.

Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, CDH3, CentOS,Sqoop, Oozie, UNIX, T-SQL

Confidential, San Mateo, CA

Hadoop Developer

Responsibilities:

  • suggestions on converting to Hadoop using MapReduce, Hive, Sqoop, Flume and Pig Latin
  • Experience in writingSpark applications for Data validation, cleansing, transformations and custom aggregations.
  • Imported data from different sources into Spark RDD for processing.
  • Developed custom aggregate functions usingSparkSQLand performed interactive querying.
  • Worked oninstalling cluster, commissioning & decommissioning ofDatanode, Namenodehigh availability, capacity planning, and slots configuration.
  • Responsible for managing data coming from different sources.
  • Imported and exported data into HDFS using Flume.
  • Experienced in analyzing data with Hive and Pig.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Setup and benchmarked Hadoop/HBase clusters for internal use
  • Setup Hadoop cluster on Amazon EC2 using whirr for POC.
  • Worked on developing applications in Hadoop Big Data Technologies-Pig, Hive, Map-Reduce, Oozie, Flume, and Kafka.
  • Experienced in managing and reviewing Hadoop log files.
  • Helped with Big Data technologies for integration of Hive with HBASE and Sqoop with HBase.
  • Analyzed data with Hive, Pig and Hadoop Streaming.
  • Involved in transforming therelational databseto legacy labels to HDFS, andHBASEtables usingSqoopand vice versa.
  • Involved in Cluster coordination services through Zookeeper and Adding new nodes to an existing cluster.
  • Moved the data from traditional databases like MySQL, MS SQL Server and Oracle into Hadoop
  • Worked on Integrating Talend and SSIS with Hadoop and performed ETL operations.
  • Installed Hive, Pig, Flume, Sqoop and Oozie on the Hadoop cluster.
  • Used Flume to collect, aggregate and push log data from different log servers

Environment: Hadoop, Hortonworks, Linux, HDFS, Hive, Sqoop, Flume, Zookeeper and HBase

Confidential

Java Developer

Responsibilities:

  • Developed the business logic using Java Beans and Session Beans.
  • Developed system to access to legacy system database (JDBC).
  • Implemented validation rules using Struts framework
  • Developed user interface using JSP, HTML, Velocity template
  • Persistence Layer operations are encapsulated in a Data Access Objects (DAO) and used Hibernate for data retrieval from the database.
  • Developed Web services component usingXML, WSDL, andSOAPwithDOMparser to transfer and transform data between applications.
  • Exposed various capabilities as Web Services usingSOAP/WSDL.
  • Used SOAPUIfor testing theRestful Webservicesby sending anSOAPrequest.
  • Used AJAX framework for server communication and seamless user experience.
  • Created test framework onSeleniumand executed Web testing inChrome, IEandMozillathroughWeb driver.
  • Used client side Java scripting:JQUERYfor designingTABSandDIALOGBOX.
  • CreatedUNIXshell scripts to automate the build process, to perform regular jobs like file transfers between different hosts.
  • Design, Build, Test, and Deploy enhanced web services.
  • Involved in system design, coding, testing, installation, documentation and post-deployment audits, all performed in accordance with the established standards.
  • Developed RESTful Web Service using Spring and Apache CXF
  • Created Java Servlets and other classes deployed as EAR file, connecting to Oracle database using JDBC.

Environment: Hibernate, MVC, JavaScript, CSS, Maven, Java 1.6, XML, Junit, SQL, PL-SQL, Eclipse, Web Sphere

Confidential

Java/J2EE Consultant

Responsibilities:

  • Modified application flows and the existing UML diagrams.
  • Involved in Change Request Technical solution document, and implementation plan.
  • Followed MVC architecture using Struts.
  • Worked on Struts Framework and developed action and form classes for User interface.
  • Mapping of event class, HTML file and JavaBean Class using Xml
  • Used J2EE design patterns like Singleton, DAO and DTO.
  • Developed UI usingHTML,JavaScript, andJSP, and developed Business Logic and Interfacing components using Business Objects,XML, andJDBC.
  • Designed user-interface and checking validations using JavaScript.
  • Managed connectivity usingJDBCfor querying/inserting & data management including triggers and stored procedures.
  • Developed various EJBs for handling business logic and data manipulations from database.
  • Involved in design ofJSP’sandServletsfor navigation among the modules.
  • Designed cascading style sheets andXMLpart of Order entry Module & Product Search Module and did client side validations with java script.
  • Developed client customized interfaces for various clients using CSS and JavaScript
  • Performing the code review for peers and maintaining of the code repositories using GIT
  • Enhanced the mechanism of logging and tracing with Log4j.
  • Web services client generation using WSDL file.
  • Involved in development of presentation layer using STRUTS and custom tag libraries.
  • Performing integration testing, supporting the project, tracking the Confidential with help of JIRA
  • Acted as the first point of contact for the Business queries during development and testing phase
  • Working closely with clients and QA team to resolve critical issues/bugs

Environment: EcommCore, JavaScript, CSS, IVY, Java 1.6, YUI 2.8, Web Services, XML, XML Parsers SAX/ JAXB, Junit, DAO/DTO, Blue zone, Eclipse, Apache Tomcat, GIT, Jenkins, Arthur, GIT

We'd love your feedback!