We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Minneapolis, MN

SUMMARY

  • 8+ years of IT experience in complete life cycle of software development using Object Oriented analysis and design using Big data Technologies / Hadoop ecosystem, SQL, Java, J2EE technologies.
  • Around 5years of experience working on Big Data and Data Science building Advanced Customer Insight and Product Analytic Platforms using Big Data and Open Source Technologies.
  • Wide experience on Data Mining, Real time Analytics, Business Intelligence, Machine Learning and Web Development.
  • Leveraged strong Skills in developing applications involving Big Data technologies like Hadoop, Spark, Elastic Search, Map Reduce, Yarn, Flume, Hive, Pig, Kafka, Storm, Sqoop, HBase, Hortonworks, Cloudera, Mahout, Avro and Scala.
  • Skilled programming in Map - Reduce framework and Hadoop ecosystems.
  • Very good experience in designing and implementing MapReduce jobs to support distributed data processing and process large data sets utilizing the Hadoop cluster.
  • Experience in implementing Inverted Indexing algorithm using MapReduce.
  • Extensive experience in creating Hive tables, loading them with data and writing hive queries which will run internally in MapReduce way.
  • Hands on experience in migrating complex MapReduce programs into Apache Spark RDD transformations.
  • Experience in setting up standards and processes for Hadoop based application design and implementation.
  • Good Exposure on Apache Hadoop MapReduce programming, PIG Scripting and HDFS.
  • Worked on developing ETL processes to load data from multiple data sources to HDFS using FLUME and SQOOP, perform structural modifications using Map-Reduce, HIVE and analyze data using visualization/reporting tools.
  • Experience in writing Pig UDF’s (Eval, Filter, Load and Store) and macros.
  • Experience in developing customized UDF’s in java to extend Hive and Pig Latin functionality.
  • Exposure on usage of Apache Kafka develop data pipeline of logs as a stream of messages using producers and consumers.
  • Experience in integrating Apache Kafka with Apache Storm and created Storm data pipelines for real time processing.
  • Developed and maintained operational best practices for smooth operation ofCassandra/Hadoopclusters.
  • Very good understanding on NOSQL databases like MongoDB, Cassandra and HBase.
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
  • Experience in coordinating Cluster services through ZooKeeper.
  • Hands on experience in setting up Apache Hadoop, MapR and Hortonworks Clusters.
  • Good knowledge on Apache Hadoop Cluster planning which includes choosing the Hardware and operating systems to host an Apache Hadoop cluster.
  • Experience in Hadoop Distributions like Cloudera, HortonWorks, BigInsights, MapR Windows Azure, and Impala.
  • Experience using integrated development environment like Eclipse, Net beans, JDeveloper, MyEclipse.
  • Excellent understanding of relational databases as pertains to application development using several RDBMS including in IBM DB2, Oracle 10g, MS SQL Server 2005/2008, and MySQL and strong database skills including SQL, Stored Procedure and PL/SQL.
  • Working knowledge on J2EE development with Spring, Struts, Hibernate Frameworks in various projects and expertise in Web Services (JAXB, SOAP, WSDL, Restful) development
  • Experience in writing tests using Spec2, Scala Test, Selenium, TestNg andJunit.
  • Ability to work on diverse Application Servers like JBOSS, APACHE TOMCAT, WEBSPHERE.
  • Worked on different OS like UNIX/Linux, Windows XP, and Windows
  • A passion to learn new things (new Languages or new Implementations) have made me up to date with the latest trends and industry standard.
  • Proficient in adapting to the new Work Environment and Technologies.
  • Quick learner and self-motivated team player with excellent interpersonal skills.
  • Well focused and can meet the expected deadlines on target.
  • Good understanding of Scrum methodologies, Test Driven Development and continuous integration.

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Flume, MongoDB, Avro, Hadoop Streaming, Cassandra, Oozie, Zookeeper, Spark, Strom, Kafka

Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC, JNDI, Java Beans

IDE’s: Eclipse, Net beans, WSAD, Oracle SQL Developer

Big data Analytics: Datameer 2.0.5

Frameworks: MVC, Struts, Hibernate, Spring and MRUnit

Languages: C,C++, Java, Python, Linux shell scripts, SQL

Databases: Cassandra, MongoDB, HBase, Teradata, Oracle, MySQL, DB2

Web Servers: JBoss, Web Logic, Web Sphere, Apache Tomcat

Web Technologies: HTML, XML, JavaScript, CSS, AJAX, JSON, Servlets,JSP

Reporting Tools: Jasper Reports, iReports

ETL Tools: Informatica, Pentaho

PROFESSIONAL EXPERIENCE

Hadoop Developer

Confidential - Minneapolis, MN

Responsibilities:

  • Developed MapReduce programs to parse the raw data, and create intermediate data which would be further used to be loaded into Hive portioned data.
  • Involved in data ingestion into HDFS using Sqoop for full load and Flume for incremental load on variety of sources like web server, RDBMS and Data API’s.
  • Performed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processin.
  • Developed Map Reduce programs to join data from different data sources using optimized joins by implementing bucketed joins or map joins depending on the requirement.
  • Implementation of the Business logic layer for MongoDB Services.
  • Executed Hive queries on tables and stored in Hive to perform data analysis to meet the business requirements.
  • Worked on CreatingKafkatopics, partitions, writing custom partitioner classes.
  • Worked on Big Data Integration and Analytics based on Hadoop, Spark andKafka.
  • Worked with Kafka for the proof of concept for carrying out log processing on a distributed system.
  • Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to HDFS, Hbase and Hive by integrating with Storm.
  • Performed real-time analysis of the incoming data using Kafka consumer API, Kafka topics, Spark Streaming utilizing Scala.
  • Used the Datastax Opscenter for maintenance operations and Keyspace and table management.
  • Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like Spark
  • Developed Spark code using Python and Spark-SQL/Streaming for faster processing of data.
  • Real time streaming the data using Spark withKafka.
  • Built real time pipeline for streaming data usingKafkaand SparkStreaming.
  • Developed Kafka producer and consumers, Cassandra clients and Spark along with components on HDFS, Hive.
  • Processing large data sets in parallel across the Hadoop cluster for pre-processing.
  • Developed the code for Importing and exporting data into HDFS using Sqoop .
  • Imported data from structured data source into HDFS using Sqoop incremental imports.
  • Implemented Kafka Custom partitioners to send data to different categorized topics.
  • Implemented Storm topology with Streaming group to perform real time analytical operations.
  • Experience in implementing Kafka Spouts for streaming data and different bolts to consume data.
  • Created Hive tables, partitioners and implemented incremental imports to perform ad-hoc queries on structured data.
  • Written Shell scripts that run multiple Hive jobs which helps to automate different hive tables incrementally which are used to generate different reports using Tableau for the Business use.

Environment: s: Hadoop, Hive, Flume, Linux, Shell Scripting, Java, Eclipse, MongoDB, Kafka, Spark, Zookeeper, Sqoop, Ambari.

Hadoop Developer

Confidential - Austin, TX

Responsibilities:

  • Developed Map Reduce jobs in Java for data cleansing, preprocessing and implemented complex data analytical algorithms.
  • Created Hive Generic UDF's to process business logic with Hive QL.
  • Involved in optimizing Hive queries, improve performance by configuring Hive Query parameters.
  • Used Cassandra Query Language (CQL) to perform analytics on time series data.
  • Worked on HBase Shell, CQL, HBase API and Cassandra Hector API as part of the above proof of concept.
  • Moving data from HDFS to Cassandra using Map Reduce and Bulk Output Format class.
  • Responsible for running Hadoop streaming Jobs to process terabytes of XML Data.
  • Development of Oozie workflow for orchestrating and scheduling the ETL process.
  • Involved in implementation of Avro,ORC, and Parquet data formats for Apache Hive computations to handle the custom business requirements.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS
  • Write Unix shell scripts in combination with theTalenddata maps to process the source files and load into staging database
  • Worked in retrieving transaction data from RDBMS to HDFS, get total transacted amount per userusing MapReduce and save output in Hive table.
  • Experience in implementing Kafka Consumers and Producers by extending Kafka high-level API in java and ingesting data to HDFS or Hbase depending on the context.
  • Worked on creating the workflow to run multiple Hive and Pig jobs, which run independently with time and data availability.
  • Developed SQL scripts using Spark for handling different data sets and verifying the performance over Map Reduce Jobs.
  • Involved in converting Map Reduce programs into Spark transformations using Spark RDD's and Python.
  • Worked on Amazon AWSconcepts like EMR and EC2 web services for fast and efficient processing of Big Data.
  • Responsible for maintaining and expandingAWS(Cloud Services) infrastructure usingAWS(SNS, SQS)
  • Developed Spark scripts by using Python Shell commands as per the requirement.
  • Experience implementing machine learning techniques in Spark by using Spark Mlib.
  • Involved in moving data from Hive tables into Cassandra for real time analytics on hive tables.
  • Involved in using Hadoop bench marks in Monitoring, Testing Hadoop cluster.
  • Involved in implementing test cases, testing map reduce programs using MRUnit and other mocking frame works.
  • Involved in cluster maintenance which includes adding, removing cluster nodes, cluster monitoring and troubleshooting, reviewing and managing data backups and Hadoop log files.
  • Involved in implementing Maven build scripts, to work on maven projects and integrated with Jenkins.

Environment: s: Hadoop, AWS, Map Reduce, Hive, Spark, Avro, Kafka, Storm, Linux, Sqoop, Shell Scripting, Oozie, Cassandra, Git, XML, Scala, Java, Maven, Eclipse, Oracle.

Hadoop Developer

Confidential - Miami, FL

Responsibilities:

  • Involved in full life-cycle of the project from Design, Analysis, logical and physical architecture Modeling, Development, Implementation, Testing.
  • Responsible to managing data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Developed MapReduce programs to parse the raw data and store the refined data in tables.
  • Designed and Modified Database tables and used HBASE Queries to insert and fetch data from tables.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Developed algorithms for identifying Influencers with in specified social network channels.
  • Involved in loading and transforming large sets of Structured, Semi structured and Unstructured data from relational databases into HDFS using Sqoop imports.
  • Analyzing data with Hive, Pig and Hadoop Streaming.
  • Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Created Hive tables, loaded data and wrote Hive queries that run within the map.
  • Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
  • Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.
  • Experienced in working with Apache Storm.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Performed Data mining investigations to find new insights related to customers.
  • Involved in forecast based on the present results and insights derived from Data analysis.
  • Developed sentiment analysis system per particular domain using machine learning concepts by using supervised learning methodology.
  • Configured Hadoop Environment withKerberosauthentication, Name nodes, and Data nodes.
  • Designed Sources to Targets mappings from SQLServer, Excel/Flat files to Oracle usingInformatica Power Center.
  • Created Data Marts and loaded the data usingInformaticaTool.
  • Developed and generated insights based on brand conversations, which in turn helpful for effectively driving brand awareness, Engagement and traffic to Social media pages.
  • Involved in identification of topics and trends and building context around that brand.
  • Developed different formulas for calculating engagement on social media posts.
  • Involved in the Identifying, Analyzing defects, questionable function error and inconsistencies in output.

Environment: Java, NLP, HBase, Machine Learning, Hadoop, HDFS, Map Reduce, Hortonworks, Hive, Apache Storm, Sqoop, Flume, Oozie, Apache Kafka, Zookeeper, MySQL, and eclipse

Java Developer

Confidential - Houston, TX

Responsibilities:

  • Developed high-level design documents, Use case documents, detailed design documents and Unit Test Plan documents and created Use Cases, Class Diagrams and Sequence Diagrams using UML.
  • Extensive involvement in database design, development, coding of stored Procedures, DDL&DML statements, functions and triggers.
  • Utilized Hibernate for Object/Relational Mapping purposes for transparent persistence onto the SQL server.
  • Developed portlet kind of user experience using Ajax, jQuery.
  • Used spring IOC for creating the beans to be injected at the run time.
  • Modified the existing JSP pages using JSTL.
  • Expertise in web designing using HTML5, XHTML, XML, CSS3, JavaScript, jQuery, AJAX and Angular JS.
  • Integrated Spring Dependency Injection among different layers of an application with Spring and O/R mapping tool of Hibernate for rapid development and ease of maintenance.
  • Designed and developed the web-tier using HTML5, CSS3, JSP, Servlets, Struts and Tiles framework.
  • Used AJAX and JavaScript for validations and integrating business server side components on the client side within the browser.
  • Developed the UI panels using JSF, XHTML, CSS and JQuery.
  • Extensively involved in writing Object relational mapping code using hibernate, developed Hibernate mapping files for configuring Hibernate POJOs for relational mapping.
  • Developed the RESTful web services using Spring IOC to provide user a way to run the job and generate daily status report.
  • Developed and exposed the SOAP web services by using JAX-WS, WSDL, AXIS, JAXP and JAXB
  • Involved in developing business components using EJB Session Beans and persistence using EJB Entity beans.
  • Implemented the Connectivity to the Database Server Using JDBC.
  • Consumed Web Services using Apache CXF framework for getting remote information.
  • Used the Eclipse as IDE, configured and deployed the application into WebLogic application server.
  • Used Maven build scripts to automate the build and deployment process.
  • Used JMS in the project for sending and receiving the messages on the queue.

Environment: Java, J2EE, Spring, Hibernate, HTML5, XHTML, XML, JavaScript, jQuery, AJAX, Angular JS, Oracle SQL, SOAP, REST

Java developer

Confidential

Responsibilities:

  • Developed UI using HTML, CSS, Java Script and AJAX.
  • Used Oracle IDE to create web services for EI application using top down approach.
  • Worked on creating basic framework for spring and web services enabled environment for EI applications as web service provider.
  • Created SOAP Handler to enable authentication and audit logging during Web Service calls.
  • Created Service Layer API's and Domain objects using Struts.
  • Designed, developed and configured the applications using Struts Framework.
  • Created Spring DAO classes to call the database through spring -JPA ORM framework.
  • Wrote PL/SQL queries and created stored procedures and invoke stored procedures using spring JDBC.
  • Used Exception handling and Multi-threading for the optimum performance of the application.
  • Used the Core Java concepts to implement the Business Logic.
  • Created High level Design Document for Web Services and EI common framework and participated in review discussion meeting with client.
  • Deployed and configured the data source for database in WebLogic application server and utilized log4j for tracking errors and debugging, maintain the source code using Subversion.
  • Used Clear Case tool for build management and ANT for Application configuration and Integration.
  • Created, executed, and documented, the tests necessary to ensure that an application and/or environment meet performance requirements (Technical, Functional and User Interface)

Environment: Windows, Linux, Rational Clear Case, Java, JAX-WS, SOAP, WSDL, JSP, Java Script, Ajax, Oracle IDE, log4j, ANT, struts, JPA, XML, HTML5, CSS3, Oracle WebLogic.

Software Developer

Confidential

Responsibilities:

  • Worked as a Development Team Member.
  • Coordinated with Business Analysts to gather the requirement and prepare data flow diagrams and technical documents.
  • Identified Use Cases and generated Class, Sequence and State diagrams using UML.
  • Used JMS for the asynchronous exchange of critical business data and events among J2EE components and legacy system.
  • Involved in Designing, coding and maintaining of Entity Beans and Session Beans using EJB 2.1 Specification.
  • Involved in the development of Web Interface using MVC Struts Framework.
  • User Interface was developed using JSP and tags, CSS, HTML and Java Script.
  • Database connection was made using properties files.
  • Used Session Filter for implementing timeout for ideal users.
  • Used stored Procedure to interact with database.
  • Development of Persistence was done using DAO and Hibernate Framework.

Environment: J2EE, Struts1.0, Java Script, Swing, CSS, HTML, XML, XSLT, DTD, JUnit, EJB 2.1, Oracle, Tomcat, Eclipse, Web logic 7.0/8.1.

We'd love your feedback!