We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Birmingham, AL

SUMMARY

  • 8+ years of software development experience which includes 4 years on Big Data Technologies like Hadoop, and other Hadoop Eco System Components like Hive, Pig, Sqoop, Hbase, Flume, Impala, Oozie.
  • Hands on Experience with Core Java and good at communicating with the client for requirement gathering.
  • Good hands on knowledge in Hadoop ecosystem and its components such as Map Reduce & HDFS.
  • Good understanding on various daemon processes like Job Tracker, Task Tracker, Name Node and Data Node.
  • Worked on installing, configuring, and administratingHadoop cluster for distributions like Cloudera and Hortonworks.
  • Efficient in writing MapReduce Programs and using Apache Hadoop API for analyzing the structured and unstructured data.
  • Expert in working with Hive data warehouse tool - creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the Hive QL queries.
  • Debugging Pig and Hive scripts and optimizing MapReduce job and debugging Map reduce job.
  • Hands-on experience in managing and reviewing Hadoop logs.
  • Have an experience in Using Spark.
  • Experienced in developing MapReduce programs using ApacheHadoop for working with Big Data.
  • Experience in dealing with structured and semi-structured data in HDFS.
  • Knowledge in UNIX shell scripting.
  • Have a Good Understanding in ETL tools.
  • Worked on developing ETL processes to load data from multiple data sources to HDFS using FLUME and SQOOP, perform structural modifications using Map-Reduce, HIVE and analyze data.
  • Developed Pig Latin scripts using operators such as LOAD, STORE, DUMP, FILTER, DISTINCT, FOREACH, GENERATE, GROUP, ORDER, LIMIT, UNION, to extract data from data files to load into HDFS.
  • Used Cassandra to handle large amounts of data across many commodity servers.
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Have knowledge in Solr.
  • Understanding of Data Structures and Algorithms.
  • Have an experience in using Elastic search.
  • Have Hands on Experience in Ruby.
  • Have hand on experience in Akka tool kit.
  • Have an Experience on Dynamic, general purpose Object Oriented Programming Language like Ruby.
  • Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS.
  • Have an experience on Scripting Language Python.
  • Have an experience on C and C++ programming language.
  • Have an experience on working with the cloud platforms like AWS, Azure.
  • Have an experience in processing the queries in MongoDB.
  • Experience in using Apache Sqoop to import and export data to and from HDFS and Hive.
  • Manage and review MapR log files.
  • Have an experience with IBM Insights.
  • Have an experience on the web services like REST, SOAP.
  • Experience in using and managing change management tool Git and build server software Jenkins.
  • Hands on experience with ETL tools like Telnad, Pentaho Kettle.
  • Hands on experience in implementing MapReduce Custom File Formats, Custom Writable and Custom Pratitioners.
  • Have an experience in Messaging and collection Frame work like Kafka and Storm.
  • Have an experience in using the Streaming technologies.
  • Strong knowledge in Hadoop cluster installation, capacity planning and performance tuning, benchmarking, disaster recovery plan and application deployment in production cluster.
  • Have an experience in writing the scripts in R.
  • Have an Experience in using the Data Integration Software Talend to provide the real-time Solutions.
  • Strong knowledge in internals of HDFS and MapReduce framework.
  • Have an experience in Data formats like Sequence, Avro, Parquet.
  • Basic knowledge in application design using Unified Modeling Language (UML).
  • Good exposure to databases like MYSQL.
  • Have an experience in developing dashboards.
  • Have an experience in using the Software Development Methodologies like Agile for providing the solutions.
  • Develop and execute maintainable automation tests for acceptance, functional, and regression test cases Investigate and debug test failures, updating tests or reporting bugs as necessary and provide test coverage analysis based on automation results.
  • Comprehensive knowledge of Software Development Life Cycle coupled with excellent communication skills.
  • Worked on scheduling for maximizing CPU time utilization and performing backup and restore if different components.

TECHNICAL SKILLS

Programming and Scripting Languages: C++, C, Java, HTML, CSS, JavaScript, XML, R, Python and Bash.

Hadoop Eco System: HDFS, Map Reduce, Hive, Pig, Sqoop, Flume, Cassandra, Hbase, Zookeeper, Oozie, Akka.

Tools:, Technologies and Utilities: Java, Servlets, JDBC, JSP, Web Services, WEKA, R Studio.

Database: MySQL, Oracle 9i/10g/11g, SQL Server 2000/2003/2008 , Microsoft Access 2007.

Operating Systems: Windows XP/7/8, 10, Linux Red Hat/Ubuntu/CentOS.

Cluster Monitoring Tools: Horton Works, Cloudera Manager.

Web/ App Servers: Apache Tomcat Server.

IDE's: Eclipse, Microsoft Visual Studio, Net Beans.

PROFESSIONAL EXPERIENCE

Confidential, Birmingham, AL

Hadoop Developer

Responsibilities:

  • Provided a solution using HIVE, SQOOP (to export/ import data), for faster data load by replacing the traditional ETL process with HDFS for loading data to target tables.
  • Created UDF's and Oozie workflows to Sqoop the data from source to HDFS and then to the target tables.
  • Worked on Hadoop Schema design, involved in file uploads from UNIX/LINUX file system to Hadoop Distribute File System Environment.
  • Implemented custom Datatypes, Input Format, Record Reader, Output Format, Record Writer for Map Reduce computations.
  • Used Big data Technologies to process the large components of data.
  • Processed large scale data systems using Big data components like Hadoop.
  • Developed the Pig UDF's to preprocess the data for analysis.
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Used Pig Latin scripts to extract the data from the output files, process it and load into HDFS.
  • Extensively involved in entire QA Process and defect Management life cycle.
  • Created reports for BI team using Sqoop to export data into HDFS and Hive.
  • Implemented partitioning, dynamic partitions, bucketing in HIVE.
  • Used the messaging Framework Kafka and Storm.
  • Used Storm to process the unbounded streams of data.
  • Used Kafka with combination of Apache Storm, Hbase for real time analysis of streaming of data.
  • Used Storm to a distributed real-time computation system for processing large volumes of high-velocity data.
  • Used the Data formats like Avro, Parquet.
  • Java unit and integration test experience with frameworks such as Junit, Mockito, Testing.
  • Develop and execute maintainable automation tests for acceptance, functional, and regression test cases.
  • Used Apex Platform with Malhar library which can quickly create new and non trival applications.
  • Delivered the solution using Agile Methodology.
  • Developed Web API using NodeJS and hosted on multiple load balanced API instances.
  • Used CQL Language to process the data.
  • Used Cassandra to process high volume of NoSQL data.
  • Used Akka to build highly concurrent, distributed, and fault tolerant applications on JVM.
  • Experience in developing and executing the Pig Scripts.
  • Created and Processed queries in MongoDB using Cassandra and Hbase.
  • Used the Spark to fast processing of data in Hive and HDFS.
  • Used the Spark for the transformation of data in storage.
  • Work on the RESTful Search to Search the documents in diverse formats using Elastic search.
  • Used Elastic search for Real time searches and analytics capabilities.
  • Used Kafka to conjunction withZookeeperfor deployment management, which necessitates monitoring its metrics alongside Kafka clusters.
  • Developed Hive queries to process the data and generate the results in a tabular format.
  • Handled importing of data from multiple data sources using Sqoop, performed transformations using Hive, MapReduce and loaded data into HDFS.
  • Involved in designing and developing non-trivial ETL processes within Hadoop using tools like Pig, Sqoop, Flume, and Oozie.
  • Designed performance optimization involving data transmission, data extraction, business validations, service logic and job scheduling.
  • Written Hive queries for data analysis to meet the business requirements.
  • Have an experience in developing programs in Ruby.
  • Created Hive tables and worked on them using Hive QL.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Automated all the jobs for pulling data from FTP server to load data into Hive tables using Oozie workflows.
  • Responsible for developing data pipeline using Flume, Sqoop and Pig to extract the data from weblogs and store in HDFS.

Environment: Hadoop, Java, J2EE, HDFS, Pig, Sqoop, Hbase, Spark, Scala, AWS, Shell Scripting, Linux, and RDBMS.

Confidential, Indianapolis, IN

Hadoop Developer

Responsibilities:

  • Worked in the BI team in the area of Big Data Hadoop cluster implementation and data integration in developing large-scale system software.
  • Involved in source system analysis, data analysis, and data modeling to ETL (Extract, Transform and Load).
  • Worked in tuning Hive and Pig scripts to improve performance.
  • Design R&D and testing phases of the integration application development lifecycles(Agile).
  • Write a MapReduce Program in JAVA.
  • Developed solutions to process the data into HDFS (Hadoop Distributed File System), within Hadoop.
  • Developed a Wrapper Script around Teradata connector for Hadoop TDP to support option parameters.
  • Used Sqoop extensively to ingest data from various source systems into HDFS.
  • Involved in running Hadoop jobs for processing millions of records of text data.
  • Hive was used to produce results quickly based on the report that was requested.
  • Integrated multiple sources data into Hadoop cluster and analyzed data by Hive-Hbase integration.
  • Developed PIG UDFs for the needed functionality such as custom Pigs loader known as timestamp loader.
  • Extensively used Oozie and Zookeeper to automate the flow of jobs and coordination in the cluster respectively.
  • Used compliant web services like REST.
  • Worked on the ETL tools Informatica to extract the data.
  • Have an experience in Transactional processing of data using Data warehouse.
  • Have a Knowledge in Data warehouse technologies.
  • Developed shell scripts, which acts as wrapper to start Hadoop jobs and set the configuration parameters.
  • Used the Spark to fast processing of data in Hive and HDFS.
  • Used the Spark for the transformation of data in storage.
  • Write test cases to test software throughout development cycles, inclusive of functional/unit-testing/continuous integration.
  • Tested the performance of the data sets on various NoSQL databases.
  • Understood complex data structures of different types (structured, semi structured) and de-normalizing for storage.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Involved in loading data from LINUX file system to HDFS.
  • Responsible for managing data from multiple sources.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
  • Developed Shell scripts for automate routine tasks.
  • Used Oozie and Zookeeper operational services for coordinating cluster and scheduling workflows.

Environment: Apache Hadoop, Pig, Hive, Hue, Hbase, Sqoop, Oozie, IDE, Java (jdk1.6), Flat files, MySQL, Windows XP, UNIX.

Confidential, Indianapolis, IN

Hadoop Developer

Responsibilities:

  • Analyzed the Big Data business requirements and transformed it into Hadoop centric technologies.
  • Worked on importing and exporting data from Oracle and Teradata into HDFS and Hive using Sqoop.
  • Implemented Hive custom UDF's to achieve comprehensive data analysis.
  • Developed Pig Custom UDF's for custom input formats for performing various levels of optimization.
  • Worked on streaming log data into HDFS from web servers using Flume.
  • Implemented custom interceptors for flume to filter data as per requirement.
  • Used Hive and Pig to analyze data in HDFS to identify issues and behavioral patterns.
  • Created internal and external Hive tables and defined static and dynamic partitions for optimized performance.
  • Developed Pig Latin scripts for running advanced analytics on the data collected.
  • Configured daily workflow for extraction, processing and analysis of data using Oozie Scheduler.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Got good experience with NoSQL database.
  • Designed and implemented MapReduce-based large-scale parallel relation-learning system.
  • Installed and benchmarked Hadoop/HBase clusters for internal use.
  • Written HBASE Client program in Java and web services.
  • Supported postproduction enhancements.
  • Experience with data model concepts-star schema dimensional modeling Relational design (ER).
  • Extensively used PIG to communicate with Hive using HCatalog and HBASE using Handlers.
  • Created Hive tables to store the processed results in a tabular format.

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Java, SQL, Sqoop, Oozie, Unix, Linux.

Confidential, Indianapolis, IN

Java Developer

Responsibilities:

  • Developed Java classes and helper classes in the business layer and tested them using Junit.
  • Used Eclipse Workbench ­ views, editors, perspectives, wizards as rich client platform.
  • Used JDBC extensively for database transactions.
  • Used Rational Clear Case for version repository.
  • Involved in the development of interfaces for the application using JSP, Servlets, and JavaScript.
  • Created Java validation classes and utility classes.
  • Actively participated in Stress Testing of the existing business components using WebLogic Application Server.
  • Created Class diagrams and Sequence diagrams by using Violet integrated with eclipse.
  • Used Jalopy for code formatting.
  • Extensive usage of Rest full web services throughout modules to communicate with all external system.
  • Developed wrapper classes and DAO classes used for validation and data lookup.
  • Implemented web service over http to expose the package/product details to the front end systems.
  • Responsible for developing the full stack (back-end development from the Markup, JavaScript, Application Services, Database, and Build Scripts).
  • Used sax parser to read the xml to propagate the values to the business validation layer.
  • Involved in creating various reusable Helper and Utility classes which are used across all the modules of the application.
  • Wrote build and deployed scripts using Shell scripts and involved in performance analysis of the application and fixed problems/suggest solutions.
  • Participated in reviewing the functional, business and high level design requirements. Developed the Use Case diagrams and Class diagrams.

Environment: J2EE, JDBC, Java 1.4, Servlets, JSP, Web services, SOAP, WSDL, UML, MVC, HTML, JavaScript 1.2, XML, My Eclipse.

Confidential

Application Developer

Responsibilities:

  • Worked with requirement analysis team to gather software requirements for application development.
  • Designed UML diagrams like use case diagrams, object diagrams and class diagrams to represent a detailed design phase.
  • Designed and developed user interface static and dynamic web pages using JSP, HTML and CSS.
  • Involved in generating screens and reports in JSP, Servlets, HTML, and JavaScript for the business users.
  • Provided support and maintenance after deploying the web application.
  • Worked on conversion of some modules to be multithreaded. Multithreading was used on the Server side to perform Database pooling of connections in Java.
  • Used JavaScript for developing client side validation scripts.
  • Resolved issues reported by the client.
  • Created tables, indexes, views and other objects using SQL.
  • Developed custom packages to connect to standard data sources and retrieve data efficiently.
  • Eliminating the need for each team to rewrite the same set of code multiple times.
  • Worked on product deployment, documentation and support.

Environment: Java, Android SDK 1.5 or later, IDE Eclipse, Windows, LINUX, MYSQL LITE.

We'd love your feedback!