We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

Atlanta, GA

SUMMARY

  • 7+ years of extensive IT experience with multinational clients this includes 5+ years of recent experience in Big Data/Hadoop Ecosystem.
  • Hands - on experience in working on Apache Hadoop ecosystem components like Map-Reduce, SQOOP, Flume, Pig, Hive, HBase, Spark, Kafka, Oozie and Zookeeper.
  • Excellent knowledge on Hadoop Components such as HDFS, Name Node, Data Node, YARN, Resource Manager, Node Manager, and Map Reduce programming paradigm.
  • Experience with installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster.
  • Experience in analyzing data using HiveQL, Pig Latin and extending HIVE and PIG core functionality by using custom UDFs.
  • Proficient in Relational Database Management Systems (RDBMS).
  • Extensive working knowledge of Partitioned table, UDFs, Performance tuning, compression related properties in Hive.
  • Good understanding of NoSQL databases and hands on experience in writing applications on NoSQL databases like HBase.
  • Hands on experience in using Amazon Web Services like EC2, EMR, RedShift, DynamoDB and S3.
  • Hands on using Apache Kafka for tracking data ingestion to Hadoop cluster and implementing Kafka Custom encoders for custom input format to load data into Kafka Partitions.
  • Experience in Spark Streaming to ingest data from multiple data sources into HDFS.
  • Skillful Hands on Experience on Stream Processing including Storm and Spark streaming.
  • Knowledge in job work-flow scheduling and monitoring tools like Oozie.
  • Experience in analyzing data using HBase and custom Map Reduce programs in Java.
  • Proficient in importing and exporting the data using SQOOP from HDFS to Relational Database systems and vice-versa.
  • Excellent knowledge in data transformations using MapReduce, HIVE and Pig scripts for different file formats.
  • Experience with various scripting languages like Linux/Unix shell scripts, Python.
  • Involved in importing Streaming data using FLUME to HDFS and analyzing using PIG and HIVE.
  • Experience in using Flume for aggregating log data from web servers and dumping into HDFS.
  • Experience in scheduling and monitoring Oozie workflows for parallel execution of jobs.
  • Proficient in Core Java, Servlets, Hibernate, JDBC and Web Services.
  • Experience in all Phases of Software Development Life Cycle (Analysis, Design, Development, Testing and Maintenance) using Waterfall and Agile methodologies.
  • Experience in using Sequence files, AVRO file, Parquet file formats; Managing and reviewing Hadoop log files.
  • Experience in Developing and maintaining applications on the AWS platform.
  • Experience in using JSP, Servlets, Struts, Java Beans, Apache Tomcat, Web Logic, Web Sphere, JBoss, JDBC, RMI, Ajax, Unix, WSDL, XML, AWS and Vertica, Spring, Hibernate, Angular JS and JMS.
  • Hands on experience in working with RESTful web services using JAX-RS and SOAP web services using JAX-WS.

TECHNICAL SKILLS

Big Data Technologies: Pig, Hive, Sqoop, Flume, HBase, Kafka-Storm, Spark with Scala, Oozie, Zookeeper, Hadoop Distributions (Cloudera, Hortonworks)Java TechnologiesJava/J2EE - JSP, Servlets, JDBC, JSTL, EJB, Junit, RMI, JMS

Web Technologies: Ajax, JavaScript, jQuery, HTML, CSS, XML, Python

Programing Languages: Java, Scala, C/ C++, Python

Databases: MySQL, MS-SQL Server, SQL, Oracle 11g, NoSQL (HBase, MongoDB, Cassandra)

Web Services: REST, AWS, SOAP, WSDL, UD

Tools: Ant, Maven, Junit

Servers: Apache Tomcat, WebSphere, JBoss

IDE's: MyEclipse, Eclipse, IntelliJ IDEA, NetBeans, WSAD

Web/UI: HTML, Java Script, XML, SOAP, WSDL

ETL/BI Tools: Talend, Tableau

PROFESSIONAL EXPERIENCE

Confidential, Atlanta, GA

Sr. Hadoop Developer

Responsibilities:

  • Hands on experience in developing Applications usingHadoopecosystem like MapReduce, Hive, Drill, Spark, Pig, Flume, Sqoop and HBase.
  • Assessed business rules, worked on source to target data mappings and collaborated with the stakeholders.
  • Developing PIG scripts to transform the raw data into intelligent data as specified by business users.
  • Familiarity with theHadoopinformation architecture, design of data ingestion pipeline, data mining and modeling, advanced data processing and machine learning.
  • Experiencedin Installation of Hadoop in Pseudo mode & Distributed mode environment.
  • Experience in working in Hadoop Cluster Environment.
  • Involved intransferring bulk data between RDBMS and HDFS using Sqoop.
  • InvolvedindevelopingtheHive, Pigscripts to perform optimization, data sorting.
  • Experience onHBASE concepts, loading data into HBASE Tables using Hive.
  • Highly motivated, ability to work independently and as a part of the team with goodCommunicationskills.
  • Capacity to adapt new environments and learn new tools.
  • Excellent communication and interpersonal skills with abilities in resolving complex software issues.
  • Strong experience in all the phases of software development life cycle including requirements gathering, analysis, design, implementation, deployment and support.
  • Written Map Reduce procedures to power data for extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV & other compressed file formats.
  • Expertise in data migration from various databases toHadoopHDFS and Hive using Sqoop.
  • Worked with Hive's data warehousing infrastructure to analyze large structured datasets.
  • Experienced in creating Hive schema, external tables and managing views.
  • Responsible for Data loading involved in creating Hive tables and partitions based on the requirement.
  • Executed Map Reduce programs to cleanse data in HDFS gathered from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Wrote Spark applications in Scala utilizing the dataframe and Spark sql api.
  • Import the data from different sources like HDFS/Hbase into Spark RDD.
  • POC on Single Member Debug on Hive/Hbase and Spark.
  • Configured, deployed, and maintained multi-node Dev and Test Kafka Clusters.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Configure Oozie workflow to run multiple Hive and Pig jobs which run independently with time and data availability.
  • Strong Knowledge on Architecture of Distributed systems and parallel processing, In-depth understanding of MapReduce programming paradigm.
  • Importing data into HDFS using Sqoop, which includes incremental loading.
  • Design and develop MapReduce jobs to process logs and feed Data Warehouse, load Hive tables for analytics and to store daily feed of data on HDFS for another team's use.
  • Develop automated shell scripts that are responsible for the data flow, monitoring and status reporting.
  • Taking on-call responsibilities and responding whenever needed (if something goes wrong withHadoopjobs or clusters).
  • Developed bash scripts to bring the log files from FTP server and then processing it to load into Hive tables.
  • Experience with Agile Methodologies and Test-Driven Development. Involved in daily SCRUM meetings.

Environment: Hadoop, Map Reduce, HDFS, PIG, Hive, Spark, Sqoop, HBase, Impala, Cloudera, Tabula, Eclipse, Scala, UNIX Shell Scripts, Java, RestClient, Firebug, Cassandra, Amazon Web Services with Cloud, Business Intelligence, HTML, XML, XML SPY, Putty.

Confidential, NYC, NY

Hadoop Developer

Responsibilities:

  • Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS.
  • Assisted with performance tuning and monitoring.
  • Gained experience in reviewing and managing Hadoop log files.
  • Experiencedin Installation of Hadoop in Pseudo mode & Distributed mode environment.
  • Exposure towards Hadoop 1.x andHadoop 2.x architecture.
  • Experience in working in Hadoop Cluster Environment.
  • Involved intransferring bulk data between RDBMS and HDFS using Sqoop.
  • InvolvedindevelopingtheHive, Pigscripts to perform optimization, data sorting.
  • Experience onHBASE concepts, loading data into HBASE Tables using Hive.
  • Highly motivated, ability to work independently and as a part of the team with goodCommunicationskills.
  • Capacity to adapt new environments and learn new tools.
  • Able to resolve complex software issues with excellent communication and interpersonal skills.
  • Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
  • Used XML Technologies like DOM for transferring data.
  • Involved in coding for DAO Objects using JDBC (using DAO pattern).
  • Used Flume to transport logs to HDFS.
  • Experienced in moving data from Hive tables into Cassandra for real time analytics on hive tables.
  • Organize documents in more useable clusters using Mahout.
  • Supported MapReduce programs those are running on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Installed and configured Hive and written Hive UDFs.
  • Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in mapreduce way.
  • Created reports for the BI team using Sqoop to export data into HDFS and Hive.
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Managed and reviewed Hadoop log files.
  • Tested raw data and executed performance scripts.
  • Assisted with data capacity planning and node forecasting.
  • Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.

Environment: Hadoop, HDFS, MapReduce, HBase, Hive, PIG, Sqoop, Cassandra, Impala, Cloudera Manager, Oozie, Zookeeper, Flume, RESTful Web Services, Java, Scala, Eclipse, Putty, HP ALM 11.52, UNIX, XML, dat files, Tivoli.

Confidential, Atlanta, GA

Hadoop Developer

Responsibilities:

  • Participate in Requirement gathering and Analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users.
  • Involved in Installing, Configuring Hadoop ecosystem, and Cloudera Manager using CDH3 distribution.
  • Involved in managing and reviewing Hadoop log files.
  • Involved in running Hadoop streaming jobs to process terabytes of xml format data
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Supported Map Reduce programs those are running on the cluster.
  • Importing and exporting of data from RDBMS to HDFS and vice versa using Sqoop.
  • Installed and configured Hive and also written Hive UDFs.
  • Involved in creating Hive tables, loading the data and writing hive queries that will run internally in Map Reduce.
  • Written Hive queries for data to meet the business requirements.
  • Analyzed the data using Pig and written Pig scripts by grouping, joining and sorting the data.
  • Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharing features.
  • Designed and Developed Dashboards using Tableau.
  • Actively participated in weekly meetings with the technical teams to review the code.

Environment: Hadoop, Map Reduce, MongoDB, Hive, Pig, Sqoop, Core Java, Cloudera, HDFS, Oracle SQL, Eclipse, Tableau, Windows XP, UNIX.

Confidential

Java/ Hadoop Developer

Responsibilities:

  • Developed JSP, JSF and Servlets to dynamically generate HTML and display the data to the client side.
  • Used Hibernate Framework for persistence onto oracle database.
  • Written and debugged the ANT Scripts for building the entire web application.
  • Developed web services in Java and Experienced with SOAP, WSDL and used WSDL to publish the services to another application.
  • Implemented Java Message Services (JMS) using JMS API.
  • Involved in managing and reviewing Hadoop log files.
  • Installed and configured Hadoop, YARN, Map Reduce, Flume, HDFS, developed multiple Map Reduce jobs in Java for data cleaning.
  • Coded Hadoop Map Reduce jobs for energy generation and PS.
  • Coded using Servlets, SOAP Client and Apache CXF Rest API's for delivering the data from our application to external and internal for communication protocol.
  • Worked on Cloudera distribution system for running Hadoop jobs on it.
  • Expertise in writing Hadoop Jobs to analyze data using Map Reduce, Hive, Pig and Solr, Splunk.
  • Created SOAP Web Service using JAX-WS, to enabled client to consume a SOAP Web Service.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.
  • Experienced in designing and developing multi-tier scalable applications using Java and J2EE Design Patterns.

Environment: Java, Hadoop, MapR, HTML, Java Script, SQL Server, PL/SQL, JSP, Spring, Hibernate, Web Services, SOAP, SOA, JSF, Java, JMS, Junit, Oracle, Eclipse, SVN, XML, CSS, Log4j, Ant, Apache Tomcat.

Hire Now