We provide IT Staff Augmentation Services!

Sr. Bigdata/hadoop Developer Resume

5.00/5 (Submit Your Rating)

Herndon, VA

SUMMARY

  • 8+ years of professional Java Development experience dis includes Excellent experience in Big Data Ecosystem - Hadoop and Dataengineeringtechnologies
  • Experienced in installation, configuration, management and deployment of Hadoop Cluster, HDFS, Map Reduce, Pig, Hive, Sqoop, Flume, Oozie, HBase, and Zookeeper.
  • Expertise in handling importing of data from various data source, performed transformation, and hands on developing and debugging YARN (MR2) jobs to process large data sets.
  • Experienced on extending Pigand Hive core functionality by writing custom UDF’s for Data Analysis. Data transformation, file processing, and identifying user behavior by running Pig Latin Scripts and expertise in creating Hive internal/external Tables/Views using shared Meta Store, writing scripts in HiveQL. Develop Hive queries helps for visualizing business requirement.
  • Excellent experience working wif importing and exporting Teradata using Sqoop from HDFS to RDBMS/mainframe& vice versa. Also, worked on incrementalimport by creating Sqoopmetastore jobs.
  • Experienced in using ApacheFlume for collecting, aggregation, moving large amount of data from application server and also handling variety of data using streaming and velocity of data.
  • Experienced in Extraction, Transformation, and Loading (ETL) of data from multiple sources like Flat files, XML files, and Databases. Used informatica for ETL processing based on business need and extensively used Oozie workflow engine to run multiple Hive and Pig jobs.
  • Excellent understanding of Zookeeper and Kafka for monitoring and managing Hadoop jobs and used ClouderaCDH 4x, CDH 5x for monitoring and managing Hadoop cluster.
  • Experienced working on HCatalog to share teh schema across teh distributed application and experience in batch processing and writing programs using ApacheSpark for handling real-time analytics and real streaming of data.
  • Experienced onNoSql technologies like Hbase, Cassandra for data extraction and storing huge volume of data. Also, experience in Data Warehouse life cycle, methodologies, and its tools for reporting and data analysis.
  • Expertise in creatingaction filters, parameters and calculated set for preparing dashboard and worksheet in Tableau.
  • Good exposure wif TalendOpen studio for data integration.
  • Extensive knowledge in developing ANTscripts to build and deploy application and experience in Maven to build and manage Java projects.
  • Experienced creating use case model, use case, class, sequence diagrams using Microsoft Visio and Rational Rose. Experience in design and development of object oriented analysis design (OOAD) based system using Rational Rose.
  • Experienced in building and performance tuning of core Java, JDBC, Servlets, JSP, JavaScript, web services, SQL and Stored Procedures.
  • Experienced in using RDBMS concepts and worked wif Oracle 10g/11g, SQL server and good experience in writing stored procedures, Functions and Triggers using PL/SQL.
  • Major strengths are familiarity wif multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner wif excellent interpersonal, technical and communication skills.
  • Strong oral and written communication, initiation, interpersonal, learning and organizing skills matched wif teh ability to manage time and people TEMPeffectively.

TECHNICAL SKILLS

Hadoop Core Services: HDFS, Map Reduce, Spark, YARN.

Hadoop Distribution: Horton works, Cloudera, Apache.

NO SQL Databases: HBase, Cassandra.

Hadoop Data Services: Hive, Pig, Impala, Sqoop, Flume, Spark,Kafka.

Hadoop Operational Services: Zookeeper, Oozie.

Cloud Computing Tools: Amazon AWS.

Languages: C, Java,Python, SQL, PL/SQL, Pig Latin, HiveQL, Unix, Java Script, Shell Scripting.

Java & J2EE Technologies: Core Java, Servlets,Hibernate, Spring, Struts, JMS, EJB.

Application Servers: Web Logic, Web Sphere,Tomcat.

Databases: Oracle, MySQL,Microsoft SQL Server, Teradata.

Business Intelligence Tools: Tableau,JMP11,Talend(ETL)

Operating Systems: UNIX, Windows, LINUX.

Build Tools: Jenkins, Maven, ANT.

Development Tools: Microsoft SQL Studio, Toad,Eclipse, NetBeans.

Development Methodologies: Agile/Scrum, Waterfall.

PROFESSIONAL EXPERIENCE

Confidential, Herndon,VA

Sr. BigData/Hadoop developer

Responsibilities:

  • Installed and Setup Hadoop CDH clusters for development and production environment.
  • Installed and configured Hive, Pig, Sqoop, Flume, Cloudera manager and Oozie on teh Hadoop cluster.
  • Planning for production cluster hardware and software installation on production cluster and communicating wif multiple teams to get it done.
  • Migrated data from Hadoop to AWS S3 bucket using DISTCP. Also migrated data across new and old clusters using DISTCP.
  • Developed multiple MapReduce jobs in java for Data Cleaning and pre-processing analyzing data in PIG.
  • Monitored multiple Hadoop clusters environments using ClouderaManager. Monitored workload, job performance and collected metrics for Hadoop cluster when required.
  • Installed Hadoop patches, updates and version upgrades when required
  • Installed and configured ClouderaManager, Hive, Pig, Sqoop and Oozie on teh CDH4 cluster.
  • Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Namenode utilizing zookeeper services.
  • Performed an upgrade in development environment from CDH 4.2 to CDH 4.6.
  • Design & Develop ETL workflow using Oozie for business requirements, which includes automating teh extraction of data from MySQL database into HDFS using Sqoop scripts.
  • Design and create teh Complete "ETL" process from end-to-end using Talend jobs and create teh test cases for validating teh Data in teh Data Marts and in teh Data Warehouse.Everyday Capture teh data from OLTP Systems and various sources of XML, EXCEL and CSV and load teh data into Talend ETL Tools.
  • Automated end to end workflow from Data preparation to presentation layer for Artist Dashboard project using Shell Scripting.
  • Developed Map reduce program which were used to extract and transform teh data sets and result dataset were loaded to Cassandra and viceversa using kafka.
  • Using Kafka messaging systemregistered to Cassandra brokers and pulled teh data to HDFS.
  • Orchestrated Sqoop scripts, pig scripts, hive queries using Oozie workflows and sub-workflows
  • Conducting RCA to find out data issues and resolve production problems.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoopcluster.
  • Performed data analytics in Hive and tan exported dis metrics back to Oracle Database using Sqoop.
  • Involved in Minor and Major Release work activities.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Collaborating wif business users/product owners/developers to contribute to teh analysis of functional requirements.

Environment: ClouderaHadoop, Talend open studio,MapReduce, HDFS, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, LDAP

Confidential, Omaha, NE

Sr. BigData/Hadoop developer

Responsibilities:

  • Involved in Installing, ConfiguringHadoopEco System, Cloudera Manager using CDH3,CDH4 Distributions.
  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Imported data using Sqoop from Teradata using Teradata connector.
  • Created Data Pipeline of Map Reduce programs using Chained Mappers.
  • Implemented Optimized join base by joining different data sets to get top claims based on state using Map Reduce.
  • Visualize teh HDFS data to customer using BI tool wif teh help of HiveODBC Driver.
  • Worked on POC ofTalendintegration wifHadoop where CreatedTalendJobs to extract data fromHadoop.
  • Imported data using Sqoop to load data fromMySQL to HDFS on regular basis
  • Worked on social media(Facebook, Twitter etc) data crawling using Java and R language and MongoDBfor unstructured data storage.
  • Integrated Quartz scheduler wif Oozie work flows to get data from multiple data sources parallels using fork
  • Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
  • Experienced wif different kind of compression techniques like LZO, GZip, Snappy.
  • Created Hive Generic UDF's to process business logic that varies based on policy.
  • Imported Relational Data base data using Sqoop into Hive Dynamic partition tables using staging tables.
  • Worked on custom Pig Loaders and storage classes to work wif variety of data formats such as JSON and XML file formats.
  • Used Oozie workflow engine to manage interdependentHadoopjobs and to automate several types ofHadoopjobs such as Java map-reduce Hive, Pig, and Sqoop
  • Developed Unit test cases using Junit, Easy Mock and MRUnit testing frameworks.
  • Experienced in Monitoring Cluster using Cloudera manager.

Environment: Hadoop, HDFS, HBase, Spark, MapReduce, Tera Data, MySQL, Java, Hive, Pig, Sqoop, Flume, Oozie, SQL, Cloudera Manager

Confidential, Orlando, FL

Hadoop developer

Responsibilities:

  • Imported Data from Different Relational Data Sources like RDBMS, Teradata to HDFS using Sqoop.
  • Imported Bulk Data into HBase Using Map Reduce programs.
  • Perform analytics on Time Series Data exists in HBase using HBaseAPI.
  • Designed and implemented Incremental Imports into Hive tables.
  • Used Rest API to Access HBase data to perform analytics.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS& Extracted teh data from MySQL into HDFS using Sqoop.
  • Worked in Loading and transforming large sets of structured, semi structured and unstructured data
  • Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume
  • Written Hive jobs to parse teh logs and structure them in tabular format to facilitate TEMPeffective querying on teh log data.
  • Created reports for teh BI team using Sqoop to export data into HDFS and Hive.
  • Involved in creating Hive tables, and loading them into dynamic partition tables.
  • Experienced in managing and reviewing teh Hadoop log files.
  • Migrated ETL jobs to Pig scripts to do Transformations, even joins and some pre-aggregations before storing teh data to HDFS.
  • Deployment and Testing of teh system in HadoopMapRCluster.
  • Worked on different file formats like Sequence files, XML files and Map files using Map Reduce Programs.
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing
  • Imported data from RDBMS environment into HDFS using Sqoop for report generation and visualization purpose using Tableau.
  • Worked on Oozie workflow engine for job scheduling.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing PigScripts.

Environment: Hadoop, HDFS, Map Reduce, Hive, HBaseOozie, Sqoop, Pig, Java,Tableau, Rest API, Maven

Confidential

Java Developer

Responsibilities:

  • Involved in Design, Development and Support phases of Software Development Life Cycle (SDLC)
  • Primarily responsible for design and development using Java, J2EE, XML, Oracle SQL, PLSQL and XSLT.
  • Experience of gathering data for requirements and use case development.
  • Reviewed teh functional, design, source code and test specifications
  • Developed teh presentation layer using JSP, CSS and client validations using JavaScript.
  • Involved in developing teh complete front end development using JavaScript and CSS
  • Implemented backend configuration on DAO, and XML generation modules of DIS
  • Used JDBC for database access, and also used Data Transfer Object (DTO) design patterns
  • Unit testing and rigorous integration testing of teh whole application
  • Involved in designing and development of teh site using JSP, Servlet, JavaScript and JDBC.
  • Written and executed teh Test Scripts using JUNIT and also actively involved in system testing
  • Developed XML parsing tool for regression testing
  • Worked on documentation that meets wif required compliance standards. Also, monitored end-to-end testing activities.

Environment: Java, JavaScript, HTML, CSS, JDK1.5.1, JDBC, Oracle9i/10g, XML, XSL, Solaris and UML.

Confidential

Java Developer

Responsibilities:

  • Worked on Java Struts1.0 synchronized wif SQL Server Database to develop an internal application for ticket creation.
  • Designed and developed GUI using JSP, HTML, DHTML and CSS.
  • Mapped an internal tool wif Service now ticket creation tool.
  • Individually developed Parser logic to decode teh Spawn file generated from Client side and to generate a ticket based system on Business Requirements.
  • Used XSL transforms on certain XML data.
  • Developed ANTscript for compiling and deployment and performed unit testing using Junit.
  • Build SQLqueries for fetching teh required data and columns from production Database.
  • Highly involved in writing all database related issues wif Stored Procedures, Triggers and tables based on requirements.
  • Prepared Documentation and User Guides to identify teh various Attributes and Metrics needed from Business.
  • Handled SVNVersion control as code repository.
  • Conduct Knowledge Transfer (KT) sessions on teh business value and technical functionalities incorporated in teh developed modules for new recruits.
  • Created a maintenance plan for production database. Facilitated wif Oracle Certified Java Programmer 6.

Environment: MS Windows 2000, OS/390, J2EE ( JSP, Struts 1.1), SQL Server 2005, Eclipse, Tomcat 6.

We'd love your feedback!