We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

4.00/5 (Submit Your Rating)

MO

PROFESSIONAL SUMMARY:

  • 10 years of experience in the Information Technology Industry with strong exposure to software project management, design, development, implementation, maintenance/support and integration of software applications.
  • 4 plus years of work experience as Hadoop Developer with good knowledge of Hadoop framework, Hadoop distributed file system and parallel processing implementation.
  • Experience in Hadoop Ecosystems HDFS, Map Reduce, Hive, Pig, HBase, Sqoop, Talend.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Good Exposure on Apache Hadoop Map Reduce programming, Hive, PIG scripting andHDFS. 
  • Experience in managing and reviewing Hadoop log files. 
  • Hands on experience in Import/Export of data using Hadoop Data Management tool SQOOP. 
  • In depth understanding of Data Structures and Algorithms.
  • Strong experience in writing Map Reduce programs for Data Analysis. Hands on experience in writing custom partitioners for Map Reduce. 
  • Experience in development of Big Data projects using Hadoop, Hive, HDP, PIG, Flume, Storm and MapReduce open source tools/technologies. 
  • Extensive experience as Hadoop Developer with strong expertise in Hive, Pig, Spark
  • Experience in Big Data Analytics with hands on experience in Data Extraction, Transformation, Loading and Data Analysis, Data Visualization using Cloudera Platform
  • Performed data analysis using Hive and Pig. 
  • Excellent understanding and knowledge of NOSQL databases like Mongo DB, HBase, and Cassandra.
  • Experience with distributed systems, large - scale non-relational data stores, RDBMS, NoSQL map-reduce systems, data modeling, database performance, and multi-terabyte data warehouses. 
  • Experience in Software Development Life Cycle (Requirements Analysis, Design, Development, Testing, Deployment and Support). 
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Experience working with JAVA J2EE, JDBC, ODBC, JSP, Java Eclipse, Java Beans, EJB, Servlets. 
  • Experience in using IDEs like Eclipse,NetBeans and Maven Development experience in DBMS like Oracle, MS SQL Server, Teradata and MYSQL. 
  • Strong knowledge of data warehousing, including Extract, Transform, and Load Processes. 
  • Hands on experience on writing Queries, Stored procedures, Functions and Triggers by using SQL.
  • Support development, testing, and operations teams during new system deployments.
  • Evaluate and propose new tools and technologies to meet the needs of the organization.
  • An excellent team player and self-starter with good communication skills and proven abilities to finish tasks before target deadlines.
  • Involved in Reviewing the project scope, requirements, architecture diagram, proof of concept (POC) design and development guidelines on Talend. 
  • Experience in versioning tools like GIT, SVN.
  • Experience in shell scripting (Bash) and python.
  • Involved in DevOps migration/automation processes for build and deploy systems. 
SKILLS SUMMARY:

Hadoop ECO Systems: Hadoop MapReduce, HDFS, HBase, Hive, Pig, Sqoop, ZooKeeper, Kafka, Oozie

NO SQL: Mongo DB, Cassandra

Databases: MS SQL Server, MY SQL, Oracle 9i/10g, MS access, Teradata TeradataV2R5

Languages: Java JDK1.4 1.5 1.6 (JDK 5 JDK 6), C/C++, SQL, Teradata SQL, PL/SQL, Servlets, JavaBeans, JDBC, JNDI, JTA, JPA, E

Operating Systems: Windows, Server Windows XP/Vista, Mac OS, UNIX, LINUX

SQL Server Tools: SQL Server Management Studio, Enterprise Manager, QueryAnalyser, Profiler, Export/Import (DTS).

TECHNICAL EXPERIENCE:

Sr. Hadoop Developer

Confidential, MO

Responsibilities:

  • Evaluate business requirements and prepare detailed specifications that follow project guidelines required to develop written programs.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Analyze large amounts of data sets to determine optimal way to aggregate and report on it.
  • Develop simple to complex MapReduce Jobs using Hive to cleanse and load mainframe data
  • Handle importing of data from various data sources, perform transformations using Hive, MapReduce, load data into HDFS and extract the data from MySQL into HDFS using Sqoop.
  • Export the analyzed data from hive tables to mainframe DB2 databases using Sqoop for visualization and to generate reports for the BI team.
  • Extensively used Hive for data cleansing.
  • Create partitioned tables in Hive.
  • Manage and review Hadoop log files.
  • Involved in creating Hive tables, loading with data and writing Hive queries, which will run internally in MapReduce way.
  • Created Mapreduce programs using Apache Hadoop for working with Big data.
  • Use Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Used Unix bash scripts to validate the files from Unix to HDFS file systems.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Manage data coming from different sources.
  • Scheduled the scripts to cleanse and load the data into hive-partitioned tables using Autosys.
  • Administered the table access for various group of people using Sentry application in Cloudera.
  • Work with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Load and transform large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
  • Experienced in working as DevOps resource for release automation to achieve continuous integration and continuous delivery (CI and CD)
  • Creating numerous Unix Shell scripts for issues analyzing. 
  • Used Shell scripts to automate the current file watcher process and file validation process. Also included the md5 checksum with it.
  • Ported chain Map Reduce jobs to Scala and Spark.
  • Worked in the ETL tool Pentaho Data integration. Created a lot of KTRs and KJB’s to extract and load data from SQL to HDFS.
  • Experience with Talend to develop processes for extracting, cleansing, transforming, integrating, and loading data into data mart database
  • Experience in designing and developing complex Talend/ETL Jobs and Java Routines. 
  • Scheduled the pentaho jobs in Autosys for data migration project.
  • Used the Pentaho ETL to incorporate the sqoop jobs for sqooping the data from DB2 table to Hive tables.
  • Experience in developing and designing POC's using Scala, Spark SQL and then deployed on the Yarn cluster. 
  • Proficiency in Spark using Scala for loading data from the local file systems like HDFS, Relational and NoSQL databases using Spark SQL, and Import data into RDD and Ingesting data from a range of sources.
  • Used Apache Solr 4.8 for search configuration. 
  • Involved in Solr Search development in dataconfig model.
  • Develop Solr Cloud test protocols or plan for testing revised application and review test results. 
  • Designed Talend Jobs and Created Jobs for Audit Process, Exception Handling framework. 
  • Worked with Parallel connectors for Parallel Processing to improve job performance while working with bulk data sources in Talend. 

Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Java, SQL, Sqoop, Scala, Spark, Eclipse, DB2, Pentaho, Talend, Shell, Python, Solr

Sr. Hadoop Developer

Confidential, MD

Responsibilities:
  • Evaluate business requirements and prepare detailed specifications that follow project guidelines required to develop written programs.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Analyze large amounts of data sets to determine optimal way to aggregate and report on it.
  • Develop simple to complex MapReduce Jobs using Hive and Pig.
  • Optimized MapReduce jobs to use HDFS efficiently by using various compression mechanisms.
  • Handle importing of data from various data sources, perform transformations using Hive, MapReduce, load data into HDFS and extract the data from MySQL into HDFS using Sqoop.
  • Export the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Extensively used Pig for data cleansing.
  • Create partitioned tables in Hive.
  • Manage and review Hadoop log files.
  • Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in MapReduce way.
  • Use Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Install and configure Pig and also write Pig Latin scripts.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Manage data coming from different sources.
  • Work with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Experienced in working on DevOps/Agile operations process and tools area (Code review, unit test automation, Build & Release automation, Environment, Service, Incident and Change Management).
  • Experience in migrating data from relational databases to Cassandra.
  • Expert in Cassandra Query Language (CQL) to execute queries on the data persisting in the Cassandra cluster. 
  • Writing shell script for daily and routine jobs. 
  • Created oozie workflow to run multiple Hive queries and pig scripts simultaneously.
  • Generated property list for every application dynamically using Python. 
  • Worked in developing and designing POC's using Scala, Spark SQL and MLlib libraries then deployed on the Yarn cluster. 
  •  Involved in writing Java client program to connect to Cassandra cluster.
  • Involved in converting Map Reduce programs into Spark transformations using Spark RDD on Scala.

Environment: Hadoop, MapReduce, HDFS, Hive, impala, Cassandra Pig, Java, SQL, Sqoop, Scala, Spark, Java (jdk 1.6), Eclipse

Hadoop Developer

Confidential, OH

Responsibilities:
  • Gathered User requirements and designed technical and functional specifications.
  • Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hbasedatabase and Sqoop.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Implemented nine nodes CDH3 Hadoop cluster on Red hat LINUX.
  • Involved in loading data from LINUX file system to HDFS.
  • Worked on installing cluster, commissioning and decommissioning of DataNode, NameNode recovery, capacity planning, and slots configuration.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Implemented a script to transmit Sys Prin information from Oracle toHbase using Sqoop.
  • Implemented best income logic using Pig scripts and UDFs.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performance Pig queries.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Responsible to manage data coming from different sources.
  • Involved in loading data from UNIX file system to HDFS.
  • Loaded and transformed large sets of structured, semi structured and unstructured data.
  • Clustered coordination services through Zookeeper.
  • Experienced in managing and reviewing Hadoop log files.
  • Job management using Fair Scheduler.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Responsible for cluster maintenance, added and removed cluster nodes, cluster monitoring and troubleshooting, managed and reviewed data backups, managed and reviewed Hadoop log files.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Supported in setting up QA environment and updated configurations for implementing scripts with Pig and Sqoop.

Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Ubuntu, Linux Red Hat.

Hadoop Developer

Confidential, WI

Responsibilities:
  • Involved in review of functional and non-functional requirements.
  • Facilitated knowledge transfer sessions.
  • Installed and configured Hadoop Mapreduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
  • Imported and exported data into HDFS and Hive using Sqoop.
  • Experienced in defining job flows.
  • Experienced in managing and reviewing Hadoop log files.
  • Extracted files from CouchDB through Sqoop and placed in HDFS and processed.
  • Experienced in running Hadoop streaming jobs to process terabytes of XML format data.
  • Loaded and transformed large sets of structured, semi structured and unstructured data.
  • Responsible to manage data coming from different sources.
  • Good experience with NOSQL database.
  • Supported MapReduce programs for those running on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Installed and configured Hive and also wrote Hive UDFs.
  • Involved in creating Hive tables, loading data and writing Hive queries which ran internally in map reduce way.
  • Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.
  • Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
  • This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
  • Designed and implemented MapReduce-based large-scale parallel relation-learning system.
  • Extracted feeds from social media sites such as Facebook and Twitter using Python scripts.
  • Setup and benchmarked Hadoop/HBase clusters for internal use
  • Setup Hadoop cluster on Amazon EC2 using whirr for POC.
  • Wrote recommendation engine using Mahout.

Environment: Java 6, Eclipse, Oracle 10g, Sub Version, Hadoop, Hive, HBase, Linux, MapReduce.

J2EE Developer

Confidential, Ok

Responsibilities:
  • Involved in designing the application and prepared Use case diagrams, class diagrams, sequence diagrams.
  • Developed Servlets and JSP based on MVC pattern using Struts Action framework.
  • Used Tiles for setting the header, footer and navigation and Apache Validator Framework for Form validation.
  • Used resource and property files for i18n support.
  • Involved in writing Hibernate queries and Hibernate specific configuration and mapping files.
  • Used Log4J logging framework to write log messages with various levels.
  • Involved in fixing bugs and minor enhancements for the front-end modules.
  • Used JUnit framework for writing Test Classes and Ant for starting up the application server in various modes.
  • Used SDLC Life Cycle and Clear Case for version control.

Environment: Java JDK1.4, EJB2.x, Hibernate 2.x, Jakarta Struts 1.2, JSP, Servlet, JavaScript, MS SQL Server 7.0, Eclipse3.x, Websphere 6, Ant, Windows XP, Unix, Excel Macro Development.

Java Developer

Confidential

Responsibilities:
  • Used PROC/SQL to fetch tables from Teradata warehouse.
  • Involved in requirement analysis, development and documentation.
  • Used MVC architecture (Jakarta Struts framework) for Web tier.
  • Participated in developing form-beans and action mappings required for struts implementation and validation framework using struts.
  • Development of front-end screens with JSP Using Eclipse.
  • Involved in development of medical records module. Responsible for development of the functionality using Struts and EJB components.
  • Coded for DAO Objects using JDBC (using DAO pattern).
  • XML and XSDs are used to define data formats.
  • Implemented J2EE design patterns, value object singleton, DAO for the presentation tier, business tier and Integration Tier layers of the project.
  • Involved in bug fixing and functionality enhancements.
  • Designed and developed excellent logging mechanism for each order process using Log4J.
  • Involved in writing Oracle SQL queries.
  • Involved in Check-in and Checkout process using CVS.
  • Developed additional functionality in the software as per business requirements.
  • Involved in requirement analysis and complete development of client side code.
  • Followed Sun standard coding and documentation standards.
  • Participated in project planning with Business Analysts and team members to analyze the business requirements and translated business requirements into working software.
  • Developed software application modules using disciplined software development process.

Environment: Java, J2EE, JSP, EJB, ANT, STRUTS1.2, Log4J, Weblogic 7.0, JDBC, MyEclipse, Windows • XP, CVS, Oracle.

We'd love your feedback!