We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Owing Mills, MD

SUMMARY:

  • 8+ years of professional IT work experience in Analysis, Design, Administration, Development, Deployment and Maintenance of critical software and big data applications.
  • Over 3+ years of experience in Big Data platform as both Developer and Administrator.
  • Hands on experience in developing and deploying enterprise based applications using major Hadoop ecosystem components like Map Reduce, YARN, Hive, Pig, HBase, Flume, Sqoop, Spark Streaming, Spark SQL, Storm, Kafka, and Oozieand Cassandra.
  • Hands on experience in using Map Reduce programming model for Batch processing of data stored in HDFS.
  • Exposure to administrative tasks such as installing Hadoop and its ecosystem components such as Hive and Pig
  • Installed and configured multiple Hadoop clusters of different sizes and with ecosystem components like Pig, Hive, Sqoop, Flume, HBase, Oozie and Zookeeper.
  • Worked on all major distributions of Hadoop Cloud era and Horton works.
  • Responsible for designing and building a Data Lake using Hadoop and its ecosystem components.
  • Handled Data Movement, data transformation, Analysis and visualization across the lake by integrating it with various tools.
  • Defined extract - translate-load (ETL) and extract-load-translate (ELT) processes for the Data Lake.
  • Good Expertise in Planning, Installing and Configuring Hadoop Cluster based on the business needs.
  • Good experience in working with cloud environment like Amazon Web Services (AWS)EC2 and S3
  • Transformed and aggregated data for analysis by implementing work flow management of Sqoop, Hive and Pig scripts.
  • Experience working on different file formats like Avro, Parquet, ORC, Sequence and Compression techniques like Gzip, Lzo, and Snappy in Hadoop.
  • Experience in retrieving data from databases like MYSQL, Teradata, Informix, DB2 and Oracle into HDFS using Sqoop and ingesting them into HBase and Cassandra.
  • Experience writing Oozie workflows and Job Controllers for job automation.
  • Integrated Oozie with Hue and scheduled workflows for multiple Hive, Pig and Spark Jobs.
  • In-Depth knowledge of Scala and Experience building Spark applications using Scala.
  • Good experience working on Tableau and Spotfire and enabled the JDBC/ODBCdata connectivity from those to Hive tables.
  • Designed neat and insightful dashboards in Tableau.
  • Have worked and designed on array of reports which includes Crosstab, Chart, Drill-Down, Drill-Through, Customer-Segment, and Geodemographic segmentation.
  • Deep understanding of Tableau features such as site and server administration, calculated fields, Table calculations, Parameters, Filter's (Normalandquick), highlighting, Levelofdetail, Granularity, Aggregation, Reference line and many more.
  • Adequate knowledge of Scrum, Agile and Waterfall methodologies.
  • Designed and developed multiple J2EEModel 2 MVC based Web Application using J2EE.
  • Worked on various Tools and IDEs like Eclipse, IBM Rational, Apache Ant-BuildTool, MS-Office, PLSQLDeveloper, and SQLPlus.
  • Highly motivated with the ability to work independently or as an integral part of a team and Committed to highest levels of profession.

PROFESSIONAL EXPERIENCE:

Confidential, Owing Mills MD

Hadoop Developer

Responsibilities:

  • Worked on Hadoop cluster scaling from 4 nodes in development environment to 8 nodes in pre-production stage and up to 24 nodes in production.
  • Involved in complete Implementation lifecycle, specialized in writing custom Map Reduce, Pig and Hive programs.
  • Exported the analysed data to the relational databases using Sqoop for visualization and to generate reports for the BIteam.
  • Extensively used Hive/HQL or Hive queries to query or search for a particular string in Hive tables in HDFS.
  • Possess good Linux and Hadoop System Administration skills, networking, shell scripting and familiarity with open source configuration management and deployment tools such as Chef.
  • Worked with Puppet for application deployment
  • Experience in developing customized UDF's in java to extend Hive and Pig Latin functionality.
  • Created HBase tables to store various data formats of data coming from different sources.
  • Use Maven to build and deploy code in Yarn cluster
  • Good knowledge on building Apache spark applications using Scala.
  • Developed several business services using Java Restful Web Services using Spring MVC framework
  • Managing and scheduling Jobs to remove the duplicate log data files in HDFS using Oozie.
  • Used Apache Oozie for scheduling and managing the Hadoop Jobs. Knowledge on HCatalog for Hadoop based storage management.
  • Expert in creating and designing data ingest pipelines using technologies such as spring Integration, Apache Storm-Kafka
  • Used Flume extensively in gathering and moving log data files from Application Servers to a central location in Hadoop Distributed File System (HDFS).
  • Implemented test scripts to support test driven development and continuous integration.
  • Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP
  • Responsible to manage data coming from different sources.
  • Experienced in Analysing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
  • Used File System check (FSCK) to check the health of files in HDFS.
  • Developed the UNIX shell scripts for creating the reports from Hive data.
  • Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
  • Analysed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Used JAVA, J2EE application development skills with Object Oriented Analysis and extensively involved throughout Software Development Life Cycle (SDLC)
  • Involved in the pilot of Hadoop cluster hosted on Amazon Web Services (AWS)
  • Extensively used Sqoop to get data from RDBMS sources like Teradata and Netezza.
  • Create a complete processing engine, based on Cloud era' s distribution
  • Involved in collecting metrics for Hadoop clusters using Ganglia and Ambari.
  • Extracted files from Couch DB, Mongo DB through Sqoop and placed in HDFS for processed
  • Spark Streaming collects this data from Kafka in near-real-time and performs necessary transformations and aggregation on the fly to build the common learner data model and persists the data in NoSQL store (Hbase).
  • Configured Kerberos for the clusters

Environment: Hadoop, Map Reduce, HDFS, Ambari, Hive, Sqoop, Apache Kafka, Oozie, SQL, Alteryx, Flume, Spark, Cassandra, Scala, Java, AWS, GitHub.

Confidential, Warren dale, PA

Hadoop Data Analyst

Responsibilities:

  • Worked on cloud platform which was built with a scalable distributed data solution using Hadoop on a 40- node cluster using AWS cloud to run analysis on 25+ Terabytes of customer usage data.
  • Worked on analysing Hadoop stack and different big data analytic tools including Pig, Hive, HBase database and Sqoop.
  • Designing and implementing semi-structured data analytics platform leveraging Hadoop.
  • Worked on performance analysis and improvements for Hive and Pig scripts at Map Reduce job tuning level.
  • Involved in Optimization of Hive Queries.
  • Developed a frame work to handle loading and transform large sets of unstructured data from UNIX system to HIVE tables.
  • Involved in Data Ingestion to HDFS from various data sources.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Extensively used Apache Sqoop for efficiently transferring bulk data between Apache Hadoop and relational databases.
  • Automated Sqoop, hive and pig jobs using Oozie scheduling.
  • Extensive knowledge in NoSQL databases like HBase
  • Worked extensively with importing metadata into Hive and migrated existing tables and applications to work on Hive and AWS cloud.
  • Responsible for continuous monitoring and managing Elastic Map Reduce (EMR) cluster through AWS console.
  • Have good knowledge on writing and using the user defined functions in HIVE, PIG and Map Reduce.
  • Helped business team by installing and configuring Hadoop ecosystem components along with Hadoop admin.
  • Developed multiple Kafka Producers and Consumers from scratch as per the business requirements.
  • Worked on loading log data into HDFS through Flume
  • Created and maintained technical documentation for executing Hive queries and Pig Scripts.
  • Worked on debugging and performance tuning of Hive& Pig jobs.
  • Used Oozie to schedule various jobs on Hadoop cluster.
  • Used Hive to analyses the partitioned and bucketed data.
  • Worked on establishing connectivity between Tableau and Hive.

Environment: Horton works 2.4, Hadoop, HDFS, Map Reduce, Mongo DB, Java, VMware, HIVE, Eclipse, PIG, Hive, HBase, AWS, Tableau, Sqoop, Flume, Linux, UNIX

Confidential, CA

Hadoop Developer

Responsibilities:

  • Loading the data from the different Data sources like (Teradata and DB2) into HDFS using Sqoop and load into Hive tables, which are partitioned.
  • Developed Hive UDF's to bring all the customers email id into a structured format.
  • Developed bash scripts to bring the Tlog files from ftp server and then processing it to load into hive tables.
  • Using Sqoop to load data from DB2 into HBASE environment.
  • Inserted Overwriting the HIVE data with Hbase data daily to get fresh data every day.
  • All the bash scripts are scheduled using Resource Manager Scheduler.
  • Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and then imported into hive tables.
  • Developed Pig scripts to transform the data into structured format and it are automated through oozie coordinators.
  • Worked on loading the data from MySQL to HBase where necessary using Sqoop
  • Developed Hive queries for Analysis across different banners.

Environment: Windows 7, Hadoop, HDFS, Map Reduce, Sqoop, Hive, Pig, Hbase, Teradata, DB2, Oozie, MySQL, Eclipse

Confidential

Hadoop/Java Developer

Responsibilities:

  • Worked with SQL and NoSQL (Mongo DB, Cassandra, Hadoop) data structures
  • Managing and reviewing Hadoop log files
  • Running Hadoop streaming jobs to process terabytes of xml format data
  • Worked on Hadoop Cluster migrations or Upgrades
  • Extensively worked with Cloud era Hadoop distribution components and custom packages
  • Build Reporting using Tableau
  • Applied ETL principles and best practices
  • Developed the application using Spring MVC Framework. Performed Client side validations using Angular JavaScript& Node JavaScript
  • Developed user interface using JSP, HTML, CSS and Java Script to simplify the complexities of the application.
  • Used AJAX Framework for Dynamic Searching of Bill Expense Information.
  • Created dynamic end to end REST API with Loopback-Node JS Framework.
  • Configured the spring framework for the entire business logic layer.
  • Developed code using various patterns like Singleton, Front Controller, Adapter, DAO, MVC, Template, Builder and Factory Patterns
  • Used Table per hierarchy inheritance of hibernates and mapped polymorphic associations.
  • Developed one-to-many, many-to-one, one-to-one annotation based mappings in Hibernate.
  • Developed DAO service methods to populate the domain model objects using Hibernate.
  • Used Spring Frame work's Bean Factory for initializing services.
  • Used Java collections API extensively such as List, Sets and Maps.
  • Wrote DAO classes using spring and Hibernate to interact with database for persistence.
  • Used Apache Log4J for logging and debugging.
  • Used Hibernate in data access layer to access and update information in the database.
  • Followed TDD and developed test cases using JUnit for all the modules developed.
  • Used Log4J to capture the log that includes runtime exceptions, monitored error logs and fixed the problems.
  • Created Maven build file to build the application and deployed on Web Sphere Application Server

Environment: Java, Struts, Hibernate ORM, Loop Back Framework, Spring Application Framework, EJB, JSP, Servlets, JMS, XML, SOAP, WSDL, JDBC, JavaScript, UML, HTML, Angular JS, Node JS, JNDI, Subversion (SVN), Maven, Log4J, Spring Source Tool Suite (STS), Windows XP, Web Sphere App server, Oracle.

Confidential

JAVA Developer

Responsibilities:

  • Worked with SQL and NoSQL (Mongo DB, Cassandra, Hadoop) data structures
  • Managing and reviewing Hadoop log files
  • Running Hadoop streaming jobs to process terabytes of xml format data
  • Worked on Hadoop Cluster migrations or Upgrades
  • Extensively worked with Cloud era Hadoop distribution components and custom packages
  • Build Reporting using Tableau
  • Applied ETL principles and best practices
  • Developed the application using Spring MVC Framework. Performed Client side validations using Angular JavaScript& Node JavaScript
  • Developed user interface using JSP, HTML, CSS and Java Script to simplify the complexities of the application.
  • Used AJAX Framework for Dynamic Searching of Bill Expense Information.
  • Created dynamic end to end REST API with Loopback-Node JS Framework.
  • Configured the spring framework for the entire business logic layer.
  • Developed code using various patterns like Singleton, Front Controller, Adapter, DAO, MVC, Template, Builder and Factory Patterns
  • Used Table per hierarchy inheritance of hibernates and mapped polymorphic associations.
  • Developed one-to-many, many-to-one, one-to-one annotation based mappings in Hibernate.
  • Developed DAO service methods to populate the domain model objects using Hibernate.
  • Used Spring Frame work's Bean Factory for initializing services.
  • Used Java collections API extensively such as List, Sets and Maps.
  • Wrote DAO classes using spring and Hibernate to interact with database for persistence.
  • Used Apache Log4J for logging and debugging.
  • Used Hibernate in data access layer to access and update information in the database.
  • Followed TDD and developed test cases using JUnit for all the modules developed.
  • Used Log4J to capture the log that includes runtime exceptions, monitored error logs and fixed the problems.
  • Created Maven build file to build the application and deployed on Web Sphere Application Server Environment: Java, J2EE, JDBC, JSP, Struts, JMS, spring, SQL, MS-Access, JavaScript, HTML

Hire Now