We provide IT Staff Augmentation Services!

Sr. Hadoop/big Data Developer Resume

0/5 (Submit Your Rating)

Chicago, IL

SUMMARY

  • Over 9+ years of overall software development experience on Big Data Technologies, Hadoop Ecosystem and Java/J2EE Technologies.
  • Worked on Data Modeling using various ML (Machine Learning Algorithms) via R and Python (Graph lab) Worked on Programming Languages like Core Java and Scala.
  • In depth and extensive knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Yarn, Resource Manager, Node Manager and Map Reduce.
  • Experience in Amazon AWS cloud which includes services like: EC2, S3, EBS, ELB, AMI, Route53, Autoscaling, CloudFront, CloudWatch, Security Groups.
  • A very good understanding of job workflow scheduling and monitoring tools like Oozie and Control - M.
  • Worked on developing ETL processes to load data from multiple data sources to HDFS using Flume and Sqoop, perform structural modifications using MapReduce, Hive and analyze data using visualization/reporting tools.
  • Designed, configured and deployed Amazon Web Services (AWS) for a multitude of applications utilizing the AWS stack (Including EC2, Route53, S3, RDS, Cloud Formation, Cloud Watch, SQS, IAM), focusing on high-availability, fault tolerance, and auto-scaling.
  • Worked on HDFS, Name Node, Job Tracker, Data Node, Task Tracker and the MapReduce concepts.
  • Experience in Front-end Technologies like Html, CSS, Html5, CSS3, and Ajax.
  • Experience in building high performance and scalable solutions using various Hadoop ecosystem tools like Pig, Hive, Sqoop, Spark, Zookeeper, Solr and Kafka.
  • Defined real time data streaming solutions across the cluster using Spark Streaming, Apache Storm, Kafka, Nifi and Flume.
  • Solid Experience in optimizing the Hive queries using Partitioning and Bucketing techniques, which controls the data distribution, to enhance performance.
  • Experience in Importing and Exporting data from different databases like MySql, Oracle into Hdfs and Hive using Sqoop.
  • Expertise with Application servers and web servers like Oracle WebLogic, IBM WebSphere and Apache Tomcat.
  • Experience working in environments using Agile (scrum) and Waterfall methodologies.
  • Expertise in database modeling and development using Sql and PL/SQL, MySql, Teradata.
  • Experienced on Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming and Spark MLlib.
  • Experience in NoSQL Databases HBase, Cassandra and it's integrated with Hadoop cluster.
  • Experienced on cloud integration with AWS using Elastic Map Reduce (EMR), Simple Storage Service (S3), EC2, Redshift and Microsoft Azure.
  • Experienced on different Relational Data Base Management Systems like Teradata, PostgreSQL, DB2, Oracle and SQL Server.
  • Experience in building, deploying and integrating applications in Application Servers with ANT, Maven and Gradle.
  • Significant application development experience with REST Web Services, SOAP, WSDL, and XML.
  • Expertise in Database Design, Creation and Management of Schemas, writing Stored Procedures, Functions, DDL and DML SQL queries and writing complex queries for Oracle
  • Strong hands-on development experience with Java, J2EE (Servlets, JSP, Java Beans, EJB, JDBC, JMS, Web Services) and related technologies.
  • Experience in working with different data sources like Flat files, XML files and Databases.
  • Experience in database design, entity relationships, database analysis, programming SQL, stored procedure's PL/ SQL, packages and triggers in Oracle and MongoDB on Unix/Linux.
  • Worked on different operating systems like UNIX/Linux, Windows XP and Windows 7,8,10.

TECHNICAL SKILLS

Major Skills: Big Data, Hadoop, Cassandra, Hive, Redshift, Sqoop, NoSQL, Oozie, pig Hortonworks, MapReduce, HBase, Zookeeper, Spark, Python, HDFS, AWS, EC2, EC3, MySql, Html, JQuery, Java, J2EE, MS Azure, Apache Flume, Kafka, Agile, MongoDB.

Big data/Hadoop: Hadoop 3.0, HDFS, MapReduce, Hive 2.3, Pig 0.17, Sqoop 1.4, Oozie 4.3, Hue, Flume 1.8, Kafka 1.1 and Spark

NoSQL Databases: HBase, Cassandra, MongoDB 3.6

Cloud Technology: Amazon Web Services (AWS), EC2, EC3, Elastic Search, Microsoft Azure.

Languages: Java, J2EE, PL/SQL, Pig Latin, HQL, R 3.5, Python 3.6, XPath

Java Tools & Web Technologies: EJB, JSF, Servlets, JSP, JSTL, CSS3/2, HTML5.5, XHTML, CSS, XML, XSL, XSLT

Databases: Oracle12c/11g, MYSQL, DB2, MS SQL Server 2016/2014

Frame Works: Struts, Spring 5.0, Hibernate 5.2, MVC

Web Services: SOAP, Restful, JAX-WS, Apache Axis

Application Server: Apache Tomcat 9.0.8, Jboss, IBM Web sphere, Web Logic

Scripting Languages: Shell Scripting, Java Script.

Tools: and IDE: SVN, Maven, Gradle, Eclipse 4.7, Netbeans 8.2

Open Source: Hibernate, Spring IOC, Spring MVC, Spring Web Flow, Spring AOP

Methodologies: Agile, RAD, JAD, RUP, Waterfall & Scrum

PROFESSIONAL EXPERIENCE

Confidential - Washington, DC

Sr. Big Data/Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop Cloudera.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.
  • Created RDD's and applied data filters in Spark and created Cassandra tables and Hive tables for user access.
  • Created Partitioning, Bucketing, and Map Side Join, Parallel execution for optimizing the hive queries decreased the time of execution from hours to minutes.
  • Designed AWS, Cloud migration, AWS EMR, Dynamo DB, Redshift and event processing using lambda function.
  • Worked on importing data from MySQL DB to HDFS and vice-versa using Sqoop to configure Hive Metastore with MySQL, which stores the metadata for Hive tables.
  • Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi-structured data coming from various sources.
  • Worked with different actions in Oozie to design workflow like Sqoop action, pig action, hive action, shell action.
  • Mastered major Hadoop distributes like Hortonworks and Cloudera numerous Open Source projects and prototype various applications that utilize modern Big Data tools.
  • Analyzed large and critical datasets using Cloudera, HDFS, HBase, MapReduce, Hive UDF, Pig, Sqoop, Zookeeper and Spark.
  • Developed Hive Scripts, Pig scripts, Unix Shell scripts, Spark programming for all ETL loading processes and converting the files into parquet in the Hadoop File System.
  • Created applications using Kafka, which monitors consumer lag within Apache Kafka clusters.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark Data Frames and Scala.
  • Loaded and transformed large sets of structured, semi structured data through Sqoop.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Hive-SQL, and Data Frames.
  • Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, etc.
  • Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts. Design of Redshift Data model, Redshift Performance improvements/analysis
  • Implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Big Data technologies such as Hadoop, Map Reduce Frameworks, HBase, Hive.
  • Worked using Apache Hadoop ecosystem components like HDFS, Hive, Sqoop, Pig, and Map Reduce, Worked with Spark and Python.
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios

Environment: Hadoop 3.0, Cassandra 3.11, Hive 2.3, Redshift, HDFS, MySQL 8.2, Sqoop 1.4, NoSQL, Oozie 4.3, pig 0.17, Hortonworks, MapReduce, HBase, Zookeeper 3.4, Spark, Unix, Kafka, JSON, Python 3.6

Confidential - Chicago, IL

Sr. Hadoop/Big Data Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using spark and Hadoop. Used Solid Understanding of Hadoop HDFS, Map-Reduce and other Ecosystem Projects.
  • Worked on analyzing Hadoop cluster using different big data analytic tools including Kafka, Pig, Hive and MapReduce.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.
  • Working on both kind of data processing as batch and streaming with ingestion to NoSQL and HDFS with different file format such as parquet and AVRO.
  • Developed multiple Kafka Producers and Consumers as per the business requirement also customized the partition to get optimized results.
  • Involved on configuration, development of Hadoop environment with AWS cloud such as EC2, EMR, Redshift, Cloud watch, and Route.
  • Developed the batch scripts to fetch the data from AWS S3 storage and do required transformations in Scala using spark framework.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
  • Responsible for developing data pipeline using flume, Sqoop and Pig to extract the data from weblogs and store in HDFS.
  • Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
  • Developed data pipeline using MapReduce, Flume, Sqoop and Pig to ingest customer behavioral data into HDFS for analysis.
  • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
  • Used the Spark -Cassandra Connector to load data to and from Cassandra.
  • Handled importing data from different data sources into HDFS using Sqoop and also performing transformations using Hive, MapReduce and then loading data into HDFS.
  • Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
  • Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis
  • Analyzed the data by performing Hive queries (HiveQL) and running Pig scripts (Pig Latin) to study customer behavior.
  • Extracted large volumes of data feed on different data sources, performed transformations and loaded the data into various Targets.
  • Developed data formatted web applications and deploy the script using HTML5, XHTML, CSS, and Client- side scripting using JavaScript.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
  • Assisted in Cluster maintenance, Cluster Monitoring, and Troubleshooting, Manage and review data backups and log files.

Environment: Hadoop 3.0, Pig 0.17, Hive 2.3, HBase, Oozie 4.1, Sqoop 1.4, Kafka 1.1, Spark, Impala, HDFS, MapReduce, Redshift, Scala, flume, NoSQL, Cassandra 3.11, XHTML, CSS3, HTML5, JavaScript

Confidential - Seattle, WA

Sr. Java/ Hadoop Developer

Responsibilities:

  • Involved in Installing Hadoop Ecosystem components.
  • Involved in HDFS maintenance and administering it through Hadoop - Java API.
  • Analyzed the data using Spark, Hive and produced summary results to downstream systems.
  • Created Shell scripts for scheduling data cleansing scripts and ETL loading process.
  • Installed and Configured multi-nodes fully distributed Hadoop cluster.
  • Analyzed large and critical datasets using Cloudera, HDFS, HBase, MapReduce, Hive, UDF, Pig, Sqoop, Zookeeper and Spark.
  • Designed and implemented MapReduce based large-scale parallel relation-learning system.
  • Developed and delivered quality services on-time and on-budget. Solutions developed by the team use Java, XML, HTTP, SOAP, Hadoop, Pig and other web technologies.
  • Created the JDBC data sources in the Weblogic.
  • Used the existing database reference tables in order for consumption using JDBC mapping.
  • Used Html, CSS, JDBC Driver, JSP, AJAX, Google API and Webmashup.
  • Involved in end to end data processing like ingestion, processing, and quality checks and splitting.
  • Imported data into HDFS from various SQL databases and files using Sqoop and from streaming systems using Storm into Big Data Lake.
  • Involved in scripting (python and shell) to provision and spin up virtualized Hadoop clusters.
  • Developed custom aggregate functions using Spark SQL and performed interactive querying.
  • Wrote Pig scripts to store the data into HBase
  • Created Hive tables, dynamic partitions, buckets for sampling, and worked on them using Hive QL
  • Exported the analyzed data to Teradata using Sqoop for visualization and to generate reports for the BI team.
  • Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
  • Developed the code which will create XML files and Flat files with the data retrieved from Databases and XML files.
  • Extracted files from RDBMS through Sqoop and placed in HDFS and processed.
  • Configured Fair Scheduler to provide service level agreements for multiple users of a cluster.
  • Loaded data into the cluster from dynamically generated files using FLUME and from RDBMS using Sqoop.
  • Involved in writing Java API's for interacting with HBase
  • Involved in writing Flume and Hive scripts to extract, transform and load data into Database
  • Participated in development/implementation of Cloudera Hadoop environment.
  • Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
  • Load and transform large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
  • Ingested semi structured data using Flume and transformed it using Pig.

Environment: Cloudera, HDFS, HBase, MapReduce, Hive 2.0, UDF, Pig 0.16, Sqoop 1.1, Zookeeper 2.8, Spark, RDBMS, Kafka 1.0, Teradata r13, Java, XML, HTTP, SOAP, Hadoop 2.8, and Flume

Confidential - Monroe, LA

Sr. Java/J2EE Developer

Responsibilities:

  • Responsible for designing Rich user Interface Applications using JavaScript, CSS, HTML and Ajax and developed web services by using SOAP UI.
  • Applied J2EE Design Patterns such as Factory, Singleton, and Business delegate, DAO, Front Controller Pattern and MVC.
  • Created POJO layer to facilitate the sharing of data between the front end and the J2EE business objects.
  • Implemented Log4j by enabling logging at runtime without modifying the application binary.
  • Provided ANT build script for building and deploying the application.
  • Involved in configuring and deploying the application on WebLogic Application Server.
  • Used CVS for maintaining the Source Code Designed, developed and deployed on Apache Tomcat Server.
  • Created and modified Stored Procedures, Functions, Triggers and Complex SQL Commands using PL/SQL.
  • Developed Shell scripts in Unix and procedures using Sql and PL/Sql to process the data from the input file and load into the database.
  • Designed and develop web based application using HTML5, CSS, JavaScript (JQuery), AJAX, and JSP framework.
  • Involved in the migration of build and deployment process from ANT to Maven.
  • Developed Custom Tags to simplify the JSP code. Designed UI Screens using JSP, Struts tags and HTML.
  • Developed a multi-user web application using JSP, Servlets, JDBC, Spring and Hibernate framework to provide the needed functionality.
  • Used JSP, JavaScript, Bootstrap, JQuery, AJAX, CSS3, and HTML4 as data and presentation.
  • Involved in J2EE Design Patterns such as Data Transfer Object (DTO), DAO, Value Object and Template.
  • Developed SQL Queries for performing CRUD operations in Oracle for the application.
  • Implemented modules using Java APIs, Java collection, Threads, XML, and integrating the modules.
  • Developed the presentation layer GUI using JavaScript, JSP, HTML, XHTML, CSS, custom tags and developed Client-Side validations using Struts validate framework.
  • Worked on UML diagrams like Class Diagram, Sequence Diagram required for implementing the Quartz scheduler.
  • Extensively used Eclipse IDE for developing, debugging, integrating and deploying the application.
  • Managing and maintaining NoSQL database mainly MongoDB and used Multithreading at back end components in production domain.
  • Extensively used Java Multi-Threading concept for downloading files from a URL.

Environment: CSS2, HTML4, Ajax, PL/SQL, UNIX, Sql, Hibernate3, Oracle10g, Maven, JavaScript, Spring MVC

Confidential

Java Developer

Responsibilities:

  • Designed and developed java backend batch jobs to update the product offer details Core Java coding and development using Multithreading and Design Patterns.
  • Used Spring MVC framework to develop the application and its architecture
  • Used spring dependency injection to inject all the required dependency in application.
  • Developed screens, Controller classes, business services and Dao layer respective to the modules.
  • Involved in developing the Business Logic using POJOs
  • Developed Graphical User Interfaces using HTML and JSP's for user interaction
  • Developed web pages using UI frameworks AngularJS.
  • Created set of classes using DAO pattern to decouple the business logic and data
  • Implemented Hibernate in the data access object layer to access and update information in the Sql Server Database
  • Used various Core Java concepts such as Multi-Threading, Exception Handling, Collection APIs to implement various features and enhancements
  • Wrote test cases in JUnit for unit testing of classes
  • Interfaced with the Oracle back-end database using Hibernate Framework and XML configured files
  • Created dynamic HTML pages, used JavaScript for client-side validations, and AJAX to create interactive front-end GUI.
  • Consumed Web Services for transferring data between different applications
  • Used Restful Web services to retrieve credit history of the applicants
  • Involved in coding, maintaining, and administering Servlets and JSP components to be deployed on a spring boot.
  • Wrote PL/SQL queries, stored procedures, and triggers to perform back-end database operations.
  • Built scripts using Maven to build the J2EE application.
  • Used Eclipse IDE for developing code modules in the development environment
  • Performed connectivity with Sql database using JDBC.
  • Implemented the logging mechanism using Log4j framework
  • Used GIT version control to track and maintain the different version of the application.

Environment: Java1.2, spring, HTML4, AngularJS, Hibernate, Oracle 9i, AJAX, PL/SQL, Maven, J2EE, Eclipse IDE, Sql

We'd love your feedback!