We provide IT Staff Augmentation Services!

Sr. Big Data Developer Resume

4.00/5 (Submit Your Rating)

Atlanta, GA

SUMMARY

  • Around 8+ years of experience in Development, Testing, Implementation, Maintenance and Enhancements on various IT Projects and experience in Big Data in implementing end - to-end Hadoop solutions.
  • Experience with all stages of the SDLC and Agile Development model right from the requirement gathering to Deployment and production support.
  • Good Knowledge in Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing
  • Extensive experience in installing, configuring and using Big Data ecosystem components like Hadoop MapReduce, HDFS, Sqoop, Pig, Hive, Impala, Spark and Zookeeper
  • Expertise in using J2EE application servers such as IBM Web Sphere, JBoss and web servers like Apache Tomcat.
  • Experience in different Hadoop distributions like Cloudera (CDH3 & CDH4) and Horton Works Distributions (HDP).
  • Experience in developing web services with XML based protocols such as SOAP, Axis, UDDI and WSDL.
  • Solid understanding of Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture.
  • Very well verse in writing and deploying Oozie Workflows and Coordinators.
  • Highly skill in integrating Amazon Kinesis streams with Spark streaming applications to build long running real-time applications.
  • Good working experience on using Sqoop to import data into HDFS from RDBMS and vice-versa.
  • Extensive experience in Extraction, Transformation and Loading (ETL) of data from multiple sources into Data Warehouse and Data Mart.
  • Good knowledge in SQL and PL/SQL to write Stored Procedures and Functions and writing unit test cases using JUnit.
  • Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS.
  • In-depth knowledge of handling large amounts of data utilizing Spark Data Frames/Datasets API and Case Classes.
  • Good knowledge in implementing various data processing techniques using Apache HBase for handling the data and formatting it as required.
  • Working experience in Impala, Mahout, Spark SQL, Storm, Avro, Kafka, and AWS.
  • Experience with Java web framework technologies like Apache Camel and Spring Batch.
  • Experience in version control and source code management tools like GIT, SVN, and Bit Bucket.
  • Hands on experience working with databases like Oracle, SQL Server and MySQL.
  • Great knowledge of working with Apache Spark Streaming API on Big Data Distributions in an active cluster environment.
  • Proficiency in developing secure enterprise Java applications using technologies such as Maven, Hibernate, XML, HTML, CSS Version Control Systems
  • Developing and implementing Apache NIFI across various environments, written QA scripts in Python for tracking files.
  • Excellent understanding of Hadoop and underlying framework including storage management.

TECHNICAL SKILLS

Hadoop/Big Data Technologies: Hadoop 3.0, HDFS, MapReduce, HBase 1.4, Apache Pig, Hive 2.3, Sqoop 1.4, Apache Impala 2.1, Oozie 4.3, Yarn, Apache Flume 1.8, Kafka 1.1, Zookeeper

Cloud Platform: Amazon AWS, EC2, EC3, MS Azure, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake, Data Factory

Programming Language: Java, Scala, Python 3.6, SQL, PL/SQL, Shell Scripting, Storm 1.0, JSP, Servlets

Frameworks: Spring 5.0.5, Hibernate 5.2, Struts 1.3, JSF, EJB, JMS

Web Technologies: HTML, CSS, JavaScript, JQuery 3.3, Bootstrap 4.1, XML, JSON, AJAX

Operating Systems: Linux, UNIX, Windows 10/8/7

IDE and Tools: Eclipse 4.7, NetBeans 8.2, IntelliJ, Maven

NoSQL Databases: HBase 1.4, Cassandra 3.11, MongoDB, Accumulo

SDLC Methodologies: Agile, Waterfall

Version Control: GIT, SVN, CVS

PROFESSIONAL EXPERIENCE

Confidential, Atlanta GA

Sr. Big Data Developer

Responsibilities:

  • Developed Big Data applications using Spark and Scala.
  • Worked on Big Data eco-systems including Hive, MongoDB, Zookeeper, Spark Streaming with MapR distribution.
  • Involved in Agile methodologies, daily scrum meetings, spring planning.
  • Involved in writing Spark applications using Scala to perform various data cleansing, validation, transformation and summarization activities according to the requirement.
  • Loaded the data into Spark RDD and Perform in-memory data computation to generate the output as per the requirements.
  • Performed multiple MapReduce jobs in Pig and Hive for data cleaning and pre-processing.
  • Built Hadoop solutions for big data problems using MR1 and MR2 in YARN.
  • Handled importing of data from various data sources, performed transformations using Hive, PIG, and loaded data into HDFS.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Used Hive to analyze data ingested into HBase by using Hive-HBase integration and compute various metrics for reporting on the dashboard.
  • Worked on analyzing, writing Hadoop MapReduce jobs using JavaAPI, Pig and hive.
  • Developed reports, dashboards using Tableau for quick reviews to be presented to business.
  • Worked on configuring and managing disaster recovery and backup on Cassandra Data.
  • Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
  • Worked on MongoDB, HBase databases which differ from classic relational databases
  • Involved in converting HiveQL into Spark transformations using Spark RDD and through Scala programming.
  • Used Hive to perform data validation on the data ingested using Sqoop and cleansed the data.
  • Developed several business services using Java RESTful Web Services using Spring MVC framework.
  • Involved in identifying job dependencies to design workflow for Oozie and YARN resource management.
  • Designed solution for various system components using Microsoft Azure.
  • Worked on data using Sqoop from HDFS to Relational Database Systems and vice-versa. Maintaining and troubleshooting
  • Explored with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD.
  • Created Hive Tables, loaded claims data from Oracle using Sqoop and loaded the processed data into target database.
  • Implemented Security in Web Applications using Azure and deployed Web Applications to Azure.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
  • Participated in all aspects of Software Development Life Cycle (SDLC) and Production troubleshooting, Software testing using Standard Test Tool.
  • Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.
  • Developed ApacheNifi flows dealing with various kinds of data formats such as XML, JSON, and Avro.
  • Worked on importing data from HDFS to MYSQL database and vice-versa using SQOOP.
  • Configured Hive meta store with MySQL, which stores the metadata for Hive tables.
  • Performed data analytics in Hive and then exported those metrics back to Oracle Database using Sqoop.
  • Upgraded the Hadoop Cluster from CDH3 to CDH4, setting up High Availability Cluster and integrating Hive with existing applications.
  • Worked on NoSQL support enterprise production and loading data into HBase using Impala and Sqoop.
  • Developed many distributed, transactional, portable applications using Enterprise JavaBeans (EJB) architecture for Java 2 Enterprise Edition (J2EE) platform.
  • Used Cloudera Manager for installation and management of Hadoop Cluster.

Environment: Flume 1.8, Tableau, GIT, Kafka 1.1, MapReduce, JSON, AVRO, Teradata, Maven, SOAP Hadoop 3.0 , Oozie 4.3, Zookeeper 3.4, Cassandra 3.0, Sqoop 1.4, Apache NiFi 1.4, ETL, Azure, Hive 2.3, HBase 1.4, Pig 0.17, HDFS 3.1.

Confidential, Phoenix, AZ

Spark/Hadoop Developer

Responsibilities:

  • Worked on Hadoop cluster scaling from 4 nodes in development environment to 8 nodes in pre-production stage and up to 24 nodes in production.
  • Extensively migrated existing architecture to Spark Streaming to process the live streaming data.
  • Involved in writing the shell scripts for exporting log files to Hadoop cluster through automated process.
  • Utilized Agile and Scrum Methodology to help manage and organize a team of developers with regular code review sessions.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Extensively used Hive/HQL or Hive queries to query or search for a particular string in Hive tables in HDFS.
  • Worked with multiple teams to provision AWS infrastructure for development and production environments.
  • Gathered the business requirements from the Business Partners and Subject Matter Experts.
  • Created custom new columns depending up on the use case while ingesting the data into Hadoop Lake using pyspark.
  • Designed and developed applications in Spark using Scala to compare the performance of Spark with Hive and SQL.
  • Stored data in S3 buckets on AWS cluster on top of Hadoop.
  • Developed time driven automated Oozie workflows to schedule Hadoop jobs.
  • Designed and implemented map reduce jobs to support distributed processing using java, Hive, Scala and Apache Pig.
  • Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDD's.
  • Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi-structured.
  • Worked with Apache Nifi for managing the flow of data from sources through automated data flow.
  • Created, modified and executed DDL and ETL scripts for De-normalized tables to load data into Hive and AWS Redshift tables.
  • Implemented POC to migrate MapReduce programs into Spark transformations using and Scala.
  • Managed and review data backups, Manage and review Hadoop log files Hortonworks Cluster.
  • Extracted files from CouchDB through Sqoop and placed in HDFS and processed.
  • Implemented a distributed messaging queue to integrate with Cassandra using Apache Kafka and Zookeeper.
  • Participated in development/implementation of Cloudera Impala Hadoop environment.
  • Used Test driven approach for developing the application and Implemented the unit tests using Python Unit Test framework
  • Involved in Ad Hoc stand up and architecture meetings to set up daily priorities.

Environment: Hadoop 3.0, Spark 2.4, Agile, Sqoop 1.4, Hive 2.3, AWS, pyspark, Scala 2.12, Oozie 4.3, Apache Pig 0.17, NoSQL, NoSQL, MapReduce, Hortonworks

Confidential, New Albany, OH

Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop Ecosystem components and Cloudera manager using CDH distribution.
  • Worked on S3 buckets on AWS to store Cloud Formation Templates and worked on AWS to create EC2 instances.
  • Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components
  • Used Oozie scripts for deployment of the application and perforce as the secure versioning software.
  • Involved in Installing, Configuring Hadoop Eco-System, Cloudera Manager using CDH3, CDH4 Distributions.
  • Worked on No-SQL databases for POC purpose in storing images and URIs.
  • Involved in creating Hive tables, then applied HiveQL on those tables, this will invoke and run MapReduce jobs automatically.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
  • Highly skilled in integrating Kafka with Spark streaming for high-speed data processing
  • Implemented POC to migrate MapReduce programs into Spark transformations using Spark and Scala.
  • Developed different kind of custom filters and handled pre-defined filters on HBase data using API.
  • Used Enterprise Data Warehouse database to store the information and to make it access all over organization.
  • Exported of result set from HIVE to MySQL using Sqoop export tool for further processing.
  • Collected and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis.
  • Worked with Avro Data Serialization system to work with JSON data formats.
  • Involved in creating Hive tables, loading with data and writing hive queries which runs internally in MapReduce way.
  • Worked on Spark for in memory commutations and comparing the Data Frames for optimizing performance.
  • Created web services request-response mappings by importing source and target definition using WSDL file.
  • Developed custom UDF's to generate unique key for the use in Apache pig transformations.
  • Created conversion scripts using Oracle SQL queries, functions and stored procedures, test cases and plans before ETL migrations.
  • Developed Shell scripts to read files from edge node to ingest into HDFS partitions based on the file naming pattern.

Environment: Hadoop 2.3, AWS, Sqoop 1.2, HDFS, Oozie 4.1, Cassandra 3.0, MongoDB 3.5, Hive 2.1, Kafka, Spark 2.1, MapReduce, Scala 2.0, MySQL, Flume, JSON

Confidential, McLean, VA

Java/J2EE Developer

Responsibilities:

  • Applied J2EE Design Patterns such as Factory, Singleton, and Business delegate, DAO, Front Controller Pattern and MVC.
  • Designed Rich user Interface Applications using JavaScript, CSS, HTML and AJAX and developed web services by using SOAP.
  • Implemented modules using Java APIs, Java collection, Threads, XML, and integrating the modules.
  • Successfully installed and configured the IBM WebSphere Application server and deployed the business tier components using EAR file.
  • Designed and developed business components using Session and Entity Beans in EJB.
  • Implemented CORS (Cross Origin Resource Sharing) using Node JS and developed REST services using Node and Express, Mongoose modules.
  • Used Log4j for Logging various levels of information like error, info, debug into the log files.
  • Developed integrated applications and light weight component using spring framework and IOC features from spring web MVC to configure application context for spring bean factory.
  • Used JDBC prepared statements to call from Servlets for database access.
  • Used Angular JS to connect the web application to back-end APIs, used RESTFUL methods to interact with several API's.
  • Involved in build/deploy applications using Maven and integrated with CI/CD server Jenkins.
  • Implemented JQuery features to develop the dynamic queries to fetch the data from database.
  • Involved in developing module for transformation of files across the remote systems using JSP and servlets.
  • Worked on Development bugs assigned in JIRA for Sprint following agile process.
  • Used ANT scripts to fetch, build, and deploy application to development environment.
  • Extensively used JavaScript to provide the users with interactive, Speedy, functional and more usable user interfaces.
  • Implemented MVC architecture using Apache struts, JSP & Enterprise Java Beans.
  • Used AJAX and JSON to make asynchronous calls to the project server to fetch data on the fly.
  • Developed batch programs to update and modify metadata of large number of documents in FileNet Repository using CE APIs
  • Worked on creating a test harness using POJOs which would come along with the installer and test the services every time the installer would be run.
  • Worked on creating Packages, Stored Procedures & Functions in Oracle using PL/SQL and TOAD.
  • Used JNDI to perform lookup services for the various components of the system.
  • Deployed the application and tested on JBoss Application Server. Collaborated with Business Analysts during design specifications.
  • Developed Apache Camel middleware routes, JMS endpoints, spring service endpoints and used Camel free marker to customize REST responses.

Environment: J2EE, MVC, JavaScript 2016, CSS3, HTML 5, AJAX, SOAP, Java 7, Jenkins 1.9, Maven, ANT, Apache struts, Apache Camel

Confidential

Java Developer

Responsibilities:

  • Wrote complex SQL queries and programmed stored procedures, packages, and triggers.
  • Designed HTML prototypes, visual interfaces and interaction of Web-based design.
  • Designed and developed a business tiers using EJBs and Used Session Beans to encapsulate the Business Logic.
  • Involved in the configuration of Spring MVC and Integration with Hibernate.
  • Used Eclipse IDE to configure and deploy the application onto WebLogic application server using Maven build scripts to automate the build and deployment process
  • Designed CSS based page layouts that are cross-browser compatible and standards-compliant.
  • Used spring framework for Dependency Injection and JDBC connectivity.
  • Created data source and connection pools in Web Logic and deployed applications on the server
  • Developed XML and XSLT pages to store and present data to the user using parsers.
  • Developed RESTful Web services client to consume JSON messages
  • Implemented business logic with POJO using multithreading and design patterns.
  • Created test cases for DAO Layer and service layer using JUNIT and bug tracking using JIRA.
  • Used Struts, Front Controller and Singleton patterns, for developing the action and Servlets classes, Involved in
  • Worked on web-based reporting system with HTML, JavaScript, and JSP.
  • Wrote REST Web Services for sending and getting data from the external interface
  • Implemented the application using Spring Boot Framework and handles the security using spring security.
  • Researched and Executed of JavaScript Frameworks, including Angular JS and Node JS.
  • Developed stored procedures and triggers using PL/SQL to calculate and update the tables to implement business logic
  • Used Subversion (SVN) as the configuration management tool to manage the code repository.
  • Used Maven as the build tool and Tortoise SVN as the Source version controller
  • Used JQuery for basic animation and end user screen customization purposes.
  • Used GIT as Version Control System performed Module and Unit Level Testing with JUnit and log4j
  • Participated in Unit testing and functionality testing for tracking errors and debugging the code.

Environment: SQL, HTML 4, MVC, Hibernate, Eclipse, Maven, CSS2, JSON, JUNIT, JavaScript, PL/SQL, JQuery, JUnit

We'd love your feedback!