Sr. Big Data Developer Resume Rensselaer, NY - Hire IT People

SUMMARY:

Around 8 years of experience in Development, Testing, Implementation, Maintenance and Enhancements on various IT Projects and experience in Big Data in implementing end - to-end Hadoop solutions.
Experience with all stages of the SDLC and Agile Development model right from the requirement gathering to Deployment and production support.
Good Knowledge in Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing
Extensive experience in installing, configuring and using Big Data ecosystem components like Hadoop MapReduce, HDFS, Sqoop, Pig, Hive, Impala, Spark and Zookeeper
Expertise in using J2EE application servers such as IBM Web Sphere, JBoss and web servers like Apache Tomcat.
Experience in different Hadoop distributions like Cloudera (CDH3 & CDH4) and Horton Works Distributions (HDP).
Experience in developing web services with XML based protocols such as SOAP, Axis, UDDI and WSDL.
Solid understanding of Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture.
Very well verse in writing and deploying Oozie Workflows and Coordinators.
Highly skill in integrating Amazon Kinesis streams with Spark streaming applications to build long running real-time applications.
Good working experience on using Sqoop to import data into HDFS from RDBMS and vice-versa.
Extensive experience in Extraction, Transformation and Loading (ETL) of data from multiple sources into Data Warehouse and Data Mart.
Good knowledge in SQL and PL/SQL to write Stored Procedures and Functions and writing unit test cases using JUnit.
Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS.
In-depth knowledge of handling large amounts of data utilizing Spark Data Frames/Datasets API and Case Classes.
Good knowledge in implementing various data processing techniques using Apache HBase for handling the data and formatting it as required.
Working experience in Impala, Mahout, Spark SQL, Storm, Avro, Kafka, and AWS.
Experience with Java web framework technologies like Apache Camel and Spring Batch.
Experience in version control and source code management tools like GIT, SVN, and Bit Bucket.
Hands on experience working with databases like Oracle, SQL Server and MySQL.
Great knowledge of working with Apache Spark Streaming API on Big Data Distributions in an active cluster environment.
Proficiency in developing secure enterprise Java applications using technologies such as Maven, Hibernate, XML, HTML, CSS Version Control Systems
Developing and implementing Apache NIFI across various environments, written QA scripts in Python for tracking files.
Excellent understanding of Hadoop and underlying framework including storage management.

TECHNICAL SKILLS:

Hadoop/Big Data Technologies: Hadoop 3.0, HDFS, MapReduce, HBase 1.4, Apache Pig, Hive 2.3, Sqoop 1.4, Apache Impala 2.1, Oozie 4.3, Yarn, Apache Flume 1.8, Kafka 1.1, Zookeeper

Cloud Platform: Amazon AWS, EC2, EC3, MS Azure, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake, Data Factory

Programming Language: Java, Scala, Python 3.6, SQL, PL/SQL, Shell Scripting, Storm 1.0, JSP, Servlets

Frameworks: Spring 5.0.5, Hibernate 5.2, Struts 1.3, JSF, EJB, JMS

Web Technologies: HTML, CSS, JavaScript, JQuery 3.3, Bootstrap 4.1, XML, JSON, AJAX

Operating Systems: Linux, Unix, Windows 10/8/7

IDE and Tools: Eclipse 4.7, NetBeans 8.2, IntelliJ, Maven

NoSQL Databases: HBase 1.4, Cassandra 3.11, MongoDB, Accumulo

SDLC Methodologies: Agile, Waterfall

Version Control: GIT, SVN, CVS

WORK EXPERIENCE:

Confidential, Rensselaer, NY

Sr. Big Data Developer

Responsibilities:

Worked on Big Data infrastructure for batch processing as well as real-time processing. Responsible for building scalable distributed data solutions using Hadoop.
Involved with all the phases of Software Development Life Cycle (SDLC) methodologies throughout the project life cycle.
Involved in Agile methodologies, daily scrum meetings, spring planning.
Developed a JDBC connection to get the data from Azure SQL and feed it to a Spark Job.
Configured Spark streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala.
Developed the Sqoop scripts to make the interaction between Hive and vertical Database.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, and Scala.
Primarily involved in Data Migration process using Azure by integrating with Github repository and Jenkins.
Deployed the application in Hadoop cluster mode by using spark submit scripts.
Worked on the large-scale Hadoop YARN cluster for distributed data processing and analysis using Spark, Hive.
Implemented various Hadoop Distribution environments such as Cloudera and Hortonworks.
Experienced of building Data Warehouse in Azure platform using Azure data bricks and data factory
Implemented monitoring and established best practices around usage of elastic search
Worked on Apache Nifi as ETL tool for batch processing and real time processing.
Upgraded the Hadoop Cluster from CDH3 to CDH4, setting up High Availability Cluster and integrating HIVE with existing applications.
Captured the data logs from web server into HDFS using Flume for analysis.
Involved in developing code to write canonical model JSON records from numerous input sources to Kafka Queues.
Built code for real time data ingestion using Java, Map R-Streams (Kafka) and STORM.
Used GitHub as repository for committing code and retrieving it and Jenkins for continuous integration.
Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
Loaded and transformed large sets of structured, semi structured and unstructured data in various formats like text, zip, XML and JSON.
Involved in PL/SQL query optimization to reduce the overall run time of stored procedures.
Developed Python, Shell/Perl Scripts and Power shell for automation purpose and Component unit testing using Azure Emulator.
Built the automated build and deployment framework using Jenkins.

Environment: Hadoop 3.0, Azure, HDFS, Scala 2.12, SQL, Hive 1.2, spark 2.4, Kafka 2.1, YARN, Apache Nifi, ETL, Sqoop 1.4, Jenkins, XML, PL/SQL, Python 3.7, GitHub, Hortonworks, Cloudera

Confidential, Phoenix, AZ

Spark/Hadoop Developer

Responsibilities:

Worked on Hadoop cluster scaling from 4 nodes in development environment to 8 nodes in pre-production stage and up to 24 nodes in production.
Extensively migrated existing architecture to Spark Streaming to process the live streaming data.
Involved in writing the shell scripts for exporting log files to Hadoop cluster through automated process.
Utilized Agile and Scrum Methodology to help manage and organize a team of developers with regular code review sessions.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Extensively used Hive/HQL or Hive queries to query or search for a particular string in Hive tables in HDFS.
Worked with multiple teams to provision AWS infrastructure for development and production environments.
Gathered the business requirements from the Business Partners and Subject Matter Experts.
Created custom new columns depending up on the use case while ingesting the data into Hadoop Lake using pyspark.
Designed and developed applications in Spark using Scala to compare the performance of Spark with Hive and SQL.
Stored data in S3 buckets on AWS cluster on top of Hadoop.
Developed time driven automated Oozie workflows to schedule Hadoop jobs.
Designed and implemented map reduce jobs to support distributed processing using java, Hive, Scala and Apache Pig.
Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDD's.
Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi-structured.
Worked with Apache Nifi for managing the flow of data from sources through automated data flow.
Created, modified and executed DDL and ETL scripts for De-normalized tables to load data into Hive and AWS Redshift tables.
Implemented POC to migrate MapReduce programs into Spark transformations using and Scala.
Managed and review data backups, Manage and review Hadoop log files Hortonworks Cluster.
Extracted files from CouchDB through Sqoop and placed in HDFS and processed.
Implemented a distributed messaging queue to integrate with Cassandra using Apache Kafka and Zookeeper.
Participated in development/implementation of Cloudera Impala Hadoop environment.
Used Test driven approach for developing the application and Implemented the unit tests using Python Unit Test framework
Involved in Ad Hoc stand up and architecture meetings to set up daily priorities.

Environment: Hadoop 3.0, Spark 2.4, Agile, Sqoop 1.4, Hive 2.3, AWS, pyspark, Scala 2.12, Oozie 4.3, Apache Pig 0.17, NoSQL, NoSQL, MapReduce, Hortonworks

Confidential, New Albany, OH

Hadoop Developer

Responsibilities:

Installed and configured Hadoop Ecosystem components and Cloudera manager using CDH distribution.
Worked on S3 buckets on AWS to store Cloud Formation Templates and worked on AWS to create EC2 instances.
Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components
Used Oozie scripts for deployment of the application and perforce as the secure versioning software.
Involved in Installing, Configuring Hadoop Eco-System, Cloudera Manager using CDH3, CDH4 Distributions.
Worked on No-SQL databases for POC purpose in storing images and URIs.
Involved in creating Hive tables, then applied HiveQL on those tables, this will invoke and run MapReduce jobs automatically.
Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
Highly skilled in integrating Kafka with Spark streaming for high-speed data processing
Implemented POC to migrate MapReduce programs into Spark transformations using Spark and Scala.
Developed different kind of custom filters and handled pre-defined filters on HBase data using API.
Used Enterprise Data Warehouse database to store the information and to make it access all over organization.
Exported of result set from HIVE to MySQL using Sqoop export tool for further processing.
Collected and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis.
Worked with Avro Data Serialization system to work with JSON data formats.
Involved in creating Hive tables, loading with data and writing hive queries which runs internally in MapReduce way.
Worked on Spark for in memory commutations and comparing the Data Frames for optimizing performance.
Created web services request-response mappings by importing source and target definition using WSDL file.
Developed custom UDF's to generate unique key for the use in Apache pig transformations.
Created conversion scripts using Oracle SQL queries, functions and stored procedures, test cases and plans before ETL migrations.
Developed Shell scripts to read files from edge node to ingest into HDFS partitions based on the file naming pattern.

Environment: Hadoop 2.3, AWS, Sqoop 1.2, HDFS, Oozie 4.1, Cassandra 3.0, MongoDB 3.5, Hive 2.1, Kafka, Spark 2.1, MapReduce, Scala 2.0, MySQL, Flume, JSON

Confidential, McLean, VA

Java/J2EE Developer

Responsibilities:

Applied J2EE Design Patterns such as Factory, Singleton, and Business delegate, DAO, Front Controller Pattern and MVC.
Designed Rich user Interface Applications using JavaScript, CSS, HTML and AJAX and developed web services by using SOAP.
Implemented modules using Java APIs, Java collection, Threads, XML, and integrating the modules.
Successfully installed and configured the IBM WebSphere Application server and deployed the business tier components using EAR file.
Designed and developed business components using Session and Entity Beans in EJB.
Implemented CORS (Cross Origin Resource Sharing) using Node JS and developed REST services using Node and Express, Mongoose modules.
Used Log4j for Logging various levels of information like error, info, debug into the log files.
Developed integrated applications and light weight component using spring framework and IOC features from spring web MVC to configure application context for spring bean factory.
Used JDBC prepared statements to call from Servlets for database access.
Used Angular JS to connect the web application to back-end APIs, used RESTFUL methods to interact with several API's.
Involved in build/deploy applications using Maven and integrated with CI/CD server Jenkins.
Implemented JQuery features to develop the dynamic queries to fetch the data from database.
Involved in developing module for transformation of files across the remote systems using JSP and servlets.
Worked on Development bugs assigned in JIRA for Sprint following agile process.
Used ANT scripts to fetch, build, and deploy application to development environment.
Extensively used JavaScript to provide the users with interactive, Speedy, functional and more usable user interfaces.
Implemented MVC architecture using Apache struts, JSP & Enterprise Java Beans.
Used AJAX and JSON to make asynchronous calls to the project server to fetch data on the fly.
Developed batch programs to update and modify metadata of large number of documents in FileNet Repository using CE APIs
Worked on creating a test harness using POJOs which would come along with the installer and test the services every time the installer would be run.
Worked on creating Packages, Stored Procedures & Functions in Oracle using PL/SQL and TOAD.
Used JNDI to perform lookup services for the various components of the system.
Deployed the application and tested on JBoss Application Server. Collaborated with Business Analysts during design specifications.
Developed Apache Camel middleware routes, JMS endpoints, spring service endpoints and used Camel free marker to customize REST responses.

Environment: J2EE, MVC, JavaScript 2016, CSS3, HTML 5, AJAX, SOAP, Java 7, Jenkins 1.9, Maven, ANT, Apache struts, Apache Camel

Confidential

Java Developer

Responsibilities:

Wrote complex SQL queries and programmed stored procedures, packages, and triggers.
Designed HTML prototypes, visual interfaces and interaction of Web-based design.
Designed and developed a business tiers using EJBs and Used Session Beans to encapsulate the Business Logic.
Involved in the configuration of Spring MVC and Integration with Hibernate.
Used Eclipse IDE to configure and deploy the application onto WebLogic application server using Maven build scripts to automate the build and deployment process
Designed CSS based page layouts that are cross-browser compatible and standards-compliant.
Used spring framework for Dependency Injection and JDBC connectivity.
Created data source and connection pools in Web Logic and deployed applications on the server
Developed XML and XSLT pages to store and present data to the user using parsers.
Developed RESTful Web services client to consume JSON messages
Implemented business logic with POJO using multithreading and design patterns.
Created test cases for DAO Layer and service layer using JUNIT and bug tracking using JIRA.
Used Struts, Front Controller and Singleton patterns, for developing the action and Servlets classes, Involved in
Worked on web-based reporting system with HTML, JavaScript, and JSP.
Wrote REST Web Services for sending and getting data from the external interface
Implemented the application using Spring Boot Framework and handles the security using spring security.
Researched and Executed of JavaScript Frameworks, including Angular JS and Node JS.
Developed stored procedures and triggers using PL/SQL to calculate and update the tables to implement business logic
Used Subversion (SVN) as the configuration management tool to manage the code repository.
Used Maven as the build tool and Tortoise SVN as the Source version controller
Used JQuery for basic animation and end user screen customization purposes.
Used GIT as Version Control System performed Module and Unit Level Testing with JUnit and log4j
Participated in Unit testing and functionality testing for tracking errors and debugging the code.

Environment: SQL, HTML 4, MVC, Hibernate, Eclipse, Maven, CSS2, JSON, JUNIT, JavaScript, PL/SQL, JQuery, JUnit

We provide IT Staff Augmentation Services!

Sr. Big Data Developer Resume

Rensselaer, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship