HADOOP DEVELOPER/BIG DATA DEVELOPER Resume Jacksonville, FL - Hire IT People

SUMMARY

Over 8 years of IT Experience in analysis, implementation and testing of enterprise - wide application, Data warehouse, client-server technologies and web-based applications.
Over 6 years of experience developing using apache spark for data transformations and processing
Over 6 years of experienced in administrative tasks such as multi-node Hadoopinstallation and maintenance
Experience in deploying Hadoop2.0 (YARN) and administration of Hbase, Hive, Sqoop, HDFS, and MapR
Installed, configured, supported and managed Apache Ambari in Hortonworks Data Platform 2.5, Cloudera Distribution Hadoop 5.x, Linux, Rackspace and AWS cloud infrastructure.
Understand the security requirements for Hadoopand integrated with Kerberos infrastructure
Good knowledge on Kerberos security while successfully Maintained the cluster by adding and removal of nodes.
Handsome experience in Linux admin activities on RHEL & CentOS.
Experience in extracting, transforming and loading (ETL) data with Hadoop and Spark
Experience in minor and major upgrades of Hadoop and Hadoop eco system.
Monitor Hadoop cluster using tools like Nagios, Ganglia, Ambari and Cloudera Manager.
Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
Involved in bench marking Hadoop / Hbase cluster file systems various batch jobs and workloads.
Set up the Linux environments, Password less SSH, creating file systems, disabling firewalls and installing Java.
Set up MySQL master and slave replications and helped applications to maintain their data in MySQL Servers.
Experienced in job scheduling using different schedulers like FAIR, CAPACITY and FIFO and cluster co-ordination through DISTCP tool.
Hands on experience in analyzing Log files for Hadoop ecosystem services and finding root cause.
Experience in Amazon AWS cloud Administration and actively involved highly available, Scalability, cost effective and fault tolerant systems using multiple AWS services.
This project involves File transmission and electronic data interchange, trades capture, verify, process and routing operations, Banking Reports Generation, Operational management.
Experience in dealing with Hadoop cluster and integration with its Ecosystem like HIVE, HBase, Pig, Sqoop, Spark, Oozie, Flume etc.
Experienced in AWS CloudFront, including creating and managing distributions to provide access to S3 bucket or HTTP server running on EC2 instances.
Good working knowledge of Vertica DB architecture, column orientation and High Availability.
Performed systems analysis for several information systems documenting and identifying performance and administrative bottlenecks.

PROFESSIONAL EXPERIENCE

HADOOP DEVELOPER/BIG DATA DEVELOPER

Confidential, Jacksonville, FL

Responsibilities:

Work on large scale team as sole Hadoop/big data developer to support data transformations using Spark
Work with DB2, Postgresql and MSQL databases to source healthcare data for spark jobs
Building and maintaining daily incremental load program for delta data
Involved in data validation for spark transformation jobs
Writing spark code in Scala using Gradle 4.6, Eclipse Scala IDE, to turn programs into jars
Hands on work with Json, Avro, Parquet and Csv files
Developed business relationships and integrated with other I.T. departments to ensure successful implementation and support of project efforts
Building data pipelines and data marts to move and store customer marketing data
Developing ETL jobs to extract data for business analysis and customer user experience
Collaborating with Business System Analysts to receive correct mapping requirements for code
Collaborating and communicating with QA team to properly test and deploy code
Communicating with business to transfer business logic needs into actual code
Provided scope, sizing and estimates time required to complete code and provide information to the project manager for input to the project plan
Determined impacts and integration points and participated in capacity planning with project manager
Created application design specification documents from which code will be written
Interfaces with external vendors and customers through crosswalk mapping with different and often complex architecture
Created and modified code for moderately complex system design that may span platforms
Ensures code complies with architectural and SDLC standards
Respond and resolved production support issues with programs
Developing in an agile methodology with daily scrums and bi - weekly sprints

Environment: Hadoop, Hortonworks, Spark, Hive, Hbase, Eclipse Scala Ide (Gradle), SQL,Winscp,Putty,Unix,MSQLl,RDBMS

HADOOP DEVELOPER

Confidential, Westport, CT

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop.
Installed OS and administrated Hadoop stack with CDH5 (with YARN) Cloudera Distribution including configuration management, monitoring, debugging, and performance tuning Scripting Hadoop package installation and configuration to support fully - automated deployments.
Installed and configured and maintained Hortonworks HDP 2.2 using Ambari and manually through CLI.
Building and maintaining scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
Involved in developer activities of installation and configuring Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Importing and exporting data into HDFS and Hive using Sqoop and Flume.
Monitoring the data streaming between web sources and HDFS and functioning through monitoring tools.
Day-to-day operational support of our Cloudera Hadoop clusters in lab and production, at multi-petabyte scale.
Involved in creating Spark cluster in HDInsight by create Azure compute resources with Spark installed and configured.
Setting up automated processes to analyze the system and Hadoop log files for predefined errors and send alerts to appropriate groups and an Excellent working knowledge on SQL with databases.
Commissioning and De-commissioning of data nodes from cluster in case of problems.
Setting up automated processes to archive/clean the unwanted data on the cluster, in particular on Name Node and Secondary Name node.
Discussions with other technical teams on regular basis regarding upgrades, process changes, any special processing and feedback.
Handled Azure Storage like Blob Storage and File Storage and setup Azure CDN and load balancers
Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
Enable the processing, management, storage and analysis of data using data fabric.
Leverage the data and utilized machine learning algorithm.

Environment: Hadoop, Confluent Kafka, Hortonworks HDF, HDP, NIFI, Linux, Redshift, Splunk, Yarn, Cloudera 5.13, Spark, Tableau, Microsoft Azure, Data fabric, DataMesh.

HADOOP DEVELOPER /Admin

Confidential, New York, NY

Responsibilities:

Worked on analyzing Cloudera Hadoop and Hortonworks cluster and different big data analytic tools including Pig, Hive and Sqoop
Installed and configured CDH cluster, using Cloudera manager for easy management of existing Hadoop cluster.
Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Setting up the machines with Network Control, Static IP, Disabled Firewalls, Swap memory.
Working on setting up 100 node production cluster and a 40 node backup cluster at two different data centers
Performance tune and manage growth of the O/S, disk usage, and network traffic
Responsible for building scalable distributed data solutions using Hadoop.
Analyze latest Big Data Analytic technologies and their innovative applications in both business intelligence analysis and new service offerings.
Experienced in managing and reviewing Hadoop log files.
Using PIG predefined functions to convert the fixed width file to delimited file.
Involved in scheduling Oozie workflow engine to run multiple Hive and Pig job
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
Managed datasets using Panda data frames and MySQL, queried MYSQL database queries from Python using Python - MySQL connector MySQL dB package to retrieve information.
Used Django configuration to manage URLs and application parameters.
Created Oozie workflows to run multiple MR, Hive and pig jobs.
Setup Azure Content Delivery Network (CDN), Azure DNS, Load balancer DDoS Protection in the environment.
Experience in Implementation of DAG and high availability.
Experience with several tools which helps in migration like ID fix, On-Ramp tool, Microsoft Remote connectivity analyzer, Microsoft network bandwidth analyzer, SCCM etc.
Worked on data fabrics to covers multiple sources of data - in the cloud, on-premise, at the edge and other storage locations.
Design maintains security and reliable access of data irrespective of the storage location by using Data fabrics.

Environment: Hadoop, HDFS, Pig, Sqoop, Shell Scripting, Ubuntu, Linux Red Hat, Spark, Scala, Hortonworks, Cloudera Manager, Apache Yarn, Python, Machine Learning, Microsoft Azure.

HADOOP DEVELOPER/Admin

Confidential, McLean, VA

Responsibilities:

Launched and configured Amazon EC2 Cloud Instances and S3 buckets using AWS, Ubuntu Linux and RHEL
Installed application on AWS EC2 instances and configured the storage on S3 buckets
Implemented and maintained the monitoring and alerting of production and corporate servers/storage using AWS Cloud watch.
Developed Pig scripts to transform the raw data into intelligent data as specified by business users.
Worked in AWS environment for development and deployment of Custom Hadoop Applications.
Worked closely with the data modelers to model the new incoming data sets.
Involved in start to end process of Hadoop jobs that used various technologies such as Sqoop, PIG, Hive, Map Reduce, Spark and Shell scripts (for scheduling of few jobs.
Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, Oozie, Zookeeper, Sqoop, flume, Spark, Impala, Cassandra with Horton work Distribution.
Involved in creating Hive tables, Pig tables, and loading data and writing hive queries and pig scripts
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark - SQL, Data Frame, Pair RDD's, Spark YARN.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data. Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
Import the data from different sources like HDFS/HBase into Spark RDD.
Developed a data pipeline using Kafka and Storm to store data into HDFS.
Performed real time analysis on the incoming data.
Used Spark Streaming to divide streaming data into batches as an input to Spark engine for batch processing.
Implemented Spark using Scala and SparkSQL for faster testing and processing of data.

Environment: Apache Hadoop, HDFS, MapReduce, Sqoop, Flume, Pig, Hive, HBASE, Oozie, Scala, Spark, Linux.

SDET

Confidential, New York, NY

Responsibilities:

Responsible for implementation and ongoing setting up and administration of Hadoop infrastructure
Analyzed technical and functional requirements documents and design and developed QA Test Plan/Test cases, Test Scenario by maintaining E2E flow of process.
Developed testing script for internal brokerage application that is utilized by branch and financial market representatives to recommend and manage customer portfolios; including international and capital markets.
Designed and Developed Smoke and Regression automation script and Automation of functional testing framework for all modules using Selenium and WebDriver.
Configured Selenium WebDriver, TestNG, Maven tool, Cucumber, and BDD Framework and created Selenium automation scripts in java using TestNG.
Performed Data - Driven testing by developing Java based library to read test data from Excel & Properties files.
Extensively performed DB2 database testing to validate the trade entry from mainframe to backend system.
Developed data driven framework with Java, Selenium WebDriver and Apache POI which is used to do the multiple trade order entry.
Developed internal application using Angular.js and Node.js connecting to Oracle on the backend.
Expertise in debugging issues occurred in front end part of web-based application which is developed using HTML5, CSS3, Angular JS, Node.JS and Java.
Applied various testing technique in test cases to cover all business scenario for quality coverage.
Executed tests in System & integration Regression testing In Testing environment.
Conducted Defect triage meeting, Defect root cause analysis, track defect in HP ALM Quality Center, manage defect by follow up open items, and retest defects with regression testing.
Provide QA/UAT sign off after closely reviewing all the test cases in Quality Center along with receiving the Policy sign off the project.

Environment: HP ALM, Selenium WebDriver, JUnit, Cucumber, Angular JS, Node.JS Jenkins, GitHub, Windows, UNIX, Agile, MS SQL, IBM DB2, Putty, WinSCP, FTP Server, Notepad++, C#, DB Visualizer.

We provide IT Staff Augmentation Services!

Hadoop Developer/big Data Developer Resume

Jacksonville, FL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship