Hadoop Developer Resume
Washington, DC
SUMMARY
- Around 8+ years of experience as Software Engineer with major focus on Big Data technologies - Hadoop Ecosystem,HDFS, Map-Reduce Framework, Hbase, HIVE, Sqoop, Kafka,OOZIE,Kuberenetes. Java,Spring,Microservices,webservices.
- Experience working with Devops/Aws such as Continuous Integration, Continuous Delivery, Continuous Deployment, Automation of Configuration Management and Security.
- Extensive knowledge of Azure Blob container,Azure Virtual,Azure DW,Azure Sql server,Azure cloud.
- Involved in all the phases of Software Development Life Cycle (SDLC): Requirements gathering, analysis, design, development, testing, production and post - production support.
- Experience in writing Map Reduce programs for analyzing Big Data with different file formats like structured and unstructured data.
- Developed Map Reduce jobs based on the use cases using Java, Map Reduce, Pig and Hive
- Experience in loading data using Hive and writing scripts for data transformations using Hive and Pig.
- Knowledge and working experience in developing Apache Spark programs using Scala.
- Good Knowledge in Spark SQL queries to load tables into HDFS to run select queries on top.
- Hands-on experience with message broker such as Apache Kafka.
- Experience in writing workflows using Oozie and automating them with Autosys scheduling.Experience in creating Impala views on hive tables for fast access to data.
- Developed UDF functions and implemented it in HIVE Queries
- Developed PIG Latin scripts for handling business transformations
- Implemented SQOOP for large dataset transfer between Hadoop and RDBMs.
- Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs.
- Knowledge on installation and administration of multi-node virtualized clusters using Cloudera Hadoop and Apache Hadoop.
- Experience setting up instances behind Elastic Load Balancer in AWS for high availability.
- Hands-on experience in latest automation tool Terraform and server less automation tool Lambda.
- Experience in working with CI/CD pipeline using tools like Jenkins and Chef.
- Worked on Jenkins for continuous integration and for End to End automation for a poll the build and deployments by managing different plugins Maven and Ant.
- Wrote cookbooks in chef to automate the system operations.
- Hands-on experience in SCM tools like GIT and SVN for merging and branching.
- Knowledge in working with continuous deployment tools like Chef, Puppet, Ansible.
- Hands-on experience in Ansible server and workstation to manage deployments.
- Worked on creating the Docker containers and Docker consoles for managing the application life.
- Good understanding of Open shift platform in managing Docker containers using Docker swarm, Kubernetes Clusters.
- Good understanding of) Architecture (Diego Architecture), PCF components and their functionalities.
- Created strategic roadmap and business case Pivotal Cloud Foundry (PCF) using PAAS Pivotal Cloud Foundry using DEVOPS and Spring /Microservices based architecture on AWS platform.
- Experience in Pivotal Cloud Foundry (PCF) & the implementation of micro services in PCF and move development application from Docker containers & deploy into production Cloud Foundry environment.
- Excellent communications skills, configuration skills and technical documentation skills.
- Ability to work closely with teams to ensure high quality timely delivery of builds & release
- Excellent relationship management skills & ability to conceive efficient solutions utilizing technology. Industrious individual who thrives on a challenge, working effectively with all levels of management.
TECHNICAL SKILLS:
Big data Technologies: Hadoop, HDFS, Pig, Hive, MapReduce, Cassandra, Kafka.
Configuration Management: Chef, Puppet, Vagrant, Maven, Ansible, Dockers, Gradle, Splunk, OPS Work.
Continuous Integration Tools: NPM, Grunt, Gulp, Jenkins, JIRA.
Web Servers: Apache, Tomcat, Web Sphere, Nix, JBOSS, WebSphere.
Database: Oracle, DB2,MySQL,MongoDB,SQLServer,MS SQL, Kubernetes
Scripting Languages: JavaScript, Python, Shell, C, HTML,Bash PHP.
Build Tools: ANT, MAVEN, make file, Hudson, Jenkins, BAMBOO, Code Deploy.
Version Control Tools: Sub version (SVN), Clear case, GIT, GIT Hub, Perforce, Code Commit.
SDLC: Agile, Scrum.
Web Technologies.: HTML, CSS, Java Script, jQuery, Bootstrap, XML, JSON, XSD, XSL, XPATH.
Operating Systems: Red hat,Linux and WINDOWS, Cent OS.
PROFESSIONAL EXPERIENCE
Confidential, Washington DC
Hadoop Developer
Responsibilities:
- Involved in importing the real time data to Hadoop using Kafka and implemented the Oozie job for daily data.
- Loaded the data from Teradata to HDFS using Teradata Hadoop connectors.
- Import the data from different sources like HDFS/HBase into Spark RDD
- Developed Spark scripts by using Python shell commands as per the requirement
- Issued SQL queries via Impala to process the data stored in HDFS and HBase.
- Used the Spark - Cassandra Connector to load data to and from Cassandra.
- Used Restful Web Services API to connect with the MapR table. The connection to Database was developed through restful web services API.
- Involved in developing Hive DDLs to create, alter and drop Hive tables and storm, & Kafka.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Experience in data migration from RDBMS to Cassandra. Created data-models for customer data using the Cassandra Query Language.
- Responsible for building scalable distributed data solutions using Hadoop cluster environment with Horton works distribution
- Experienced in developing Spark scripts for data analysis in both python and Scala. Designed and developed various modules of the application with J2EE design architecture.
- Implemented modules using Core Java APIs, Java collection and integrating the modules. Experienced in transferring data from different data sources into HDFS systems using Kafka producers, consumers and Kafka brokers
- Installed Kibana using salt scripts and build custom dashboards that can visualize aspects of important data stored by Elasticsearch.
- Written ConfigMap and Daemon set files to install File beats on Kubernetes PODS to send the log files to Logstash or Elasticsearch to monitor the different type of logs in Kibana.
- Created Database on Influx DB also worked on Interface, created for Kafka also checked the measurements on Databases.
- Installed Kafka manager for consumer lags and for monitoring Kafka Metrics also this has been used for adding topics, Partitions etc.
- Successfully Generated consumer group lags from Kafka using their API.
- Ran Log aggregations, website Activity tracking and commit log for distributed system using Apache Kafka
- Involved in creating Hive tables, and loading and analyzing data using hive queries.
- Developed multiple MapReduce jobs in java for data cleaning and pre-processing. Loading data from different source (database & files) into Hive using Talend tool. Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts.
- Experienced in using Apache Mesos for running many applications on a dynamically shared pool of nodes.
- Used Oozie and Zookeeper operational services for coordinating cluster and scheduling workflows.
- Developed Site Reliability Engineering model to monitor and measure Container Metrics, Network Metrics, Logging using PCF Metrics model for AWS and OpenStack cloud clusters
- Implemented Flume, Spark, and Spark Streaming framework for real time data processing.
Confidential, Sanjose CA
Hadoop java Developer
Responsibilities:
- Involved in full life-cycle of the project from Design, Analysis, logical and physical architecture modeling, development, Implementation, testing.
- Developed Map Reduce programs to parse the raw data and store the refined data in tables.
- Designed and Modified Database tables and used HBASE Queries to insert and fetch data from tables.
- Developed algorithms for identifying influencers with in specified social network channels.
- Developed and updated social media analytics dashboards on regular basis.
- Involved in fetching brands data from social media applications like Face book, twitter.
- Performed data mining investigations to find new insights related to customers.
- Involved in forecast based on the present results and insights derived from data analysis.
- Developed sentiment analysis system per domain using machine learning concepts by using supervised learning methodology.
- Involved in collecting the data and identifying data patterns to build trained model using Machine Learning.
- Responsible for managing data coming from various sources.
- Involved in generating Analytics for brand pages.
- Experienced in working with Apache Spark
- Responsible for maintaining and supporting application.
- Developed and generated insights based on brand conversations, which in turn helpful for effectively driving brand awareness, engagement and traffic to social media pages.
- Involved in identification of topics and trends and building context around that brand.
- Developed different formulas for calculating engagement on social media posts.
- Maintaining Project documentation for the module.
- Involved in the identifying, analyzing defects, questionable function error and inconsistencies in output.
- Involved in review technical documentation and provide feedback.
- Involved in fixing issues arising out of duration testing.
- Developed metrics graphs using Kibana.
Confidential
Software Engineer
Responsibilities:
- Planning, deploying, monitoring, and maintaining Amazon AWS cloud infrastructure consisting of multiple EC2 nodes and VMWare Vm's as required in the environment. Used security groups, network ACLs, Internet Gateways, NAT instances and Route tables to ensure a secure zone for organizations in AWS public cloud.
- Created monitors, alarms and notifications for EC2 hosts using CloudWatch.
- Implemented and maintained Chef Configuration management spanning several environments in VMware and the AWS cloud.
- Working on Multiple AWS instances, set the security groups, Elastic Load Balancer and AMIs, Auto scaling to design cost effective, fault tolerant and highly available systems.
- Creating S3 buckets and managing policies for S3 buckets and Utilized S3 bucket and Glacier for Archival storage and backup on AWS.
- Creating public and private subnets within the VPC and attaching them to the EC2 instances based on the requirement.
- Utilize AWS CLI to automate backups of ephemeral data-stores to S3 buckets, EBS and create nightly AMIs for mission critical production servers as backups.
- Virtualized the servers using the Docker for the test environments and dev-environments needs.
- Written Chef Cookbooks for various DB configurations to modularize and optimize end product configuration, converting production support scripts to Chef Recipes and AWS server provisioning using Chef Recipes.
- Well Versed with Configuring Access for inbound and outbound traffic RDS DB services, DynamoDB tables, EBS volumes to set alarms for notifications or automated actions.
- Expert Knowledge in Bash Shell Scripting, Automation of cron Jobs.
- Implemented a GIT mirror for SVN repository, which enables users to use both GIT and SVN.
- Implemented Continuous Integration using Jenkins and GIT.
- Developed build and deployment scripts using ANT and MAVEN as build tools in Jenkins to move from one environment to other environments.
- Configure and ensure connection to RDS database running on MySQL engines.
- Responsible for Plugin Management, User Management, regular incremental backups and regular maintenance for recovery
Confidential
Java Developer
Responsibilities:
- Involved in the Object Oriented Analysis and Design using UML including development of class diagrams, Use Case Diagrams, Sequence diagrams, and State Diagrams
- Developed the application using J2EE architecture
- Developed the view pages in JSP, using CSS and validations using Servlets
- Programming for various backend services using Java JDBC for accessing Oracle database establishing and reusing database connections and write stored procedure
- Used the Struts validation, Struts Custom tags and Tiles Framework in the presentation layer
- Responsible for application build and releases using ANT as an application building tool and deploying the applications on WebLogic
- Involved in the end to end coding, testing of the system including writing unit test cases
- Maintaining the code repository using VSS and ClearCase for keeping codebase in sync with other phases of projects running simultaneously