Hadoop Developer Resume
Fairfax, VA
SUMMARY:
- A motivated and result driven professional with 7+ years of experience in Software development including 3+ years of heavy exposure to BigData /Hadoop, Actively involved in analysis and implementation of various trending technologies in BigData Eco Systems and NoSQL Technologies under different verticals like Finance, Health - care and Insurance.
- Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, Data Node, Resource Manager, Node Manager and MapReduce programming paradigm.
- Experience in installing, configuring and troubleshooting Hadoop ecosystem components like Map Reduce, HDFS, Sqoop, Impala, Pig, Flume, Hive, HBase, and Zoo Keeper.
- Experience on Hadoop CDH3 and CDH4, CDH5 and also MapR.
- Experience in upgrading the existing Hadoop cluster to latest releases.
- Experienced in using NFS (network file systems) for Name node metadata backup.
- Experience in using Cloudera Manager 4.0, 5.0 for installation and management of Hadoop cluster.
- Experience in supporting data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud.
- Good Knowledge in Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data.
- Good Knowledge on Hadoop Cluster administration, monitoring and managing Hadoop clusters using Cloudera Manager.
- Exporting and importing data into S3.
- ConfiguredName node HA on the existingHadoopcluster using Zookeeper quorum.
- Expertise in writing Shell scripting in UNIX using ksh and bash.
- Experienced in developing and implementing Map Reduce jobs using java to process and perform various analytics on large datasets.
- Good experience in writing Pig Latin scripts and Hive queries.
- Good understanding of Data Structure and Algorithms.
- Good experience on developing of ETL Scripts for Data cleansing and Transformation.
- Experience in designing both time driven and data driven automated workflows using Oozie.
- Experience in supporting analysts by administering and configuring HIVE.
- Hands-on programming experience in various technologies like JAVA, JSP, Servlets, SQL, JDBC, HTML, XML, UNIX.
- Experience writing SQL queries and working with Oracle and My SQL.
- Expertise in Object-oriented analysis and programming (OOAD) like UML and use of various design patterns.
- Have dealt with end users in requirement gathering, user experience and issues.
- Experience in preparing deployment packages and deploying to Dev and QA environments and prepare deployment instructions to Production Deployment Team.
- Team player with excellent analytical, communication and project documentation skills.
- Agile Methodology and Iterative development.
TECHNICAL SKILLS:
Programming Languages/ Scripting Languages: Java, C, J2EE, Unix Shell / Python Scripts
Hadoop Ecosystems: Apache Hadoop, HDFS, Hive, Pig, Flume, Impala, Oozie, Zookeeper, HBase and Sqoop.
Operating System: Linux, Windows 7, Server 2003, Server 2008.
RDBMS: Oracle, MySQL and SQL Server.
Web Technologies: Applets, JavaScript, CSS, HTML and XHTML
Web/XML Technologies: JSP & Servlets, XML, HTML, JSON, JavaScript, jQuery
IDE: Eclipse, Net Beans
NoSQL Database: HBase, Cassandra
PROFESSIONAL EXPERIENCE:
Confidential, Fairfax, VA
Hadoop Developer
Responsibilities:
- Responsible for architecting Hadoop clusters with CDH4 on CentOs, managing with Cloudera Manager.
- Responsible for building scalable distributed data solutions usingHadoop Installed and configured Hive, Pig, Sqoop, Flume and Oozie on theHadoopcluster.
- Developed Simple to complex Map/reduce Jobs using Hive and Pig
- Experienced in creating several UDFs, UDAF, and UDTF for Hive using Java
- Used Maven extensively for building jar files of MapReduce programs and deployed to Cluster
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
- Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior
- Implemented business logic by writing UDFs in Java and used various UDFs from Piggybanks and other sources.
- Use Flume to collect the log data from different resources and transfer the data type to hive tables using different SerDes to store in JSON, XML and sequence file formats.
- Wrote the shell scripts to monitor the Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Continuous monitoring and managing theHadoopcluster using Cloudera Manager
- Worked with application teams to install operating system,Hadoopupdates, patches, version upgrades as required
- Developed Map-Reduce programs to process data that is stored in HDFS using Java.
- Loaded business accounts/customer details into HDFS from RDBMS Database using Sqoop.
- Involved in schema design for the application using Hive and HBase.
- Installed and configuredHadoopMap Reduce, HDFS (non-production environment).
- Coordinate with other teams for each quarterly deployment and deploy new functionality in prod environment.
- Techno-functional responsibilities include interfacing with users, identifying functional and technical gaps, estimates, designing custom solutions, development, leading developers, producing documentation, and production support.
- Designed, documented operational problems by following standards and procedures using a software reporting tool JIRA.
- Production Rollout Support which includes monitoring the solution post go-live and resolving any issues that are discovered by the client and client services teams.
- Worked on analyzingHadoopstack and different big data analytic tools including Pig and Hive, Hbase and Sqoop.
Environment: Hadoop, HDFS, Hive, Pig, Sqoop, Hbase, Hue, Linux, Map Reduce,Hadoopdistribution of Cloudera and Flume.
Confidential, San Mateo, CA
Hadoop Developer
Responsibilities:
- Worked on setting up the 100 node Hadoop cluster for the Production Environment.
- Upgraded production Hadoop clusters from CDH4U1 to CDH5.2 and CM 4.x to CM 5.1.
- Migrating production Hadoop clusters from MRV1 to Yarn and application migration.
- Involved in full life-cycle of the project from Design, Analysis, logical and physical architecture modeling, development, Implementation, testing.
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Developed Map Reduce programs to parse the raw data and store the refined data in tables.
- Designed and Modified Database tables and used HBASE Queries to insert and fetch data from tables.
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
- Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Created Hive tables, loaded data and wrote Hive queries that run within the map.
- Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
- Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Involved in fetching brands data from social media applications like Facebook, twitter.
- Create a complete processing engine, based on Cloudera's distribution, enhanced to performance.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration on the website.
Environment: Hadoop, HDFS, Hive, Pig, Sqoop, Hbase, Hue, Linux, Map Reduce,Oozie, Apache Kafka, Cassandra and Flume.
Confidential, Englewood, CO
Hadoop Admin/ Developer
Responsibilities:
- Loaded customer data such as service installations, technical help line calls and interaction from the Confidential web site in to HDFS using Flume.
- Implemented 100 node CDH4 Hadoop cluster on Red hat Linux using Cloudera Manager.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Developed Simple to complex Map/Reduce Jobs using Hive and Pig.
- Handled importing of data from various data source s, performed transformations using Hive, MapReduce, loaded data into HDFS and Migrated the data from MySQL to HDFS using Sqoop
- Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
- Implemented business logic by writing UDFs in Java and used various UDFs from Piggybanks and other sources.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
- POC:
- Setup Amazon web services (AWS) to check whether Hadoop is a feasible solution or not.
- Setup Hadoop cluster using EC2 (Elastic MapReduce) on managed Hadoop Frame Work.
- Used Maven extensively for building MapReduce jar files and deployed it to Amazon Web Services (AWS) using EC2 virtual Servers in the cloud.
- Used S3 Bucket to store the jar’s, input datasets and used Dynamo DB to store the processed output from the input data sets.
Environment: CDH4, Cloudera Manager, MapReduce, HDFS, Hive, Pig, HBase, Flume, MySQL, Sqoop, Oozie, AWS.
Confidential
Java developer
Responsibilities:
- Identified System Requirements and Developed System Specifications, responsible for high-level design and development of use cases.
- Involved in designing Database Connections using JDBC.
- Involved in design and Development of UI using HTML, JavaScript and CSS.
- Developed coded, tested, debugged and deployed JSPs and Servlets for the input and output forms on the web browsers.
- Created Java Beans accessed from JSPs to transfer data across tiers.
- Database Modification using SQL, PL/SQL, Stored procedures, triggers, Views in Oracle9i.
- Experience in going through bug queue, analyzing and fixing bugs, escalation of bugs.
- Involved in Significant customer interaction resulting in stronger Customer Relationships.
- Responsible for working with other developers across the globe on implementation of common solutions.
- Involved in Unit Testing.
Environment: Tomcat Web Server, Jdk1.6, Servlets, JDBC, JSP, Oracle 9i, HTML, DHTML, XML, CSS, Java Script, Windows.
Confidential
Java Developer
Responsibilities:
- Developed the user interface screens using Swing for accepting various system inputs such as contractual terms, monthly data pertaining to production, inventory and transportation.
- Involved in designing Database Connections using JDBC.
- Involved in design and Development of UI using HTML, JavaScript and CSS.
- Involved in creating tables, stored procedures in SQL for data manipulation and retrieval using SQL SERVER 2000, Database Modification using SQL, PL/SQL, Stored procedures, triggers, Views in Oracle.
- Developed the business components (in core Java) used for the calculation module (calculating various entitlement attributes).
- Involved in the logical and physical database design and implemented it by creating suitable tables, views and triggers.
- Created the related procedures and functions used by JDBC calls in the above components.
- Involved in fixing bugs and minor enhancements for the front-end modules.
Environment: JDK 1.3, Swing, JDBC, JavaScript, HTML, Resin, SQL Server 2000, Textpad, Toad, MS Visual SourceSafe,Windows 2000, HP UNIX.