Hadoop Developer Resume
New York, NY
SUMMARY
- Professional IT experience of 7+ years, includes 3 years of comprehensive experience in Big Data Hadoop Development and Ecosystem Analytics in various sectors like Banking, Insurance and Communication
- Well experienced in analyzing the data using custom MapReduce programs in JAVA, HIVEQL and PIG Latin. Using custom UDSF’s to extend HIVE and PIG core functionality
- Sound knowledge and hands on experience on Hadoop Architecture and various other components such as HDFS, Name Node, Job Tracker, Data Node, Task Tracker, MapReduce, HIVE, PIG, Zookeeper and Oozie
- Experienced in using Sqoop and Flume to import and export data from HDFS to RDBMS and vice - versa
- Good understanding and versed on NoSQL databases like HBase, Cassandra and MongoDB.
- Developed POC (proof of concept) on random data using shell scripts
- Good understanding of Data Mining and Machine Learning techniques
- Experience with workflow engine like Oozie, used in order to run jobs of Hadoop MapReduce and Pig
- Knowledge of administrative tasks such as installing Hadoop and its ecosystem components such as Hive and Pig
- Experience in installing and handling Cloudera Manager tool, used to manage Hadoop clusters
- Experience in the integration of various data sources like RDBMS, Spreadsheets, and Text files
- Well experience in Web and Business environments, which include JAVA Platform, Java Servlets. Junit, JDBC, J2EE, JSP.
- Good understanding of XML methodologies (XML, XSL, XSD) including Web Services and SOAP
- Familiar with popular frameworks like Struts, Hibernate, Spring MVC and AJAX
- Well experienced in using application servers like Weblogic, Web Sphere and Java tools in client server
- Work experience with cloud infrastructure like Amazon Web Services (AWS)
- Experience in managing and reviewing Hadoop log files
- Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.
- Good communication skills, work ethics and the ability to work in a team efficiently with good leadership skills
TECHNICAL SKILLS:
Hadoop/Big Data: HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Flume, Avro and Oozie
NoSQL Databases: HBase, Cassandra, MangoDB
Java & J2EE Technologies: Java Servlets. Junit, Java Database Connectivity (JDBC), J2EE, JSP
IDE Tools: Eclipse, Cygwin, Putty
Programming languages: C, C++, Java, Python, Linux shell scripts
Databases: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server, Teradata
Operating Systems: Windows, Macintosh, Ubuntu (Linux), RedHat
Web Technologies: HTML, XML, JavaScript, JSP, JDBC
Testing: HIVE Testing, HADOOP Testing, Quality Center (QC), MR Unit Testing, Junit Testing
ETL Tools: Informatica, Pentaho
PROFESSIONAL EXPERIENCE
Confidential, New York, NY
Hadoop Developer
Responsibilities:
- Worked on Hadoop cluster which ranged from 4-8 nodes during pre-production stage and it was sometimes extended up to 24 nodes during production
- Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analysed the imported data using Hadoop Components
- Established custom MapReduces programs in order to analyze data and used Pig Latin to clean unwanted data
- Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
- Involved in creating Hive tables, then applied HiveQL on those tables for data validation.
- Moved the data from Hive tables into Mongo cpllections.
- Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts
- Participated in requirement gathering form the Experts and Business Partners and converting the requirements into technical specifications
- Used Zookeeper to manage coordination among the clusters
- Experienced in analyzing MongoDB database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
- Gave assistance in exporting the analyzed data to RDBMS using Sqoop.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts
- Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability
- Assisted application teams in installing Hadoop updates, operating system, patches and version upgarades when required
- Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files
Environment: Hadoop, Pig, Hive, Sqoop, Flume, MapReduce, HDFS, LINUX, Oozie, MongoDB
Confidential - Madison, WI
Hadoop Developer
Responsibilities:
- Processed data into HDFS by developing solutions, analyzed the data using MapReduce, Pig, Hive and produce summary results from Hadoop to downstream systems
- Used Sqoop widely in order to import data from various systems/sources (like MySQL) into HDFS
- Applied Hive quires to perform data analysis on HBase using Storage Handler in order to meet the business requirements
- Created components like Hive UDFs for missing functionality in HIVE for analytics.
- Hands on experience with NoSQL databases like HBase, Cassandra for POC (proof of concept) in storing URL’s and images.
- Developing Scripts and Batch Job to schedule a bundle (group of coordinators) which consists of various Hadoop Programs using Oozie
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Worked with cloud services like Amazon web services (AWS)
- Involved in ETL, Data Integration and Migration
- Used different file formats like Text files, Sequence Files, Avro
- Cluster co-ordination services through Zookeeper
- Assisted in creating and maintaining Technical documentation to launching HADOOP Clusters and even for executing Hive queries and Pig Scripts
- Assisted in Cluster maintenance, cluster monitoring, adding and removing cluster nodes and Troubleshooting
- Installed and configured Hadoop,Mapreduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Environment: MapReduce, HDFS Sqoop, Flume, LINUX, Oozie, Hadoop, Pig, Hive, Hbase, Cassandra, Hadoop Cluster, Amazon Web Services
Confidential, Missouri
Java/Hadoop Developer
Responsibilities:
- Analyzed Hadoop clusters, other analytical tools used in big data like Hive, Pig and databases like HBase
- Used Hadoop to build scalable distributed data solution
- Extracted feeds form social media sites such as Facebook, Twitter using Flume
- Used Sqoop extensively to ingest data from various source systems into HDFS.
- Developed Pig UDFs for the needed functionality such as custom Pigsloader known as timestamp loader.
- Written Hive queries for data analysis to meet the business requirements
- Created Hive tables and worked on them using Hive QL.
- Used Zookeeper to coordinate clusters
- Created HBase tables to store variable data formats of PII data coming from different portfolios
- Installed cluster, worked on commissioning & decommissioning of Datanode, Namenode recovery, capacity planning, and slots configuration
- Stored variable data formats coming from different portfolios by creating HBase tables
- Assisted in managing and reviewing Hadoop log files
- Assisted in loading large sets of data (Structure, Semi Structured, Unstructured)
- Implemented Hadoop cluster on Ubuntu Linux
- Installed and configured Flume, Sqoop, Pig, Hive, HBase on Hadoop clusters
- Managed Hadoop clusters include adding and removing cluster nodes for maintenance and capacity needs.
- Assisted in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop
- Used Hibernate ORM framework with Spring framework for data persistence and transaction management
- Used struts validation framework for form level validation
- Wrote test cases in JUnit for unit testing of classes
- Involved in templates and screens in HTML and JavaScript
Environment: Hadoop, HDFS, Hive, Flume, Sqoop, HBase, PIG, Eclipse, MySQL and Ubuntu, Zookeeper, Java (JDK 1.6)ITC InfoTech - Paramus, NJ Oct 2010 - Jan 2012
Java/J2EE Developer
Confidential
Responsibilities:
- EJB (Session Beans and Entity Beans) on Web sphere Studio Application Developer.
- Used different Design patterns, like MVC, EJBs Session facade, Controller Servlets, while implementing the Framework.
- Front End was built using JSPs, jQuery, JavaScript and HTML.
- Built Custom Tags for JSPs.
- Built the report module on reports based from Crystal reports.
- Integrating data from multiple data sources.
- Generating schema difference reports for database using toad.
- Built Prototypes for internationalization.
- Wrote Stored Procedures in DB2.
Environment: J2EE (JSP’s, Servlets, EJB), HTML, Struts, DB2
Confidential - Atlanta, GA
Java/J2EE Developer
Responsibilities:
- Coded end to end (i.e. from GUI on Client side to Middleware to database and Connecting the back end Systems) on a subset of sub modules belonging to the above modules.
- Worked extensively on Swing.
- Most of the business logic is provided in Session Beans and the database transactions are performed using Container Managed Entity Beans.
- Worked on Parsing of XML Using DOM and SAX.
- Implemented EJB Transactions.
- Used JMS for messaging with IBM MQ-Series.
- Written stored procedures.
- Developed the Presentation layer, which was built using Servlets and JSP and MVC architecture on Web sphere Studio Application Developer (WSAD).
- Mentoring other programmers.
- Studied the implementation of Struts
- Implemented the Security Access Control both on client and Server side. Applet signing including Jar signing
Environment: Java, Java Swing JSP, Servlets, JDBC, Applets, Servlets, JCE 1.2, RMI, EJB, XML/XSL, Visual Age java (VAJ), Visual C++, J2EE
Confidential
Java developer
Responsibilities:
- Implemented the project according to the Software Development Life Cycle (SDLC).
- Implemented JDBC for mapping an object-oriented domain model to a traditional relational database.
- Created Stored Procedures to manipulate the database and to apply the business logic according to the user’s specifications.
- Developed the Generic Classes, which includes the frequently used functionality, so that it can be reusable.
- Exception Management mechanism using Exception Handling Application Blocks to handle the exceptions.
- Designed and developed user interfaces using JSP, Java script and HTML.
- Involved in Database design and developing SQL Queries, stored procedures on MySQL.
- Used CVS for maintaining the Source Code.
- Logging was done through log4j.
Environment: JAVA, Java Script, HTML, JDBC Drivers, Soap Web Services, Unix, Shell scripting, SQL Server