Hadoop/big Data Consultant Resume
Charlotte, NC
PROFESSIONAL SUMMARY:
- Has total of 9 years of total IT experience, 3+ years in BigData, Hadoop and 7 years in Java
- Experience in installing, configuring and using ecosystem components like Hadoop MapReduce, HDFS, Sqoop, Pig, Flume, Hive, HBase, Zoo Keeper, Cassandra
- Experience on Hadoop CDH3 and CDH4
- Hands - on experience with Hadoop components such as Job Tracker, Task Tracker, Name Node, Data Node, MapReduce and HDFS Framework
- Experienced in developing and implementing MapReduce jobs using java to process and perform various analytics on large datasets
- Configured Zoo Keeper, Cassandra & Sqoop to the existing Hadoop cluster.
- Hands-on experience with Hadoop applications (such as administration, configuration management, monitoring, debugging, and performance tuning).
- Expertise in writing Shell scripting in UNIX using ksh and bash
- Good experience in writing Pig Latin scripts and Hive queries
- Experience in SQL and NoSQL development
- Good understanding of Data Structure and Algorithms
- Skilled in programming with Java and Jee5 using JSP, JSF, JavaScript, Servlets and EJB
- Worked on Query Tuning, Application Server performance tuning, Server performance improvement settings, Database tuning, Network performance analysis
- Good experience in implementing service oriented architecture using Web Services
- Experience in Build Process and Deployments of bug fixes and customizations from the development team to the Application Server(IBM Websphere, Tomcat)
- Developed Java Utilities and Tools for EAM team to various problems in production
- Good knowledge of ETL Scripts for Data Acquisition and Transformation
- Strong data cleansing and data migration experience using ETL Informatica
- Expertise in Object-oriented analysis and programming(OOAD) like UML and use of various design patterns
- Have dealt with end users in requirement gathering, user experience and issues
- Experience in preparing deployment packages and deploying to Dev and QA environments and prepare deployment instructions to Production Deployment Team
- Team player with excellent analytical, communication and project documentation skills
TECHNICAL SKILL:
SBIG DATA ECO SYSTEM: Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Oozie, Flume, Cloudera, Hortonworks, Amazon Elastic Map Reduce (EMR).
PROGRAMMING LANGUAGES: Java SE/J2EE (jdk 1.5/1.6), C, Java, Visual Basic
SCRIPTING LANGUAGES: JavaScript, Perl, PHP
DATABASES: SQL Server 2005, 2008, Oracle 10g, Teradata, MySql
APPLICATION/WEB SERVERS: Apache Tomcat, IBM Websphere 6.x, 7.0, JBoss 5.0, WebLogic
DEVELOPMENT TOOLS/IDE: Eclipse, Documentum Composer, Microsoft Visual Studio 2008
PLUG-IN TOOLS: ANT, Maven, Log4j, JUnit
OPERATING SYSTEMS: Unix,Linux, Ubuntu, CentOS, Windows XP/Vista/7/8
PROFESSIONAL EXPERIENCE:
Confidential, Charlotte, NC
Hadoop/Big Data Consultant
Responsibilities:
- Involved in architecture development of Hadoop Clusters with CDH4 on Linux, using Cloudera Manager
- Involved in loading data from Linux file system to HDFS
- Developed MapReduce jobs in java for data cleaning, pre-processing and performed various analytics and the output was written back to HDFS
- Importing and exporting structured data into HDFS using Sqoop
- Loading large amount of application server logs from different web servers using Flume and performed various analytics using Pig Latin and Hive queries
- Developed Pig UDF’s to pre-process the data
- Exported analyzed data to Oracle database using Sqoop for generating reports
- Involved in creating Hive Tables, loading with data and writing Hive queries to do analytics on the data
- Performance tuning of Hadoop cluster and MapReduce routines against very large data sets
- Monitored Hadoop cluster job performance and performed capacity planning and managed nodes on Hadoop cluster
Environment: CDH4 Cloudera Distribution, Sqoop, Pig Latin, Hive, Flume, HDFS, MapReduce, Eclipse IDE, UNIX Shell Scripting
Confidential, Dallas, TX
Hadoop/Big data Consultant
Responsibilities:
- Installed and configured Hadoop Mapreduce, developed multiple MapReduce
- Jobs in java for data cleaning and preprocessing
- Importing and exporting data into HDFS and Hive using Sqoop
- Managing and reviewing Hadoop log files
- Extracted data files from MySql through Sqoop and placed in Cassandra for processing
- Running Hadoop streaming jobs to process terabytes of xml format data
- Load and transform large sets of structured, semi structured and unstructured data
- Responsible to manage data coming from different sources
- Got good experience with NOSQL database
- Supported Map Reduce Programs those are running on the cluster
- Involved in loading data from UNIX file system to HDFs
- Installed and configured Hive and also written Hive UDFs
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way
- Gained very good business knowledge on health insurance, claim processing, fraudSuspect identification, appeals process etc.
- Developed a custom File System plug in for Hadoop so it can access files on Data Platform
- Developed Unit test cases using JUnit4.8.2 and JMock libraries
- Setup Hadoop cluster on Amazon EC2 using whirr for POC
Environment: Java 6, Eclipse, MySql, Sub Version, Hadoop, Hive, HBase, Linux, MapReduce, Hive, Java (JDK 1.6), MapReduce, Cassandra.
Confidential, Memphis, TN
J2EE Developer/ Analyst
Responsibilities:
- Designed and Developed EJB Session Beans for business logic implementation
- Implemented design patterns such as Transfer Object and Singleton
- Written Validation framework and developed DataTransferObjects for request and response messages
- Involved in writing Web services for various transactions that are existing as xml based transactions
- Developed the Shell scripts which makes and RMI / IIOP remote call to the EJBs which starts and stops the interfaces
- Used extensive JDBC calls to access data from database. Developed SQL’s and stored procedures necessary for accessing data
- Configured the data sources in WebLogic application server
- Used JDOM for parsing of XML documents and transformations
- Implemented JUnit for unit testing and Log4J for logging at runtime
- Used CVS for version control of the product
- Implemented Test Driven Development methodology for project deliverables
- Developed build and deployment scripts using Ant
Environment: Java 1.4, EJB, Oracle 9i, JUnit, JAX-RPC, Log4J, Eclipse 3.2, WebLogic 9.2
Confidential, Atlanta, GA
Java Developer
Responsibilities:
- Involved in various phases of software development phases such as modeling, system analysis and design, programming and testing using AGILE methodology
- Involved in the application development based on the business requirements using Eclipse and Tomcat application server
- Created shell scripts for scheduling various data cleansing and data transformation and ETL loading process
- Handled structured and unstructured data and applied ETL process
- Development process also including bug fixing, unit testing, code reviews
- As an active build manager responsible for weekly builds and deploying the ear to the Dev and QA application servers for testing
- Developed SQL queries, Stored Procedures and used JDBC to interact with database.
- Used AABS Issue Tracking System for project management
Environment: Java 1.5, Struts, Hibernate 3.0, Oracle 10g, EJB, JPA, JSP, IBM Rational Application Developer, AJAX, Jquery, JMS, Log4j, JUNIT, JMOCK, Linux, ANT, WebSphere, Shell script
Confidential, Morgantown, WV
Graduate Research Assistant
Responsibilities:
- Studied the organizational structure of the West Virginia State Police Forensic Laboratory
- Worked in a group of four in project requirements and implementation
- Created web pages for the UI and code in VB.NET for functional implementation using Visual Studio 2005
- Populating the database with the data obtained from the WVSPFL
- Implemented several SQL queries and stored procedures in MS SQL Server 2005
- Created electronic forms which reflect the existing paper based forms to make the transition of data easy and faster
- Tested and debugged the visual basic code for the functionality implemented in the project
- Used Weka tool to implement data mining algorithms on the crime data available in the database to identify patterns and trends in the data
Environment: Visual Studio 2005/2008 SQL Server 2005/2008, Windows XP, Weka, Windows Server 2003, 2008
Confidential
Web Developer
Responsibilities:
- Developed the Web Interface using Struts, Java Script, HTML and CSS.
- Extensively used the Struts controller component classes for developing the applications.
- Involved in developing business tier using stateless session bean (acts as a Session Facade) and Message driven beans.
- Used JDBC and Hibernate to connect to the database, using Oracle.
- Data sources were configured in the app server and accessed from the DAO’s through Hibernate.
- Design patterns of Business Delegates, Service Locator and DTO are used for designing the web module of the application.
- Developed SQL stored procedures and prepared statements for updating and accessing data from database.
- Involved in developing database specific data access objects (DAO) for Oracle.
- Used CVS for source code control and JUNIT for unit testing.
- Used Eclipse to develop entity and session beans.
- The entire application is deployed in WebSphere Application Server.
- Followed coding and documentation standards.
Environment: Java, J2EE, JDK, Java Script, XML, Struts, JSP, Servlets, JDBC, EJB, Hibernate, Web services, JMS, JSF, JUnit, CVS, IBM Web Sphere, Eclipse, Oracle 9i, Linux.