Big Data / Senior Hadoop Consultant Resume
Miami, FL
SUMMARY:
- 8+years of experience in software design, development, maintenance, testing, andTroubleshooting of enterprise applications.
- Expertise in core Java, JDBC and proficient in using Java API’s for application development.
- Strong knowledge of Object Oriented Programming (OOP) concepts including the use of
- Polymorphism, Abstraction, Inheritance, and Encapsulation.
- Overthreeyears of experience in design, development, maintenance and support of Big Data
- Analytics using Hadoop Ecosystem components like HDFS, Hive, Pig, Hbase, Sqoop, Flume Zookeeper, MapReduce, and Oozie.
- Strong working experience with ingestion, storage, querying, processing and analysis of big data.
- Extensive Experience on working withHadoop Architecture and the components of Hadoop -
- MapReduce, HDFS, JobTracker, TaskTracker, NameNode and DataNode.
- Experience in installation, configuration, supporting and managing Hadoop clusters.
- Good Experience on writing MapReduce programs using Java.
- Expertise in writing Hadoop Jobs for analyzing data using Hive and Pig.
- Loadedstreaming log data from various webservers into HDFS using Flume.
- Successfully loaded files to Hive and HDFS from Oracle, SQL Server, Teradata, and Netezzausing SQOOP.
- Worked in Multiple Environments in installation and configuration of Hadoop Clusters
- Experience with SQL, PL/SQL and database concepts
- Good Experience with job workflow scheduling like Oozie
- Good understanding of NoSQL databases.
- Experience on creating databases, tables and views in HIVE, IMPALA
- Experience with performance tuning on map reduce and hive jobs
- Load and transform large sets of structured, semi-structured and unstructured data usingHadoop ecosystem components.
- Experience in working with different data sources like Flat files, XML files and Databases.
- Experience in database design, entity relationships, database analysis, programming SQL stored procedure’s PL/ SQL, packages and triggers in Oracle and MangoDB on Unix / Linux.
- Experience in various phases of Software Development Life Cycle (Analysis, Requirements gathering, Designing) with expertise in documenting various requirement specifications functional specifications, Test Plans, Source to Target mappings, SQL Joins.
- Worked on different operating systems like UNIX/Linux, Windows XP and Windows 2K.
- Goal oriented self-starter, quick learner, team player and proficient in handling multipleprojects simultaneously.
TECHNICAL SKILLS:
Programming Languages: CC++,Java / J2EEPythonLinux shell scripts
NOSQL: CassandraHBase
Database: Oracle 9i, 10g, 11g,MySQLOperating Systems: Windows Vista/XP/2000/7/8UnixLinuxRHELCENTOSSolaris
PROFESSIONAL EXPERIENCE:
Confidential, Miami, FL
Big Data / Senior Hadoop Consultant
Responsibilities:
- Design & Develop ETL workflow using Oozie which includes automating the extraction of data from different database into HDFS using Sqoop scripts, Transformation and Analysis in Hive/Pig, Parsing the raw data using Map reduce
- Created Hive Tables, loaded claims data from Oracle using Sqoop and loaded the processed data into Netezza database.
- Developed Pig scripts and UDF’s as per the Business requirement.
- Developed Map reduce programs for data extraction, data manipulation.
- Worked with different file formats and compression techniques to in hadoop.
- Performed data analytics in Hive and then exported this metrics back to Oracle Database using Sqoop.
- Performance tuning of Hive queries, Map reduce programs for different applications
- Involved in POC for evaluating impala in Prototype Cluster
- Installed and Setup Hadoop CDH clusters for development and production environment.
- Installed and configured Hive, Pig, Sqoop, Flume, Cloudera manager and Oozie on the Hadoop cluster
- Implemented High Availability and automatic failover infrastructure to overcome single point of failure for Name node using zookeeper service.
- Performed a Major upgrade in development environment from CDH 4 to CDH 5.
- Worked with big data developers, designers and scientists in troubleshooting map reduce, hive jobs and tuned them to give high performance.
- Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
- Collaborating with business users/product owners/developers to contribute to the analysis of functional requirements.
Environment: CDH (CDH4 & CDH 5), Hive, Pig, Oozie, Flume, Sqoop, Cloudera manager, Cassandra, Tableau
Confidential, Chicago, ILBig Data / Hadoop Consultant
Responsibilities:
- Installed and configured Hadoop cluster in Dev, Test and Production environments.
- Performed both major and minor upgrades to the existing CDH cluster.
- Implemented Commissioning and Decommissioning of new nodes to existing cluster.
- Copying the data from one cluster to other cluster by using DISTCP and automated the dumping procedure using shell scripts.
- Involved in business requirements gathering and analysis of business use cases.
- Prepared System Design document with all functional implementations.
- Involved in Data model sessions to develop models for HIVE tables.
- Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MapReduce, HIVE, SQOOP and Pig Latin.
- Converting ETL logic to Hadoop mappings.
- Extensive hands on experience in Hadoop file system commands for file handling operations.
- Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for hive performance enhancement and storage improvement.
- Worked with parsing XML files using Map reduce to extract sales related attributed and store it in HDFS.
- Involved in building TBUILD scripts to import data from Teradata using Teradata Parallel transport APIs.
- Worked with SQOOP import and export functionalities to handle large data set transfer between DB2 database and HDFS.
Environment: CDH, Hadoop, HDFS, MapReduce, Hive, Sqoop, Pig, XML, Cloudera Manager, Teradata
Confidential, Atlanta, GABig Data / Hadoop Consultant
Responsibilities:
- Worked on setting up the Hadoop cluster for the dev, test and prod Environment.
- Worked on pulling the data from oracle databases into the hadoop cluster using the sqoop import.
- Worked with flume to import the log data from the reaper logs, syslog’s into the Hadoop cluster.
- Data was pre-processed and fact tables were created using HIVE.
- The resulting data set was exported to SQL server for further analysis.
- Generated reports using Tableau report designer.
- Automated all the jobs from pulling data from databases to loading data into SQL server using shell scripts
- Used Ganglia to monitor the cluster around the clock.
- Supported Data Analysts in running Map Reduce Programs.
- Worked on importing and exporting data into HDFS and Hive using Sqoop.
- Worked on analyzing data with Hive and Pig.
- Installed and configured NFS, Used NSLOOKUP to check information in the DNS.
Environment: CDH, Hadoop, HDFS, MapReduce, Hive, Pig, Flume, Sqoop, Tableau, SQL Server, Ganglia.
ConfidentialJava Developer
Responsibilities:
- Gathered specifications for the Library site from different departments and users of the services.
- Assisted in proposing suitable UML class diagrams for the project.
- Wrote SQL scripts to create and maintain the database, roles, users, tables, views, procedures and triggers in Oracle
- Designed and implemented the UI using HTML, JSP, JavaScript and Java.
- Implemented Multi-threading functionality using Java Threading API
- Extensively worked on IBM Web Sphere 6.0 while implementing the project.
Environment: Java,Servlets, JDBC, HTML, JavaScript, SQL Server, IBM Web sphere 6.0.
ConfidentialJava Developer
Responsibilities:
- Involved in Analysis, Design, Coding and Development of custom Interfaces.
- Involved in the feasibility study of the project.
- Gathered requirements from the client for designing the Web Pages.
- Participated in designing the user interface for the application using HTML, DHTML, and Java Server Pages (JSP).
- Involved in writing Client side Scripts using Java Scripts and Server Side scripts using Java Beans and used Servlets for handling the business.
- Developed the Form Beans and Data Access Layer classes.
- XML was used to transfer the data between different layers.
- Involved in writing complex sub-queries and used Oracle for generating on-screen reports.
- Worked on database interaction layer for insertions, updating and retrieval operations on data.
- Deployed EJB Components on WebLogic.
- Involved in deploying the application in test environment using Tomcat.
Environment: Java, Servlets, JDBC, HTML, DHTML, JavaScript, SQL Server, Web Logic