- 7 plus years of experience in IT industry which includes 2 plus years of experience in Big Data technologies and widespread experience of 4 plus years in Java, Database Management Systems and Data warehouse systems.
- Hands on experience in working wif Hadoop Ecosystems Including Hive, Pig, HBase, Cassandra, Oozie, Kafka, and Flume.
- Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN and MapReduce programming paradigm.
- Highly capable of processing large sets of Structured, Semi - structured and Unstructured datasets and supporting BigData applications.
- Strong experience in writing custom UDFs for Hive and Pig wif strong understanding in Pig and Hive analytical functions.
- Experience in importing and exporting data using Sqoop from Relational Database to HDFS and from HDFS to Relational Database.
- Extensively worked on Oozie for workflow management, wif separate workflows for each layer like Staging, Transformations and Archive layers.
- Experienced in installing, configuring Hadoop cluster of major Hadoop distributions.
- Extensively worked on NOSQL Database such as HBase, Cassandra and MongoDB.
- Worked on MapReduce programs for parallel processing of data and for custom input formats.
- Extensively worked on Pig for ETL Transformations and optimized Hive Queries.
- Worked on Flume to maintain log data from external source systems to HDFS.
- Developed workflow in Oozie to automate tasks of loading teh data in to HDFS and preprocessing wif pig and used Zookeeper to coordinate teh clusters.
- Deployed, configured and managed Linux servers in VM.
- Strong UNIX Shell Scripting skills.
- Extensive experience in working wif databases such as SQL Server, MySQL and writing StoredProcedures, Functions, Joins and Triggers for different Data Models.
- Possess a strong coding experience using Core Java. Expert in developing Strong hands-on experience in Java and J2EE frameworks.
- Experience working wif JAVA, J2EE, JDBC, ODBC, JSP, Java Eclipse, Java Beans, EJB, Servlets
- Web page interfaces using JSP, Java Swings, and HTML scripting languages.
- Excellent understanding on Java beans and Hibernate framework to implement model logic to interact wif RDBMS databases.
- Always looking for new challenges dat broaden my experience and noledge, as well as further develop skills dat was already acquired.
Big Data Ecosystems: HDFS, Hive, Pig, MapReduce, Sqoop, HBase, Cassandra, Zookeeper, Flume, Kafka, and Oozie.
Languages: C, C++, Java, J2EE, Spring, Hibernate, Java Servlets, JDBC, JUnit, Python, and Perl
Web Technologies: HTML, DHTML, XHTML, XML, CSS, Ajax, and Java Script
Data Base: MY SQL, Oracle 10g/11g, NOSQL, MongoDB, Microsoft SQL Server, DB2, Sybase, PL/SQL, and SQL*PLUS
Operating System: Linux, Unix, Windows, and Mac OSX
Web Servers: Apache Tomcat 5.x, BEA Web logic 8.x, IBM Websphere 6.00/5.11, IDE Eclipse, and Net beans
Design & Modelling Tools: UML Use Cases, Sequence & class diagrams
Methodologies: Waterfall, Scrum, and Agile
Distributions: Cloudera, Hortonworks, and Apache Hadoop
Confidential, San Rafael, CA
- Configured, Implemented, maintained and deployed Hadoop/ Big Data Ecosystem
- Experience in Extraction, Transformation, and Loading (ETL) of data from multiple sources like Flat files, XML files, and Databases
- Used a ETL tool for processing based on business needs and extensively used Oozie workflow engine to run multiple Hive and Pig jobs
- Load and transferred large complex sets of structured, semi-structured and unstructured data using Sqoop
- Implemented of MapReduce jobs using techniques such as Hive, Pig, Scoop and YARN architecture
- Provided NoSql solutions in MongoDB, Cassandra for data extraction and storing huge amount of data
- Integrated Business Intelligence Reporting Solution like Tableau wif various databases
- Used Apache Spark for large-scale data processing, handling real-time analytics and real streaming of data.
- Wrote complex queries in SQL for performance tuning
- Worked closely wif Business Stakeholders, UX Designers, Solution Architects and other team members to achieve results together
- Participated in business requirement analysis, solution design, detailed design, solution development, testing and deployment of various products
- Delivered robust, flexible and scalable solutions wif a dedication to high quality dat meet or exceed customer requirements and expectations.
Environment:Java, Hadoop, Hive, Pig, Oozie, Sqoop, YARN, MongoDB, Cassandra, Tableau, Spark, SQL, XML, Eclipse, Maven, JUnit, Linux, Windows, Subversion
Confidential, Kansas City, MO
- Installed and configuredHadoopMap reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experience in defining job flows.
- Experience in managing and reviewingHadooplog files.
- Extracted files from RDBMS through Sqoop and placed in HDFS and processed.
- Experience in runningHadoopstreaming jobs to process Terabytes of xml format data.
- Got good experience wif NOSQL database.
- Supported Map Reduce Programs those are running on teh cluster.
- Involved in loading data from Unix file system to HDFS.
- Involved in creating Hive tables, loading wif data and writing Hive queries which will run internally in map reduce way.
- Replaced default Derby metadata storage system for Hive wif MySQL system.
- Developed Pig Latin scripts to extract teh data from teh web server output files to load into HDFS.
- Developed teh Pig UDF's to preprocess teh data for analysis.
- Developed Hive queries for teh analysts.
- Involved in loading data from Linux and Unix file system to HDFS.
- Load and transform large data sets of structured, semi structured and unstructured data.
- Worked wif various Hadoop file formats, including TextFiles, SequenceFile, RCFile.
- Supported in setting up QA environment and updating configurations for implementing scripts wif Pig.
- Developed a custom File System plug in forHadoop.so, it can access files on Data Platform. dis plugin allowsHadoopMapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
- Designed and implemented MapReduce based large scale parallel relation learning system.
Environment: Hadoop, Hive, HBase, MapReduce, HDFS, Pig, Cassandra, Java (JDK 1.6)
Confidential, Columbus, OH
- Responsible for building a system dat ingests Terabytes of data per day onto Hadoop from a variety of data sources providing high storage efficiency and optimized layout for analytics.
- Responsible for converting wide online video and ad impression tracking system, teh source of truth for billing, from a legacy stream based architecture to a MapReduce architecture, reducing support effort.
- Used Cloudera Crunch to develop data pipelines dat ingests data from multiple data sources and process them.
- Used Sqoop to move teh data from relational databases to HDFS.Used Flume tomove teh data from web logs onto HDFS.
- Used Pig to apply transformations, cleaning and reduplication of data from raw data sources.
- Used MRUnit for doing unit testing.
- Experienced in managing and reviewingHadoop log files.
- Created adhoc analytical job pipeline using Hive and Hadoop Streaming to compute various metrics and dumped them in Hbase for downstream applications.
Environment: JDK1.6,Red Hat Linux, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, Python, Crunch, HBase, MRUnit
- Involved in designing and implementing teh User Interface for teh General Information pages and Administrator functionality.
- Designed front end using JSP and business logic in Servlets.
- Used Struts Framework for teh application based on teh MVC-II Architecture and implemented validator Framework.
- Mapping of teh servlet in teh Deployment Descriptor (XML).
- Used HTML, JSP, JSP Tag Libraries, and Struts Tiles to develop presentation tier.
- Deployed application on Jboss Application Server and also configured database connection pooling.
- Developed stored procedures and Triggers for business rules.
- Performed unit tests and integration tests of teh application.
- Used CVS as a documentation repository and version controlling tool.Environment: Java, J2EE, JDBC, Servlets, JSP, Struts, HTML, CSS, Java Script, UML, Jboss Application Server 4.2, MySQL
- Developed complete Business tire wif Session beans.
- Used Web services (SOAP) for transmission of large blocks of XML data over HTTP.
- Used XSL/XSLT for transforming common XML format into internal XML format.
- Apache Ant was used for teh entire build process.
- Implemented teh database connectivity using JDBC wif Oracle 9i database as backend.
- Designed and developed Application based on teh Struts Framework using MVC design pattern.
- Used CVS for version controlling and JUnit for unit testing.
- Deployed teh application on JBoss Application server.