- Over Eight (8+) years of extensive experience in IT, including 3+ in Big data Hadoop Framework related Technology.
- Proficient in the area of project implementation (SDLC) specifically in integration of business intelligence strategy, requirement gathering, requirement analysis, data modeling, information processing, system design, testing and training.
- Possess experience in conducting current state (as - is) system analysis, defining future state (to-be), eliciting requirements, developing functional or technical requirements, mapping business processes to application capabilities, conducting fit gap analysis, developing/configuring/prototyping solutions, implementing to-be processes / solutions based on application capability and industry best practices.
- In depth knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and YARN / MapReduce programming paradigm and for working with Big Data to analyze large data sets efficiently.
- Hands on experience working with Ecosystems like Hive, Pig, Sqoop, Map Reduce, and Flume.
- Hands-on experience with related/complementary open source software platforms and languages (e.g. Java, Linux).
- Experience in importing and exporting terra bytes of data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Experience in analyzing data using HiveQL, Pig Latin in Java.
- Hands On experience in NoSQL database like MongoDB, HBASE. Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper
- Experience working on processing data using Pig and Hive. Involved in creating Hive tables, data loading and writing hive queries
- Followed Test driven development of Agile and scrum Methodology to produce high quality software
- Good knowledge on Data Mining tools like Weka analyzing data.
- Good knowledge on Hadoop administration activities such as installation and configuration of clusters using Apache and Cloudera
- Involved in Data Modeling, System/Data Analysis, Design and Development for Data warehousing environments. Strong knowledge on ETL methods, Developed mapping spreadsheets for (ETL) team with source to target data mapping with physical naming standards, data types, volumetric, domain definitions, and corporate meta-data definitions.
- Established and maintained comprehensive data model documentation including detailed descriptions of business entities, attributes, and data relationships.
- Comprehensive knowledge and experience in process improvement, normalization/de-normalization, data extraction, data cleansing, data manipulation.
- Worked on Database and ETL Testing - Functions, Stored procedures, packages, Constraints, Loading data into tables, Executing scripts. Performed Test case preparation/execution and Defect Management as well.
- Experience in Object Oriented Analysis and Design, Java technologies.
- Possess experience with working on large projects/teams as a good team player, quick learner as have worked in different teams.
- Demonstrated success many times under aggressive project schedules and deadlines, flexible, result oriented and adapts to the environment to meet the goals of the product and organization.
- Excellent work ethics, self-motivated, quick learner and team oriented. Continually provided value added services to the clients through thoughtful experience and excellent communication skills
Hadoop/Big Data: HDFS, Map Reduce, Hive, Pig, Sqoop, Flume, Oozie, Spark and Zookeeper
NoSQL Databases: MongoDB, HBase
Relation Databases: SQL, Oracle
Languages: Java, Pig Latin, HiveQL, C, C++, Python, Scala
Operating Systems: Linux and Windows XP/Vista/7/8/10
Tools: and IDEEclipse, Weka, SAS, SAP, STATA, Microsoft Project, Microsoft Office suite desktop applications (e.g., MS Visio, Word, Excel, PowerPoint)
Networks: TCP/IP, LAN / WAN
Confidential, Mayfield, OH
Sr. Hadoop Developer & Admin
- Converting the existing relational database model toHadoopecosystem.
- Generate datasets and load toHADOOP Ecosystem
- Worked with Linux systems and RDBMS database on a regular basis in order to ingest data using Sqoop.
- Worked with Spark to create structured data from the pool of unstructured data received.
- Wrote Spark programs in java and at times Scala to implement intermediate functionalities like events or records count from the flume sinks or Kafka topics.
- Managed and reviewed Hadoop and HBase log files.
- Developed multiple Kafka Producers and Consumers from scratch to as per the software requirement specifications.
- Also created, altered and deleted topics (Kafka Queues) as and when required with varying configurations involving replication factors, partitions and TTL.
- Responsible to manage data coming from different sources.
- Loaded data from relational DB using Sqoop and other sources toHadoop cluster by using Flume
- Involved in loading data from UNIX file system and FTP to HDFS.
- Designed and implementedHIVE queries and functions for evaluation, filtering, loading and storing of data.
- Creating Hive tables and working on them using Hive QL.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Collected the logs data from web servers and integrated in to HDFS using Flume.
- Used HIVE to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
- Supported the existing MapReduce Programs those are running on the cluster.
- Followed agile methodology for the entire project.
- Prepare technical design documents, detailed design documents.
Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, Flume, Kafka, flume, Sqoop, Oozie, MySQL, Cassandra, Java, SQL.
Confidential, Minneapolis, MN
- Moved all crawl data flat files generated from various retailers to HDFS for further processing.
- Experienced in developing MapReduce programs using Hadoop for working with Big Data.
- Used Flume to import data from applications.
- Created Hive External tables and loaded the data in to tables and query data using HQL.
- Involved in partitioning and joining Hive tables for Hive query optimization.
- Used Pig and Hive to analyze data
- Created HBase tables to load large sets of structured, semi-structured and unstructured
- Good Knowledge of analyzing data in HBase.
- Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
- Developed the SQOOP scripts in order to make the interaction between Pig and MySQL database.
- Experience in scheduling and monitoring jobs using Oozie
- Extending HIVE and PIG core functionality by using custom UDF's.
- Debugging Pig and Hive scripts and optimizing MapReduce job.
- Managed and reviewed Hadoop log files.
- Tested raw data and executed performance scripts.
Environment: Hadoop, Pig, Hive, MapReduce, Sqoop, Flume, Oozie, HBase, Cloudera Manager
Confidential, Nashville, TN
- Responsible for the development & unit testing of Staffing Request module using Struts.
- Prepared architectural prototype to validate the architecture and baseline for the development.
- Involved in system design and development in Core Java using Collections,multithreading.
- Used Hibernate Query Language for to store and retrieve data from the database.
- Configured the Queues and topics on the JMS of the JBOSS server.
- Used Struts tag libraries and custom tag libraries extensively while coding JSP pages.
- Interact with clients to understand their needs and propose design to the team to implement the requirement.
- Train team members to understand the system and how to use it.
- Developed PL/SQL objects like packages, procedures and functions.
- Always adhered on Quality processes during delivery on tasks to client.
- Provided Test Scripts and Templates with test results of each task delivered to the client team.
- Always used the best practices of Java/J2EE and minimize the unnecessary object creation, encourage proper garbage collections of un-used objects, always keep try to minimize the database call, always encourage to get all data in bulk from database to get best performance of application.
Environment: J2EE (JSP's, Servlets, EJB), HTML, Struts, DB2, hibernate 3.0, Log4j, JUnit 3.8.1., Eclipse 3.1.1, JBoss Plugins, JMS in JBoss, CVS, CSS and JS, SQL Server
Confidential, Chicago, IL
- Used UML for developing Use Cases, Sequence Diagrams and preliminary Class Diagrams for the system and involved in low-level design.
- Extensively used Eclipse IDE for building, testing, and deploying applications.
- Developed the whole frame work for the projects, based on Struts MVC & Spring MVC.
- Developed a new screen for the VAT using Icefaces.
- Designed and developed a Batch process to for VAT.
- Developed controllers, repositories, Service modules, form beans and validations
- Developed bean sand persisted bean in the database using JDBC and Hibernate.
- Involved in connecting bean with database using Hibernate configuration file.
- Involved in development of Spring DAO layer which invoked database queries.
- Developed Session Beans for the transactions in the application.
- Created SQL queries, PL/SQL Stored Procedures, Functions for the Database layer by studying the required business objects and validating them with Stored Procedures. Also used JPA with Hibernate provider.
- Written ANT scripts to build the entire module and deploy it on WebLogic application Server.
- Implemented JUnit framework to write test cases for different modules and resolved the test findings.
- Used Subversion for software versioning and as a revision control system.
- Critical to planning and overseeing our software development activities, leading teams against competing deliverables, and actively identify production issues/bringing them to quick resolution.
Environment: JDK 1.6, Icefaces, DAO, JPA, JSP, Servlets, Hibernate, WebLogic 10.3.4, AJAX, SVN JDBC, Web Services, XML, XSLT, CSS, DOM, HTML, ANT, DB2, MS SQL, UML, JUnit, JQuery, Toad
Junior Java Developer
- Gathered requirements for the project and involved in analysis phase.
- Developed quick prototype for the project so as to aid business in deciding the necessary ramifications to the requirements.
- Used strict Object Oriented model to separate the roles of the employees and their specifications to achieve extendibility and clarity.
- Installation, configuration and clustering of BEA WebLogic Server on Windows NT platform.
- Developed dynamic content of presentation layer using JSP.
- Involved in the design of tables in oracle to store the pay information.
- Used JDBC to interact with the Oracle database for storage and retrieval of information.
- Tested code using JUNIT.
- Created UML class and sequence diagrams using Rational Rose.
Environment: JAVA, Java Scripts, HTML, JUNIT, Web Logic, Tomcat 4.x, Oracle 8i