- 8+ years of overall experience in IT Industry which includes experience in Java Development, Database Management, Big data technologies and web applications in multi - tiered environment using Java, Hadoop, Spark, Hive, HBase, Pig, Sqoop, J2EE (Spring, JSP, Servlets), JDBC, HTML, CSS and Java Script(Angular JS).
- 4+ years of comprehensive experience in Big Data Analytics, Hadoop and its ecosystem components.
- Working knowledge in AWS environment and AWS spark with Strong experience in Cloud computing platforms such as AWS services.
- Extensive experience in Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and Map Reduce concepts .
- Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster along with CDH3&4 clusters.
- Developed a Cassandra based database and related web service for storing unstructured data.
- Experience in NoSQL databases including HBase, Cassandra.
- Designed a Cassandra NoSQL based database and associated RESTful web service that persists high-volume user profile data for vertical teams.
- Experience in building large scale highly available Web Applications .Working knowledge of web services and other integration patterns.
- Developed Simple to complex Map/reduce streaming jobs using Java language.
- Developed Hive scripts for end user / analyst requirements to perform ad hoc analysis.
- Experience in managing and reviewing Hadoop log files.
- Experience in using Pig, Hive, Scoop and Cloudera Manager.
- Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Hands on experience in RDBMS, and Linux shell scripting.
- Developed UDF, UDAF, UDTF functions and implemented it in HIVE Queries.
- Experience in analyzing data using HiveQL, Pig Latin and Map Reduce.
- Developed Map Reduce jobs to automate transfer of data from HBase.
- Knowledge in job work-flow scheduling and monitoring tools like oozie and Zookeeper.
- Knowledge of data warehousing and ETL tools like Informatica and Pentaho.
- Experienced in Oracle Database Design and ETL with Informatica.
- Proven ability in defining goals, coordinating teams and achieving results.
- Procedures, Functions, Packages, Views, materialized views, function based indexes and Triggers, Dynamic SQL, ad-hoc reporting using SQL.
- Business Intelligence (DW) applications.
- Generated ETL reports using Tableau and created statistics dashboards for Analytics.
- Proficient in using data visualization tools like Tableau and MS Excel.
- Knowledge of job workflow scheduling and monitoring tools like oozie and Zookeeper, of NoSQL databases such as HBase, Cassandra.
- Experience in setting up HIVE, PIG, HBASE, and SQOOP on Ubuntu Operating system.
- Excellent Java development skills using J2EE, spring, J2SE, Servlets, JUnit, MRUnit, JSP, JDBC.
- Excellent global exposure to various work cultures and client interaction with diverse teams
- Ability to work effectively in cross-functional team environments and experience of providing training to business users.
- Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.
- Practical understanding of the Data modeling concepts like star-Schema Modeling, Snowflake Schema Modeling, Fact and Dimension tables.
- Collaborate with data architects for data model management and version control and conduct data model reviews with project team members and create data objects(DDL).
- Collaborate with BI teams to create reporting data structures.
Programming Languages : C, C++, Java, Unix Shell Scripting, PL/SQL
J2EE Technologies: Spring, Servlets, JSP, JDBC, Hibernate.
Big Data Ecosystem: HDFS, HBase, Map Reduce, Hive, Pig, Spark,Kafka,Storm,Sqoop, Impala, Cassandra, Oozie, Zookeeper, Flume.
DBMS : Oracle 11g, SQL Server, MySQL, IBM DB2.
Modeling Tools: UML on Rational Rose 4.0
Web Services: Restful, SOAP.
IDEs : Eclipse, Net beans, WinSCP, Visual Studio and Intellij.
Operating systems : Windows, UNIX, Linux (Ubuntu), Solaris, Centos.
Version and Source Control: CVS, SVN and IBM Rational Clear Case.
Servers: Apache Tomcat, Web logic and Web Sphere.
Frameworks : MVC, Spring, Struts, Log4J, Junit, Maven, ANT.
Confidential, Atlanta, Georgia
- Develop, automate and maintain scalable Cloud infrastructure to help process of ters bytes of data.
- Solve most of the issues in hive and introduce best tools to optimize the query for good performance.
- Work with datascience team and software engineers to automate and scale their work.
- Automate/build scalable infrastructure in AWS.
- Design and implemented Hive and Pig UDF’s for evaluation, filtering, loading and storing of data.
- The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
- Load and transformed large sets of structured, semi-structured using Hive and Impala.
- Connected Hive and Impala to tableau reporting tool and generated graphical reports.
- Configure Presto ODBC/JDBC test and documented and proved the efficiency in querying data for business use.
- Worked on Active Batch Directory to automate the incremental scripts and observe lot of issues and solved.
- Wrote lot of scripts in Redshift and modify scope to be in Redshift instead of Hive due to relational data.
- Wrote Lambda functions to stream the incoming data from API’s and created the table sin DynamoDB and then ingest to AWS S3.
- Created clusters in EMR using applications hive, spark, hue, Zeppelin-sandbox, Ganglia, Presto-sandbox and scale up the nodes. Currently, I am able to launch the automated cluster by using Python Scripts.
- Build the framework for incremental queries by using shell scripts and work in SQL server, Postgresql.
- Participated in multiple big data POC to evaluate different architectures, tools and vendor products.
- Solved lot of issues in hive, impala and presto. Dealed with large datasets.
- Analyze the big datasets and change the existing workflow for efficiency and work on Agile methodology.
Environment: Mapreduce, HDFS, Core Java, Eclipse, Hive, Pig, Impala, Tableau, Spark, Hue, Ganglia, Presto-Sandbox, Zeppelin-Sandbox, SQL Server, PostgreSQL, Agile, Python.
Confidential, Phoenix, Arizona
- Installed and configure MapReduce, HIVE and the HDFS; implemented CDH5 Hadoop cluster on CentOS. Assited with performance tuning and monitoring.
- Conducted code reviews to ensure systems operations and prepare code modules for staging.
- Role of project manager for this project that contribution to manage and estimation activities.
- Run scrum based agile development group.
- Plays a key role in driving a high performance infrastructure strategy, architecture, scalability.
- Utilized high-level information architecture to design modules for complex programs.
- Write scripts to automate application deployments and configurations. Monitoring YARN applications.
- Implemented HAWQ to render queries faster than any other Hadoop-based query interface
- Wrote map reduce programs to clean and pre-process the data coming from different sources.
- Implemented various output formats like Sequence file and parquet format in Map reduce programs. Also, implemented multiple output formats in the same program to match the use cases.
- Implemented test scripts to support test driven development and continuous integration.
- Converted text files into Avro then to parquet format for the file to be used with other Hadoop eco system tools.
- Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
- Exported the analyzed data to HBase using Sqoop and to generate reports for the BI team.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Participate in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users.
- Worked on external HAWQ tables where the data is loaded directly from CSV files then load them into internal tables.
- Responsible for implementation and ongoing administration of Hadoop infrastructure.
- Performance tuning of Hadoop clusters and Hadoop MapReduce routines.
- Manage and review Hadoop log files.
- File system management and monitoring.
- Point of Contact for Vendor escalation.
Environment: Map Reduce, HDFS, Hive, Pig, Hue, Oozie, Core Java, Eclipse, Hbase, Flume, Cloudera Manager, Oracle 10g, DB2, IDMS, VSAM, SQL*PLUS, Toad, Putty, Windows NT, UNIX Shell Scripting, PentahoBigdata, YARN, HawQ, SpringXD,CDH.
Confidential -NYC, NY
- Developed the application using Struts Framework that leverages classical Model View Controller (MVC) architecture.
- Created Business Logic using Servlets, POJO's and deployed them on Web logic server
- Involved in running Hadoop jobs for processing millions of records of text data.
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, manage and review data backups and log files.
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop Clusters.
- Monitored multiple Hadoop clusters environments using Ganglia.
- Managing and scheduling Jobs on a Hadoop cluster.
- Involved in defining job flows, managing and reviewing log files.
- Monitored workload, job performance and capacity planning using Cloud era Manager.
- Installed Oozie workflow engine to run multiple Map Reduce, Hive and Pig jobs.
- Implemented Map Reduce programs on log data to transform into structured way to find user information.
- Responsible for loading and transforming large sets of structured, semi structured and unstructured data.
- Collected the log data from web servers and integrated into HDFS using Flume.
- Responsible to manage data coming from different sources.
- Extracted files from Couch DB and placed into HDFS using Sqoop and pre-process the data for analysis.
- Gained experience with No SQL database.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
Confidential - Scranton, PA
Sr. Java Developer
- Analyzed client requirements and design.
- Agile methodology is followed for development process.
- Developed presentation layer using HTML5, and CSS3, Ajax
- Developed the application using Struts Framework that uses Model View Controller (MVC) architecture with JSP as the view
- Extensively used Spring IOC for Dependency Injection and worked on Custom MVC Frameworks loosely based on Struts
- Used RESTful Web services for transferring data between applications
- Configured spring with ORM framework Hibernate for handling DAO classes and to bind objects to the relational model
- Adopted J2EE design patterns like Singleton, Service Locator and Business Facade
- Developed POJO classes and used annotations to map with database tables
- Used Java Message Service (JMS) for reliable and asynchronous exchange of important information such as Credit card transactions report
- Used Multi-Threading to handle more users
- Developed Hibernate JDBC code for establishing communication with database
- Worked with DB2 database for persistence with the help of PL/SQL querying
- Used SQL queries to retrieve information from database
- Developed various triggers, functions, procedures, views for payments
- XSL/XSLT is used for transforming and displaying reports
- Used GIT to keep track of all work and all changes in source code
- Used JProfiler for performance tuning
- Wrote python scripts to parse XML documents and load the data in database
- Wrote test cases that adhere to a Test Driven Development (TDD) pattern
- Used JUnit, a test framework that uses annotations to identify methods that specify a test
- Used Log 4J to log messages depending on the messages type and level
- Built the application using MAVEN and deployed using WebSphere Application server
Environment: Java 8, Spring framework, Spring Model View Controller (MVC), Struts 2.0, XML, Hibernate 3.0, UML, Java Server Pages (JSP) 2.0, Python, Servlets 3.0, JDBC4.0, JUnit, Log4j, MAVEN, Win 7, HTML, RESTClient, Eclipse, Agile Methodology, Design Patterns, WebSphere 6.1.
- Involved in various phases of Software Development Life Cycle (SDLC) as design, development, unit testing and Maintainence.
- Agile Scrum Methodology been followed for the development process.
- Designed different design specifications for application development that includes front-end, back-end using design patterns.
- Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
- Developed the application by using the Spring MVC framework.
- Collection framework used to transfer objects between the different layers of the application.
- Developed data mapping to create a communication bridge between various application interfaces using XML, and XSL.
- Spring IOC being used to inject the parameter values for the Dynamic parameters.
- Developed JUnit testing framework for Unit level testing.
- Actively involved in code review and bug fixing for improving the performance.
- Documented application for its functionality and its enhanced features.
- Created connection through JDBC and used JDBC statements to call stored procedures.
- Provide L3 application support as primary on call.
- Involved in the development of Report Generation module which includes volume statistics report, Sanctions Monitoring Metrics report, and TPS report.
- Implemented Online List Management (OLM) and FMM module using spring and Hibernate.
- Wrote various SQL, PL/SQL queries and stored procedures for data retrieval.
- Created Configuration files for the application using spring framework.
- Developed the system by following the agile methodology.
- Involved in vital phases of the Software development life cycle (SDLC) that includes Design, Development, Testing, Deployment and Maintenance Support.
- Applied OOAD principles for the analysis and design of the system.
- Created real time web applications using Node.js
- Web sphere Application Server is used to deploy the build.
- Developed business objects using Spring Frame work.
- Performed data validation in Struts Form beans and Action Classes.
- Eclipse IDE is used for Development, Testing and Debugging of the application.
- Used DOM Parser to parse the xml files.
- Log4j framework has been used for logging debug, info & error data.
- Used Oracle 10g Database for data persistence.
- SQL Developer was used as a database client.
- Used WinSCP to transfer file from local system to other system.
- Performed Test Driven Development (TDD) using JUnit.
- Used Ant script for build automation.
- Used Rational Clear Quest for defect logging and issue tracking.