- Over 8+ years of experience in the field of Information Technology including 3 years of experience in Big Data/Hadoop
- Experience in working with BI team and transform big data requirements into Hadoop centric technologies.
- Experience in performance tuning the Hadoop cluster by gathering and analyzing the existing infrastructure.
- Very Strong Object - oriented concepts with complete software development life cycle experience - Requirements gathering, Conceptual Design, Analysis, Detail design, Development, Mentoring, System and User Acceptance Testing.
- Excellent knowledge on Hadoop Architecture and ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce.
- Used different Hive Serde's like Regex Serde .
- Extensive knowledge on data serialization techniques like AVRO, sequence files.
- Excellent understanding and knowledge of NoSQL databases like HBase
- Experience in providing support to data analyst in running Pig and Hive queries.
- Developed Map Reduce programs to perform analysis.
- Performed Importing and exporting data into HDFS and Hive using Sqoop.
- Experience in writing shell scripts to dump the Shared data from MySQL servers to HDFS.
- Highly knowledgeable in Writer Comparable, Writer interfaces, Mapper and Reducer abstract classes, Hadoop Data Objects such as IntWritable, ByteWritable, Text objects.
- Experience in using Oozie 0.1 for managing Hadoop jobs.
- Experience in cluster coordination using Zookeeper.
- Extensively development experience in different IDE’s like Eclipse, NetBeans, Forte and STS.
- Expertise in relational databases like Oracle, My SQL.
- Experience in designing both time driven and data driven automated workflows using Oozie 3.0 order to run jobs of Hadoop MapReduce 2.0
- Experience in installation, configuration, supporting and managing - Cloudera's Hadoop platform along with CDH3&4 clusters.
- Experienced in setting up SSH, SCP, SFTP connectivity between UNIX hosts.
- Extensive experience in working with the Customers to gather required information to analyze, debug and provide data fix or code fix for technical problems, build service patch for each version release and unit testing, integration testing, User Acceptance testing and system testing and providing Technical Solution documents for the Users.
Programming Languages: Java 1.4, C++, C, SQL, PIG, PL/SQL.
Java Technologies: JDBC.
Frame Works: Jakarta Struts 1.1, JUnit and JTest.
Databases: Oracle8i/9i, NO SQL MYSQL, MSSQL server.
IDE’s & Utilities: Eclipse and JCreator, NetBeans.Web Dev.
Technologies: HTML, XML.
Protocols: TCP/IP, HTTP and HTTPS.
Operating Systems: Linux, MacOS, WINDOWS 98/00/NT/XP.
Hadoop ecosystem: Hadoop and MapReduce, Sqoop, Hive, PIG,HBASE, HDFS, Oozie.
Confidential Hartford CT
- Worked on evaluation and analysis of Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
- Responsible for building scalable distributed data solutions using Hadoop.
- Involved in loading data from LINUX file system to Hadoop Distributed File System.
- Created Hbase tables to store various data formats of PII data coming from different portfolios.
- Experience in managing and reviewing Hadoop log files.
- Exporting the analyzed and processed data to the relational databases using Sqoop for visualization and for generation of reports for the BI team.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Analyzing large amounts of data sets to determine optimal way to aggregate and report on these data sets
- Worked with the Data Science team to gather requirements for various data mining projects.
- Analyzed large data sets by running Hive queries and Pig scripts.
- Created dash boards using Tableau to analyze data for reporting.
- Support for setting up QA environment and updating of configurations for implementation scripts with Pig and Sqoop.
Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Linux Red HatConfidential Angeles, CA
- Exported data from DB2 to HDFS using Sqoop and NFS mount approach.
- Moved data from HDFS to Cassandra using Map Reduce and BulkOutputFormat class.
- Developed Map Reduce programs for applying business rules on the data.
- Developed and executed hive queries for denormalizing the data.
- Installed and configured Hadoop Cluster for development and testing environment.
- Implemented Fair scheduler on the Job tracker to share the resources of the cluster for the map reduces jobs given by the users.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports by our BI team.
- Dumped the data using Sqoop into HDFS for analyzing.
- Developed data pipeline using Pig and Hive from Teradata and Netezza data sources. These pipelines had customized UDF’S to extend the ETL functionality.
- Developed job flows in Oozie to automate the workflow for extraction of data from Teradata and Netezza
- Developed data pipeline into DB2 containing the user purchasing data from Hadoop
- Implemented Partitioning, Dynamic Partitions, buckets in Hive and wrote map reduce programs to analyze and process the data
- Streamlined Hadoop jobs and workflow operations using Oozie workflow engine.
- Involved in product life cycle developed using Scrum methodology.
- Involved in mentoring team in technical discussions and Technical reviews.
- Involved in code reviews and verifying bug analysis reports.
- Automated work flows using shell scripts.
- Performance tuning of the hive queries, written by other developers.
Environment: Hadoop, HDFS, Hive, MapReduce 2.0, Sqoop 2.0.0, Oozie 3.0, Shell Scripting, Ubuntu, Linux Red Hat.Confidential Minnetonka, MN
- Extensively used Core Java, Servlets, JSP and XML
- Used Struts 1.2 in presentation tier
- Generated the Hibernate XML and Java Mappings for the schemas
- Used DB2 Database to store the system data
- Actively involved in the system testing
- Involved in fixing bugs and unit testing with test cases using JUnit
- Wrote complex SQL queries and stored procedures
- Used IBM Web-Sphere as the Application Server
Environment: Java 1.2/1.3, Swing, Applet, Servlet, JSP, XML, HTML, Java Script, Oracle, DB2, PL/SQLConfidential
Programmer Analyst/Java DevelopResponsibilities:
- Involved in complete software development life cycle - Requirement Analysis, Conceptual Design, and Detail design, Development, System and User Acceptance Testing.
- Involved in Design and Development of the System using Rational Rose and UML.
- Involved in Business Analysis and developed Use Cases, Program Specifications to capture the business functionality.
- Improving the coding standards, code reuse, and performance of the Extend application by making effective use of various design patterns (Business Delegate, View Helper, DAO, Value Object etc. and other Basic patterns).
- Design of system using JSPs, Servlets
- Designed application using Process Object, DAO, Data Object, Value Object, Factory, Delegation patterns.
- Involved in integrating the concept of RFID in the software and developing the code for its API.
- Coordinating between teams as a Project Co-coordinator, organizing design and architectural meetings.
- Design and developed Class diagram, Identifying Objects and its interaction to specify Sequence diagrams for the System using Rational Rose.
Environment: JDK 1.3, J2EE, JSP, Servlets, HTML, XML, UML, RATIONAL ROSE, AWT, Web logic 5.1 and Oracle 8i, SQL, PL/SQL. References: Available upon Request.