Big Data Consultant Resume
Wilmington, DE
SUMMARY
- 8+ years of experience with emphasis on Big Data Technologies, Development and Design of Java based enterprise applications.
- Extensive experience in development of BigData projects using Hadoop,Mapreduce, Pig,Hive, Sqoop, Flume, Oozie.
- Good experience working with Hortonworks Distribution and Cloudera Distribution.
- Experience in installation, configuration, supporting and managing Hadoop clusters.
- Implemented standards and processes for Hadoop based application design and implementation.
- Responsible for writing MapReduce programs using Java.
- Logical Implementation and interaction with HBase.
- Developed MapReduce jobs to automate transfer of data from HBase.
- Experience in end to end DevOps
- Performed data analysis using Hive and Pig.
- Writing Map Reduce programs in Hadoop, pig, Hive and Scala.
- Loadedstreaming log data from various webservers into HDFS using Flume.
- Successfully loaded files to Hive and HDFS from Oracle and SQL Server using SQOOP.
- Assist with the addition of Hadoop processing to the IT infrastructure.
- Strong understanding of NoSQL databases like HBase, MongoDB&Cassandra
- Worked in Multiple Environment in installation and configuration.
- Document and explain implemented processes and configurations in upgrades.
- Support development, testing, and operations teams during new system deployments.
- Evaluate and propose new tools and technologies to meet the needs of the organization.
- Experience in using Sqoop, Oozie and Cloudera Manager.
- Good Knowledge on Hadoop Cluster architecture and monitoring the cluster.
- Strong understanding of Data warehouse concepts, ETL, Star Schema, Snowflake, data modeling experience using Normalization, Business Process Analysis, Reengineering, Dimensional Data modeling, physical & logical data modeling.
- Implemented stand - alone installation, file system management, backups, process control, user administration and device management in a networked environment.
- An excellent team player and self-starter with good communication skills and proven abilities to finish tasks before target deadlines.
TECHNICAL SKILLS
Programming Languages: Java 1.4, C++, C, SQL, PIG, PL/SQL.
Java/J2EE Technologies: Applets,Swing,JDBC,JSON, JMS,JSP, Servlets, JSF, JQuery
Frame Works: MVC, Struts, Spring, Hibernate
IDE’s & Utilities: Eclipse and JCreator, NetBeans, Maven
Web Dev. Technologies: HTML, XML.
Protocols: TCP/IP, HTTP and HTTPS.
Operating Systems: Linux, MacOS, WINDOWS 98/00/NT/XP.
Databases: Oracle8i/9i, MY SQL,MS SQL server.
No SQL Databases: Hbase,Cassandra, mongoDB
Hadoop ecosystem: Hadoop and MapReduce, Sqoop, Hive, PIG, HBASE, HDFS, Zookeeper, Flume, Oozie, Kafka.
PROFESSIONAL EXPERIENCE
Confidential, Wilmington, DE
Big Data Consultant
Responsibilities:
- Analyzing the data using PIG, HIVE and loading the data into HDFS using Sqoop.
- Vast use of Confidential scripting for the loading of data into HDFS.
- Worked with the QA and Production team in the data loading process.
- Worked on TWS for the scheduling of the jobs.
- Data processing involved working on ANT BUILDS.
- Implemented batch processing in Spark using Scala.
- Install Kafka on Hadoop cluster and configure producer and consumer coding part in java to establish connection.
- Data loading involved creating Hive tables and partitions based on the requirement.
- Worked on various types of SERDE
- Worked on HCatalog which allows PIG and Map Reduce to take advantage of the SerDE data format transformation definitions that we write for HIVE
- Worked on different UDFs, these are used in the MES solution to provide a way for DA’s and developers to encrypt, or decrypt, data.
- Worked on DevOps tools like Chef and Puppet to configure and maintain the production environment
- Worked on different file formats(ORCFILE,RCFILE,SEQUENCEFILE,TEXTFILE) and different Compression Codecs (GZIP,SNAPPY,LZO).
Environment: Hadoop, HDFS, Pig, Hive, MapReduce,Sqoop,Horton works distribution, Spark, Scala, Confidential Scripting, Ubuntu, Linux Red Hat, Kafka
Confidential, Austin, TX
Hadoop Developer
Responsibilities:
- Worked on analyzing, writing Hadoop Mapreduce jobsusing JavaAPI,Pig and Hive.
- Responsible for building scalable distributed data solutions using Hadoop.
- Involved in loading data from edge node to HDFS using Confidential scripting.
- Worked on installing cluster, commissioning & decommissioning of datanode, namenode high availability, capacity planning, and slots configuration.
- Created HBase tables to store variable data formats of PII data coming from different portfolios.
- Implemented a script to transmit sysprin information from Oracle toHbase using Sqoop.
- Implemented best income logic using Pig scripts and UDFs.
- Implemented test scripts to support test driven development and continuous integration.
- Worked on tuning the performance Pig queries.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Responsible to manage data coming from different sources.
- Load and transform large sets of structured, semi structured and unstructured data
- Experience in managing and reviewing Hadoop log files.
- Job management using Fair scheduler.
- Exported the analyzed data to the relational databases using Sqoopfor visualization and to generate reports for the BI team.
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Confidential Scripting, Ubuntu, Linux Red Hat.
Confidential, Houston, TX
Hadoop Developer
Responsibilities:
- Involved in review of functional and non-functional requirements.
- Facilitated knowledge transfer sessions.
- Installed and configured Hadoop Mapreduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experienced indefining jobflows.
- Experienced in managing andreviewing Hadooplog files.
- Experienced in running Hadoopstreaming jobs to process terabytes of xml format data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Got good experience with NOSQL database.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hive and also written Hive UDFs.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.
- Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
- This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
- Designed and implemented Mapreduce-based large-scale parallel relation-learning system
- Extracted feeds form social media sites such as Facebook, Twitter using Python scripts.
- Setup and benchmarked Hadoop/HBase clusters for internal use
- Setup Hadoop cluster on Amazon EC2 using whirr for POC.
- Wrote recommendation engine using mahout.
Environment: Java 6, Eclipse, Oracle 10g, Sub Version, Hadoop, Hive, HBase, Linux,, MapReduce, HDFS, Hive, Java (JDK 1.6),HadoopDistribution of HortonWorks, Cloudera, MapReduce, DataStax, IBM DataStage 8.1, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Confidential Scripting.
Confidential, Newark, NJ
Hadoop Developer
Responsibilities:
- Responsible for architecting Hadoop clusters.
- Assist with the addition of Hadoop processing to the IT infrastructure.
- Perform data analysis using Hive and Pig.
- Load log data into HDFS using Flume.
- Monitoring Hadoop cluster using tools like Nagios,Ganglia and Cloudera Maneger.
- Automation script to monitor HDFS and HBase through cronjobs.
- Plan, design, and implement processing massive amounts of marketing information, complete with information enrichment, text analytics, and natural language processing.
- Prepare multi-cluster test harness to exercise the system for performance and failover.
- Develop high-performance cache, making the site stable and improving its performance.
- Create a complete processing engine, based on Cloudera's distribution, enhanced to performance.
- Administrative support for parallel computation research on a 24-node Fedora/ Linux cluster.
- Build and support standard-based infrastructure capable of supporting tens of thousands of computers in multiple locations.
- Negotiated and managed projects related to designing and deploying tis architecture.
Environment: Hive, Pig, HBase, Zookeeper, Sqoop, Java, JDBC, JNDI, Struts, Maven, Trac, Subversion, JUnit, SQL language, spring, Hibernate, Junit, Oracle, XML, AltovaXmlSpy, Putty and Eclipse.
Confidential, Albany, NY
J2EE Developer
Responsibilities:
- Involved in Presentation Tier Development using JSF Framework and ICE Faces tag Libraries.
- Involved in business requirement gathering and technical specifications.
- Implemented J2EE standards, MVC2 architecture using JSF Framework.
- Implementing Servlets, JSP and Ajax to design the user interface.
- Extensive experience in building GUI (Graphical User Interface) using JSF and ICE Faces.
- Developed Rich Enterprise Applications using ICE Faces and Portlets technologies.
- Experience using ICE Faces Tag Libraries to develop user interface components.
- Used JSF, JSP, Java Script, HTML, and CSS for manipulating, validating, customizing, error messages to the User Interface.
- Used EJBs(Session beans) to implement the business logic, JMS for communication for sending updates to various other applications and MDB for routing priority requests.
- All the Business logic in all the modules is written in core Java.
- Wrote WebServices using SOAP for sending and getting data from the external interface.
- Developed a web-based reporting for monitoring system with HTML and Tiles using Struts framework.
- Middleware Services layer is implemented using EJB(Enterprise Java Bean - stateless) in WebSphere environment.
- Used Design patterns such as Business delegate, Service locator, Model View Controller, Session façade, DAO.
- Funds Transfers are sent to another application using JMS technology asynchronously.
- Involved in implementing the JMS (Java messaging service) for asynchronous communication.
- Involved in writing JMS Publishers to post messages.
- Involved in writing MDB(Message Driven Beans) as subscribers.
- Design and develop Oracle PL/SQL Procedures and wrote SQL, PL/SQL scripts code for extracting data to system.
- Tune ETL procedures and STAR schemas to optimize load and query Performance.
- Created Stored procedures using PL/SQL for data modification (Using DML insert, update, delete) in Oracle
- Interaction with Oracle database is implemented using Hibernate.
Environment: J2EE, EJB, JSF, ICE Faces, EJB, WebServices, XML, XSD, Agile, Microsoft Visio, Clear Case, Oracle 9.i/10.g, ETL, Weblogic8.1/10.3,RAD, SOAP, HTML, LOG4j,Servlets, JSP, Unix.
Confidential
J2EE Developer
Responsibilities:
- Involved in designing the application and prepared Use case diagrams, class diagrams, sequence diagrams.
- Developed Servlets and JSP based on MVC pattern using Struts Action framework.
- Used Tiles for setting the header, footer and navigation and Apache Validator Framework for Form validation.
- Using Resource and Properties files for i18n support.
- Involved in writing Hibernate queries and Hibernate specific configuration and mapping files.
- Used Log4J logging framework to write Log messages with various levels.
- Involved in fixing bugs and minor enhancements for the front-end modules.
- Used JUnit framework for writing Test Classes.
- Used Ant for starting up the application server in various modes.
- Used Clear Case for version control.
- Used SDLC Life Cycle
Environment: Java JDK1.4, EJB2.x, Hibernate 2.x, Jakarta Struts 1.2, JSP, Servlet, JavaScript, MS SQL Server 7.0, Eclipse3.x, Websphere 6, Ant, Windows XP, Unix, Excel Macro Development.
Confidential
J2EE Developer
Responsibilities:
- Involved in Requirement Analysis, Development and Documentation.
- Used MVC architecture (Jakarta Struts framework) for Web tier.
- Participation in developing form-beans and action mappings required for struts implementation and validation framework using struts.
- Development of front-end screens with JSP Using Eclipse.
- Involved in Development of Medical Records module. Responsible for development of the functionality using Struts and EJB components.
- Coding for DAO Objects using JDBC (using DAO pattern).
- XML and XSDs are used to define data formats.
- Implemented J2EE design patterns value object singleton, DAO for the presentation tier, business tier and Integration Tier layers of the project.
- Involved in Bug fixing and functionality enhancements.
- Designed and developed excellent Logging Mechanism for each order process using Log4J.
- Involved in writing Oracle SQL Queries.
- Involved in Check-in and Checkout process using CVS.
- Developed additional functionality in the software as per business requirements.
- Involved in requirement analysis and complete development of client side code.
- Followed Sun standard coding and documentation standards.
- Participation in project planning with business analysts and team members to analyze the Business requirements and translated business requirements into working software.
- Developed software application modules using disciplined software development process.
Environment: Java, J2EE, JSP, EJB, ANT, STRUTS1.2, Log4J, Weblogic 7.0, JDBC, MyEclipse, Windows XP, CVS, Oracle.