Big Data Consultant Resume
Eden Prairie, MN
SUMMARY:
- Around 9+ years of professional IT experience which includes experience in Big data ecosystem and Java/J2EE related technologies.
- Good exposure in following all the process in a production environment like change management, incident management and managing escalations
- Experience with AWS components like Amazon Ec2 instances, S3 buckets and EBS Volumes.
- Experience in Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, Data Node and MapReduce programming paradigm.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Flume and kafka.
- In depth and extensive knowledge of analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java..
- Good Knowledge on Hadoop Cluster architecture and working with Hadoop clusters using Cloudera (CDH5) and HortonWorks Distributions.
- Excellent understanding and knowledge of NOSQL databases like MongoDB and HBase.
- Extensive knowledge on file formats like AVRO, sequence files, Parquet, ORC and RC .
- Experience in importing and exporting data using Sqoop to Relational Database Systems and vice - versa.
- Good knowledge in using job scheduling and workflow designing tools like Oozie.
- Have good experience creating real time data streaming solutions using Apache Spark, Kafka and Flume.
- Experienced in Developing Spark application using Spark Core, Spark SQL and Spark Streaming API's.
- Extending Hive and Pig core functionality by writing custom UDFs .
- Experience in managing Hadoop clusters using Cloudera Manager Tool and Ambari.
- Experience in Installing and monitoring standalone multi-node Clusters of Kafka and Storm
- Very good experience in complete project life cycle (design, development, testing and implementation) Rapid Application Development (RAD), Agile Methodology and Scrum software development processes
- Highly skilled with object oriented architectures and patterns, systems analysis, software design, effective coding practices, databases, and servers
- Designed and Maintained Oozie workflows to manage the flow of jobs in the cluster
- Developer in Big Data team, worked with Hadoop AWS cloud , and its ecosystem.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Experience in Java, JSP, Servlets, WebLogic, WebSphere, Hibernate, Spring, JBoss, JDBC, Java Script, Ajax, Jquery, XML, and HTML.
- Experience in different compression techniques like Gzip, LZO, Snappy and Bzip2
- Experience in working with multi/ multiple Operating Systems like Windows, Linux and strong knowledge with troubleshooting, finding and fixing critical problems.
- Functional knowledge of Banking and Health Insurance domain.
- Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
- Proficient with Core JAVA, AWT and also with the markup languages like HTML 5.0,XHTML, DHTML, CSS, XML 1.1, XSL, XSLT, XPath, XQuery, Angular.js, Node.js
- Worked with version control systems like Subversion, Perforce, and GIT for providing common platform for all the developers.
- Articulate in written and verbal communication along with strong interpersonal, analytical, and organizational skills.
- Experience in preparing deployment packages and deploying to Dev and QA environments and prepare deployment instructions to Production Deployment Team.
- Highly motivated team player with the ability to work independently and adapt quickly to new and emerging technologies.
- Creatively communicate and present models to business customers and executives, utilizing a variety of formats and visualization methodologies.
TECHNICAL SKILLS:
Big Data Technologies: Apache Hadoop (MRv1, MRv2), Hive, Pig, Sqoop, HBase, MongoDB, Flume, Spark, Zookeeper, Oozie.
Languages: C, Java, SQL/PLSQL.
Methodologies: Agile, Rad, V-model.
Databases: Oracle, MySQL, MongoDB, Hbase, MS SQL server.
Web Technologies: HTML, JSP, JSF, CSS, JavaScript, JSON & AJAX
IDE’s: Eclipse, Netbeans
Build tools: Maven, Ant.
Web services: SOAP & RESTful Web Services
Cloud solutions: Amazon Web Services (AWS)
Monitoring Tools: Wire shark, Nagios, Ganglia
Operating System: Windows, Ubuntu, Red Hat Linux, Cent OS.
Scripting languages: JavaScript, Shell Scripting.
PROFESSIONAL EXPERIENCE:
Confidential, Eden prairie, MN
Big Data Consultant
Responsibilities:
- Involved in installation and configuration of parcels of various Hadoop eco system components including HDFS, Pig, Hive, Sqoop, Hbase.
- Evaluate, refine, and continuously improve the efficiency and accuracy of existing Predictive Models.
- Developed data pipeline using Sqoop to ingest customer behavioural data into HDFS for analysis.
- Worked on installing cluster, Commissioning & Decommissioning of Datanode, Namenode recovery and capacity planning .
- Import data into Hive using Sqoop from RDBS systems
- Written Hive queries to parse the logs and structure them in tabular format to facilitate effective querying on the log data to perform business analytics .
- Good experience in Hive partitioning, bucketing and perform different types of joins on Hive tables and implementing Hive SerDe.
- Used Hbase to perform fast, random reads and writes to all data stored and integrate with other components like Hive.
- Provided production support for cluster maintenance .
- Triggered workflows based on time or availability of data using Oozie.
- Monitoring and Debugging Hadoop jobs/Applications running in production.
- Worked on Providing User support and application support on Hadoop Infrastructure.
- Install Kafka on Hadoop cluster and configure producer and consumer using JAVA.
- All the Mterics data is directly published in kafka, where it is consumed by a consumers group called Spark Streaming API .
- Load data from various data sources into HBase using Kafka.
- Performance tuning of Kafka, Storm Clusters. Benchmarking Real time streams.
- Added authorization to the server using the user’s Kerberos identity to determine which role each was and which operations they could perform
- Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
- Implemented Spark using java,scala and utilizing Data frames, Spark SQL API for faster processing of data
- Used Spark SQL to analyze web logs and use Spark tranformations and actions to compute some statistics for web server monitoring using Scala.
Environment: AWS , EMR, EC2, S3 , Hortonworks Distribution of Hadoop, HDFS, Oozie, Java (JDK 1.6), Eclipse, MySQL, Kafka, Impala, Spark SQL, Spark Streaming, UNIX Shell Scripting
Confidential, Omaha NE
Hadoop Developer
Responsibilities:
- Handle the installation and configuration of a Hadoop cluster.
- Handle the data exchange between HDFS and different Web Applications and databases using Flume and Sqoop.
- Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
- Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
- Commission and decommission the Data nodes from cluster in case of problems.
- Worked on No SQL databases like MongoDB and ingest the data into HDFS
- Worked on No SQL databases like MongoDB.
- Understanding concepts like replication, sharding and implementing them using Mongo DB.
- Worked on NoSQL (HBase) for support enterprise production.
- Loading data into HBASE using HIVE and SQOOP.
- Involved in upgradation process of the Hadoop cluster from CDH3 to CDH4
- Creating Hive tables and working on them using Hive QL.
- Performed cluster co-ordination and assisted with data capacity planning and node forecasting using Zookeeper.
- Installed Hadoop, Map Reduce, HDFS,developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing
- Developed job flows in Oozie to automate the workflow for extraction of data from warehouses and weblogs.
- Pig UDFs for custom data processing (clean, edit and format unstructured data) .
Environment: Hadoop, MapReduce, HDFS, Hive, Java, Hadoop distribution of Horton Works, Pig, HBase, Linux, XML, MySQL, MySQL Workbench, Java 6, Eclipse, Oracle 10g, PL/SQL, SQL*PLUS.
Confidential, San Francisco, CA
Sr. Hadoop Developer/ Admin
Responsibilities:
- Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Experience in installing, configuring and using Hadoop ecosystem components.
- Experience in administration, installing, upgrading and managing CDH3, Pig, Hive & Hbase
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experienced in defining job flows.
- Knowledge in performance troubleshooting and tuning Hadoop clusters.
- Experienced in managing and reviewing Hadoop log files.
- Participated in development/implementation of Cloudera Hadoop environment.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Good understanding of NoSQL databases such as HBase, Cassandra and MongoDB. .
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from UNIX file system to HDFS .
- Installed and configured Hive and also written Hive UDFs .
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Implemented CDH3 Hadoop cluster on CentOS .
- Worked on installing cluster, commissioning & decommissioning of datanode, namenode recovery, capacity planning, and slots configuration.
- Created HBase tables to store variable data formats of PII data coming from different portfolios.
- Implemented best income logic using Pig scripts .
- Load and transform large sets of structured, semi structured and unstructured data.
- Cluster coordination services through Zookeeper.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Datameter, PIG, Zookeeper, Sqoop, CentOS, SOLR.
Confidential, Johnston, IA
Java/ J2EE Developer
Responsibilities:
- Involved in Analysis, Design, Development and Testing of application modules.
- Analyzed the complex relationship of system and improve performances of various screens.
- Developed various user interface screens using struts framework.
- Worked with spring framework for dependency injection.
- Developed JSP pages, using Java Script, Jquery, AJAX for client side validation and CSS for data formatting.
- Written domain, mapper and DTO classes and hbm.xml files to access data from DB2 tables.
- Developed various reports using Adobe APIs and Web services.
- Wrote test cases using Junit and coordinated with testing team for integration tests
- Fixed bugs, improved performance using root cause analysis in production support
- Analyzed the data by performing Hive queries (1HiveQL) and running Pig scripts (Pig Latin) to study customer behavior.
- Stored the data in an Apache Cassandra Cluster
- Used Impala to query the Hadoop data stored in HDFS.
- Manage and review Hadoop log files.
- Support/Troubleshoot Map/Reduce programs running on the cluster
- Load data from Linux file system into HDFS.
- Install and configure Hive and write Hive UDFs.
- Create tables, load data, and write queries in Hive.
- Develop scripts to automate routine DBA tasks using Linux Shell Scripts, Python
Environment: JDK 1.4.2, Swings, XML,SQL,Windows XP/7, Web services, Hyperion 8/9.3, Citrix, Mainframes, CVS.
Confidential, Lansing, MI
Java/ J2EE Developer
Responsibilities:
- Created Use case, Sequence diagrams, functional specifications and User Interface diagrams using Star UML.
- Involved in complete requirement analysis, design, coding and testing phases of the project.
- Participated in JAD meetings to gather the requirements and understand the End Users System.
- Developed user interfaces using JSP, HTML, XML and JavaScript.
- Generated XML Schemas and used XML Beans to parse XML files.
- Created Stored Procedures & Functions. Used JDBC to process database calls for DB2/AS400 and SQL Server databases.
- Developed the code which will create XML files and Flat files with the data retrieved from Databases and XML files.
- Created Data sources and Helper classes which will be utilized by all the interfaces to access the data and manipulate the data.
- Developed web application called iHUB (integration hub) to initiate all the interface processes using Struts Framework, JSP and HTML.
- Developed the interfaces using Eclipse 3.1.1 and JBoss 4.1 Involved in integrated testing, Bug fixing and in Production Support
Environment: Java 1.3, Servlets, JSPs, Java Mail API, Java Script, HTML, Spring Batch XML Processing, MySQL 2.1, Swing, Java Web Server 2.0, JBoss 2.0, Red Hat Linux 7.1.
Confidential
Java/ J2EE developer
Responsibilities:
- Designed and developed Struts like MVC 2 Web framework using the front-controller design pattern, which is used successfully in a number of production systems.
- Normalized Oracle database, conforming to design concepts and best practices.
- Designed the high level and detailed design for Quiet Time alerts by utilizing the existing Message broker infrastructure. The architecture reduced 400 man hours of initial estimation.
- Invoked the web service from Message flows through XML gateway in a secured manner.
- Developed User Interface to allow the members to setup the preference using JSP based on Struts and Custom tag library facility using form beans and action classes.
- Developed ANT scripts to build the different modules for the Project, such as building the binary files and scripts for deploying to the server.
- Provided technical assistance to the Infrastructure team in successfully maneuvering the challenges faced during installation and execution of the project.
- Worked closely with the Infrastructure team during the build process in deployment and configuration of the Message broker. Resolved the environmental issues in a timely manner and performed IT checkout to ensure successful deployment on all the Message broker servers.
Environment: MVC 2,Message Broker, ESQL, Java, JSP, Struts, Hibernate, Apache Ant, Log4j, XML gateway.