Resume
Hadoop Consultant Louisville, KY
SUMMARY:
- Having 5+ years of experience in the field of Business Intelligence, support & maintenance in Client/Server and Big Data Environment
- Around 3 years of experience in implementation of Big Data analytics based on Hadoop ecosystem
- Excellent knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN, Resource Manager, Node Manager & Map Reduce
- Over six years of experience in implementing reports using different tools such as Crystal Reports, and Web Intelligence.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database systems and vice - versa
- Solid understanding of Hadoop File Distribution System Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture
- Knowledge of full project cycle and transitioning among different environments like Development, UAT, Production
- Experience with multiple RDBMS including Oracle, Microsoft SQL Server, Microsoft Access Teradata and associated SQL Dialects such as Transact SQL (T-SQL), PL/SQL
- Assisted organizations to attain cost effective production with increased efficiency by implementing best practices as process improvement
- Strong application and problem solving/analytical skills
- Very quick at developing/implementing/fixing the code and product/feature requirements with innovative and robust ideas. Goal oriented and Team player
SKILL:
Languages: Java, Python, Ruby, R, Shell (bash), PHP, C / C++ / C, SQL, PL/SQL
Tools: and Utilities: Bigdata - Apache, Hadoop, Pig, Hive, Hbase, Sqoop, Oozie, Zookeeper
Operating Systems: Linux Red Hat, HP-UX, AIX, Windows 2000/2003.
Reporting Tools: Tableau, Crystal Reports, Business Objects Web Intelligence.
Databases: MS SQL Server, Oracle, MS Access, Sybase.
Web Technologies: ASP 3.0, JavaScript, VBScript, HTML, IIS 4.0/5.0 VB .Net 2003, ASP .Net, Remedy
PROFESSIONAL EXPERIENCE:
Confidential, Louisville, KY
Hadoop Consultant
Responsibilities:
- Diagnose, assess, troubleshoot and fix issues within the Open Source environment
- Performance tuning, resource and capacity management of users and scheduled jobs
- Enforce security compliance and BI governance rules
- Documentation of all environmental settings and configurations
- Planning and upgrading of the environment, both hardware and software (where applicable)
- Provide front line support to teams using the Hadoop environments
- Ensure availability and stability of the systems in production
- Along with the rest of the team, actively research and share learning/advancements in the Hadoop space, especially related to administration
- Experience using the Hadoop tool set such as Sqoop, Oozie, Flume, Yarn, etc. (3+ years)
- Strong UNIX/Linux skills. (5+ years)
- Knowledge of Hadoop architecture and the evolving role of Hadoop in the Enterprise environment
- Experience with any of the above responsibilities or Cloudera specific tools is required preferred
- Experience in deploying a Hadoop environment
- Knowledge of Hadoop related coding languages (Map Reduce, Java)
Environment: Cloudera Hadoop, HDFS, Pig, Hive, Sqoop, Shell Scripting, Core Java, Netezza, Oracle 11g, Linux, Unix.
Confidential, Louisville, KY
Hadoop Consultant
Responsibilities
- Responsible for developing efficient MapReduce on AWS cloud programs for more than 20 years' worth of data to detect and separate fraudulent information
- Uploaded and processed more than 30 terabytes of data from various structured and unstructured sources into HDFS (AWS cloud) using Sqoop and Flume
- Played a key-role is setting up a 40 node Hadoop cluster utilizing Apache Spark by working closely with the Hadoop Administration team
- Worked with the advanced analytics team to design fraud detection algorithms and then developed MapReduce programs to efficiently run the algorithm on the huge datasets
- Developed Scala programs to perform data scrubbing for unstructured data
- Responsible for designing and managing the Sqoop jobs that uploaded the data from Oracle to HDFS and Hive
- Helped in troubleshooting Scala problems while working with Micro Strategy to produce illustrative reports and dashboards along with ad-hoc analysis
- Used Flume to collect the logs data with error messages across the cluster
- Designed and Maintained Oozie workflows to manage the flow of jobs in the cluster
- Played a key role in installation and configuration of the various Hadoop ecosystem tools such as Solr, Kafka, Pig, HBase and Cassandra
- Created dashboards using Tableau
- Tibco Jasper Soft studio was used for the ireport analysis using AWScloud
- Teradata concepts were used for the early instance creation with the DBMS concepts
- Actively updated the upper management with daily updates on the progress of project that include the classification levels that were achieved on the data
Environment: Java, Hadoop, Hive, Pig, Sqoop, Flume, HBase, Oracle 10g, Teradata, Cassandra.
Confidential, Memphis, Tennessee
Hadoop consultant
Responsibilities
- Create, validate and maintain scripts to load data from and into tables in Oracle PL/SQL and in SQL Server 2008 R2
- Wrote Store Procedures and Triggers
- Converting, testing and validating Oracle scripts to SQL Server
- Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, Spark and loaded data into HDFS
- Used SOLR for database integration IBM MAXIMO to SQL SERVER
- Upgraded IBM Maximo database from 5.2 to 7.5
- Analyze, validate and document the changed records for IBM Maximo web application
- Importing data from MySQL database to HiveQL using Scoop
- Fetch data to/from HBase using Map Reduce jobs
- Writing Map Reduce jobs
- Develop, validate and maintain HiveQL queries
- Running reports in Pig and Hive Queries
- Analyzing data with Hive, Pig
- Designed Hive tables to load data to and from external files
- Wrote and Implemented Apache PIG scripts to load data from and to store data into Hive
Environment: Cloudera, Hadoop, Hive, Pig, Sqoop, HBase, SQL