Big data Consultant Resume Austin, TX - Hire IT People

SUMMARY

9 years of experience in Application development/Administration/Architecture and Data Analytics with specialization in Java and Big Data Technologies.
4 Years of extensive experience as Hadoop Developer with strong expertise in MapReduce and Hive.
Strong hands on experience in designing optimized solutions using various Hadoop components like Mapreduce, Hive, Sqoop, Pig, HDFS, Flume, Oozie etc.
Strong Understanding of Cassandra and HBASE NoSQL Databases.
Actively involved in Requirement gathering, Analysis, Design, Reviews, Coding and Code Reviews.
Expertise in designing web applications using Java/J2ee in Agile scrum methodologies. Strong skills in writing UNIX and Linux scripts.
Experience in writing Python scripts.Expertise in JMS, IBM Web sphere MQ, Expertise in writing MDBs to listen to MQs.
Extensive experience with RAD 6.0, RSA, Web Sphere (WSAD5.1), Eclipse3.1.2, My eclipse and Oracle 9i.
Extensive experience in development of three tier & N - tier applications, Distributed Applications using J2EE Technologies.
Have strong analytical skills with proficiency in debugging, problem solving. Experience in Sizing and Scaling of Distribution Databases.
Experience in performing database consistency checks using DBCC utilities and Index Tuning Wizard
Experience in OLAP and ETL/Data warehousing, creating different data models and maintaining Data Marts
Experience in designing logical and physical database models using ERWIN
Implemented Kerberos in the Hadoop cluster environment. Kerberos is a security gateway to authenticate any user getting into the Hadoop cloud.
The Kerberos security system includes Key distribution center, Users and HDFS nodes as the components.
Have done implementation of these components in the system. Also handled tickets related to the Hadoop security system
Used to communicate with the Clodera team in case there is any critical issue that cannot be resolved very easily as the background of this Hadoop-Kerberos security environment is supported by the Cloudera team.

TECHNICAL SKILLS

Operating Systems: Windows 2000 Server, Windows 2000 Advanced Server, Windows Server 2003 Centos, Debian, Fedora, Windows NT, Windows 98/XP UNIX, Linux RHEL,DB2

Database: MSSQL Server 2000/2005/2008 , MS-Access, Tera Data, Oracle, Cassandra

Languages: JAVA, C, C++, PIG, HIVE, SSIS, SSAS, SSRS

Tools: /Utilities MapReduce,Sqoop,Flume,Oozie,SQLProfiler, Hbase, JENKINS,Stash, Agile,GIT

Reporting Tools: Tableau, Impala, Qlikview, Datameer Web Utilities HTTP, IIS Administration, APACHE

PROFESSIONAL EXPERIENCE

Confidential - Austin, TX

Big data Consultant

Responsibilities:

Installed and Configured multi-nodes fully distributed Hadoop cluster
Involved in Installing Hortonworks Hadoop Ecosystem components
Responsible to manage data coming from different sources and develop Hadoop production clusters.
Setup Hadoop Cluster environment administration that includes adding and removing cluster nodes, cluster capacity planning and performance tuning
Written Complex Map reduce programs
Involved in designing, installations and maintenance of KAFKA and Ambari.
Loaded data into the cluster from dynamically generated files using FLUME and from RDBMS using Sqoop
Involved in writing Java API’s for interacting with HBase, building with Maven, JSP, Servlets, Web 2.0, Struts/Spring framework, Hibernate ORM, REST API’s, AngularJS
Involved in writing Flume and Hive scripts to extract, transform and load data into Database
Used Datalake as the data storage,
Installed and configured Hadoop Map Reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
Experienced in Teradata services.
Experienced in Importing and exporting data into HDFS and Hive using Sqoop.
Knowledge in performance troubleshooting and tuning Hadoop clusters.
Expert in Spark, Scala, storm, Hue and Samza.
Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
Experienced in running AMAZON EMR.
Installed and configured Hive and also written Hive UDFs and Used Map Reduce and Junit for unit testing.
Experienced in working with various kinds of data sources such as Hortonworks Teradata and Oracle.
Worked on Data management and Data Integration.
Load and transform large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts,.
Project leader of a team of 7 members
Technical expert in Hadoop Architecture guided the team and help them to solve the problems.
Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability
Expertise in Kerberos, LDAP integration.
Very familiar with Data Visualization.
Familiar with parallel processing database like Terradata and the Netezza

Environment: Java, Hadoop, Hortonworks, Hive, Pig, Sqoop, Flume, HBase, Oracle 10g, Teradata, Cassandra, Scala, Spark, Netezza, Spring, Kakfa, AWS,Amazon EMR SSIS, SSRS, SSAS,Datalake

Confidential - Pittsburgh

Hadoop Developer

RESPONSIBILITIES:

Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data
Experience in Metadata.
Worked on Spark streaming, creating RDD, Graph Analytics.
Defining workflow using Oozie framework for automation.
Implemented Flume (Multiplexing) to steam data from upstream pipes in to HDFS.
Responsible for reviewing Hadoop log files.
Loading and transforming large sets of unstructured and semi structured data.
Performed data completeness, correctness, data transformation and data quality testing using SQL.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Worked on platforms like Kafka clusters
Implementation of Hive partition (static and dynamic) and bucketing.
Handled importing of data from various data sources, performed transformations using Hive, Map Reduce and loaded data into HDFS.
Assisted in creation of ETL processes for transformation of data sources from existing RDBMS systems.
Developed profile/log interceptors for the struts action classes using Struts Action Invocation Framework (SAIF).
Written the Apache PIG scripts to process the HDFS data.
Wrote Hive queries for data analysis to meet the business requirements.
Involved in installing Hadoop Ecosystem components.
Involved in HDFS maintenance and loading of structured and unstructured data.
Installed and configured Hadoop, Map Reduce, HDFS.
Used Hive QL to do analysis on the data and identify different correlations.
Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing.
Installed and configured Pig and also written Pig Latin scripts.
Wrote Map Reduce job using Scala and Splunk.
Great understanding of REST architecture style and its application to well performing web sites for global usage.
Developer in Big Data team, worked with Hadoop AWS cloud, and its ecosystem.
Worked on Storm, Apache and Apex

Environment: Apache Hadoop, HDFS, Cloudera Manager, CentOS, Java, MapReduce, Eclipse, Hive, PIG,Sqoop, Oozie and SQL, Scala, Terraform and cloud formation, Hadoop AWS,SSIS,SSRS,SSAS

Confidential - Atlanta, GA

Hadoop Architect

Responsibilities:

Resolving User Support requests
Administer and Support Hadoop Clusters
Loaded data from RDBMS to Hadoop using Sqoop
Providing solutions to ETL/Data warehousing teams as to where to store the intermediate and final output file in the various layers in Hadoop
Worked collaboratively to manage build outs of large data clusters.
Helped design big data clusters and administered them.
Worked both independently and as an integral part of the development team.
Communicated all issues and participated in weekly strategy meetings.
Administered back end services and databases in the virtual environment.
Worked on Spark,Scala,Storm.
Implemented big data systems in cloud environments.
Created security and encryption systems for big data.
Performed administration troubleshooting and maintenance of ETL and ELT processes
Collaborated with multiple teams for design and implementation of big data clusters in cloud environments
Developed PIG Latin scripts for the analysis of semi structured data.
Developed and involved in the industry specific UDF (user defined functions)
Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
Used Sqoop to import data into HDFS and Hive from other data systems.
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
Migration of ETL processes from RDBMS to Hive to test the easy data manipulation.
Developed Hive queries to process the data for visualizing.
Installed and configured Apache Hadoop to test the maintenance of log files in Hadoop cluster.
Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
Developed a custom file system plugin for Hadoop to access files on data platform.
The custom file system plugin allows Hadoop Map Reduce programs, HBase, Pig, and Hive to access files directly.
Teradata vast knowledge experience.
Extracted feeds from social media sites such
Imported data using Sqoop to load data from Oracle to HDFS on a regular basis.
Developing scripts and batch jobs to schedule various Hadoop Programs.
Have written Hive Queries for data analysis to meet the business requirements.
Creating Hive Tables and working on them using Hive QL.

Environment: HDFS, Hive, ETL, PIG,UNIX, Linux, CDH 4 distribution, Tableau, Impala, Teradata, Pig, sqoop, flume, oozie

Confidential - Louisville, KY

Hadoop Admin/Architect

Responsibilities:

Installation and Configuration of Hadoop Cluster
Working with Cloudera Support Team to Fine tune Cluster
Working Closely with SA Team to make sure all hardware and software is properly setup for Optimum usage of resources
Developed a custom File System plugin for Hadoop so it can access files on Hitachi Data Platform
Plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly
The plugin also provided data locality for Hadoop across host nodes and virtual machines
Wrote data ingesters and map reduce programs
Developed map Reduce jobs to analyze data and provide heuristics reports
Extensive data validation using HIVE and also written Hive UDFs
Adding, Decommissioning and rebalancing nodes
Created POC to store Server Log data into Cassandra to identify System Alert Metrics
Rack Aware Configuration
Configuring Client Machines
Configuring, Monitoring and Management Tools
HDFS Support and Maintenance
Cluster HA Setup
Applying Patches and Perform Version Upgrades
Incident Management, Problem Management and Change Management
Performance Management and Reporting
Recover from Name Node failures
Schedule Map Reduce Jobs - FIFO and FAIR share
Installation and Configuration of other Open Source Software like Pig, Hive, HBASE, Flume and Sqoop
Integration with RDBMS using swoop and JDBC Connectors
Working with Dev Team to tune Job Knowledge of Writing Hive Jobs

Environment: Windows 2000/ 2003 UNIX Linux Java, Apache HDFS Map Reduce, Pig Hive HBase Flume Sqoop, Cassandra, NOSQL

Confidential - Auburn Hills, MI

SQL/Hadoop Developer

Responsibilities:

Developer Hadoop ecosystem: Hadoop, MapReduce, Hbase, Sqoop, Amazon Elastic Map Reduce (EMR)
Developed a scalable, cost effective, and fault tolerant data ware house system on Amazon EC2 Cloud.
Developed MapReduce/EMR jobs to analyze the data and provide heuristics and reports.
The heuristics were used for improving campaign targeting and efficiency
Importing, exporting data into HDFS and HIVE using Sqoop
Responsible for loading unstructured data into Hadoop file system (HDFS)
Created and scheduled jobs for maintenance
Configured Database Mail
Monitored File Growth
Maintained Operators, Categories, Alerts, Notifications, Jobs and Schedules
Maintained database response times, proactively generated performance reports
Automated most of the DBA Tasks and Monitoring stats
Developed complex stored procedures, views, clustered/non-clustered indexes, triggers (DDL, DML, LOGON) and user defined functions
Created a mirrored database using Database Mirroring with High Performance Mode
Created database snapshots and stored procedures to load data from the snapshot database to the report database
Restore Development and Staging databases from production as per the requirement
Involved in resolving Dead lock issues and Performance issues
Query Optimization and Performance Tuning for long running queries and created new indexes on tables for faster I/O

Environment: MS SQL Server 2005/2000, Windows 2000/2003 Server, DTS, Web Logic, Redhat Enterprise MS Access, XML, Hadoop, MapReduce, Hbase, Sqoop, Amazon Elastic Map Reduce CDH, Cassandra, NOSQL,Teradata

Confidential, IA

SQL/Linux Administrator

Responsibilities:

Installing, configuring Linux based systems
Installed, Configured and Maintained Supporting open source Linux operating systems (CENTOS, Debian, Fedora)
Monitoring the health and stability of Linux and Windows System environments
Diagnosed and resolved problems associated with DNS, DHCP, VPN, NFS, and Apache
Scripting expertise including BASH, PHP, PERL, Java script and UNIX Shell
Maintained and Monitored Replication by managing the profile parameters
Implemented Log Shipping and Database Mirroring
Used BCP Utility and Bulk Insert for bulk operations on data
Automated and enhanced daily administrative tasks including disk space management Backup and recovery
Used DTS and SSIS to Import and Export various forms of data
Performance Tuning, capacity planning, Server Partitioning and Database security Configuration are done on regular basis to maintain the consistency
Created alerts and notifications to notify system errors
Used SQL Server Profiler for troubleshooting, monitoring and optimization of SQL Server
Worked with developers in creation of Stored Procedures, triggers and User Defined Functions to handle the complex business rules data and audit analysis
Provided 24X7 on call Support
Generated reports daily, weekly and monthly reports

Confidential

SQL Server Admin

Responsibilities:

To set up SQL Server configuration settings.
Export or Import data from other data sources like flat files using Import/Export of DTS.
Back up, package and distribute databases more efficiently by using Red gate
Automate common tasks and use functionality in applications by using Red gate
Rebuilding the indexes at regular intervals for better performance
Designed and implemented comprehensive Backup plan and disaster recovery strategies
Involved in trouble shooting and fine-tuning of databases for its performance and concurrency.
Monitored and modified Performance using execution plans and Index tuning.
Manage the clustered environment.
Using log shipping for synchronization of database.
Implementation of SQL Logins, Roles and Authentication Modes as a part of Security Policies for various categories of User Support.
Monitoring SQL server performance using profiler to find performance and deadlocks.
Maintaining the database consistency with DBCC at regular intervals

We provide IT Staff Augmentation Services!

Big Data Consultant Resume

Austin, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship