Big Data Consultant Resume
NyC
PROFESSIONAL SUMMARY:
- 6.5years of IT experienceincluding 4.5 years of experience with Hadoop Ecosystem in installation and configuration of different Hadoop eco - system components in the existing cluster;
- Experience in Hadoop Administration (HDFS, MAP REDUCE, HIVE, PIG, SQOOP, FLUME AND OOZIE, HBASE) NoSQL Administration.
- Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
- Experience in installing Hadoop cluster using different distributions of Apache Hadoop, Cloudera, Hortonworks and HDP.Good experience in understanding the client's Big Data business requirements and transform it into Hadoop centric technologies.
- Analyzing the clients existing Hadoop infrastructure and understand the performance bottlenecks and provide the performance tuning accordingly.Installed, Configured and maintained HBASE
- Worked with Sqoop in Importing and exporting data from different databases like MySQL, Oracle into HDFS and Hive.
- Defining job flows in Hadoop environment using tools like Oozie for data scrubbing and processing.
- Experience in configuring Zookeeper to provide Cluster coordination services.
- Loading logs from multiple sources directly into HDFS using tools like Flume.
- Experience in benchmarking, performing backup and recovery of NameNode metadata and data residing in the cluster.Familiar in commissioning and decommissioning of nodes on Hadoop Cluster.
- Adept at configuring NameNode High Availability; Worked on Disaster Management with Hadoop Cluster.
- Well experienced in building servers like DHCP, PXE with kick-start, DNS and NFS and used them in building infrastructure in a Linux Environment.
- Experienced in Linux Administration tasks like IP Management (IP Addressing, Subnetting, Ethernet Bonding and Static IP).Strong knowledge on Hadoop HDFS architecture and Map-Reduce framework.
- Experience in deploying and managing the multi-node development, testing and production.
- Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, creating realm /domain, managing.
- Worked on setting up Name Node high availability for major production cluster and designed Automatic failover control using zookeeper and quorum journal nodes.
TECHNICAL SKILLS:
Operating System: RedHat, CentOS, Ubuntu, Solaris, Windows 2008/ 08R2
Hardware: Sun Ultra Enterprise Servers (E3500, E4500), SPARC server 1000, SPARC server 20 Enterprise Servers.
Language: Core Java,C
Hadoop Distribution: Cloudera and HDP.
Ecosystem Hadoop: MapReduce, YARN, HDFS, Sqoop, Hive, Pig, HBase, Sqoop, Flume, Oozie
Tools: JIRA, PuTTy, WinSCP, FileZilla
Protocols: TCP/IP, FTP, SSH, SFTP, SCP, SSL, ARP, DHCP, TFTP, RARP, PPP and POP3
Database: HBase, RDBMS Sybase, Oracle 7.x/8.0/9i, MySQL, SQL
PROFESSIONAL EXPERIENCE:
Confidential, NYC
Big Data Consultant
Responsibilities:
- Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
- Worked exclusively on Cloudera distribution of Hadoop.
- Installed and configured multi-node fully distributed Hadoop cluster of large number of nodes.
- Provided Hadoop, OS, Hardware optimizations.
- Setting up the machines with Network Control, Static IP, Disabled Firewalls, Swap memory.
- Installed and configured Cloudera Manager for easy management of existing Hadoop cluster
- Administered and supported distribution of Horton works.
- Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes
- Implemented Fair scheduler on the job tracker to allocate fair amount of resources to small jobs.
- Performed operating system installation, Hadoop version updates using automation tools.
- Configured Oozie for workflow automation and coordination.
- Implemented rack aware topology on the Hadoop cluster.
- Importing and exporting structured data from different relational databases into HDFS and Hive using Sqoop
- Configured ZooKeeper to implement node coordination, in clustering support.
- Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS;involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
- Worked on developing scripts for performing benchmarking with Terasort/Teragen.
- Implemented Kerberos Security Authentication protocol for existing cluster.
- Backed up data on regular basis to a remote cluster using distcp.
- Good experience in troubleshoot production level issues in the cluster and its functionality.
- Regular Commissioning and Decommissioning of nodes depending upon the amount of data.
- Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.
Environment: Horton works Hadoop, Pig 0.11, Hive 0.10, Sqoop 1.4.3, Flume, MapReduce, HDFS, LINUX, Oozie, Hue, HCatalog, Java, Eclipse, and Linux.
Confidential
Lead Engineer
Responsibilities:
- Experienced of Hadoop 1.x and Hadoop 2.X Installation, configuration and Hadoop Clients.
- Setup new Hadoop users, setting up Kerberos principals and validating there access.
- Troubleshooting installation & configuration issues.
- Experienced of Deploying Hadoop single node cluster in pseudo Distributed mode and Full Distributed mode.
- Involved in installing Hadoop Ecosystem components (Hadoop, MapReduce, Yarn, Pig, Hive, Sqoop, Flume, Zookeeper and HBase). Involved in HDFS maintenance and administering it through Hadoop-Java API.
- Expert in importing and exporting data into HDFS using Sqoop and Flume.
- Configured FIFO and Fair Scheduler to provide service-level agreements for multiple users of a cluster.
- Managing nodes on Hadoop cluster connectivity and security.
- Experienced of manage user and group access to various Big Data Environments.
- Installed Name node, Secondary name node, Yarn (Resource Manager, Node manager, Application master), Data node. Experience in using Sqoop to migrate data to and fro from HDFS and MySQL or Oracle and deployed Hive and HBase integration to perform OLAP operations on HBase data.
- Responsible to Configure on the Hadoop cluster and troubleshoot the common Cluster Problem
- Have handled issues related to cluster start, node failures and several java specific errors on the system.
- Cluster configuration and HDFS data transfer (distcp and HFTP), inter and intra cluster data transfer.
- Installed and Configured HDP 2.x; Responsible for implementation and ongoing administration of Hadoop infrastructure.
- Monitored already configured cluster of 40 nodes. Installed and configured Hadoop components HDFS, Hive, HBase.
- Communicating with the development teams and attending daily meetings.
- Addressing and Troubleshooting issues on a daily basis. Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.
- Cluster maintenance as well as creation and removal of nodes. Monitor Hadoop cluster connectivity and security.
- Manage and review Hadoop log files. File system management and monitoring. HDFS support and maintenance.
- Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
Environment: Cloudera Hadoop, HDFS, Pig 0.11, Hive 0.10, Sqoop 1.4.3, Flume, HDFS, Linux, Oozie, Cassandra, Hue, Core Java, Eclipse, Linux.
Confidential
Trainee Engineer
Responsibilities:
- Develop new features based on 3GPP specifications
- Dual Carrier HSDPA, Enhanced CELL FACH: Implement the feature based on detail design document.
- Expertise in L3/RRC, RRM, interfaces to PHY, MAC, RLC and NAS, idle mode & mobility procedures (e.g., cell selection, reselection, handover, SRVCC), DC-HSDPA, DL message & IE processing, etc.
- Analyzed logs from lab and field tests around the world, resolved protocol issues, and optimized the software
- Very proficient in chipset certification (GCF, PTCRB, 10776, IOT, etc) and the use of test equipment, TTCN.
- Capable of identifying any protocol conformance issues in the whole 3GPP protocol stack
- Paging reception and operations in idle mode.
- Power saving for modem (Connected/Idle state).
- Code/Bug management using Clear Case and Perforce.
- ARM debugging using Lauterbach T32.
- Modem log analysis to un-cover the root-cause of a problem.
- Inter Operator Development Testing with Ericsson, NSN.
- Design and Implementation of CMAS, CBS and ETWS security alert systems as per 3GPP specs and operator’s requirements
Environment: Visual Studio, Source Insight, Trace32, Perforce, GIT/Gerrit, Clear Quest, C, Windows