We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

4.00/5 (Submit Your Rating)

Chicago, IL

SUMMARY:

  • Over 4 years of experience in Hadoop big data administration
  • Over 5 years of experience in the information technology field
  • Hands on experience in installing and configuring MapReduce, HDFS, Impala, Flume, Oozie, Hive, Sqoop, and Pig
  • Installed, configured, and upgraded OS when required
  • Created bash scripts to automate processes and cleanups
  • Excelled in importing data from various data sources into HDFS like Oracle, TeraData, and S3
  • Experienced in installing and configuring Cloudera, Hortonworks, and Cloudera in AWS
  • Handled security permissions for users on linux, HDFS and Hue
  • Worked closely with management and other departments to configure the system efficiently
  • Skilled in implementing fair scheduler to manage resources during peak times
  • Gained extensive experience managing and reviewing Hadoop Log files
  • Monitored the performance of the Hadoop ecosystem and estimated cluster capacity

PROFESSIONAL EXPERIENCE:

Confidential,Chicago, IL

Hadoop Administrator

Responsibilities:

  • Meticulously structured Cloudera production and non - production clusters
  • Efficiently worked as a team in a fast-paced environment to meet deadlines
  • Implemented practices to ensure the cluster was scalable
  • Assembled a governance practice to keep the cluster health in check
  • Carefully worked with management and architects to assemble a disaster recovery plan
  • Configured Flume and Kafka to set up a live stream of data
  • Created a daily job to compress data from 300GB/day to 42GB/day and coalesce 250,000 files/day to 300 files/day
  • Greatly reduced the number of blocks to ease cluster memory stress
  • Enabled Sentry and Kerberos to ensure data protection
  • Worked closely with developers to develope and test new jobs
  • Scooped over 36TB of data from TeraData and transformed to parquet
  • Fashioned a new sandbox cluster to test upgrades and beta services
  • Provided monthly updates on the health of the cluster to upper management
  • Trained other DBAs to assist in day to day activities
  • Worked with Oracle and Cloudera to troubleshoot issues and implement upgrades
  • Tested a smaller Hortonworks cluster
  • Built a POC cluster in AWS; planning to migrate whole cluster to cloud in two years
  • Received the highest employee year-end review score in my division

Confidential ,Farmington Hills, MI

Hadoop Administrator

Responsibilities:
  • Setup, configured, and managed security for the Cloudera Hadoop cluster
  • Used Hive and Impala to perform data analysis
  • Loaded log data into HDFS using Flume and Kafka
  • Created multi-cluster tests to assess the system’s performance and failover
  • Built a scalable Hadoop cluster for data solution
  • Responsible for maintenance and creation of nodes
  • Managed log files, backups, and capacity
  • Found and troubleshot Hadoop errors
  • Worked with other teams to decide the hardware configuration
  • Implemented cluster high availability
  • Scheduled jobs using Fair Scheduler
  • Configured alerts to find possible errors
  • Handled patches and updates
  • Took care of the linux system by creating automated scripts

Confidential ,Plano, TX

Linux administrator/Hadoop Administrator

Responsibilities:
  • Analyzed Hadoop cluster and other big data analysis tools including Pig
  • Implemented multiple nodes on CDH3 Hadoop cluster on Red hat Linux
  • Imported data from Linux file system to HDFS
  • Installed clusters, starting and stopping data nodes, and recovered name nodes
  • Assisted with capacity planning and slot configuration
  • Created HBase tables to house data from different sources
  • Transmitted data from TeraData to HBase using Sqoop
  • Worked with a team to successfully tune Sparks performance queries
  • Excelled in managing and reviewing Hadoop log file
  • Worked with management to determine the optimal way to report on datasets
  • Installed, configured, and monitored Hadoop Clusters using Cloudera
  • Balanced and tuned HDFS, Hive, Impala, MapReduce, and Oozie workflows
  • Maintained and backed up meta-data
  • Configured Kerberos for the clusters
  • Used data integration tools like Flume and Sqoop
  • Setup automated processes to analysis the system and find errors
  • Supported linux engineering department in cluster hardware upgrades

We'd love your feedback!