We provide IT Staff Augmentation Services!

Team Lead Big Data Engineer Resume

5.00/5 (Submit Your Rating)

SUMMARY

  • Hands - on lead of Big Data Engineers group. 17+ years of IT experience with over 5 years of extensive experience in Hadoop and AWS technologies.
  • Successful implementation of high performance, scalable Big Data analytics platforms.
  • In-depth understanding of Hadoop architecture and its components.
  • Experience of managing day-to-day decisions, work assignments and work sequencing.
  • Ability to drive implementation of key Big Data infrastructure initiatives.
  • Experience working in global settings with onshore and offshore teams and team members.
  • Excellent communication and relationship-building skills and work ethics.

TECHNICAL SKILLS

Hadoop: Hortonworks HDP, Cloudera CDH, Pivotal HD;

Big Data: HDFS, YARN, MapReduce, Spark, Hive, Tez, Impala, Pig, ZooKeeper, NiFi, HBase, Phoenix, Ranger, Sentry, Kafka, Flume, HAWQ, ElasticSearch, Oozie, Zeppelin Notebook, Cassandra;

Cloud: AWS - S3, EC2, IAM, RDS, CloudWatch, CloudFront, DynamoDB, RedShift, Route53, SNS, HDP and HDC clusters, SAML access; Microsoft Azure, Snowflake;

BI tools: MicroStrategy, Tableau;

Security: Kerberos, Centrify, UNAB;

Analytics: Bedrock, Kyvos, Jethro, H2O, Datameer, Alpine Data;

Databases: MySQL, PostgreSQL;

Monitoring: Pepperdata, Grafana, Ganglia, Nagios;

Automation: Ansible, Ansible Tower;

Languages: Shell scripting, Python, LabVIEW, Matlab, C/C++, HTML, XML, KML, Visual Basic, Pascal;

OS: Linux, Windows, DOS

PROFESSIONAL EXPERIENCE

Confidential

Team Lead Big Data Engineer

Responsibilities:

  • Assembled, mentored and led a team supporting Big Data platform for Confidential Digital Ad Sales business. The Big Data platform is built upon several on-premises and AWS cloud Hadoop clusters utilizing key technologies - Hive, Spark, Talend and Bedrock. The team includes onshore and offshore members and delivers uninterrupted 24x7 availability of Big Data platform.
  • Coordinated cluster migration project between data centers, which included evaluation of different cluster architectures, options for virtualization and cloud solutions.
  • Managed building of Hadoop clusters in virtualized environments with detached storage such as Isilon and NetApp. Planned and implemented all steps of cluster migration from physical on-premises to virtualized environments.
  • Led all stages of design, planning, building and tuning of 2.7 Petabyte Hadoop cluster and several smaller clusters in physical on-premises environment. Led cluster capacity growth planning and cluster expansion from hardware evaluation and purchasing to addition of new servers in existing clusters.
  • Migrated projects between on-premises clusters and also to AWS cloud clusters including Hive databases, HDFS directories, workflows, code and access policies.
  • Initiated and implemented multiple automations for environment stabilization.
  • Led administration of AWS account with over 1 Petabyte of data. Automated growth monitoring and life cycle managing for AWS S3 and Glacier buckets. Provided user access to AWS account including creation of users, groups, roles and policies in IAM. Backed up data from HDFS to S3 and Glacier in AWS, restored data from S3.
  • Led and guided vendor transition for Big Data team, trained new team members to follow best practices of Hadoop and Confidential standards.
  • Worked with Hortonworks Data Cloud, Hortonworks and Cloudera clusters on EC2, created AMIs for critical servers, resized AWS clusters by adding new EC2 instances.
  • Supported data consolidation projects, where data from multiple sources arriving at various speeds and in various formats were ingested in Hadoop, usually with Bedrock or Talend ingestion tool. Subsequent steps included data parsing and cleaning with Python, Spark and Scala and transforming it in Hive for further analysis and reports in Tableau and MicroStrategy.
  • Provided support of big data platform and big data analytics tools. Responsibilities included cluster setup, tuning, performance monitoring, software installation and configuration, software upgrades and backup.
  • Installed, upgraded, resized and decommissioned Hadoop clusters of Hortonworks, Cloudera and Pivotal distributions of varying sizes and configurations.
  • Set up high availability of Name Node, Resource Manager and HiveServer2. Implemented well-balanced cluster capacity management in multi-tenant environment.
  • Created Hive databases and tables with evaluation of more efficient data formats like ORC or Parquet. Analyzed Spark, Hive and TEZ applications to improve performance for ETL tasks. Tuned applications to use cluster resources more efficiently.
  • Configured and maintained security on Hadoop clusters with cluster-level Active Directory Kerberos authentication, and also with server-level user access control with Centrify and UNAB.
  • Provided access to cluster resources by managing LDAP-Active Directory user authentication in Ranger with Ranger UI and in Sentry with Hue.
  • Conducted multiple PoCs and tested performance of various technologies - Jethro, Kyvos, AtScale, LLAP, Spark2, Redshift, Snowflake. Provided support for AtScale, Datameer, H2O and Alpine Data platforms used by end-users for business intelligence analytics.

Confidential

Senior Research Engineer

Responsibilities:

  • Provided general Linux administration.
  • Structured and organized large amounts of data from acoustical and video sensors for further processing and analysis.
  • As a member of a team, created software for on-water and under-water motion detection. Obtained a patent for the technology.
  • Analyzed data using different techniques such as Fourier transform, correlation functions, statistical analysis, interpolation and linear and nonlinear regression for real-time signal processing applications.
  • Implemented mathematical models for differential equations (wave and heat equations) allowing for visualization and analysis of corresponding physical effects.
  • Designed, developed, tested and maintained programs for multi-channel simultaneous data acquisition at various speeds, real-time signal processing and post-processing, data analysis, signal generation, process and instrument control.
  • Developed software for control, video processing and video recording for IP-cameras with Pan-Tilt-Zoom capabilities. Integrated camera control with AIS, acoustical and radar sensors, allowing camera to automatically follow river vessels.
  • As a member of a team created software for diagnostic system for nondestructive detection of cracks in constructions (continuous and impact methods).
  • Developed software for linear and nonlinear land mine detection.
  • Wrote testing programs for termite detection, automated measuring of mechanical impedance, microwave detection of decay spots in wooden beams inside the walls.
  • Participated in projects for adaptive real-time transmission of large amounts of video and acoustical data over the network with unstable bandwidth.
  • Performed operations with extremely large numbers well beyond the capabilities of standard operations with numbers.

We'd love your feedback!