We provide IT Staff Augmentation Services!

Sr. Big Data Engineer Resume

Charlotte, NC

PROFESSIONAL SUMMARY:

  • Certified Big Data Engineer with 15 years of experience in designing and development of Big Data platforms with cutting - edge technologies (Spark, Kafka, NiFi,AWS, CDH) to develop end to end data pipelines taking into consideration data quality, scalability, performance and maintainability .
  • Design and development of ELT/Data streaming platforms using Big Data eco system and cutting-edge technologies
  • Experience in design and development of Ingestion framework from multiple sources to Hadoop using Spark framework with PySpark and PyCharm
  • Prepare the Data sets as per requirement for Data Scientists to analyse the data usage pattern and make recommendations
  • Experienced in building data ingestion platform from various channels to Hadoop using PySpark and Spark Streaming framework
  • Design and perform data transformation using data mapping and data processing capabilities like Spark SQL and PySpark, Python, Scala
  • Specialized in software development of data pipes lines and implementation of streaming Analytics platforms using Cloudera and AWS platforms
  • Specialized in Perform software analysis, risk analysis, reliability analysis and perform actions accordingly
  • Big Data Architecture and developing highly scalable large-scale distributed data processing Systems
  • Cognitive about designing, deploying and operating highly available, scalable and fault tolerant systems using Amazon Web Services (AWS).
  • Design, build, and manage an analytics infrastructure that can be utilized by data analysts, data scientists to enable big data analytics
  • Development of Machine Learning and AI solutions for IOT, advanced power flow distributions and Smart Grid using Tensor flow, Spark, Tiger Graph, Python.
  • Experienced with event-driven and scheduled AWS Lambda functions to trigger various AWS resources.
  • Experienced working with an agile/scrum environment passion.

TECHNICAL SUMMARY:

Big Data Technologies: Spark, Kafka, NiFi, PySpark, Python, Anaconda, Pandas, GPU, Scala, GCP & AWS Analytics

Development Technologies: GCP, Big Query, AWS RDS - AWS EMR/EC2, Glue, Redshift GCP & AWS Analytics

Development Technologies: PySpark, Github, UCD, Jenkins, Spark, Kafka, Python, AWS EMR/EC2, Glue, Athena, Ambari, Cloud break, AWS CloudWatch, Cloudera Manager, Knox, Grafana

PROFESSIONAL EXPERIENCE:

Confidential, Charlotte, NC

Sr. Big Data Engineer

Responsibilities:

  • Implementation of PySpark built-in, cutting-edge machine learning routines, along with utilities to create full machine learning pipelines using PySpark on Hadoop 3 platform
  • Design and develop Spark Data Frames that behave like a SQL table and perform data manipulations like Grouping and Aggregating and filtering the data
  • Data model tuning and selection with PySpark machine learning framework to create models and fit the models
  • Development of Machine Learning and AI solutions for IOT, advanced power flow distributions and Smart Grid using Tensor flow, PySpark, Graph DB, CUDA and NUMBA, Python.
  • Implement PySpark.sql module, which provides optimized data queries to the Spark session
  • Create the Pipeline using pyspark.ml module that combines all the Estimators and Transformers
  • Development of real time streaming Analytics using HDP, HDF, NiFi, PySpark
  • Implementation of AWS cloud computing platform using AWS EMR/EC2, Glue, Redshift, Athena, S3, RDS, Dynamo DB, RedShift, Python
  • Understand the existing data pipes line and implement faster streaming Analytics using HDP & HDF Data flow with Apache NiFi, Kafka and Spark

Environment: - AWS , Hortonworks, NiFi, PySpark, Scala, Kafka, Java, HDFS, Hive, AWS, Red Hat Linux, Ambari, Redshift and DynamoDB

Confidential, Atlanta, GA

Sr. Technical Architect

Responsibilities:

  • Design and development of Big Data analytical platform as per client requirements and engage technical discussions
  • Understand the existing data pipes line and implement streaming Analytics using HDP & HDF Data flow with Apache NiFi, Kafka and Spark
  • Performed real time streaming process thru Data lake by using HDP, HDF, NiFi, PySpark
  • Implement AWS cloud computing platform using AWS EMR/EC2, Glue, Redshift, Athena, S3, RDS, Dynamo DB, RedShift, Python
  • Understand the existing data pipes line and implement faster streaming Analytics using HDP & HDF Data flow with Apache NiFi, Kafka and Spark
  • Managing and developing framework Confidential Enterprise level and develop Custom processors for specific requirements using NiFi
  • Implemented HDF Data flow and Data plane for manage, Secure and Govern data across data centers
  • Performed processing of various data sets ORC; Parquet; Avro; json using PySpark
  • Implement AWS cloud computing platform using AWS EMR/EC2, Glue, Redshift, Athena, S3, RDS, Dynamo DB, RedShift, Python

Environment: - AWS , Hortonworks, NiFi, PySpark, Scala, Kafka, Java, HDFS, Hive, AWS, Red Hat Linux, Ambari, Redshift and DynamoDB

Confidential, Atlanta, GA

Sr. Big Data Architect

Responsibilities:

  • Understand the existing data pipes line and implement faster streaming Analytics using HDP & HDF Data flow with Apache NiFi, Kafka and Spark
  • Design and development of Big Data analytical platform as per client requirements and engage technical discussions
  • Design and deployment of the code in on premise, AWS cloud OR hybrid environment
  • Implementation of HDF Data flow and Data plane for manage, Secure and Govern data across data centers and in the Cloud
  • Performed real time streaming process thru Data lake by using HDP, HDF, NiFi, Spark/ Scala, Kafka

Environment: - Cloudera, AWS, EMR, Redshift, Linux, Cloudera Manager, Cloud watch

Confidential, Atlanta, GA/Dublin, OH

Sr. Big Data Architect

Responsibilities:

  • Implementation of Hortonworks Data Platform with High availability solutions and managing it(HIPAA)
  • Developed High Speed BI layer on Hadoop platform with Apache Spark & Java & Python
  • Developed Core java client API's for HBase that is used to perform CRUD operations on HBase tables
  • Worked on Java H base constructors and H base Java classes put, get, results methods
  • Experience in using Version Control Systems like ClearCase, CVS, SVN and GIT.
  • Work with Application team in implementation of Cassandra and fine tune according to requirements
  • Thorough understanding of client requirements and implementation of Data Ingestion confidentiality, Ingest workflows

Environment: - Hadoop, Hortonworks, NiFi, Spark, Scala, Kafka, Java, HDFS, Hive, AWS, Red Hat Linux, Ambari, Grafana, Zeppelin

Confidential, Atlanta, GA

Sr. Oracle Consultant

Responsibilities:

  • Development of Oracle applications as per client requirements
  • Installation & configuration of Apache Hadoop cluster using Pseudo-Distributed Operation, Fully-Distributed Operation
  • Setup & Manage HDFS Federation CDH 4.2, YARN/MapReduce 2 (MR2)
  • Setup & Manage of Hadoop, big Data, Hadoop Administration, Hive, HBase, Pig, Mahout, Spark, Linux, Scripting, Python, Perl, Shell, Open Source

Environment: - Oracle, PL/SQL, Oracle BDA, PostgreSQL, Stream sets, Golden gate, Cloudera

Confidential

Technical lead

Responsibilities:

  • Coordinate with team members and plan the project tasks in agile environment
  • Development of Oracle applications as per client requirements
  • Performed RAC upgrades from 10g RAC to 11gRAC
  • Provided technical expertise on multiple environments (Development, Integration
  • Installed and configured 4 node 11g RAC with ASM
  • Performed RAC upgrades from 10g RAC to 11gRAC
  • Perform day to day operations on RAC maintenance.
  • Support for application code builds and adheres to software development lifecycle (SDLC)

Environment: - Oracle, Oracle RAC, ASM, PostgreSQL

Hire Now