We provide IT Staff Augmentation Services!

Big Data Developer Resume

Mountain View, CA

PROFESSIONAL SUMMARY:

  • Experienced and self - motivated Big Data Developer, with over 6years of extensive experience in Big Data Ecosystem and Enterprise Data Warehouse Systems.
  • Agile development experience with proven track record of Successful Implementations.
  • Experience in designing and developing complex business applications involving high with expertise
  • Experience in Relational Databases (Oracle/PostgreSQL)
  • Strong Knowledge in Big Data Ecosystem: Cloudera-Impala, Hive, HDFS, Pig, Sqoop, Oozie, Apache-airflow, Spark
  • Experience in Pyspark, python, Scala, Cassandra, AWS EMR, S3, Datapipeline and tune of Hive and Impala queries.
  • Worked in agile teams and well-versed with a SAFe agile methodology.

WORK HISTORY:

Big Data Developer

Confidential, Mountain view, CA

Responsibilities:

  • Currently working as Spark developer with scala and onsite coordinator for building Hadoop based analytics platform for measuring Confidential online TV, video and other digital content across the web andapps.
  • Worked closely with clients to establish problem specifications and system designs
  • According to the GDPR compliance developed apps for further business in EU.
  • Used Airflow platform to programmatically author, schedule and monitor data pipelines.
  • Basic understanding with hands on experience in python scripting.
  • Used AWS Data Pipeline to create complex data processing workloads that are fault tolerant, repeatable, and highly available.
  • Used AWS EMR for running spark jobs which provided dynamic resizing ability.
  • Performed data profiling of source systems and defined the data rules for the extraction and integration of the same
  • Collaborated with product management to design, build and test systems.
  • Worked totally in agile teams and well-versed with a SAFe agile methodology.
  • Successfully working in fast-paced environment, both independently and in collaborative team environments.

Environment: AWS EMR, AWS S3, AWS Datapipeline, Apache-airflow, Spark.

Big Data Developer

Confidential, Tampa, FL

Responsibilities:

  • Worked as Hadoop developer with scalaand onsite coordinator for building Hadoop based analytics platform for measuring online TV, video and other digital content across the web and apps.
  • Good Experience in designing and development of Big Data components including HDFS, Impala, Hive, Sqoop, Pig, Oozie in Cloudera distribution.
  • Involved in loading and transforming large Datasets from relational databases into HDFS and vice-versa using Sqoop imports and export.
  • Scheduled workflow using Oozie workflow Engine
  • Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
  • Implemented advanced procedures like text analytics and processing using regular expressions and window functions in Hive
  • Solved performance issues in Hive scripts with an understanding of Execution Plan, Joins, Group and Aggregation and how does it translate to MapReduce jobs
  • Solved performance issues in Impala scripts with an understanding of Execution plan, Joins, Group, and Aggregation
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, Impala and loaded final data into HDFS.
  • Developed Spark scripts by using Scala, pyspark shell.
  • Handled Pyspark POC
  • Written complex Hive queries involving external dynamic partitioned on date Hive Tables
  • Load the data into Spark RDD/Data Frames and performed in-memory data computation to generate the output response.
  • Developed Spark code and Spark-SQL for faster testing and processing of data.
  • Knowledge on Cloud technologies like AWS EMR Cloud.
  • Performed data profiling of source systems and defined the data rules for the extraction and integration of the same
  • Working on data validation strategy by analyzing the data architecture, data ingestion framework and Hadoop information architecture
  • Carrying out specified data processing and statistical techniques.
  • Analyzing Provider's raw data & aggregated data, drawing conclusions & developing recommendations
  • Worked closely with clients to establish problem specifications and system designs
  • Performed regression and system-level testing to verify software quality and function before it was released.
  • Collaborated with product management to design, build and test systems.
  • Worked totally in agile teams and well-versed with a SAFe agile methodology.
  • Successfully working in fast-paced environment, both independently and in collaborative team environments.

Environment: Cloudera Impala, Hive, Sqoop, PostgreSQL, AWS EMR, Oozie, Pig, Spark, Parquet, ORC

Senior Developer

Confidential, SFO, CA

Responsibilities:

  • Involved in the definition of complex business rules for extracting and transforming the data across multiple source systems in order to provide independent as well as an integrated view of the data.
  • Performed data profiling of source systems and defined the data rules for the extraction and integration of the same.
  • Transformed complex Netezza queries.
  • Wrote complex SQL queries on varying levels of granularity for the Analysis dimensions and provide Business Insights to the users enabling guided decision making and what -if Analysis.
  • Involved in analysis, design, testing phases and responsible for documenting technical specifications. Very good understanding of Impala performance tuning of queries.
  • Responsible for creating HIVE external tables on the finalized data in HDFS and partitioning and bucketing the data.
  • Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects.
  • Communicated emergency problems, progress status and support services to all clients.
  • Drafted comprehensive reports to document bugs and design flaws.
  • Collaborated with product management to design, build and test systems.

Environment: Cloudera Impala, Oracle, Netezza, Hive, Sqoop, Parquet

Developer

Confidential

Responsibilities:

  • Worked closely with Business users to define the Panelists Intab function.
  • Involved in the definition of complex business rules for extracting and transforming the data across multiple source systems in order to provide independent as well as an integrated view of the data.
  • Performed data profiling of Panel data systems and defined the data rules for the extraction and integration of the same.
  • Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects.
  • Coordinated with technical team members for coordination and conducted system integrationtesting services.
  • Performed requirement gathering, designing, and development of data models for operational control and monitoring.
  • Coordinated with other client teams for organizing and conducting systems and integration testing services.
  • Communicated emergency problems, progress status and support services to all clients.

Environment: Netezza, Oracle, Java

Developer

Confidential

Responsibilities:

  • Prepared technical and functional designs in collaboration with teams and business users.
  • Developed data warehouse solutions and application by translation of business and functional needs.
  • Suggested steps and concepts for detailed estimation, designing, and functionality of data warehouses.
  • Maintained program documentation, operational procedures and user guidelines as per client requirements.
  • Coordinated with technical team members for coordination and conducted system integration testing services.
  • Performed requirement gathering, designing, and development of data models for operational control and monitoring.
  • Responsible for frequently importing, cleaning, transforming, validating or modeling data with the purpose of understanding or making conclusions from the data for decision-making purposes.

Environment: Netezza, Oracle, Java

Developer

Confidential

Responsibilities:

  • Prepared technical and functional designs in collaboration with teams and business users.
  • Developed data warehouse solutions and application by translation of business and functional needs.
  • Maintained program documentation, operational procedures and user guidelines as per client requirements.
  • Coordinated with technical team members for coordination and conducted system integration testing services.
  • Performed requirement gathering, designing, and development of data models for operational control and monitoring.
  • Responsible for frequently importing, cleaning, transforming, validating or modeling data with the purpose of understanding or making conclusions from the data for decision-making purposes.
  • Debugged and modified software components.
  • Developed code fixes and enhancements for inclusion in future code releases and patches.
  • Built, tested and deployed scalable, highly available and modular software products.

Environment: Netezza, Oracle, Java.

Hire Now