We provide IT Staff Augmentation Services!

Big Data Developer Resume

5.00/5 (Submit Your Rating)

SUMMARY:

  • A versatile professional offering over 4 years of rich technology and business experience with high interest and core competencies in Big Data and core strengths in Retail banking, Product development and Telecommunication.
  • Design and document the architecture and development process to convert database anddatawarehouse models into Hadoop based systems
  • Experience in loading data coming from different data sources into HDFS and automate data ingestion and transformation jobs
  • Performeddataextraction anddatawrangling using Pandas and NumPy modules in Python
  • Developed predictive analytics to generate business insights
  • Queried data from SQL datasources to build visualization and dashboards in Tableau
  • Creating Hive tables, dynamic partitioning, bucketing, performance tuning and querying them using HiveQL
  • Good knowledge on Spark In - memory capabilities and its modules: Spark Core, Spark SQL, Spark Streaming, MLlib
  • Experience in developing Spark jobs using PySpark API
  • Programmed in Hive, Spark SQL and Python to streamline the incomingdataand build thedatapipelines to get the useful insights
  • Working knowledge of streaming applications and scheduling workflows
  • Ability to work under pressure and adapt to constantly changing work environment
  • Possess good communication, analytical and organizational skills and able to multi-task efficiently
  • Self-motivated, excellent team player and ability to work independently as well

TECHNICAL SKILLS:

Programming: Python, Scala, R

Python - Data manipulation, Numpy,: Pandas, Matplotlib, Plotly, Scikitlearn

RDBMS: MySQL, Oracle, PostgreSQL

NoSQL: MongoDB, Cassandra, Hbase

Methodologies: Agile

Big Data: Hadoop, HDFS, MapReduce, Spark, Hive, Pig, Sqoop, Apache Airflow, Kafka, Oozie

Spark: Spark Core, Spark SQL, Spark Streaming, PySpark, Scala

Analytical Tools: SQL, Jupyter Notebook, Tableau, Zeppelin

Others: AWS, TWS, Shell Script

PROFESSIONAL EXPERIENCE:

Confidential

Big Data Developer

Responsibilities:

  • Design and implement data ingestion techniques for data coming from various source systems
  • Developed python scripts to collect data from source systems and store it on HDFS to run analytics
  • Creating the tables in Hive and integrating data between Hive &Spark
  • Created Hive Partitioned and Bucketed tables to improve performance
  • Created Hive tables with User defined functions
  • Worked on the core and Spark SQL modules of Spark extensively using Python
  • Performed extensive studies of different technologies and capture metrics by running different algorithms
  • Defining the data layouts and rules after consultation with ETL teams
  • Worked in AGILE environment and participated in daily Stand-ups/Scrum Meetings
  • Assisted business by reducing elapsed time of applications and hence time to market is reduced by streaming the applications

Environment: Hadoop, Map Reduce, HDFS, Hive, Spark, SQL, Sqoop, Python, Airflow, NoSQL

Confidential

Big Data Developer

Responsibilities:

  • Ingesting the data from Oracle to Hive and vice-versa using Sqoop
  • Ingesting the data from Mainframe to Hive for analytics consumption
  • Proof of concepts on Hive file systems
  • Created Hive tables with UDF’s and dynamic partitioning, bucketing tables to improve performance
  • Worked on Spark core and Spark SQL modules using PySpark API
  • Worked with Spark RDD's and Data frames to query and fetch Hive data
  • Worked with multipledataformats and Hadoop file formats like Avro, Parquet, ORC, and JSON etc.
  • Involved in code review and bug fixing for improving the performance
  • Experience working in AGILE environment and participated in Scrum Meetings
  • Completed big data analytics proof of concepts
  • Completed data ingestion and prepared data for analytics applications consumption

Environment: Hadoop, Map Reduce, HDFS, Hive, Spark, Pig, Sqoop, SQL, Python, Oracle, Kafka, Oozie, Tableau

Confidential

Python Developer

Responsibilities:

  • Involved in the Design, development, test, deploy and maintenance of the website
  • Data Analysis using python libraries
  • Design and develop the Python code as per user requirements
  • Debugging Software for Bugs and improving the performance
  • Helped creating data lakes in Big Data environment as part of new strategic initiatives
  • Automated the process in Python to create the Hive tables and Data ingestion processes
  • s:The critical gaps in the system were fixed
  • Migrating data from existing traditional data warehouse to Hadoop platform

Environment: Python, HTML, CSS, SQL, PLSQL, Oracle and Windows

We'd love your feedback!