We provide IT Staff Augmentation Services!

Big Data Developer Resume

3.00/5 (Submit Your Rating)

Lafayette, LA

SUMMARY

  • Over 9 years of IT experience in application design, development, testing and technical support and guidance in Python, Java and Cloud Tools.
  • Experience working in all stages of Software Development Life Cycle (SDLC) including Requirements, Analysis and Design, implementation, integration and testing, deployment and maintenance.
  • Hands on experience in usingHadoopTechnologies such as Map Reduce, HDFS, Hive, Spark, Oozie, Pig, Kafka, Flume, NiFi, Impala, Storm, Zookeeper.
  • Hands on experience in developing Spark Scala Streaming app to consume teh data from Kafka and writing teh data to another Kafka topics as well as persisting teh data to HBase.
  • Hands on experience in developing Scala processor to consume and publish teh data to Kafka topics.
  • Experienced wif GIT as version control system and used Ansible for application deployment and task automation.
  • Experience in using Rundeck for deploying teh applications and checking teh logs using Kibana.
  • Hands - on experience in Import / Export teh data using Sqoop from HDFS to RDBMS and vice-versa.
  • Hands on Experience wif Cassandra, HBase and MongoDB NoSQL Databases.
  • Good Understanding wif distributed technologies such as Spark and Hadoop.
  • Experience in working wif Elastic MapReduce (EMR) and setting up environments on Amazon AWS EC2 instances.
  • Experience in loading teh data into Spark RDD and performing in-memory data computation to generate teh output responses.
  • Hands-on experience in AWS Cloud platform and its features which includes services like: EC2, S3, EBS, VPC, ELB, IAM, LAMBDA, Auto scaling
  • Good Knowledge on Route 53, Cloud Front, Cloud Watch, Cloud Trail, Cloud Formation and Docker
  • Good Knowledge of Snowflake, Airflow Scheduler and Netflix Genie.
  • Experience in using AWS SDK Java, Python.
  • Experience in uploading teh data, Host Static Websites, Encrypt Data, Implement Bucket Policy and Setup CORS in S3 using, Web Console, AWS CLI and AWS SDK for Python (Boto3).
  • Experienced wif Jenkins as Continuous Integration / Continuous Deployment Tool and strong experience wif Ant and Maven Build Frameworks.
  • Experience in technologies such as Core Java, HTML5, CSS3, AJAX, XHTML, JavaScript, CSS, jQuery and Bootstrap.
  • Strong abilities in Design Patterns, Database Design, Normalization, writing Stored Procedures, Triggers, Views, Functions in MS SQL Server, Oracle and PostgreSQL.
  • Adept at preparing business requirements documents, defining project plans, writing system requirements specifications
  • Worked on Python classes from teh respective APIs so that they can be incorporated in teh overall application.
  • Experience wif setting up and developing flows in Apache NiFi using processer and groups
  • Worked wif different distributions ofHadooplike Hortonworks andCloudera
  • Worked on using different file formats like Sequence, AVRO, ORC files, Parquet files and CSV using different compression Techniques.
  • Experience in Apache Flume for collecting, aggregating and moving large amounts of data from application servers.
  • Expertise in writing Custom UDF's to in corporate complex business logic into Hive Queries.

TECHNICAL SKILLS

Big Data Technologies: HDFS, Map Reduce, Spark, Apache Flume, Kafka, Pig, Hive, Oozie, Apache SOLR, YARN, Zookeeper, Sqoop, Impala

Databases: Oracle 10/11g, MySQL, PostgreSQL, Cassandra, HBase, Mongo DB, Spark SQL

Programming Languages: Python, Scala, Java, Java Script, Spring Boot

Cloud Tools: Amazon EC2, Amazon S3, Dynamo DB, Redshift, IAM, Cloud Front, Elastic Map Reduce, Lambda, Kinesis, Route 53, Cloud Trail, AWS Data Pipeline, AWS Database Migration Service

ETL Tools: Informatica

Data Analytics: Python, Numpy and Scipy

Build Tools: Apache Maven, SBT

Platforms: Linux, Mac, Windows

Version Control: GIT and SVN

PROFESSIONAL EXPERIENCE

Confidential, Lafayette LA

Big Data Developer

Responsibilities:

  • Understanding teh project documentation, analyzing and converting into technical requirements.
  • Requirement analysis and Estimation of project timelines.
  • Participating in Sprint Planning and Releases.
  • Performed efficient delivery of code based on principals of Test Driven development (TDD) and continuous integration to keep in line wif Agile Software Methodology principals and SCRUM process.
  • Developed a bare Spark Streaming app using Scala which consumes teh messages from a Kafka topic and publishes it to another Kafka Topic.
  • Developed Scala Processors which consumes teh messages from a Kafka topic and teh message that is consumed from teh topic is processed and sent to another Kafka topic as well as persisted into teh HBase in Avro format.
  • Prepared teh Case Classes based on teh Data Models that are to be written to Kafka and HBase.
  • Assisted in writing a HTTP Client which hits External Service to fetch more details about teh message and publish to teh Kafka Topic.
  • Assisted in writing a Terminology API which hits teh MongoDB to fetch more details about teh message and publish to another Kafka Topic.
  • Transformed teh detailed messages that are fetched from External Service/API as per teh defined data models.
  • Created Hive tables and as well as views on top of teh tables based on teh data models that are stored in HBase.
  • Added metrics and logging to check teh application behavior in Grafana Dashboards.
  • Updated teh Ansible configurations and used Rundeck to deploy teh applications on to diff environments.
  • Staged teh application in prod by writing teh transformed models to a dead topic for checking teh behavior of teh application.
  • Troubleshooted teh application behavior and changed teh Kafka configs to make teh application more stable.
  • Loaded teh existing tables into Spark Data Frames/Data Sets and created new Hive tables after performing teh necessary transformations based on teh Business requirements.
  • Used Kibana to check teh logs of teh application.

Environment: Hadoop, YARN, Spark-Core, Spark-Streaming, Spark-SQL, Scala, Kafka, Hive, Avro, Ibis, Impala, HBase, MongoDB, Ansible, Rundeck, UDeploy, Kibana, Git, Linux, Robo Mongo.

Confidential, Charlotte NC

Big Data Cloud Analytics Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop and Spark.
  • Used Spark-Streaming APIs to perform necessary transformations and actions on teh fly for building teh common learner data model which gets teh data from Kafka in near real time and Persists into Cassandra.
  • Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Developed Python scripts, UDFs using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
  • Experienced in performance tuning of Spark Applications (Spark Streaming, Spark SQL) for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Experience in writing teh UDFs to parse teh data as per teh requirements on Spark-Streaming using Spark-Context
  • Loaded teh data into Spark RDD/ Data Frames and do in memory data Computation to generate teh Output response.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Performed advanced procedures like text analytics and processing, using teh in-memory computing capabilities of Spark using Python.
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, TEMPEffective & efficient Joins, Transformations and other during ingestion process itself.
  • Worked on migrating Map Reduce programs into Spark transformations using Spark and Python.
  • Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment wif both traditional and non-traditional source systems as well as RDBMS and NoSQL data stores for data access and analysis.
  • Good Experience working wif Amazon AWS for setting up Hadoop cluster.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Developed Hive queries to process teh data and generate teh data cubes for visualizing
  • Implemented schema extraction for Parquet and Avro file Formats in Hive.
  • Good Knowledge wif Talend open studio for designing ETL Jobs for Processing of data.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.

Environment: Hadoop, YARN, Spark-Core, Spark-Streaming, Spark-SQL, Python, Kafka, Hive, Avro, Sqoop, Amazon AWS, Impala, Cassandra, Linux.

Confidential, AL/DE

Hadoop / AWS Developer

Responsibilities:

  • Involved in gathering requirements from different teams to design teh ETL migration process from RDBMS to Hadoop Cluster.
  • Used AWS EMR to process teh Data.
  • Extensively used Sqoop to import data from RDBMS and export it back
  • Used Compression Techniques (snappy, Gzip) wif file formats like Parquet, Avro and Sequence Files to leverage teh storage in HDFS
  • Developed Custom UDF’s for cleansing and transforming data
  • Implemented Hive's Dynamic partitions and Hive Buckets depending on teh downstream business requirements and Analyzed teh existing Hive queries and implemented advanced queries using functions like RANK to optimize teh performance of teh Hive queries
  • Used Apache NiFi flow to perform teh conversion of Raw XML data into ORC files, Parquet files
  • Performed Data Integrity checks
  • Worked wif Pig as a ETL tool to do Transformations, joins and some pre-aggregations before storing teh data onto HDFS
  • Used Oozie workflow engine to manage interdependent Hadoop jobs.
  • Implemented Oozie Coordinator to schedule teh workflow, leveraging both data and time dependent properties
  • Used Sub Version (SVN) for version control and code management
  • Wrote Shell Scripts to automate end-to-end jobs and dealt wif tools like AutoSys to automate jobs

Environment: Cloudera Hadoop, HDFS, Hive, Pig, Map Reduce, Sqoop, Zookeeper AWS EMR, AutoSys and Oozie, Avro, SVN, Shell Scripting, Linux

Confidential, DE

Hadoop / AWS Developer

Responsibilities:

  • Involved in developing Web services in Service Oriented Architecture (SOA).
  • Configured AWS Security Groups which acts as a virtual firewall that controls teh traffic for one or more AWS EC2 instances.
  • Configured Launch configurations, Auto-scaling groups, target Groups and Classic/Application load balancers.
  • Configured AWS Identity and Access Management (IAM) to securely manage AWS users & groups, and use policies & roles to allow or deny access to AWS resources.
  • Experience in AWS Cloud Front, including creating and managing distributions to provide access to S3 bucket or HTTP server running on EC2 instances.
  • Designed and created Cloud Formation templates to create stacks.
  • Configured custom metrics for teh AWS Cloud Watch for detailed monitoring.
  • Implemented continuous integration using Jenkins. Configured security to Jenkins and added multiple nodes for continuous deployments.
  • Used shell scripting for loading data from edge node to HDFS.
  • Configured MySQL Database to store Hive metadata.
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance
  • Solved performance issues in Hive and Pig scripts wif understanding of Joins, Group, and aggregation and how does it translate to MapReduce jobs
  • Worked in tuning Hive and Pig scripts to improve performance
  • Good experience in writing MapReduce programs in Java on MRv2 / YARN environment
  • Good experience in troubleshooting performance issues and tuning Hadoop cluster
  • Imported and exported data into HDFS and Hive using Sqoop.
  • Good Knowledge in performance troubleshooting and tuning Hadoop clusters.
  • Involved in start to end process of Hadoop jobs that used various technologies such as Sqoop, PIG, Hive, MapReduce and Shell scripts (for scheduling of few jobs)
  • Used Tableau9.0 to create reports representing analysis in graphical format.
  • Experience in managing and reviewing Hadoop log files.
  • Created MapReduce jobs using Pig Latin and Hive Queries.
  • Used Git as version control for scripts and configurations.
  • Using Amazon Route53 to manage public and private hosted zones.
  • Created SNS topics and managed subscriptions.

Environment: Hortonworks Data Platform 2.2, AWS EC2, S3, IAM, VPC, Python 2.7 Boto3, Jenkins, AWS Cloud watch, Git, Route53, Linux, Hadoop, Sqoop, Flume, Oozie, MapReduce, HDFS, Pig, Hive, HBase, MySQL, Ubuntu

Confidential, DE

Python Developer

Responsibilities:

  • Involved in teh analysis, design and development of teh project life cycle.
  • Designed and created web pages using HTML and CSS, Bootstrap, JavaScript, jQuery, Ajax and JSON.
  • Developed Restful API’s using Flask.
  • Developed web pages to connect and interact wif data base using Django and SQLAlchemy.
  • Involved in database design based on requirements.
  • Involved in writing and modifying stored procedures, views, and tables in SQL database.
  • Python was used to write Views, models, templates, and database queries and Django's (a web framework) MVC pattern takes care of teh interaction between Model, View and leaving us wif templet/HTML file.
  • Continuously maintained and troubleshoot thePythonDjango modules.
  • Used Visio for Data Modeling and Database Design.
  • Resolved ongoing problems and accurately documented progress of teh complete project.
  • Support for existing applications.
  • Developed release notes for deployment. Teh support team uses teh document for deployment.

Environment: Python 2.7, Flask, Django, HTML, CSS, JavaScript, Linux, JQuery, SQL Server, GIT, Shell Scripting

Confidential, DE

Python Developer

Responsibilities:

  • Participated in requirement gathering and worked closely wif teh architect in designing and modeling.
  • Involved in teh design, development and testing phases of application using AGILE methodology.
  • Designed and maintained databases using Python and developed Python based API (RESTful Web Service) using Flask, SQL Alchemy and PostgreSQL.
  • Designed and developed components using Python wif Django framework. Implemented code in python to retrieve and manipulate data.
  • Develop consumer based features and applications using Python, Django, HTML, behavior Driven Development (BDD) and pair based programming.
  • Worked closely wif back-end developer to find ways to push teh limits of existing Web technology.
  • Designed and developed teh UI for teh website wif HTML, XHTML, CSS, Java Script and AJAX
  • Involved in writing SQL queries implementing functions, cursors, object types, sequences, indexes, and stored procedures, Functions, Packages and Triggers in SQL Server.
  • Designed dynamic client-side JavaScript codes to build web forms and performed simulations for web application page.

Environment: Python 2.7, Django 1.4, web2y, Flask, Struts, JavaScript, AJAX, XML, SQL Server HTML, XHTML, CSS, GIT

Confidential, DE

Python Developer

Responsibilities:

  • Responsible for gathering requirements, system analysis, design, development, testing and deployment.
  • Participated in teh complete SDLC process.
  • Developed Business logic using Python 2.7
  • Used Django framework for database layer development.
  • Developed user Interface GUI using CSS, HTML, JavaScript and JQuery.
  • Responsible for setting up Python REST API framework using DJANGO.
  • Created database using MySQL, wrote several queries to extract data from database.
  • Wrote scripts in Python for automation of testing jobs.
  • Deployment and Build of various environments including Linux and UNIX.
  • Jira is used as project management tool for issue tracking and bug tracking.
  • TEMPEffectively communicated wif teh external vendors to resolve queries.
  • Implemented monitoring and established best practices around using elastic search.

Environment: Python 2.7, Django 1.4, C++, Java, Jenkins, JSON, XML, SOAPUI, HTML, Restful API, Shell Scripting, SQL, MySQL, GIT, Linux.

Confidential, DE

Software Programmer

Responsibilities:

  • Involved in teh analysis, design and development of teh project life cycle.
  • Designed and developed Web pages (presentation layer) using Java/JSP.
  • Coded Client-side validations heavily using Java Script.
  • Involved in writing and modifying stored procedures, views, and tables in SQL Server database.
  • Designed and developed various levels of security measures such as data access and login privileges according to teh levels of user’s login.
  • Developed an API to write XML documents from a database. Utilized XML and XSL Transformation for dynamic web-content and database connectivity.
  • Involved in business meetings to gather additional requirements for teh application.
  • Created numerous reports using SSRS.
  • Involved in configuring and deploying teh application using WebSphere.
  • Involved in code reviews and mentored teh team in resolving issues.
  • Undertook teh Integration and testing of teh various parts of teh application.
  • Used Subversion for version control and log4j for logging errors.
  • Code Walkthrough, Test cases and Test Plans

Environment: Java, JSP, HTML, JavaScript, SQL Server, SSRS, Web Services.

We'd love your feedback!