Sr. Bigdata Developer Resume
San Antonio, TX
SUMMARY
- Having more than 7 years of total experience in IT industry.
- Responsible for complete SDLC management using different methodologies like Agile, Incremental, Waterfall, etc
- Underwent extensive on Big Data spanning over 45 days.
- Covered modules like Big Data Technology Landscape and worked on Business use cases (PoC). Technologies like Pig, Hive, Spark, Oozie, MapReduce and Flume were used.
- Worked hands on with ETL process.
- Tech savvy bigdata developer.
- Extensive Experience in Setting Hadoop Cluster.
- Experience in working with different data sources like Flat files, XML files and Databases.
- Hands on experience in migrating data from RDBMS to Hadoop ecosystem.
- Worked on writing various Linux scripts to stream data from multiple data sources like Oracle and Teradata.
- Extracted the data from Teradata into Confidential using Sqoop.
- Hands on experience in working with Ecosystems like Hive, Pig, Sqoop, Map Reduce, Ooozie.
- Trained and hands on work experience of Datastax distribution of Confidential .
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Been a part of Project Delivery team for a reputed client and interacted directly with customer.
- Work experience in supporting Kafka, Storm, Hbase, Sqoop and other Hadoop components in administrative as well as development capacities for Big Data applications
- Good knowledge in Hadoop architecture and ecosystem components
- Data scrubbing and processing with Oozie.
- Good interpersonal skills, fast learner, team player, self - directed and self-motivated.
TECHNICAL SKILLS
Big Data: Spark, Flume, Hadoop, Pig, Kafka, Oozie, Hive, Sqoop, MapReduce, Confidential BigInsight, Cloudera Distribution, Confidential Distribution
NoSQL: : Confidential, Dynamo DB, HBase, Hive (warehouse).
IDE:: Eclipse, Visual studio 2008, Visual studio 2010.
Programming Language:: Scala, Python, MapReduce,Pig Latin, HiveQL, Java, Shell Scripting, Maven.
Database:: SQL Server 2005, SQL Server 2008, Teradata, SAS
Operating System:: MacOs, CentOS 6.5, Windows
Cloud Platforms:: AWS EMR, AWS EC2, Azure HDInsight.
Development Platform:: GitHub.
Containers: : Docker, Linux Container, Anaconda
Notebooks:: Jupyter, Zeepelin
CI/CD: : Jenkins, Travis CI, Confidential UCD
PROFESSIONAL EXPERIENCE
Confidential, San Antonio, TX
Sr. BigData Developer
Responsibilities:
- Technical requirement analysis for setting up Confidential Data Platform environment.
- Overall requirement analysis.
- Low level design and High-level Design preparation.
- Setting up edge node configuration for the user.
- Installation and Configuration of Apache Kafka.
- Writing python code for Kafka Producer and Consumer to get the data into Hive tables. This includes spark streaming coding to write data into Confidential .
- Configuration changes to the Hive environment to meet project requirements.
- Designed and Developed the Spark code for transforming the data stored in Hive.
- Writing spark code to calculate the time and status for a given claim.
- Spark code to write the data back to Hive tables so that it can be read by Tableau.
- Develop the spark code according to the requirement for a given insight and provide the same to the Data Scientist team.
- Writing Hive scripts to perform the high-level data transformation and analysis.
- Used Hive UDF and Hive collections to perform the operations.
- PySpark code to read the data from hive and calculate IQR by converting spark DF to pandas DF.
- Writing SAS scripts for data movement from SAS to Confidential .
- Setting up Docker images and containers for PySpark and python.
- Testing Docker images with the complete application deployment.
- Feature Engineering and data structuring.
- Involved in the analysis and selection of right Machine Learning algorithm for our model.
- Model Prediction data analysis and efficiency calculation.
Technologies Used: Confidential Data Platform (HDP), Jupyter Notebook, Hive, Apache Kafka, Ambari, Spark 2.2, Python, PySpark, Docker, Linux Shell Scripting, SAS, GitHub, Travis CI.
Confidential, Dallas, TX
Sr. BigData Developer
Responsibilities:
- Business and scope analysis for developing and managing data lake.
- Overall requirement analysis
- Low level design and High Level Design preparation
- Designed and Developed the code for modeling jobs as per client requirement in agile mode, and standardized it using coding standards and code versioning techniques
- Writing spark code to read CSV file from S3 bucket.
- Spark code to read metadata from DynamoDB.
- Perform data analysis on the Data frames and calculating the matrices.
- Perform validation for each column depending upon the predefined rules.
- Updating the DynamoDB table for the report.
Technologies Used: Amazon Web Services EMR, S3 Bucket, Amazon Lambda, Spark 2.0, DynamoDB, Scala, SFTP, Teradata, Linux Shell Scripting.
Confidential, Bethesda, MD
Sr. BigData Developer
Responsibilities:
- Overall analysis of requirements.
- Design of the entire system.
- Configuration and setup of Confluent Kafka.
- Writing scripts to write the Kafka topics to Confidential consumer.
- Automating the Kafka scripts to consume the data.
- Writing spark code to read XML data as RDD and transform into DataFrame.
- Writing spark code to Transform the data in the required format for ML Model input.
- The data has to be trimmed, filtered and mathematically transformed into an insight for the model.
- Oozie workflow creation for Spark Action.
- Filtering and Auditing XML Data frames and logging in Confidential as flat files.
- Storing DataFrames as Hive tables.
- Creating Hive Structure.
Technologies Used: Confidential Distribution of hadoop Confidential BigInsight 4.1, Spark, Kafka, Hive, Oozie, Git., Scala, SFTP,Teradata, Linux Shell Scripting, Travis CI.
Confidential
Sr. BigData Developer
Responsibilities:
- A tech savvy approach to Confidential biginsight distribution and environment.
- Requirement Gathering And Analysis.
- Low Level Design And Data Structuring.
- Setting up of Apache Confidential cluster.
- Confidential Data Modelling.
- MapR educe Programs for Data Transformation And Validation
- Data transformation by writing Pig Scripts.
- Pig Scripts for Data Validation.
- Oozie workflow creation.
- Writing MapReduce Program for Confidential Connectivity.
- Pig connectivity.
- Confidential Performance Tuning.
- Confidential Security Enabling (SSL and data encryption on rest).
- Confidential OpsCentre security and administration.
- Took initiatives to learn about the new technology which were required to be implemented for the development, and provided knowledge transfer to the fellow team members to stay in trend.
Technologies Used: Confidential Distribution of hadoop Confidential BigInsight 4.1, Datastax Distribution of Confidential, MapReduce,Pig, Oozie, Git., Scala, SFTP, Linux Shell Scripting, CentOS 6.0.
Confidential
BigData Developer
Responsibilities:
- Business and scope analysis for developing and managing data lake.
- Overall requirement analysis
- Setting up of HDP 6 node cluster.
- Transferring data using Scoop.
- Writing Pig Scripts to transform the data.
- Refining data using Pig.
- Creating Hive Structure.
Technologies Used: Confidential Hadoop, Tableau, Sqoop, Hive, Pig Linux Shell Scripting.
Confidential
BigData Developer
Responsibilities:
- Building a Generic Framework to convert unstructured logs into structured format.
- Analyzing different Confidential ’s logs which consists of ISS logs, Jbos Logs, And Web application logs.
- Convert these logs to structured format using MR/Spark program.
- The output of framework is processed with pig to further structure the logs and save into Hive tables.
- Visualization is done using QlikSense on top of Hive table
- Developing the framework by writing Map Reduce program in java and also transforming the framework into Spark by writing the code in Scala for faster real time processing.
- Responsible for processing the data with Pig by writing Pig Scripts.
- Storing the data into Hive tables.
- Administration of HDP Cluster.
- Configuration and setup of Confidential Hadoop Cluster.
- Automate the jobs by using Oozie scheduler.
- Analyze the data to produce useful insights.
- Verified the functionality of the application as per the requirements.
Technologies Used: Spark, Scala,HDP Hadoop, Hive, Impala, Qlik Sense, Oozie, Linux Shell Scripting.
Confidential
BigData Developer
Responsibilities:
- It is an engine which takes in the click stream data and provides the most recommended product to the customer.
- The most recommended product is based on the present product which the customer is viewing.
- For each item, consumers get a list of other products which the customer most likely be interested in buying.
- Developing the engine.
- Configure and setup of 6 Node HDP cluster.
- Installation and administration of HDP Cluster.
- Architect design.
- Writing MapReduce Code for Data Analysis.
- Modelling And Creating Hive Tables for Structured Data Storage.
- Creating various D3JS charts for Data Visualization.
- Writing Scoop Scripts for data transfer from RDBMS to Confidential .
- Benchmarking tools (Pig, Hive, R).
Technologies Used: Pig, Sqoop, Hive, HDP Platform, R, RHadoop, D3JS.
Confidential
Developer
Responsibilities:
- P & Confidential, is an American multi-national consumer goods Confidential headquartered in downtown Cincinnati, Ohio.
- P& Confidential receives online product orders and reviews.
- Team is responsible for maintaining and servicing the application and provide a seamless experience to the end user.
- Thus increasing the revenue of P & Confidential .
Confidential
Responsibilities:
- Worked in a team to carry out the change requests and bug fixes.
- Responsibilities include functionality and content of the site.
- Capture the issue that is reported that by the business team
- Find the root cause of the issue
- Find a solution/fix
- Implement the fix and test it in the local environment
- Pass it on to the testing team for further tests.
- Once approved merge the fix in the production environment.
Technologies Used: Java, JavaScript, Web services.
Developer
Confidential
Responsibilities:
- Confidential Solutions is a Consulting and Confidential Company based at Hubli, Karnataka, India.
- Confidential solutions had a requirementy for connecting each and every employee with each other for the purpose of sharing their status and updating their employees.
- The requirement was to make more like an organizational facebook portal for employees.
- Thus connecting employees more easily and by consuming less of empoyees time.
- Requirement Gathering And Analysis.
- Low Level Design And Data Structuring
- Code development and Designing.
- Getting client feedback and applying the same.
- Implement the fix and test it in the local environment
- Created online portal and was successfully deployed on the customer LAN network.
Technologies Used: Java, JavaScript, Web services, MySql, Eclipse, PHP.