Bigdata Developer Resume
SUMMARY:
- To take up challenging assignments by which my knowledge, skill and experience are leveraged to the full extent and contribute to the growth and development of the organization through creative and innovative ideas.
- Over 6.8 years of IT experience, with around 5 years of experience in BigData and Hadoop Ecosystem.
- Expertise in concepts of end - to-end project planning and implementation from scope management in various environments viz. release based maintenance, custom application development, enterprise wide application deployment, testing support and quality management in adherence to international guidelines and norms
- Extensive experience on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, PIG, HBase, Zookeeper, Oozie and Flume
- Expertise in setting up processes for Hadoop based application design and implementation.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Experience in managing and reviewing Hadoop log files.
- Experienced in processing Big Data on the Apache Hadoop framework using MapReduce programs.
- Experience in working with Windows, UNIX/LINUX platform with different technologies such as Big Data, SQL, XML, HTML, Core Java, Shell Scripting, Cornerstone DB etc.
- Having experience in Spark using Python language (PySpark).
- Experience with Cornerstone DB & Centralized Data Management Framework.
- Cornerstone Data ingestion of SOR feeds, Cloak Registration, Cornerstone feed release process.
- Proficient knowledge in Logistics and Banking Domains.
- Experience in interacting with customers and working at client locations for real time field testing of products and services.
- Ability to work effectively with associates at all levels within the organization.
- Strong background in mathematics and have very good analytical and problem.
- Big Data Hadoop Ecosystem in ingestion, storage, querying, processing and analysis of big data.
- Knowledge of agile methodologies for delivering software solutions.
- Ability to meet deadlines and handle multiple tasks, decisive with strong leadership qualities, flexible in work schedules and possess good communication skills.
- Good interpersonal skills, committed, result oriented, hard working with a quest and zeal to learn new technologies.
TECHNICAL SKILLS:
Hadoop/BigData: HDFS, MapReduce, Sqoop, Hive, PIG, HBASE, Zookeeper, Spark(Python)
Cluster configuration, FLUME, AWS
Distributions: Hortonworks, Cloudera and MapR
Java Technologies: Core Java.
Databases: SQL, NOSQL HBase, MYSQL and Cornerstone Database
Programming Languages: Java, SQL, Shell, Python
IDE's Utilities: Eclipse
Protocols: TCP/IP, SSH, HTTP & HTTPS
Scripting: HTML, JavaScript, CSS, XML.
Operating System: Windows, Linux and Unix.
Version control: Git, SVN, CVS.
Tools: FileZilla, Putty.
PROFESSIONAL EXPERIENCE:
Confidential
BigData Developer
Responsibilities
- Developed data ingestion script to ingest data from Oracle & Teradata to HDFS/HIVE using PySpark.
- Written python scripts using Spark-SQL for various data stages to analysis and maintain the data in well formed.
- Used multiple Spark-SQL functions to clean the data & stored it into optimized format.
- Working with the client and business management on gathering requirements and understanding functional aspects of the application
- Responsible for writing Hive Queries for analyzing data in Hive warehouse using SparkSQL.
- Responsible for writing the data validation query for two different data sets in Hive warehouse using Spark-SQL.
- Develop test cases and test scripts based on documented source.
- Involved in gathering the business scope and technical requirements, and created technical specifications.
- Coordinates with other team members to ensure that all work products integrate together as a complete solution and adopts a supporting role to any other team member to resolve issues, or to complete tasks sooner.
Technologies/Tools Used: HDFS, Hadoop MR, Hive/HQL, Pig, Spark, Python, UNIX shell script, Eclipse, Putty.
Confidential
BigData Developer
Responsibilities
- Having experience with 3TB of data involving 300 nodes.
- Good understanding and related experience with Hadoop stack-internals, Hive, Pig and Map/Reduce.
- Involved in loading data from UNIX file system to HDFS.
- Fetching Cornerstone data through hive & used for application in different business scenario as we can be considering source data.
- Developed HIVE queries for the analysts & processed data by all means is imported into Hive warehouse which enabled business analysts and operation groups to write Hive queries.
- Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run Map Reduce jobs in the backend.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Handled task to load data from HDFS to Kognitio which will connect directly to Tableau to visualize data.
- Handled data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data
- Working with the client and business management on gathering requirements and understanding functional aspects of the application.
- Understand and contribute towards Projects’ Technical Design as well, along with the requirement specifications.
- Coordinates with other team members to ensure that all work products integrate together as a complete solution and adopts a supporting role to any other team member to resolve issues, or to complete tasks sooner.
Technologies/Tools Used: HDFS, Hadoop MR, Hive/HQL, Pig, Kognitio, Sqoop, UNIX shell script, Eclipse, Putty, Tableau, Cornerstone DB
Confidential
BigData Developer
Responsibilities
- Created Views from Hive Tables on top of data residing in Data Lake.
- Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement
- Involved in configuring batch job to perform ingestion of the source files in to the Data Lake
- Extensive experience in creating Hive tables, loading them with data and writing hive queries which will run internally in MapReduce way
- Processing large data sets in parallel across the Hadoop cluster for pre-processing.
- Working in Analysis, Design, Coding and Testing
- Creating the data sets for various test scenarios.
- Written Hive & Pig scripts for loading the data from the various data stages.
- Developed SQL statements to improve back-end communications.
- Working in creating shell scripts for executing the scripts.
- Performance evaluation and reporting at each step for entire workflow
- Working with the client and business management on gathering requirements and understanding functional aspects of the application
- Handled data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data
- Knowledge building among team members
- Understand and contribute towards Projects’ Technical Design as well, along with the requirement specifications.
- Develop test cases and test scripts based on documented source.
- Assisted in technical specifications and other deliverable documents
- Coordinates with other team members to ensure that all work products integrate together as a complete solution and adopts a supporting role to any other team member to resolve issues, or to complete tasks sooner.
Technologies/Tools Used: HDFS, Hadoop, MapReduce, Hive/HQL, Pig, Sqoop, MySQL, UNIX shell script, Eclipse, Putty, Tableau, Cornerstone DB.
Confidential
BigData Developer
Responsibilities
- Contributed from analysis to testing phase.
- Created the data sets for various scenarios.
- Responsible to writing Map Reduce programs to read and write the data in various scenario.
- Written Hive scripts for loading the data from the various data stages
- Created Pig scripts for identifying duplicates and most recent customer data.
- Worked in creating shell scripts for executing the scripts and MapReduce programs in shell.
- Also created the Design Document for the above mentioned modules.
- Worked for optimizing the MapReduce jobs and Hive queries.
- Assisting the team members with their task and keep the onsite team updated
- Carried out end-to-end integration testing
- Knowledge building among team members
- Understand and contribute towards Projects’ Technical Design as well, along with the requirement specifications.
Technologies/Tools Used: HDFS, Hadoop, MapReduce, Hive/HQL, Pig, Sqoop, MySQL, UNIX shell script, Eclipse, Putty.