Hadoop / Spark Developer Resume
4.00/5 (Submit Your Rating)
SUMMARY:
- 6+ years of experience in Software Development Life Cycle (SDLC), AGILE methodology including analysis, design, development, testing, implementation with strong emphasis on Object Oriented Analysis, ETL Design, Development and Implementation, Testing and Deployment of Data Warehouse as well as with Big Data Processing in ingestion, storage, querying and analysis.
- Experience with different data source files like Avro, Parquet, formats and compressions techniques like snappy, gzip, Hive & Pig
- Experience in building, maintaining multiple Hadoop clusters of different sizes and configuration and setting up the rack topology for large clusters.
- Good Understanding of Hadoop architecture and Hands - on experience with Hadoop components such as Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce concepts and HDFS Framework.
- Work experience in ingestion, storage, querying, processing and analysis of Big Data, with hands on experience in Hadoop Ecosystem development including Map Reduce, HDFS, Hive, Pig, Spark, Cloudera Navigator, HBase, Zookeeper, Kafka, Sqoop, Flume, Oozie and AWS, Cassandra.
- Extensive experience in using Flume to transfer log data files to Hadoop Distributed File System (HDFS).
- Experience in creating real time data streaming solutions using Apache Spark Core, Spark SQL & Data Frames, Spark Streaming.
- Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop, used tools like SQOOP, Flume to ingest data into Hadoop.
- Successfully loaded files to Hive and HDFS from Oracle and SQL Server using Sqoop.
- Worked with Hive/HQL to query data from Hive tables in HDFS.
- Used NoSQL database Hbase, MongoDB for storing large tables by bringing data to Hbase using Pig and Sqoop.
- Experienced in moving data from different sources using Kafka producers, consumers and preprocess data using Storm topologies.
- Experienced in migrating ETL transformations using Pig Latin Scripts, transformations and join operations.
- Good understanding on Spark architecture and its components.
- Experience in writing PigLatinScripts.
- Expertise in implementing Spark, Scala application using higher order functions for both batch and interactive analysis requirement.
- Extensive experienced working with Spark tools like RDD transformations, spark MLlib and spark QL.
- Good knowledge on executing Spark SQL queries against data in Hive by using hive context in spark.
- Good understanding of R Programming, Data Mining and Machine Learning techniques.
- Hands On experience onSPARK, SCALA, HIVE, PIG and SQOOP.
- Creating the Data Frames handle in SPARK with Scala
- Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
- Developed Spark code using scala and Spark-SQL/Streaming for faster testing and processing of data.
- Experience in installing and setting up Hadoop Environment in cloud though Amazon Web services (AWS) like EMR and EC2 which provide efficient processing of data and Amazon's Simple Storage Service (S3).
- Experience in web technologies like HTML5, JavaScript, jQuery & CSS 3
- Strong analytical and problem solving skills with good understanding of system development Methodologies, tools and techniques.
- Strong working experience on web standards.
- Proven experience in working with software methodologies like Waterfall/Iterative and Agile SCRUM.
PROFESSIONAL EXPERIENCE:
Hadoop / Spark Developer
Confidential
Responsibilities:
- Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
- Involved in installing, configuring and managing Hadoop Ecosystem components like Spark, Hive, Pig, Sqoop, Kafka and Flume.
- Involved in installing Hadoop and Spark Cluster in Amazon Web Server.
- Responsible for Data Ingestion like Flume and Kafka.
- Responsible for loading unstructured and semi-structured data into Hadoop cluster coming from different sources using Flume and managing.
- Used the RegEx, JSON and Avro SerDe's for serialization and de-serialization packaged with Hive to parse the contents of streamed log data.
- Developed Spark Programs for Batch and Real time Processing.
- Developed Spark Streaming applications for Real time Processing.
- Developed MapReduce programs to cleanse and parse data in HDFS obtained from various data sources and to perform joins on the Map side using distributed cache.
- Migrated the existing data to Hadoop from RDBMS (SQL Server and Oracle) using Sqoop for processing the data.
- Used Hive data warehouse tool to analyze the data in HDFS and developed Hive queries.
- Created internal and external tables with properly defined static and dynamic partitions for efficiency.
- Exported the business required information to RDBMS using Sqoop to make the data available for BI team to generate reports based on data.
- Implemented daily workflow for extraction, processing and analysis of data with Oozie.
Environment: Hadoop, Spark, Spark Streaming, Spark Mlib, Scala, Hive, Pig, Hcatalog, MapReduce, Oozie, Sqoop, Flume and Kafka.
Confidential
Hadoop DeveloperResponsibilities:
- Hired as an Hadoop Developer and the primary responsibilities include Design,implement and maintain applications that receive a transaction-based and data generated from different bank applications across all US locations.
- Job duties involved the design, development of various modules in Hadoop Big Data Platform and processing data using MapReduce, Hive, Pig and Scoop.
- Designed, developed and tested Map Reduce programs to acquire data and store it into HDFS.
- Processed those data using different technologies like map-reduce, pig, Sqoop etc. and make it structured and ingested into HIVE tables.
- Implementing Big Data solutions to clients
- Setting up Bid Data Architecture for companies with various needs implementing Hadoop Eco-systems and integrating the Relational Database systems into the Hadoop Ecosystem
- Responsible for building scalable distributed data solutions using Hadoop
- Written multiple MapReduce programs for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV & other compressed file formats.
- Collected logs data from web servers and integrated into HDFS using Apache Flume.
- Worked with different File Formats like TEXTFILE, AVROFILE, ORC, and PARQUET for HIVE querying and processing
- Installing and working on Hontonworks Hadoop cluster.
- Experienced in using Apache Ambari console for monitoring Hadoop and Hbase cluster health checks.
- Worked on various cluster administrative activities in managing clusters like commissioning & decommissioning of datanodes, namenode recovery.
- Loading log data directly into HDFS using Flume.
- Experienced in managing and reviewing Hadoop log files.
- Developed PIG Latin scripts for the analysis and transformations on semi structured data and structured data.
- Worked on tuning the performance Pig queries.
- Installed Oozie workflow engine to run multiple Pig jobs and Sqoop jobs in specified sequence.
- Exported the analyzed data to the relational databases using Sqoop for visualization and for generating reports.
- Hands on experience in installing and configuring Hbase and Zookeeper.
- Used Sqoop to import the data from Oracle to Hadoop Distributed File System (HDFS) and later analyzed the imported data using PIG scripts.
- Used File System Check (FSCK) to check the health of files in HDFS
Environment: Hadoop, MapReduce, HDFS, PIG, Hive, Flume, Sqoop, Oozie, Storm, Kafka, Spark, Scala, MongoDB, Cassandra, Cloudera, Zookeeper, AWS, MySQL, Talend, Shell Scripting, Java, Git.
Tableau Developer
Confidential
Responsibilities:
- Gathered business requirements from end users, analyzed and documented the same for various reporting needs.
- Analyze complex data from various data repositories and become familiar with user base locations.
- Created views and dashboards based on requirements and published them to Tableau Server for Business and End User teams to be reviewed.
- Designed and developed mock-up Tableau Dashboards to explore options for visualization of data, presentation, and analysis.
- Created side by side bar chart comparison of Vendors who were into Confidential along with profit revenue generated.
- Created line charts to display trends over a period of time and tied it with bar charts to show revenue growth based on users upgrading to additional features of the mobile app additionally adding filter to choose between business and personal users.
- Executed and tested required queries and reports before publishing.
- Participated in weekly meetings, reviews, and user group meeting as well as communicating with stakeholders and business groups.
Business Analyst
Confidential
Responsibilities:
- Involved in developing business logic components using JavaBeans and Servlets.
- Involved in utilizing WebLogic specific connection pools in order to interact with the business data from the business components.
- Extensively work on Node.js, Angular.js etc.
- Created enterprise deployment strategy and designed the enterprise deployment process to deploy Web Services, J2EE programs on more than 7 different SOA/WebLogic instances across development, test and production environments.
- Interaction programming for backend service with CoreData, Restful Service
- Designed and Developed User Interface using JSP.
- Developed Server-side validation classes using JDBC calls.
- Designed user interface HTML, Swing, CSS, XML, Java Script and JSP.
- Implemented the presentation using a combination of Java Server Pages (JSP) to render the HTML and well-defined API interface to allow access to the application services layer.
- Implemented Configuration of Engage Administration and property files.
- Input validations were done using JavaScript.
- Developed applications using Mobile Supply Chain.
Environment: Java, J2EE, Servlets, JSP, JSF, JavaScript, HTML, Swing, CSS, XML, Eclipse, Internet Explorer, Linux, WebLogic, DB2, jQuery UI, jQuery, Node.js, Angular.js, ExtJS.