We provide IT Staff Augmentation Services!

Hadoop Developer/lead Resume

5.00/5 (Submit Your Rating)

Richmond, VA

SUMMARY

  • Almost 8 years of experience in analysis, design, implementation of applications running on various platforms.
  • 3+ years of experience as Hadoop/Spark Developer with good knowledge of Map Reduce, Hive, Scala and Spark.
  • Hands on Experience in development of Big data projects using Hadoop, Hive, Oozie, Spark, Kafka and MapReduce, HDFS, PIG, Zookeeper, Flume, Sqoop, Impala open source tools/technologies.
  • Strong development experience in Apache Spark using Scala.
  • Experience on Spark for handling large data processing in streaming process along with Scala.
  • Good experience in NoSQL data base such as HBase Cassandra and MongoDB.
  • Experience and strong understanding of all phases of the SDLC using Agile SCRUM and Waterfall development methodologies.
  • Good Understanding and the knowledge on the REST services.
  • Built Spark Streaming applications to receive real time data from the Kafka and store the stream data to HDFS.
  • Experience in Managing scalable Hadoop clusters including Cluster designing, provisioning, custom configurations, monitoring and maintaining using different Hadoop distributions: Cloudera CDH, Hortonworks HDP.
  • Experience with Hadoop Ecosystem including Spark, Storm, HDFS, Hive,NIFI, Kafka, Sqoop, HBase
  • Experienced in configuring Workflow scheduling using Oozie.
  • Experience in importing and exporting data using Sqoop from Relational Database Systems (RDBMS) to HDFS.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
  • Experience in importing and exporting data using Sqoop from Relational Database Systems (RDBMS) to HDFS.
  • Experience in working on various Data warehousing and ETL tools
  • Good knowledge in integration of various data sources like RDBMS, Spreadsheets, Text files and XML files.
  • Experience in analyzing data using HIVEQL, PIG Latin. Extending HIVE and PIG core functionality by using custom UDF’s.
  • Experience developing, deploying Shell, Python Scripts for automation/notification/monitoring.
  • In depth knowledge of Object Oriented programming methodologies (OOPS) and object Oriented features like Inheritance, Polymorphism, Exception handling and Templates and Development experience with Java technologies.
  • Developed Spark applications using Scala for easy Hadoop transitions.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
  • Experienced in Data Warehousing to develop ETL mappings and scripts in Informatica Power Center 9.6 and Power Mart 9.6 using Designer, Repository Manager, Workflow Manager & Workflow Monitor.
  • Able to work within a team environment as well as independently.
  • Proficient in analyzing information system needs, evaluating end - user requirements, custom designing solutions, troubleshooting for complex information systems management.
  • Worked with Business Analysts team to analyze the feasibility of the System requirements and proactively offered recommendations suggesting new workflows.
  • Excellent Written and Oral communication, Presentation, Analytical & Problem solving skills, Conflict resolution & Negotiation techniques.
  • Leadership in projects requiring strong customer interface and technical excellence.

TECHNICAL SKILLS

Core Competency Technologies: Scala, Spark streaming, Spark SQL, Kafka, C#

DATABASES: Microsoft SQL, ORACLE, Hive

OPERATING SYSTEMS: Windows, UNIX

WEB PRESENTATION FRAMEWORKS: Java Script, HTML, AJAX, JQuery, CSS, JSON, SharePoint Designer, Visual Studio

DEVELOPMENT TOOLS: Microsoft Visual Studio 2005/08/10, SharePoint Designer 2007/2010

WEB SERVERS: Internet Information Server (IIS 6.0/7.0/8.0), Active Directory, DNS

SHAREPOINT TECHNOLOGY: SharePoint Server, SharePoint Designer, Office 365

PROFESSIONAL EXPERIENCE

Confidential, Richmond, VA

Hadoop Developer/Lead

Responsibilities:

  • Responsible for design, development and delivery of data from operational systems and files into ODS, downstream Data Marts and files.
  • Troubleshoot and develop on Hadoop technologies including HDFS, Hive, HBase, Spark, Impala and Hadoop ETL development via tools such as Informatica, Teradata.
  • Responsible for building solutions involving large data sets using SQL methodologies, Data Integration Tools in any database.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities through Apache Spark.
  • Enhanced and optimized product Spark code to aggregate, group and run data mining tasks using the Spark framework.
  • Worked with Apache Spark SQL and Data frame functions to perform data transformations and aggregations on complex semi structured data.
  • Experience in developing Spark Applications using Spark RDD, Spark-SQL and Data frame APIs.
  • Used Control-M to schedule workflows to run Spark jobs to transform data on a persistent schedule.
  • Experience in working on various Data warehousing and ETL tools
  • Experience developing, deploying Shell, Python Scripts for automation/notification/monitoring.
  • Extensively used Apache Kafka, Apache Spark, HDFS and Apache Impala to build a near real time data pipelines that get, transform, store and analyze click stream data to provide a better personalized user experience.
  • We use GIT to checkin the code into BitBucket through CICD pipeline we push the code into SIT and PROD
  • We use confluence to keep track of our documents.

Confidential, Jackson, MS

Hadoop Developer

Responsibilities:

  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
  • Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities through Apache Spark.
  • Enhanced and optimized product Spark code to aggregate, group and run data mining tasks using the Spark framework.
  • Getting the data usingNifi, writing the stream data into Kafka and analyzing the data through Spark
  • Used Spark-Streaming APIs to perform required transformations and actions on the learner data model which gets the data from Kafka in near real time.
  • Worked on migrating Map Reduce programs into Spark transformations using Spark and Scala.
  • Experience in developing Spark Applications using Spark RDD, Spark-SQL and Data frame APIs.
  • Used Apache Oozie to schedule workflows to run Spark jobs to transform data on a persistent schedule.
  • Using MapReduce Job exported Batch file into AWS S3 Database
  • Experience developing, deploying Shell, Python Scripts for automation/notification/monitoring.
  • Extensively used Apache Kafka, Apache Spark, HDFS and Apache Impala to build a near real time data pipelines that get, transform, store and analyze click stream data to provide a better personalized user experience.
  • Managing scalable Hadoop clusters including Cluster designing, provisioning,custom configurations, monitoring and maintaining using different Hadoop distributions in Hortonworks HDP.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to Pre-process the data.
  • Built a lambda architecture using Apache Kafka, Spark Streaming, Spark SQL, HDFS and HBase to develop and provide a near real time personalization experience for customers.
  • Hands on experience in creating RDDs, transformations and actions while implementing Spark applications.
  • We use GIT to checkin the code into BitBucket through CICD pipeline we push the code into SIT and PROD

Confidential, Dallas, TX

Hadoop Developer

Responsibilities:

  • Used Apache Hue web interface to monitor the Hadoop cluster and run the jobs.
  • Used Spark API over Cloudera Hadoop Yarn to perform analytics on data in Hive.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster processing of data.
  • Migrated complex Map reduce programs, Hive scripts into Spark RDD transformations and actions.
  • Developed Scala scripts, UDF's using both SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into Rdbms through Sqoop.
  • Responsible to Load the data into Spark RDD and performed in-memory data computation to generate the output response.
  • Converted all the vap processing from Netezza and implemented by using Spark data frames and RDD's.
  • I have good experience in Object Oriented programming(C/C++, C#,Java) languages
  • Importing and Exporting data into HDFS and Hive using Sqoop.
  • Experience in working on various Data warehousing and ETL tools
  • Migrated an existing on-premises application to AWS.
  • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
  • Writing UDF/Map reduce jobs depending on the specific requirement.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Hive.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs.
  • Created Hive schemas using performance techniques like partitioning and bucketing.
  • Analysed HBase data in Hive by creating external partitioned and bucketed tables
  • Developed Oozie workflow jobs to execute HIVE and MapReduce actions.
  • Extensively worked in code reviews and code remediation to meet the coding standards.
  • Involved in collecting and aggregating large amounts of log data using Apache Kafka and staging data in HDFS for further analysis.

Confidential, Austin, TX

Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop MapReduce HDFS Developed multipleMapReducejobs for data cleaning and preprocessing.
  • Experience in installing configuring and using Hadoop ecosystem components.
  • Importing and exporting data intoHDFSandHiveusingSqoop.
  • Experienced in defining job flows.
  • Created Hive tables which are extracted by the relevant EDW tables.
  • Worked on processing unstructured data using PIG and HIVE.
  • Experience developing, deploying Shell, Python Scripts for automation/notification/monitoring
  • Experience in Managing scalable Hadoop clusters including Cluster designing, provisioning, custom configurations, monitoring and maintaining using different Hadoop distributions in Hortonworks HDP.
  • Used RDD's to perform transformation on datasets as well as to perform actions like count, reduce, first
  • Extracted and processed log data in HDFS using FLUME.
  • Implemented various checkpoints on RDD's to disk to handle job failures and debugging.
  • Developed SparkSQL to load tables into HDFS to run select queries on top.
  • Knowledge in performance troubleshooting and tuning Hadoop clusters.
  • Responsible to manage data coming from different sources.
  • Got good experience withNOSQL like MONGOdatabase.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in loading data fromUNIXfile system toHDFS.
  • Installed and configured Hive and also writtenHive UDFs.
  • Involved in creating Hive tables loading with data and writing hive queries which will run internally in map reduce way.

Confidential, Charlotte, NC

Hadoop Consultant

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig & HBase NoSQL database.
  • Importing and exporting data in HDFS and Hive using Map Reduce.
  • Extracted BSON files from MongoDB and placed in HDFS and processed.
  • Designed and developed Map Reduce jobs to process data coming in BSON format
  • Experience with NoSQL databases.
  • Extracted and processed log data in HDFS using FLUME.
  • Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
  • Written Hive UDFs to extract data from staging tables.
  • Involved in creating Hive tables, loading with data.
  • Have good understanding on STORM to read the log files.
  • Hands on writing Map Reduce code to make unstructured data as structured data and for inserting data into HBase from HDFS.
  • Managing scalable Hadoop clusters including Cluster designing, provisioning, custom configurations, monitoring and maintaining using different Hadoop distributions in Hortonworks HDP.
  • Experience in creating integration between Hive and HBase.
  • Used Oozie scheduler to submit workflows.
  • Review QA test cases with the QA team.

Confidential, Jacksonville, FL

Java Developer

Responsibilities:

  • Involved in Analysis, Design, Coding and Development of custom Interfaces.
  • Assisted in proposing suitable UML class diagrams for the project.
  • Wrote SQL scripts to create and maintain the database, roles, users, tables, views, procedures and triggers in Oracle.
  • Designed and implemented the UI using HTML and Java.
  • Worked on database interaction layer for insertions, updating and retrieval operations on data.
  • Coordinated & Communicated with onsite resources regarding issues rose in production environment and used to fix day to day issues.
  • Looked after Release Management & code reviews.
  • Partly used Hibernate EJB and Web Services.
  • Involved in developing build file for the project.
  • Involvement in all Payday Transactions Issue Fixes and Enhancements.
  • Supported with UAT, Pre-Prod and Production Build management.
  • Involved in the analysis of Safe/Drawer Transactions, Loan deposit modules and development of Collection Letters.
  • Coordination with team for Fixes and Releases.
  • Involvement in all Title Transactions, printing of the documents.
  • Applied CSS for a consistent look and feel for page in the application
  • Worked on web pages and business objects using JavaScript, XML in a mixed ASP.NET environment.
  • Used JavaScript functions to implement complex business rules and validation of front end forms.
  • Experience with Change Management /Change Control Boards.
  • Microsoft SQL Server Database, Analysis, Integration and Reporting.

Confidential, Boise, ID

Java Developer

Responsibilities:

  • Actively involved in all phases of SDLC and followed agile methodology throughout the project.
  • Used Java-J2EE patterns like Model View Controller (MVC), Business Delegate, Session façade, Service Locator, Data Transfer Objects, Data Access Objects, Singleton, factory patterns. s
  • Involved in the design and development of Business Tier using Service Beans (Stateless EJB’s) and JavaBeans, DAO Stored Procedures, Data Access Layer using JDBC and Hibernate.
  • Used Spring Framework for DI, integrated with the Struts Framework and Hibernate.
  • Used iBatis as an SQL Mapping tool to store the persistent data and to communicate with Oracle.
  • Used ESB in developing enterprise applications.
  • Created JSF Custom Components and configured managed beans to meet the requirement.
  • Have worked on development and integration of all the applications in the project including Web sphere Portal presentation layer, and Services Layer containing processes.
  • Designed and Developed SOAP Web Services by extensively using WSDL and IBM RSA IDE.
  • Used JUnit, JTest and Struts Test cases for testing the application modules and Log4Jfor Logging.
  • Configured and Integrated IBM Web sphere Application Server and MQ Series.
  • Developed MDBs for receiving and processing data from Web sphere MQ series.
  • Used Ant scripts to build and deploy the application in IBM Web sphere Application Server.

Confidential, Houston, TX

Java Developer

Responsibilities:

  • Involved in Analysis, Design, Coding and Development of custom Interfaces.
  • Involved in the feasibility study of the project.
  • Gathered requirements from the client for designing the Web Pages.
  • Gathered specifications for the Library site from different departments and users of the services.
  • Assisted in proposing suitable UML class diagrams for the project.
  • Wrote SQL scripts to create and maintain the database, roles, users, tables, views, procedures and triggers in Oracle.
  • Designed and implemented the UI using HTML and Java.
  • Worked on database interaction layer for insertions, updating and retrieval operations on data.
  • Coordinated & Communicated with onsite resources regarding issues rose in production environment and used to fix day to day issues.
  • Looked after Release Management & code reviews.
  • Partly used Hibernate EJB and Web Services.
  • Involved in developing build file for the project.
  • Involvement in all Payday Transactions Issue Fixes and Enhancements.
  • Supported with UAT, Pre-Prod and Production Build management.
  • Involved in the analysis of Safe/Drawer Transactions, Loan deposit modules and development of Collection Letters.
  • Coordination with team for Fixes and Releases.
  • Involvement in all Title Transactions, printing of the documents.

We'd love your feedback!