Hadoop Developer/lead Resume
Richmond, VA
SUMMARY
- Almost 8 years of experience in analysis, design, implementation of applications running on various platforms.
- 3+ years of experience as Hadoop/Spark Developer with good knowledge of Map Reduce, Hive, Scala and Spark.
- Hands on Experience in development of Big data projects using Hadoop, Hive, Oozie, Spark, Kafka and MapReduce, HDFS, PIG, Zookeeper, Flume, Sqoop, Impala open source tools/technologies.
- Strong development experience in Apache Spark using Scala.
- Experience on Spark for handling large data processing in streaming process along with Scala.
- Good experience in NoSQL data base such as HBase Cassandra and MongoDB.
- Experience and strong understanding of all phases of the SDLC using Agile SCRUM and Waterfall development methodologies.
- Good Understanding and the knowledge on the REST services.
- Built Spark Streaming applications to receive real time data from the Kafka and store the stream data to HDFS.
- Experience in Managing scalable Hadoop clusters including Cluster designing, provisioning, custom configurations, monitoring and maintaining using different Hadoop distributions: Cloudera CDH, Hortonworks HDP.
- Experience with Hadoop Ecosystem including Spark, Storm, HDFS, Hive,NIFI, Kafka, Sqoop, HBase
- Experienced in configuring Workflow scheduling using Oozie.
- Experience in importing and exporting data using Sqoop from Relational Database Systems (RDBMS) to HDFS.
- Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
- Experience in importing and exporting data using Sqoop from Relational Database Systems (RDBMS) to HDFS.
- Experience in working on various Data warehousing and ETL tools
- Good knowledge in integration of various data sources like RDBMS, Spreadsheets, Text files and XML files.
- Experience in analyzing data using HIVEQL, PIG Latin. Extending HIVE and PIG core functionality by using custom UDF’s.
- Experience developing, deploying Shell, Python Scripts for automation/notification/monitoring.
- In depth knowledge of Object Oriented programming methodologies (OOPS) and object Oriented features like Inheritance, Polymorphism, Exception handling and Templates and Development experience with Java technologies.
- Developed Spark applications using Scala for easy Hadoop transitions.
- Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
- Experienced in Data Warehousing to develop ETL mappings and scripts in Informatica Power Center 9.6 and Power Mart 9.6 using Designer, Repository Manager, Workflow Manager & Workflow Monitor.
- Able to work within a team environment as well as independently.
- Proficient in analyzing information system needs, evaluating end - user requirements, custom designing solutions, troubleshooting for complex information systems management.
- Worked with Business Analysts team to analyze the feasibility of the System requirements and proactively offered recommendations suggesting new workflows.
- Excellent Written and Oral communication, Presentation, Analytical & Problem solving skills, Conflict resolution & Negotiation techniques.
- Leadership in projects requiring strong customer interface and technical excellence.
TECHNICAL SKILLS
Core Competency Technologies: Scala, Spark streaming, Spark SQL, Kafka, C#
DATABASES: Microsoft SQL, ORACLE, Hive
OPERATING SYSTEMS: Windows, UNIX
WEB PRESENTATION FRAMEWORKS: Java Script, HTML, AJAX, JQuery, CSS, JSON, SharePoint Designer, Visual Studio
DEVELOPMENT TOOLS: Microsoft Visual Studio 2005/08/10, SharePoint Designer 2007/2010
WEB SERVERS: Internet Information Server (IIS 6.0/7.0/8.0), Active Directory, DNS
SHAREPOINT TECHNOLOGY: SharePoint Server, SharePoint Designer, Office 365
PROFESSIONAL EXPERIENCE
Confidential, Richmond, VA
Hadoop Developer/Lead
Responsibilities:
- Responsible for design, development and delivery of data from operational systems and files into ODS, downstream Data Marts and files.
- Troubleshoot and develop on Hadoop technologies including HDFS, Hive, HBase, Spark, Impala and Hadoop ETL development via tools such as Informatica, Teradata.
- Responsible for building solutions involving large data sets using SQL methodologies, Data Integration Tools in any database.
- Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
- Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities through Apache Spark.
- Enhanced and optimized product Spark code to aggregate, group and run data mining tasks using the Spark framework.
- Worked with Apache Spark SQL and Data frame functions to perform data transformations and aggregations on complex semi structured data.
- Experience in developing Spark Applications using Spark RDD, Spark-SQL and Data frame APIs.
- Used Control-M to schedule workflows to run Spark jobs to transform data on a persistent schedule.
- Experience in working on various Data warehousing and ETL tools
- Experience developing, deploying Shell, Python Scripts for automation/notification/monitoring.
- Extensively used Apache Kafka, Apache Spark, HDFS and Apache Impala to build a near real time data pipelines that get, transform, store and analyze click stream data to provide a better personalized user experience.
- We use GIT to checkin the code into BitBucket through CICD pipeline we push the code into SIT and PROD
- We use confluence to keep track of our documents.
Confidential, Jackson, MS
Hadoop Developer
Responsibilities:
- Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
- Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.
- Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities through Apache Spark.
- Enhanced and optimized product Spark code to aggregate, group and run data mining tasks using the Spark framework.
- Getting the data usingNifi, writing the stream data into Kafka and analyzing the data through Spark
- Used Spark-Streaming APIs to perform required transformations and actions on the learner data model which gets the data from Kafka in near real time.
- Worked on migrating Map Reduce programs into Spark transformations using Spark and Scala.
- Experience in developing Spark Applications using Spark RDD, Spark-SQL and Data frame APIs.
- Used Apache Oozie to schedule workflows to run Spark jobs to transform data on a persistent schedule.
- Using MapReduce Job exported Batch file into AWS S3 Database
- Experience developing, deploying Shell, Python Scripts for automation/notification/monitoring.
- Extensively used Apache Kafka, Apache Spark, HDFS and Apache Impala to build a near real time data pipelines that get, transform, store and analyze click stream data to provide a better personalized user experience.
- Managing scalable Hadoop clusters including Cluster designing, provisioning,custom configurations, monitoring and maintaining using different Hadoop distributions in Hortonworks HDP.
- Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to Pre-process the data.
- Built a lambda architecture using Apache Kafka, Spark Streaming, Spark SQL, HDFS and HBase to develop and provide a near real time personalization experience for customers.
- Hands on experience in creating RDDs, transformations and actions while implementing Spark applications.
- We use GIT to checkin the code into BitBucket through CICD pipeline we push the code into SIT and PROD
Confidential, Dallas, TX
Hadoop Developer
Responsibilities:
- Used Apache Hue web interface to monitor the Hadoop cluster and run the jobs.
- Used Spark API over Cloudera Hadoop Yarn to perform analytics on data in Hive.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster processing of data.
- Migrated complex Map reduce programs, Hive scripts into Spark RDD transformations and actions.
- Developed Scala scripts, UDF's using both SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into Rdbms through Sqoop.
- Responsible to Load the data into Spark RDD and performed in-memory data computation to generate the output response.
- Converted all the vap processing from Netezza and implemented by using Spark data frames and RDD's.
- I have good experience in Object Oriented programming(C/C++, C#,Java) languages
- Importing and Exporting data into HDFS and Hive using Sqoop.
- Experience in working on various Data warehousing and ETL tools
- Migrated an existing on-premises application to AWS.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Writing UDF/Map reduce jobs depending on the specific requirement.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Hive.
- Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs.
- Created Hive schemas using performance techniques like partitioning and bucketing.
- Analysed HBase data in Hive by creating external partitioned and bucketed tables
- Developed Oozie workflow jobs to execute HIVE and MapReduce actions.
- Extensively worked in code reviews and code remediation to meet the coding standards.
- Involved in collecting and aggregating large amounts of log data using Apache Kafka and staging data in HDFS for further analysis.
Confidential, Austin, TX
Hadoop Developer
Responsibilities:
- Installed and configured Hadoop MapReduce HDFS Developed multipleMapReducejobs for data cleaning and preprocessing.
- Experience in installing configuring and using Hadoop ecosystem components.
- Importing and exporting data intoHDFSandHiveusingSqoop.
- Experienced in defining job flows.
- Created Hive tables which are extracted by the relevant EDW tables.
- Worked on processing unstructured data using PIG and HIVE.
- Experience developing, deploying Shell, Python Scripts for automation/notification/monitoring
- Experience in Managing scalable Hadoop clusters including Cluster designing, provisioning, custom configurations, monitoring and maintaining using different Hadoop distributions in Hortonworks HDP.
- Used RDD's to perform transformation on datasets as well as to perform actions like count, reduce, first
- Extracted and processed log data in HDFS using FLUME.
- Implemented various checkpoints on RDD's to disk to handle job failures and debugging.
- Developed SparkSQL to load tables into HDFS to run select queries on top.
- Knowledge in performance troubleshooting and tuning Hadoop clusters.
- Responsible to manage data coming from different sources.
- Got good experience withNOSQL like MONGOdatabase.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data fromUNIXfile system toHDFS.
- Installed and configured Hive and also writtenHive UDFs.
- Involved in creating Hive tables loading with data and writing hive queries which will run internally in map reduce way.
Confidential, Charlotte, NC
Hadoop Consultant
Responsibilities:
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig & HBase NoSQL database.
- Importing and exporting data in HDFS and Hive using Map Reduce.
- Extracted BSON files from MongoDB and placed in HDFS and processed.
- Designed and developed Map Reduce jobs to process data coming in BSON format
- Experience with NoSQL databases.
- Extracted and processed log data in HDFS using FLUME.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Written Hive UDFs to extract data from staging tables.
- Involved in creating Hive tables, loading with data.
- Have good understanding on STORM to read the log files.
- Hands on writing Map Reduce code to make unstructured data as structured data and for inserting data into HBase from HDFS.
- Managing scalable Hadoop clusters including Cluster designing, provisioning, custom configurations, monitoring and maintaining using different Hadoop distributions in Hortonworks HDP.
- Experience in creating integration between Hive and HBase.
- Used Oozie scheduler to submit workflows.
- Review QA test cases with the QA team.
Confidential, Jacksonville, FL
Java Developer
Responsibilities:
- Involved in Analysis, Design, Coding and Development of custom Interfaces.
- Assisted in proposing suitable UML class diagrams for the project.
- Wrote SQL scripts to create and maintain the database, roles, users, tables, views, procedures and triggers in Oracle.
- Designed and implemented the UI using HTML and Java.
- Worked on database interaction layer for insertions, updating and retrieval operations on data.
- Coordinated & Communicated with onsite resources regarding issues rose in production environment and used to fix day to day issues.
- Looked after Release Management & code reviews.
- Partly used Hibernate EJB and Web Services.
- Involved in developing build file for the project.
- Involvement in all Payday Transactions Issue Fixes and Enhancements.
- Supported with UAT, Pre-Prod and Production Build management.
- Involved in the analysis of Safe/Drawer Transactions, Loan deposit modules and development of Collection Letters.
- Coordination with team for Fixes and Releases.
- Involvement in all Title Transactions, printing of the documents.
- Applied CSS for a consistent look and feel for page in the application
- Worked on web pages and business objects using JavaScript, XML in a mixed ASP.NET environment.
- Used JavaScript functions to implement complex business rules and validation of front end forms.
- Experience with Change Management /Change Control Boards.
- Microsoft SQL Server Database, Analysis, Integration and Reporting.
Confidential, Boise, ID
Java Developer
Responsibilities:
- Actively involved in all phases of SDLC and followed agile methodology throughout the project.
- Used Java-J2EE patterns like Model View Controller (MVC), Business Delegate, Session façade, Service Locator, Data Transfer Objects, Data Access Objects, Singleton, factory patterns. s
- Involved in the design and development of Business Tier using Service Beans (Stateless EJB’s) and JavaBeans, DAO Stored Procedures, Data Access Layer using JDBC and Hibernate.
- Used Spring Framework for DI, integrated with the Struts Framework and Hibernate.
- Used iBatis as an SQL Mapping tool to store the persistent data and to communicate with Oracle.
- Used ESB in developing enterprise applications.
- Created JSF Custom Components and configured managed beans to meet the requirement.
- Have worked on development and integration of all the applications in the project including Web sphere Portal presentation layer, and Services Layer containing processes.
- Designed and Developed SOAP Web Services by extensively using WSDL and IBM RSA IDE.
- Used JUnit, JTest and Struts Test cases for testing the application modules and Log4Jfor Logging.
- Configured and Integrated IBM Web sphere Application Server and MQ Series.
- Developed MDBs for receiving and processing data from Web sphere MQ series.
- Used Ant scripts to build and deploy the application in IBM Web sphere Application Server.
Confidential, Houston, TX
Java Developer
Responsibilities:
- Involved in Analysis, Design, Coding and Development of custom Interfaces.
- Involved in the feasibility study of the project.
- Gathered requirements from the client for designing the Web Pages.
- Gathered specifications for the Library site from different departments and users of the services.
- Assisted in proposing suitable UML class diagrams for the project.
- Wrote SQL scripts to create and maintain the database, roles, users, tables, views, procedures and triggers in Oracle.
- Designed and implemented the UI using HTML and Java.
- Worked on database interaction layer for insertions, updating and retrieval operations on data.
- Coordinated & Communicated with onsite resources regarding issues rose in production environment and used to fix day to day issues.
- Looked after Release Management & code reviews.
- Partly used Hibernate EJB and Web Services.
- Involved in developing build file for the project.
- Involvement in all Payday Transactions Issue Fixes and Enhancements.
- Supported with UAT, Pre-Prod and Production Build management.
- Involved in the analysis of Safe/Drawer Transactions, Loan deposit modules and development of Collection Letters.
- Coordination with team for Fixes and Releases.
- Involvement in all Title Transactions, printing of the documents.