Hadoop Developer Resume
OmahA
PROFESSIONAL SUMMERY:
- 7+ years of experience in Software Development Life Cycle (SDLC), AGILE Methodology and analysis, design, development, testing, implementation and maintenance in Hadoop, Data Warehousing, Linux and Java.
- 5 years of experience in providing solutions for Big Data using Hadoop 2.x, HDFS, MR2, YARN, Kafka, Pig, Hive, Impala, Sqoop, HBase, Cassandra, Cloudera Manager, Hortonworks, Zookeeper, Oozie, Hue.
- Experienced in building highly scalable Big - data solutions using Hadoop and multiple distributions i.e., Cloudera, Hortonworks and NoSQL platforms (Hbase).
- Implementation of Big data batch processes using Hadoop, Map Reduce, YARN, Pig and Hive.
- Experience in importing and exporting data using Sqoop from HDFS/Hive/HBase to Relational Database Systems and vice-versa.
- Worked extensively on Horton Works HDP and HDF platforms
- Hands on experience in in-memory data processing with Apache Spark using Scala and python codes.
- DevelopedSparkscripts by using Scala shell commands as per the requirement
- Responsible for interaction with the clients for understanding their business problem related to Big Data, Cloud Computing and NoSQL (Hbase, Cassandra) technologies.
- Experienced in using Kafka as a distributed publisher-subscriber messaging system.
- Good experience in writing Pig scripts and Hive Queries for processing and analyzing large volumes of data.
- Experience in optimization of MapReduce algorithm using Combiners and Partitioners to deliver best results.
- Using Git hub for continuous integration services
- Experienced in designing, developing and implementing connectivity products that allow efficient exchange of data between the core database engine and the Hadoop ecosystem
- Extending Hive and Pig core functionality by writing custom UDFs.
- Good knowledge on Amazon AWS concepts like Red shift EMR & EC2 web services which provides fast and efficient processing of Big Data.
- Hands on experience in using BI tools like Tableau and Informatica.
- Involved in Predictive modeling of the Customers data after cleaning using Big Data Technologies.
- Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
- Experienced in Strong scripting skills in Python and Unix shell.
- Involved in creating Data warehouse of the transformed data which involve RDBMS services like MS SQL server.
- Worked on PySpark APIs for data transformations
- Used Web Services like SOAP and RESTful web services using HTML, XML, JSON, JavaScript, jQuery.
- Involved in review of various MVC Java frameworks like AngularJS.
- Experience in managing and reviewing Hadoop log files.
- Good knowledge in using apacheNiFito automate the data movement between different Hadoop systems
- Hands on experience in application development using RDBMS and Linux shell scripting.
- Having good working experience in Agile/Scrum methodologies, technical discussion with client
- Communication using scrum calls daily for project analysis specs and development aspects.
- Ability to work independently as well as in a team and able to effectively communicate with customers, peers and management at all levels in and outside the organization.
TECHNICAL SKILLS:
Hadoop Ecosystem: Hadoop, MapReduce, Sqoop, Hive(HCatalog), Oozie, PIG, HDFS, Zookeeper, Flume, Spark, Kafka
NoSQL Databases: HbaseJava & J2EE Technologies Core Java, Servlets,JSP, JDBC, JNDI, Java Beans.
Languages: C, C++, JAVA, Scala, SQL,PL/SQL, PIG Latin, HiveQL, Unix shell scripting.
Databases: Oracle 11g/10g/9i, My SQL,DB2, MS SQL Server, RDBMS.
Application Server: Apache Tomcat, JBoss, IBM Web sphere, Web Logic.
Web Services: WSDL, SOAP, Apache CXF, Apache Axis, REST.
Methodologies: Scrum, Agile, Waterfall.
PROFESSIONAL EXPERIENCE:
Confidential, Omaha
Hadoop Developer
Responsibilities:
- Working in agile, successfully completed stories related to ingestion, transformation and publication of data on time.
- Experience in architecture and creating datalake from different source systems like RDBMS and Teradata.
- Using Hortonworks for development of the code and experience in data pipeline using HDP and HDF.
- Creating external and internal tables on hadoop datalakes to manage up-steam and down-stream data flow.
- Developed Complex HiveQL‘s using SerDe JSON.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala.
- Worked on PySpark APIs for data transformations.
- Working on JSON, Parquet, and ORC Hadoop File formats.
- Worked extensively on creating sqoop jobs to manage data flow into datalake
- Involved in importing the real time data to Hadoop using Kafka and implemented the Oozie job for daily imports
- As part of support, responsible for troubleshooting of Map Reduce Jobs, Pig Jobs, Hive
- Worked on performance tuning of Hive & Pig Jobs.
- Performed various optimizations on Hive to improve the query efficiency
- Using Hive for ETL jobs and cleaning the data as per requirements.
- Implementing ETL process in Datastage to load a Data Warehouse
- Performed Data compilations and data manipulations using API’s created.
- Worked on HBase for generating additional columns on the datalake zones
- Wrote ApacheSparkstreaming API on Big Data distribution in the active cluster environment.
- Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala.
- Experience with the software problem resolution process (identification, diagnosis and resolution).
Confidential, Plano, TX
Hadoop Developer
Roles & Responsibilities:
- Working in agile, successfully completed stories related to ingestion, transformation and publication of data on time.
- Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like ApacheSparkwritten in Scala.
- Hands on experience in in-memory data processing with Apache Spark using Scala and python codes.
- Ingested data sets from different DBs and Servers using Sqoop Import tool and MFT (Managed file transfer) Inbound process.
- Design/Implement large scale pub-sub message queues using Apache Kafka.
- DevelopedSparkscripts by using Scala shell commands as per the requirement.
- Using Cloudera Manager, Hortonworks for development of the code and experience in data pipeline using HDP and HDF.
- Using Spark streaming consumes topics from distributed messaging source Kafka and periodically pushes batch of data to Spark for real time processing.
- Experience in building data lake for the Claims Initiation and Updates Process
- Support several clients to update database in the same time, arrange the process with a queue Profits Analyzer (HADOOP & JAVA Program).
- Using apacheNiFito copy the data from local file system toHDFS
- Design and object modelling using UML (Use cases, Test Cases, Sequence and Class diagrams) and Unix Shell Scripting.
- Using Git hub for continuous integration services
- Developed Complex HiveQL‘s using SerDe JSON.
- Experience of working in AWS environment
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala.
- Worked on PySpark APIs for data transformations.
- Working on JSON, Parquet, Hadoop File formats.
- Involved in importing the real time data to Hadoop using Kafka and implemented the Oozie job for daily imports
- As part of support, responsible for troubleshooting of Map Reduce Jobs, Pig Jobs, Hive
- Using Hive for ETL jobs and cleaning the data as per requirements.
- Implementing ETL process in Datastage to load a Data Warehouse
- Performed Data compilations and data manipulations using API’s created.
- Wrote ApacheSparkstreaming API on Big Data distribution in the active cluster environment.
- Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
Environment: AgileScrum, MapReduce, Hive, Pig, Sqoop, Spark, Scala, MFT, Oozie, Flume, Java, ETL, SQL Server, RDBMS, CentOS, UNIX, Linux, Cloudera CDH4, CDH5, Hortonworks, C++.
Confidential, Reston, VA
Big data cloud Engineer
Roles & Responsibilities:
- Worked on analyzing Hadoop cluster using different big data analytic tools including Kafka, Pig,Hive and Map Reduce.
- Developing parser and loader map reduce application to retrieve data from HDFS and store to HBase and Hive.
- Importing the data from the MySql and Oracle into the HDFS using Sqoop.
- Importing the unstructured data into the HDFS using Flume.
- Written Map Reduce java programs to analyze the log data for large-scale data sets.
- Involved in creating Hive(HCatalog) tables, loading and analyzing data using hive queries.
- Worked hands on with ETL process and Involved in the development of the Hive scripts for extraction, transformation and loading of data into other data warehouses.
- Used HIVE/ Impala join queries to join multiple tables of a source system and load them into Elastic Search Tables.
- Using Hive for ETL jobs and cleaning the data as per requirements.
- Using Sqoop dumped data into RDBMS
- Involved in creating Data warehouse of the transformed data which involve RDBMS services like MS SQL server
- Cloudera Manager, Hortonworks console application development.
- Involved in running Ad-Hoc query through PIG Latin language, Hive or JavaMapReduce.
- Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
- Implemented Elastic Search to decrease query times and increase search capabilities.
- Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive
- Responsible for continuous monitoring and managing Elastic MapReduce cluster through AWS console
- Installing and configuring Apache and supporting them on Linux production servers
- Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scale.
- Involved in writing UNIXShelland Perl scripts for automation of deployments to Application server
- Working with NoSQL databases like Cassandra.
Environment: Hadoop 1.0.0, Oracle 11g/10g, Python, Hortonworks, MapReduce, Hive, HBase, Flume, Sqoop, Pig, Zookeeper, Java, ETL, SQL Server, RDBMS, CentOS, UNIX, Linux, Cloudera Manager, CDH3, Hortonworks, C++.
Confidential, Virginia
HadoopDeveloper
Roles & Responsibilities:
- Workedon analyzingHadoop cluster usingdifferent big data analytic tools including Pig, HBase and Sqoop.
- Business logic is implemented using Struts action components in the Struts and Hibernate framework.
- Migrating the needed data from MySQL into HDFS using Sqoop and importing various formats of unstructured data from logs into HDFS using Flume.
- Used Multithreading for invoking the database and also implemented complex modules which contain business logics using Collection, Reflection, and Generics API.
- Involved in Pig Latin programming.
- As part of support, responsible for troubleshooting of Map Reduce Jobs, Pig Jobs, Hive
- ImportingAnd Exporting Data from MySQL/Oracle to HiveQL usingSQOOP.
- Experienced in analyzingdata with Hive and Pig.
- Responsible for operational support of Production system.
- Loading log data directly into HDFS using Flume.
- Developed Message Handler Adapter, which converts the data objects into XML message and invoke an enterprise service and vice-versa using Java, JMS and MQ Series.
Environment: ApacheHadoop, HDFS, JavaMap Reduce, Eclipse, Hive, PIG, Sqoop, Flume, Oozie,Java/J2EE, Oracle 10g, SQL, PL/SQ L, JSP, EJB, Struts, Hibernate, Weblogic 8.0, HTML, AJAX, Java Script, JDBC, XML, JMS.
Confidential
Java Developer
Roles & Responsibilities:
- Worked on both WebLogic Portal 9.2 for Portal development and WebLogic 8.1 for Data Services Programming.
- Worked on creating EJBs that implemented business logic.
- Developed the presentation layer using JSP, HTML, CSS and client validations using JavaScript.
- Involved in designing and development of the ecommerce site using JSP, Servlet, EJBs, JavaScript and JDBC.
- Used Web Services like SOAP and RESTful web services using HTML, XML, JSON, JavaScript, jQuery.
- Involved in review of various MVC Java frameworks like AngularJS.
- Used Eclipse 6.0 as IDE for application development.
- Validated all forms using Struts validation framework and implemented Tiles framework in the presentation layer.
- Configured Struts framework to implement MVC design patterns.
- Designed and developed GUI using JSP, HTML, DHTML and CSS.
- Worked with JMS for messaging interface.
Environment:Java, J2EE, HTML, DHTML, CSS, JavaScript, JSP, Servlets, XML, EJB, Sturts, Weblogic 8.1,SQL Server 2008R2, UNIX, LINUX, Windows 7/Vista/XP.
