Sr. Big Data/hadoop Developer Resume
Dallas, Tx
SUMMARY:
- Around 9+ years of strong experience in software development using Big Data, Hadoop, Apache Spark Java/J2EE, Scala, Python technologies
- Strong understanding of Software Development Lifecycle (SDLC)
- Experience in dealing with log files to extract data and to copy into HDFS using flume.
- Developer test classes using MR unit for checking Input and Output
- Installing, configuring and managing of Hadoop Clusters, THOR and Data Science tools.
- Managing the Hadoop distribution with Cloudera Manager, Cloudera Navigator, and Hue.
- Good Knowledge and understanding of Hadoop Architecture and various components in Hadoop ecosystems - HDFS, Map Reduce, Pig, Sqoop and Hive.
- Hands on experience in developing Map Reduce programs using Apache Hadoop for analyzing the Big Data
- Experience administering and configuring NoSQL Databases like Cassandra, MongoDB etc.
- Knowledge in handling Kafka cluster and created several topologies to support real-time processing requirements.
- Good working knowledge in creating Hive tables and worked using Hive QL for data analysis to meet the business requirements.
- Experience working with JAVA, J2EE, JDBC, ODBC, JSP, Java Eclipse, MS SQL Server.
- Extensive experience with SQL, PL/SQL and database concepts.
- Expertise in debugging and optimizing Oracle and java performance tuning with strong knowledge in Oracle 11g and SQL.
- Experience working with Distributions such as MAPR, Horton works and Cloudera.
- Experience working with NoSQL databases such as HBase and MongoDB.
- Experience in managing and reviewing Hadoop log files. Experience in NoSQL database HBase.
- Experience with Testing Map Reduce programs using MR Unit, J unit and Easy Mock
- Experienced in performing real time analytics on HDFS using HBase
- Exposure in working with data frames and optimized the SLA's.
- Developed custom MapReduce programs for data analysis and data cleaning using pig Latin scripts.
- Good understanding of end-to-end content lifecycle, web content management, content publishing/deployment, and delivery processes.
- Hands on experience with build tools like ANT, Maven
TECHNICAL SKILLS:
Big Data & Hadoop: Hadoop, MapReduce, HDFS, HBase, Hive, Pig, Oozie, Scoop, Spark, Impala, Zookeeper, Flume, Kafka
Programming Languages: Java JDK 1.7/1.8, SQL, PL/SQL
Java/J2EE Technologies: Servlets, JSP, JSTL, JDBC, JMS, JNDI, RMI, EJB, JFC/Swing, AWT, Applets, Multi-threading, Java Networking
Frameworks: Struts 2.x/1.x, Spring 2.x, Hibernate 3.x
IDEs: Eclipse 3.x, IntelliJ
Web technologies: JSP, JavaScript, jQuery, AJAX, XML, XSLT, HTML, DHTML, CSS
Web Services: SOAP, REST, WSDL
XML Tools: JAXB, Apache Axis, AltovaXMLSpy
Methodologies: Agile, Scrum, RUP, TDD, OOAD, SDLC
Modeling Tools: UML, Visio
Testing technologies/tools: JUnit
Database Servers: Oracle 8i/9i/10g, DB2, SQL Server 2000/2005/2008, MySQL
Version Control: CVS, SVN
Build Tools: ANT, Maven
Platforms: Windows 2000/98/95/NT4.0, UNIX
PROFESSIONAL EXPERIENCE:
Confidential, Dallas, TX
Sr. Big Data/Hadoop Developer
Responsibilities:
- Gathering business requirements from the Business Partners and Subject Matter Experts.
- Used Zookeeper, THOR for providing coordination services to the cluster
- Documented the requirements including the available code which should be implemented using Spark, Hive, HDFS and SOLR.
- Experience in using Zookeeper technologies.
- Expertise in integrating Kafka with Spark streaming for high speed data processing.
- Used various Spark Transformations and Actions for cleansing the input data
- Load and transform large sets of structured, semi structured and unstructured data.
- Scheduled Oozie workflow engine to run multiple Hive and Pig jobs, which independently run with time and data availability
- Used Sqoop tool to load data from RDBMS into HDFS.
- Involved in loading data from UNIX file system to HDFS
- Work on code reviews, Bug fixes and documentation
- Developed the Pig UDF'S to pre-process the data for analysis
- Created Produce, consumer and Zookeeper setup to Kafka replication
- Having good knowledge in writing scripts using shell, Python & Perl in Linux.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS), Teradata and vice versa.
- Hands on knowledge of writing code in Scala
- Hands on experience in large scale data processing using Spark
- Responsible for developing data pipeline by implementing Kafka producers and consumers and configuring brokers
- Familiar with installation of Oozie workflow engine to run multiple Hive and Pig jobs that run independently with time and data availability.
- Real time streaming the data using Spark with Kafka
- Uses Pig UDF's in Python, Java code and uses sampling of large data sets.
Environment: Java, Horton works, Hadoop, HDFS, Hive, Tez, Pig, Sqoop, Hue, HBase, Kafka, Storm, Oozi, Zookeeper, yarn Map Reduce, Hcatalog, Avro, Parquet, Tableau, JSP, Oracle, Teradata, SQL, Log4J, RAD, Web sphere, Eclipse, AJAX, JavaScript, jQuery, CSS3, SVN, Putty, FTP, Linux, Cronjob, Shell Script and SQL Developer.
Confidential, Franklin lakes, NJSr. Bigdata/Hadoop Developer
Responsibilities:
- Installed and configured HDFS, Hadoop Map Reduce, developed various Map Reduce jobs in Java for data cleaning and preprocessing.
- Analyzed various RDDS using Scala, Python with Spark.
- Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Experience in implementing Spark RDD's in Scala.
- Involved in data ingestion into HDFS using Sqoop and Flume from variety of sources.
- Responsible for managing data from various sources.
- Worked on the conversion of existing MapReduce batch applications for better performance.
- Big data analysis using Pig and User defined functions (UDF).
- Worked on both External and Managed HIVE tables for optimized performance.
- Developed HIVE scripts for analyst requirements for analysis
- Storing, processing and analyzing huge data-set for getting valuable insights from them.
- Extensively used PIG to communicate with Hive using HCatalog and HBASE using Handlers.
- Familiarity with a NoSQL database such as MongoDB, Cassandra.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Analyzed the customer data by performing Hive queries to know user behavior.
- Worked on the conversion of existing MapReduce batch applications for better performance
Environment: HDFS, Pig, Hive, HBase, Sqoop, Spark, Oozie, flume, Kafka, Linux Shell Scripting, Java, J2EE, JSP, JSF, Eclipse, Maven, J2EE, SQL, HTML, XML, XSLT, Oracle, MYSQL, Ajax/JavaScript, web services API, putty
Confidential, Houston TX.Big Data/Hadoop Developer
Responsibilities:
- Hadoop installation, configuration of multiple nodes in Cloudera platform.
- Setup and optimize Standalone-System/Pseudo-Distributed/Distributed Clusters.
- Developed Simple to complex MapReduce streaming jobs
- Used Impala to query the Hadoop data stored in HDFS.
- Manage and review Hadoop log files
- Worked extensively on creating Oozie workflows for scheduling different jobs of hive, map reduce and shell scripts.
- Worked on migrating tables in SQL to Hive using Sqoop.
- Implemented Kafka messaging services to stream large data and insert into database.
- Used Hive and created Hive tables and involved in writing Hive UDFs and data loading.
- Used ECL scripts in development of HPCC Systems.
- Imported data into HDFS and Hive from other data systems by using Sqoop.
- Installed Oozie Workflow engine to run multiple Hive and Pig Jobs.
- Worked on partitioning and Bucketing the Hive table and running the scripts in parallel to reduce the run time of the script
- Configured Hadoop system files to accommodate new sources of data and updated the existing configuration Hadoop cluster
- Involved in loading data from UNIX file system to HDFS.
- Analyzed data using Pig Latin, Hive QL, HBase and custom MapReduce programs in Java
- Wrote SOLR queries for various search documents.
- Involved in gathering business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Actively participating in the code reviews, meetings and solving any technical issues.
Environment: HDFS, Pig, Hive, HBase, Sqoop, Spark, Oozie, flume, Kafka, AWS, Linux Shell Scripting, Linux, Java, J2EE, JSP, JSF, Eclipse, Maven, J2EE, SQL, HTML, XML, XSLT, Oracle, MYSQL, Ajax/JavaScript, web services API
Confidential, Salt Lake City, UtahBig Data/Hadoop Developer
Responsibilities:
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hadoop Map Reduce, HDFS and Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
- Hands on writing Map Reduce code to make unstructured data as structured data and for inserting data into HBase from HDFS
- Used Pig and Hive in the analysis of data.
- Designing and developing ECL scripts for HPCC Systems.
- Used all complex data types in Pig for handling data
- Experienced with NoSQL database and handled using the queries
- Experienced in managing and reviewing Hadoop log files.
- Extracted files from Couch DB through Sqoop and placed in HDFS and processed
- Worked on Kafka to produce the streamed data into topics and consumed that data.
- Created Data model for Hive tables
- Developed the LINUX shell scripts for creating the reports from Hive data.
- Involved in creating Hive tables, loading the data using it and in writing Hive queries to analyze the data.
- Implemented Pig jobs to clean, parse and structure the event data to facilitate effective downstream analysis.
- Built re-usable Pig UDFs for business requirements which enabled developers to use these UDFs in data parsing and aggregation.
- Created Hive Managed Tables and External tables and loaded the transformed data to those tables.
- Implemented Hive’s Dynamic partitions and Hive Buckets depending on the downstream business requirements.
- Used Oozie to automate interdependent Hadoop jobs.
Environment: s: Java 7, Eclipse IDE, Hive, HBase, Map Reduce, Oozie, Sqoop, Pig, Spark, flume, Impala, Java, MySQL, PL/SQL, Kafka, Linux
Confidential, Bridgewater, NJSr. Java Developer
Responsibilities:
- Developed the application under JEE architecture, developed Designed dynamic and browser compatible user interfaces using JSP, Custom Tags, HTML, CSS, and JavaScript.
- Used Spring, Hibernate, and Web Services Frameworks.
- Developed and Deployed SOA/Web Services (SOAP and RESTFUL) using Eclipse IDE.
- Developed user interface using JSP, JSP Tag libraries, and Java Script to simplify the complexities of the application.
- Implemented Model View Controller (MVC) architecture using Jakarta Struts frameworks at presentation tier.
- Contributed significantly in designing the Object Model for the project as senior developer and Architect.
- Responsible for development of Business Services.
- Deployed J2EE applications in Web sphere application server by building and deploying ear file using ANT script
- Involved in the testing and integrating of the program at the module level.
- Worked with production support team in debugging and fixing various production issues.
- Used stored procedures and Triggers extensively to develop the Backend business logic in Oracle database.
- Involved in performance improving and bug fixing.
- Involved in code review and designed prototypes
- Responsible for preparing the foundation of web projects by coding specific Java data objects, Java source files, XML files and SQL statements designed for graphical presentation, data manipulation and security.
- Used Oracle WebLogic application server to deploy application.
Environment: s: Java 1.5, JSP, AJAX, XML, Spring 3.0, Hibernate 2.0, Web Services, WebSphere7.0, JUnit, Oracle 10g, SQL, PL/SQL, log4j, RAD 7.0/7.5, ClearCase, UNIX, HTML, CSS, JavaScript
ConfidentialJava Developer
Responsibilities:
- Worked with the business community to define business requirements and analyze the possible technical solutions.
- Requirement gathering, Business Process flow, Business Process Modeling and Business Analysis.
- Responsible for system analysis, design and development using J2EE architecture.
- Interacted with Business Analyst for the requirement gathering
- Developed and deployed UI layer logics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax
- Developed custom tags for table utility component
- Used various Java, J2EE APIs including JDBC, XML, Servlets, and JSP.
- Carried out integration testing & acceptance testing
- Involved in Java application testing and maintenance in development and production.
- Participated in the team meetings and discussed enhancements, issues and proposed feasible solutions.
- Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
- Involved in mentoring specific projects in application of the new SDLC based on the
- Agile Unified Process, especially from the project management, requirements and architecture perspectives.
- Designed and developed Views, Model and Controller components implementing MVC
- Designed, developed and documented stored procedures and functions for the project and reviewing the code to check for errors if any.
Environment: JDK 1.3, J2EE, JDBC, Servlets, JSP, XML, XSL, CSS, HTML, DHTML, JavaScript, UML, Eclipse 3.0, Tomcat 4.1, MySQL