Hadoop Developer Resume
2.00/5 (Submit Your Rating)
Pataskala, OhiO
SUMMARY
- Over 7+ years of professional IT experience with strong emphasis on Big data, Hadoop eco system related technologies in multiple industries such as Financial, Banking, Insurance, Healthcare and Public Sectors.
- Excellent understanding / knowledge of Hadoop architecture and various components such as Big Data and Hadoop File System HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Hadoop Map Reduce programming paradigm.
- Experienced in installation, configuration, supporting and monitoring 50+node Hadoop cluster using Hortonworks, MapR distributions.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala and have a good experience in using Spark - Shell, PySpark shell and Spark Streaming.
- Good Exposure on Apache Hadoop Map reduce programming, PIG Scripting and Distributed Application and HDFS.
- Expertise in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice versa.
- Hands on experience in Installing, Configuring, and using Hadoop ecosystem components like Linux /Unix including Hadoop Administration (like Hive, pig, Sqoop etc.).
- Experience in modeling and developing HIVE warehouse considering Business Use Cases and Performance considerations of the queries.
- Good Knowledge on Hadoop Cluster architecture and monitoring the cluster and SOLR/Lucene.
- Expertise in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Extended Hive and Pig core functionality by writing custom UDFs.
- Experience in managing and reviewing Hadoop log files.
- Worked on NoSQL database like HBase, Mongo DB and Cassandra.
- Experience in using Apache Flume for collecting, aggregating and moving large amounts of data from application servers.
- Expertized in Implementing Spark using Python and Spark-Sql for faster testing and processing of data.
- Experienced in working with Spark eco system using Sparksql and Python queries on different data formats like Text file, CSV file.
- Experience in Amazon AWS cloud services (EC2, EBS, S3)
- Proficient in managing Hadoop clusters using Cloudera Manager Tool.
- In-depth understanding of Data Structure and Algorithms.
- Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
- Extensive experience working in Oracle, DB2, SQL Server, PL/SQL and Mysql database.
- Hands on experience in application development using Java, RDBMS, and Unix shell scripting.
- Familiar with Java virtual machine (JVM) and multi-threaded processing.
- Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
- Have solid understanding of REST architecture style and its application to well performing web sites for global usage.
- Extensive experience in middle-tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets,JSP, JSF, Struts, Spring, Hibernate, JDBC, EJB, JBoss and Ajax.
- Expertise in troubleshooting and development onHadooptechnologies including HDFS, MapReduce2, YARN, Hive, Pig, Flume, HBase, MongoDB, Accumulo, Tez, Sqoop, Zookeeper, Spark, Kafka, andStorm
- Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
TECHNICAL SKILLS
- Hadoop/Big Data
- HDFS
- Map Reduce
- Hive
- Pig
- Sqoop
- Flume
- Oozie
- Kafka
- ZooKeeper.
- HBase
- Cassandra
- CouchDB.
- Teradata
- MS SQL Server
- Oracle
- Informix
- Sybase
- Informatica
- Datastage.
- JAVA
- J2EE
- Spring
- Hibernate EJB
- Web Services (JAX-RPC
- JAXP
- JAXM)
- JMS
- JNDI
- Servlets
- JSP
- Jakarta Struts.
- BEA Weblogic
- IBM Websphere
- JBoss
- Tomcat.
- UML
- OOAD.
- HTML
- AJAX
- CSS
- XHTML
- XML
- XSL
- XSLT
- WSDL
- SOAP
PROFESSIONAL EXPERIENCE
Confidential - Pataskala,Ohio
Hadoop Developer
Responsibilities:
- Created Hive Tables, loaded retail transactional data from Teradata using Sqoop.
- Loaded Home mortgage data from the existing DWH tables (SQL Server) to HDFS using Sqoop.
- Wrote Hive Queries to have a consolidated view of the mortgage and retail data.
- Orchestrated hundreds of sqoop scripts, pig scripts, hive queries using oozie workflows and sub-workflows.
- Loaded the load ready files from mainframes to Hadoop and files were converted to ASCII format.
- Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark -SQL, Data Frame, Pair RDDs, Spark YARN.
- Developed MapReduce programs to write data with Headers and Footers and Shell scripts to convert the data to fixed-length format suitable for Mainframes CICS consumption.
- Importing and exporting data into HDFS using Sqoop and Kafka.
- Implemented test scripts to support test driven development and continuous integration.
- Used Maven for continuous build integration and deployment.
- Expertise in AWS data migration between different database platforms like SQL Server to Amazon Aurora using RDS tool.
- Involved in scheduling Oozie workflow engine to run multiple pig jobs.
- Responsible for developing data pipeline using flume, Sqoop and Pig to extract the data from weblogs and store in HDFS.
- Proficient work experience with NoSQL, Cassandra databases.
- Data scrubbing and processing with Oozie.
- Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
- Involved in developing Hive DDLs to create, alter and drop tables.
- Created and maintained technical documentation for launching Hadoop clusters and for executing Hive queries and Pig Scripts.
- Used NoSQL database with Cassandra
Confidential, Hauppauge, NY
Hadoop Developer
Responsibilities:
- Worked with Technology and Business groups for Hadoop migration strategy.
- Transferred data to and from Hadoop cluster, using Sqoop and various storage media such as Informix tables and flat files.
- Log data Stored inHBaseDB is processed and analyzed and then imported into Hive warehouse, which enabled end business analysts to write HQL queries.
- Developed MapReduce programs and Hive queries to analyze sales pattern and customer satisfaction index over the data present in various relational database tables.
- Worked extensively in performance optimization by adopting/deriving at appropriate design patterns of the MapReduce jobs by analyzing the I/O latency, map time, combiner time, reduce time etc.
- Developed Pig scripts in the areas where extensive coding needs to be reduced.
- Developed UDFs for Pig as needed.
- Followed Agile methodology for the entire project.
- Orchestrated hundreds of sqoop scripts, pig scripts, hive queries using oozie workflows and sub-workflows.
- Loaded the load ready files from mainframes to Hadoop and files were converted to ASCII format
- Involved in creating Hive tables, loading with data and writing hive queries that will run internally in mapreduce way.
- Used Hive optimization techniques during joins and best practices in writing hive scripts.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Developed PL/SQL queries based on the requirement.
- Worked closely with the testing team helping them writing the test cases and fixing the bugs.
- Have experience working with data marts and data warehouses.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports by our BI team.
- Installing and maintain puppet-based configuration management system
- Used Puppet configuration management to manage cluster.
Confidential, Boston, MA
Hadoop developer
Responsibilities:
- Worked with Business partners to gather Business requirements.
- Developed the application by using the Spring MVC framework.
- Created connection through JDBC and used JDBC statements to call stored procedures.
- Responsible for building scalable distributed data solutions using Hadoop.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Developed the Pig UDFs to pre-process the data for analysis.
- Implemented multiple Mapreduce Jobs in java for data cleansing and pre-processing.
- Experienced in loading data from UNIX file system to HDFS.
- Developed job workflow in Oozie to automate the tasks of loading the data into HDFS.
- Responsible for creating Hive tables, loading data and writing Hive queries.
- Effectively involved in creating the partitioned tables in Hive.
- Worked extensively with Sqoop for importing metadata from Oracle.
- Configured Sqoop and developed scripts to extract data from SQL Server into HDFS.
- Expertise in exporting analyzed data to relational databases using Sqoop.
- Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Mapreduce jobs given by the users.
- Cluster coordination services through ZooKeeper.
- Responsible for running Hadoop streaming jobs to process terabytes of xml data.
- Gained experience in managing and reviewing Hadoop log files.
- Used sequence and avro file formats and snappy compressions while storing data in HDFS.
- Installed and configured Pig and also written PigLatin scripts.
- Wrote MapReduce job using Pig Latin.
- Involved in ETL, Data Integration and Migration. Imported data using Sqoop to load data from Oracle to HDFS on regular basis.
Environment: Hadoop 1x, HDFS, Map Reduce, Hive 10.0, Pig, Sqoop, Hbase, Shell Scripting, Oozie, Oracle 10g, SQL Server 2008, Ubuntu 13.04, Spring MVC, J2EE, Java 6.0, JDBC, Apache Tomcat
Confidential
Sr. JAVA Developer
Responsibilities:
- Analyzing the business requirements and doing the GAP analysis then transforming them to detailed design specifications.
- Involved in design process using UML & RUP (Rational Unified Process).
- Performed Code Reviews and responsible for Design, Code and Test signoff.
- Assisting the team in development, clarifying on design issues and fixing the issues.
- Development of the logic for the Business tier using Session Beans (Stateful and Stateless).
- Responsible for Design and development of web services to test the security aspects of Web Services enabled CICS Transaction Gateway.
- Extensively used SQL queries, PL/SQL stored procedures & triggers in data retrieval and updating of information in the Oracle database using JDBC.
- Expert in writing, configuring and maintaining the Hibernate configuration files and writing and updating Hibernate mapping files for each Java object to be persisted.
- Expert in writing Hibernate Query Language (HQL) and Tuning the hibernate queries for better performance.
- Deployed the application in Weblogic and used Weblogic Workshop for development and testing.
- Involved in application performance tuning (code refractory).
- Writing test cases using JUNIT, doing test first development.
- Writing build files using ANT. Used Maven in conjunction with ANT to manage build files.
- Running the nightly builds to deploy the application on different servers.
- Experience of working in Agile Methodology.
- Worked with both SOAP and Restful web Services.
- Worked with Complex SQL queries, Functions and Stored Procedures.
JAVA Developer
Responsibilities:
- Involved in designing and development using UML with Rational Rose
- Played a significant role in performance tuning and optimizing the memory consumption of the application.
- Developed various enhancements and features using Java 5.0
- Developed advanced server side classes using Networks, IO and Multithreading.
- Designed and developed various complex and advanced user interface using Swing.
- Used SAX/DOM XML Parser for parsing the XML file.
- Designed and developed Struts like MVC 2 Web framework using the front-controller design pattern, which is used successfully in a number of production systems.
- Worked on Java Mail API. Involved in the development of Utility class to consume messages from the message queue and send the emails to customers.
- Used JUnit framework for unit testing and Log4j to capture runtime exception logs.
- Used JDBC to invoke Stored Procedures and database connectivity.
- Responsible for data reconciliation with EOD files using scheduled batch process.
- Responsible for system development using J2EE architecture.
- Used Spring Framework for dependency injection, transaction management and AOP.
- Involved in Springs MVC model integration for front-end request action controller.