Hadoop/big Data/spark/developer Resume
Plano, TexaS
SUMMARY
- 10 years of experience in IT industry covering Development, Analysis, Design, Testing and System Maintenance using JAVA/J2EE Technologies and BigData/Hadoop development with 5+yrs.
- Operative expertise sound knowledge in using Hadoopecosystem components like HadoopMap Reduce, HDFS, HBase, Hive. Hands on experience in installing, configuring Sqoop, Pig, Zookeeper and Flume.
- Cross - functional Exposure on ApacheHadoopMap Reduce programming, PIG Scripting and Distribute Application and HDFS. Solid understanding of Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture
- Extensive involvement in Hadoop cluster architecture knowledgebase and monitoring clusters.
- Experienced in building highly scalable Big-data solutions using Hadoop and multiple distributions i.e.Cloudera and NoSQL platforms(Hbase & Cassandra).
- Updated knowledge on Amazon AWS concepts like EMR & EC2 web services which provides fast and efficient processing of Big Data.
- Experience in managing and reviewing Hadoop log files. Experience in writing Map Reduce programs and using Apache Hadoop API for analyzing the data.
- Strong experience in developing, debugging and tuning Map Reduce jobs in Hadoop environment. Used Compression Techniques (snappy) with file formats to leverage the storage in HDFS.
- Implemented in setting up standards and processes forHadoopbased application design and implementation. Expertise in developing PIG and HIVE scripts for data analysis.
- Hands on experience in data mining process, implementing complex business logic and optimizing the query using HiveQL and controlling the data distribution by partitioning and bucketing techniques to enhance performance.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Experience in managing Hadoopclusters using Cloudera Manager Tool.
- Experience working with Flume to handle large volume of streaming data.
- Good working knowledge on Hadoop hue ecosystems.
- Extensive experience in migrating ETL operations into HDFS systems using Pig Scripts.
- Detailed knowledge of big data analytics libraries (MLlib) and utilization of data exploration tools like Spark-SQL.
- Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in scala.
- Knowledge on Apache Spark and its stack.
- Experience in creating and designing data ingest pipelines using technologies such as spring Integration, Apache Storm with Kafka.
- Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map Reduce and Pig jobs.
- Worked on power exchange in Informatica. Scheduled and Monitored jobs on Informatica Monitoring Tool.
- Developed applications using Core Java, Multithreading, Collections, JDBC, Swing, Networking, Reflections.
- Java/J2EE Software Developer with experience of Core Java and Web based applications with expertise in reviewing client requirement; prioritize requirements, creating project proposal (scope, estimation) and baseline project plan.
- Implemented core modules in large cross-platform applications using JAVA, J2EE, Hibernate, Python, Spring, JSP, Servlets, EJB, JDBC, JavaScript, XML, and HTML.
- Devised continuous integration of Java projects with build tools Maven and ANT.
- Working Knowledge in configuring and monitoring tools like Ganglia and Nagios.
- Hands-on expertise in using relational databases like Oracle, MySQL, PostgreSQL and MS-SQL Server.
- Extensive experience in developing and deploying applications using Web Logic, Apache Tomcat and JBOSS.
- Developed Unit test cases using Junit, Easy Mock and MRUnit testing frameworks.
- Experienced with version controller systems like SVN, Clear case, Git, bitbucket.
- Experience using IDEs tools Eclipse 3.0, My Eclipse, RAD and NetBean.
- Experience in designing Use Cases, Class diagrams, Sequence and Collaboration diagrams for multi-tiered object-oriented system architectures.
- Extensive experience with design and development of J2EE based applications involving technologies such as Java Server Pages (JSP), Java Messaging Service (JMS), Java Data Base Connectivity (JDBC).
TECHNICAL SKILLS
Big Data Ecosystem: Visual Source Safe, Win CVS, MKS,Putty,WINSCP, Rational -Rose, Eclipse, Rational Application Developer and HDFS, HBase,HadoopMap Reduce, Zookeeper, Hive, Pig, Sqoop, Oozie, Cassandra, Spark, Apache Storm, Hadoop Hue
Programming Languages: Java 8, Java 7 (J2EE), Cobol, Focus, C, C++, python, perl, JavaScript, Cuda.
Technologies: J2EE (JSP, Servlets, EJB, JMS, JDBC), Multi-Threading, Collections, Spring, Java Scripting, Hibernate, Struts, design patterns, JUnit,JDK 1.5,log4j.
Databases: Oracle 9i, 10G,DB2,Sybase,Netezza
Operating systems: Windows 95/NT/2000, Red hat LINUX.
Servers: Web sphere, Tomcat, JBoss.
User interfaces: HTML, JSP, XML, CSS, PHP
PROFESSIONAL EXPERIENCE
Confidential, Plano, Texas
Hadoop/Big Data/Spark/Developer
Responsibilities:
- Created an end to end fault tolerant mailing solution leveraging AWS EC2/AMR instances and Amazon tools like STS and SQS queues using Java 8.
- Used Lambdas to connect AWS tools and fault tolerant connectivity.
- Developed UI Node-js compatible angular-js and angular components such as pagination, Sorting and range slider with realtime HTTPS GET update for Squadron process monitoring framework.
- Worked on angular integration and developing some angular modules for the front end compatibility.
- Integrated Team code and maintained it on the proprietary Github
- Leveraged Scala and Scallop a Scala based argument parsing Library for Controls Management framework.
- Utilized Scala based frameworks like Play and Json4s frameworks for Json validations and manipulations.
- Utilized Snowflake Sql and Spark and Scala snowflake Connectors to query and transform tables
- Leveraged Spark dataframes for data manupulation and fast transformations for on the fly table updates and views generation
- Worked on multi level data validation between Oracle and corresponding Snowflake Data tables using partition and slicing queries to match data from both platforms.
Confidential, Chicago, IL
Sr. Hadoop/Big Data/Developer
Responsibilities:
- Extracted and updated the data into HDFS using sqoop import and export command line utility interface.
- Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
- Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
- Responsible for managing data from multiple sources. Implemented best income logic using pig scripts.
- Involved in developing Hive UDFs for the needed functionality.
- Involved in creating Hive tables, loading with data and writing hive queries. Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Involved in creating hive tables loading and analyzing data using hive queries.
- Created and managed data indexing, developing custom tokenizers and relevance tuning, creating filters, adding functionality.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting. Used Hive connections to analyze data from Oracle.
- Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Involved in emitting processed data from Hadoop to relational databases or external file systems using SQOOP.
- Developed multiple Map Reduce jobs in java 8 libraries for data cleaning and pre-processing.
- Involved in running Hadoop jobs for processing millions of records of text data.
- Assisted in exporting analyzed data to relational databases using sqoop.
- Involved in loading data from UNIX file system to HDFS.
- Created HBase tables to store different data formats.
- Experience in managing and reviewing Hadoop log files.
- Export the analyzed data to the relational databases using sqoop for visualization and to generate reports for the BI team.
- Monitored jobs on Informatica monitoring tool.
- Extensive knowledge on debugging MapReduce programs, Hive UDF’s using Eclipse
Environment: Hadoop, HDFS, Map Reduce, Hive, Flume, HBase, Sqoop, PIG, Java (JDK 1.6), Eclipse, MySQL and Ubuntu, Zookeeper, Oozie, Apache Kafka, Apache Storm.
Confidential, Kansas, MO
Hadoop/Java/Big Data Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
- Developed pig scripts for analyzing large data sets in the HDFS.
- Collected the logs from the physical machines and the OpenStack controller and integrated into HDFS using Flume.
- Good knowledge analyzing data using Python development and scripting for HadoopStreaming. Setup and benchmarked Hadoop/HBase clusters for internal use.
- Developed Simple to complex Map/reduce Jobs using Java 7 and Java 8 programming language that are implemented using Hive and Pig.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
- Analyzed the data by performing Hive queries (HiveQL) and running Pig scripts (Pig Latin) to study customer behavior.
- Used UDF s to implement business logic in Hadoop.
- Organized query ready and write and query the Hadoop data in HDFS or HBase or Cassandra using impala.
- Developed programs in Spark based on the application for faster data processing than standard MapReduce programs.
- Develop Spark code using Scala and Spark-SQL for faster testing and data processing.
- Experience with batch processing of data sources using Apache Spark.
- Implemented business logic by writing UDFs in Java and used various UDFs from other sources.
- Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
- Analyzed large amounts of datasets to determine optimal way to aggregate and report on it.
- Supported in setting up QA environment and updating configurations for implementing scripts with pig and sqoop.
- Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS using autosys and Oozie coordinator jobs. Installed Oozie workflow engine to run multiple Hive and Pig jobs.
- Established connections to ingest data in and from HDFS
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team
- Diverse Experience in loading and transforming of large sets of structured, semi structured and unstructured data.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Written multiple MapReduce programs in Java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats.
Environment: Hadoop, MapReduce, HDFS, Hive, Spark, Pig, Java, SQL, Cloudera Manager, Sqoop, Strom, Solr, Flume, Cassandra, Oozie, Java (jdk 1.6), Eclipse
Confidential, Phoenix, AZ
Java Developer
Responsibilities:
- Defining the requirements for new capabilities for the platform (product) after due research and study of the prospective client requirements.
- Defined architecture and provided optimal design approach for new capabilities during product cycle.
- Conduct reviews (code/artifacts) during the implementation phase.
- Analyze and identify the rectification of the issues identified during product security reviews.
- Provide technical support to the product engineering team during the implementation phase.
- Preparing client presentations for prospective clients to showcase the existing capabilities and features of the product.
- Involved in analysis, specification, design, and implementation and testing phases of Software Development Life CycleSDLC.
- Implemented service layer classes usingSpring IOCandAOPclasses.
- Actively interacted with Business Analysts for requirement gatherings and analysis. Developed designedspecificationsusing UML including Use case, Class and Sequence diagrams.
- Implemented and maintainedAJAXbased rich client for improved customer experience.
- Developed the Presentation and Controller layers usingJSP, HTML, Java Script,Business logic usingSpringIOC, AOP, DTO, JTA, and Persistent layerDAO, Hibernatefor all modules.
- Developed the application using industry standard design patterns likeService Locator, Singleton, Business Delegate, MVC, and Factory Patternetc for re-usability.
- Developed Java Messaging ServiceJMSwithMessage Driven Beansby configuringJMSQueues, Topics, and Connection Factories.
- UsedJavaScriptfor Client Side validation inJSPpages.
- Developed the code environment using IDE asEclipse3.2and deployed intoTomcat ServerDeveloped ANT build.xml to achieve more functionality for build process.
- Implemented Web services componentsSOAP, WSDL, and UDDIto interact with external systems.
- Developed JUnit test framework and executed unit test cases by usingJUNITfor fixes.
Environment: MySQL, Jdk1.5, AJAX, JavaScript, JSP, Spring 3.0, DAO, Hibernate 3.2, UML, Design patterns, JMS, Eclipse3.2, Oracle 10g, ANT, JUNIT, HTML, DHTML, XML, slf4, XSL, CSS, JMeter, Windows XP and UNIX
Confidential, New York
Java Developer
Responsibilities:
- Organized requirement gathering discussions with all the stake holders.
- Undertook distributing and tracking and communicated issues to developers and reporting status to manager on daily basis.
- Involved in High Level Design and Low Level Design document preparation.
- Development according to the specified design.
- Published SOAP based web services using JAX-WS, JAXB, XSD, XML Bean and XML.
- Front end is developed based on struts MVC architecture
- SOAPUI has been used to test the web services.
- Struts and spring frameworks has been used for the newly designed UI Infrastructure services to interact with the legacy application systems.
- Developed Action classes, Action forms, Validate methods, struts-config.xml file using struts and also used various struts tag libraries.
- Used Enterprise Java Beans (EJB session beans) in developing business layer APIs.
- Hibernate is used as ORM.
- Applied J2EE design patterns like Business Delegate, DAO and Singleton.
- Deployed and tested the application using Tomcat web server.
- Usingjavascripts did client side validation.
- Involved in developing DAO's using JDBC.
- HQL and Criteria API have been used extensively.
- Developed complex SQL queries, stored procedures, functions, triggers and created indexes wherever applicable in Oracle database.
- Co-ordination with Onshore development team
- Involved in debugging and testing the application for the change requests
- Preparing weekly status reports /Monthly status reports
- Coordinating with complete offshore team on filling weekly time sheets on Clarity and Field glass.
- Given the code walk through to the newly joined team members on the deliverables
- Planning the forecast for the individuals on their task sheets.
- Prepared the test case documents for enhancements
- JUNIT is used for unit testing and prepared JUNIT Test cases document.
Environment: JDK 1.5/1.4, J2EE, Servlets, Strut, Spring, Hibernate 3/3.5/4.0, HQL, Maven 3.0, JAX-WX, JAXB, XML, XSD, SOAPUI, JQuery, CSS, JUNIT, Oracle 9i/10g, SQL, PL/SQL, Quality Center, SSH shell, SSH Client, Putty, VSS, WAS, Web Sphere, Visual Studio, Microsoft Visio, Microsoft Project, UML, Share point, Windows XP and UNIX.
