We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

St Louis, MissourI

SUMMARY:

  • Over 5 years of professional IT experience with Analysis, Design, Development, Testing, Implementation and Production Support.
  • Over 3 years of experience in Big Data and tools in Hadoop Ecosystem including Pig, Hive, Sqoop, Oozie, Zookeeper, Flume & Impala.
  • Strong experience on Big Data development, test and deployment to production.
  • Involved in all the phases of Life Cycle of Software Development (SDLC).
  • Requirement Analysis, Design, Development, Testing and Prod deployments of J2EE Business applications, Web - based and n-tier applications using the following core technologies - Java, Servlets, JSP, JSTL, and XML.
  • Proficient knowledge on handling big data usingHadooparchitecture, HDFS, Map Reduce, HBase, Hive, Pig, Flume, Oozie & Sqoop.
  • Working experience on ClouderaHadoopdistribution versions.
  • Cloudera Hue web interface for executing the respective scripts.
  • Expertise in writing Map-Reduce Jobs in Java.
  • Expertise in usage ofHadoopand its ecosystem commands.
  • Expertise in moving/loading the data from RDBMS (Oracle, DB2) toHadoopHDFS using Sqoop scripts.
  • Proficient on processing the data using Apache Pig by registering User Defined Functions (UDF) written in Java.
  • Skilled in scheduling recurringHadoopjobs using Apache Oozie workflows.
  • Proficient in designing and querying the NoSQL databases like HBase.
  • Strong Proficiency in R on concepts such as data transformation, filter and analytics.
  • Having experience on RDD architecture and implementingsparkoperations on RDD and also optimizing transformations and actions inspark.
  • Good knowledge onsparkcomponents likeSparkSql, MLib,SparkStreaming and GraphX
  • Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Well knowledgeable in building, configuring, monitoring, supportingHadoopenvironment using Cloudera manager, Hortonworks, AWS, and apache Ambari.
  • Developed various Pig and Hive UDFs (User Defined Functions) to extend functionality to solve multiple bigdata filtering problems.
  • Extensive knowledge and experience in various OS platforms like UNIX, Linux RHEL, AIX and Windows.
  • Hands on experience with writing Unix shell and bash scripts.
  • Strong experience in MapReduce programming and customizing framework at various levels and Input formats like Sequence FileInput Format, KeyValuePairInputFormat etc.
  • Worked on following NoSQL databases MongoDB, HBase and Cassandra and ensured faster access to data on HDFS.

TECHNICAL SKILLS:

Hadoop: DistributionApache, Cloudera CDH, Horton Works HDP

Big Data Technologies: Apache Hadoop(MRv1,MRv2),Hive,Pig,Sqoop,HBase,Flume, Oozie, AMBARI, Spark, TEZ, KAFKA, Storm,R, Elasticsearch, Solr

Cloud Platforms: Amazon Web Services (EC2), Google cloud platform

Operating Systems: Windows, Linux RHEL & Unix

Languages: C,C++, Java, PL/SQL, Unix Shell

Web Technologies: HTML, JSP, JSF, CSS, JavaScript

IDEs: Eclipse, JBOSS, IBM Web Sphere

Reporting Tools: SAP Business Objects, Microstategy, Tableau.

Webservers /App Servers: Apache Tomcat 6.0/7.0, IBM WebSphere 6.0/7.0, JBoss 4.3

Databases: Oracle, MySQL, SQL Server 2008, MongoDB(NoSQL), LDA

ETL Tools: Informatica, Datastage

PROFESSIONAL EXPERIENCE:

Confidential, St louis, Missouri

Hadoop Developer

Responsibilities:

  • Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
  • Involved in installing, configuring & managingHadoopEcosystem components like Hive, Pig, Sqoop and Flume.
  • Migrated the existing data toHadoopfrom RDBMS using Sqoop for processing the data.
  • Responsible for loading unstructured and semi-structured data intoHadoopcluster coming from different sources using Flume and managing.
  • Developed MapReduce programs to cleanse and parse data in HDFS obtained from various data sources and to perform joins on the Map side using distributed cache.
  • Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing Hive Queries to further analyze the data.
  • Wrote SQL queries and performed Back-End Testing for data validation to check the data integrity during migration from back-end to front-end.
  • Used Hive data warehouse tool to analyze the data in HDFS and developed Hive queries.
  • Worked on setting up Pig, Hive, Redshift and Hbase on multiple nodes and developed using Pig, Hive, Hbase and MapReduce.
  • Used the JSON and Avro SerDe's for serialization and de-serialization packaged with Hive to parse the contents of streamed log data.
  • Implemented Hive custom UDF's to achieve comprehensive data analysis.
  • Wrote MRUnit tests for unit testing the MapReduce jobs.
  • Wrote test cases in JUnit for unit testing of classes
  • Worked on parsing the XML files using DOM/SAX parsers and used JAXB for data retrieval.
  • Implemented daily workflow for extraction, processing and analysis of data with Oozie.
  • Responsible for troubleshooting MapReduce jobs by reviewing the log files.
  • Created tables, inserted data and executed various Cassandra Query language (CQL 3) commands on tables using cqlsh.
  • Setting up the Component Integration testing environment and trouble shooting.
  • Involved in deployment of application on WebSphere Application Server.

Environment: Hadoop, Hive, MapReduce, Oozie, Core Java, Jdk6, Spring 3.0, Spring MVC, Bootstrap, Spring AOP, Spring JDBC, DB2, JSON, JAXB, XML, Oracle 11g, Sqoop, Flume, Eclipse, Hue.

Confidential, Reston / Virginia

Big Data Developer

Responsibilities:

  • Worked on data loading, querying and data extraction for the data present inHadoopFile System.
  • Automated all the jobs for extracting the data from different Data Sources and to push the result set data toHadoopDistributed File System.
  • Worked on data migration from existing data stores toHadoopFile System (HDFS)
  • Responsible for loading and managing the data from different databases to HDFS.
  • Worked on importing and exporting data into HDFS and Hive using Sqoop.
  • Exported the data from Oracle, MySQL, SAP files toHadoopFile System using Sqoop.
  • Used Pig to preprocess the data and Performed data transformations and then exported this metrics back to Teradata using Sqoop.
  • Provided ad-hoc queries and data metrics to the Business Users using Hive, Pig.
  • Developed Pig Latin scripts to extract and transform the data according to business needs and output files to load into HDFS.
  • Created tables and views using hive with different file formats and compression.
  • Implemented dynamic partitioning and bucketing for the hive tables.
  • Worked on Hive for analyzing the datasets.
  • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with IDW tables and historical metrics.
  • Experience in creating HBase and hive tables to store large sets of data.
  • Experienced in optimizing Hive Queries and provided support for end-to-end ETL Workflow.
  • Worked with Business Users in providing them the HQL queries for their data analysis.
  • Successfully loaded files to Hive and HDFS from HBase.
  • Worked on performing Data Quality checks for the data loaded in HDFS.
  • Knowledge on implementing error handling service forHadoopjobs.

Environment: Apache Hadoop(HDFS), Pig: 0.14, Hive: 0.14, Oozie: 4.1, Hbase, Apache Solr, Apache Tomcat, Jenkins, Nexus, Shell Scripts.

Confidential

ETL/Hadoop Developer

Responsibilities:

  • Developed solutions to process the data into HDFS (HadoopDistributed File System), within Hadoop.
  • Used Sqoop extensively to ingest data from various source systems into HDFS.
  • Involved in runningHadoopjobs for processing millions of records of text data.
  • Hive was used to produce results quickly based on the report that was requested.
  • Extensively used Oozie and Zookeeper to automate the flow of jobs and coordination in the cluster respectively.
  • Developed shell scripts, which acts as wrapper to startHadoopjobs and set the configuration parameters.
  • Worked on a stand-alone as well as a distributedHadoopapplication.
  • Tested the performance of the data sets on various NoSQL databases.
  • Understood complex data structures of different types (structured, semi structured) and de-normalizing for storage.
  • Worked with application teams to install operating system,Hadoopupdates, patches, version upgrades as required.
  • Created and maintained Technical documentation for launchingHADOOPClusters and for executing Hive queries and Pig Scripts.
  • Developed Shell scripts for automate routine tasks.
  • Used Oozie and Zookeeper operational services for coordinating cluster and scheduling workflows.

Environment: Hadoop, HDFS, MapReduce, Pig, Hive, Sqoop, HBase, Oozie, Flume, java.

Confidential

ETL Developer

Responsibilities:

  • Involved in analyzing and designing database using Normalization techniques.
  • Developed Forms, Menus, Object Libraries, and the PL/SQL Library using Oracle Form Builder 10g/9i.
  • Created and maintained development environment, resolved Production and validated Environment problems.
  • Handled PL/SQL Compile-Time, Runtime Errors, and Debugging stored procedure for business logic modification, responding to System Events through Triggers.
  • Created PL/SQL Stored Procedures, Functions, Triggers and Packages for implementing business logic.
  • Implemented exception handling using autonomous transactions, Locks, used save points, commits & rollbacks to maintain transactional consistency & database integrity.
  • Developed Front end user interface using Forms 10g.
  • Detected and Corrected bugs during system integration and user acceptance testing.
  • Involved in Planning, Implementation, Testing, and Documentation.
  • Expertise to follow agile process in application development.
  • Mapped the source and target databases by studying the specifications and analyzed the required transforms.
  • Created numerous Mappings and Mapplets using Transformations like Filters, Aggregator, Lookups, Expression, and Sequence generator, Sorter, Joiner and Update Strategy.
  • Created Custom Triggers, Stored Procedures, Packages and SQL Scripts.
  • Made changes in existing Oracle PL/SQL Functions, Procedures, and Packages and also wrote new code as per the requirements and handled exceptions.

Environment: Oracle 9i/10g, Teradata, SQL, SQL*Plus, PL/SQL, SQL*Loader,InformaticaPower Center 8.6/8.1, Export/Import, UNIX Server, Oracle Web Application Server, Forms and Reports 9i, Toad

Confidential

Java Developer

Responsibilities:

  • Analyzed and gathered the system requirements.
  • Created design documents and reviewed with team in addition to assisting the business analyst / project manager in explanations to line of business.
  • Developed the web tier using JSP to show account details and summary.
  • Designed and developed the UI using JSP, HTML and JavaScript.
  • Utilized JPA for Object/Relational Mapping purposes for transparent persistence onto the SQL Server database.
  • Used Tomcat web server for development purpose.
  • Used Oracle as Database and used Toad for queries execution and also involved in writing SQL scripts, PL/SQL code for procedures and functions.
  • Developed application using Eclipse.
  • Used Log4J to print the logging, debugging, warning, info on the server console.
  • Interacted with Business Analyst for requirements gathering.
Environment: Java, J2EE, JUnit, XML, JavaScript, Log4j, CVS, Eclipse, Apache Tomcat, and Oracle.

We'd love your feedback!