We provide IT Staff Augmentation Services!

Sr Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Kansas City, MissourI

SUMMARY

  • Around 9 years of professional IT work experience in Analysis, Design, Development, Deployment and Maintenance of critical software and big data applications.
  • 5+ years of hands on experience across Hadoop and that includes extensive experience into Big Data technologies.
  • Hands on experience in developing and deploying enterprise based applications using major Hadoop ecosystem components like Map Reduce, Hive, Pig, Hbase, Flume, YARN, Sqoop, Spark Streaming, Spark SQL, Storm, Kafka, Oozie, Cassandra and Zookeeper.
  • Hands on experience of multiple distributions like Cloudera, Hortonworks and Mapr.
  • Experience in installation, configuration, supporting and managing Cloudera Hadoop platform along with CDH4 and CDH5 clusters.
  • Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
  • Experience in Automating Sqoop, Hive and Pig scripts using Oozie work flow scheduler.
  • Good experience in optimizing MapReduce algorithms using Mappers, Reducers, combiners and partitioners.
  • Extensive experience in developing PIG Latin Scripts and using Hive Query Language for data analytics.
  • Experience in using different file formats like CSV, Sequence, AVRO, RC, ORC, JSON and PARQUET files and different compression Techniques like LZO, Gzip, Bzip2 and Snappy.
  • Experience in big data ingestion tools like Sqoop, Flume and Apache Kafka.
  • Experience in using Flume and Kafka to load the log data from multiple sources into HDFS.
  • Hands on experience with NoSQL Databases like Hbase, MongoDB and Cassandra.
  • Experience in retrieving data from databases like MYSQL, Teradata, Informix, DB2 and Oracle into HDFS using Sqoop and ingesting them into Hbase and Cassandra.
  • Good understanding and experience with Software development methodologies like Agile and Waterfall.
  • Used Apache NiFi for automating loading data tasks to HDFS.

TECHNICAL SKILLS

Big Data Ecosystem: Hadoop, Map Reduce, YARN, Pig, Hive, Hbase, Flume, Sqoop, Impala, Oozie, Zookeeper, Apache, Kafka, Scala, MongoDB, Cassandra.

Hadoop Distributions: Cloudera (CDH3, CDH4, and CDH5), Hortonworks and MapR.

No SQL Databases: Cassandra, Hbase, MongoDB, CouchDB.

Java: J2EE, JSP, CSS, Jquery, Servlets, HTML, Java Script

Mainframe: JCL, COBOL, CICS, DB2.

Databases: MYSQL, Oracle, DB2 for Mainframes, Teradata, Informix.

Operating Systems: Windows, Unix, Linux

Other Tools: Putty, WINSCP, Filezilla, Streamweaver, Compuset.

Languages: Java, SQL, HTML, JavaScript, JDBC, XML, and C.

Frameworks: Struts, spring, Hibernate.

App/Web servers: WebSphere, WebLogic, JBoss, Tomcat.

PROFESSIONAL EXPERIENCE

SR HADOOP DEVELOPER

Confidential, Kansas City, Missouri

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Worked in Agile scrum methodology for software test driven development.
  • Optimized Hive queries and used Hive on top of Spark engine.
  • DevelopedKafkaproducer and consumers, Cassandra clients and Spark along with components on HDFS, Hive.
  • Populated HDFS and HBase with huge amounts of data using ApacheKafka.
  • Used Kafka to ingest data into Spark engine.
  • Hands on experience inSparkandSparkStreaming creating RDD's, applying operations -Transformation and Actions.
  • Developed Spark Streaming script which consumes topics from distributed messaging source Kafka and periodically pushes batch of data to Spark for real time processing.
  • Managing and schedulingSparkJobs on aHadoopCluster using Oozie.
  • Experienced data pipelines usingKafkaand Akka for handling large terabytes of data.
  • Involved in integration ofHadoopcluster with Spark engine to perform BATCH and GRAPHX operations.
  • Worked on indexes, scalability and query language supporting usingCassandra.
  • Created Sqoop scripts for importing data from different data sources to Hive andCassandra.
  • Used HUE for running Hive queries and created partitions according to day using Hive to improve performance.
  • Implemented ApacheNififlow topologies to perform cleansing operations before moving data into HDFS.
  • Developed a Data flow to pull the data from the REST API usingApacheNifiwith context configuration enabled.
  • Used NiFi to provide realtime control for the moment of data between source and destination.
  • Experience working onTalendETL for performing data migration and data synchronization processes on the data warehouse.
  • Installed and ConfiguredHadoopcluster using Amazon Web Services (AWS) for POC purposes.
  • Worked on AmazonAWSconcepts like KINESIS, LAMBDA, EMR and EC2 web services for fast and efficient processing of Big Data.
  • Responsible for maintaining and expandingAWS(Cloud Services) infrastructure usingAWS.
  • Developed the batch scripts to fetch the data fromAWSS3 storage and do required transformations in Scala using Spark framework.
  • Written Python, Shell scripts for various deployments and automation process.
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
  • Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest behavioral data into HDFS for analysis.
  • Visualized the analytical results usingTableauvisualization tool.
  • DevelopedRscripts to implement predictive analysis graphs in Tableau.

Environment: ApacheSpark, Kafka, Map Reduce, Cassandra, YARN, Sqoop, Oozie, HDFS, Hive, Pig, Java, Hortonworks, AWS, KINESIS, Linux, XML, Eclipse, MySQL.

SR HADOOP DEVELOPER

Confidential - Chicago, Illinois

Responsibilities:

  • Implemented Storm builder topologies to perform cleansing operations before moving data into Cassandra.
  • Developed the ETL jobs to load the data into a data warehouse, which is coming from various data sources like Mainframes, flat file.
  • Implemented test scripts for test driven development and integration.
  • Configuring of Hive, PIG, Impala, Sqoop, Flume andOoziein Cloudera (CDH5).
  • Experience in using Sqoop to import the data on to Cassandra tables from different relational databases and importing data from various sources to the Cassandra cluster using Java API's.
  • Created Cassandra tables to load large sets of structured, semi-structured and unstructured data coming from Linux, NoSQL and a variety of portfolios.
  • Involved in creating data-models for customer data using Cassandra Query Language.
  • Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing.
  • Develop wrapper and utility automation scripts inPython.
  • Write scripts to automate application deployments and configurations monitoringYARN.
  • Written MapReduce programs inPythonwith theHadoopstreaming API.
  • Involved in creating Hive tables and loading them with data and writing Hive queries.
  • Migration of some ETL processes from MicrosoftSQLServer toHadooputilizing Pig as data pipe line for easy data manipulation.
  • Developed Spark jobs using Scala in test environment for faster data processing and used SparkSQLfor querying.
  • Experience withAWScomponents like KINESIS, Amazon Ec2 instances, S3 buckets and Cloud Formation templates and Boto library.
  • Hands on experience inAWSCloud in variousAWSservices such as Redshift cluster, Route 53 domain configuration.
  • Developed scripts which will load the data into Spark RDD and do in memory data Computation to generate the output.
  • Developed Spark Streaming script which consumes topics from distributed messaging source Kafka and periodically pushes batch of data to Spark for real time processing.
  • Involved in converting Cassandra/Hive/SQL queries into Spark transformations using Spark RDDs in Scala.
  • Experience inElasticsearchtechnologies in creating custom SolrQuery components.
  • Implemented Kafka Custom encoders for custom input format to load data into Kafka Partitions.
  • Worked on different data sources such as Oracle,Netezza, MySQL, Flat files etc.
  • Extensively used Sqoop to get data from RDBMS sources like Teradata andNetezza.
  • Developed Talend jobs to move inbound files to HDFS file location based on monthly, weekly, daily and hourly partitioning.
  • Worked in an Agile methodology environment for meeting the requirement deadlines.

Environment: Cloudera, Map Reduce, Spark SQL, Spark Streaming, Pig, Hive, Flume, Hue, Oozie, Java, Eclipse, Zookeeper, Cassandra, Hbase, Talend, Github, Agile.

HADOOP DEVELOPER

Confidential - Milwaukee, Wisconsin

Responsibilities:

  • Worked on analyzing Hadoop stack and different big data analytic tools including Pig, Hive, Hbase and Sqoop.
  • Experienced to implement Hortonworks distribution system (HDP 2.1, HDP 2.2 and HDP 2.3).
  • Developed Map Reduce programs for some refined queries on big data.
  • Experienced in working withElasticMapReduce (EMR).
  • CreatingHive tablesand working on them for data analysis to cope up with the requirements.
  • Worked with business team in creatingHivequeries for ad hoc access.
  • Developed a frame work to handle loading and transform large sets of unstructured data from UNIX system toHIVEtables.
  • ImplementedHiveGeneric UDF's to implement business logic.
  • UsedHiveto analyze the partitioned and bucketed data and compute various metrics for reporting.
  • In depth understanding of MapReduce andYARNarchitectures.
  • Installed and configuredPigforETLjobs.
  • Developed Pig UDF’s to pre-process the data for analysis.
  • Implemented indexing for logs from Oozie toElasticSearch.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Analyzed the data by performing Hive queries, ran Pig scripts, Spark SQL and Spark Streaming.
  • Used ApacheNiFito copy the data from local file system to HDFS.
  • Worked on using NiFi for to provide user interface for managing the data flow process.
  • Extracted files from Cassandra through Sqoop and placed in HDFS for further processing.
  • Involved in creating generic Sqoop import script for loading data into Hive tables from RDBMS.
  • Involved incontinuous monitoring of operations using Storm.
  • Design, develop, unit test, and support ETL mappings and scripts for data marts usingTalend.

Environment: Hortonworks, Hadoop, Map Reduce, HDFS,Hive, Pig, Sqoop, Apache Kafka, Apache Storm, Oozie, SQL, Flume, Spark, Hbase, Cassandra, Informatica, Java, Github.

HADOOP DEVELOPER

Confidential - Chicago, IL

Responsibilities:

  • Experienced in development using Cloudera distribution system.
  • Load and transform large data sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
  • Involved in doing POC’s for performance comparison of Spark SQL with Hive.
  • Imported data usingSqoopfromTeradatausing Teradata connector.
  • Created Sub-Queries for filtering and faster execution of data.
  • Created multiple Join tables and fetched the required data.
  • Worked inAWSenvironment for development and deployment of customHadoopapplications.
  • Install and set up HBASE andImpala.
  • Used ApacheImpalato read, write and query theHadoopdata in HDFS, Hbase.
  • Implemented Partitioning, Dynamic Partitions and Buckets in Hive.
  • Supported Map Reduce Programs those are running on the cluster.
  • DevelopedETLtest scripts based on technical specifications/Data design documents and Source to Target mappings.
  • Worked on debugging, performance tuning of Hive & Pig Jobs.
  • Bulk load the data into Oracle using JDBC template.
  • Created Groovy scripts to load the CSV files into table intoOracletables.

Environment: Cloudera, HDFS, Pig, Hive, Map Reduce, python, Sqoop, Storm, LINUX, Hbase, Impala, Java, SQL, Cassandra, MongoDB, SVN.

JAVA/HADOOP DEVELOPER

Confidential - Boston, Massachusetts

Responsibilities:

  • Developed JSP, JSF and Servlets to dynamically generate HTML and display the data to the client side.
  • Used Hibernate Framework for persistence onto oracle database.
  • Written and debugged the ANT Scripts for building the entire web application.
  • Developed web services in Javaand Experienced with SOAP, WSDL and used WSDL to publish the services to another application.
  • Implemented JavaMessage Services (JMS) using JMS API.
  • Involved in managing and reviewingHadooplog files.
  • Installed and configuredHadoop, HDFS, YARN, Flume, developed multiple Map Reduce jobs in Java for data cleaning.
  • CodedHadoopMap Reduce jobs for energy generation and PS.
  • Coded using Servlets, SOAP Client and Apache CXF Rest API's for delivering the data from our application to external and internal for communication protocol.
  • Worked on Cloudera distribution system for running Hadoop jobs on it.
  • Expertise in writingHadoopJobs to analyze data using Map Reduce, Hive, Pig and Solr.
  • Created SOAP Web Service using JAX-WS, to enabled client to consume a SOAP Web Service.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.
  • Experienced in designing and developing multi-tier scalable applications usingJavaand J2EE Design Patterns.

Environment: Mapr, Java, HTML, Java Script, SQL Server, PL/SQL, JSP, Spring, Hibernate, Web Services, SOAP, SOA, JSF, Java, JMS, Junit, Oracle, Eclipse, SVN, XML, CSS, Log4j, Ant, Apache Tomcat.

JAVA DEVELOPER

Confidential

Responsibilities:

  • Involved in projects utilizing Java, Java EE web applications to create fully-integrated client management systems
  • Developed UI using HTML, Java Script, JSP and developed business Logic and interfacing components using Business Objects, JDBC and XML.
  • Participated in user requirement sessions to analysis and gather Business requirements.
  • Development of user visible site usingPerl, back end admin sites using Python and big data using core java.
  • Involved in development of the application using Spring Web MVC and other components of the
  • Elaborated Use Cases based on business requirements and was responsible for creation of class Diagrams, Sequence Diagrams.
  • Implemented Object-relation mapping in the persistence layer using Hibernate(ORM) framework.
  • Implemented REST Web Services with Jersey API to deal with customer requests
  • Experienced in developing Restful web services: consumed and produced.
  • Used Hibernate for the Database connection and Hibernate Query Language (HQL) to add and retrieve the information from the Database.
  • Implemented Spring JDBC for connecting Oracle database.
  • Designed the application using MVC framework for easy maintainability
  • Provided bug fixing and testing for existing web applications.
  • Involved in full system life cycle and responsible for Developing, Testing, Implementing.
  • Involved in Unit Testing, Integration Testing and System Testing.
  • Implemented Form Beans and their Validations.
  • Written Hibernate components.
  • Developed client-side validations withJavascript.

Environment: Spring, JSP, Servlets, REST, Oracle, AJAX, Java Script, JQuery, Hibernate, WebLogic, Log4j, HTML, XML, CVS, Eclipse, SOAP Web Services, XSLT, XSD, UNIX, Maven, Mockito Junits, Jenkins, shell scripting, MVS, ISPF.

We'd love your feedback!