We provide IT Staff Augmentation Services!

Hadoop/sql Developer Resume

3.00/5 (Submit Your Rating)

Denver, CO

PROFESSIONAL SUMMARY:

  • Software developer with an experience of 7+ years in information technology which includes 4+ years of experience in suggesting and implementing solutions for applications using Big Data Technologies as a Hadoop Developer and 3+ years in Design, Development, Analysis, Deployment, and Maintenance of software applications using JAVA/J2EE Technologies in various industries.
  • Experience in installing, configuring and maintaining multiple Hadoop clusters of different sizes.
  • Exposure to design and development of database driven systems.
  • Good knowledge of Hadoop architectural components like Hadoop Distributed File System, Name Node, Data Node, Task Tracker, Job Tracker, and Map Reduce programming.
  • Experience in developing and deploying of applications using Hadoop based components like Hadoop Map Reduce(MR1), YARN (MR2), HDFS, Hive, Pig, HBase, Flume, Sqoop, Spark (Streaming, Spark SQL, Spark ML), Storm, Kafka, Oozie, ZooKeeper and Avro.
  • Exposure on Big Data technologies and Hadoop eco system, in depth understanding of Map Reduce and Hadoop infrastructure.
  • Experience in writing MapReduce jobs using native Java code, Pig, Hive for data processing.
  • Hands on experience in importing and exporting data into HDFS and Hive using Sqoop.
  • Exposure on usage of NoSQL databases column - oriented HBase and Cassandra.
  • Extensive experienced in working with structured, semi-structured, and unstructured data by implementing complex map reduce programs using design patterns.
  • Excellent knowledge of multiple platforms such as Cloudera, Hortonworks, MapR etc.
  • Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing.
  • Hands on experience in major Big Data components Apache Kafka, Apache spark, Zookeeper, Avro.
  • Experienced in implementing unified data platforms using Kafka producers/ consumers, implement pre-processing using storm topologies.
  • Strong experience in architecting real time streaming applications and batch style large scale distributed computing applications using tools like Spark Streaming, Spark SQL, Kafka, Flume, Map reduce, Hive etc.
  • Experience using various Hadoop Distributions (Cloudera, Hortonworks, MapR etc) to fully implement and leverage new Hadoop features.
  • Strong experience in architecting real time streaming applications and batch style large scale distributed computing applications using tools like Spark Streaming, Spark SQL, Kafka, Flume, Map reduce, Hive etc.
  • Experience in Apache Flume for collecting, aggregating and moving huge chunks of data from various sources such as webserver, telnet sources etc.
  • Great team player and quick learner with effective communication, motivation, and organizational skills combined with attention to details and business improvements.
  • Experienced in involving complete SDLC life cycle includes requirements gathering, design, development, testing and production environments.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, Map reduce, HBase, Pig, Hive, Sqoop

Data warehousing: Informatic Power Center, ETL, Informatica Power Exchange, Metadata, Data Mining, SQL, OLAP, OLTP, Work flow manager and work flow monitor.

Programming languages: Java, Python, Linux shell scripts

Databases: MS-SQL Server, HBase, NoSQL Cassandra

Web Servers: Web Logic, Web Sphere, Apache Tomcat, AWS

Web Technologies: HTML, XML, JavaScript, Python

Operating Systems: Linux, Unix, Windows 8, Windows 7, Windows Server 2008/2003

PROFESSIONAL EXPERIENCE:

Confidential, Denver, CO

Hadoop/SQL Developer

Responsibilities:

  • Imported log files of master card, baseII, Confidential organizations from mainframes using Golden Gate Software and injected these logfiles into hive tables by creating hive external tables for each type of log files.
  • Written complex Hive and SQL queries for data analysis to meet business requirements.
  • Creating Hive external tables to store the GGS output. Working on them for data analysis to meet the business requirements.
  • Used ESP schedule jobs to automate the pipeline workflow and orchestrate the map reduces jobs that extract the data on a timely manner.
  • Involved in Hive performance optimizations like partitioning, bucketing and perform several types of joins on Hive tables and implementing Hive serdes like JSON and Avro.
  • Designed and implemented Map Reduce-based large-scale parallel relation-learning system.
  • Worked with Parquet, Avro Data Serialization system to work with all data formats.
  • Implemented several types of scripts like shell scripts, python, and HQL scripts to meet the business requirements.
  • Developed Spark Streaming applications for real time Processing.
  • Experienced in managing and reviewing Hadoop log file.
  • Experienced in working with different scripting technologies like Python, Unix shell scripts.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Designed and developed a corporate intranet that is used in daily work flow to increase.
  • Developed Spark Streaming applications for real time Processing.
  • Applied different Transformations and actions in Spark-SQL like joins and collect.
  • Drives and leads solution design services, including requirements analysis, functional and technical design leadership, and documentation / review with business and IT constituents
  • Used agile development environments using continuous integration and deployments.

Environment: Hadoop, HDFS, Hive, Spark, MapReduce, Cloudera, Avro, CDH, Shell script, HBase, Java, Eclipse, Python, MySQL.

Confidential, Richmond, VA

Sr. Big Data/ Hadoop Developer.

Responsibilities:

  • Involved in installing Hadoop Ecosystem components (Hadoop, MapReduce, Spark, Pig, Hive, Sqoop, Flume, Zookeeper and HBase).
  • Installed and configured three node cluster in Full Distributed mode and pseudo Distributed mode.
  • Imported and exported data (MySQL, CSV and text file) from local/ external file system and MySQL to HDFS on a regular basis.
  • Worked with structured, semi-structured and unstructured data which is automated in the tool Big Bench.
  • Implemented Hive Generic UDF’s to in corporate business logic into Hive Queries.
  • Configuring Spark Streaming to receive real time data from the Kafka and Store the stream data to HDFS.
  • Worked with Spark to create structured data from the pool of unstructured data received.
  • Developed multiple MapReduce jobs in Java and python for data cleaning and preprocessing.
  • Assisted with data capacity planning and node forecasting.
  • Involved in continuous monitoring and managing the Hadoop cluster using Hortonworks.
  • Experience and working knowledge of TVA.
  • Developed optimal strategies for distributing the web log data over the cluster; importing and exporting the stored web log data into HDFS and Hive using Sqoop.
  • Used tableau for data visualization.
  • Developed Spark scripts by using Scala shell commands as per the requirement.
  • Loading data from various data sources and legacy systems into Teradata production and development warehouse using BTEQ, FASTEXPORT, MULTI LOAD, FASTLOAD and Informatica.
  • Involved in migration projects to migrate data from data warehouses on Oracle/DB2 and migrated those to Teradata.
  • Responsible to Configure on the Hadoop cluster and troubleshoot the common Cluster Problem.
  • Involved in handling the issues related to cluster start, node failures on the system.
  • Cluster configuration and data transfer (distcp and hftp), inter and intra cluster data transfer.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources.

Environment: Hadoop, HDFS, Map Reduce, Hive, Sqoop, Spark, Scala, Kafka, Oozie, Nifi, Cassandra, Python Maven, Shell Scripting.

Confidential, Chevy Chase, MD

Big Data/ Hadoop Developer

Responsibilities:

  • Worked with the business users to gather, define business requirements and analyze the possible technical solutions.
  • Installed Name node, Secondary name node, Yarn (Resource Manager, Node manager, Application master), Data node.
  • Installed and Configured HDP2.2
  • Responsible for implementation and ongoing administration of Hadoop infrastructure.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Migrated complex map reduce programs into in memory Spark processing using Transformations and actions.
  • Good knowledge on Teradata Manager, TDWM, PMON, DBQL, SQL assistant and BTEQ.
  • Gathered system design requirements, design and write system specifications.
  • Designed and developed UNIX shell scripts as part of the ETL process to compare control totals, automate the process of loading, pulling and pushing data from and to different servers.
  • Experienced in working with different scripting technologies like Python, Unix shell scripts.
  • Mentored analyst and test team for writing Hive Queries.
  • Worked on optimizing and tuning the Teradata views and SQL’s to improve the performance of batch and response time of data for users.
  • Designed workflows with many sessions with decision, assignment task, event wait, and event raise tasks.
  • Extracted the needed data from server and into HDFS and bulk loaded the cleaned data into HBase.
  • Handled different time series data using HBase to store data and perform analytics based on time to improve queries retrieval time.
  • Involved in Agile methodologies, daily scrum meetings, spring planning.

Environment: Hadoop, HDFS, Hive, Pig, Flume, Sqoop, Spark, MapReduce, Cloudera, Avro, Snappy, Zookeeper, CDH, NoSQL, HBase, Java (JDK 1.6), Eclipse, Python, MySQL.

Confidential, Dallas, TX

Big Data/ Hadoop Developer

Responsibilities:

  • Participated in Hadoop Deployment and infrastructure scaling.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries.
  • Developed Simple to complex Map Reduce Jobs using Hive and Pig.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Migrated complex map reduce programs into in memory Spark processing using Transformations and actions.
  • Parsed high-level design spec to simple ETL coding and mapping standards.
  • Maintained warehouse metadata, naming standards and warehouse standards for future application development.
  • Worked with Linux systems and RDBMS database on a regular basis in order to ingest data using Sqoop.
  • Implemented Kafka consumers to move data from Kafka partitions into Cassandra for near real time analysis.
  • Involved in Hadoop cluster task like adding and removing nodes.
  • Managed and reviewed Hadoop log files and loaded log data into HDFS using Sqoop.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries, Pig Scripts, Sqoop jobs.

Environment: Horton works, Hadoop, HDFS, Spark, Oozie, Pig, Hive, MapReduce, Sqoop, Cassandra, Linux.

Confidential

Java Developer

Responsibilities:

  • Participated in the Requirement collection process for Trading, Import & Exports.
  • Created/customized different save beans, load beans, and Xbeans for various requirements.
  • Deployment of customizations onto the Unix/on-site environment.
  • Completed JSP changes for UI part.
  • Created the required views and tables using SQL & DAT files.
  • Configuration and creation of different processes.
  • Created web services using REST Java and Node.js.
  • Developed Detail Design and Technical Design Documents.
  • Developed API to Integrate with architecture, including creating machine Images.
  • Coding Different Java validation classes for the Application logic and utilities.
  • Developed JMS resources for asynchronous message receiving from various client systems.
  • Extracted data from the XML files and save the data in the Oracle database.
  • Implemented Database interactions using JDBC/SQL with back-end Oracle 10g and 9i and also developed necessary stored procedures and triggers in Oracle.

Environment: Java, J2EE, Servlets, JSP, Html, Rational Application Developer V7.0, Rational Clear Case LT, Toad, Oracle.

Confidential

Java Developer

Responsibilities:

  • Responsible and active in the analysis, design, implementation and deployment of full Software Development Lifecycle (SDLC) of the project.
  • Designed and developed user interface using JSP, HTML and JavaScript.
  • Developed Struts action classes, action forms and performed action mapping using Struts framework and performed data validation in form beans and action classes.
  • Defined the search criteria and pulled out the record of the customer from the database. Make the required changes and save the updated record back to the database.
  • Validated the fields of user registration screen and login screen by writing JavaScript validations.
  • Used DAO and JDBC for database access.
  • Design and develop XML processing components for dynamic menus on the application.
  • Involved in post-production support and maintenance of the application.

Environment: Java 1.5, Struts, Servlets, HTML, XML, SQL, J2EE, JUnit, Tomcat 6.

We'd love your feedback!