We provide IT Staff Augmentation Services!

Big Data Consultant Resume

2.00/5 (Submit Your Rating)

MinnesotA

PROFESSIONAL SUMMARY:

  • Over 8 years of experience with emphasis on Big Data /Hadoop technologies, development and design of Java based enterprise applications.
  • Experience in complete software development life cycle (SDLC) like User Interaction, Design, Development, Implementation, Integration, Documentation, All types of testing, Deployment, Builds, Configuration and Code Management.
  • Expertise in the creation of On - prem and Cloud Data Lake.
  • Experience working with Cloudera Distribution of Hadoop.
  • Expertise in HDFS, MapReduce, Spark, Hive, Pig, Sqoop, Hbase, Oozie, Flume, and various other ecosystem components.
  • Having knowledge in Hadoop Cluster Setup, Integrations, and Installations.
  • Expertise in Spark framework for batch and real time data processing.
  • Experience in working with BI team and transform big data requirements into Hadoop centric technologies.
  • Experience in performance tuning the Hadoop cluster by gathering and analyzing the existing infrastructure.
  • Working experience on designing and implementing complete end-to-end Hadoop Infrastructure including PIG, HIVE, Sqoop, Oozie, Flume and zookeeper.
  • Extensive experience with big data query tools like Pig Latin and HiveQL.
  • Experience with Sequence files, AVRO and HAR file formats and compression.
  • Experience in tuning and troubleshooting performance issues in Hadoop cluster.
  • Experience on monitoring, performance tuning, SLA, scaling and security in Big Data systems.
  • Experience in working with MapReduce programs using Apache Hadoop for working with Big Data
  • Hands on NoSQL database experience with HBase, MongoD B and Cassandra.
  • Extensive experience in Data Ingestion, In-Stream data processing, Batch Analytics and Data Persistence strategy.
  • Experience in designing and architecting large scale distributed applications.
  • Knowledge on converting MapReduce applications to Spark.
  • Experience in working with flume to load the log data from multiple sources directly into HDFS.
  • Experience in Data migration from existing data stores and mainframe NDM (Network Data mover) to Hadoop
  • Experience in handling multiple relational databases: MySQL, SQL Server, and Oracle.
  • Experience in supporting data analysis projects using Elastic MapReduce on the Amazon Web Services(AWS) cloud. Exporting and importing data into S3.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Experience in supporting analysts by administering and configuring HIVE.
  • Experience in running Pig and Hive scripts.
  • Experience in fine-tuning Map R educe jobs for better scalability and performance.
  • Developed various MapReduce applications to perform ETL workloads on terabytes of data.
  • Performed Importing and exporting data into HDFS and Hive using Sqoop.
  • Experience in writing LINUX/UNIX shell scripts to dump the Sharded data from Landing Zones to HDFS.
  • Worked on predictive modeling techniques like Neural Networks, Decision Trees and Regression Analysis.
  • Experience in Data mining and Business Intelligence tools such as Teradata, Informatica and MSBI.
  • Strong experience as a senior Java Developer in Web/intranet, Client/Server technologies using Java, J2EE, Servlets, JSP, EJB, JDBC.
  • In depth knowledge of Object Oriented programming methodologies (OOPS) and object oriented features like Inheritance, Polymorphism, Exception handling and Templates and development experience with Java technologies.
  • I have well experienced in training, mentoring and motivating other members of the team and other teams to achieve the goals of the company.
  • Involved in in-room/telephonic Scrum meetings to gather requirements and analyzing the requirements and developments.
  • Strong experience in client interaction and understanding business application, business data flow and data relations from them.
  • Experience in different operating Systems UNIX, LINUX, and WINDOWS.
  • Strong troubleshooting and production support skills and interaction abilities with end users
  • Good working knowledge of industry best practices for Enterprise development including implementing and reassuring to design patterns.
  • Experience in problem solving, analysis, implementation, installation, and configuration skills.
  • Good interpersonal skills, commitment, result oriented, hard working with a quest and zeal to learn new technologies and undertake challenging tasks. Excellent team member with strong communication skills and capable of meeting set Deadlines.

TECHNICAL SKILLS:

Programming Languages: C, Java, C#, LINUX/UNIX Shell Scripting.

Big Data Technologies: Apache Hadoop, Cloudera Distribution (HDFS & Map Reduce)

Hadoop Ecosystem: Yarn, Spark, PIG, Hive, SQOOP, Flume, Zookeeper, OOZIE, Hue.

NoSQL: HBase, Cassandra, MongoDB

Database Tools: SQL Server 2008, MYSQL, Oracle 10G.

Operating Systems: Windows XP/7/8, LINUX, UNIX.

Version Control: CVS, SVN, GIT, MantisBT and JIRA.

Networking: Putty, FileZilla and WinSCP.

IDE s and Utilities: Eclipse, NetBeans.

Data Integration Tools: Talend, Datameer

Others: Spring MVC, XML, SOAP,AWS

PROFESSIONAL EXPERIENCE:

Confidential, Minnesota

Big Data Consultant

Environment: Hadoop, CDH 4.X, Hue, MapReduce, Hive, Pig, Sqoop, Oozie, NOSQL, core java, JDBC, J2EE, Teradata, SVN, Eclipse, Putty, WinSCP, Shell Scripting and Ubuntu 10.

Responsibilities:

  • Performed Sqoop imports of data from Data warehouse platform to HDFS and built hive tables on top of the datasets.
  • Built ETL workflow to process data on hive tables.
  • Used HUE to create Oozie workflows to perform different kinds of actions such as hive, java &MapReduce.
  • Worked extensively in Hive used features like UDF and UDAFs.
  • Supported MapReduce Programs those are running on the cluster.
  • Responsible to manage data coming from different sources.
  • Used sequence and avro file formats and snappy compressions while storing data in HDFS. Used Efficient Columnar Storage like parquet for data used by business.
  • Worked extensively in MapReduce using Java Well versed with features like multiple output in MapReduce.
  • Worked on various types of SERDE.
  • Mentored analyst and test team for writing Hive Queries.
  • Worked on features like reading a hive table from MapReduce and making it available for all data nodes by keeping in distributed cache. Used both Hue and xml for Oozie.
  • Got good experience with NOSQL databases.
  • Extracted the data from Teradata into HDFS using the Sqoop
  • Participated in building test cluster for implementing Kerberos authentication. Installing Cloudera manager and Hue.
  • Worked on different file formats (ORCFILE, RCFILE, SEQUENCEFILE, TEXTFILE) and different Compression Codecs (GZIP, SNAPPY, LZO)
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from LINUX, NoSQL and a variety of portfolios.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
  • Written the Apache PIG scripts to process the HDFS data.
  • Created Hive External tables to store the processed results in a tabular format, Ad-hoc reporting.
  • Writing CLI commands using HDFS.
  • Managed and reviewed Hadoop log files.
  • Tested raw data and executed performance scripts.
  • Shared responsibility for administration of Hadoop, Hive and Pig.
  • Developed shell scripts for creating the reports from Hive data.
  • Writing the PIG UDF’s for achieving the desired functionality.
  • Involved in understanding the requirements and KT sessions.

Confidential, Utah

Hadoop Developer

Environment: Hadoop, CDH 3.X, Hue, MapReduce, Hive, Pig, Sqoop, Oozie, NOSQL, core java, JDBC, J2EE, Oracle, MySQL, SVN, Eclipse, Putty, WinSCP, Shell Scripting and LINUX.

Responsibilities:

  • Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS. Assisted with performance tuning and monitoring.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase, Cassandra database s and Sqoop.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Involved in loading data from LINUX file system to HDFS.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Implemented a script to transmit information from Oracle, MYSQL to Hbase using Sqoop.
  • Implemented best income logic using Pig scripts and UDFs.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performance Pig queries.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Responsible to manage data coming from different sources..
  • Load and transform large sets of structured, semi structured and unstructured data
  • Cluster coordination services through Zookeeper.
  • Experience in managing and reviewing Hadoop log files.
  • Job management using Fair scheduler.
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop. Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Assisted with data capacity planning and node forecasting.
  • Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
  • Administrator for Pig, Hive and HBase installing updates, patches and upgrades.
  • Writing the script files for processing data and loading to HDFS.
  • Writing CLI commands using HDFS.
  • Managed and reviewed Hadoop log files.
  • Tested raw data and executed performance scripts.
  • Shared responsibility for administration of Hadoop, Hive and Pig.
  • Involved in understanding the requirements and KT sessions.

Confidential

Java Developer

Environment: Java, JSP,HTML, CSS, MySQL, JDBC, Eclipse IDE, SPRING MVC, GIT.

Responsibilities:

  • Interacted with the clients to understand business requirements.
  • Analyzed and developed Use Case diagrams, Sequence diagrams and Activity diagrams using UML.
  • Involved in the development of application using Core Java, and JDBC.
  • Worked with various IDE like Eclipse, Net Beans IDE.
  • Used Core Java concepts in application such as multithreaded programming, synchronization of threads used thread wait, notify, join methods etc.
  • Creating cross-browser compatible and standards-compliant CSS-based page layouts.
  • Performed Unit Testing on the applications that are developed.
  • Involved in SCRUM meetings and developed and fixed the issues
  • Worked with SVN Commands.
  • Configuring JDBC connection pooling to access the MySQL database..
  • Written many SQL procedures and SQL queries.
  • Building and deployment of EAR and JAR files on test, stage and production systems.
  • Designing the UML diagrams (Sequence diagrams and class diagrams).
  • Creation of database objects like tables, views etc…
  • Regression testing, evaluating the response times, and resolving the connection pooling issues.
  • Involved in deployment activities.
  • Performed Performance and Unit testing.
  • Involved in the KT session to the new resources about functionality and high level architecture.

Confidential

Java Developer

Environment: Java, MySQL, JDBC, Eclipse IDE, SOAP, SVN.

Responsibilities:

  • Interacted with the clients to understand business requirements.
  • Analyzed and developed Use Case diagrams, Sequence diagrams and Activity diagrams using UML.
  • Involved in the development of application using Core Java, and JDBC.
  • Worked with various IDE like Eclipse, Net Beans IDE.
  • Worked with SVN Commands
  • Using stateless session beans.
  • Configuring JDBC connection pooling to access the MySQL database.
  • Written many SQL procedures and SQL queries.
  • Building and deployment of EAR and JAR files on test, stage and production systems.
  • Designing the UML diagrams (Sequence diagrams and class diagrams).
  • Creation of database objects like tables, views etc…
  • Regression testing, evaluating the response times, and resolving the connection pooling issues.
  • Performed Performance and Unit testing.

We'd love your feedback!