We provide IT Staff Augmentation Services!

Big Data Lead Resume

3.00/5 (Submit Your Rating)

Minneapolis, MN

SUMMARY

  • Over 7+ years’ experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
  • Around 5 years of professional experience including extensive Hadoop and Linux experience.
  • Experienced in installation, configuration, supporting and monitoring 100+ node Hadoop cluster using Cloudera manager and Hortonworks distributions.
  • Experience in performing various major and minor Hadoop upgraded on large environments.
  • As an admin involved in Cluster maintenance, trouble shooting, Monitoring and followed proper backup & Recovery strategies.
  • Experience in HDFS data storage and support for running map - reduce jobs.
  • Involved in Infrastructure set up and installation of HDP stack on Amazon Cloud.
  • Experience with ingesting data from RDBMS sources like - Oracle, SQL and Teradata into HDFS using Sqoop.
  • Experience in big data technologies: Hadoop HDFS, Map-reduce, Pig, Hive, Oozie, Sqoop, Zookeeper and NoSQL.
  • Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive structured and unstructured data.
  • Loaded the aggregate data into a relational database for reporting, dash boarding and ad-hoc analyses, which revealed ways to lower operating costs and offset the rising cost of programming.
  • Use innovation to improve operational processes and performance, making sure data is of the highest Quality, build and unit test integration components
  • Formulation of highly detailed DW solutions using Informatica tool set which can be practically implemented
  • Develop and coding of Informatica mappings, session, workflows for different stages of ETL
  • Experience in benchmarking, performing backup and disaster recovery of Name Nodemetadata and important sensitive data residing on cluster.
  • Experience in designing and implementing HDFS access controls, directory and file permissions user authorization that facilitates stable, secure access for multiple users in a large multi-tenant cluster
  • Experience in using Ambari for Installation and management of Hadoop clusters. experience in Ansible and related tools for configuration management.
  • Experience in working large environments and leading the infrastructure support and operations.
  • Migrating applications from existing systems like MySQL, oracle, db2 and Teradata to Hadoop.
  • Expertise with Hadoop, Mapreduces, Pig, Sqoop, Oozie, and Hive.
  • Benchmarking Hadoop clusters to validate the hardware before and after installation to tweak the configurations to obtain better performance.
  • Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster.
  • Great team player and quick learner with effective communication, motivation, and organizational skills combined with attention to details and business improvements.
  • Adequate knowledge and working experience in Agile & Waterfall methodologies.

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Cassandra, Power pivot, Puppet, Oozie, Zookeeper, Kafka, Spark, Unix

Big data Analytics: Data Meer 2.0.5

Frameworks: MVC, Struts, Hibernate, Spring

Databases: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server

Web Servers: Web Logic, Web Sphere, Apache Tomcat

Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL

PROFESSIONAL EXPERIENCE

Confidential

Big Data Lead

Responsibilities:

  • Responsible for running the day to day operations of the technology platform.
  • Work activities specific to Production Services roles include Problem/Incident Management, Deployment, Operational Readiness, Capacity/Availability Management, Application Monitoring, Reporting, Production Governance, Triage, Associate Support, Change/Configuration Management.
  • I was responsible to identify possible production failure scenarios, create incident tickets in ticket tracking system, and communicates effectively with development and internal business operations teams.
  • Identifies vulnerabilities and opportunities for improvement, as well as maintain metrics to help develop analysis that will drive improvement in all areas of Production Services.
  • Creates and enhances administrative, operational and technical policies and procedures, adopting best practice guidelines, standards and procedures.
  • Takes ownership of escalations and perform trouble shooting, analysis, research and resolution using advanced query and programming skills.
  • Performs analytical, technical, and administrative work in planning, installing, designing and supporting new and existing equipment and software under moderate supervision.
  • Resolving complex issues which comes across.
  • Consults with end users to determine optimal configuration of equipment and applications.
  • Works on problems of minimal-moderate scope where analysis of situation or data requires a review of identifiable factors.
  • Exercises judgment within defined procedures and practices to determine appropriate action and document it as per needed for future.
  • This increased awareness and exposure to basic technical principles, concepts and techniques.
  • Coaching and mentoring for new on-boarding employees.
  • Initiates and provides leadership, strategic/tactical direction and planning input on all information technology and client/business area issues and in the development of technology environment which meets current and anticipated business requirements and objectives.
  • Participates with management in the development of technology products, service standards and development efforts that impact the client/business area.
  • Serves as an escalation point between the client/business area and internal management for the resolution of moderately complex unresolved problems, complaints and service requests.
  • Provides the client areas with technology products and service alternatives that improve the production services environment.

Environment: HDFS, Map Reduce, Spark, Kafka, Hive, Pig, Unix, Scoop, Ranger,Hbase,Jenkins.

Confidential - Minneapolis, MN

Hadoop / Data Platform Architect

Responsibilities:

  • Designed and implemented end to end big data platform solution on. Teradata Appliance and AWS cloud.
  • Manage Hadoop clusters in production, development, Disaster Recovery environments.
  • Implemented Teradata Aster a data science tool and integrate with Hadoop.
  • Developed Spark code to using Scala and Spark-SQL for faster processing and testing.
  • Handle data exchange between HDFS and RDBMS. Write Spark applications in Scala to interact with MYSQL database using Spark SQL
  • Experienced in working with Spark eco system using SCALA and HIVE Queries on different data formats like Text file and parquet.
  • Loaded the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Involved with the team of fetching live stream data from DB2 to HDFS table using Spark Streaming.
  • Integrate Informatica BDM and Informatica Cloud with Hadoop.
  • Worked on the conversion of existing MapReduce batch applications to Spark for better performance.
  • Implemented Confidential Guardium to perform enterprise level monitoring.
  • Splunk integration with Hadoop for log aggregation and monitoring dashboards.
  • Provisioning, installing, configuring, monitoring, and maintaining HDFS, Yarn, HBase, Flume, Sqoop, Oozie, Pig, Hive, Ranger, Rangerkms, Falcon, Smart sense, Storm, Kafka.
  • Recovering from node failures and troubleshooting common Hadoop cluster issues.
  • Scripting Hadoop package installation and configuration to support fully-automated deployments.
  • Automated Hadoop deployment using Ambari blueprints and Ambari REST API's.
  • Responsible for building a cluster on HDP 2.5.
  • Performed major Hadoop upgrades. Upgraded from HDP 2.5.3 to HDP 2.6.4
  • Worked closely with developers to investigate problems and make changes to the Hadoop environment and associated applications.
  • Trouble shooting many cloud related issues such as Data Node down, Network failure, login issues and data block missing.
  • Proven results-oriented person with a focus on delivery.
  • Performed Importing and exporting data into HDFS and Hive using Scoop.
  • Performed HDFS cluster support and maintenance tasks like Adding and Removing Nodes without any effect to running jobs and data.
  • Used Python programming and language to develop a working and efficient network within the company
  • Utilized Python in the handling of all hits on Django Redis and other applications
  • Performed research regarding Python Programming and its uses and efficiency
  • Developed object-oriented programming to enhance company product management

Environment: HDFS, Map Reduce, Spark, Kafka, Hive, Pig, Unix, Scoop, Ranger, Rangerkms, Falcon, Smart sense, Storm, Kafka.

Confidential - Sunnyvale, CA

Hadoop Architect

Responsibilities:

  • Involving in Analysis, Design, Implementation and Bug Fixing Activities.
  • Involving in Functional & Technical Specification documents review.
  • Created and configured domains in production, development and testing environments using configuration wizard.
  • Developed Spark applications using SCALA with SPARK-SQL/STREAMINGAPI for faster testing and processing of data.
  • Worked on the SPARK SQL and Spark Streaming modules of SPARK and used SCALA to write code for all Spark use cases.
  • Worked on converting PL/SQL code into Scala code and converted PL/SQL queries into Hive queries.
  • Involved in creating and configuring the clusters in production environment and deploying the applications on clusters.
  • Deployed and tested the application using Tomcat web server.
  • Analysis of the specifications provided by the clients.
  • Involved to Design of the Application.
  • Ability to understand Functional Requirements and Design Documents.
  • Developed Use Case Diagrams, Class Diagrams, Sequence Diagram, Data Flow Diagram
  • Coordinated with other functional consultants.
  • Web related development with JSP, AJAX, HTML, XML, XSLT, and CSS.
  • Create and enhance the stored procedures, PL/SQL, SQL for Oracle 9i RDBMS.
  • Designed and implemented a generic parser framework using SAX parser to parse XML documents which stores SQL.
  • Creating Hive tables and working on them using Hive QL.
  • Written Hive queries for data analysis to meet the business requirements.
  • Experienced in defining job flows.
  • Got good experience with NOSQL database like HBase.
  • Identified the required data to be pooled to Hadoop, and created required Sqoop scripts which were scheduled periodically to migrate data to Hadoop environment.
  • Provided further Maintenance and support, this involves working with the Client and solving their problems which include major Bug fixing.

Environment: Java 1.4, Web logic Server 9.0,Kafka, Oracle 10g, Web services Monitoring, Web Drive, UNIX/LINUX Hadoop, Hive, Web Logic Server, JavaScript, HTML, CSS, XM

Confidential

Hadoop Architect

Responsibilities:

  • Responsible for coding Map Reduce program, Hive queries, testing and debugging the Map Reduce programs.
  • Developed Pig Latin scripts in the areas where extensive coding needs to be reduced to analyze large data sets.
  • Migrate Data from Elasticsearch-1.4.3 Cluster to Elasticsearch-5.6.4 using log stash, Kafka for all environments.
  • Infrastructure design for the ELK Clusters.
  • Developed Spark code using Scala and Spark -SQL for faster testing and data processing.
  • Involved in converting Hive/SQL queries into Spark Transformations using SPARK RDDs and SCALA.
  • Worked on the SPARK SQL and Spark Streaming modules of SPARK and used SCALA to write code for all Spark use cases.
  • Developed Spark code using Scala and Spark -SQL for faster testing and data processing.
  • ElasticSearch and Log stash performance and configuration tuning.
  • Identify and remedy any indexing issues, crawl errors, SEO penalties, etc.
  • Provided design recommendations and thought leadership to improved review processes and resolved technical problems.
  • Benchmark Elasticsearch-5.6.4 for the required scenarios.
  • Using X-pack for monitoring, Security on Elasticsearch-5.6.4 cluster.
  • Created POC on Hortonworks and suggested the best practice in terms HDP, HDF platform
  • Providing Global Search with ElasticSearch
  • Implemented Hadoop cluster on Horton Work’s HDP 2.4.and assisted with performance tuning, monitoring and troubleshooting.
  • Used Sqoop tool to extract data from a relational database into Hadoop.
  • Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
  • Responsible for performing peer code reviews, troubleshooting issues and maintaining status report.
  • Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
  • Installed and configured Hadoop cluster in DEV, QA and Production environments.
  • Performed upgrade to the existing Hadoop clusters.
  • Enabled Kerberos for Hadoop cluster Authentication and integrate with Active Directory for managing users and application groups.
  • Implemented Commissioning and Decommissioning of new nodes to existing cluster
  • Worked with systems engineering team for planning new Hadoop environment deployments, expansion of existing Hadoop clusters.
  • Responsible for data ingestions using Talend.
  • Designed and presented plan for POC on impala.
  • Experienced in migrating Hive QL into Impala to minimize query response time.
  • Monitoring workload, job performance and capacity planning using Cloudera Manager.
  • Worked with application teams to install OS level updates, patches and version upgrades required for Hadoop cluster environments.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.

Environment: Hadoop, HDFS, Map Reduce, Hive, Flume, Sqoop, Hortonworks, Cloudera CDH4, HBase, Oozie, Pig, AWS EC2 cloud, Eclipse, Talend, ELK.

We'd love your feedback!