Big Data Lead Resume Minneapolis, MN - Hire IT People

SUMMARY

Over 7+ years’ experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
Around 5 years of professional experience including extensive Hadoop and Linux experience.
Experienced in installation, configuration, supporting and monitoring 100+ node Hadoop cluster using Cloudera manager and Hortonworks distributions.
Experience in performing various major and minor Hadoop upgraded on large environments.
As an admin involved in Cluster maintenance, trouble shooting, Monitoring and followed proper backup & Recovery strategies.
Experience in HDFS data storage and support for running map - reduce jobs.
Involved in Infrastructure set up and installation of HDP stack on Amazon Cloud.
Experience with ingesting data from RDBMS sources like - Oracle, SQL and Teradata into HDFS using Sqoop.
Experience in big data technologies: Hadoop HDFS, Map-reduce, Pig, Hive, Oozie, Sqoop, Zookeeper and NoSQL.
Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive structured and unstructured data.
Loaded the aggregate data into a relational database for reporting, dash boarding and ad-hoc analyses, which revealed ways to lower operating costs and offset the rising cost of programming.
Use innovation to improve operational processes and performance, making sure data is of the highest Quality, build and unit test integration components
Formulation of highly detailed DW solutions using Informatica tool set which can be practically implemented
Develop and coding of Informatica mappings, session, workflows for different stages of ETL
Experience in benchmarking, performing backup and disaster recovery of Name Nodemetadata and important sensitive data residing on cluster.
Experience in designing and implementing HDFS access controls, directory and file permissions user authorization that facilitates stable, secure access for multiple users in a large multi-tenant cluster
Experience in using Ambari for Installation and management of Hadoop clusters. experience in Ansible and related tools for configuration management.
Experience in working large environments and leading the infrastructure support and operations.
Migrating applications from existing systems like MySQL, oracle, db2 and Teradata to Hadoop.
Expertise with Hadoop, Mapreduces, Pig, Sqoop, Oozie, and Hive.
Benchmarking Hadoop clusters to validate the hardware before and after installation to tweak the configurations to obtain better performance.
Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster.
Great team player and quick learner with effective communication, motivation, and organizational skills combined with attention to details and business improvements.
Adequate knowledge and working experience in Agile & Waterfall methodologies.

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Cassandra, Power pivot, Puppet, Oozie, Zookeeper, Kafka, Spark, Unix

Big data Analytics: Data Meer 2.0.5

Frameworks: MVC, Struts, Hibernate, Spring

Databases: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server

Web Servers: Web Logic, Web Sphere, Apache Tomcat

Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL

PROFESSIONAL EXPERIENCE

Confidential

Big Data Lead

Responsibilities:

Responsible for running the day to day operations of the technology platform.
Work activities specific to Production Services roles include Problem/Incident Management, Deployment, Operational Readiness, Capacity/Availability Management, Application Monitoring, Reporting, Production Governance, Triage, Associate Support, Change/Configuration Management.
I was responsible to identify possible production failure scenarios, create incident tickets in ticket tracking system, and communicates effectively with development and internal business operations teams.
Identifies vulnerabilities and opportunities for improvement, as well as maintain metrics to help develop analysis that will drive improvement in all areas of Production Services.
Creates and enhances administrative, operational and technical policies and procedures, adopting best practice guidelines, standards and procedures.
Takes ownership of escalations and perform trouble shooting, analysis, research and resolution using advanced query and programming skills.
Performs analytical, technical, and administrative work in planning, installing, designing and supporting new and existing equipment and software under moderate supervision.
Resolving complex issues which comes across.
Consults with end users to determine optimal configuration of equipment and applications.
Works on problems of minimal-moderate scope where analysis of situation or data requires a review of identifiable factors.
Exercises judgment within defined procedures and practices to determine appropriate action and document it as per needed for future.
This increased awareness and exposure to basic technical principles, concepts and techniques.
Coaching and mentoring for new on-boarding employees.
Initiates and provides leadership, strategic/tactical direction and planning input on all information technology and client/business area issues and in the development of technology environment which meets current and anticipated business requirements and objectives.
Participates with management in the development of technology products, service standards and development efforts that impact the client/business area.
Serves as an escalation point between the client/business area and internal management for the resolution of moderately complex unresolved problems, complaints and service requests.
Provides the client areas with technology products and service alternatives that improve the production services environment.

Environment: HDFS, Map Reduce, Spark, Kafka, Hive, Pig, Unix, Scoop, Ranger,Hbase,Jenkins.

Confidential - Minneapolis, MN

Hadoop / Data Platform Architect

Responsibilities:

Designed and implemented end to end big data platform solution on. Teradata Appliance and AWS cloud.
Manage Hadoop clusters in production, development, Disaster Recovery environments.
Implemented Teradata Aster a data science tool and integrate with Hadoop.
Developed Spark code to using Scala and Spark-SQL for faster processing and testing.
Handle data exchange between HDFS and RDBMS. Write Spark applications in Scala to interact with MYSQL database using Spark SQL
Experienced in working with Spark eco system using SCALA and HIVE Queries on different data formats like Text file and parquet.
Loaded the data into Spark RDD and do in memory data Computation to generate the Output response.
Involved with the team of fetching live stream data from DB2 to HDFS table using Spark Streaming.
Integrate Informatica BDM and Informatica Cloud with Hadoop.
Worked on the conversion of existing MapReduce batch applications to Spark for better performance.
Implemented Confidential Guardium to perform enterprise level monitoring.
Splunk integration with Hadoop for log aggregation and monitoring dashboards.
Provisioning, installing, configuring, monitoring, and maintaining HDFS, Yarn, HBase, Flume, Sqoop, Oozie, Pig, Hive, Ranger, Rangerkms, Falcon, Smart sense, Storm, Kafka.
Recovering from node failures and troubleshooting common Hadoop cluster issues.
Scripting Hadoop package installation and configuration to support fully-automated deployments.
Automated Hadoop deployment using Ambari blueprints and Ambari REST API's.
Responsible for building a cluster on HDP 2.5.
Performed major Hadoop upgrades. Upgraded from HDP 2.5.3 to HDP 2.6.4
Worked closely with developers to investigate problems and make changes to the Hadoop environment and associated applications.
Trouble shooting many cloud related issues such as Data Node down, Network failure, login issues and data block missing.
Proven results-oriented person with a focus on delivery.
Performed Importing and exporting data into HDFS and Hive using Scoop.
Performed HDFS cluster support and maintenance tasks like Adding and Removing Nodes without any effect to running jobs and data.
Used Python programming and language to develop a working and efficient network within the company
Utilized Python in the handling of all hits on Django Redis and other applications
Performed research regarding Python Programming and its uses and efficiency
Developed object-oriented programming to enhance company product management

Environment: HDFS, Map Reduce, Spark, Kafka, Hive, Pig, Unix, Scoop, Ranger, Rangerkms, Falcon, Smart sense, Storm, Kafka.

Confidential - Sunnyvale, CA

Hadoop Architect

Responsibilities:

Involving in Analysis, Design, Implementation and Bug Fixing Activities.
Involving in Functional & Technical Specification documents review.
Created and configured domains in production, development and testing environments using configuration wizard.
Developed Spark applications using SCALA with SPARK-SQL/STREAMINGAPI for faster testing and processing of data.
Worked on the SPARK SQL and Spark Streaming modules of SPARK and used SCALA to write code for all Spark use cases.
Worked on converting PL/SQL code into Scala code and converted PL/SQL queries into Hive queries.
Involved in creating and configuring the clusters in production environment and deploying the applications on clusters.
Deployed and tested the application using Tomcat web server.
Analysis of the specifications provided by the clients.
Involved to Design of the Application.
Ability to understand Functional Requirements and Design Documents.
Developed Use Case Diagrams, Class Diagrams, Sequence Diagram, Data Flow Diagram
Coordinated with other functional consultants.
Web related development with JSP, AJAX, HTML, XML, XSLT, and CSS.
Create and enhance the stored procedures, PL/SQL, SQL for Oracle 9i RDBMS.
Designed and implemented a generic parser framework using SAX parser to parse XML documents which stores SQL.
Creating Hive tables and working on them using Hive QL.
Written Hive queries for data analysis to meet the business requirements.
Experienced in defining job flows.
Got good experience with NOSQL database like HBase.
Identified the required data to be pooled to Hadoop, and created required Sqoop scripts which were scheduled periodically to migrate data to Hadoop environment.
Provided further Maintenance and support, this involves working with the Client and solving their problems which include major Bug fixing.

Environment: Java 1.4, Web logic Server 9.0,Kafka, Oracle 10g, Web services Monitoring, Web Drive, UNIX/LINUX Hadoop, Hive, Web Logic Server, JavaScript, HTML, CSS, XM

Confidential

Hadoop Architect

Responsibilities:

Responsible for coding Map Reduce program, Hive queries, testing and debugging the Map Reduce programs.
Developed Pig Latin scripts in the areas where extensive coding needs to be reduced to analyze large data sets.
Migrate Data from Elasticsearch-1.4.3 Cluster to Elasticsearch-5.6.4 using log stash, Kafka for all environments.
Infrastructure design for the ELK Clusters.
Developed Spark code using Scala and Spark -SQL for faster testing and data processing.
Involved in converting Hive/SQL queries into Spark Transformations using SPARK RDDs and SCALA.
Worked on the SPARK SQL and Spark Streaming modules of SPARK and used SCALA to write code for all Spark use cases.
Developed Spark code using Scala and Spark -SQL for faster testing and data processing.
ElasticSearch and Log stash performance and configuration tuning.
Identify and remedy any indexing issues, crawl errors, SEO penalties, etc.
Provided design recommendations and thought leadership to improved review processes and resolved technical problems.
Benchmark Elasticsearch-5.6.4 for the required scenarios.
Using X-pack for monitoring, Security on Elasticsearch-5.6.4 cluster.
Created POC on Hortonworks and suggested the best practice in terms HDP, HDF platform
Providing Global Search with ElasticSearch
Implemented Hadoop cluster on Horton Work’s HDP 2.4.and assisted with performance tuning, monitoring and troubleshooting.
Used Sqoop tool to extract data from a relational database into Hadoop.
Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
Responsible for performing peer code reviews, troubleshooting issues and maintaining status report.
Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
Installed and configured Hadoop cluster in DEV, QA and Production environments.
Performed upgrade to the existing Hadoop clusters.
Enabled Kerberos for Hadoop cluster Authentication and integrate with Active Directory for managing users and application groups.
Implemented Commissioning and Decommissioning of new nodes to existing cluster
Worked with systems engineering team for planning new Hadoop environment deployments, expansion of existing Hadoop clusters.
Responsible for data ingestions using Talend.
Designed and presented plan for POC on impala.
Experienced in migrating Hive QL into Impala to minimize query response time.
Monitoring workload, job performance and capacity planning using Cloudera Manager.
Worked with application teams to install OS level updates, patches and version upgrades required for Hadoop cluster environments.
Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.

Environment: Hadoop, HDFS, Map Reduce, Hive, Flume, Sqoop, Hortonworks, Cloudera CDH4, HBase, Oozie, Pig, AWS EC2 cloud, Eclipse, Talend, ELK.

We provide IT Staff Augmentation Services!

Big Data Lead Resume

Minneapolis, MN

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship