Openstack Engineer/hadoop Developer Resume
CA
SUMMARY
- Overall 7+ years of professional IT experience, which includes hands on experience in Big data technologies, OpenStack Engineering with focus on Cloudera, AWS, OpenStack, Linux and Windows System’s.
- 1+ year of Experience as OpenStack Engineer/Hadoop Developer with a focus on OpenStack Object Storage (Swift), OpenStack Compute (Nova), OpenStack Image Service (Glance), OpenStack Dashboard (Horizon), OpenStack Identity (Keystone), OpenStack Network Service (Neutron), OpenStack Block Storage (Cinder).
- Excellent hands on experience in OpenStack Environment Implementation.
- Implemented Cloud Infrastructure as a Service Environment using Open Source Technology OpenStack to enable portability of cloud services across hybrid cloud environments.
- 3 years of Experience as Big Data Developer in Hadoop Ecosystem and as well as various analytical tools.
- Good working experience with Hadoop, MapReduce, Hive, Hbase, HDFS, YARN, Spark, Storm, Cassandra, Kafka, Pig, Sqoop, Impala.
- Exposure on Hadoop Architecture and various components like HDFS, Job Tracker, Task Tracker, Name Node and Data Node.
- Hands - on experience on Spark, RDD, Data Frames, Spark SQL.
- Experience in Importing and Exporting data into HDFS and HIVE using Sqoop and Vice-Versa.
- Importing and exporting data from Cassandra, AWS to HDFS and vice versa using Spark API and used Spark SQL to Analyze the data.
- Hands on experience with MapReduce Programming to performs data transmission.
- Good understanding on Messaging systems like Kafka and Dataset API in Spark.
- Involved in loading data from UNIX file system to HDFS.
- Created Hive tables from JSON data using data serialization frameworks like AVRO.
- Very good understanding of Partitions, bucketing concepts in Hive and designed both Managed andExternal tables in Hive to optimize performance.
- Involved in using Pig Latin to analyze the large-scale data.
- Worked on some of the AWS tools like S3, Redshift, EMR.
- Experienced in analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
- Experience in ETL operations on Hive to Spark.
- Experience of Tableau Desktop BI reporting tools & Tableau Dashboards Development.
- 2 years of Experience as an ETL Developer implementing IBM InfoSphere DataStage for building Data Mart, Data Warehouse & Data Conversion, hands on experience in platform related Support for ETL Datastage and performance tuning, troubleshooting ETL Datastage issues related to performance, integration, DB.
- Excellent understanding of Data modeling (Dimensional & Relational) on concepts like Star-schema, Snowflake schema using fact and dimension tables and relational databases (Oracle), OLAP SAP BW, MS Excel, MS Access, Teradata, Netezza, Redshift, Hive and DB2.
- Experience in creating reporting solutions using Teradata Views, Macros, Excel functions, Excel graphs and Pivot tables.
- Good knowledge about the Teradata Architecture and Concepts.
- Good Knowledge on Stored Procedure, Triggers, Batch Referential Integrity, Indexes and other features of Teradata.
- Provided applications support in Teradata, Sqlserver environment. Creating Tables, Stored Procedures, Views, Triggers, Rules, Defaults, macros and functions.
- Extensively worked with Teradata utilities like BTEQ, Fast Export, Fast Load, Multi Load to export and load data to/from different source systems including flat files.
- Hands on experience using query tools like TOAD, SQL Developer, PLSQL developer, Teradata SQL Assistant and Query man.
- Used External Loaders like Multi Load and Fast Load to load data into Teradata database.
- 1 year of Experience as an Software Engineer with FactoryPro MES (Manufacturing Execution System).
- Good Experience in writing complex SQL queries with databases like DB2, Oracle 10g, MySQL, SQL Server andMS SQL Server.
- Hands on working knowledge on IBM WebSphere Datapower Appliance XI52.
- Excellent understanding of EDI retail transactions and process flow - 850, 856,810 & 997.
- Highly flexible and capable of self-learning new tools and technologies.
- Can handle tasks independently with little or no support.
TECHNICAL SKILLS
Big Data Ecosystem: Hadoop, Map Reduce, Hive, HBase, Sqoop, Impala, Spark, Kafka, Flume, AWS.
Databases: My SQL, My SQL Server, DB2, Oracle 11g
No-SQL: Hbase, Cassandra
BI Tools: Tableau
Development Tools: Eclipse, NetBeans
Web Services: SOAP, Restful
Web - technologies: HTML, XML, XSLT 2.0, Java Script
EAI/SOA Integration: IBM WebSphere Datapower Integration Appliance XI52.
Languages: Hive QL, SQL, Java
Cloud: AWS
Operating Systems: Windows, Mac, Unix
Other Tools: MS Office, XML Spy, Atom, SOAP UI 5.3.0
PROFESSIONAL EXPERIENCE
Confidential, CA
OpenStack Engineer/Hadoop developer
Responsibilities:
- Automated deployment of OpenStack Clouds in data centers and availability zones.
- Installed OpenStack Controller, Compute, object and block storage in premises datacenter.
- Involved in Installation and administration of OpenStack Components.
- Installed and Configured the image service.
- Launched Instances from Horizon and CLI (Command Line Interface).
- Configuring and Managing the users and services with the Keystone Identity Services.
- Created various services, endpoints and bind them using OpenStack.
- Created various roles in OpenStack.
- Set the Quotas for various projects and components in OpenStack.
- Managed and troubleshoot the Neutron Networking Services.
- Created Bridge Networks for Internal and External Access.
- Managed NoSQL database instances and schemas for OpenStack services nova, neutron and keystone.
- Moved the data from Hive tables into Cassandra DB.
- Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries.
- Assisted application teams in installing Hadoop updates, operating system, patches and version upgarades when required.
- Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
- Experienced in analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
- Used Neutron command line client to create routers, networks, ports, floating IP's and load balancer pools and VIP's for applications.
- Used Nova commands to build and manage OpenStack VMs of different flavors and different images.
- Responsible for reliability and uptime of Control Plane services through automated monitoring and alerting.
- Hands on experience working with cluster mode RabbitMQ that is used as message queue in OpenStack.
- Used GitHub for code version management and also GitHub pull requests for code reviews and change reviews.
Environment: Keystone, Nova, Neutron, Glance, Ubuntu, CentOS, GIT, Cassandra, RabbitMQ, Hadoop, Red Hat.
Confidential, CA
Hadoop Developer
Responsibilities:
- Involved in start to end process of Hadoop cluster installation, configuration and monitoring.
- Involved in all stages of the data project lifecycle (data migration, validation, analysis and reporting) and design optimized data mining.
- Worked on Cloudera to analyze data present on top of HDFS
- Participated in development/implementation of Cloudera Hadoop environment.
- Load and transform large sets of structured semi-structured and unstructured data.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop.
- Written hive queries for enterprise analytics project
- Worked on NoSQL databases including Hbase.
- Involved in loading data from UNIX file system to HDFS.
- Responsible for managing and reviewing Hadoop log files.
- Involved in creating Hive tables loading with data and writing hive queries which will run internally in map reduce way.
- Worked on CSV files while trying to get input from the MySQL database.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
- Designed a star-schema model for inventory management to strengthen the ability to manage data.
- Wrote data queries (Oracle, MySQL & PostgreSQL) involving stored procedures, triggers, functions and indexes that helped optimize data retrieval.
- Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
Environment: HDFS, Hive, PIG, UNIX, SQL, Java MapReduce, Hadoop Cluster, HBase, Sqoop, Oozie, Linux, Cloudera, Zookeeper, Oracle 10g.
Confidential, NJ
Hadoop Developer
Responsibilities:
- Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
- Extensively involved in Installation and configuration of Cloudera distribution Hadoop.
- Designed and implemented Incremental Imports into Hive tables.
- Worked in Loading and transforming large sets of structured, semi structured and unstructured data.
- Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data
- Experienced in managing and reviewing the Hadoop log files.
- Implemented the workflows using Apache Oozie framework to automate tasks
- Worked with Avro Data Serialization system to work with JSON data formats.
- Worked on different file formats like Sequence files, XML files and Map files using MapReduce Programs.
- Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.
- Developed scripts and automated data management from end to end and sync up between all the clusters.
- Involved in Setup and benchmark of Hadoop /HBase clusters for internal use.
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Oozie, Java (jdk 1.6), Eclipse.
Confidential, KY
ETL Developer
Responsibilities:
- Involved in gathering of business requirements, interacting with business users and translation of the requirements to ETL High level and Low-level Design.
- Documented both High level and Low-level design documents, Involved in the ETL design and development of Data Model.
- Worked in importing and cleansing of data from various sources like Teradata, Oracle, flat files, SQL Server 2005 with high volume data.
- Developed complex ETL mappings and worked on the transformations like Source qualifier, Joiner, Expression, Sorter, Aggregator, Sequence generator, Normalizer, Connected Lookup, Unconnected Lookup, Update Strategy and Stored Procedure transformation.
- Implemented Slowly Changing Dimension Type 1 and Type 2 for inserting and updating Target tables for maintaining the history.
- Worked on loading the data from different sources like Oracle, EBCDIC files (Created Copy book layouts for the source files), ASCII delimited flat files to Oracle targets and flat files.
- Experience in working with Mapping variables, Mapping parameters, Workflow variables, implementing SQL scripts and Shell scripts in Post-Session, Pre-Session commands in sessions.
- Experience in writing SQL*Loader scripts for preparing the test data in Development, TEST environment and while fixing production bugs.
- Experience in using the debugger to identify the processing bottlenecks, and performance tuning of Informatica to increase the performance of the workflows.
- Experience in creating ETL deployment groups and ETL Packages for promoting up to higher environments.
- Extensively worked with Teradata utilities like BTEQ, Fast Export, Fast Load, Multi Load to export and load data to/from different source systems including flat files.
- Provided applications support in Teradata, Sqlserver environment. Creating Tables, Stored Procedures, Views, Triggers, Rules, Defaults, macros and functions.
- Worked on performance tuning of Teradata database & Informatica mappings.
- Involved in Designing the ETL process to Extract translates and load data from OLTP Oracle database system to Teradata data warehouse.
- Worked efficiently on Teradata Parallel Transport and generated codes.
- Worked on enhancements of Bteq scripts which validated the Performance tables in the Teradata environment.
- Involved in various phases of the software development life cycle right from Requirements gathering, Analysis, Design, Development, and Testing to Production.
- Performed and documented the unit testing for validation of the mappings against the mapping specifications documents.
- Performed production support activities in Data Warehouse (Informatica) including monitoring and resolving production issues, pursue information, bug-fixes and supporting end users.
- Experience in writing and implementing FTP, archive and purge Scripts in UNIX.
Environment: Informatica Power Center 9.1/8.6.1, Oracle 11g/10g/9i, DB2, MS Access, UNIX. Teradata.
Confidential
Software Engineer
Responsibilities:
- Actively Involved in gathering the requirements from clients.
- Reviewed user requirements and needs for new software and performed analysis, design, implementation, installation related to new software developed.
- Created / modified database objects such as tables and stored procedures in MS SQL Server 2008 in the development of application.
- Preparing SQL queries according to the requirement and optimization of SQL queries according to the changes
- Interacting with development team for database requirements
- Testing of modules with all database logics
- Weekly data backup tasks and maintenance.
- To do user acceptance testing for checking all the requirements
- Working with the development team to get the bugs fixed
- To give internal app and technical operation’s training
- To provide technical support