Hadoop Developer/admin Resume
0/5 (Submit Your Rating)
New York, NY
SUMMARY
- Over 7+ years of extensive experience in Business Analysis, business process re - engineering, with proven
- Expertise in Apache Hadoop Development and Administration for a multimode cluster in Linux environment.
- Comfortable in bringing up and maintainingHadoopcluster, capacity planning, performance tuning and monitoring.
- Comfortable in writing Map Reduce programs and excellent debug skills.
- Hadoopcluster set up in AWS
- Extremely comfortable with all types of hardware systems.
- Knowledge of Objected-Oriented programming methodology/analysis.
- Programming languages: JAVA, JDBC, C/C++; Python, SQL, R, HTML, XML and Javascript.
- High levels of technical and professional communications and ability to learn new things quickly.
- Experience in Big Data,Hadooptechnologies
- Worked on CDH, CDM,Hadoopenvironment (HDFS) setup, Map Reduce Jobs, HIVE, Hbase, PIG and NoSQL and MongoDB.
- Worked on Multi Clustered environment and setting up ClouderaHadoopecho-System
- Good experience in Designing, implementing, and improving analytic solutions for Big Data on Apache Hadoop, Hive, HBase, Flume, Oozie.
TECHNICAL SKILLS
Hadoopeco system: Hadoop, Hive, sqoop, Hdfs, Map-reduce, Pig-latin, CDH3, CDH4
Hadooptools: Nagios, Ganglia, Tableau
RDBMS versions: Oracle 9i, 10g, 11g Mysql, Greenplum
RDBMS Tools: Toad, Tkprof, SQLTRACE
Operating Systems: REDHAT Enterprise Linux 4.0, 5.0, OEL 4.x. 5.x, Windows 2KServer, Windows XP
Scripting: UNIX Shell Scripting,Hadoopfs, shell, sed, Perl
Languages: C, C++, Java, SQL, PLSQL
PROFESSIONAL EXPERIENCE
Confidential, New York, NY
Hadoop Developer/Admin
Responsibilities:
- Built a Hadoop cluster in a broad effort to store and process massive amounts of data.
- Drove the internal development of agile map/reduce programming practices and led a team to quickly deliver powerful and cost-effective big data results using Hadoop and Hive.
- Part of managing the Hadoop clusters: setup, install, monitor, maintain; Distributions: Cloudera CDH, Apache Hadoop
- Responsible for Analyze and improve efficiency, scalability, and stability of data collection, storage, and retrieval processes.
- Responsible for optimizing infrastructure at both the software and hardware level.
- Monitor cluster job performance and capacity planning.
Confidential, Ogden, UT
Hadoop Consultant
Responsibilities:
- Interfaced with the Client as part of the Requirements Engineering team to finalize the project scope.
- Experience in setup of clusters utilizing cloudera manager and manual and automated installation of Cloudera’s Distribution including ApacheHadoopCDH3, CDH4 environment.
- Installation of Linux OS and setup of SSH in cluster for remote communication between nodes.
- Experience in RPM and Tar ball installation. Created Linux users and groups
- Experience in basic Linux system administration like user management,
- Created hdfs users and assigned quotas on HDFS file system
- Backup of namenode metadata and hive metadata.
- Monitored logs and system health check and Ensured data integrity using fsck for block corruption.
- Good knowledge ofHadoopcluster connectivity and security and knowledge in performance troubleshooting tuningHadoopClusters
- Installed and Monitored clusters through GUI Dashboard and nagios
- Created shell scripts to delete log files ofhadoopperiodically.
- Experience in Recovery of Namenode using nfs mounted file system backup of namenode.
- Good understanding of java concepts and Implemented pig script for map reduce job.
- Loading of data from heterogeneous data source to hdfs
- Installed and configured hive, sqoop for data analysis
- Used sqoop to import data from oracle into hive environment after analyzing that data exported back to oracle database.
- Loaded web server logs into hive tables and analyzed for customer interest on products and product support.
- Designed and implemented a prototype for web log analysis usingHadooptechnology stack.
- Designed processes for near real-time analysis of transactional data and weblogs.
- Created external table and partitioned tables in hive for querying purpose
- Good Knowledge on Data warehouse concepts and Working knowledge on OBIEE RPD and report creation, tableau visualization tool
Confidential, NYC, NY
ETL Developer/Admin
Responsibilities:
- Review the existing informatica mappings and create design documents to migrate them to Netezza.
- Converted the existing XPONENT process into ELT to achieve good performance. Due to converting to ETL the load time was reduced to 4 hours from 70 hours.
- Converted Oracle Materialized views into Netezza views.
- Worked on performance and tuning netezza queries.
- Worked on identifying and setting up proper distribution keys.
- Good knowledge on Netezza architecture like Zone maps, distribution keys etc
- Created design approach to lift and shift the existing mappings to Netezza.
- Created design documents to convert the existing mappings to use informaitca pushdown optimization.
- Analyze the impact on the downstream systems and recommend the solutions to keep them intact.
- Planning the Dev, SIT and QA environments.
- Involved in designing the D/W using Star Schema. Identifying the Fact, Dimension and slowly changing dimension tables.
- Taking ETL architecture decisions.
- Created mappings, WorkFlows/Worklets and scheduled them using workflow manager and UNIX.
- Create stored procedures in Netezza.
- Converted the existing Oracle materialized and relational views into Netezza views.
- Identify issues, debug and resolving issues.
- Developed reusable frameworks for DB constraints and NZLoad.
- Coordinate with Oracle DBA and Unix Admin to achieve better dB performance and identify performance bottlenecks.
Confidential
SQL Developer
Responsibilities:
- Created and configured the database, with 2 file groups, created the tables having required constraints.
- Analyzed business requirements and build logical data models that describe all the data and relationships between the data.
- Created new database objects like Procedures, Functions, Packages, Triggers, Indexes and Views using T-SQL in SQL Server 2000.
- Wrote stored procedures and functions using PL/SQL in Oracle 9i.
- Worked on Log Shipping, Replication to restore Database backup.
- Validated change requests and made appropriate recommendations. Standardized the implementation of data.
- Promoted database objects from test/develop to production. Coordinated and communicated production schedules within development team.
- Modified database structures as directed by developers for test/develop environments and assisted in coding, design and performance tuning.
- Backup and restore databases.
- Developed and implemented database and coding standards, improving performance and maintainability of corporate databases.
- Created the DTS Packages through ETL Process to vendors in which records were extracts from Flat file and Excel sources and loaded daily at the server.
- Updated the statistics info for both the old and new servers.
- Managing logins, roles, assigning rights and permissions. Created views to implement security.
- Backing up master & system databases and restoring them when necessary. Supported production environment.
- Managing logins, roles, assigning rights and permissions. Created views to implement security.
Confidential
SQL Developer
Responsibilities:
- Created new database logical and physical design to fit new business requirement and implemented new design into SQL Server 2005.
- Migrated DTS packages to SSIS and modified the packages accordingly to the new features of SSIS and carried out a migration of databases from SQL Server .
- Participated in database logical design to fit new business requirement, and implemented new design in SQL Server 2005
- Filtered bad data from legacy systems using T-SQL statements, and implemented various constraint and triggers for data consistency.
- Drop and recreate Index on the DSS system while migrating data.
- Wrote complex T-SQL Queries to perform data validation and graph validation to make sure test results matched back to expected results based on business requirements.
- Created TNS entries on workstations to implement connectivity to Oracle 10g server.
- Used Toad to created complex stored procedures in Oracle using PL/SQL.
- Created views to facilitate easy user interface implementation, and triggers on them to facilitate consistent data entry into the database.
- Deployed reports to Report Manager and Troubleshooting for any error occur in execution.
- Scheduled the Reports to run on daily and weekly basis in Report Manager and also email them to director and analysts to review in Excel Sheet.
- Maintaining remote and central databases for Synchronization of data and preparing plans for Replication.
- Documentation of all the processes involved in maintaining the database for future reference.
- Develop documentation that sufficiently describes technical deliverable as required for internal controls and maintaining documentation of the database for future reference.