Junior Hadoop Developer Resume
San Francisco, CA
SUMMARY
- An enthusiastic Bachelors level IT graduate with 6+ years of credible experience in IT. Through my career till date I have gained an extensive experience and understating of Microsoft technologies and Databases, Virtualization and integration of ETL technologies like Big Data Hadoop developer, Informatica. Have a proven track record of holding positions of importance requiring a high degree of competence and responsibility. Respected team player who is willing to do what it takes to get the job done.
TECHNICAL SKILLS
Programming Languages: C++, XML, UNIX shell scripting and K - Shell, JavaDatabases Oracle 10g,11g, JDBC, TOAD, PL/SQL.
Operating Systems: Windows 7/8, UNIX, Microsoft Share Point 2016, Microsoft CRM Dynamic 2016.
ETL Tool: Informatica Power Center 10.x/ 9.x, DataStage 11.3, TeradataBig Data Ecosystems Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Oozie, Flume.
Reporting Tool: Tableau.
PROFESSIONAL EXPERIENCE
Junior Hadoop Developer
Confidential - San Francisco, CA
Responsibilities:
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
- Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
- Managed and reviewed Hadoop log files.
- Tested raw data and executed performance scripts.
- Shared responsibility for administration of Hadoop, Hive and Pig.
- Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS. Assisted with performance tuning and monitoring.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Supported code/design analysis, strategy development and project planning.
- Created reports for the BI team using Sqoop to export data into HDFS and Hive.
- Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
- Assisted with data capacity planning and node forecasting.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
- Administrator for Pig, Hive and Hbase installing updates, patches and upgrades.
ETL/BI Developer
Confidential, Mountain View, CA
Responsibilities:
- Hands on experience in end-to-end of Data warehousing ETL routines, which includes developing graphs and notation to construct data integration with the Informatica, stress testing, and data quality process.
- Developed several Informatica jobs for Data Integration of the Source systems Teradata flat file, oracle and SQL Server.
- Worked on production support for ETL Informatica job.
- Designed and deployed well-tuned graphs (Generic and Custom) for unix environment.
- Practical experience with working on multiple environments like production, development, testing.
- Thorough understanding of the issues involved in mappings, development and integration of a software product.
- Good fundamental knowledge of Data Warehouse concepts andmethodologies.
- Experience in PL-SQL and conversion into ETL Informatica.
- Strong Working in Data Mart.
- Experience in requirement analysis, client interaction, development, testing and security.
- Worked in a sandbox environment while extensively interacting with EME to maintain version control on objects. Sandbox features like checkin and checkout were used for this purpose.
- Maintained locks on objects while working in the sandbox to maintain the privacy
- Ensure monthly audits are completed as outlined in the Quality Assurance Plan.
- Validated system generated reports for source to target comparison and reports created by the data correction.
- Hands on experience in SharePoint, Java development/Agile methodology/Water fall.
Environment: Data Stage, Oracle 10g, SQL Server, Windows XP, UNIX, Teradata, SQL Developer, SQL, PL/SQL, Oracle SQL *Loader, Erwin 4.0, Trillium.
Programmer Analyst
Confidential, Kansas City, MO
Responsibilities:
- Worked on web based application using Php symfony and JavaScript with Extjs.
- Created modules for both the client and server.
- Developed several Datastage jobs for Data Integration of the Source systems Teradata flat file, oracle and SQL Server.
- Use Informatica ETL tool to load data from source to target.
- Used Jira tools to meet the responsibilities for the Project.
- Used Sql to support client related request.
- Stored and merge the code changes using github.
Environment: Informatica, DataStage, PHP Storm 4.8, Symfony 4.2, JavaScript 4.2, Oracle 10g, SQL Server 2005, Windows XP, UNIX, Github 11.10.12.
ETL Developer/ Data Analyst
Confidential, Kansas City, Missouri
Responsibilities:
- Involved in business analysis and technical design sessions with business and technical staff to develop requirements document and ETL design specifications.
- Focal point for making sound decisions related to data collection, data analysis, data security, methodologies and designs.
- A Used Erwin for Logical and Physical database modeling of the warehouse, responsible for database schemas creation based on the logical models.
- Maintained the data integrity during extraction, manipulation, processing, analysis and storage.
- Involved in performance tuning of targets, sources, mappings, and sessions.
- Wrote complex SQL scripts to avoid Informatica Look-ups to improve the performance as the volume of the data was heavy.
- Created and monitored sessions using workflow manager and workflow monitor.
Environment: Informatica Power Center 9.5.1/9.1/8.6.1 /8.1.1 SP4, Oracle 10g, SQL Server, Windows XP, UNIX, Teradata, SQL Developer, SQL, PL/SQL, Oracle SQL *Loader, Erwin 4.0, Trillium.
Junior ETL Developer/Data Analyst
Confidential, Tucson, AZ
Responsibilities:
- Worked with business analysts to identify appropriate sources for data warehouse and prepared the Business Release Documents, documented business rules, functional and technical designs, test cases, and user guides.
- Actively involved in the Design and development of the STAR schema data model.
- Mentored Informatica developers on project for development, implementation, performance tuning of mappings and code reviews.
- Built data input and designed data collection screens - Managed database design and maintenance, administration and security for the company.
- Data output - Made data chart presentations and coded variables from original data, conducted statistical analysis as and when required and provided summaries of analysis.
- Implemented slowly changing and rapidly changing dimension methodologies; created aggregate fact tables for the creation of ad-hoc reports.
- Extensively worked on Connected & Unconnected Lookups, Router, Expressions, Source Qualifier, Aggregator, Filter, Sequence Generator, etc.
- Created and maintained surrogate keys on the master tables to handle SCD type 2 changes effectively.
- Extracted data from the databases (Oracle and SQL Server, DB2, FLAT FILES) using Informatica to load it into a single data warehouse repository.
- Used SQL tools like TOAD to run SQL queries and validate the data in warehouse and mart.
- Designed and developed UNIX Scripts to automate the tasks.
- Involved in unit testing of the mappings and mapplets.
- Understand the components of a data quality plan. Make informed choices between sources data cleansing and target data cleansing.
Environment: Informatica Power Center 9.5.1/9.1/8.6.1 , Ab Initio, Informatica Data Quality 8.6.2, Informatica Data Explorer, Oracle 11g, Erwin - 4.0, TOAD 9.x, Shell Scripting, Teradata, SQL Server 2005/2008, Oracle SQL *Loader, PL/SQL, UNIX, Windows-XP.
