Big Data Developer Resume Los Angeles - Hire IT People

PROFESSIONAL SUMMARY:

Over 7 years of total IT experience including 3 years in BigData technologies and 4 years implementing DataWarehousing solutions using IBM DataStage V8 and V11.
Extensive experience in using Hadoop eco - system components like HDFS, MapReduce, Oozie, Pig, Hive, Sqoop, Flume, Kafka, Impala, HBase, Zookeeper.
Have experience in Apache Spark, Spark SQL and No SQL databases like Cassandra, MongoDB and Hbase.
Experience in installing, configuring and maintaining the Hadoop Cluster including YARN configuration using Cloudera, Hortonworks and AWS.
Experienced in Integrating Hadoop with Apache Storm and Kafka. Expertise in uploading Click stream data from Kafka to HDFS, Hbase and Hive by integrating with Storm.
Expertise on Scala Programming language and Spark Core
Experience in benchmarking Hadoop Cluster to tune and obtain the best performance out of it.
Good knowledge on S3 Buckets, Dynamo DB, RedShift.
Very Good understanding of SQL, ETL and Data Warehousing Technologies
Familiar with all stages of Software Development Life Cycle, Issue Tracking, Version Control and Deployment.
Extensively worked in writing, tuning and profiling jobs in MapReduce, Advanced MapReduce using Java.
Experience in writing Shell-Scripts, Cron Automation, Regular Expressions and MRUnit.
Hands on experience in dealing with Compression Codecs like Snappy, BZIP2.
Implemented workflows in Oozie using Sqoop, MapReduce, Hive and other Java and Shell actions.
Good knowledge of working with Avro and Parquet formats.
Excellent knowledge of Data Flow Lifecycle and implementing transformations and analytic solutions.
Extending Hive and Pig core functionality by writing Custom UDFs and creating Serdes in Hive.
Have sound knowledge on designing data warehousing applications with using Tools like Teradata, Oracle and SQL Server.
Experience on using Talend ETL tool.
Knowledge of java virtual machines (JVM) and multithreaded processing.
Developed Web-Services module for integration using SOAP and REST.
Strong understanding of Agile Scrum and Waterfall SDLC methodologies.
Used Datastage Version Control to migrate the project from one environment to the other.
Experience working with Parallel Extender for Parallel Processing to improve job performance while working with bulk data sources. Worked with most of the parallel stages applying different partitioning techniques.
Worked with various databases Oracle 10g/9i/8i/8.0/7.x, DB2, MS Access, SQL Server and Experience in major relational DB platforms.
Excellent analytical, problem solving, communication and interpersonal skills, with ability to interact with individuals at all levels.

TECHNICAL SKILLS:

BigData Technologies: Hadoop, MapReduce, YARN, Pig, Hive, HBase, Sqoop, Scala, Spark, Python, PySpark Mongo-DB with Python, Neo4j, Cassandra.

ETL Tools: IBM Infosphere DataStage 11.3/9.1/8.5/8.1 (Manager, Designer, Director, Administrator), DataStage PX (Parallel Extender), Quality Stage.

Databases: Oracle10g/9i/8i/8.x/7.x,DB 2 9.0/8.0/7.0, MS SQL Server 2005/7.0/6.5, Ms Access 2000, SQL*Plus, SQL*Loader, TOAD 7.0 and Developer 2000

Operating System: Sun Solaris 2.7/2.6, HP-UX 10.2/9.0, IBM AIX 4.3/4.2, Linux, MS DOS 6.22, Win 3.x/95/98/XP, Win NT 4.0, Sun Ultra, HP9000, IBM RS6000, AS400

Programming Skills: UNIX Shell Scripting, SQL, PL/SQL, SQL*Plus 3.3/8.0, Business Intelligence, C, C++, Java, JavaScript, SQL*Loader, VB, ASP, COBOL, HTML, XML.

Datamodeling Tools: Dimensional Data Modeling, Star Join Schema Modeling, Snow-Flake, Modeling, Fact and Dimensions Tables, Physical and Logical Data, Enterprise Database, Integration and Management, Microsoft Visual Studio.

BI Tools: OBIEE 11g, BI Publisher, Tableau 9.2, Cognos 10.x

PROFESSIONAL EXPERIENCE:

Confidential, Los Angeles

Big Data Developer

Responsibilities:

Involvement in design, development and testing phases of Software Development Life Cycle.
Installed and configured Hadoop Ecosystem components.
Used Spark Data Frame API to process Structured and Semi Structured files and load them back into S3 Bucket.
Migrated Map reduce jobs to Spark Jobs to achieve better performance
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark.
Automating and scheduling the Sqoop jobs in a timely manner using Unix Shell Scripts.
Imported the data from Oracle source and populated it into HDFS using Sqoop.
Developed a streaming data pipeline using Kafka and Storm to store data into HDFS.
Implemented a POC with Spark SQL to interpret complex Json records.
Automated the process for extraction of data from warehouses and weblogs by developing work-flows and coordinator jobs in OOZIE.
Developed MapReduce jobs to Convert data files into Parquet file format.
Executed Hive queries on Parquet tables to perform data analysis to meet the business requirements.
Created table definition and made the contents available as a Schema-BackedRDD.
Developed business specific Custom UDF's in Hive, Pig.
Exported the aggregated data onto Oracle using Sqoop for reporting on the Tableau dashboard.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it

Environment: HDFS, MapReduce, Kafka, Storm, S3, Parquet, Pig, Hive, Sqoop, Spark, Oracle, Oozie, RedHat Linux, Tableau.

Confidential, Detroit

Big Data Developer

Responsibilities:

Involved in installing cluster and Configuring Hadoop Ecosystem components.
Worked with Hadoop administrator in rebalancing blocks and decommissioning nodes in the cluster.
Responsible to manage data coming from different sources.
Extracted the data onto HDFS using Flume, Kafka.
Imported and exported data using Sqoop to load data from RDBMS to HDFS and vice versa, on regular basis.
Developed, Monitored and Optimized MapReduce jobs for data cleaning and preprocessing.
Built data pipeline using Pig and MapReduce in Java.
Implemented MapReduce jobs to write data into Avro format.
Automated all the jobs for pulling the data and to load into Hive tables, using Oozie workflows.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
Developed custom Serde's specific to the requirement in Hive.
Implemented Pattern matching algorithms with Regular Expressions, built profiles using Hive and stored the results in HBase.
Used Maven to build the application.
Implemented Unit Testing using MRUnit.

Environment: HDP, HDFS, Flume, Kafka, Sqoop, Pig, Hive, MapReduce, HBase, Oozie, MRUnit, Maven, Avro, RedHat Linux, RDBMS.

Confidential, Los Angeles

DataStage Developer

Responsibilities:

Experience in Designing, Compiling, Testing, and Scheduling and Running DataStage jobs.
Worked with various techniques such as schema bound views, partitioning, and ETL/Query optimization.
Efficient in all phases of the development life cycle, coherent with Data Cleansing, Data Integration, Data Conversion, Performance Tuning.
Developed Server jobs for extracting, transforming, integrating and loading data to targets.
Expertise in data warehousing techniques like Data cleansing, Slowly Changing Dimension phenomenon and Change Data Capture.
Used Datastage Version Control to migrate the project from one environment to the other.
Excellent analytical, problem solving, communication and interpersonal skills, with ability to interact with individuals at all levels.
Worked with SQL server and created jobs to load data from SQL to Oracle.
Involved in Extracting, transforming, loading and testing data from XML files, Flat files, Oracle and DB2 using DataStage jobs.
Involved in performance tuning of the DataStage jobs and queries.
Written SQL in DB2 for using in DataStage and testing the data.
Troubleshooting and performance tuning of ETL jobs
Used DataStage Manager for importing metadata from repository, new job categories and creating new data elements.
Worked on NDM jobs for Secure file transfers from one server to other server.
Used Autosys for scheduling the jobs using autosys scripts.
Used Microsoft visual Studio to migrate jobs from one environment to other environment.
Used TFS to checkin jobs form DataStage and files from backend and also used DSX generator to convert the files to .dsx for Migration purpose. .
Support Hyperion Interactive Reporting from front and backend support.
Supported Production instances via Service Manager Tickets.

Environment: IBM Infosphere DataStage 8.5, 9.5 - DataStage Designer, DataStage Director, DataStage Manager, DataStage Administrator, Autosys, Microsoft Visual Studio, Oracle,UNIX Shell Programming.

Confidential

DataStage Developer

Responsibilities:

Extensively used DataStage Designer to develop various Parallel jobs to extract, cleanse, transform, integrate and load data into Enterprise Data Warehouse tables.
Worked with the Business analysts and the DBA for requirements gathering, business analysis, testing, and project coordination.
Worked with DataStage Manager to import/export metadata, DataStage Components between the projects.
Involved in Design, Source to Target Mappings between sources to operational staging targets, using Star Schema, implemented logic for Slowly Changing Dimensions.
Involved in Performance Tuning of Parallel Jobs using Performance Statistics
Used Various Standard and Custom Routines in DataStage jobs.
Tuned the Parallel jobs for better performance.
Responsible for adopting the standards for Stage and Link naming conventions.
Created and edited the design specification documents for the jobs
Participated in discussions with Team leader, Group Members and Technical Manager regarding any technical and Business Requirement issues.
Developed Parameterized reusable Datastage jobs where you can use these jobs in multiple instances.
Performed Unit testing for jobs developed to ensure that it meets the requirements.
Coordinated with team members at times of change in Business requirements and change in Data Mart Schema.

Environment: DataStage 8.x (Designer, Director, Manager, Administrator), Oracle 8i, PL/SQL, UNIX Shell Programming, Windows NT.

We provide IT Staff Augmentation Services!

Big Data Developer Resume

Los, AngeleS

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship