Big Data Engineer Resume
Long Beach, CA
SUMMARY:
- Over 8 years of experience in designing, developing and implementing Big Data and Data Warehouse solutions using various tools and technologies across domains.
- 3 years of experience in implementing Enterprise Big Data and Hadoop along with various ecosystem components like Hive, Spark, Sqoop, Flume, Kafka, MapReduce, and AWS tools like S3/EC2/EMR/Data Pipeline/Lambda/Athena.
- Experience writing scripts using PySpark, Scala on Spark framework to utilize RDDs/DFs and facilitate advanced data analytics.
- Worked on various Structured/ Semi - structured formats like Sequence, Avro, Parquet, RC/ORC, XML/JSON etc., in conjunction with Spark RDDs and Data Frames.
- Proficient in shell-scripting on Unix/Linux platforms.
- Extensively worked on identifying bottlenecks and performance tuning Hive scripts.
- Worked on various Relational/Columnar databases like Oracle, DB2, SQL Server, Sybase, Redshift, Postgres, Hive etc.,
- Experience working with various BI tools like Tableau, Business Objects XIR2/3, Crystal Reports, Cognos etc.,
- Implementing projects and enhancements in agile methodology with weekly scrums.
- Good Knowledge of Sprint planning, conducting daily scrums, sprint reviews, sprint retrospectives, helping stake holders with user stories, preparing product backlogs, sprint backlogs and monitoring sprint progress with burn down charts.
- Possess good communication skills and comprehensive problem-solving abilities. Able to manage communications among members at diverse situations in a team
- Versatile team player with excellent analytical, presentation and interpersonal skills with an affinity and aptitude to learn new technologies.
TECHNICAL SKILLS:
Hadoop Eco-system: Hadoop, Spark, Hive, HDFS, Sqoop, Zookeeper, MapReduce, Flume, Kafka, Impala & Hue on CDH 5.x
AWS tools: S3, Redshift, EC2, Data Pipeline, EMR, Kinesis, RDS, Glacier, Athena, Lambda
BI Application: Business Objects XI R2/R3, Cognos 8, Tableau
Databases: Oracle 12c/11g/10g, SQL Server, Postgres, Sybase, DB2
Programming Languages: Python, Java, PL/SQL, Scala/PySpark
File Formats: JSON/XML, Parquet, Avro, Sequence, RC/ORC, Delimited/Fixed Width
PROFESSIONAL EXPERIENCE:
Confidential, Long Beach, CA
Big Data Engineer
Roles & Responsibilities:
- Involved in review of functional and non-functional requirements.
- Worked with admin to install and configure Hadoop Map reduce, HDFS, and developed multiple Map Reduce jobs for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experienced in managing and reviewing Hadoop log files.
- Extensively worked on Spark and Spark Streaming and utilized power of RDDs and DataFrames using Python Context to work with Parquet files and Impala
- Setup and benchmarked Hadoop clusters using TerraGen and TerraSort for internal Audit needs
- Designed and converted existing SSIS packages to Hive jobs
- Heavily worked on SQL optimizations working closely with other application teams
Environment: Cloudera 5.11, Spark, Hive, Tableau, UNIX, SQL, SQL Server, Kafka, Sqoop, Attunity
Confidential, Los Angeles, CA
Data Engineer
Roles & Responsibilities:
- Responsible for managing Hadoop clusters and monitoring logs
- Created Hadoop workflows to the IT infrastructure to facilitate workload offloading.
- Perform data analysis using Hive and Pig.
- Loaded OLTP data and external data into HDFS using Sqoop, Flume.
- Designed, and implemented processing periodical marketing information, complete with information enrichment, text analytics, and natural language processing.
- Develop high-performance cache, making the site stable and improving its performance.
- Performed PoC on AWS tools like S3, EC2 and Redshift to migrate subsequent projects to Redshift
- Created and managed tables, databases, security groups and WLM on Redshift cluster
- Utilized COPY and UNLOAD options to transfer data to S3.
Environment: Hadoop, Hive, MapReduce, Sqoop, HDFS, Oozie, UNIX, SQL, Oracle, Tableau, Informatica 8.6, AWS S3/EC2/EMR
Confidential
DW/BI Developer
Roles & Responsibilities:
- Involved in project from Requirement gathering, BRD analysis, Design and development of warehouse from existing systems through the conversion.
- Raised change requests, incident Management, analysed and coordinated resolution of program flaws for the Development environment and hot fixed them in the QA, Pre-Prod and prod environments, during the runs using QA Complete ticketing system.
- Involved in Testing, Debugging, Validation and Performance Tuning of data warehouse. Help develop the optimum solutions for data warehouse deliverables.
- Involved in the designing and Building of Universe, Classes, and objects.
- Created Hierarchies, complex objects using various @functions, aliases, contexts, Aggregate objects in the universe.
- Involved in production planning and analysis with Business and Project Management.
Environment: Business Objects XI R2, Tableau, UNIX, SQL, PL/SQL, Oracle, DB2/Mainframes, Informatica PowerCenter 8, Control-M
Confidential
BI Developer
Roles & Responsibilities:
- Resolving production issues per JIRA on priority basis.
- Analyzed Change request (CR) as per requests from JIRA. Creating SR to Informatica Inc - related to any Power center product issues.
- As per Change requests (CR) migration of objects in all phases (DEV, QA and PROD) of project and trained new developers to support in UAT and production environments.
- Worked on VB Macros in Excel for reporting purposes.
- Ensured proper configuration of the Power Center Domain components.
- Administered and developed crystal, Web reports on Business Objects XIR2, XIR3.
- Created Drill Down, Sub Report, formulas, and charts in crystal reports.
Environment: Business Objects XI R2/R3, Crystal Reports, UNIX, SQL, PL/SQL, Oracle, Sybase, Autosys
