Data Engineer Resume
Los Angeles, CA
PROFESSIONAL SUMMARY:
- Over 8 years of professional IT experience with strong emphasis in development and testing of software applications
- Over 4 years of comprehensive experience in Big Data, Hadoop Ecosystem
- Hands on experience with Big Data technology stacks using HDFS, Hive, HBase, Flume, Kafka, Mapreduce, Oozie, Sqoop, Spark, Tez
- Proficient in analyzing, designing, developing and translating business requirements to technical requirements and good understanding of product development
- Deep experience in Data Warehousing to deliver Hybrid Data Engineering Architecture using MPP and Hadoop
- Broad set of technology skills with hands - on experience in wide variety of high volume systems to Terabyte class EDW and analytical systems
- Hands on with setting up and creating Data Visualizations using Tableau
- Adept in working with technical and non technical audience
- Excellent communication and documentation skills
TECHNICAL SKILLS:
Big Data: Hive, Sqoop, Flume, Kafka, Oozie, Tableau, Spark, Scala
Java, J2EE, Servlets, Spring, JDBC, SQL, SQL Server, JSP, CSS, HTML, JavaScript, XML, SVN, Ant.:
Tools: , Mercury Tools, SQL Server Management Studio, Oracle SQL Developer, Quality Center, ALM, McKesson ClaimsXten & Clear Claims Connection, Amisys Advance, Facets
Operating System: MVS, UNIX, Windows XP/7
Scripting Languages: Python, Shell Scripting
Databases: SQL Server 2008 R2, Oracle 10/11g
PROFESSIONAL EXPERIENCE
Confidential - Los Angeles, CA
Data Engineer
- Built critical data sets in hybrid data hub(Hadoop and Netezza EDW) for analytics and BI needs serving all of Confidential 's business units
- Managed complex projects and initiatives that significantly impact business results and require high degree of cross functional participation and coordination
- Instrumental in integrating Hadoop with Enterprise Data Warehouse, thereby saving 40% space in Netezza
- Extensive use of Sqoop and Hive Data Warehouse environment for ETL and ELT
- Assisted BI team with MicroStratergy setup using Hive for data analytics
- Worked with cross functional teams such as Data Science and Machine Learning on various data ingestion needs
- Created and automated data ingestion pipelines using Oozie and Shell
- Created Hive tables using multiple storage formats and compression techniques
- Used Hive to analyze partitioned and bucketed data and compute various metrics for reporting
- Used Oozie to build streaming workflows that listen to various Kafka topics, transform the data using Hive before loading into HDFS
- Involved in collecting and aggregating log data using Flume and staging it in HDFS for analysis
- Created multiple data pipelines that could handle 1TB of data
- Worked on various Python data structures including list, dictionaries
- Built data visualization dashboards using Tableau for real time monitoring and data analysis
Environment: Hadoop, HDFS, Informatica Power Center, Oracle, Hive, Sqoop, Flume, Kafka, Netezza Striper, Tableau, Agile, Jira, GitHub
Confidential -Quincy, MA
Hadoop Developer
- Developed solutions to process data into HDFS (Hadoop Distributed File System), process within Hadoop and emit the summary results from Hadoop to downstream systems
- Worked on Sqoop extensively to ingest data from various source systems into HDFS
- Worked totally in agile methodology and developed Spark scripts by using Scala shell
- Involved in transforming data from Mainframe tables to HDFS, and HBase tables using Sqoop
- Developed Simple to complex Map/reduce Jobs using Hive, Pig and Python
- Analyzed Hadoop cluster and different big data analytic tools including Map Reduce, Pig, Hive
- Created Partitions, Buckets based on State to further process using bucket based Hive joins
- Involved in developing PIG UDFs for the needed functionality such as custom Pigs loader
- Involved in writing optimized Pig Script along with involved in developing and testing Pig Latin Scripts
- Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV etc.
- Integrated NoSQL database like HBase with Map Reduce to move bulk amount of data into HBase
- Used (SQL Server, DB2, TD) for integrating into Hadoop cluster and analyzed data by Hive-HBase integration
- Created HBase tables to store variable data formats of data coming from different portfolios
- Using Oozie to schedule the workflow to perform shell action and hive actions
- Worked on a stand-alone as well as a distributed Hadoop application
- Experienced with working on Avro Data files using Avro Serialization system
- Kerberos security was implemented to safeguard the cluster
Environment:: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, Oozie, Zookeeper, HBase, Java Eclipse, SQL Server, Shell Scripting
Confidential -Durham, NC
Java Developer
- Involved in understanding the functional requirements and converting into technical design document
- Implemented the presentation layer based using Servlets, JSP, CSS, HTML and JavaScript
- Developed JSPs and Servlets to provide a mechanism for obtaining electronic and printed pricelists for list prices, regional prices and customer-specific prices
- Used Spring Framework to provide architectural flexibility
- Designed and developed JDBC module to read and write data from Oracle and SQL Server databases and convert to XML format
- Parsed XML data using Xerces parser to display it on JSPs
- Designed and developed Session and Entity beans
- Implemented hierarchical control mechanism to provide different permission levels to different users to modify pricing rules
- Provided control mechanisms to allow a salesman to view customer accounts associated with his login
- Implemented hierarchical definition of products, customers and channels
- Involved in unit testing and developed test cases
Environment: Java, J2EE, Servlets, Spring, JDBC, SQL, SQL Server, JSP, CSS, HTML, JavaScript, XML, Xerces, SVN, Ant
Confidential -Minneapolis, MN
QA Analyst
- Primarily responsible for developing and maintaining test plans and cases for the UHG's highly visible Member EOB Migration system
- Responsibilities also included providing requirements analysis for the Coordination of Benefits, Primary Claim Calculation and 835 Bundling projects
- Extensively worked on EDI Transactions 270, 271, 276, 277, 834, 835, 837
- Tested HIPAA Claims processing through the Clearing house
- Extensively worked on the Provider and Member Explanation of Benefits Migration
Environment: MS Visual Studio 2010, HP Quality Center 10.0, Oracle 9.6.1.1,Ultra Edit, Microsoft Office