Big Data Senior Engineer Resume
SUMMARY
- 9 years of professional IT experience with Hadoop Ecosystems experience in ingestion, storage, querying, processing and analysis of big data.
- In depth knowledge of Hadoop Architecture and various components such as HDFS, Resource Manager, Node Manager, Namenode, Datanode and MapReduce Concepts.
- 2.9 years’ experience in using Hadoop Ecosystem components like MapReduce, Yarn, Pig, Hive, sqoop, Flume, Oozie, Spark.
- Experience in writing Pig Latin scripts to sort, group, join and filter the data as part of data transformation as per the business requirements.
- Experience in writing custom UDF's for extending Hive and Pig core functionality.
- Worked on design and implemented HIVE based data warehouse solution.
- Experience in importing and exporting data from RDBMS to HDFS/Hive using sqoop.
- Excellent understanding and knowledge of NoSQL databases like Hbase
- Good knowledge in optimization of MapReduce algorithm using Combiners and Partitioners to deliver best results
- Hands on experience in Spark - Python programming with good knowledge on Spark Architecture and its in-memory Processing.
- Good knowledge on Spark Streaming and SparkSQL.
- In-depth Knowledge in Object Oriented Programming Concepts, JSP, JavaScript, HTML, XML, Oracle9i/10g.
- Having good knowledge on JDBC,J2EE, IBM DB2, IBM WebSphere6.0 Application Server
- Excellent experience in monitoring the production jobs using CONTROL+M scheduling tool
- Development experience with IDE like Eclipse and IBM ApplicationServerToolkit
- Having good experience in SQL and knowledge on AquaDataStudio, QueryTool, Winscp and Putty.
- Good knowledge of Agile scrum practices.
TECHNICAL SKILLS
Languages: J2SE1.6, JDBC, Servlets, Jsp, EJB2.0
Framework: Hadoop, MVC
Big Data Ecosystems: MapReduce, HDFS, Hive, Pig, Sqoop, HBase, Zookeeper, Flume,Oozie,Spark
Internet Technologies: HTML, XML, JavaScript, CSS
Database: IBM DB2, Oracle9i/10g
Tools: / IDE: Eclipse, NetBeans, Putty, Winscp, AquaDataStudio
Servers: IBM websphere, Web Logic and Tomcat
Platforms: Linux, Windows
Source Control: Tortoise SVN
PROFESSIONAL EXPERIENCE
Confidential
Big Data Senior Engineer
Environment: Hadoop, HDFS, Pig, Hive, Sqoop, and Spark
Responsibilities:
- Gathering data from multiple different data sources and organizing it.
- Involved in loading data from UNIX file system to HDFS
- Extracted the data from Databases into HDFS using Sqoop
- Handled importing of data from various data sources, performed transformations using Hive, Spark and loaded data into HDFS.
- Worked on the core and Spark SQL modules of Spark extensively..
- Setting up cron job to schedule the execution of shell scripts
- Extending Hive functionality by writing custom UDFs
- Developed Simple to complex Map Reduce Jobs using Hive and Pig
- Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
Confidential
Big Data Engineer
Environment: Hadoop, HDFS, Pig, Hive, Sqoop, Mysql
Responsibilities:
- Involved in requirement gathering, designing and development.
- Involved in converting XML, JSON file to CSV file using Pig Script.
- Created Hive tables to store the processed results in a tabular format.
- Worked extensively in Hive queries and Pig Scripts.
- Developed Sqoop scripts to load the data from Mysql table to Hive table
- Developed Pig and Hive script to process the data
- Created Hive UDF based on the business requirement.
- Developed the UNIX shell scripts to execute the Pig and Hive script
- Setting up cron job to schedule the execution of shell scripts
Confidential
Big Data Engineer
Environment: Hive, Sqoop, Talend ETL ( Basic Jobs) and Spark
Responsibilities:
- Gathering data from multiple different data sources and organizing it.
- Connect with Data Architects to design the ETL workflow
- Implement Algorithms with Spark Frameworks (python) as per requirement.
- Develop Hive & Spark ETL scripts for ETL process
- Process Monitoring and Debugging.
- Write Custom UDF for some of the requirements
- Setting up cron job to schedule the execution of shell scripts
- Developed Sqoop scripts to load the data from Mysql table to Hive table.
Confidential
Software Engineer
Environment: Java1.5, Servlets, Jsp, JDBC, Apache Tomcat, oracle10g, Tortoise SVN
Responsibilities:
- Develop parsers for different types of tests and test data to structure the data into a standard format and uploading the structure data to a central repository (Oracle database.
- Automated the parsing and upload of data according to a configurable schedule.
- Web interface has been developed to display the data for Modelers for analysis.
- Report module is developed to generate reports & charts from structured data for further analysis.
- Test case document preparation and unit testing.
- Hands on SVN as version control tool.
Confidential
Senior System Analyst
Environment: Java1.5, Servlets, Jsp, JDBC, EJB2.0, IBM WebSphere 6.0, IBM DB2, oracle9i/10g, AquaDataStudioTool, Putty, Winscp, QueryTool, Control+M, IBM ApplicationServerToolkit, Tortoise SVN
Responsibilities:
- Developing the SessionBeans and EntityBeans as per TS.
- Test case document preparation and performing unit testing as per FS.
- DB scripts preparation if required and code review document preparation.
- Source check-in in SVN repository in corresponding branch(ex.E12.03.06 VOB)
- Involved in SIT, UAT and UVT testing till UVT gets sign off and end to end support till released to production successfully.
- Fixing the bugs for the error logs available in IBM websphere application server while SIT and UAT is under process.
- Involved in fixing the bugs raised in HP QualityCenter 10.0
- Supporting the EOD (End of the Day) for 11 countries using Control+M batch scheduling tool.
Confidential
Software Engineer
Environment: Java1.4, Servlets, Jsp, JDBC, Java Script, Weblogic, Oracle 9i/10g, Eclipse, Putty, Winscp, QueryTool, Tortoise SVN
Responsibilities:
- Developed JDBC programs to connect to Oracle database.
- Developed HTML pages and written JavaScript to validate form data.
- Involved in Developing various JSP screens and Servlet classes
- TestCase document preparation and performing unit testing.
- Code review document preparation and fixing the bugs.
- Hands on SVN as version control tool.
Confidential
Software Engineer
Environment: Java1.4, Servlets, Jsp, JDBC, Java Script, IBM WebSphere, IBM DB2, Eclipse, Putty, Winscp, QueryTool, Tortoise SVN
Responsibilities:
- Developed JDBC programs to connect to IBM DB2 database.
- Developed HTML pages and written JavaScript to validate form data.
- Involved in Developing JSP screens.
- Developed new Servlet classes and enhanced already existed Servlets.
- TestCase document preparation and performing unit testing.
- Prepared DB patches if required and code review document preparation.
- Hands on SVN as version control tool.