Big Data Engineer Resume
SUMMARY:
- Around 8 Years into enterprise applications and building data systems, pipelines for large volume of complex data to extract, classify, merge, and deliver useful insights.
- Design Solutions for Big data pipeline management for SQL and No SQL Data using Sqoop, Oozie, HDFS, Map Reduce, Hive, Pig and Spark.
- Experience in Hortonworks distribution installation on AWS cloud.
- Experience in other distributions like CDH and MapR.
- Experience in installation and customization of Confidential ingestion framework.
- Worked on different file formats like AVRO and Parquet.
- Experience in analyzing Google click stream data to have business to formulate their strategies.
- Strong working experience with Spark Core, Spark SQL, Spark Streaming using Scala, HIVE, Flume, Kafka, Oozie, Zookeeper and HBase.
- Implemented Spark (Scala) RDD transformations, actions to migrate MapReduce programs.
- Create low - latency and high processing queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics via Hive, Spark, Spark SQL, Hive context, HBase
- Having exposure to Cloudera Manager, Ambari, Zookeeper, Kafka and Flume
- Excellent Knowledge of performance optimization techniques, J2EE and Java design patterns and UML designing, and object oriented modeling.
- Experience in analysis and design using SDLC, UML, MS Visio.
- Expert in enterprise application design, and development using J2EE, EJB, SOAP WebServices, Restful Services, Spring framework, Junit, TestNG, SPOCK, SOAPUI,JDBC and ORM.
- Experience in fine tuning the application and improve the application response time.
- Experience in using messaging services like MQ Series, Apache Active MQ.
- Worked extensively on Spring framework including Spring integration, Spring Security.
- Good exposure to SSL. Migration of SHA1 to SHA2 migration along with SSLv3 disable initiative.
- Work experience on certificate based security. Good understanding on PKI.
- Work experience on CI and CD using Jenkins, maven and nexus repos.
- Work experience on authentication and authorization products like Ping Access and SiteMinder.
- Work experience on splunk tool; client call trace. dashboard and reports.
- Expertise in formulating authentication strategies using OTP, SiteMinder/Ping Federation.
- Experience in various SDLC methods like Agile, Kanban, Iterative Incremental and Waterfall.
- Experience on cloud environments like Amazon Cloud and Azure.
PROFESSIONAL EXPERIENCE:
Confidential
Big Data Engineer
Responsibilities:
- Design and architecture of the application.
- Creating datalake by creating different upstream systems, various client feeds
- Customizing the ingestion framework to Vizient.
- Performing the transformation and validations using scala spark.
- Design the datapipe lines, data wrangling in CASK.
Environment: HDP, Confidential ingestion framework, CASK (CDAP), Spark, Hive, HBase, Ranger and Kerberos.
Confidential
Big Data Engineer
Responsibilities:
- Creating and managing different Hadoop distributions like Hortonworks, Cloudera and MapR.
- Installation and customization of Enable ingestion framework on all the above distributions.
- Creating different source connectors in the Enable UI which represents different source systems and target system.
- Defining the ingestion strategy and providing the architectural suggestions in creating Data Lake.
Environment: HDP, CDH, MapR, AWS cloud, Hive, Sqoop, Hbase, Kudu, Kafka, MapR Streams, MapRDB
Confidential
Big Data Developer
Responsibilities:
- Built Data Pipeline Management for Import/Export, ETL processing, and operationalizing and modeling business scores. Enhance and optimize and tune pipelines based Oozie workflows, schedules, and Control-M jobs.
- Data Ingestion using Sqoop, Flume and Export score data for visualization via IBM Campaign mart, Tableau, and cognos for reporting and business intelligence.
- High-level design for the Behavioral Predictive Propensity models calculation.
- Applied Spark Dataframes, Datasets for real-time processing.
- Fine-tuned Hive, and Impala queries to join large tables with billions of rows to create feature vectors for each data object and use same in Spark, Spark SQL, Hive context
- Loaded AVRO and Parquet file formats for ETL and faster processing purposes.
- Cluster computing for data cleanse and validate tasks using Apache Spark Scala API.
- Used Jenkins, Maven to manage project deployments to different environment.
- Used Kerberos authentication and ID Vault for password encryption.
Confidential
Technical Lead
Responsibilities:
- Build the use cases, class diagrams and sequence diagrams using MS Visio and Rational Rose.
- Evaluate and perform feasibility study of new vendor integration to orchestration existing IAM infrastructure seamlessly.
- Participating into product analysis activities with vendors.
- Evaluate and provide suggestions on which vendor product needs to be chosen as strategic partner of bio-metric authentication.
- Developed POC to evaluate vendor product.
- Created architecture documents which describe new authentication strategies with vendor product integration.
- Developed new Webservices for integration.
- Developed authentication web flows in Ping environment which uses customized ping adaptor.
- Customizing Ping access authentication flows by designing adaptors which would interact with other IAM applications.
- Demonstrating the different use case scenarios to the LOB partners to show the authentication enhancements and easy availability of applications over the smart devices.
- Preparing dashboard capturing analysis of different vendor products comparison.
- Managed SVN branching and merging activities.
- Used Jenkins for build tool as a standard.
Environment: Java, J2ee, Spring, Spring MVC, Ping Access servers, SOAP WebServices, Restful Webservices, other enterprises applications like Weblogic, Maven, SVN,Jenkins, Bladelogic, Nexus Repo, Bio-metric vendor like Authentify xFA, Daon IdentityX, Symantec VIP, Vasco.
Confidential
Software Engineer
Responsibilities:
- Participation in design and architecture with US counterparts and brain storming sessions.
- High and Low level design documentation and review.
- Coding and implementation.
- Developed ESB modular approach for communicating different modules within the system.
- Implemented Spring-Integration module.
- Implemented Asynchronous logging.
- Implemented Spring Webservices.
- Used unbound API for LDAP communication.
- Implemented messaging techniques using MQ series and Rabbit MQ.
- Used SVN for version control system.
- Laying out build strategies and developing standard build platform by using maven and Jenkins.
Environment: Java, j2ee, Spring, Spring Integration, MQ series, eTrust ldap server, tcServer from VMware, Spring Webservices.