Paper Title
VISUAL CAUSAL INFERENCE DESIGN BASED ON INPUT IMAGES
Abstract
Abstract - The Causal Inference is the process of understanding and inferring the impact of a specific event on another event. The Causal Inference in the field of Natural Language Processing (NLP) aims to predict future scenarios that may arise due to observed behavior. However, the issue of the Causal Inference is a challenging task that can be solved by language models by understanding the concepts of context and causal relationships.
The Causal Inference between images is the process of inferring causal relationships based on similarities and logical relationships between input images. For research into Strong AI Technology, the need for logical thinking models that go beyond simple classification and generation has increased. Therefore, this paper seeks to solve the issue of the Visual Causality Inference based on input images by utilizing the ViT (Vision Transformer) structure. Since ViT makes it easy to understand global relationships through the Self-Attention, by leveraging the strengths of the Model, it will solve the issue of the Visual Inference by setting event-based causal relationships as a knowledge background without any additional information.
Keywords - Vision Transformer, Visual Causal Inference, Data Analysis.