| Scene Graph Generation(SGG)is an intermediate visual understanding task,aiming to first encode the visual contents within the given image and then parse them into a compact summary graph.In this paper,we aim to study the SGG task in the aspect of academic innovations and technique applications.Regarding the academic innovation,we conduct the research towards the unbiased Scene Graph Generation task.Existing SGG approaches generally not only neglect the insufficient modality fusion between vision and language,but also fail to provide informative predicates due to the biased relationship predictions,leading SGG far from practical.Towards this end,we first present a novel Stacked Hybrid-Attention(SHA)network,which facilitates the intramodal refinement as well as the inter-modal interaction,to serve as the encoder.We then devise an innovative Group Collaborative Learning(GCL)strategy to optimize the decoder.Particularly,based on the observation that the recognition capability of one classifier is limited towards an extremely unbalanced dataset,we first deploy a group of classifiers that are expert in distinguishing different subsets of classes,and then cooperatively optimize them from two aspects to promote the unbiased SGG.Experiments conducted on VG and GQA datasets demonstrate that,we not only establish a new state-of-the-art in the unbiased metric,but also nearly double the performance compared with two baselines.Regarding the technical application,we conduct the research towards the emergency earlywarning description generation in electric power industry based on the Scene Graph Generation technology.Most of the current early-warning systems in electric power industry are based on object detection technologies,which could only provide annotations of dangerous targets within the image,ignoring the potential risk brought by some relationships between pair-wised objects.Such ignorance will limit the capabilities of emergency recognition and forewarning.Towards this end,we focus on introducing the SGG technology into current early-warning systems.Specifically,our method can not only identify the dangerous objects,but also recognize the potential relationships which may cause an accident.We conduct experiments to verify the effectiveness and efficiency of our proposed early-warning description generation system. |