Data Augmentation with Large Language Models for Vietnamese Abstractive Text Summarization

Text summarization plays a crucial role in managing the overwhelming volume of information available today. This task aims to condense large amounts of information into summaries. However, the lack of large-scale annotated data in certain languages, such as Vietnamese, poses a substantial challenge...

Full description

Saved in:

Bibliographic Details
Main Authors:	Le, Huy M., Luong, Vy T., Luong, Ngoc Hoang
Format:	Conference Proceeding
Language:	eng
Subjects:	Analytical models Data augmentation Data models Deep learning GPT-3.5 large language model Measurement Natural languages synthetic data generation Training Vietnamese text summarization
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Text summarization plays a crucial role in managing the overwhelming volume of information available today. This task aims to condense large amounts of information into summaries. However, the lack of large-scale annotated data in certain languages, such as Vietnamese, poses a substantial challenge for developing effective summarization models. With the recent advancements in large language models, such as GPT-3.5, there is an opportunity to leverage these models to augment data for improving the performance of deep learning models in Vietnamese text summarization. In this paper, we propose an automatic approach that utilizes a large language model to generate additional training examples and to enhance the summarization process for Vietnamese texts.
ISSN:	2770-6850