Channel: PyData
Category: Science & Technology
Tags: pythonlearn to codeeducationsoftwarepydatalearncodinghow to programjuliaopensourcescientific programmingnumfocuspython 3tutorial
Description: In the last few years deep learning models and architecture are rapidly evolving, which result an ongoing improvement in the performance of different NLP tasks. However, as advanced the cutting-edge models would be, one of the major bottleneck in their daily usage is the amount of annotated data that is available for their training. Though different methods for data augmentation were successfully applied in image processing, in NLP data augmentation is still maturing. In this talk we will present different approaches for tackling the limited dataset size issue, by using data augmentation and synthetic data generation. Text documents may contain several different formats of textual data. Our methodologies make use of different ways of augmentation, based on the input ontology and its positional coordinates in the document.