Discussion

Stop words

Gupta, D., Vani, K., & Leema, L. M. (2016). Plagiarism detection in text documents using sentence bounded stop word n-grams. Journal of Engineering Science and Technology, 11(10), 1403-1420.

Sentence annotation

AI CUP 2019 人工智慧論文機器閱讀競賽之論文標註 - 總獎金高達35萬元！Trend Micro | Business Support

Classification and Name Entity Detection

玉山人工智慧挑戰賽 2020 夏季賽 - 最高獎金 12 萬元Trend Micro | Business Support

Pragmatic words

"你為什麼什麼都這樣?" "Why are you always like that?"

BERT

BERT Explained: A Complete Guide with Theory and TutorialTowards Machine Learning

2. Preparing the data

In order to use BERT, we need to convert our data into the format expected by BERT — we have reviews in the form of csv files; BERT, however, wants data to be in a tsv file with a specific format as given below (four columns and no header row):

Column 0: An ID for the row
Column 1: The label for the row (should be an int — class labels: 0,1,2,3 etc)
Column 2: A column of the same letter for all rows — this is a throw-away column that we need to include because BERT expects it.
Column 3: The text examples we want to classify

Applications of BERT

BERT Explained: State of the art language model for NLPTowards Data Science

BERT can be used for a wide variety of language tasks, while only adding a small layer to the core model:

Classification tasks such as sentiment analysis are done similarly to Next Sentence classification, by adding a classification layer on top of the Transformer output for the [CLS] token.
In Question Answering tasks (e.g. SQuAD v1.1), the software receives a question regarding a text sequence and is required to mark the answer in the sequence. Using BERT, a Q&A model can be trained by learning two extra vectors that mark the beginning and the end of the answer.
In Named Entity Recognition (NER), the software receives a text sequence and is required to mark the various types of entities (Person, Organization, Date, etc) that appear in the text. Using BERT, a NER model can be trained by feeding the output vector of each token into a classification layer that predicts the NER label.

Previous(Temporary) Applications NextA1. Third party functions

Last updated 5 years ago