Course Description
 
               
              This course is a project-based course that studies basic techniques for processing text data. The course will introduce the concepts of language morphology, text representation, pre-processing, feature extraction to obtain information such as similarity and text clustering. Topics covered include: language morphology, string representation, regex, tokenization, text pre-processing, Bag of Words, TF-IDF, word similarity, word clustering, and web scraping. Students will create group projects to apply text processing theories and concepts to problems in the field of Data Science.
 
			  Program Objectives (PO)
 
               
                 
                   
				  - Mampu merepresentasikan pengetahuan linguistik pada tingkat representasi morfologi, sintaksis serta semantik
- Mampu melakukan penggalian data teks dari sumber digital dan mengolahnya menggunakan teknik pre-processing, ekstraksi fitur, dan similarity teks
  
				
                 
                   
                    - Mampu melakukan pemodelan data teks dengan menggunakan klasifikasi dan klaster
- Mampu merancang penyelesaian masalah pada data teks menggunakan pengolahan data teks yang terkait