Site icon Premium Researchers

A Review of Tools and Techniques for Preprocessing of Textual Data

Do You Have New or Fresh Topic? Send Us Your Topic


Post Views:
0

Abstract

With the high availability of computing facilities, a huge amount of data is available in electronic form. Processing of huge data is required to discover new facts and knowledge. But dealing with huge datasets is challenging because real-world data is generally incomplete, inconsistent, contains errors or outliers. More than 80% of the data is unstructured or semi-structured. The data is prepared by data preprocessing. Data preprocessing has become an essential step in data mining. Data Preprocessing takes 80% of the total efforts of any data mining project and it directly affects the quality of data mining. The selection of the right technique and tool for data preprocessing helps to enhance the speed of data mining process. This paper discusses different preprocessing techniques, different tools available for text preprocessing, carries out their comparison and briefs the challenges faced such as knowledge of sentence structure of a language to perform tokenization, difficulty in constructing domain-specific stop words list, over stemming and under stemming etc.

Related

Previous articleNumerical Modeling of 3D Site-City Effects Including Partially Embedded Buildings Using Spectral Element Methods in Medium Stiffness SoilsNext articleOptimal Path and Path-Following Control in Airborne Wind Energy Systems

Not What You Were Looking For? Send Us Your Topic

INSTRUCTIONS AFTER PAYMENT

After making payment, kindly send the following:

» Send the above details to our email; contact@premiumresearchers.com or to our support phone number; (+234) 0813 2546 417 . As soon as details are sent and payment is confirmed, your project will be delivered to you within minutes.