Artificial Intelligence

2405 Submissions

[22] viXra:2405.0171 [pdf] submitted on 2024-05-31 02:37:45

Application of Table based K Nearest Neighbor for Index Optimization

Authors: Taeho Jo
Comments: 11 Pages.

This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a table as its input data and is applied tothe index optimization. The motivations of this research are the successful results from applying the table based algorithms to thetext categorizations in previous works and the index optimization is able to be viewed into a classification task where each word is classified into expansion, inclusion, and removal. In the proposed system, each word in the given text is classified into one of thethree categories by the proposed KNN algorithm, associates words are added to ones which are classified into expansion, and ones whichare classified into inclusion are kept by themselves without adding any word. The proposed KNN version is empirically validated as thebetter approach in deciding the importance level of words in news articles and opinions. In using the table based KNN algorithm, it is easier to trace results from categorizing words.
Category: Artificial Intelligence

[21] viXra:2405.0170 [pdf] submitted on 2024-05-31 02:38:04

Specializing K Nearest Neighbor into String Vector based Version using String Vector Operation in Index Optimization

Authors: Taeho Jo
Comments: 11 Pages.

This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a string vector as its input data and is applied to the index optimization. The results from applying the string vector based algorithms to the text categorizations were successful in previous works, and the index optimization is able to be viewed into a classification task where each word is classified into expansion, inclusion, and removal. In the proposed system, each word in the given text is classified into one of the three categories by the proposed KNN algorithm, associates words are added to ones which are classified into expansion, and ones which are classified into inclusion are kept by themselves without adding any word. The proposed KNN version is empirically validated as thebetter approach in deciding the importance level of words in news articles and opinions. We need to define and characterize mathematically more operations on string vectors for modifying moreadvanced machine learning algorithms.
Category: Artificial Intelligence

[20] viXra:2405.0169 [pdf] submitted on 2024-05-31 02:38:19

Table based K Nearest Neighbor for Text Classification

Authors: Taeho Jo
Comments: 13 Pages.

This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a table as its input data and is applied tothe text categorization. The motivations of this research are the successful results from applying the table based algorithms to thetext categorizations in previous works and the expectation of synergy effect between the text categorization and the word categorization. In this research, we define the similarity metricbetween two tables representing texts, modify the KNN algorithm by replacing the exiting similarity metric by the proposed one, andapply it to the text categorization. The proposed KNN is empirically validated as the better approach in categorizing texts in newsarticles and opinions. In using the table based KNN algorithm, it is easier to trace results from categorizing texts.
Category: Artificial Intelligence

[19] viXra:2405.0168 [pdf] submitted on 2024-05-31 02:38:35

Graph Similarity Metric for Modifying K Nearest Neighbor for Classifying Texts

Authors: Taeho Jo
Comments: 13 Pages.

This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a graph as its input data and is applied tothe text categorization. The graph is more graphical for representing a word and the synergy effect between the text categorization and the word categorization is expected by combining them with each other. In this research, we propose the similaritymetric between two graphs representing words, modify the KNN algorithm by replacing the exiting similarity metric by the proposedone, and apply it to the text categorization. The proposed KNN is empirically validated as the better approach in categorizing texts in news articles and opinions. In this article, a word is encoded into a weighted and undirected graph and it is represented into a list of edges.
Category: Artificial Intelligence

[18] viXra:2405.0164 [pdf] submitted on 2024-05-31 03:52:19

Text Mining; Text Clustering; Table Similarity; Table based AHC Algorithm

Authors: Taeho Jo
Comments: 12 Pages. Text Mining; Text Clustering; Table Similarity; Table based AHC Algorithm

This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which clusters tables, instead of numerical vectors, as the approach to the text clustering. The motivations of this research are the successful results from applying the tablebased algorithms to the text clustering tasks in previous works and the expectation of synergy effect between the text clustering andthe word clustering. In this research, we define the similarity metric between tables representing texts, and modify the AHCalgorithm by adopting the proposed similarity metric as the approach to the text clustering. The proposed AHC algorithm is empiricallyvalidated as the better approach in clustering texts in news articles and opinions. In using the table based AHC algorithm, it iseasier to trace results from clustering texts.
Category: Artificial Intelligence

[17] viXra:2405.0158 [pdf] submitted on 2024-05-29 02:53:51

Applying Table based AHC Algorithm to Semantic Word Clustering

Authors: Taeho Jo
Comments: 11 Pages.

This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which clusters tables, instead of numerical vectors, as the approach to the word clustering. The motivations of this research are the successful results from applying the table based algorithms to the text clustering tasks in previous works and the expectation of synergy effect between the text clustering and the word clustering. In this research, we define the similarity metric between tables representing words, and modify the AHC algorithm by adopting the proposed similarity metric as the approach to the word clustering. The proposed AHC algorithm is empirically validated as the better approach in clustering words in news articles and opinions. In using the table based AHC algorithm, it is easier to trace results from clustering words.
Category: Artificial Intelligence

[16] viXra:2405.0157 [pdf] submitted on 2024-05-29 02:54:52

String Vector based AHC Algorithm for Clustering Words Semantically

Authors: Taeho Jo
Comments: 12 Pages.

This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which clusters string vectors, instead of numerical vectors, as the approach to the word clustering. The results from applying the string vector based algorithms to the text clustering were successful in previous works and synergy effect between the text clustering and the word clustering is expected by combining them with each other; the two facts become motivations for this research. In this research, we define the operation on string vectors called semantic similarity, and modify the AHC algorithm by adopting the proposed similarity metric as the approach to the word clustering. The proposed AHC algorithm is empirically validated as the better approach in clustering words in news articles and opinions. We need to define and characterize mathematically more operations on string vectors for modifying more advanced machine learning algorithms.
Category: Artificial Intelligence

[15] viXra:2405.0156 [pdf] submitted on 2024-05-29 02:56:04

Clustering Words Semantically by Graph based Version of AHC Algorithm

Authors: Taeho Jo
Comments: 11 Pages.

This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which clusters graphs, instead of numerical vectors, as the approach to the word clustering. The graph is more graphical for representing a word and the synergy effect between the text clustering and the word clustering is expected by combining them with each other. In this research, we propose the similarity metric between two graphs representing words, and modify the AHCalgorithm by adopting the proposed similarity metric as the approach to the word clustering. The proposed AHC algorithm is empiricallyvalidated as the better approach in clustering words in news articles and opinions. In this article, a word is encoded into a weighted and undirected graph and it is represented into a list of edges.
Category: Artificial Intelligence

[14] viXra:2405.0155 [pdf] submitted on 2024-05-29 02:56:42

Extracting Keywords from Text by Feature Similarity based K Nearest Neighbor

Authors: Taeho Jo
Comments: 12 Pages.

This article proposes the modified KNN (K Nearest Neighbor) algorithm which considers the feature similarity and is applied to the keyword extraction. The texts which are given as features for encoding words into numerical vectors are semantic related entities, rather than independent ones, and the keyword extraction is able to be viewed into a binary classification where each word is classified into keyword or non-keyword. In the proposed system, a text which is given as the input is indexed into a list of words, each word isclassified by the proposed KNN version, and the words which are classified into keyword are extracted ad the output. The proposed KNN version is empirically validated as the better approach in deciding whether each word is a keyword or non-keyword in news articles and opinions. The significance of this research is to improve the classification performance by utilizing the feature similarities.
Category: Artificial Intelligence

[13] viXra:2405.0152 [pdf] submitted on 2024-05-29 02:57:42

Keyword Selection from Textual Data using Table based K Nearest Neighbor

Authors: Taeho Jo
Comments: 12 Pages.

This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a table as its input data and is applied tothe keyword extraction. The table based algorithms worked successfully in text mining tasks such as text categorization andtext clustering in previous works, and the keyword extraction is able to be mapped into the binary classification where each word isclassified into keyword or non-keyword. In the proposed system, a text which is given as the input is indexed into a list of words, each word is classified by the proposed KNN version, and the words which are classified into keyword are extracted ad the output. The proposed KNN version is empirically validated as the better approach in deciding whether each word is a keyword or non-keyword in news articles and opinions. In using the table based KNN algorithm, it is easier to trace results from categorizing words.
Category: Artificial Intelligence

[12] viXra:2405.0151 [pdf] submitted on 2024-05-29 02:58:14

K Nearest Neghbor Modified Into String Vector Based Version for Keyword Extraction

Authors: Taeho Jo
Comments: 11 Pages.

This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a string vector as its input data and is applied to the keyword extraction. The results from applying the string vector based algorithms to the text categorizations were successful in previous works and the keyword extraction is able to be mapped into the binary classification where each word is classified into keyword or non-keyword. In the proposed system, a text which is given as the input is indexed into a list of words, each word is classified by the proposed KNN version, and the words which are classified into keyword are extracted ad the output. The proposed KNN version is empirically validated as the better approach in deciding whether each word is a keyword or non-keyword in news articles and opinions. We need to define and characterize mathematically more operations on string vectors for modifying more advanced machine learning algorithms.
Category: Artificial Intelligence

[11] viXra:2405.0150 [pdf] submitted on 2024-05-29 02:58:37

Modification of K Nearest Neighbor by Graph Similarity Metric for Keyword Extraction

Authors: Taeho Jo
Comments: 11 Pages.

This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a graph as its input data and is applied tothe keyword extraction. The graph is more graphical for representing a word and the keyword extraction is able to be mapped into thebinary classification where each word is classified into keyword or non-keyword. In the proposed system, a text which is given as theinput is indexed into a list of words, each word is classified by the proposed KNN version, and the words which are classified into keyword are extracted ad the output. The proposed KNN version is empirically validated as the better approach in deciding whether each word is a keyword or non-keyword in news articles and opinions.In this article, a word is encoded into a weighted and undirectedgraph and it is represented into a list of edges.
Category: Artificial Intelligence

[10] viXra:2405.0149 [pdf] submitted on 2024-05-29 02:59:11

Feature Similarity based K Nearest Neighbor for Optimizing of Text Indexes

Authors: Taeho Jo
Comments: 11 Pages.

This article proposes the modified KNN (K Nearest Neighbor) algorithm which considers the feature similarity and is applied to the index optimization. The texts which are given as features for encoding words into numerical vectors are semantic related entities, rather than independent ones, and the index optimization is able to be viewed into a classification task where each word is classified into expansion, inclusion, and removal. In the proposed system, each word in the given text is classified into one of the three categories by the proposed KNN algorithm, associates words are added to ones which are classified into expansion, and ones which areclassified into inclusion are kept by themselves without adding any word. The proposed KNN version is empirically validated as the better approach in deciding the importance level of words in news articles and opinions. The significance of this research is to improve the classification performance by utilizing the feature similarities.
Category: Artificial Intelligence

[9] viXra:2405.0144 [pdf] submitted on 2024-05-27 21:45:26

Content based Word Clustering using Feature Similarity based AHC Algorithm

Authors: Taeho Jo
Comments: 12 Pages.

This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which considers the feature similarity and is applied to the word clustering. The texts which are given as features for encoding words into numerical vectors are semantic related entities, rather than independent ones, and the synergy effect between the word clustering and the text clustering is expected by combining both of them with each other. In this research, we define the similarity metric between numerical vectors considering the feature similarity, and modify the AHC algorithm byadopting the proposed similarity metric as the approach to the word clustering. The proposed AHC algorithm is empirically validated asthe better approach in clustering words in news articles and opinions. The significance of this research is to improve the clustering performance by utilizing the feature similarities.
Category: Artificial Intelligence

[8] viXra:2405.0140 [pdf] submitted on 2024-05-26 05:12:24

Using Table based Version of K Nearest Neighbor for Classifying Words Semantically

Authors: Taeho Jo
Comments: 11 Pages.

This article proposes the modified KNN (K earest Neighbor) algorithm which receives a table as its input data and is applied to the word categorization. The motivations of this research are the successful results from applying the table based algorithms to the text categorizations in previous works and the expectation of synergy effect between the text categorization and the word categorization. In this research, we define the similarity metricbetween two tables representing words, modify the KNN algorithm by replacing the exiting similarity metric by the proposed one, andapply it to the word categorization. The proposed KNN is empirically validated as the better approach in categorizing words in newsarticles and opinions. In using the table based KNN algorithm, it is easier to trace results from categorizing words.
Category: Artificial Intelligence

[7] viXra:2405.0138 [pdf] submitted on 2024-05-26 06:53:45

Application of String Vector based K Nearest Neighbor to Semantic Word Classification

Authors: Taeho Jo
Comments: 12 Pages.

This article proposes the modified KNN (K Nearest Neighbor) algorithm which receives a string vector as its input data and isapplied to the word categorization. The results from applying the string vector based algorithms to the text categorizations were successful in previous works and synergy effect between the text categorization and the word categorization is expected by combining them with each other; the two facts become motivations for this research. In this research, we define the operation on string vectors called semantic similarity, modify the KNN algorithm by replacing the exiting similarity metric by the proposed one, and apply it to the word categorization. The proposed KNN is empiricallyvalidated as the better approach in categorizing words in news articles and opinions. We need to define and characterize mathematically more operations on string vectors for modifying moreadvanced machine learning algorithms.
Category: Artificial Intelligence

[6] viXra:2405.0136 [pdf] submitted on 2024-05-26 07:51:04

Modifying K Nearest Neighbor for Content based Word Classification by Graph Similarity Metric

Authors: Taeho Jo
Comments: 12 Pages.

This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which considers the feature similarity and is applied to the word clustering. The texts which are given as features for encoding words into numerical vectors are semantic related entities, rather than independent ones, and the synergy effect between the word clustering and the text clustering is expected by combining both of them with each other. In thisresearch, we define the similarity metric between numerical vectors considering the feature similarity, and modify the AHC algorithm by adopting the proposed similarity metric as the approach to the word clustering. The proposed AHC algorithm is empirically validated as the better approach in clustering words in news articles and opinions. The significance of this research is to improve the clustering performance by utilizing the feature similarities.
Category: Artificial Intelligence

[5] viXra:2405.0090 [pdf] submitted on 2024-05-17 22:35:51

Intelligent Description and the Principle of Free Energy

Authors: Friedrich Sösemann
Comments: 38 Pages.

Information measures the dependency between states, knowledge that between object and subject states and intelligence that between subject states. Descriptions store object states. Friston's free energy principle is intelligent, combining physics, computer science and biology, but is not new.
Category: Artificial Intelligence

[4] viXra:2405.0046 [pdf] submitted on 2024-05-09 00:29:42

Reasoning AI (RAI), Large Language Models (LLMs) and Cognition

Authors: Victor Senkevich
Comments: 4 Pages.

Do Large Language Models have cognitive abilities? Do Large Language Models haveunderstanding? Is the correct recognition of verbal contexts or visual objects, based onpre-learning on a large training dataset, a manifestation of the ability to solve cognitivetasks? Or is any LLM just a statistical approximator that compiles averaged texts fromits huge dataset close to the specified prompts?The answers to these questions require rigorous formal definitions of the cognitive concepts of "knowledge", "understanding" and related terms.
Category: Artificial Intelligence

[3] viXra:2405.0041 [pdf] submitted on 2024-05-07 21:08:56

Self-Supervised Pre-Training for Histological Image Transformer

Authors: Kum Song Ju, Ok Chol Choe, Ok Chol Ri
Comments: 9 Pages.

Image Transformer has recently achieved significant progress for natural image understanding, either using supervised (ViT, DeiT, etc.) or self-supervised (BEiT, MAE, etc.) pre-training techniques. In this paper, we propose HiT, a self-supervised pre-trained Histological Image Transformer model using large-scale unlabeled histological images for medical image processing tasks, which is essential since no supervised counterparts ever exist due to the lack of human-labeled histological images. We leverage HiT as the backbone network in a variety of vision-based histological image processing tasks. Experiment results have illustrated that the self-supervised pre-trained HiT model the new state-of-the-art results on these downstream tasks, e.g. histological image classification on SIPaKMeD database achieved an accuracy of 97.45% and 99.29% for 5-class and 2-class classifications, respectively.
Category: Artificial Intelligence

[2] viXra:2405.0037 [pdf] submitted on 2024-05-07 20:59:35

Large Language Model for Automobile

Authors: Fei Ding
Comments: 6 Pages.

With the introduction of ChatGPT (OpenAI, 2022) from OpenAI, the power of these models to generate human-like text has captured widespread public attention. The scale of language models has burgeoned, progressing from modest multi-million-parameter architectures like ELMo (Peters et al., 2018) and GPT-1 (Radford et al., 2018), to behemoths boasting billions, even trillions of parameters, exemplified by the monumental GPT-3 (Brown et al., 2020), Switch Transformers (Fedus et al., 2022) , GPT-4 (OpenAI, 2023), PaLM-2 (Anil et al., 2023), and Claude (Claude, 2023) and Vicuna (Chiang et al., 2023). The expansion in scale has significantly raised hardware requirements, making it exceedingly challenging to deploy models on mobile devices such as smartphones and tablets.To deploy on cars , we trained a 7-billion-parameter automobile model, which outperformsGPT-3.5 in the automotive domain. Surpassing all models in areas such as automotive.
Category: Artificial Intelligence

[1] viXra:2405.0025 [pdf] submitted on 2024-05-06 19:50:49

Unlocking Customer Sentiments: A Sentiment Analysis of Amazon Product Reviews for Unlocked Mobile Phones

Authors: Apurba Poudel
Comments: 3 Pages.

In this study, I conducted sentiment analysis on product reviews of unlocked mobile phones sold on Amazon to explore customer’s opinions and sentiments towards these devices. I classified the sentiment according to the given rating by user and according to the written reviews by the users respectively. This study collected a total of 400000 reviews from the Amazon website, focusing on unlocked mobile phones from various brands. The reviews were pre-processed and analyzed using Natural Language Processing (NLP) techniques, Bag of Words (BoW) model, LinearSVC, Word2Vec model and Long Short-Term Memory (LSTM) neural network. My analysis revealed that the majority of the reviews (approximately 70%) were positive. The positive reviews highlighted features such as the device's camera quality, battery life, display, and user interface. On the other hand, some negative reviews were found, mainly related to issues with the device's software and hardware. The negative reviews highlighted problems such as slow performance, freezing, and device malfunctioning.Moreover, the study found that some ratings does not corresponds to actual sentiment of review. Some users gave ratings higher or lower compared to the calculated sentiment of then reviews.
Category: Artificial Intelligence