Readme

Project Proposal - Rational or emotional: Comparative Study of Human and LLM Pathfinding in the Wikispeedia Game

Introduction

Abstract:

The project aims to explore and compare the strategies of humans and large language models(LLMs) in the Wikipedia game. The goal of the game is to find the shortest paths between two Wikipedia articles. We will investigate whether LLMs are better at finding the shortest paths than humans in terms of efficiency and whether their strategies for finding paths differ significantly from those of humans. Through these analyses, we aim to explore whether LLMs have a deeper or more superficial understanding of semantics compared to humans and whether the LLMs are more perceptual or rational compared to humans.

Motivation:

By exploring the differences in semantic understanding and exploration patterns between humans and LLMs, the project helps to understand the cognitive strengths as well as limitations of AI and humans. This will help determine whether AIs have near-human thinking patterns and comprehension abilities. As a result, the project can advance our knowledge and judgment of AI's semantic comprehension capabilities.

Research Question and Story:

In this project, we ought to study the following questions:

Do LLMs outperform humans in finding the shortest paths between semantically closely related and unrelated words in the Wikispeedia game?

How do LLMs’ strategies differ from human strategies when navigating between concepts?

Do the paths taken by LLMs reflect a deeper or more superficial understanding of the semantic connections between words compared to human paths?

Are there any biases or noises in the paths taken by humans?

Is there a possibility of AI a priori?

By answering the question above, we can measure the paths from four different perspectives: performance strengths and weaknesses, strategy differences, depth of understanding and bias existence.

Implement:

Pipeline:

The research process follows this workflow:

Database:

We will only use the database - wikispeedia that has been provided and based on this, we will extract the human navigation paths in the dataset. And we will also use the API to collect batch browsing data of the LLMs. With the combination, we will use the new dataset to analyse and evaluate.

Method/Matrics

Preprocessing and Pre-analysis: Perform initial data cleaning and initial data visualization.
We will use simple analysis of path distance and semantic distance to measure the effiency: whether each step is closer to the target page.
we will generate and visualize graph node embeddings using Node2Vec, enabling analysis of node relationships in a reduced-dimensional space.
We will design a method based on graph embedding closeness score to judge the merit of the method, which shows a clear advantage in the optimal path.

Tool

Python

OpenAI:Chatgpt 4o-mini

Timeline

Date	Content
Now - Nov.15th	Data preprocessing and pre-analysis
Now - Nov.23rd	API exploration
Nov.16th - Nov.30th	Dataset generation
Nov.30th - Dec.13th	Processing and analysis
Dec.13th - Dec.20th	Evaluation, comparison and report

Conclusion

This project compares the abilities of humans and LLMs to find paths in Wikipeedia games, aiming to understand the differences in strategy, efficiency, and semantic understanding. This study will determine whether LLMs exhibit similarities to human reasoning or possess their own unique logical patterns. Ultimately, our goal is to better assess the current capabilities of LLM models in understanding complex information.

Project Structure

The directory structure of new project looks like this:

├── data                        <- Project data files
│
├── src                         <- Source code
│   ├── data                            <- Data directory
│   ├── models                          <- Model directory
│   ├── utils                           <- Utility directory
│   ├── scripts                         <- Shell scripts
│
├── tests                       <- Tests of any kind
│
├── results.ipynb               <- a well-structured notebook showing the results
│
├── .gitignore                  <- List of files ignored by git
├── pip_requirements.txt        <- File for installing python dependencies
└── README.md

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
charts		charts
data/wikispeedia_paths-and-graph		data/wikispeedia_paths-and-graph
img		img
src		src
.gitignore		.gitignore
GPT_API.ipynb		GPT_API.ipynb
README.md		README.md
results.ipynb		results.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Readme

Project Proposal - Rational or emotional: Comparative Study of Human and LLM Pathfinding in the Wikispeedia Game

Introduction

Implement:

Conclusion

Project Structure

About

Releases

Packages

Contributors 3

Languages

epfl-ada/ada-2024-project-genshinstart

Folders and files

Latest commit

History

Repository files navigation

Readme

Project Proposal - Rational or emotional: Comparative Study of Human and LLM Pathfinding in the Wikispeedia Game

Introduction

Implement:

Conclusion

Project Structure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages