Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【信息】补充复旦大学语料库资料 #7

Open
hoochanlon opened this issue Jun 27, 2023 · 2 comments
Open

【信息】补充复旦大学语料库资料 #7

hoochanlon opened this issue Jun 27, 2023 · 2 comments

Comments

@hoochanlon
Copy link

hoochanlon commented Jun 27, 2023

下载地址:https://bluepload.unstable.life/10100.zip (年份:2018;网页文字显示乱码是页面编码问题,下载资料可正常查看。)

停用词文本聚类综合统计分析,见下图 1

我的看法和此处说提的期刊文献笔者观点相一致。停用词也不是越多越好,实际上还需要有一定的领域分类,才能有效精准定位。

Footnotes

  1. 《中文文本聚类常用停用词表对比研究》,2017,官 琴、邓三鸿、王昊,《数据分析与知识发现》2017年第3期,76页

@Lotus-Blue
Copy link

这种级别的论文实验数据就算了吧....

@Lotus-Blue
Copy link

CCFA论文的实验数据我都持怀疑态度,除非是能完全本地复现,更别说这种期刊文章了🤣

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants